moyix/aaronson.txt

## aaronson.txt
A few months ago, I got a surprise call from Subra Suresh, director of the National Science Foundation, who told me I was going to share this year’s Alan T. Waterman Award with Robert Wood of Harvard.  (At first I assumed it was a telemarketing call, since pretty much no one calls my office phone; I use my iPhone exclusively and have trouble even operating my desk phone.)  Dr. Suresh explained that this was the first time the Waterman would ever be awarded to two people the same year, but that the committee was unanimous in supporting both me and Rob.  Looking up my co-winner, I quickly learned that Rob was a leader in the field of robot bees (see here for video)—and that his work, despite having obvious military applications, had been singled out by Sean Hannity as the latter’s #1 example of government waste (!).  That fact, alone, made me deeply honored to share the award with Rob, and eager to meet him in person.

Happily, I finally got to do that this past Thursday, at the Waterman award ceremony in Washington DC.  The festivities started in the morning, with talks by me and Rob to the National Science Board.  (I just performed my usual shtick.  I was hoping Rob would bring some actual RoboBees, but he said he no longer does that due to an unfortunate run-in with airport security.)  Then, after lunch and meetings at the NSF, it was back to the hotel to change into a tux, an item I’d never worn before in my life (not even at my wedding).  Fortunately, my dad was there to help me insert the cufflinks and buttons, a task much more complicated than anything I was allegedly getting the award for.  Then Dana and I were picked up by a limo, to begin the arduous mile-long journey from Dupont Circle to the State Department for the awards dinner.

Besides me and Rob, there were three other awardees that night:

  * Leon Lederman, the 89-year-old Nobel physicist whose popular book (The God Particle) I enjoyed as a kid, received the Vannevar Bush Award.
  * Lawrence Krauss, physicist and popular science writer, and National Public Radio’s science desk shared the National Science Board Public Service Award.  Some readers of science blogs might recognize Lawrence Krauss from his recent brouhaha over literally nothing with the philosopher of science David Albert.  (For whatever it’s worth, I have little to add to Sean Carroll’s diplomatic yet magisterial summary of the issues over on Cosmic Variance.)

Speaking of diplomacy, the awards dinner was held in the “diplomatic reception rooms” on the top floor of the State Department’s Harry S. Truman Building.   These were pretty awesome rooms: full of original portraits of George Washington, Ben Franklin, etc., as well as antique furniture pieces like a desk that Thomas Jefferson allegedly used while writing the Declaration of Independence.  I could easily eat dinner there on a regular basis.

Carl Wieman, the Nobel physicist and Associate Director for Science at the White House Office of Science and Technology Policy, read out a congratulatory message from President Obama.  I feel certain the President remembered I was the same dude he shook hands with a while back.

Anyway, cutting past dinner and dessert, here was my short acceptance speech:

Thanks for this honor, and huge congratulations to my co-winners, wherever in the alphabet they might lie [a reference to my getting called up before Rob Wood, simply because Aaronson&lt;Wood lexicographically].  I like to describe my research, on the limits of quantum computers, as the study of what we can’t do with computers we don’t have.  Why would I or anyone else study such a bizarre thing?  Mostly because we’re inspired by history.  In the 1930s, before electronic computers even existed, a few people like Alan Turing were already trying to understand mathematically what such devices would or wouldn’t be able to do.  Their work ultimately made possible the information age.  Today, we don’t know exactly where curiosity about (say) quantum computers or the P versus NP question is going to lead, but I’m grateful to live in a country that’s able to support this kind of thing.  I thank the NSF and the Obama administration for supporting basic science even in difficult times.  I thank Subra Suresh (my former dean at MIT), and my phenomenal program officer Dmitry Maslov.  I thank the teachers and mentors to whom I owe almost everything, including Chris Lynch, Bart Selman, Avi Wigderson, and Umesh Vazirani.  I thank my wonderful colleagues at MIT—including my department head Anantha Chandrakasan, who’s here now—and my students and postdocs.  I thank my collaborators, and the entire theory of computing and quantum information communities, which I’m so proud to be part of.  I thank my students in 6.045 for understanding why I had to miss class today.  Most of all, I thank four people who are here with me now—my mom, dad, and my brother David, who’ve always believed in me, whether justified or not, and my wife, Dana Moshkovitz Aaronson, who’s enriched my life ever since she came into it three years ago.  Thank you.

The next day, I had the privilege of giving a quantum computing talk to more than 100 students at the Thomas Jefferson High School for Science and Technology in nearby Alexandria, VA.  Visiting TJ had special meaning for me, since while I was suffering through high school, TJ was my “dream school”: I wished my parents lived in the DC area so that I could go there.  I told the TJ students never to forget just how good they had it.  (To this day, when I meet fellow American-raised scientists, and they tell me they’re surprised I had such an unhappy time in high school, since they themselves had a great time, I always ask them which high school they went to.  In a large fraction of cases, the answer turns out to be TJ—and when it isn’t, it’s often the Bronx High School of Science or another similar place.)  As should surprise no one, the students had vastly more detailed questions about my talk than did the National Science Board (for example, they wanted to know whether I thought progress in group theory would lead to new quantum algorithms).

Without doubt, the most surreal aspect of this trip was the contrast between what was going on in my “real” and “virtual” lives.  Again and again, I’d be shaking hands with the Undersecretary of Defense, Director of the National Institute of Prestigiousness, etc. etc., and warmly accepting these fine people’s congratulations.  Then I’d sneak away for a minute to moderate my blog comments on my iPhone, where I’d invariably find a fresh round of insults about my “deeply ignorant lesser brain” from entanglement denier Joy Christian.

Perhaps the funniest contrast had to do with a MathOverflow question that I posted just before I left for DC, and which was quickly answered, just as I had hoped.  During the limo ride back from the dinner, I got the following polite inquiry from a blog commenter calling himself “Mike”:

Hey Scott, I’m wondering how you got the courage to post that question on [MathOverflow]. In truth it wasn’t that hard of a question and if you have trouble solving it then…well, no offense, but you see what I mean. Reputation matters.

As I contemplated Mike’s question, a profound sense of peace came over me.  Probably for the first time in my life, I realized just how lucky I really am.  I’m lucky that I feel free to ask naïve, simpleminded questions, toss out speculations, and most importantly, admit when I don’t know something or made a mistake, without worrying too much about whether those actions will make me look foolish before the “Mikes” of the world.  If I want to work on a problem myself, I can do that; if I prefer giving the problem out to others, I can do that as well.  Let Mike, with his greater wisdom, sit in judgment of me for my failure to see all the answers that no doubt are obvious to him.  I don’t mind.  In science, like in everything else, I’ll continue being an unabashed doofus—partly because it seems to work OK, but mostly just because it’s the only way I know.

Thanks so much to all of you for your support.


I’m convinced that the following diagram means something precise:

My question is, what does it mean?

Intuitively, it means that if your software package can solve SDP’s, then you can easily use it to solve LP’s; if it can solve LP’s, you can easily use it to invert matrices, and so on, but not vice versa. But it can’t mean (for example) that SDP’s are harder than LP’s in the usual complexity theory sense, since both problems are P-complete!

Maybe it means that, if your axiom system is strong enough to prove SDP is in P, then it’s also strong enough to prove LP is in P, and so on — but not necessarily vice versa. But how would we show such a separation?

(Sorry, no money this time. We’ll see if it makes any difference — I’m guessing that it doesn’t.)


I arrived this morning in Prague for the 2006 Complexity conference. Soon I’ll have the photos to prove it. For now, though, I wish to blog neither about the breathtakingly beautiful city in which I find myself, nor about the meaty, succulent topic alluded to in my previous post, but instead about anthropicisms.

Inspired by Peter Woit’s almost-daily anti-anthropic broadsides, and in the spirit of my earlier Best Umeshism Contest, I hereby announce a new contest for Best Application of the Anthropic Principle. Here are a few samples to get the self-selected tautological ball rolling, not that it could do otherwise than roll:

> Why do so many people seem to care about being remembered after they die? Because we only remember the ones who cared about being remembered.
>
> Academics comprise only a tiny portion of humanity, so what are the chances of being an academic as opposed to someone else? Conditioned on asking such a question in the first place, pretty high.
>
> Why is the moon round? Because if it were square, you wouldn’t be you — you would instead be a being extremely similar to you, except that he or she lives in a universe with a square moon.
>
> Why am I a blogger? Because if I weren’t, you wouldn’t be reading this.

The rules are similar to the Best Umeshism Contest: up to three entries per person. Please include a name — despite the nature of the contest, “He Who Posted This” doesn’t count. Entries must be in by July 22nd. The winner (as chosen by me) gets to ask any question and have me answer it here.


Update: I decided to close comments on this post and the previous Joy Christian post, because they simply became too depressing for me.

I’ve further decided to impose a moratorium, on this blog, on all discussions about the validity of quantum mechanics in the microscopic realm, the reality of quantum entanglement, or the correctness of theorems such as Bell’s Theorem.  I might lift the moratorium at some future time.  For now, though, life simply feels too short to me, and the actually-interesting questions too numerous.  Imagine, for example, that there existed a devoted band of crackpots who believed, for complicated, impossible-to-pin-down reasons of topology and geometric algebra, that triangles actually have five corners.  These crackpots couldn’t be persuaded by rational argument—indeed, they didn’t even use words and sentences the same way you do, to convey definite meaning.  And crucially, they had infinite energy: you could argue with them for weeks, and they would happily argue back, until you finally threw up your hands in despair for all humanity, at which point the crackpots would gleefully declare, “haha, we won!  the silly ‘triangles have 3 corners’ establishment cabal has admitted defeat!”  And, in a sense, they would have won: with one or two exceptions, the vast majority who know full well how many corners a triangle has simply never showed up to the debate, thereby conceding to the 5-cornerists by default.

What would you in such a situation?  What would you do?  If you figure it out, please let me know (but by email, not by blog comment).

* * *

In response to my post criticizing his “disproof” of Bell’s Theorem, Joy Christian taunted me that “all I knew was words.”  By this, he meant that my criticisms were entirely based on circumstantial evidence, for example that (1) Joy clearly didn’t understand what the word “theorem” even meant, (2) every other sentence he uttered contained howling misconceptions, (3) his papers were written in an obscure, “crackpot” way, and (4) several people had written very clear papers pointing out mathematical errors in his work, to which Joy had responded only with bluster.  But I hadn’t actually studied Joy’s “work” at a technical level.  Well, yesterday I finally did, and I confess that I was astonished by what I found.  Before, I’d actually given Joy some tiny benefit of the doubt—possibly misled by the length and semi-respectful tone of the papers refuting his claims.  I had assumed that Joy’s errors, though ultimately trivial (how could they not be, when he’s claiming to contradict such a well-understood fact provable with a few lines of arithmetic?), would nevertheless be artfully concealed, and would require some expertise in geometric algebra to spot.  I’d also assumed that of course Joy would have some well-defined hidden-variable model that reproduced the quantum-mechanical predictions for the Bell/CHSH experiment (how could he not?), and that the “only” problem would be that, due to cleverly-hidden mistakes, his model would be subtly nonlocal.

What I actually found was a thousand times worse: closer to the stuff freshmen scrawl on an exam when they have no clue what they’re talking about but are hoping for a few pity points.  It’s so bad that I don’t understand how even Joy’s fellow crackpots haven’t laughed this off the stage.  Look, Joy has a hidden variable λ, which is either 1 or -1 uniformly at random.  He also has a measurement choice a of Alice, and a measurement choice b of Bob.  He then defines Alice and Bob’s measurement outcomes A and B via the following functions:

A(a,λ) = something complicated = (as Joy correctly observes) λ

B(b,λ) = something complicated = (as Joy correctly observes) -λ

I shit you not.  A(a,λ) = λ, and B(b,λ) = -λ.  Neither A nor B has any dependence on the choices of measurement a and b, and the complicated definitions that he gives for them turn out to be completely superfluous.  No matter what measurements are made, A and B are always perfectly anticorrelated with each other.

You might wonder: what could lead anyone—no matter how deluded—even to think such a thing could violate the Bell/CHSH inequalities?  Aha, Joy says you only ask such a naïve question because, lacking his deep topological insight, you make the rookie mistake of looking at the actual outcomes that his model actually predicts for the actual measurements that are actually made.  What you should do, instead, is compute a “correlation function” E(a,b) that’s defined by dividing A(a,λ)B(b,λ) by a “normalizing factor” that’s a product of the quaternions a and b, with a divided on the left and b divided on the right.  Joy seems to have obtained this “normalizing factor” via the technique of pulling it out of his rear end.  Now, as Gill shows, Joy actually makes an algebra mistake while computing his nonsensical “correlation function.”  The answer should be -a.b-a×b, not -a.b.  But that’s truthfully beside the point.  It’s as if someone announced his revolutionary discovery that P=NP implies N=1, and then critics soberly replied that, no, the equation P=NP can also be solved by P=0.

So, after 400+ comments on my previous thread—including heady speculations about M-theory, the topology of spacetime, the Copenhagen interpretation, continuity versus discreteness, etc., as well numerous comparisons to Einstein—this is what it boils down to.  A(a,λ) = λ and B(b,λ) = -λ.

I call on FQXi, in the strongest possible terms, to stop lending its legitimacy to this now completely-unmasked charlatan.  If it fails to do so, then I will resign from FQXi, and will encourage fellow FQXi members to do the same.

While I don’t know the exact nature of Joy’s relationship to Oxford University or to the Perimeter Institute, I also call on those institutions to sever any connections they still have with him.

Finally, with this post I’m going to try a new experiment.  I will allow comments through the moderation filter if, and only if, they exceed a minimum threshold of sanity and comprehensibility, and do not randomly throw around terms like “M-theory” with no apparent understanding of what they mean.  Comments below the sanity threshold can continue to appear freely in the previous Joy Christian thread (which already has a record-setting number of comments…).

Update (May 11): A commenter pointed me to a beautiful preprint by James Owen Weatherall, which tries sympathetically to make as much sense as possible out of Joy Christian’s ideas, and then carefully explains why the attempt fails (long story short: because of Bell’s theorem!).  Notice the contrast between the precision and clarity of Weatherall’s prose—the way he defines and justifies each concept before using it—and the obscurity of Christian’s prose.

Another Update: Over on the previous Joy Christian thread, some commenters are now using an extremely amusing term for people who believe that theories in physics ought to say something comprehensible about the predicted outcomes of physics experiments.  The term: “computer nerd.”

Third Update: Quite a few commenters seem to assume that I inappropriately used my blog to “pick a fight” with poor defenseless Joy Christian, who was minding his own business disproving and re-disproving Bell’s Theorem.  So let me reiterate that I wasn’t looking for this confrontation, and in fact took great pains to avoid it for six years, even as Joy became more and more vocal.  It was Joy, not me, who finally forced matters to a head through his absurd demand that I pay him $100,000 “with interest,” and then his subsequent attacks.


A year ago, I relinquished my dictatorial control of the Complexity Zoo, accepting an offer from John Stockton to convert the Zoo into wiki format. Unfortunately, the wiki site has been down for days and shows no signs of coming back anytime soon. So for now, I’ve put the old Zoo back up at www.complexityzoo.com. I’ve learned my lesson: in times of crisis, it takes a leader with an iron fist to keep the trains running on time and the animals in their cages.

John Baez is back on the scene, with an account of his recent visit to our quantum computing group at Waterloo. Among other things, he gives a lucid explanation of how, while it’s generally impossible to keep information from leaking out of a computer, it is possible to arrange things so that the information that does leak is irrelevant to the computation. Baez links to the papers that prove this is true for quantum computing as well as classical, but complains that “most of it speaks the language of ‘error correction’ rather than thermodynamics.” Question for the audience: can the fault-tolerance theorems be reproved more physicsly? (“We now define a PHYSICAL SYSTEM called the concatenated Steane code…”)

Baez’s semi-conversion to the Church of Knill, Laflamme, and Zurek (or the Shul of Aharonov and Ben-Or) has inspired me to propose a far-reaching hypothesis:

> While it’s generally impossible to explain computer science concepts to physicists so that they understand them on your terms, it is sometimes possible to explain them so that they understand on their terms.

Naturally, it helps if the physicist in question is Baez.


From these lecture notes by Harvey Friedman comes one of the best metamathematical anecdotes I’ve ever heard (and yes, I’ve heard my share). Apparently Friedman was attending a talk by the “ultra-finitist” Alexander Yessenin-Volpin, who challenged the “Platonic existence” not only of infinity, but even of large integers like 2100. So Friedman raised the obvious “draw the line” objection: in the sequence 21,22,…,2100, which is the first integer that Yessenin-Volpin would say doesn’t exist?

Yessenin-Volpin asked Friedman to be more specific.

“Okay, then. Does 21 exist?”

Yessenin-Volpin quickly answered “yes.”

“What about 22?”

After a noticeable delay: “yes.”

“23?”

After a longer delay: “yes.”

It soon became clear that Yessenin-Volpin would answer “yes” to every question, but would take twice as long for each one as for the one before it.


Back in April, I blogged about why we should all support open-access journals. But after receiving a long string of referee requests from closed-access journals, I’ve completely changed my mind about this issue. I now believe we should keep Kluwer, Elsevier, and the other publishing conglomerates rolling in dough for as long as possible, and do whatever we can to sabotage the open-access movement. Why? Because as soon as the world switches to open-access, I’ll have no choice but to start accepting referee requests again.


A few days ago, a writer named John Rico emailed me the following question, which he’s kindly given me permission to share.

If a computer, or robot, was able to achieve true Artificial Intelligence, but it did not have a parallel programming or capacity for empathy, would that then necessarily make the computer psychopathic?  And if so, would it then follow the rule devised by forensic psychologists that it would necessarily then become predatory?  This then moves us into territory covered by science-fiction films like “The Terminator.”  Would this psychopathic computer decide to kill us?  (Or would that merely be a rational logical decision that wouldn’t require psychopathy?)

See, now this is precisely why I became a CS professor: so that if anyone asked, I could give not merely my opinion, but my professional, expert opinion, on the question of whether psychopathic Terminators will kill us all.

My response (slightly edited) is below.

Dear John,

I fear that your question presupposes way too much anthropomorphizing of an AI machine—that is, imagining that it would even be understandable in terms of human categories like “empathetic” versus “psychopathic.”  Sure, an AI might be understandable in those sorts of terms, but only if it had been programmed to act like a human.  In that case, though, I personally find it no easier or harder to imagine an “empathetic” humanoid robot than a “psychopathic” robot!  (If you want a rich imagining of “empathetic robots” in science fiction, of course you need look no further than Isaac Asimov.)

On the other hand, I personally also think it’s possible –even likely—that an AI would pursue its goals (whatever they happened to be) in a way so different from what humans are used to that the AI couldn’t be usefully compared to any particular type of human, even a human psychopath.  To drive home this point, the AI visionary Eliezer Yudkowsky likes to use the example of the “paperclip maximizer.”  This is an AI whose programming would cause it to use its unimaginably-vast intelligence in the service of one goal only: namely, converting as much matter as it possibly can into paperclips!

Now, if such an AI were created, it would indeed likely spell doom for humanity, since the AI would think nothing of destroying the entire Earth to get more iron for paperclips.  But terrible though it was, would you really want to describe such an entity as a “psychopath,” any more than you’d describe (say) a nuclear weapon as a “psychopath”?  The word “psychopath” connotes some sort of deviation from the human norm, but human norms were never applicable to the paperclip maximizer in the first place … all that was ever relevant was the paperclip norm!

Motivated by these sorts of observations, Yudkowsky has thought and written a great deal about how the question of how to create a “friendly AI,” by which he means one that would use its vast intelligence to improve human welfare, instead of maximizing some arbitrary other objective like the total number of paperclips in existence that might be at odds with our welfare.  While I don’t always agree with him—for example, I don’t think AI has a single “key,” and I certainly don’t think such a key will be discovered anytime soon—I’m sure you’d find his writings at yudkowsky.net, lesswrong.com, and overcomingbias.com to be of interest to you.

I should mention, in passing, that “parallel programming” has nothing at all to do with your other (fun) questions.  You could perfectly well have a murderous robot with parallel programming, or a kind, loving robot with serial programming only.

Hope that helps,
Scott


Yesterday brought the tragic news that Mihai Pătraşcu—who revolutionized the field of data structures since he burst onto the scene a decade ago—has passed away at the age of 29, after a year-and-a-half-long battle with brain cancer.  Mihai was not only an outstanding researcher but a fun-loving, larger-than-life personality in the computer science theory community.  For more information, see Lance and Bill’s or Michael Mitzenmacher’s blogs.

Mihai was an MIT CS PhD student (advised by Erik Demaine), who worked on the same floor as me for the first couple years I was here.  I’m still in shock over his loss—I hadn’t even known about the cancer before yesterday.   Mihai and I had pretty big disagreements, mostly over the viability of quantum computing, the “technical” versus “conceptual” theory debate, various things he wrote on his blog and various things I wrote on mine.  But it seems terribly stupid now to have let this stuff get in the way of collegiality.  I feel guilty for not trying to mend bridges with him when I had the chance.

Rest in peace, Mihai.


So, my Best Anthropicism Contest elicited almost 50 submissions. Thanks so much to everyone who entered — if not for you, this tautological tug-of-war would’ve been something other than what it was!

To choose the winning entry, the first rule I adopted was that, when I did find the winning entry, conditions would have to be such as to make it the winning entry, since otherwise it wouldn’t be the winning entry in the first place, but rather a losing entry. Since that didn’t get me very far, I quickly fell back on other criteria.

First, the winning entry would have to be short — longwinded explanations were out right away.

Second, it would have to make sense.

Third, it would have to illustrate the anthropic principle specifically, not some sort of generic Zen wisdom.

That already killed most of the entries. Among the ones left, many dealt Hofstadterifically with the contest itself:

> wolfgang: Applying the principle of mediocrity I have to conclude that it is unlikely that I will win this contest.
>
> Matt Wedel: Oh, c’mon! Just give me the prize! If I wasn’t going to win, I’d be living in a different universe where I didn’t win. BUT — I’m not. So give me the prize.
>
> MX: Why am I entering this contest? Because if I weren’t, I wouldn’t be me, I would be a being very similar to me living in a universe in which I did not enter this contest.

Other entries worked well as parody:

> sockatume: How much wood could a woodchuck chuck if a woodchuck could chuck would? As much wood as a woodchuck could chuck if a woodchuck could chuck would, otherwise it wouldn’t be a woodchuck.
>
> Bram Cohen: Why have all dates thus far come before January 1, 3000? Because the universe will cease to exist on that day.

In the end, though, I decided that what I was looking for wasn’t mere wit, but the real, genuine illusion of explanatory insight. And that’s why Lev R. takes the prize, with the following perspicacious pearl:

> why aren’t physicists too interested in computational complexity? because if they were, they’d be computer scientists.


Tomorrow, at 9AM EST (or an hour before teatime in Britain), I’ll be giving an online talk on Quantum Money from Hidden Subspaces (see here for PowerPoint slides) at the Q+ hangout on Google+.  To watch the talk, go here, then click the “Play” button on the video that will be there tomorrow.  Abstract:

Forty years ago, Wiesner pointed out that quantum mechanics raises the striking possibility of money that cannot be counterfeited according to the laws of physics. We propose the first quantum money scheme that is (1) public-key, meaning that anyone can verify a banknote as genuine, not only the bank that printed it, and (2) cryptographically secure, under a “classical” hardness assumption that has nothing to do with quantum money. Our scheme is based on hidden subspaces, encoded as the zero-sets of random multivariate polynomials. A main technical advance is to show that the “black-box” version of our scheme, where the polynomials are replaced by classical oracles, is unconditionally secure. Previously, such a result had only been known relative to a quantum oracle (and even there, the proof was never published). Even in Wiesner’s original setting — quantum money that can only be verified by the bank — we are able to use our techniques to patch a major security hole in Wiesner’s scheme. We give the first private-key quantum money scheme that allows unlimited verifications and that remains unconditionally secure, even if the counterfeiter can interact adaptively with the bank. Our money scheme is simpler than previous public-key quantum money schemes, including a knot-based scheme of Farhi et al. The verifier needs to perform only two tests, one in the standard basis and one in the Hadamard basis — matching the original intuition for quantum money, based on the existence of complementary observables. Our security proofs use a new variant of Ambainis’s quantum adversary method, and several other tools that might be of independent interest.  Joint work with Paul Christiano.

Update: Here’s a YouTube video of the talk.

In unrelated news, Alistair Sinclair asked me to announce that UC Berkeley’s new Simons Institute for Theoretical Computer Science—you know, the thing Berkeley recently defeated MIT (among others) to get—is soliciting proposals for programs.  The deadline is July 15, so be sure to get your proposal in soon.


  * Why do I procrastinate so much on blog posts, even to the extent of not blogging about a trip until well after it’s over? Because, while coming up with the ideas (i.e., the jokes) is trivial, writing the connective tissue is a pain in the ass.
  * Bulleted lists are easier. Expect me to fall back on them more often.
  * So, Prague. It was nice. Really nice. Nicer than Amsterdam even.
  * Like a fool, I somehow expected that, since it’s been less than two decades since the Velvet Revolution, Prague would still be some sort of backwards city in consonant-intensive Eastern Europe, grateful for any tourists it could get.
  * I dramatically overestimated how long it would take for a former Communist stronghold to become Disneyland, a.k.a. the college backpacker capital of the world.
  * I’m told there are two reasons for this transformation: (1) castles and cathedrals that weren’t completely reduced to rubble by WWII, and (2) cheap beer (less than $1 a pint). Of course, factoring in the cost of airfare and hotels, you’d have to drink hundreds of beers to save money. But we are talking about college backpackers.
  * Have you heard of Jan Hus? A century before Martin Luther, he was already pulling the same shtick: condemning the selling of indulgences, advocating a return to Christ’s original teachings, etc. Of course the Catholics burned him at the stake. This led to the Hussite Wars, which I guess I would’ve learned about had I stayed in high school long enough to take AP Euro. Anyway, there’s a big statue of Mr. Hus in Prague’s Old Town Square (you can see a photo of it on Hus’s Wikipedia page). Get this: the statue is glaring angrily at a nearby Catholic church. As you might have gathered, I’ve never been much of an art critic, but I think I more-or-less understood what the sculptor was getting at.
  * I also saw the biggest telescope in the Czech Republic.
  * Oh, yeah: there was a conference. It was about complexity or something.
  * Seriously, it was an excellent conference, except that the lecture room wasn’t air-conditioned. As a direct result, I can remember very little of the talks. (Is it better to contribute to global warming or to experience it?)
  * If you’re ever in Prague, definitely visit the Museum of Communism (“back-handed bribes accepted in our gift shop”), especially if you’ve never been to a Soviet-bloc country before (as I hadn’t). Learning about the 19th century’s worst idea on a North American campus is different from learning about it on Wenceslas Square.
  * Unfortunately, when I visit European cities like Amsterdam and Prague, I can never completely forget that I’m walking through a big murder scene. (“Thank you, waiter, for bringing me my chicken! And thank you, as well, for not deporting me to Theresienstadt or shooting me into an open pit! When you get a chance, could you maybe refill my water?”)
  * Why does Prague have one the best Judaica collections in the world? Because the Nazis shipped their loot there, expecting to open a historical museum about the human bacillus they had successfully eradicated. (There is such a museum today, but run by the bacillus itself.)
  * Speaking of which, have you heard of the Golem? It was a clay robot allegedly built in the 1500’s by Rabbi Judah Löw of Prague. This robot, you see, went rampaging around, causing random destruction, until the townspeople agreed to halt their anti-Semitic attacks. (A bit like the IDF in Lebanon.) According to legend, the Golem’s remains are still in the attic of Prague’s Old-New Synagogue, and can be reanimated if necessary. The attic is closed to visitors, but the guidebooks say that recently some great rabbi was allowed to ascend to the attic, and “returned white and trembling.” (As a friend of mine remarked, they forgot to mention that the old fellow was also white and trembling before he went up the attic.) In any case, the Golem was apparently out of service when most needed.


I’m in Edinburgh this week to visit my wonderful old friends Elham Kashefi and Rahul Santhanam, and to give a series of talks.  It’s my first visit to my “ancestral homeland,” and as you can see above, I’ve enjoyed visiting my namesake monument and eating some freshly-ground haggis.

Earlier today, I was delighted to meet the matrix-multiplication-exponent-lowerer and unwilling Shtetl-Optimized celebrity Andrew Stothers, and to treat him to lunch.  (I’d promised to buy Andrew a beer if I was ever in Edinburgh, to apologize for the blog-circus I somehow dragged him into, but he only wanted a diet Coke.)  I’m now convinced that Andrew’s not publicizing his lowering of ω was mostly a very simple matter of his not being in contact with the theoretical computer science community.  One factor might be that, here at U. of Edinburgh, the math and CS buildings are on different campuses two miles away from each other!

I apologize for the light-to-nonexistent blogging.  To tide you over until I have time to post something real, here are some extremely-interesting quantum information papers that appeared on the arXiv just recently: A multi-prover interactive proof for NEXP sound against entangled provers by Tsuyoshi Ito and my postdoc Thomas Vidick, and Bell’s Theorem Without Free Will by Tobias Fritz.


A commenter on a previous post writes:

> A lot of great discoveries came from non-scientific losers. E=MCC. Airplanes. America. Someone discovered how to make an airplane by playing with a box. Physics is mostly theoretical. America, I guess, is the most scientific discovery. They applied the scientific method to determine its existence, but they used no control group, and no placebo. For that, America’s existence is not yet proven. There seem to be other ways of establishing truth than just the scientific method. Scientists are contemporary soothsayers. They should use every means possible of proving a fact.

Despite its insightfulness and coherence, the above argument raises some immediate questions:

  1. What does it have to do with anything I said?
  2. E=MCC?
  3. What would mean to use a placebo or control group to test America’s existence? Would it mean sending a ship in a different direction, and checking that it didn’t also reach America? Would it mean verifying that America can’t be reached from Europe by foot — since if it could, then it wouldn’t be America, but rather part of Eurasia?
  4. Has England’s existence been scientifically proven? What about France’s?
  5. Where do so many people get the cockamamie idea that there’s such a thing as a “scientific method” — that science is not just really, really, really careful thinking? (I blame the school system.)


Update (August 5): Sorry for the delay! Now that the Zoo is back up, my sense of urgency has decreased, but we still do need a long-term solution. Thanks so much to everyone who offered hosting. Alas, I was persuaded by the argument that it’s too complicated to have a wiki mirrored at multiple locations, so I should really choose one—and ideally it should be someplace where I retain control of the files, in case anything goes wrong again. Following the helpful directions of Eric Price, I set up a MediaWiki installation at http://scottaar.scripts.mit.edu/zoo. Is anyone interested in helping me transfer over the content from the qwiki Zoo?

* * *

Update (August 1): Thanks to the efforts of Gopal Sarma at Stanford, the Zoo is back up and running!!  However, I believe the only long-term solution is to get the Zoo mirrored at other locations.  I can then direct the domain complexityzoo.com to point to any of them that are currently up.  So, to all of those who volunteered to mirror the Zoo: thanks so much, and please go ahead and do so!  Let me know what you need for that (I can ask Gopal to get the source files).

* * *

As some of you have noticed, the Complexity Zoo (well, don’t bother clicking the link!) has been down for the past couple weeks.  Some Stanford students volunteered to host the Zoo years ago but then graduated, and these sorts of outages have been a frustrating reality since then.  So my co-zookeeper Greg Kuperberg and I are looking for a volunteer to help us get the Zoo back online.  The reward?  Eternal gratitude and a co-zookeeper title for yourself.  In principle, I could host the Zoo on my Bluehost account, but I don’t know how to set up the wiki software, and I’m not even sure how to retrieve the Zoo pages prior to its going down (Google Cache?).  If you’re interested or have ideas, leave a comment or send me an email.

Thanks!!


A reader from Istanbul wrote in, asking me to comment on the war in Israel and Lebanon. In other words, he wants me to make this blog the scene of yet another intellectual bloodbath, with insult-laden rockets launched from untraceable IP addresses and complexity-theoretic civilians trapped in the crossfire. What a neat idea! Why didn’t I think of it before?

Alright, let me start with some context. No, I’m not talking about the Gaza pullout, or Camp David, or the last Lebanon invasion, or the Yom Kippur War, or the Six-Day War, or the War of Independence, or the UN partition plan, or the 1939 White Paper. I’m talking about the first appearance of Israel in the extrabiblical historical record, which seems to have been around 1200 BC. Boasting in a victory stele about his recent military conquests in Canaan, the Egyptian pharaoh Merneptah included a single sentence about Israel:

> Israel is laid waste; his seed is destroyed.

Sure, the pharoah was a bit premature. But give him credit for prescience if not for accuracy. Unlike (say) pyramid-building or Ra-worship, Merneptah’s Jew-killing idea has remained consistently popular for 3.2 millennia.

Today, in the year 2006, as the LHC prepares to find the Higgs boson and the New Horizons probe heads to Pluto, Am Yisra’el (literally, “the people that argues with God”) is once again surrounded by enemies whose stated goal is to wipe it off the face of the Earth. And, in the familiar process of fighting for its existence, that people is grievously, inexplicably, incompetently, blowing up six-year-olds and farmers while failing to make any visible progress on its military objectives.

So what is there to say about this that hasn’t already been said Ackermann(50) times? Instead of cluttering the blogosphere any further, I’ll simply point you to a beautiful New York Times op-ed by Rebecca Goldstein, commemorating the 350th anniversary of Spinoza’s excommunication from the Jewish community of Amsterdam. Actually, I’ll quote a few passages:

> Spinoza’s reaction to the religious intolerance he saw around him was to try to think his way out of all sectarian thinking. He understood the powerful tendency in each of us toward developing a view of the truth that favors the circumstances into which we happened to have been born. Self-aggrandizement can be the invisible scaffolding of religion, politics or ideology.
>
> Against this tendency we have no defense but the relentless application of reason.
>
> Spinoza’s system is a long deductive argument for a conclusion as radical in our day as it was in his, namely that to the extent that we are rational, we each partake in exactly the same identity.
>
> Spinoza’s dream of making us susceptible to the voice of reason might seem hopelessly quixotic at this moment, with religion-infested politics on the march. But imagine how much more impossible a dream it would have seemed on that day 350 years ago.


1\. The 1936 Berlin Olympics, in which American participation was ensured by the racist, sexist, antisemitic, Nazi-sympathizing future decades-long IOC president Avery Brundage (also, the IOC’s subsequent failure to accept responsibility for its role in legimitizing Hitler).

2\. The 1972 Munich Olympics (and the IOC’s subsequent refusal even to memorialize the victims, apparently for fear of antagonizing those Olympic countries that still celebrate the murder of the 11 Israeli athletes).

3\. Even after you leave out 1936 and 1972, the repeated granting of unearned legitimacy to the world’s murderous dictatorships—as well as “glory” to those countries most able to coerce their children into lives of athletic near-slavery (or, in the case of more “civilized” countries, outspend their rivals).

4\. The sanctimonious fiction that, after all this, we need the Olympics because of their contributions to world peace and brotherhood (a claim about which we now arguably have a century of empirical data).

5\. The double-standard that holds “winning a medal is everything” to be a perfectly-reasonable life philosophy for a gymnast, yet would denounce the same attitude if expressed by a scientist or mathematician.

6\. The increasingly-convoluted nature of what it is that the athletes are supposed to be optimizing (“run the fastest, but having taken at most these performance-enhancing substances and not those, unless of course you’re a woman with unusually-high testosterone, in which case you must artificially decrease your testosterone before competing in order to even things out”)

7\. The IOC’s notorious corruption, and the fact that hosting the Olympics is nevertheless considered such a wonderful honor and goal for any aspiring city.

8\. The IOC’s farcical attempts to control others’ use of five interlocked rings and of the word “Olympics.”

9\. ~~The fact that swimmers have to use a particular stroke, rather than whichever stroke will propel them through the water the fastest~~ (alright, while the “freestyle” rules still seem weird to me, I’m taking this one out given the amount of flak it’s gotten)

10\. The fact that someone like me, who knows all the above, and who has less interest in sports than almost anyone on earth, is still able to watch an Olympic event and care about its outcome.


(Note for non-US readers: This will be another one of my America-centric posts.  But don’t worry, it’s probably one you’ll agree with.)

There’s one argument in favor of gun control that’s always seemed to me to trump all others.

In your opinion, should private citizens should be allowed to own thermonuclear warheads together with state-of-the-art delivery systems?  Does the Second Amendment give them the right to purchase ICBMs on the open market, maybe after a brief cooling-off period?  No?  Why not?

OK, whatever grounds you just gave, I’d give precisely the same grounds for saying that private citizens shouldn’t be allowed to own assault weapons, and that the Second Amendment shouldn’t be construed as giving them that right.  (Personally, I’d ban all guns except for the bare minimum used for sport-shooting, and even that I’d regulate pretty tightly.)

Now, it might be replied that the above argument can be turned on its head: “Should private citizens be allowed to own pocket knives?  Yes, they should?  OK then, whatever grounds you gave for that, I’d give the precisely same grounds for saying that they should be allowed to own assault weapons.”

But crucially, I claim that’s a losing argument for the gun-rights crowd.  For as soon as we’re anywhere on the slippery slope—that is, as soon as it’s conceded that the question hinges, not on absolute rights, but on an actual tradeoffs in actual empirical reality—then the facts make it blindingly obvious that letting possibly-deranged private citizens buy assault weapons is only marginally less crazy than letting them buy ICBMs.

[Related Onion story]


Sorry for the long delay! I had to be in Washington D.C. this week, for reasons I’m not at liberty to disclose. (Yes, I’m serious, and no, it’s not as interesting as it sounds.) Oh: on my way back to Canada, for some strange reason they confiscated my Blistex. I guess airport security guards get chapped lips a lot.

As our world descends even further into war, terror, and Armageddon, I have an exciting complexity-theoretic announcement. Building on the Complexity Zoo, Greg Kuperberg has created a “Robozoologist”: an expert system for reasoning about complexity classes. What’s more, Greg is releasing some spinoffs of his project to the masses, including a JavaScript-powered inclusion graph, and an automatically-generated RoboZoo. I can still remember them frontier days of 2002, when I had to herd the BP operators with my two bare hands…


Forgive me if this post isn’t particularly timely — I just started blogging, so I’m still clearing out my cognitive backlog.

A month ago, the economist Steven Landsburg wrote a Slate column arguing that we shouldn’t help Hurricane Katrina victims too much. His reasoning? Presumably, the hurricane risk in New Orleans and surrounding areas was already reflected in property values being lower than what they would have been were there no such risk. So if the US spends federal tax dollars on hurricane relief, then it’s artificially subsidizing people who choose to live in hurricane-prone areas — thereby

  1. raising taxes for everyone, including those who live in “safe” areas, and
  2. raising property values in the hurricane-prone areas, which limits people’s freedom to select cheap but risky housing over expensive but safer housing.

I’d had some pleasant correspondence with Landsburg in the past, so I emailed him to say that, while I could find no flaw in his logic, I was confused as to why he didn’t take the argument even further. For example, what are fire departments, if not an artificial subsidy for people who choose to live in wooden houses rather than stone ones? And police departments? Clearly a lose-lose proposition. If you have a personal bodyguard, then you’re forced to pay for protection you don’t need. And if you don’t have a bodyguard, then you’re deprived of the freedom to choose lower taxes in exchange for having no one to call if you get stabbed.

See, in my view, if you’re going to be a radical libertarian, then you might as well go all the way. For — just like the denial of relief to hurricane victims — such consistency makes all parties better off than otherwise. Those willing to follow you all the way into Galt’s Gulch get the genuine Ayn Rand experience, with no wussy collectivist compromises. And for others, you’re all the more valuable as a walking, talking reductio ad absurdum.


[Update (8/26): Inspired by the great responses to my last Physics StackExchange question, I just asked a new one—also about the possibilities for gravitational decoherence, but now focused on Gambini et al.’s “Montevideo interpretation” of quantum mechanics.

Also, on a completely unrelated topic, my friend Jonah Sinick has created a memorial YouTube video for the great mathematician Bill Thurston, who sadly passed away last week.  Maybe I should cave in and set up a Twitter feed for this sort of thing…]

[Update (8/26): I’ve now posted what I see as one of the main physics questions in this discussion on Physics StackExchange: “Reversing gravitational decoherence.”  Check it out, and help answer if you can!]

[Update (8/23): If you like this blog, and haven’t yet read the comments on this post, you should probably do so!  To those who’ve complained about not enough meaty quantum debates on this blog lately, the comment section of this post is my answer.]

[Update: Argh!  For some bizarre reason, comments were turned off for this post.  They’re on now.  Sorry about that.]

I’m in Anaheim, CA for a great conference celebrating the 80th birthday of the physicist Yakir Aharonov.  I’ll be happy to discuss the conference in the comments if people are interested.

In the meantime, though, since my flight here was delayed 4 hours, I decided to (1) pass the time, (2) distract myself from the inanities blaring on CNN at the airport gate, (3) honor Yakir’s half-century of work on the foundations of quantum mechanics, and (4) honor the commenters who wanted me to stop ranting and get back to quantum stuff, by sharing some thoughts about a topic that, unlike gun control or the Olympics, is completely uncontroversial: the Many-Worlds Interpretation of quantum mechanics.

Proponents of MWI, such as David Deutsch, often argue that MWI is a lot like Copernican astronomy: an exhilarating expansion in our picture of the universe, which follows straightforwardly from Occam’s Razor applied to certain observed facts (the motions of the planets in one case, the double-slit experiment in the other).  Yes, many holdouts stubbornly refuse to accept the new picture, but their skepticism says more about sociology than science.  If you want, you can describe all the quantum-mechanical experiments anyone has ever done, or will do for the foreseeable future, by treating “measurement” as an unanalyzed primitive and never invoking parallel universes.  But you can also describe all astronomical observations using a reference frame that places the earth at the center of the universe.  In both cases, say the MWIers, the problem with your choice is its unmotivated perversity: you mangle the theory’s mathematical simplicity, for no better reason than a narrow parochial urge to place yourself and your own experiences at the center of creation.  The observed motions of the planets clearly want a sun-centered model.  In the same way, Schrödinger’s equation clearly wants measurement to be just another special case of unitary evolution—one that happens to cause your own brain and measuring apparatus to get entangled with the system you’re measuring, thereby “splitting” the world into decoherent branches that will never again meet.  History has never been kind to people who put what they want over what the equations want, and it won’t be kind to the MWI-deniers either.

This is an important argument, which demands a response by anyone who isn’t 100% on-board with MWI.  Unlike some people, I happily accept this argument’s framing of the issue: no, MWI is not some crazy speculative idea that runs afoul of Occam’s razor.  On the contrary, MWI really is just the “obvious, straightforward” reading of quantum mechanics itself, if you take quantum mechanics literally as a description of the whole universe, and assume nothing new will ever be discovered that changes the picture.

Nevertheless, I claim that the analogy between MWI and Copernican astronomy fails in two major respects.

The first is simply that the inference, from interference experiments to the reality of many-worlds, strikes me as much more “brittle” than the inference from astronomical observations to the Copernican system, and in particular, too brittle to bear the weight that the MWIers place on it.  Once you know anything about the dynamics of the solar system, it’s hard to imagine what could possibly be discovered in the future, that would ever again make it reasonable to put the earth at the “center.”  By contrast, we do more-or-less know what could be discovered that would make it reasonable to privilege “our” world over the other MWI branches.  Namely, any kind of “dynamical collapse” process, any source of fundamentally-irreversible decoherence between the microscopic realm and that of experience, any physical account of the origin of the Born rule, would do the trick.

Admittedly, like most quantum folks, I used to dismiss the notion of “dynamical collapse” as so contrived and ugly as not to be worth bothering with.  But while I remain unimpressed by the specific models on the table (like the GRW theory), I’m now agnostic about the possibility itself.  Yes, the linearity of quantum mechanics does indeed seem incredibly hard to tinker with.  But as Roger Penrose never tires of pointing out, there’s at least one phenomenon—gravity—that we understand how to combine with quantum-mechanical linearity only in various special cases (like 2+1 dimensions, or supersymmetric anti-deSitter space), and whose reconciliation with quantum mechanics seems to raise fundamental problems (i.e., what does it even mean to have a superposition over different causal structures, with different Hilbert spaces potentially associated to them?).

To make the discussion more concrete, consider the proposed experiment of Bouwmeester et al., which seeks to test (loosely) whether one can have a coherent superposition over two states of the gravitational field that differ by a single Planck length or more.  This experiment hasn’t been done yet, but some people think it will become feasible within a decade or two.  Most likely it will just confirm quantum mechanics, like every previous attempt to test the theory for the last century.  But it’s not a given that it will; quantum mechanics has really, truly never been tested in this regime.  So suppose the interference pattern isn’t seen.  Then poof!  The whole vast ensemble of parallel universes spoken about by the MWI folks would have disappeared with a single experiment.  In the case of Copernicanism, I can’t think of any analogous hypothetical discovery with even a shred of plausibility: maybe a vector field that pervades the universe but whose unique source was the earth?  So, this is what I mean in saying that the inference from existing QM experiments to parallel worlds seems too “brittle.”

As you might remember, I wagered $100,000 that scalable quantum computing will indeed turn out to be compatible with the laws of physics.  Some people considered that foolhardy, and they might be right—but I think the evidence seems pretty compelling that quantum mechanics can be extrapolated at least that far.  (We can already make condensed-matter states involving entanglement among millions of particles; for that to be possible but not quantum computing would seem to require a nasty conspiracy.)  On the other hand, when it comes to extending quantum-mechanical linearity all the way up to the scale of everyday life, or to the gravitational metric of the entire universe—as is needed for MWI—even my nerve falters.  Maybe quantum mechanics does go that far up; or maybe, as has happened several times in physics when exploring a new scale, we have something profoundly new to learn.  I wouldn’t give much more informative odds than 50/50.

The second way I’d say the MWI/Copernicus analogy breaks down arises from a closer examination of one of the MWIers’ favorite notions: that of “parochial-ness.”  Why, exactly, do people say that putting the earth at the center of creation is “parochial”—given that relativity assures us that we can put it there, if we want, with perfect mathematical consistency?  I think the answer is: because once you understand the Copernican system, it’s obvious that the only thing that could possibly make it natural to place the earth at the center, is the accident of happening to live on the earth.  If you could fly a spaceship far above the plane of the solar system, and watch the tiny earth circling the sun alongside Mercury, Venus, and the sun’s other tiny satellites, the geocentric theory would seem as arbitrary to you as holding Cheez-Its to be the sole aim and purpose of human civilization.  Now, as a practical matter, you’ll probably never fly that spaceship beyond the solar system.  But that’s irrelevant: firstly, because you can very easily imagine flying the spaceship, and secondly, because there’s no in-principle obstacle to your descendants doing it for real.

Now let’s compare to the situation with MWI.  Consider the belief that “our” universe is more real than all the other MWI branches.  If you want to describe that belief as “parochial,” then from which standpoint is it parochial?  The standpoint of some hypothetical godlike being who sees the entire wavefunction of the universe?  The problem is that, unlike with my solar system story, it’s not at all obvious that such an observer can even exist, or that the concept of such an observer makes sense.  You can’t “look in on the multiverse from the outside” in the same way you can look in on the solar system from the outside, without violating the quantum-mechanical linearity on which the multiverse picture depends in the first place.

The closest you could come, probably, is to perform a Wigner’s friend experiment, wherein you’d verify via an interference experiment that some other person was placed into a superposition of two different brain states.  But I’m not willing to say with confidence that the Wigner’s friend experiment can even be done, in principle, on a conscious being: what if irreversible decoherence is somehow a necessary condition for consciousness?  (We know that increase in entropy, of which decoherence is one example, seems intertwined with and possibly responsible for our subjective sense of the passage of time.)  In any case, it seems clear that we can’t talk about Wigner’s-friend-type experiments without also talking, at least implicitly, about consciousness and the mind/body problem—and that that fact ought to make us exceedingly reluctant to declare that the right answer is obvious and that anyone who doesn’t see it is an idiot.  In the case of Copernicanism, the “flying outside the solar system” thought experiment isn’t similarly entangled with any of the mysteries of personal identity.

There’s a reason why Nobel Prizes are regularly awarded for confirmations of effects that were predicted decades earlier by theorists, and that therefore surprised almost no one when they were finally found.  Were we smart enough, it’s possible that we could deduce almost everything interesting about the world a priori.  Alas, history has shown that we’re usually not smart enough: that even in theoretical physics, our tendencies to introduce hidden premises and to handwave across gaps in argument are so overwhelming that we rarely get far without constant sanity checks from nature.

I can’t think of any better summary of the empirical attitude than the famous comment by Donald Knuth: “Beware of bugs in the above code.  I’ve only proved it correct; I haven’t tried it.”  In the same way, I hereby declare myself ready to support MWI, but only with the following disclaimer: “Beware of bugs in my argument for parallel copies of myself.  I’ve only proved that they exist; I haven’t heard a thing from them.”


If you hadn’t been reading the comments on my last post, you might not know that my old chum Mahmoud Ahmadinejad had launched his own blog on Sunday. Along with a rambling autobiography, this exciting new blog (which I’ve added to my linklog on the right) also includes a poll:

> Do you think that the US and Israeli intention and goal by attacking Lebanon is pulling the trigger for another word [sic] war?

When I first visited, only 5% had voted “yes”, though it’s now up to 50%.

But wait, it gets better: if Mahmoud’s site identifies your IP address as coming from Israel, then it tries to install a virus on your computer by exploiting an Internet Explorer vulnerability. (Thanks to an anonymous commenter for bringing this to my attention.)

I suppose we should grateful that, at least for now, defending oneself against the modern-day Hitler is as simple as installing Firefox.


Yes.


Over at Theoretical Computer Science StackExchange, an entertaining debate has erupted about the meaning and validity of the Church-Turing Thesis.  The prompt for this debate was a question asking for opinions about Peter Wegner and Dina Goldin’s repetitive diatribes claiming to refute “the myth of the Church-Turing Thesis”—on the grounds that, you see, Turing machines can only handle computations with static inputs and outputs, not interactivity, or programs like operating systems that run continuously.  For a demolition of this simple misunderstanding, see Lance Fortnow’s CACM article.  Anyway, I wrote my own parodic response to the question, which generated so many comments that the moderators started shooing people away.  So I decided to repost my answer on my blog.  That way, after you’re done upvoting my answer over at CS Theory StackExchange :-), you can come back here and continue the discussion in the comments section.

* * *

Here’s my favorite analogy. Suppose I spent a decade publishing books and papers arguing that, contrary to theoretical computer science’s dogma, the Church-Turing Thesis fails to capture all of computation, because Turing machines can’t toast bread. Therefore, you need my revolutionary new model, the Toaster-Enhanced Turing Machine (TETM), which allows bread as a possible input and includes toasting it as a primitive operation.

You might say: sure, I have a “point”, but it’s a totally uninteresting one. No one ever claimed that a Turing machine could handle every possible interaction with the external world, without first hooking it up to suitable peripherals. If you want a Turing machine to toast bread, you need to connect it to a toaster; then the TM can easily handle the toaster’s internal logic (unless this particular toaster requires solving the halting problem or something like that to determine how brown the bread should be!). In exactly the same way, if you want a TM to handle interactive communication, then you need to hook it up to suitable communication devices, as Neel discussed in his answer. In neither case are we saying anything that wouldn’t have been obvious to Turing himself.

So, I’d say the reason why there’s been no “followup” to Wegner and Goldin’s diatribes is that theoretical computer science has known how to model interactivity whenever needed, and has happily done so, since the very beginning of the field.

Update (8/30): A related point is as follows. Does it ever give the critics pause that, here inside the Elite Church-Turing Ivory Tower (the ECTIT), the major research themes for the past two decades have included interactive proofs, multiparty cryptographic protocols, codes for interactive communication, asynchronous protocols for routing, consensus, rumor-spreading, leader-election, etc., and the price of anarchy in economic networks? If putting Turing’s notion of computation at the center of the field makes it so hard to discuss interaction, how is it that so few of us have noticed?

Another Update: To the people who keep banging the drum about higher-level formalisms being vastly more intuitive than TMs, and no one thinking in terms of TMs as a practical matter, let me ask an extremely simple question. What is it that lets all those high-level languages existin the first place, that ensures they can always be compiled down to machine code? Could it be … err … THE CHURCH-TURING THESIS, the very same one you’ve been ragging on? To clarify, the Church-Turing Thesis is not the claim that “TURING MACHINEZ RULE!!” Rather, it’s the claim that any reasonable programming language will be equivalent in expressive power to Turing machines — and as a consequence, that you might as well think in terms of the higher-level languages if it’s more convenient to do so. This, of course, was a radical new insight 60-75 years ago.

Update (Sept. 6): Check out this awesome comment by Lou Scheffer, describing his own tale of conversion from a Church-Turing skeptic to believer, and making an extremely apt comparison to the experience of conversion to the belief that R, R2, and so on all have the same cardinality (an experience I also underwent!).


  1. Given an n-qubit pure state, is there always a way to apply Hadamard gates to some subset of the qubits, so as to make all 2n computational basis states have nonzero amplitudes?
  2. Can we get any upper bound on QMIP (quantum multi-prover interactive proofs with unlimited prior entanglement)? It would suffice to show (for example) that the provers never need more than Ackermann(n) ebits of entanglement.
  3. Can any QMA(2) (QMA with two unentangled yes-provers) protocol be amplified to exponentially small error probability? If you think the answer is trivially yes, think about it some more!
  4. If a unitary operation U can be applied in polynomial time, then can some square root of U also be applied in polynomial time?
  5. Suppose Alice and Bob are playing n parallel CHSH games, with no communication or entanglement. Is the probability that they’ll win all n games at most pn, for some p bounded below 0.853?
  6. Forget about an oracle relative to which BQP is not in PH. Forget about an oracle relative to which BQP is not in AM. Is there an oracle relative to which BQP is not in SZK?
  7. Given any n-qubit unitary operation U, does there exist an oracle relative to which U can be (approximately) applied in polynomial time?
  8. How many mutually unbiased bases are there in non-prime-power dimensions? (Alright, I don’t care about this one, but so many people do that I figured I’d put it in.)
  9. Is there an n-qubit pure state that can be prepared by a circuit of size n3, and that can’t be distinguished from the maximally mixed state by any circuit of size n2?
  10. Fill this space with your own annoying question! Here are the rules: the question must involve quantum. It must be annoying. It must be clearly-stated — no open-ended pontificating allowed. It can’t be an Everest of the field, like graph isomorphism or increasing the fault-tolerance threshold. Instead it should be a dinky little molehill, that’s nevertheless caused all would-be climbers to fall flat on their asses.


The Pennsylvania Governor’s School for the Sciences (PGSS) was an incredibly-successful summer program for gifted high school students in my birth-state of Pennsylvania.  PGSS ran from 1982 to 2009 and then was shuttered due to state budget cuts.  A group of alumni is now trying to raise enough private funds to restart the program (they need $100,000).  Please visit their site, watch their video, and make a small (or large) donation if you feel moved to.

In other news, I’ll be speaking at a workshop on Quantum Information Science in Computer and Natural Sciences, organized by Umesh Vazirani and Carl Williams, to be held September 28-29 at the University of Maryland College Park.  This workshop is specifically designed for computer scientists, mathematicians, physicists, and others who haven’t worked in quantum information, but who’d like to know more about current research in the area, and to look for connections between quantum information and their own fields.  Umesh writes:

The initiative comes at a particularly opportune moment for researchers in complexity theory, given the increasing relevance of quantum techniques in complexity theory — the 2-4 norm paper of Barak, et al (SDPs, Lasserre), exponential lower bounds for TSP polytope via quantum communication complexity arguments (de Wolf et al), quantum Hamiltonian complexity as a generalization of  CSPs, lattice-based cryptography whose security is based on quantum arguments, etc.

Hope to see some of you there!


Update (10/10).  In case anyone is interested, here’s a comment I posted over at Cosmic Variance, responding to a question about the relevance of Haroche and Wineland’s work for the interpretation of quantum mechanics.

> The experiments of Haroche and Wineland, phenomenal as they are, have zero implications one way or the other for the MWI/Copenhagen debate (nor, for that matter, for third-party candidates like Bohm 🙂 ). In other words, while doing these experiments is a tremendous challenge requiring lots of new ideas, no sane proponent of any interpretation would have made predictions for their outcomes other than the ones that were observed. To do an experiment about which the proponents of different interpretations might conceivably diverge, it would be necessary to try to demonstrate quantum interference in a much, much larger system — for example, a brain or an artificially-intelligent quantum computer. And even then, the different interpretations arguably don’t make differing predictions about what the published results of such an experiment would be. If they differ at all, it’s in what they claim, or refuse to claim, about the experiences of the subject of the experiment, while the experiment is underway. But if quantum mechanics is right, then the subject would necessarily have forgotten those experiences by the end of the experiment — since otherwise, no interference could be observed!
>
> So, yeah, barring any change to the framework of quantum mechanics itself, it seems likely that people will be arguing about its interpretation forever. Sorry about that. 🙂

* * *

Where is he?  So many wild claims being leveled, so many opportunities to set the record straight, and yet he completely fails to respond.  Where’s the passion he showed just four years ago?  Doesn’t he realize that having the facts on his side isn’t enough, has never been enough?  It’s as if his mind is off somewhere else, or as if he’s tired of his role as a public communicator and no longer feels like performing it.  Is his silence part of some devious master plan?  Is he simply suffering from a lack of oxygen in the brain?  What’s going on?

Yeah, yeah, I know.  I should blog more.  I’ll have more coming soon, but for now, two big announcements related to quantum computing.

Today the 2012 Nobel Prize in Physics was awarded jointly to Serge Haroche and David Wineland, for “for ground-breaking experimental methods that enable measuring and manipulation of individual quantum systems.”  I’m not very familiar with Haroche’s work, but I’ve known of Wineland for a long time as possibly the top quantum computing experimentalist in the business, setting one record after another in trapped-ion experiments.  In awarding this prize, the Swedes have recognized the phenomenal advances in atomic, molecular, and optical physics that have already happened over the last two decades, largely motivated by the goal of building a scalable quantum computer (along with other, not entirely unrelated goals, like more accurate atomic clocks).  In so doing, they’ve given what’s arguably the first-ever “Nobel Prize for quantum computing research,” without violating their policy to reward only work that’s been directly confirmed by experiment.  Huge congratulations to Haroche and Wineland!!

In other quantum computing developments: yes, I’m aware of the latest news from D-Wave, which includes millions of dollars in new funding from Jeff Bezos (the founder of Amazon.com, recipients of a large fraction of my salary).  Despite having officially retired as Chief D-Wave Skeptic, I posted a comment on Tom Simonite’s article in MIT Technology Review, and also sent the following email to a journalist.

> I’m probably not a good person to comment on the “business” aspects of D-Wave.  They’ve been extremely successful raising money in the past, so it’s not surprising to me that they continue to be successful.  For me, three crucial points to keep in mind are:
>
> (1) D-Wave still hasn’t demonstrated 2-qubit entanglement, which I see as one of the non-negotiable “sanity checks” for scalable quantum computing.  In other words: if you’re producing entanglement, then you might or might not be getting quantum speedups, but if you’re not producing entanglement, then our current understanding fails to explain how you could possibly be getting quantum speedups.
>
> (2) Unfortunately, the fact that D-Wave’s machine solves some particular problem in some amount of time, and a specific classical computer running (say) simulated annealing took more time, is not (by itself) good evidence that D-Wave was achieving the speedup because of quantum effects.  Keep in mind that D-Wave has now spent ~$100 million and ~10 years of effort on a highly-optimized, special-purpose computer for solving one specific optimization problem.  So, as I like to put it, quantum effects could be playing the role of “the stone in a stone soup”: attracting interest, investment, talented people, etc. to build a device that performs quite well at its specialized task, but not ultimately because of quantum coherence in that device.
>
> (3) The quantum algorithm on which D-Wave’s business model is based — namely, the quantum adiabatic algorithm — has the property that it “degrades gracefully” to classical simulated annealing when the decoherence rate goes up.  This, fundamentally, is the thing that makes it difficult to know what role, if any, quantum coherence is playing in the performance of their device.  If they were trying to use Shor’s algorithm to factor numbers, the situation would be much more clear-cut: a decoherent version of Shor’s algorithm just gives you random garbage.  But a decoherent version of the adiabatic algorithm still gives you a pretty good (but now essentially “classical”) algorithm, and that’s what makes it hard to understand what’s going on here.
>
> As I’ve said before, I no longer feel like playing an adversarial role.  I really, genuinely hope D-Wave succeeds.  But the burden is on them to demonstrate that their device uses quantum effects to obtain a speedup, and they still haven’t met that burden.  When and if the situation changes, I’ll be happy to say so.  Until then, though, I seem to have the unenviable task of repeating the same observation over and over, for 6+ years, and confirming that, no, the latest sale, VC round, announcement of another “application” (which, once again, might or might not exploit quantum effects), etc., hasn’t changed the truth of that observation.
>
> Best,
Scott


I’ve decided to “launch” my latest paper on the blogosphere even before posting it to quant-ph — so that you, my loyal readers, can be the very first to lay eyes on it. (True fact: as I was writing, I didn’t look once at the screen.) Comments more than welcome.

> The Learnability of Quantum States [PS] [PDF]
Scott Aaronson
>
> Abstract: Let me warn you up-front, this is one big-ass mother of a paper. It’s got learning. It’s got quantum. It’s got philosophy. It’s got weird complexity classes (naturally). It’s even got experimental physics applications (don’t worry, I showered afterward). And dude. When I say “experimental,” I’m not talking wormholes or anthropic postselection. I’m talking stuff that you, the quantum state tomographer, can try in your lab today. And no, this is not the real abstract.

Update (8/20): I’ve posted a slightly revised version, mostly in response to the comments I received here.


Update (10/31): While I continue to engage in surreal arguments in the comments section—Scott, I’m profoundly disappointed that a scientist like you, who surely knows better, would be so sloppy as to assert without any real proof that just because it has tusks and a trunk, and looks and sounds like an elephant, and is the size of the elephant, that it therefore is an elephant, completely ignoring the blah blah blah blah blah—while I do that, there are a few glimmerings that the rest of the world is finally starting to get it.  A new story from The Onion, which I regard as almost the only real newspaper left:

## Nation Suddenly Realizes This Just Going To Be A Thing That Happens From Now On

Update (11/1): OK, and this morning from Nicholas Kristof, who’s long been one of the rare non-Onion practitioners of journalism: Will Climate Get Some Respect Now?

* * *

I’m writing from the abstract, hypothetical future that climate-change alarmists talk about—the one where huge tropical storms batter the northeastern US, coastal cities are flooded, hundreds of thousands are evacuated from their homes, etc.  I always imagined that, when this future finally showed up, at least I’d have the satisfaction of seeing the deniers admit they were grievously wrong, and that I and those who think similarly were right.  Which, for an academic, is a satisfaction that has to be balanced carefully against the possible destruction of the world.  I don’t think I had the imagination to foresee that the prophesied future would actually arrive, and that climate change would simultaneously disappear as a political issue—with the forces of know-nothingism bolder than ever, pressing their advantage into questions like whether or not raped women can get pregnant, as the President weakly pleads that he too favors more oil drilling.  I should have known from years of blogging that, if you hope for the consolation of seeing those who are wrong admit to being wrong, you hope for a form of happiness all but unattainable in this world.

Yet, if the transformation of the eastern seaboard into something out of the Jurassic hasn’t brought me that satisfaction, it has brought a different, completely unanticipated benefit.  Trapped in my apartment, with the campus closed and all meetings cancelled, I’ve found, for the first time in months, that I actually have some time to write papers.  (And, well, blog posts.)  Because of this, part of me wishes that the hurricane would continue all week, even a month or two (minus, of course, the power outages, evacuations, and other nasty side effects).  I could learn to like this future.

At this point in the post, I was going to transition cleverly into an almost (but not completely) unrelated question about the nature of causality.  But I now realize that the mention of hurricanes and (especially) climate change will overshadow anything I have to say about more abstract matters.  So I’ll save the causality stuff for tomorrow or Wednesday.  Hopefully the hurricane will still be here, and I’ll have time to write.


Alright, I can give an oracle relative to which BQP is not in SZK, thereby knocking off one of the Ten Most Annoying Questions in Quantum Computing.

It’s a forehead-slapper. Just take the problem from the paper Exponential algorithmic speedup by quantum walk by Andrew Childs et al. Here the oracle encodes an exponentially large graph, consisting of two binary trees conjoined at the leaves by a random cycle:


(I hope Childs et al. will forgive me for swiping their graphic.)

Each vertex is labeled by a random string, and given the label of a vertex, the oracle tells us the labels of its neighbors. Then, given the label of the Entrance vertex, the problem is to decide (let’s say) whether the label of the Exit vertex starts with a 1 or a 0.

Childs et al. proved that this oracle problem is in BQP but not in BPP. Intuitively, any classical random walk on the graph will get stuck for an exponentially long time in the enormous middle region, but because of interference effects, a quantum walk will tunnel right through to the Exit vertex with 1/polynomial probability.

Now, it’s easy to generalize their proof that the problem is not in BPP, to show that it’s not in SZK. One way to see this is that, for a prover to convince a verifier of the solution, the prover will (basically) have to reveal where the Exit vertex is, thereby violating the zero-knowledge property. Another way to see it is that, if we consider the Sahai-Vadhan characterization of SZK in terms of the Statistical Difference problem, then neither of the two distributions we’re comparing will depend non-negligibly on the Exit vertex.

Disappointingly, this solution is way too trivial to publish, and almost too trivial even to blog. On the other hand, so far I’ve been unable to extend the solution to get an oracle relative to which BQP is not in AM. Every variant of the problem I’ve come up with is in AM intersect coAM, sometimes for non-obvious reasons. Anyone want to help me?


The following question emerged from a conversation with the machine learning theorist Pedro Domingos a month ago.

Consider a hypothetical race of intelligent beings, the Armchairians, who never take any actions: never intervene in the world, never do controlled experiments, never try to build anything and see if it works.  The sole goal of the Armchairians is to observe the world around them and, crucially, to make accurate predictions about what’s going to happen next.  Would the Armchairians ever develop the notion of cause and effect?  Or would they be satisfied with the notion of statistical correlation?  Or is the question kind of silly, the answer depending entirely on what we mean by “developing the notion of cause and effect”?  Feel free to opine away in the comments section.


So, this year’s Fields Medals go to Terence Tao and Grisha Perelman (duhhhh), as well as to Andrei Okounkov and Wendelin Werner. The Nevanlinna Prize goes to an already-prize-bedecked Jon Kleinberg, my professor at Cornell way back in ’97. Congratulations to all!

Meanwhile, there’s a long article in yesterday’s New Yorker about Perelman and the Poincaré conjecture, by Sylvia Nasar (the media’s go-to person for reclusive mathematical geniuses) and David Gruber. Unfortunately the article’s not on the web, but fearless detective that I am, I was able to track it down in a so-called “bookstore.”

Nasar and Gruber find Perelman in a St. Petersburg apartment, where he lives with his mom, doesn’t check his mail, and just generally makes Andrew Wiles look like a hard-partying, elliptic-curve-modularizing regular dude. Perelman is nevertheless happy to grant Nasar and Gruber an interview, to confirm that he intends to be the first person in history to turn down the Fields, and to complain about his fellow mathematicians’ lax ethical standards.

What exactly is he talking about? It wasn’t clear to me, but Nasar and Gruber devote much of their article to an indictment of 1982 Fields Medalist Shing-Tung Yau, who they portray as trying to usurp credit from Perelman for the benefit of his students Xi-Ping Zhu and Huai-Dong Cao. (Zhu and Cao wrote a 328-page exposition of Perelman’s ideas, complementing other expositions by Bruce Kleiner and John Lott and by John Morgan and Gang Tian.) I have no idea to what extent, if any, the criticism of Yau is justified. But to my mind, failing to write up your result properly, and then getting upset when those who do write it up properly try to share credit, is a bit like leaving your wallet on the sidewalk and then shaking your head at human depravity when someone tries to steal it.

Nasar and Gruber also don’t comment on the obvious irony of Perelman’s “unworldliness”: that, by being such a fruitcake, he’s guaranteeing he’ll draw vastly more attention to himself than he would by just accepting the goddamned medal. (Feynman, though not exactly publicity-shy, employed similar reasoning to conclude that turning down the Nobel Prize would be a bad idea.) Indeed, supposing Perelman did aspire to celebrity status, my public-relations advice to him would be to do exactly what he’s doing right now.

Update: The New Yorker article is now online.


Update (Nov. 8): Slate’s pundit scoreboard.

* * *

Update (Nov. 6): In crucial election news, a Florida woman wearing an MIT T-shirt was barred from voting, because the election supervisor thought her shirt was advertising Mitt Romney.

* * *

At the time of writing, Nate Silver is giving Obama an 86.3% chance.  I accept his estimate, while vividly remembering various admittedly-cruder forecasts the night of November 5, 2000, which gave Gore an 80% chance.  (Of course, those forecasts need not have been “wrong”; an event with 20% probability really does happen 20% of the time.)  For me, the main uncertainties concern turnout and the effects of various voter-suppression tactics.

In the meantime, I wanted to call the attention of any American citizens reading this blog to the wonderful Election FAQ of Peter Norvig, director of research at Google and a person well-known for being right about pretty much everything.  The following passage in particular is worth quoting.

## Is it rational to vote?

Yes. Voting for president is one of the most cost-effective actions any patriotic American can take.

Let me explain what the question means. For your vote to have an effect on the outcome of the election, you would have to live in a decisive state, meaning a state that would give one candidate or the other the required 270th electoral vote. More importantly, your vote would have to break an exact tie in your state (or, more likely, shift the way that the lawyers and judges will sort out how to count and recount the votes). With 100 million voters nationwide, what are the chances of that? If the chance is so small, why bother voting at all?

Historically, most voters either didn’t worry about this problem, or figured they would vote despite the fact that they weren’t likely to change the outcome, or vote because they want to register the degree of support for their candidate (even a vote that is not decisive is a vote that helps establish whether or not the winner has a “mandate”). But then the 2000 Florida election changed all that, with its slim 537 vote (0.009%) margin.

What is the probability that there will be a decisive state with a very close vote total, where a single vote could make a difference? Statistician Andrew Gelman of Columbia University says about one in 10 million.

That’s a small chance, but what is the value of getting to break the tie? We can estimate the total monetary value by noting that President George W. Bush presided over a $3 trillion war and at least a $1 trillion economic melt-down. Senator Sheldon Whitehouse (D-RI) estimated the cost of the Bush presidency at $7.7 trillion. Let’s compromise and call it $6 trillion, and assume that the other candidate would have been revenue neutral, so the net difference of the presidential choice is $6 trillion.

The value of not voting is that you save, say, an hour of your time. If you’re an average American wage-earner, that’s about $20. In contrast, the value of voting is the probability that your vote will decide the election (1 in 10 million if you live in a swing state) times the cost difference (potentially $6 trillion). That means the expected value of your vote (in that election) was $600,000. What else have you ever done in your life with an expected value of $600,000 per hour? Not even Warren Buffett makes that much. (One caveat: you need to be certain that your contribution is positive, not negative. If you vote for a candidate who makes things worse, then you have a negative expected value. So do your homework before voting. If you haven’t already done that, then you’ll need to add maybe 100 hours to the cost of voting, and the expected value goes down to $6,000 per hour.)

I’d like to embellish Norvig’s analysis with one further thought experiment.  While I favor a higher figure, for argument’s sake let’s accept Norvig’s estimate that the cost George W. Bush inflicted on the country was something like $6 trillion.  Now, imagine that a delegation of concerned citizens from 2012 were able to go back in time to November 5, 2000, round up 538 lazy Gore supporters in Florida who otherwise would have stayed home, and bribe them to go to the polls.  Set aside the illegality of the time-travelers’ action: they’re already violating the laws of space, time, and causality, which are well-known to be considerably more reliable than Florida state election law!  Set aside all the other interventions that also would’ve swayed the 2000 election outcome, and the 20/20 nature of hindsight, and the insanity of Florida’s recount process.  Instead, let’s simply ask: how much should each of those 538 lazy Floridian Gore supporters have been paid, in order for the delegation from the future to have gotten its money’s worth?

The answer is a mind-boggling ~$10 billion per voter.  Think about that: just for peeling their backsides off the couch, heading to the local library or school gymnasium, and punching a few chads (all the way through, hopefully), each of those 538 voters would have instantly received the sort of wealth normally associated with Saudi princes or founders of Google or Facebook.  And the country and the world would have benefited from that bargain.

No, this isn’t really a decisive argument for anything (I’ll leave it to the commenters to point out the many possible objections).  All it is, is an image worth keeping in mind the next time someone knowingly explains to you why voting is a waste of time.


In a comment on my last post, Bram Cohen writes:

> This whole business of ‘formality’ and ‘review’ is really kind of dumb. A mathematical theorem is only really proven when a computer can verify the proof. Until then, it’s just hand-waving which has some degree of utility when generating a real proof.
>
> Were it standard to present proofs in computer-checkable form, there would be no review process at all. In fact it would be possible to send a proof to a theorem server which would automatically accept any proof which checked out. Had Perelman submitted to one of those, we wouldn’t have had any review process at all, and had complete confidence from day 1, and there wouldn’t be any of this stupid game of who really proved it by making the arguments sufficiently ‘formal’ or ‘detailed’.
>
> I view the switch to doing mathematics in the style just described as inevitable…

Like Bram, I also hope and expect that mathematicians will eventually switch to machine-readable proofs supplemented by human-readable explanations. That would certainly beat the current standard, proofs that are readable by neither machines nor humans.

So then why hasn’t it happened already? Probably the best way to answer this question is to show you the proof, in a state-of-the-art formal verification system called HOL Light, that the square root of 2 is irrational.

>
>     let rational = new_definition
>     `rational(r) = ?p q. ~(q = 0) / abs(r) = &p / &q`;;
>
>     let NSQRT_2 = prove
>     (`!p q. p * p = 2 * q * q ==> q = 0`,
>     MATCH_MP_TAC num_WF THEN REWRITE_TAC[RIGHT_IMP_FORALL_THM] THEN
>     REPEAT STRIP_TAC THEN FIRST_ASSUM(MP_TAC o AP_TERM `EVEN`) THEN
>     REWRITE_TAC[EVEN_MULT; ARITH] THEN REWRITE_TAC[EVEN_EXISTS] THEN
>     DISCH_THEN(X_CHOOSE_THEN `m:num` SUBST_ALL_TAC) THEN
>     FIRST_X_ASSUM(MP_TAC o SPECL [`q:num`; `m:num`]) THEN
>     POP_ASSUM MP_TAC THEN CONV_TAC SOS_RULE);;
>
>     let SQRT_2_IRRATIONAL = prove
>     (`~rational(sqrt(&2))`,
>     SIMP_TAC[rational; real_abs; SQRT_POS_LE; REAL_POS; NOT_EXISTS_THM] THEN
>     REPEAT GEN_TAC THEN DISCH_THEN(CONJUNCTS_THEN2 ASSUME_TAC MP_TAC) THEN
>     DISCH_THEN(MP_TAC o AP_TERM `x. x pow 2`) THEN
>     ASM_SIMP_TAC[SQRT_POW_2; REAL_POS; REAL_POW_DIV; REAL_POW_2; REAL_LT_SQUARE;
>     REAL_OF_NUM_EQ; REAL_EQ_RDIV_EQ] THEN
>     ASM_MESON_TAC[NSQRT_2; REAL_OF_NUM_EQ; REAL_OF_NUM_MUL]);;

Cool — now let’s do Fermat and Poincaré! Any volunteers?

Seriously, the biggest accomplishments to date have included formal proofs of the Jordan Curve Theorem (75,000 lines) and the Prime Number Theorem (30,000 lines). If you want to know which other famous theorems have been formalized, check out this excellent page. Or look at these notes by Harvey Friedman, which cut through the crap and tell us exactly where things stand.

A huge part of the problem in this field seems to be that there’s neither a standard proof format nor a standard proof repository — no TeX or HTML, no arXiv or Wikipedia. Besides HOL Light, there’s also ProofPower, Isabelle, Coq, Mizar, and several other competitors. I’d probably go with Mizar, simply because the proofs in it look the most to me like actual math.

Friedman gives machine-readable proofs fifty years to catch on among “real” mathematicians. That seems about right — though the time could be reduced if the Don Knuth, Tim Berners-Lee, Paul Ginsparg, or Jimmy Wales of proof-checking were to appear between now and then. As usual, it mostly comes down to humans.

Update: Freek Wiedijk put together a fantastic online book, which shows the proofs that the square root of 2 is irrational in 17 different formal systems. The “QED Manifesto” is also worth a look. This manifesto makes it clear that there are people in the formal verification world with a broad enough vision — if you like, the Ted Nelsons of proof-checking. Nelson is the guy who dreamed in 1960 of creating a global hypertext network. In his case, it took 35 years for the dream to turn into software and protocols that people actually wanted to use (not that Nelson himself is at all happy with the result). How long will it take in the case of proof-checking?


Last Friday, I was at a “Symposium on the Nature of Proof” at UPenn, to give a popular talk about theoretical computer scientists’ expansions of the notion of mathematical proof (to encompass things like probabilistic, interactive, zero-knowledge, and quantum proofs).  This really is some of the easiest, best, and most fun material in all of CS theory to popularize.  Here are iTunes videos of my talk and the three others in the symposium: I’m video #2, logician Solomon Feferman is #3, attorney David Rudovsky is #4, and mathematician Dennis DeTurck is #5.  Also, here are my PowerPoint slides.  Thanks very much to Scott Weinstein at Penn for organizing the symposium.

In other news, the Complexity Zoo went down yet again this week, in a disaster that left vulnerable communities without access to vital resources like nondeterminism and multi-prover interaction.  Luckily, computational power has since been restored: with help from some volunteers, I managed to get the Zoo up and running again on my BlueHost account.  But while the content is there, it looks horrendously ugly; all the formatting seems to be gone.  And the day I agreed to let the Zoo be ported to MediaWiki was the day I lost the ability to fix such problems.  What I really need, going forward, is for someone else simply to take charge of maintaining the Zoo: it’s become painfully apparent both that it needs to be done and that I lack the requisite IT skills.  If you want to take a crack at it, here’s an XML dump of the Zoo from a few months ago (I don’t think it’s really changed since then).  You don’t even need to ask my permission: just get something running, and if it looks good, I’ll anoint you the next Zookeeper and redirect complexityzoo.com to point to your URL.

Update (Nov. 18): The Zoo is back up with the old formatting and graphics!!  Thanks so much to Charles Fu for setting up the new complexity-zoo.net (as well as Ethan, who set up a slower site that tided us over).  I’ve redirected complexityzoo.com to point to complexity-zoo.net, though it might take some time for your browser cache to clear.


If the world ends today, at least it won’t do so without three identical photons having been used to sample from a probability distribution defined in terms of the permanents of 3×3 matrices, thereby demonstrating the Aaronson-Arkhipov BosonSampling protocol.  And the results were obtained by no fewer than four independent experimental groups, some of whom have now published in Science.  One of the groups is based in Brisbane, Australia, one in Oxford, one in Vienna, and one in Rome; they coordinated to release their results the same day.  That’s right, the number of papers (4) that these groups managed to choreograph to appear simultaneously actually exceeds the number of photons that they so managed (3).  The Brisbane group was even generous enough to ask me to coauthor: I haven’t been within 10,000 miles of their lab, but I did try to make myself useful to them as a “complexity theory consultant.”

Here are links to the four experimental BosonSampling papers released in the past week:

  * Experimental BosonSampling by Broome et al. (Brisbane)
  * Experimental Boson Sampling by Tillmann et al. (Vienna)
  * Experimental Boson Sampling by Walmsley et al. (Oxford)
  * Experimental boson sampling in arbitrary integrated photonic circuits by Crespi et al. (Italy)

For those who want to know the theoretical background to this work:

  * My and Alex’s original 100-page BosonSampling paper (to appear soon in the journal Theory of Computing)
  * The 10-page STOC’2011 version of our paper
  * My PowerPoint slides
  * Alex’s slides
  * Theoretical Computer Science StackExchange question and answer
  * Gil Kalai’s blog post
  * Old Shtetl-Optimized post

For those just tuning in, here are some popular-level articles about BosonSampling:

  * Larry Hardesty’s MIT News article (from last year)
  * University of Queensland press release
  * Victorian counting device gets speedy quantum makeover (this week, from New Scientist; the article is not bad except that it ought to credit Alex Arkhipov)
  * New Form of Quantum Computation Promises Showdown with Ordinary Computers, by Adrian Cho (from Science)

I’ll be happy to answer further questions in the comments; for now, here’s a brief FAQ:

Q: Why do you need photons in particular for these experiments?

A: What we need is identical bosons, whose transition amplitudes are given by the permanents of matrices.  If it were practical to do this experiment with Higgs bosons, they would work too!  But photons are more readily available.

Q: But a BosonSampling device isn’t really a “computer,” is it?

A: It depends what you mean by “computer”!  If you mean a physical system that you load input into, let evolve according to the laws of physics, then measure to get an answer to a well-defined mathematical problem, then sure, it’s a computer!   The only question is whether it’s a useful computer.  We don’t believe it can be used as a universal quantum computer—or even, for that matter, as a universal classical computer.  More than that, Alex and I weren’t able to show that solving the BosonSampling problem has any practical use for anything.  However, we did manage to amass evidence that, despite being useless, the BosonSampling problem is also hard (at least for a classical computer).  And for us, the hardness of classical simulation was the entire point.

Q: So, these experiments reported in Science this week  have done something that no classical computer could feasibly simulate?

A: No, a classical computer can handle the simulation of 3 photons without much—or really, any—difficulty.  This is only a first step: before this, the analogous experiment (called the Hong-Ou-Mandel dip) had only ever been performed with 2 photons, for which there’s not even any difference in complexity between the permanent and the determinant (i.e., between bosons and fermions).  However, if you could scale this experiment up to about 30 photons, then it’s likely that the experiment would be solving the BosonSampling problem faster than any existing classical computer (though the latter could eventually solve the problem as well).  And if you could scale it up to 100 photons, then you might never even know if your experiment was working correctly, because a classical computer would need such an astronomical amount of time to check the results.


The most important research question in astronomy, to judge from the news websites, is neither the nature of dark matter and energy, nor the origin of the Pioneer anomaly or gamma-ray bursts beyond the GZK cutoff, nor the possible existence of Earth-like extrasolar planets. No, the big question is whether Pluto is “really” a planet, and if so, whether Charon and Ceres are “really” planets, and whether something has to be round to be a planet, and if so, how round.

I was going to propose we bring in Wittgenstein to settle this. But I guess the astronomers have already “ruled.”

Richard Dawkins often rails against what he calls the “tyranny of the discontinuous mind.” As far as I know, he’s not complaining about those of us who like our Hilbert spaces finite-dimensional and our quantum gravity theories discrete. Rather, he’s complaining about those who insist on knowing, for every humanoid fossil, whether it’s “really” human or “really” an ape. Ironically, it’s often the same people who then complain about the “embarrassing lack of transitional forms”!

Can anyone suggest a word for a person obsessed with drawing firm but arbitrary lines through a real-valued parameter space? (“Lawyer” is already taken.) I’ve already figured out the word for a debate about such lines, like the one we saw in Prague: chasmgasm.


(Note: The “2!” in the title of this post really does mean “2 factorial,” if you want it to.)

With the end of the semester upon us, it’s time for a once-every-two-year tradition: showcasing student projects from my 6.845 Quantum Complexity Theory graduate course at MIT.  For my previous showcase, in 2010, I chose six projects that I thought were especially outstanding.  This year, however, there were so many great projects—and so many, in particular, that could actually be useful to people in quantum computing—that I decided simply to open up the showcase to the whole class.  I had 17 takers; their project reports and 10-minute presentation slides are below.

Let me mention a few projects that tried to do something new and audacious.  Jenny Barry generalizes the notion of Partially Observable Markov Decision Processes (POMDPs) to the quantum case, and uses a recent result of Eisert et al., showing that certain problems in quantum measurement theory are undecidable (like, literally Turing-undecidable), to show that goal state reachability for “QOMDPs” is also Turing-undecidable (despite being decidable for classical POMDPs).  Matt Falk suggests a novel quantum algorithm for spatial search on the 2D grid, and gives some numerical evidence that the algorithm finds a marked item in O(√n) time (which, if true, would be the optimal bound, beating the previous runtime of O(√(n log n))).  Matt Coudron and Henry Yuen set out to prove that the Vazirani-Vidick protocol for quantum randomness expansion is optimal, and achieve some interesting partial results.  Mohammad Bavarian (well, jointly with me) asks whether there’s a fast quantum algorithm for PARITY that gets the right answer on just slightly more than 50% of the inputs—and shows, rather surprisingly, that this question is closely related to some of the hardest open problems about Boolean functions, like sensitivity versus block sensitivity.

This year, though, I also want to call special attention to the survey projects, since some of them resulted in review articles that could be of real use to students and researchers in quantum computing theory.  Notably, Adam Bookatz compiled the first list of essentially all known QMA-complete problems, analogous to (but shorter than!) Garey and Johnson’s listing of known NP-complete problems in 1979.  Chris Graves surveyed the known quantum fault-tolerance bounds.  Finally, three projects took on the task of understanding and explaining some of the most important recent results in quantum complexity theory: Travis Hance on Thomas Vidick and Tsuyoshi Ito’s NEXP in MIP* breakthrough; Emily Stark on Mark Zhandry’s phenomenal results on the security of classical cryptographic constructions against quantum attack; and Max Zimet on Jordan-Lee-Preskill’s major work on simulation of quantum field theories.

(Oops, sorry … did I use words like “important,” “breakthrough,” and “phenomenal” too often in that last sentence, thereby triggering the wrath of the theoretical computer science excitement police?  Well then, come over to my apartment and friggin’ arrest me.)

Anyway, thanks so much to all the students for making 6.845 such an awesome class (at least on my end)!  Without further ado, here’s the complete project showcase:

  * Arturs Backurs.  Influences in Low-Degree Polynomials.  [Report] [Slides]
  * Jenny Barry.  Quantum POMDPs (Partially Observable Markov Decision Processes).  [Report] [Slides]
  * Mohammad Bavarian.  The Quantum Weak Parity Problem.  [Report] [Slides]
  * Shalev Ben-David.  Decision-Tree Complexity.  [Report] [Slides]
  * Adam Bookatz.  QMA-Complete Problems.  [Report] [Slides]
  * Adam Bouland.  Classifying Beamsplitters.  [Report] [Slides]
  * Matt Coudron and Henry Yuen.  Some Limits on Non-Local Randomness Expansion.  [Report] [Slides]
  * Charles Epstein.  Adiabatic Quantum Computing.  [Report] [Slides]
  * Matt Falk.  Quantum Search on the Spatial Grid.  [Report] [Slides]
  * Badih Ghazi.  Quantum Query Complexity of PARITY with Small Bias.  [Report] [Slides]
  * Chris Graves.  Survey on Bounds on the Quantum Fault-Tolerance Threshold.  [Report] [Slides]
  * Travis Hance.  Multiprover Interactive Protocols with Quantum Entanglement.  [Report] [Slides]
  * Charles Herder.  Blind Quantum Computation.  [Report] [Slides]
  * Vincent Liew.  On the Complexity of Manipulating Quantum Boolean Circuits.  [Report] [Slides]
  * Emily Stark.  Classical Crypto, Quantum Queries.  [Report] [Slides]
  * Ted Yoder.  Generalized Stabilizers.  [Report] [Slides]
  * Max Zimet.  Complexity of Quantum Field Theories.  [Report] [Slides]


There’s a trivial question about particle accelerators that bugged me for a while. Today I finally figured out the answer, and I’m so excited by my doofus “discovery” that I want to tell the world.

In Ye Olde Times, accelerators used to smash particles against a fixed target. But today’s accelerators smash one particle moving at almost the speed of light against another particle moving at almost the speed of light — that’s why they’re called particle colliders (duhhh). Now, you’d think this trick would increase the collision energy by a constant factor, but according to the physicists, it does asymptotically better than that: it squares the energy!

My question was, how could that be? Even if both particles are moving, we can clearly imagine that one of them is stationary, since the particles’ motion with respect to the Earth is irrelevant. So then what’s the physical difference between a particle hitting a fixed target and two moving particles hitting each other, that could possibly produce a quadratic improvement in energy?

[Warning: Spoiler Ahead]

The answer pops out if we consider the rule for adding velocities in special relativity. If in our reference frame, particle 1 is headed left at a v fraction of the speed of light, while particle 2 is headed right at a w fraction of the speed of light, then in particle 1’s reference frame, particle 2 is headed right at a (v+w)/(1+vw) fraction of the speed of light. Here 1+vw is the relativistic correction, “the thing you put in to keep the fraction less than 1.” If v and w are both close to 0, then of course we get v+w, the Newtonian answer.

Now set v=w=1-ε. Then (v+w)/(1+vw) = 1 – ε2/(2-2ε+ε2), which scales like 1-ε2. Aha!

To finish the argument, remember that relativistic energy increases with speed like 1/sqrt(1-v2). If we plug in v=1-ε, then we get 1/sqrt(2ε-ε2), while if we plug in v=1-ε2, then we get 1/sqrt(2ε2-ε4). So in the case of a fixed target the energy scales like 1/sqrt(ε), while in the case of two colliding particles it scales like 1/ε.

In summary, nothing’s going on here except relativistic addition of velocities. As with Grover’s algorithm, as with the quantum Zeno effect, it’s our intuition about linear versus quadratic that once again leads us astray.


I botched the calculation. While I got the answer I wanted (a quadratic improvement in energy), and while I more-or-less correctly identified the reason for that answer (unintuitive properties of the relativistic velocity addition formula), I did the calculation in the rest frame of one of the particles instead of the zero-momentum rest frame, and thereby obtained a scaling of 1/sqrt(ε) versus 1/ε instead of 1/ε1/4 versus 1/sqrt(ε). As a result, my answer flagrantly violates conservation of energy.

Thanks to rrtucci and perseph0ne. In my defense, I did call it a doofus discovery.


In August of 2002 I opened the Complexity Zoo: an online bestiary of 196 complexity classes, since expanded to 443. Yesterday I entrusted the Zoo to anyone on Earth who wants to feed the animals or contribute their own. This was possible because of John Stockton, who graciously converted the Zoo to wiki form.

The decision to relinquish control of my best-known work was tinged with regret. But at age 3, my baby is all grown up, and it’s time to send it off to grad school so I can move on to other things.

This seems like a good occasion to ask a potentially heretical question:

> Did theoretical computer science take a wrong turn when it introduced complexity classes?

For readers with social lives, I should explain that a “complexity class” is a class of problems solvable by some sort of idealized computer. For example, P (Polynomial-Time) consists of all problems that an ordinary computer could solve in a “reasonable” amount of time, meaning an amount that increases like the problem size raised to a fixed power. To illustrate, a few years ago Agrawal, Kayal, and Saxena made international headlines for showing that “PRIMES is in P.” What this means is that they found a general method to decide if an n-digit number is prime or composite, using only about n12 steps — much less than you’d need to try all possible divisors. Faster methods were known before, but they had a small chance of not producing an answer.

Other complexity classes include PSPACE (Polynomial Space), BQP (Bounded-Error Quantum Polynomial Time), EXP, NP, coNP, BPP, RP, ZPP, PH, Σ2P, P/poly, L, NL, PP, AWPP, LWPP, BQNC, QMA, QCMA, S2P, SZK, NISZK, and many more.

The advantage of this alphabet soup is that it lets us express complicated insights in an incredibly compact way:

  * If NP is in BPP then NP=RP.
  * If NP is in P/poly then PH = Σ2P.
  * PH is in P#P.
  * NL=coNL.

The disadvantage, of course, is that it makes us sound like the fabled prisoners who tell each other jokes by calling out their code numbers. Again and again, I’ve had trouble getting across to outsiders that complexity theory is not “about” capital letters, any more than chemistry is “about” weird strings like NaCl-KCl-MgCl2-H20\. Why is it so hard to explain that we don’t worry about EXP vs. P/poly because we’re eccentric anal-retentives, but because we want to know whether a never-ending cavalcade of machines, each richer and more complicated than the last, might possibly succeed at a task on which any one machine must inevitably founder — namely, the task of outracing time itself, of simulating cosmic history in an eyeblink, of seeing in the unformed clumps of an embryonic universe the swirl of every galaxy and flight of every hummingbird billions of years hence, like Almighty God Himself?

(Alright, maybe I meant BQEXP vs. BQP/poly.)

In the early 70’s, there was apparently a suggestion that NP be called PET, which could stand for three things: “Probably Exponential Time,” “Provably Exponential Time” (if P!=NP), or “Previously Exponential Time” (if P=NP). If this silly name had stuck, would our field have developed in a different direction?


Sorry for the terrible pun.  Today’s post started out as a comment on a review of the movie Lincoln on Sean Carroll’s blog, but it quickly become too long, so I made it into a post on my own blog.  Apparently I lack Abe’s gift for concision.

I just saw Lincoln — largely inspired by Sean’s review — and loved it.  It struck me as the movie that Lincoln might have wanted to be made about himself: it doesn’t show any of his evolution, but at least it shows the final result of that evolution, and conveys the stories, parables, and insight into human nature that he had accumulated by the end of his life in a highly efficient manner.

Interestingly, the Wikipedia page says that Spielberg commissioned, but then ultimately rejected, two earlier scripts that would have covered the whole Civil War period, and (one can assume) Lincoln’s process of evolution.  I think that also could have been a great movie, but I can sort-of understand why Spielberg and Tony Kushner made the unusual choice they did: at the level of detail they wanted, it seems like it would be impossible to do justice to Lincoln’s whole life, or even the last five years of it, in anything less than a miniseries.

I agree with the many people who pointed out that the movie could have given more credit to those who were committed antislavery crusaders from the beginning—rather than those like Lincoln, who eventually came around to the positions we now associate with him after a lot of toying with ideas like blacks self-deporting to Liberia.  But in a way, the movie didn’t need to dole out such credit: today, we know (for example) that Thaddeus Stevens had history and justice 3000% on his side, so the movie is free to show him as the nutty radical that he seemed to most others at the time.  And there’s even a larger point: never the most diligent student of history, I (to take one embarrassing example) had only the vaguest idea who Thaddeus Stevens even was before seeing the movie.  Now I’ve spent hours reading about him, as well as about Charles Sumner, and being moved by their stories.

(At least I knew about the great Frederick Douglass, having studied his Narrative in freshman English class.  Douglass and I have something in common: just as a single sentence he wrote, “I would unite with anybody to do right and with nobody to do wrong,” will reverberate through the ages, so too, I predict, will a single sentence I wrote: “Australian actresses are plagiarizing my quantum mechanics lecture to sell printers.”)

More broadly, I think it’s easy for history buffs to overestimate how much people already know about this stuff.  Indeed, I can easily imagine that millions of Americans who know Lincoln mostly as “the dude on the $5 bill (who freed some slaves, wore a top hat, used the word ‘fourscore,’ and got shot)” will walk out of the cineplex with a new and ~85% accurate appreciation for what Lincoln did to merit all that fuss, and why his choices weren’t obvious to everyone else at the time.

Truthfully, though, nothing made me appreciate the movie more than coming home and reading countless comments on movie review sites denouncing Abraham Lincoln as a bloodthirsty war criminal, and the movie as yet more propaganda by the victors rewriting history.  Even on Sean’s blog we find this, by a commenter named Tony:

I’m not one who believes we have to go to war to solve every problem we come across, I can’t believe that Lincoln couldn’t have found a solution to states rights and slavery in a more peaceful course of action. It seems from the American Revolutionary war to the present it has been one war after another … The loss of life of all wars is simply staggering, what a waste of humanity.

Well look, successive presidential administrations did spend decades trying to find a peaceful solution to the “states rights and slavery” issue; the massive failure of their efforts might make one suspect that a peaceful solution didn’t exist.  Indeed, even if Lincoln had simply let the South secede, my reading of history is that issues like the return of fugitive slaves, or competition over Western territories, would have eventually led to a war anyway.  I’m skeptical that, in the limit t→∞, free and slave civilizations could coexist on the same continent, no matter how you juggled their political organization.

I’ll go further: it even seems possible to me that the Civil War ended too early, with the South not decimated enough.  After World War II, Japan and Germany were successfully dissuaded even from “lite” versions of their previous plans, and rebuilt themselves on very different principles.  By contrast, as we all know, the American South basically refused for the next century to admit it had lost: it didn’t try to secede again, but it did use every means available to it to reinstate de facto slavery or something as close to that as possible.  All the civil-rights ideals of the 1960s had already been clearly articulated in the 1860s, but it took another hundred years for them to get implemented.  Even today, with a black President, the intellectual heirs of the Confederacy remain a force to be reckoned with in the US, trying (for example) to depress minority voter turnout through ID laws, gerrymandering, and anything else they think they can possibly get away with.  The irony, of course, is that the neo-Confederates now constitute a nontrivial fraction of what they proudly call “the party of Lincoln.”  (Look at the map of blue vs. red states, and compare it to the Mason-Dixon line.  Even the purple states correspond reasonably well to the vacillating border states of 1861.)

So that’s why it seems important to have a movie every once in a while that shows the moral courage of people like Lincoln and Thaddeus Stevens, and that names and shames the enthusiastic defenders of slavery—because while the abolitionists won the battle, on some fronts we’re still fighting the war.


In the wake of my very public relativity humiliation, I’ve decided to sentence myself to a one-month punishment term of only blogging about things that I actually understand. That means, unfortunately, that from now until September 27 this blog is going to be quite boring and limited in scope. It also means that Lev R.’s prizewinning question, about the survival prospects of the human race, will need to be deferred until after the punishment term.

To be clear: No string theory. No global warming. No biting vaginas. No Mahmoud. Quantum complexity classes are probably kosher.

The remainder of today’s entry will be about the topic of bananas. Bananas are long, yellow fruits that grow in bunches on some sort of plant or other. They consist of two components: the peel, and the “meat.” Well, there are probably other components as well, but those two are the most readily identifiable. The meat is delicious when fresh, even more so if covered with chocolate. When not fresh, on the other hand, it tends to form brown spots. The peel is not so good to eat, but is reputed to good for tripping dumb, careless, unwary people. Like me.


A couple weeks ago M. I. Dyakonov, a longtime quantum computing skeptic, published a new paper setting out his arguments (maybe “grievances” is a more accurate word) against quantum computing research.  Looking for a way to procrastinate from other work I have to do, I decided to offer some thoughts in response.

To me, perhaps the most striking aspect of Dyakonov’s paper is what it doesn’t claim.  Unlike Leonid Levin, Oded Goldreich, and several other quantum computing skeptics I’ve engaged, Dyakonov never seriously entertains the possibility of a general principle that would explain why scalable quantum computing is not possible.  (Thus, my $100K prize presumably isn’t relevant to him.)  He even ridicules discussion of such a principle (see the end of this post).  The unwillingness to say that scalable QC can’t work, or to articulate a reason why, saves Dyakonov from the need to explore what else would need to be true about the physical world if scalable QC were impossible.  For example, would there then be an efficient algorithm to simulate arbitrary quantum systems on a classical computer—or at least, all quantum systems that can plausibly arise in Nature?  Dyakonov need not, and does not, evince any curiosity about such questions.  In his game, it’s only the quantum computing proponents who are on trial; there’s no need for examination of the other side.

That being so, Dyakonov focuses on what he sees as unrealistic assumptions in known versions of the Quantum Fault-Tolerance Theorem, covering well-trodden ground but with some strange twists.  He accuses quantum computing researchers of a “widespread belief that the |0〉 and |1〉 states ‘in the computational basis’ are something absolute, akin to the on/off states of an electrical switch, or of a transistor in a digital computer.”  He then follows with a somewhat-patronizing discussion of how no continuous quantity can be manipulated perfectly, and how |0〉 and |1〉 are just arbitrary labels whose meanings could change over time due to drift in the preparation and measurement devices.  Well, yes, it’s obvious that |0〉 and |1〉 don’t have absolute meanings, but is it not equally obvious that we can give them meanings, through suitable choices of initial states, gates, and measurement settings?  And if the meanings of |0〉 and |1〉 drift over time, due to the imprecision of our devices … well, if the amount of drift is upper-bounded by some sufficiently small constant, then we can regard it as simply yet another source of noise, and apply standard fault-tolerance methods to correct it.  If the drift is unbounded, then we do need better devices.

(Fault-tolerance mavens: please use the comments for more detailed discussion!  To my inexpert eyes, Dyakonov doesn’t seem to engage the generality of the already-known fault-tolerance theorems—a generality traceable to the fact that what powers those results is ultimately just the linearity of quantum mechanics, not some fragile coincidence that one expects to disappear with the slightest change in assumptions.  But I’m sure others can say more.)

Anyway, from his discussion of fault-tolerance, Dyakonov concludes only that the possibility of scalable quantum computing in the real world should be considered an open question.

Surprisingly—since many QC skeptics wouldn’t be caught dead making such an argument—Dyakonov next turns around and says that, well, OK, fine, even if scalable QCs can be built, they still won’t be good for much.  Shor’s factoring algorithm is irrelevant, since people would simply switch to other public-key cryptosystems that appear secure even against quantum attack.  Simulating quantum physics “would be an interesting and useful achievement, but hardly revolutionary, unless we understand this term in some very narrow sense.”  And what about Grover’s algorithm?  In an endnote, Dyakonov writes:

Quantum algorithms that provide (with an ideal quantum computer!) only polynomial speed-up compared to digital computing, like the Grover algorithm, became obsolete due to the polynomial slow-down imposed by error correction.

The above is flat-out mistaken.  The slowdown imposed by quantum error-correction is polylogarithmic, not polynomial, so it doesn’t come close to wiping out the Grover speedup (or the subexponential speedups that might be achievable, e.g., with the adiabatic algorithm, which Dyakonov doesn’t mention).

But disregarding the polylog/polynomial confusion (which recurs elsewhere in the article), and other technical issues about fault-tolerance, up to this point many quantum computing researchers could happily agree with Dyakonov—and have said similar things many times themselves.  Dyakonov even quotes Dorit Aharonov, one of the discoverers of quantum fault-tolerance, writing, “In a sense, the question of noisy quantum computation is theoretically closed. But a question still ponders our minds: Are the assumptions on the noise correct?”

(And as for QC researchers coming clean about limitations of quantum computers?  This is just hearsay, but I’m told there’s a QC researcher who actually chose “Quantum computers are not known to be able to solve NP-complete problems in polynomial time” as the tagline for his blog!)

Dyakonov fumes about how popular articles, funding agency reports, and so forth have overhyped progress in quantum computing, leaving the conditions out of theorems and presenting incremental advances as breakthroughs.  Here I sadly agree.  As readers of Shtetl-Optimized can hopefully attest, I’ve seen it as my professional duty to spend part of my life battling cringeworthy quantum computing claims.  Every week, it feels like I talk to another journalist who tries to get me to say that this or that QC result will lead to huge practical applications in the near future, since that’s what the editor is looking for.  And every week I refuse to say it, and try to steer the conversation toward “deeper” scientific questions.  Sometimes I succeed and sometimes not, but at least I never hang up the phone feeling dirty.

On the other hand, it would be interesting to know whether, in the history of science, there’s ever been a rapidly-developing field, of interest to large numbers of scientists and laypeople alike, that wasn’t surrounded by noxious clouds of exaggeration, incomprehension, and BS.  I can imagine that, when Isaac Newton published his Principia, a Cambridge University publicist was there to explain to reporters that the new work proved that the Moon was basically an apple.

But none of that is where Dyakonov loses me.  Here’s where he does: from the statements

A) The feasibility of scalable quantum computing in the physical world remains open, and

B) The applications of quantum computing would probably be real but specialized,

he somehow, unaided by argument, arrives at the conclusion

C) Quantum computing is a failed, pathological research program, which will soon die out and be of interest only to sociologists.

Let me quote from his conclusion at length:

I believe that, in spite of appearances, the quantum computing story is nearing its end, not because somebody proves that it is impossible, but rather because 20 years is a typical lifetime of any big bubble in science, because too many unfounded promises have been made, because people get tired and annoyed by almost daily announcements of new “breakthroughs”, because all the tenure positions in quantum computing are already occupied, and because the proponents are growing older and less zealous, while the younger generation seeks for something new …

In fact, quantum computing is not so much a scientific, as a sociological problem which has expanded out of all proportion due to the US system of funding scientific research (which is now being copied all over the world). While having some positive sides, this system is unstable against spontaneous formation of bubbles and mafia-like structures. It pushes the average researcher to wild exaggerations on the border of fraud and sometimes beyond. Also, it is much easier to understand the workings of the funding system, than the workings of Nature, and these two skills only rarely come together.

The QC story says a lot about human nature, the scientific community, and the society as a whole, so it deserves profound psycho-sociological studies, which should begin right now, while the main actors are still alive and can be questioned.

In case the message isn’t yet clear enough, Dyakonov ends by comparing quantum computing to the legend of Nasreddin, who promised the Sultan that he could teach a donkey how to read.

Had he [Nasreddin] the modern degree of sophistication, he could say, first, that there is no theorem forbidding donkeys to read. And, since this does not contradict any known fundamental principles, the failure to achieve this goal would reveal new laws of Nature.  So, it is a win-win strategy: either the donkey learns to read, or new laws will be discovered.

Second, he could say that his research may, with some modifications, be generalized to other animals, like goats and sheep, as well as to insects, like ants, gnats, and flies, and this will have a tremendous potential for improving national security: these beasts could easily cross the enemy lines, read the secret plans, and report them back to us.

Dyakonov chose his example carefully.  Turnabout: consider the first person who had the idea of domesticating a wild donkey, teaching the beast to haul people’s stuff on its back.  If you’d never seen a domestic animal before, that idea would sound every bit as insane as donkey literacy.  And indeed, it probably took hundreds of years of selective breeding before it worked well.

In general, if there’s no general principle saying that X can’t work, the truth might be that X can probably never work, but the reasons are too messy to articulate.   Or the truth might be that X can work.  How can you ever find out, except by, y’know, science?  Try doing X.  If you fail, try to figure out why.  If you figure it out, share the lessons with others.  Look for an easier related problem Y that you can solve.  Think about whether X is impossible; if you could show its impossibility, that might advance human knowledge even more than X itself would have.  If the methods you invented for X don’t work, see if they work for some other, unrelated problem Z.  Congratulations!  You’ve just reinvented quantum computing research.  Or really, any kind of research.

But there’s something else that bothers me about Dyakonov’s donkey story: its specificity.  Why fixate on teaching a donkey, only a donkey, how to read?  Earlier in his article, Dyakonov ridicules the diversity of physical systems that have been considered as qubits—electron spin qubits, nuclear spin qubits, Josephson superconducting qubits, cavity photon qubits, etc.—seeing the long list as symptomatic of some deep pathology in the field.  Yet he never notices the tension with his donkey story.  Isn’t it obvious that, if Nasreddin had been a quantum computing experimentalist, then after failing to get good results with donkeys, he’d simply turn his attention to teaching cows, parrots, snakes, elephants, dolphins, or gorillas how to read?  Furthermore, while going through the zoo, Nasreddin might discover that he could teach gorillas how to recognize dozens of pictorial symbols: surely a nice partial result.  But maybe he’d have an even better idea: why not build his own reading machine?  The machine could use a camera to photograph the pages of a book, and a computer chip to decode the letters.  If one wanted, the machine could be even be the size and shape of a donkey, and could emit braying sounds.  Now, maybe Nasreddin would fail to build this reading machine, but even then, we know today that it would have been a noble failure, like those of Charles Babbage or Ted Nelson.  Nasreddin would’ve failed only by being too far ahead of his time.

Update (Jan. 7): See Dyakonov’s response to this post, and my response to his response.


From left: Amnon Ta-Shma, your humble blogger, David Zuckerman, Adi Akavia, Adam Klivans. Behind us: the majestic mountains of Banff, Canada, site of yet another complexity workshop, which I just returned from a couple days ago, after which I immediately had to move out of my apartment, which explains the delay in updating the blog. Thanks to Oded Regev for the photo.

A few highlights from the workshop:

  * Rahul Santhanam presented a proof that for every fixed k, there exists a language in PromiseMA with no circuits of size nk. This is a problem I spent some time on last year and failed to solve.
  * Dmitry Gavinsky discussed the question of whether quantum one-way communication complexity can be exponentially smaller than randomized two-way communication complexity. Richard Cleve has a candidate problem that might yield such a separation.
  * Ryan O’Donnell presented a proof that one can decide, using poly(1/ε) queries, whether a Boolean function is a threhold function or is ε-far from any threshold function. This is much harder than it sounds.
  * I took a gondola to the top of Sulphur Mountain, where the above photo was taken. While walking amidst some slanty rocks, I slipped and twisted my ankle. I was hobbling around for several days afterward, but seem to be OK now.

Overwhelming everything else, alas, was a memorial session for Misha Alekhnovich. Misha, who loved extreme sports, went on a whitewater kayaking trip in Russia a month ago. At a dangerous bend in the river, his three companions apparently made it to shore safely, while Misha did not. He was 28, and was to get married a few days from now.

Misha and I overlapped as postdocs at IAS, and I wish I’d gotten to know him better then. From the conversations we did have, it was clear that Misha missed Russia and wanted to go back as soon as possible. The truth, though, is that I knew Misha less on a personal level than through his groundbreaking work, and particularly his beautiful paper with Razborov, where they show that the Resolution proof system is not automatizable unless FPT = W[P]. I still find it incredible that they were able to prove such a thing.

Lance has already discussed the memorial session, in which Eli Ben-Sasson and Sasha Razborov offered their personal remembrances, while Toni Pitassi and Russell Impagliazzo gave talks about Misha’s work, emphasizing how the P versus NP question always lay just beneath the surface. It occurred to me that an outsider might find these talks odd, or even off-putting. Here we were, at a memorial for a dead colleague, talking in detail about the definition of automatizability and the the performance of the DPLL algorithm on satisfiable CNF instances. Personally, I found it moving. At a funeral for a brilliant musician, would one discuss his “passion for music” in the abstract without playing any of his songs?

The tragic loss of Misha has reinforced a view I’ve long held: that if challenge is what you seek, then the thing to do is to tackle difficult open problems in math and computer science (or possibly physics). Unlike the skydiver, the kayaker, or the mountain-climber, the theorem-prover makes a permanent contribution in the best case, and is down a few months and a few hundred cups of coffee in the worst case. As for physical challenges, walking around heavily-populated tourist areas with slanty rocks has always presented more than enough of them for me.


If you have opinions about quantum computing, and haven’t yet read through the discussion following my “response to Dyakonov” post, you’re missing out.  The comments—by QC researchers (Preskill, Kuperberg, Gottesman, Fitzsimons…), skeptics (Dyakonov, Kalai, …), and interested outsiders alike—are some of the most interesting I’ve seen in this two-decade-old debate.

At the risk of crass immodesty, I just posted a comment whose ending amused me so much, I had to promote it to its own post.  My starting point was an idea that several skeptics, including Dyakonov, have articulated in this debate, and which I’ll paraphrase as follows:

Sure, quantum computing might be “possible in principle.”  But only in the same sense that teaching a donkey how to read, transmuting large amounts of lead into gold, or doing a classical computation in the center of the Sun are “possible in principle.”  In other words, the task is at the same time phenomenally difficult, and fundamentally arbitrary and quixotic even if you did somehow achieve it.

Since I considered this argument an important one, I wrote a response, which stressed how quantum computing is different both because it strives to solve problems that flat-out can’t feasibly be solved any other way if standard complexity conjectures are correct, and because the goal—namely, expanding the human race’s computational powers beyond classical polynomial time—is not at all an arbitrary one.  However, I then felt the need to expand on the last point, since it occurred to me that it’s both central to this debate and almost never discussed explicitly.

How do I know that the desire for computational power isn’t just an arbitrary human quirk?

Well, the reason I know is that math isn’t arbitrary, and computation is nothing more or less than the mechanizable part of solving math problems.

Let me put it this way: if we ever make contact with an advanced extraterrestrial civilization, they might have three sexes and five heads. But they, too, will have encountered the problem of factoring integers into primes. Indeed, because they’ll inhabit the same physical universe as we do, they’ll even have encountered the problem of simulating quantum physics. And therefore, putting the two together, they’ll almost certainly have discovered something like Shor’s algorithm — though they’ll call it “Zork’s bloogorithm” or whatever.


More often than I can remember, I’ve been asked some form of the following question: “If you computer scientists can’t prove P=NP or P!=NP, then why aren’t we justified in believing whichever one we want? And why is the ‘consensus’ that P!=NP anything more than a shared prejudice — something you repeat to each other so your work won’t seem irrelevant?”

It’s time to assume the mantle of Defender of the Faith. I’m going to give you ten arguments for believing P!=NP: arguments that are pretty much obvious to those who have thought seriously about the question, but that (with few exceptions) seem never to be laid out explicitly for those who haven’t. You’re welcome to believe P=NP if you choose. My job is to make you understand the conceptual price you have to pay for that belief.

Without further ado:

  1. The Obvious Argument. After half a century, we still don’t know any algorithm for an NP-complete problem that runs in subexponential time. For Circuit-SAT, the canonical NP-complete problem, we don’t know any algorithm essentially better than brute-force search. In math, if decades of research fail to turn up an object, and there’s no a priori reason to suppose the object exists, it’s usually a good strategy to conjecture that the object doesn’t exist. We can all list counterexamples to this thesis, but the examples are much more numerous (though usually less famous, for obvious reasons).
  2. The Empirical Argument. While the argument based on decades of mathematical work can stand on its own, in the case of P versus NP we also have half a century of evidence from the computer industry. In a few cases — like linear programming and primality testing — people wanted fast ways to solve a problem in practice, and they came up with them, long before the problem was proved to be tractable theoretically. Well, people certainly want fast ways to solve NP-complete problems in practice, and they haven’t been able to invent them. The best-known satisfiability algorithms — such as DPLL, GSAT, and Survey Propagation — work surprisingly well on certain instance distributions, but croak (for example) on instances derived from factoring or automated theorem proving.
  3. The Bayesian Argument. Why can’t we turn the last two arguments on their heads, and say that, if our failure to find a fast SAT algorithm is evidence that P!=NP, then our failure to prove P!=NP is likewise evidence that P=NP? The answer is, because lower bounds are harder to prove than upper bounds. Assuming P=NP, it’s difficult to come up with a good reason why an efficient algorithm for NP-complete problems wouldn’t yet have been discovered. But assuming P!=NP, we understand in great detail why a proof hasn’t yet been discovered: because any proof will need to overcome specific and staggering obstacles. It will need to “know” how 3SAT differs from 2SAT, how quadratic programming differs from linear programming, and how approximating set cover within o(log|S|) differs from approximating it within log|S|. It will need to “look inside” computations in a way that doesn’t relativize. It will need to argue that NP-complete problems are hard, not because they look like random Boolean functions, but because they don’t look like random Boolean functions. While we have no reason to think such a proof is impossible — indeed, we have proofs satisfying some of the desiderata — we do have reason to think it will be extremely difficult.Whatever your “naïve prior probability” was that P=NP, the above considerations, together with Bayes’ Rule, suggest revising it downward.
  4. The Multiple-Surprises Argument. Here’s a point that’s not often stressed: for P to equal NP, not just one but many astonishing things would need to be true simultaneously. First, factoring would have to be in P. Second, factoring would have to be as hard as breaking one-way functions. Third, breaking one-way functions would have to be as hard as solving NP-complete problems on average. Fourth, solving NP-complete problems on average would have to be as hard as solving them in the worst case. Fifth, NP would have to have polynomial-size circuits. Sixth, NP would have to equal coNP. And so on. Any one of these statements, by itself, would overturn much of what we think we know about complexity.
  5. The Hierarchy Argument. This argument goes back to the early days of P versus NP. We know that P is strictly contained in EXP by the time hierarchy theorem. It follows that either P is strictly contained in NP, or NP is strictly contained in PSPACE, or PSPACE is strictly contained in EXP. Likewise, since NL is strictly contained in PSPACE=NPSPACE by the space hierarchy theorem, either NL is strictly contained in P, or P is strictly contained in NP, or NP is strictly contained in PSPACE. But if some of these separations hold, then why not all of them? To put the point differently, we know that collapse is not the general rule of the Complexity Zoo: even between P and EXP, there really are infinitely many distinct species. Indeed for some pairs of species, like E and PSPACE, we know they’re not equal even though we don’t know if either one contains the other! The burden of evidence, then, is on those who believe that two seemingly-distinct species are the same, not on those who believe they’re different.
  6. The Known-Algorithms Argument. We do have nontrivial efficient algorithms for several problems in NP, such as matching, stable marriage, minimum spanning tree, matrix inversion, planarity testing, and semidefinite programming. But every one of these algorithms depends, in a crucial way, on some special combinatorial or algebraic structure of the problem being solved. Is this just a fancy way of repeating that we don’t know yet how to solve NP-complete problems? I don’t think it is. It’s possible to imagine a situation where we knew “generic” techniques for achieving exponential speedups, which worked for objects as complicated as Turing machines, and the only problem was that we didn’t yet know how to apply those techniques to prove P=NP. But this is nothing like the actual situation.
  7. The Known-Lower-Bounds Argument. It could be that the dream of proving superpolynomial lower bounds on circuit size is no more than that: a pipe dream. But the fact remains we can prove superpolynomial lower bounds, albeit in weaker models of computation that are easier to analyze. To give some examples, superpolynomial lower bounds have been proven on the sizes of resolution proofs, monotone circuits, constant-depth circuits, read-once branching programs, and multilinear formulas.
  8. The Self-Referential Argument. If P=NP, then by that very fact, one would on general grounds expect a proof of P=NP to be easy to find. On the other hand, if P!=NP, then one would on general grounds expect a proof of P!=NP to be difficult to find. So believing P!=NP seems to yield a more ‘consistent’ picture of mathematical reality.
  9. The Philosophical Argument. If P=NP, then the world would be a profoundly different place than we usually assume it to be. There would be no special value in “creative leaps,” no fundamental gap between solving a problem and recognizing the solution once it’s found. Everyone who could appreciate a symphony would be Mozart; everyone who could follow a step-by-step argument would be Gauss; everyone who could recognize a good investment strategy would be Warren Buffett. It’s possible to put the point in Darwinian terms: if this is the sort of universe we inhabited, why wouldn’t we already have evolved to take advantage of it? (Indeed, this is an argument not only for P!=NP, but for NP-complete problems not being efficiently solvable in the physical world.)
  10. The Utilitarian Argument. Suppose you believe P!=NP. Then there are only two possibilities, both of which are deeply gratifying: either you’re right, or else there’s a way to solve NP-complete problems in polynomial time. (I realize that I’ve given a general argument for pessimism.)

There are several questions that the above arguments don’t pretend to address: first, why is P versus NP a reasonable question? Second, even if P!=NP, why should we expect there to be a proof in ZF set theory? Third, even if there is a proof, why should we expect it to be within reach of the human intellect? I’m really not cut out for this C. S. Lewis role, but look for further installments of Mere Complexity as the need arises…


Update (1/18): Some more information has emerged.  First, it’s looking like the prosecution’s strategy was to threaten Aaron with decades of prison time, in order to force him to accept a plea bargain involving at most 6 months.  (Carmen Ortiz issued a statement that conveniently skips the first part of the strategy and focuses on the second.)  This is standard operating procedure in our wonderful American justice system, due (in part) to the lack of resources actually to bring most cases to trial.  The only thing unusual about the practice is the spotlight being shone on it, now that it was done not to some poor unknown schmuck but to a tortured prodigy and nerd hero.  Fixing the problem would require far-reaching changes to our justice system.

Second, while I still strongly feel that we should await the results of Hal Abelson’s investigation, I’ve now heard from several sources that there was some sort of high-level decision at MIT—by whom, I have no idea—not to come out in support of Aaron.  Crucially, though, I’m unaware of the faculty (or students, for that matter) ever being consulted about this decision, or even knowing that there was anything for MIT to decide.  Yesterday, feeling guilty about having done nothing to save Aaron, I found myself wishing that either he or his friends or parents had made an “end run” around the official channels, and informed MIT faculty and students directly of the situation and of MIT’s ability to help.  (Or maybe they did, and I simply wasn’t involved?)

Just to make sure I hadn’t missed anything, I searched my inbox for “Swartz”, but all I found relevant to the case were a couple emails from a high-school student shortly after the arrest (for a project he was doing about the case), and then the flurry of emails after Aaron had already committed suicide.  By far the most interesting thing that I found was the following:

Aaron Swartz (December 12, 2007): I’m really enjoying the Democritus lecture notes. Any chance we’ll ever see lecture 12?

My response: It’s a-comin’!

* * *

As I wrote on this blog at the time of Aaron’s arrest: I would never have advised him to do what he did.  Civil disobedience can be an effective tactic, but off-campus access to research papers simply isn’t worth throwing your life away for—especially if your life holds as much spectacular promise as Aaron’s did, judging from everything I’ve read about him.  At the same time, I feel certain that the world will eventually catch up to Aaron’s passionate belief that the results of publicly-funded research should be freely available to the public.  We can honor Aaron’s memory by supporting the open science movement, and helping the world catch up with him sooner.


We are deeply saddened by Aaron Swartz’s death, and send our condolences to all who knew him.  We are very mindful of his commitment to the open access movement.  It inspires our own commitment to work for a situation where academic knowledge is freely available, so that others are not menaced by the kind of prosecution that he faced.  We encourage everyone to visit www.rememberaaronsw.com, a memorial site created by Aaron’s family and friends.

Scott Aaronson
Sasha Costanza-Chock
Kai von Fintel
Richard Holton
George Stephanopoulos
Anne Whiston Spirn

Members of the MIT Open Access Working Group


A reader named Lewis K. wrote in to ask for a “brief list of required reading for someone with a normal CS degree under his belt who wants to be taken to the research front in quantum complexity.” Alright then:

[Deutsch] [Bernstein-Vazirani] [BBBV] [Simon] [Shor] [Grover] [BBHT] [BBCMW] [Ambainis] [Watrous] [ANTV] [Fortnow-Rogers] [Abrams-Lloyd] [Childs et al.] [DMV] [EHK] [BJK] [Gottesman] [KKR] [Marriott-Watrous]

(Sprinkle in some textbooks, survey articles, and course lecture notes to taste.)

Commenters will boil me alive for leaving out huge swaths of the field, and they’ll be right. I’ve merely listed some papers that had a definite impact on how I, personally, attack problems. But hey, I’m the one you asked. So print ’em out, take ’em to the toilet, and sit there for a long time. When you’re finished, you won’t be at the “research front” — for that you obviously have to read my papers — but hopefully you’ll have seen enough to visit the big bad arXiv on your own. Happy Hadamards!


In 7+ years of blogging, one lesson I’ve learned is to go easy on the highly-personal stuff.  But sometimes one does need to make an exception.  Lily Rebecca Aaronson was born today (Jan. 20), at 6:55am, to me and Dana, weighing 3.3kg.  (After seeing her placenta, the blog category “Adventures in Meatspace” never seemed more appropriate.)  I’m blogging from the postpartum ward, which has free wifi and excellent food—we’ll probably stay here as long as they’ll let us.

Given that her parents are both complexity theorists, one question people will have is whether Lily demonstrates any early aptitude in that field.  All I can say is that, so far, she’s never once confused quantum computing with classical exponential parallelism, treated relativization as acting on a complexity class rather than on its definition, or made any other mathematical mistake that I can see.  (She has, on the other hand, repeatedly mistaken her hand for food.)


A month and a half ago, I gave a 45-minute lecture / attempted standup act with the intentionally-nutty title above, for my invited talk at the wonderful NIPS (Neural Information Processing Systems) conference at Lake Tahoe.  Video of the talk is now available at VideoLectures net.  That site also did a short written interview with me, where they asked about the “message” of my talk (which is unfortunately hard to summarize, though I tried!), as well as the Aaron Swartz case and various other things.  If you just want the PowerPoint slides from my talk, you can get those here.

Now, I could’ve just given my usual talk on quantum computing and complexity.  But besides increasing boredom with that talk, one reason for my unusual topic was that, when I sent in the abstract, I was under the mistaken impression that NIPS was at least half a “neuroscience” conference.  So, I felt a responsibility to address how quantum information science might intersect the study of the brain, even if the intersection ultimately turned out to be the empty set!  (As I say in the talk, the fact that people have speculated about connections between the two, and have sometimes been wrong but for interesting reasons, could easily give me 45 minutes’ worth of material.)

Anyway, it turned out that, while NIPS was founded by people interested in modeling the brain, these days it’s more of a straight machine learning conference.  Still, I hope the audience there at least found my talk an amusing appetizer to their hearty meal of kernels, sparsity, and Bayesian nonparametric regression.  I certainly learned a lot from them; while this was my first machine learning conference, I’ll try to make sure it isn’t my last.

(Incidentally, the full set of NIPS videos is here; it includes great talks by Terry Sejnowski, Stanislas Dehaene, Geoffrey Hinton, and many others.  It was a weird honor to be in such distinguished company — I wouldn’t have invited myself!)


At Greg Kuperberg’s request, I’ve decided to follow my Ten Reasons To Believe P!=NP with…

Thirteen Reasons Why I’d Be Surprised If Quantum Computing Were Fundamentally Impossible

So that there’s no question about exactly where I stand, I’ll start out by repeating, for the ten billionth time, the Official Scott Aaronson Quantum Computing Position Statement.

  * It’s entirely conceivable that quantum computing will turn out to be impossible for a fundamental reason.
  * This would be much more interesting than if it’s possible, since it would overturn our most basic ideas about the physical world.
  * The only real way to find out is to try to build a quantum computer.
  * Such an effort seems to me at least as scientifically important as (say) the search for supersymmetry or the Higgs boson.
  * I have no idea — none — how far it will get in my lifetime.

I now offer thirteen arguments to support the above views.

  1. The Obvious Argument. Quantum mechanics has been the foundation for all non-gravitational physics since 1926. Hoping that it would “just go away” has been one of the most consistently losing strategies in the history of science. If physicists and engineers didn’t take quantum mechanics seriously as a description of the world, they wouldn’t have been able to invent the laser, transistor, or classical computer. For that matter, they wouldn’t be able to explain why all the atoms in the universe don’t instantly disintegrate. Now, if you start with quantum mechanics, and write down the model of computation that directly flows from it, what do you end up with? BQP: Bounded-Error Quantum Polynomial-Time.
  2. The Experimental Argument. Ten years ago, one wouldn’t have been able to do much more than mount a general defense of quantum mechanics. But by now, liquid-NMR quantum computers have been built that not only factored 15 into 3 x 5 with small probability of error, but also searched 8-item databases. I’ve seen some of the machines that performed these staggering computational feats right here in Waterloo; they look like big-ass cylinders with the word “Bruker” on them. Seriously, while liquid-NMR (at least for now) doesn’t seem to be scalable, there’s been lots of recent work on solid-state NMR, photonics, and ion traps, all of which (if I’m not mistaken) are up to at least 3 qubits. While I don’t think the experimentalists are anywhere close to succeeding, these are smart people who haven’t been sitting on their asses (or if they have, then no doubt hard at work at a lab table or something).
  3. The Better-Shor-Than-More Argument. Why do skeptics always assume that, if quantum mechanics turns out to be only approximate, then whatever theory supersedes it will reinstate the Extended Church-Turing Thesis? Why isn’t it just as likely, a priori, that the new theory would yield even more computational power than BQP? This isn’t merely a logical point: to the extent that people have tried to propose serious alternatives to quantum mechanics (where “serious” means “agreeing with known experiments”), those alternatives often do involve more computational power than BQP.
  4. The Sure/Shor Argument. If you believe quantum mechanics is going to break down before nontrivial quantum computing becomes possible, then you must believe there’s some point where it will break down — some level of size, or complexity, or whatever, at which it will cease to be a useful description of the world. What is that point? In other words, where is the line — possibly a fuzzy, asymptotic, resource-dependent line — that puts the quantum states that have already been observed on one side, and the quantum states that arise in Shor’s factoring algorithm on the other? In a paper I wrote three years ago, I called such a line a “Sure/Shor separator,” and challenged skeptics to come up with some example of what it might be. I even tried to get the ball rolling by studying such separators myself. My idea was that having a Sure/Shor separator could motivate further research: once they knew where the “barrier” was, the experimentalists could set to work trying to cross it; then, if they succeeded, the skeptics could come back with a new barrier, and so on. Unfortunately, no skeptic has yet risen to the challenge. It’s not hard to see why: if you start with the many-particle entangled states that have already been observed (for example, by the Zeilinger group and by Ghosh et al.) and then throw in a few closure properties, you quickly end up with — well, the set of all quantum states. Coming up with a “reasonable” set of states that includes Sure states but doesn’t include Shor states turns out to be an extremely hard problem.
  5. The Linearity Argument. In my experience, at least 70% of all objections to quantum computing boil down to the idea that a quantum computer would be a “souped-up analog computer” — a machine that would store information not in voltage differences or the positions of pulleys, but instead in exponentially-small amplitudes. From this idea it follows readily that, just as “old-school” analog computers have always run up against scalability problems, so too will quantum computers. To see why the analogy fails, think about classical probabilities. If you flip a coin a thousand times, you’ll end up with a probability distribution over outcomes that requires real numbers of order 2-1000 to describe. Does it follow from this that classical probabilistic computers are really analog computers in disguise, or that classical probability theory must be a mere approximation to some deeper, underlying theory? Of course not — for, unlike voltages or pulleys, probabilities evolve in time by means of norm-preserving linear transformations, which are insensitive to small errors. Well, quantum amplitudes also evolve by means of norm-preserving linear transformations, and this is what makes them behave like probabilities with respect to error, and not like the state variables of an analog computer.
  6. The Fault-Tolerance Argument. Among the many nontrivial consequences of this linearity, there’s one that probably counts as a separate argument: the Threshold Theorem. This theorem states that even if a quantum computer is subject to noise, we can still use it to do universal computation, provided we have parallel processing and a supply of fresh qubits, and provided the error rate is at most ε per qubit per time step, for some constant ε&gt;0 independent of the length of the computation. The original lower bound on ε was about 10-6, but recently Knill and others have brought it up to 1-3% under plausible assumptions. Many quantum computing researchers talk about this theorem as the knight in shining armor who rode in unexpectedly to vindicate all their hopes. They’re entitled to do so, but to me, the theorem has always felt more like a beautiful, detailed working-out of something that couldn’t possibly have been false. (And not just because it’s a theorem.)
  7. The What-A-Waste Argument. Why do I say that the threshold theorem “couldn’t possibly have been false”? Well, suppose quantum mechanics were an accurate description of reality, yet quantum computing was still impossible for some fundamental reason. In that case, we’d have to accept that Nature was doing a staggering amount of quantum computation that could never be “extracted,” even in principle. Indeed, even assuming that life is (and always will be) confined to the vicinity of one planet, the resulting computational waste would make the waste of 1011 uninhabited galaxies look like chickenfeed. I don’t deny that such a possibility is logically consistent, but my complexity-theoretic instincts rebel against it.
  8. The Non-Extravagance Argument. In my opinion, if quantum computers could solve NP-complete problems in polynomial time, then there really would be grounds for regarding them as physically extravagant. Like coming up with theories that allow causality violations and superluminal signalling, coming up with models of computation that can simulate NP, #P, and PSPACE is just too easy. It’s not interesting. The interesting task is to come up with a model of computation that’s stronger than the usual ones (P, BPP, and P/poly), but not so strong that it encompasses NP-complete problems. If it weren’t for BQP, I don’t think I’d have any clear idea of what such a model could look like. (Sure, we have problems and complexity classes below NP, but that’s different from a full-fledged model of computation.)
  9. The Turn-The-Tables Argument. If building quantum computers that outperform classical ones is fundamentally impossible, then it must be possible to write classical computer programs that efficiently simulate any quantum system found in Nature. And yet, even though this way of looking at the question is perfectly equivalent, there’s a reason quantum computing skeptics avoid it. This is that, as soon as you frame the issue this way, they (the skeptics) are the ones who look like wild-eyed technological optimists — believing we’ll be able to simulate superconductors and quark-gluon plasmas on an ordinary desktop PC! The “staid,” “conservative” position is that such a simulation won’t be possible — or, equivalently, that the systems being simulated have more computational power than the PC doing the simulating.
  10. The Island-In-Theoryspace Argument. String theorists have been ridiculed for claiming that string theory is “too beautiful to be wrong.” But as Peter Woit points out in his fascinating new book, this is not at all a bad argument. It’s a fine argument; the real question is whether string theory — with its perturbation series, ten dimensions of which six are compactified for unknown reasons, landscape of vacua, etc. — really is as beautiful as its proponents think it is. At the risk of breaking my vow, let me hasten to say that I’m in no position to judge. What I do know is that there’s something mathematically unique about quantum mechanics: how it takes advantage of special properties of the L2 norm that fail for other p-norms, how the parameter-counting for mixed states that works perfectly with complex numbers fails with real numbers and quaternions, and so on. Crucially, it seems all but impossible to change quantum mechanics while retaining its nice properties. More so than general relativity or any other theory we have, quantum mechanics gives every indication of being an island in theoryspace.
  11. The Only-Game-In-Town Argument. However one feels about the alternatives to string theory — loop quantum gravity, spin foams, twistors, and so on — at least each one has a “developer base,” a community of physicists who are actively trying to make it work. By contrast, I don’t know of any picture of the world in which quantum computing is impossible, that’s being actively developed by any research community anywhere. (Gerard ‘t Hooft and Stephen Wolfram are not research communities.) All the skepticism of quantum computing that I’m aware of is purely negative in character.
  12. The Historical Argument. If the above arguments are sound, then why haven’t people already succeeded in building quantum computers? It’s been what, ten years already? Some historical perspective might be helpful here: in Samuel Johnson’s The History of Rasselas, Prince of Abissinia, written in 1759, Johnson has one of his characters give a correct explanation of why heavier-than-air flying machines should be physically possible, and then build a test plane that promptly plummets into a lake. Johnson was safe in ridiculing the idea; it would be another 144 years before Kitty Hawk. Closer to our topic, Darwin wrote in his autobiography about an eccentric loon of his acquaintance, who dreamed of building an engine to automate routine human thought. Though the loon — a certain Charles Babbage — hadn’t run afoul of any fundamental theory, his proposal to build a classical computer was a century ahead of its time. Since the 1600’s, science has often been generations ahead of technology. History gives us no reason at all to assume that a technology will be discovered to be compatible with known laws of physics at about the same time as it becomes possible to implement.
  13. The Trademark-Twist Argument. This last argument is the hardest one to articulate, but possibly the most compelling to my mind. In my view, Nature has been telling us, over and over and over, that our everyday intuitions will match the physical world if and only if we first apply a little “twist” to them. Often this twist involves an unusual symmetry, or switching from the L1 to the L2 norm, or inserting negative or complex numbers where our intuition says that only nonnegative real numbers would make sense. We see such a twist in special relativity, in the metric that’s not positive definite but instead has a (-1,1,1,1) signature. We see it in the -1 phase that the universe picks up when you swap a fermion with its identical twin. We see it in the fact that, to rotate an electron back to where it was, you have to turn it not 360o but 720o. We see it in the Dirac equation. We see it, of course, in quantum mechanics itself. And what is BQP, if not P=BPP with Nature’s trademark little twist?


Good news, everyone!  Anindya De, Oded Regev, and my postdoc Thomas Vidick are launching an online theoretical computer science seminar series called TCS+, modeled after the successful Q+ quantum information seminars run by Daniel Burgarth and Matt Leifer.  The inaugural TCS+ lecture will be on Wednesday Feb. 6, at noon Eastern Standard Time.  Ronald de Wolf, longtime friend both of this blog and of its author, will be speaking on Exponential Lower Bounds for Polytopes in Combinatorial Optimization, his STOC’2012 Best Paper with Samuel Fiorini, Serge Massar, Sebastian Pokutta and Hans Raj Tiwary.  This is the paper that used ideas originally from quantum communication complexity to solve a 20-year-old problem in classical optimization: namely, to rule out the possibility of proving P=NP by reducing the Traveling Salesman Problem to certain kinds of linear programs.  Ronald previously gave the talk at MIT, and it rocked.  See Thomas’s blog for details about how to watch.


At least eight people—journalists, colleagues, blog readers—have now asked my opinion of a recent paper by Ross Anderson and Robert Brady, entitled “Why quantum computing is hard and quantum cryptography is not provably secure.”  Where to begin?

  1. Based on a “soliton” model—which seems to be almost a local-hidden-variable model, though not quite—the paper advances the prediction that quantum computation will never be possible with more than 3 or 4 qubits.  (Where “3 or 4” are not just convenient small numbers, but actually arise from the geometry of spacetime.)  I wonder: before uploading their paper, did the authors check whether their prediction was, y’know, already falsified?  How do they reconcile their proposal with (for example) the 8-qubit entanglement observed by Haffner et al. with trapped ions—not to mention the famous experiments with superconducting Josephson junctions, buckyballs, and so forth that have demonstrated the reality of entanglement among many thousands of particles (albeit not yet in a “controllable” form)?
  2. The paper also predicts that, even with 3 qubits, general entanglement will only be possible if the qubits are not collinear; with 4 qubits, general entanglement will only be possible if the qubits are not coplanar.  Are the authors aware that, in ion-trap experiments (like those of David Wineland that recently won the Nobel Prize), the qubits generally are arranged in a line?  See for example this paper, whose abstract reads in part: “Here we experimentally demonstrate quantum error correction using three beryllium atomic-ion qubits confined to a linear, multi-zone trap.”
  3. Finally, the paper argues that, because entanglement might not be a real phenomenon, the security of quantum key distribution remains an open question.  Again: are the authors aware that the most practical QKD schemes, like BB84, never use entanglement at all?  And that therefore, even if the paper’s quasi-local-hidden-variable model were viable (which it’s not), it still wouldn’t justify the claim in the title that “…quantum cryptography is not provably secure”?

Yeah, this paper is pretty uninformed even by the usual standards of attempted quantum-mechanics-overthrowings.  Let me now offer three more general thoughts.

First thought: it’s ironic that I’m increasingly seeing eye-to-eye with Lubos Motl—who once called me “the most corrupt piece of moral trash”—in his rantings against the world’s “anti-quantum-mechanical crackpots.”  Let me put it this way: David Deutsch, Chris Fuchs, Sheldon Goldstein, and Roger Penrose hold views about quantum mechanics that are diametrically opposed to one another’s.  Yet each of these very different physicists has earned my admiration, because each, in his own way, is trying to listen to whatever quantum mechanics is saying about how the world works.  However, there are also people all of whose “thoughts” about quantum mechanics are motivated by the urge to plug their ears and shut out whatever quantum mechanics is saying—to show how whatever naïve ideas they had before learning QM might still be right, and how all the experiments of the last century that seem to indicate otherwise might still be wiggled around.  Like monarchists or segregationists, these people have been consistently on the losing side of history for generations—so it’s surprising, to someone like me, that they continue to show up totally unfazed and itching for battle, like the knight from Monty Python and the Holy Grail with his arms and legs hacked off.  (“Bell’s Theorem?  Just a flesh wound!”)

Like any physical theory, of course quantum mechanics might someday be superseded by an even deeper theory.  If and when that happens, it will rank alongside Newton’s apple, Einstein’s elevator, and the discovery of QM itself among the great turning points in the history of physics.  But it’s crucial to understand that that’s not what we’re discussing here.  Here we’re discussing the possibility that quantum mechanics is wrong, not for some deep reason, but for a trivial reason that was somehow overlooked since the 1920s—that there’s some simple classical model that would make everyone exclaim,  “oh!  well, I guess that whole framework of exponentially-large Hilbert space was completely superfluous, then.  why did anyone ever imagine it was needed?”  And the probability of that is comparable to the probability that the Moon is made of Gruyère.  If you’re a Bayesian with a sane prior, stuff like this shouldn’t even register.

Second thought: this paper illustrates, better than any other I’ve seen, how despite appearances, the “quantum computing will clearly be practical in a few years!” camp and the “quantum computing is clearly impossible!” camp aren’t actually opposed to each other.  Instead, they’re simply two sides of the same coin.  Anderson and Brady start from the “puzzling” fact that, despite what they call “the investment of tremendous funding resources worldwide” over the last decade, quantum computing still hasn’t progressed beyond a few qubits, and propose to overthrow quantum mechanics as a way to resolve the puzzle.  To me, this is like arguing in 1835 that, since Charles Babbage still hasn’t succeeded in building a scalable classical computer, we need to rewrite the laws of physics in order to explain why classical computing is impossible.  I.e., it’s a form of argument that only makes sense if you’ve adopted what one might call the “Hype Axiom”: the axiom that any technology that’s possible sometime in the future, must in fact be possible within the next few years.

Third thought: it’s worth noting that, if (for example) you found Michel Dyakonov’s arguments against QC (discussed on this blog a month ago) persuasive, then you shouldn’t find Anderson’s and Brady’s persuasive, and vice versa.  Dyakonov agrees that scalable QC will never work, but he ridicules the idea that we’d need to modify quantum mechanics itself to explain why.  Anderson and Brady, by contrast, are so eager to modify QM that they don’t mind contradicting a mountain of existing experiments.  Indeed, the question occurs to me of whether there’s any pair of quantum computing skeptics whose arguments for why QC can’t work are compatible with one another’s.  (Maybe Alicki and Dyakonov?)

But enough of this.  The truth is that, at this point in my life, I find it infinitely more interesting to watch my two-week-old daughter Lily, as she discovers the wonderful world of shapes, colors, sounds, and smells, than to watch Anderson and Brady, as they fail to discover the wonderful world of many-particle quantum mechanics.  So I’m issuing an appeal to the quantum computing and information community.  Please, in the comments section of this post, explain what you thought of the Anderson-Brady paper.  Don’t leave me alone to respond to this stuff; I don’t have the time or the energy.  If you get quantum probability, then stand up and be measured!


So, it seems the arXiv is now so popular that even Leonhard Euler has contributed 25 papers, despite being dead since 1783. (Thanks to Ars Mathematica for this important news item, as well as for the hours of procrastination on my part that led to its rediscovery.) Since I’d long been curious about the mathematical research interests of the nonliving, I decided to check out Leonhard’s most recent preprint, math.HO/0608467 (“Theorems on residues obtained by the division of powers”). The paper starts out slow: explaining in detail why, if a mod p is nonzero, then a2 mod p, a3 mod p, and so on are also nonzero. By the end, though, it’s worked out most of the basics of modular arithmetic, enough (for example) to analyze RSA. Furthermore, the exposition, while “retro” in style, is sufficiently elegant that I might even recommend acceptance at a minor theory conference, even though the basic results have of course been known for like 200 years.

Oh — you say that Mr. E’s papers were as difficult and abstract for their time as Wiles and Perelman’s papers are for our own time? BULLSHIT. Reading the old master brings home the truth: that, for better and worse, math has gotten harder. Much, much harder. And we haven’t gotten any smarter.


Today I break long radio silence to deliver some phenomenal news.  Two of the people who I eat lunch with every week—my MIT CSAIL colleagues Silvio Micali and Shafi Goldwasser—have won a well-deserved Turing Award, for their fundamental contributions to cryptography from the 1980s till today.  (I see that Lance just now beat me to a blog post about this.  Dammit, Lance!)

I won’t have to tell many readers of this blog that the names Goldwasser and Micali—or more often, the initials “G” and “M”—are as ubiquitous as Alice and Bob in modern cryptography, from the GGM construction of pseudorandom functions (discussed before on this blog), to the classic GMR paper that introduced the world to interactive proofs.  Besides that, Shafi and Silvio are known as two of the more opinionated and colorful characters of theoretical computer science—and as I learned last week, Silvio is also an awesome party host, who has perfect taste in sushi (as well as furniture and many other things).

I wish I could go on right now talking about Shafi and Silvio—and even more, that I could join the celebration that will happen at MIT this afternoon.  But I’m about to board a flight to LAX, to attend the 60th birthday symposium of longtime friend, extraordinary physicist, and sometime Shtetl-Optimized commenter John Preskill.  (I’ll also be bringing you coverage of that symposium, including slides from my talk there on hidden variables.)  So, leave your congratulations, etc. in the comments section, and I’ll see them when I land!


I woke up at my normal time — probably around 2PM — in my room at Berkeley’s International House, to find an avalanche of email: from a fellow grad student, urging everyone to check the news; from Christos Papadimitriou, reminding us that we have a community here, and communities can comfort; from Luca Trevisan, announcing that the class that he taught and I TA’ed would be canceled, since on a day like this it was impossible to think about algorithms. I then clicked over to news sites to find out what had happened.

After confirming that my friends and family were safe, I walked over to my office in Soda Hall, mostly to find people to talk to. Technically I had office hours for the algorithms class that afternoon, but I didn’t expect students actually to come. Yet come they did: begging for hints on the problem set, asking what would and wouldn’t be on the test, pointing to passages in the CLRS textbook that they didn’t understand. I pored over their textbook, shaking my head in disbelief, glancing up every minute or so at the picture of the burning buildings on the computer screen.

That night there was a big memorial service in Sproul Plaza. When I arrived, a woman offered me a candle, which I took, and a man standing next to her offered me a flyer, which I also took. The flyer, which turned out to be from a socialist organization, sought to place the events of that morning in “context,” describing the World Trade Center victims as “mostly white-collar executives and those who tried to save them.”

After a few songs and eulogies, a woman got up to explain that, on this terrible day, what was really important was that we try to understand the root causes of violence — namely poverty and despair — and not use this tragedy as a pretext to start another war. The crowd thunderously applauded.

While the speeches continued, I got up and wandered off by myself in the direction of Bancroft Way. Much as I did the year before, when the area around Telegraph was festooned with Nader for President posters, I felt palpably that I wasn’t living in an outcomes-based region of reality. The People’s Republic of Berkeley was proving to be a staunch ally of the Oilmen’s Oligarchy of Crawford, undermining the only sorts of opposition to it that had any possibility of succeeding.

I decided to forget about politics for a while and concentrate exclusively on research. I can’t say I succeeded at this. But I did pass my prelim exam three days later (on September 14), and a few weeks afterward proved the quantum lower bound for the collision problem.

Note: Feel free to post your own retrospective in the comments section. Andris Ambainis has already done so.


I got back a couple days ago from John Preskill‘s 60th birthday symposium at Caltech.  To the general public, Preskill is probably best known for winning two bets against Stephen Hawking.  To readers of Shtetl-Optimized, he might be known for his leadership in quantum information science, his pioneering work in quantum error-correction, his beautiful lecture notes, or even his occasional comments here (though these days he has his own group blog and Twitter feed to keep him busy).  I know John as a friend, colleague, and mentor who’s done more for me than I can say.

The symposium was a blast—a chance to hear phenomenal talks, enjoy the California sun, and catch up with old friends like Dave Bacon (who stepped down as Pontiff before stepping down as Pontiff was cool).  The only bad part was that I inadvertently insulted John in my talk, by calling him my “lodestar of sanity.”  What I meant was that, for 13 years, I’ve known plenty of physicists who can be arbitrarily off-base when they talk about computer science and vice versa, but I’ve only ever known John to be on-base about either.  If you asked him a question involving, say, both Barrington’s Theorem and Majorana fermions, he’s one of the few people on earth who would know both, seem totally unfazed by your juxtaposing them, and probably have an answer that he’d carefully tailor to your level of knowledge and interest.  In a polyglot field like quantum information, that alone makes him invaluable.  But along with his penetrating insight comes enviable judgment and felicity of expression: unlike some of us (me), John always manages to tell the truth without offending his listeners.  If I were somehow entrusted with choosing a President of the United States, he’d be one of my first choices, certainly ahead of myself.

Anyway, it turned out that John didn’t like my use of the word “sane” to summarize the above: for him (understandably, in retrospect), it had connotations of being humorless and boring, two qualities I’ve never seen in him.  (Also, as I pointed out later, the amount of time John has spent helping me and patiently explaining stuff to me does weigh heavily against his sanity.)  So I hereby rename John my Lodestar of Awesomeness.

In case anyone cares, my talk was entitled “Hidden Variables as Fruitful Dead Ends”; the PowerPoint slides are here.  I spoke about a new preprint by Adam Bouland, Lynn Chua, George Lowther, and myself, on possibility and impossibility results for “ψ-epistemic theories” (a class of hidden-variable theories that was also the subject of the recent PBR Theorem, discussed previously on this blog).  My talk also included material from my old paper Quantum Computing and Hidden Variables.

The complete program is here.  A few highlights (feel free to mention others in the comments):

  * Patrick Hayden spoke about a beautiful result of himself and Alex May, on “where and when a qubit can be.”  After the talk, I commented that it’s lucky for the sake of Hayden and May’s induction proof that 3 happens to be the next integer after 2.  If you get that joke, then I think you’ll understand their result and vice versa.
  * Lenny Susskind—whose bestselling The Theoretical Minimum is on my to-read list—spoke about his views on the AMPS firewall argument.  As you know if you’ve been reading physics blogs, the firewall argument has been burning up (har, har) the world of quantum gravity for months, putting up for grabs aspects of black hole physics long considered settled (or not, depending on who you ask).  Lenny gave a typically-masterful summary, which for the first time enabled me to understand the role played in the AMPS argument by “the Zone” (a region near the black hole but outside its event horizon, in which the Hawking radiation behaves a little differently than it does when it’s further away).  I was particularly struck by Lenny’s comment that whether an observer falling into a black hole encounters a firewall might be “physics’ Axiom of Choice”: that is, we can only follow the logical consequences of theories we formulate outside black-hole event horizons, and maybe those theories simply don’t decide the firewall question one way or the other.  (Then again, maybe they do.)  Lenny also briefly mentioned a striking recent paper by Harlow and Hayden, which argues that the true resolution of the AMPS paradox might involve … wait for it … computational complexity, and specifically, the difficulty of solving QSZK (Quantum Statistical Zero Knowledge) problems in BQP.  And what’s a main piece evidence that QSZK⊄BQP?  Why, the collision lower bound, which I proved 12 years ago while a summer student at Caltech and an awestruck attendee of Preskill’s weekly group meetings.  Good thing no one told me back then that black holes were involved.
  * Charlie Bennett talked about things that I’ve never had the courage to give a talk about, like the Doomsday Argument and the Fermi Paradox.  But his disarming, avuncular manner made it all seem less crazy than it was.
  * Paul Ginsparg, founder of the arXiv, presented the results of a stylometric analysis of John Preskill’s and Alexei Kitaev’s research papers.  The main results were as follows: (1) John and Alexei are easily distinguishable from each other, due in part to the more latter’s “Russian” use of function words (“the,” “which,” “that,” etc.).   (2) Alexei, despite having lived in the US for more than a decade, is if anything becoming more “Russian” in his function word use over time. (3) Even more interestingly, John is also becoming more “Russian” in his function word use—a possible result of his long interaction with Alexei. (4) A joint paper by Kitaev and Preskill was indeed written by both of them.  (Update: While detained at the airport, Paul decided to post an online video of his talk.)

Speaking of which, the great Alexei Kitaev himself—the $3 million man—spoke about Berry curvature for many-body systems, but unfortunately I had to fly back early (y’know, 2-month-old baby) and missed his talk.  Maybe someone else can provide a summary.

Happy 60th birthday, John!

* * *

Two unrelated announcements.

1\. Everyone who reads this blog should buy Sean Carroll’s two recent books: From Eternity to Here (about the arrow of time) and The Particle at the End of the Universe (about the Higgs boson and quantum field theory more generally).  They’re two of the best popular physics books I’ve ever read—in their honesty, humor, clarity, and total lack of pretense, they exemplify what every book in this genre should be but very few are.  If you need even more inducement, go watch Sean hit it out of the park on the Colbert Report (and then do it again).  I can’t watch those videos without seething with jealousy: given how many “OK”s and “y’know”s lard my every spoken utterance, I’ll probably never get invited to hawk a book on Colbert.  Which is a shame, because as it happens, my Quantum Computing Since Democritus book will finally be released in the US by Cambridge University Press on April 30th!  (It’s already available in the UK, but apparently needs to be shipped to the US by boat.)  And it’s loaded with new material, not contained in the online lecture notes.  And you can preorder it now.  And my hawking of Sean’s books is in no way whatsoever related to any hope that Sean might return the favor with my book.

2\. Recent Turing Award winner Silvio Micali asks me to advertise the Second Cambridge Area Economics and Computation Day (CAEC’13), which will be held on Friday April 26 at MIT.  Anything for you, Silvio!  (At least for the next week or two.)


Update (March 22): The Kindle edition of Quantum Computing Since Democritus is now available, for the low price of $15.40!  (Not factorial.)  Click here to get it from amazon.com, or here to get it from amazon.co.uk.  ~~And let me know how it looks (I haven’t seen it yet).~~  Another Update: Just saw the Kindle edition, and the figures and formulas came out great!  It’s a product I stand behind with pride.

In the meantime, I regret to say that the marketing for this book is getting crasser and more exploitative by the day.

* * *

It seems like wherever I go these days, all anyone wants to talk about is Quantum Computing Since Democritus—the sprawling new book by Scott Aaronson, published by Cambridge University Press and available for order now.  Among leading figures in quantum information science—many of them well-known to Shtetl-Optimized readers—the book is garnering the sort of hyperbolic praise that would make Shakespeare or Tolstoy blush:

“I laughed, I cried, I fell off my chair – and that was just reading the chapter on Computational Complexity.  Aaronson is a tornado of intellectual activity: he rips our brains from their intellectual foundations; twists them through a tour of physics, mathematics, computer science, and philosophy; stuffs them full of facts and theorems; tickles them until they cry ‘Uncle’; and then drops them, quivering, back into our skulls.  Aaronson raises deep questions of how the physical universe is put together and why it is put together the way it is.  While we read his lucid explanations we can believe – at least while we hold the book in our hands – that we understand the answers, too.” —Seth Lloyd

“Scott Aaronson has written a beautiful and highly original synthesis of what we know about some of the most fundamental questions in science: What is information? What does it mean to compute? What is the nature of mind and of free will?” —Michael Nielsen

“Not since Richard Feynman’s Lectures on Physics has there been a set of lecture notes as brilliant and as entertaining.  Aaronson leads the reader on a wild romp through the most important intellectual achievements in computing and physics, weaving these seemingly disparate fields into a captivating narrative for our modern age of information.  Aaronson wildly runs through the fields of physics and computers, showing us how they are connected, how to understand our computational universe, and what questions exist on the borders of these fields that we still don’t understand.   This book is a poem disguised as a set of lecture notes.  The lectures are on computing and physics, complexity theory and mathematical logic and quantum physics.  The poem is made up of proofs, jokes, stories, and revelations, synthesizing the two towering fields of computer science and physics into a coherent tapestry of sheer intellectual awesomeness.” —Dave Bacon

After months of overhearing people saying things like the above—in the halls of MIT, the checkout line at Trader Joe’s, the bathroom, anywhere—I finally had to ask in annoyance: “is all this buzz justified?  I mean, I’m sure the book is as deep, hilarious, and worldview-changing as everyone says it is.  But, after all, it’s based off lecture notes that have long been available for free on the web.  And Aaronson, being the magnanimous, open-access-loving saint that he is, has no plans to remove the online notes, even though he could really use the royalties from book sales to feed his growing family.  Nor does Cambridge University Press object to his principled decision.”

“No, you don’t understand,” they told me.  “Word on the street has it that the book is extensively updated for 2013—that it’s packed with new discussions of things like algebrization, lattice-based cryptography, the QIP=PSPACE theorem, the ‘quantum time travel controversy,’ BosonSampling, black-hole firewalls, and even the Australian models episode.  They say it took years of painstaking work, by Aaronson and his student Alex Arkhipov, to get the notes into book form: fixing mistakes, clarifying difficult points, smoothing out rough edges, all while leaving intact the original’s inimitable humor.  I even heard Aaronson reveals he’s changed his mind about certain things since 2006.  How could you not want such a labor of love on your bookshelf?”

Exasperated, I finally exclaimed: “But the book isn’t even out yet in North America!  Amazon.com says it won’t ship until April 30.”

“Sure,” one gas-station attendant replied to me, “but the secret is, it’s available now from Amazon.co.uk.  Personally, I couldn’t wait a month, so I ordered it shipped to me from across the pond.  But if you’re a less hardcore quantum complexity theory fan, and you live in North America, you can also preorder the book from Amazon.com, and they’ll send it to you when it arrives.”

Much as the hype still grated, I had to admit that I’d run out of counterarguments, so I looked into ordering a copy for myself.


It’s the Science Blog Carnival! Come see me hawking my wares, alongside Peter Woit, Dave Bacon, and dozens of other clowns.


None

As some of you probably heard, last week Sen. Tom Coburn (R-Oklahoma) managed to get an amendment passed prohibiting the US National Science Foundation from funding any research in political science, unless the research can be “certified” as “promoting national security or the economic interests of the United States.”  This sort of political interference with the peer-review process, of course, sets a chilling precedent for all academic research, regardless of discipline.  (What’s next, an amendment banning computer science research, unless it has applications to scheduling baseball games or slicing apple pies?)  But on researching further, I discovered that Sen. Coburn has long had it in for the NSF, and even has a whole webpage listing his grievances against the agency.  Most of it is the usual “can you believe they wasted money to study something so silly or obvious?,” but by far my favorite tidbit is the following:

Inappropriate staff behavior including porn surfing and Jello wrestling and skinny-dipping at NSF-operated facilities in Antarctica.

It occurred to me that the NSF really has no need to explain this one, since a complete explanation is contained in a single word of the charge itself: Antarctica.  Personally, I’d support launching an investigation of NSF’s Antarctica facilities, were it discovered that the people stuck in them weren’t porn surfing and Jello wrestling and skinny-dipping.


That, for better or worse, is the name of a course I’m teaching this semester at the University of Waterloo. I’m going to post all of the lecture notes online, so that you too can enjoy an e-learning cyber-experience in my virtual classroom, even if you live as far away as Toronto. I’ve already posted Lecture 1, “Atoms and the Void.” Coming up next: Lecture 2.


“Meme” courtesy of my brother David

First news item: it’s come to my attention that yesterday, an MIT professor abused his power over students for a cruel April Fools’ Day prank involving the P vs. NP problem.  His email to the students is below.

I assume most of you already heard the news that a Caltech grad student, April Felsen, announced a 400-page proof of P≠NP last week.  While I haven’t yet completely digested the argument, it’s already clear that Felsen (who I actually knew back when she was an MIT undergrad) has changed theoretical computer science forever, bringing in new tools from K-theory to higher topos theory to solve the biggest problem there was.

Alas, Felsen’s proof has the “short-term” effect of making the existing 6.045 seem badly outdated.  So, after long reflection, I’ve made a decision that not all of you are going to like, but that I believe is the right one intellectually.  I’ve decided to reorient the entire course to focus on Felsen’s result, starting with tomorrow’s lecture.

And further, I decided to rewrite Thursday’s midterm to focus almost entirely on this new material.  That means that, yes, you’re going to have THREE DAYS to learn at least the basics of algebraic topology and operator algebras, as used in Felsen’s proof.  To do that, you might need to drop everything else (including sleep, unfortunately), and this might prove to be the most strenuous and intense thing you’ve ever done.  But it will also be an experience that will enrich your minds and ennoble your souls, and that you’ll be proud to tell your grandchildren about.  And of course we’ll be there to help out.  So let’s get started!

All the best,
Scott

* * *

Second news item: many of you have probably heard that Lance Fortnow’s The Golden Ticket—the first popular book about the P vs. NP problem—is now out.  (The title refers to Roald Dahl’s Charlie and the Chocolate Factory, which involved a few chocolate bars that had coveted golden tickets inside the wrappers, along with millions of chocolate bars that didn’t.)  I read it last week, and I think it’s excellent: a book I’ll happily recommend to family and friends who want the gentlest introduction to complexity theory that exists.

Some context: for more than a decade, people have been telling me that I should write a popular book about P vs. NP, and I never did, and now Lance has.  So I’m delighted to say that reading Lance’s book quickly cured me of any regrets I might have felt.  For not only is The Golden Ticket a great book, but better yet, it’s not a book that I ever could’ve written.

Here’s why: every time I would have succumbed to the temptation to explain something too complicated for the world’s journalists, literary humanists, and pointy-haired bosses—something like relativization, or natural proofs, or arithmetization, or Shannon’s counting argument, or Ladner’s Theorem, or coNP, or the reasons to focus on polynomial time—every time, Lance somehow manages to resist the temptation, and to stick to cute stories, anecdotes, and practical applications.  This is really, truly a popular book: as Lance points out himself, in 162 pages of discussing the P vs. NP question, he never even formally defines P and NP!

But it goes beyond that: in the world of The Golden Ticket, P vs. NP is important because, if P=NP, then people could design more effective cancer therapies, solve more crimes, and better predict which baseball games would be closely-matched and exciting (yes, really).  P vs. NP is also important because it provides a unifying framework for understanding current technological trends, like massively-parallel computing, cloud computing, big data, and the Internet of things.  Meanwhile, quantum computing might or might not be possible in principle, but either way, it’s probably not that relevant because it won’t be practical for a long time.

In short, Lance has written precisely the book about P vs. NP that the interested layperson or IT professional wants and needs, and precisely the book that I couldn’t have written.  I would’ve lost patience by around page 20, and exclaimed:

“You want me to justify the P vs. NP problem by its relevance to baseball??  Why shouldn’t baseball have to justify itself by its relevance to P vs. NP?  Pshaw!  Begone from the house of study, you cretinous fools, and never return!”

My favorite aspect of The Golden Ticket was its carefully-researched treatment of the history of the P vs. NP problem in the 50s, 60s, and 70s, both in the West and in the Soviet Union (where it was called the “perebor” problem).  Even complexity theorists will learn countless tidbits—like how Leonid Levin was “discovered” at age 15, and how the powerful Sergey Yablonsky stalled Soviet perebor research by claiming to have solved the problem when he’d done nothing of the kind.  The historical chapter (Chapter 5) is alone worth the price of the book.

I have two quibbles.  First, throughout the book, Lance refers to a hypothetical world where P=NP as the “Beautiful World.”  I would’ve called that world the “Hideous World”!  For it’s a world where technical creativity is mostly worthless, and where the mathematical universe is boring, flat, and incomprehensibly comprehensible.  Here’s an analogy: suppose a video game turned out to have a bug that let you accumulate unlimited points just by holding down a certain button.  Would anyone call that game the “Beautiful Game”?

My second disagreement concerns quantum computing.  Overall, Lance gives an admirably-accurate summary, and I was happy to see him throw cold water on breathless predictions about QC and other quantum-information technologies finding practical applications in the near future.  However, I think he goes beyond the truth when he writes:

[W]e do not know how to create a significant amount of entanglement in more than a handful of quantum bits.  It might be some fundamental rule of nature that prevents significant entanglement for any reasonable length of time.  Or it could just be a tricky engineering problem.  We’ll have to let the physicists sort that out.

The thing is, physicists do know how to create entanglement among many thousands or even millions of qubits—for example, in condensed-matter systems like spin lattices, and in superconducting Josephson junctions.  The problem is “merely” that they don’t know how to control the entanglement in the precise ways needed for quantum computing.  But as with much quantum computing skepticism, the passage above doesn’t seem to grapple with just how hard it is to kill off scalable QC.  How do you cook up a theory that can account for the massively-entangled states that have already been demonstrated, but that doesn’t give you all of BQP?

But let me not harp on these minor points, since The Golden Ticket has so many pleasant features.  One of them is its corny humor: even in Lance’s fantasy world where a proof of P=NP has led to a cure for cancer, it still hasn’t led to a cure for the common cold.  Another nice feature is the book’s refreshing matter-of-factness: Lance makes it clear that he believes that

(a) P≠NP,
(b) the conjecture is provable but won’t be proven in the near future, and
(c) if we ever meet an advanced extraterrestrial civilization, they’ll also have asked the P vs. NP question or something similar to it.

Of course we can’t currently prove any of the above statements, just like we can’t prove the nonexistence of Bigfoot.  But Lance refuses to patronize his readers by pretending to harbor doubts that he quite reasonably doesn’t.

In summary, if you’re the sort of person who stops me in elevators to say that you like my blog even though you never actually understand anything in it, then stop reading Shtetl-Optimized right now and go read Lance’s book.  You’ll understand it and you’ll enjoy it.

And now it’s off to class, to apologize for my April Fools prank and to teach the Cook-Levin Theorem.


My friend Alex Halderman is now after bigger fish than copy-“protected” music CD’s. Watch this video, in which he, Ed Felten, and Ariel Feldman demonstrate how to rig a Diebold voting machine (and also watch Alex show off his lock-picking skills). Reading the group’s paper, one becomes painfully aware of a yawning cultural divide between nerds and the rest of the world. Within the nerd universe, that voting machines need to have a verifiable paper trail, that they need to be open to inspection by researchers, etc., are points so obvious as to be scarcely worth stating. If a company (Diebold) refuses to take these most trivial of precautions, then even without a demonstration of the sort Alex et al. provide, the presumption must be that their machines are insecure. Now Alex et al. are trying to take what’s obvious to nerds into a universe — local election boards, the courts, etc. — that operates by entirely different rules. Within this other universe, the burden is not on Diebold to prove its voting machines are secure; it’s on Alex et al. to prove they’re insecure. And even if they do prove they’re insecure — well, if it weren’t for those pesky researchers telling the bad guys how to cheat, what would we have to worry about?

So, how does one bridge this divide? How does one explain the obvious to those who, were they capable of understanding it, would presumably have understood it already? I wish I had an easy answer, but I fear there’s nothing to do but what Alex, Ed, and Ariel are doing already — namely, fight with everything you’ve got.


An addendum to my last post: a few days ago, I got an email with the subject line “BBQ,…”

“Awesome!” I thought. “Free food! Where?”

But no, the email was from someone who had read one of my papers, and who wanted references for the strange, unfamiliar terms that littered the text — terms like “BBP” and “BBQ.” I wasn’t sure what to tell him, except that the class BBQ contains BYOB and is conjectured to be incomparable with BLT.


Furthermore, the last of those things actually happened.  What won’t I do to promote Quantum Computing Since Democritus?  Enjoy!

Update: I submitted the following response to the comments over on Lubos’s blog.  Since it has some bits of general interest, I thought I’d crosspost it here while it awaits Lubos’s moderation.

* * *

Since Lubos “officially invited” me to respond to the comments here, let me now do so.

1\. On “loopholes” in quantum mechanics: I completely agree with Lubos’s observation that the actual contents of my book are “conservative” about the truth of QM. Indeed, I predict that, when Lubos reads his free copy, he’ll agree with (or at least, have no objections to) the vast majority of what’s in the book. On the other hand, because I was guest-blogging about “the story of me and Lubos,” I found it interesting to highlight one area of disagreement regarding QM, rather than the larger areas of agreement.

2\. On Gene Day’s patronizing accusation that I don’t “get the basics of QM or even comprehend the role of mathematics in physics”: his misreading of what I wrote is so off-base that I don’t know whether a response is even necessary.  Briefly, though: of course two formulations of QM are mathematically equivalent if they’re mathematically equivalent!  I wasn’t asking why we don’t use different mathematical structures (quaternions, the 3-norm, etc.) to describe the same physical world.  I was asking why the physical world itself shouldn’t have been different, in such a way that those other mathematical structures would have described it.  In other words: if you were God, and you tried to invent a theory that was like QM but based on those other structures, would the result necessarily be less “nice” than QM?  Would you have to give up various desirable properties of QM?  Yes?  Can you prove it?  The ball’s in your court, Mr. Day — or else you can just read my book! 🙂

3\. On Lord Nelson’s accusation that I’m a “poseur”: on reflection, someone who only knew me from blog stunts like this one could easily be forgiven for getting that impression! 🙂 So it might be worth pointing out for the record that I also have a “day job” outside the blogosphere, whose results you can see here if you care.

4\. On my political views: I wish to clarify for Tom Vonk that I despise not only “Communists,” but the ideology of Communism itself. One of the formative experiences of my life occurred when I was an 8-year-old at Wingate Kirkland summer camp, and all the campers had to relinquish whatever candy they’d brought into a communal “bunk trunk.” The theory was that all the campers, rich and poor alike, would then share the candy equally during occasional “bunk parties.” What actually happened was that the counselors stole the candy. So, during a meeting of the entire camp, I got up and gave a speech denouncing the bunk trunk as Communism. The next day, the camp director (who had apparently been a fellow-traveler in the 1950s) sat with me at lunchtime, and told me about a very evil man named Joe McCarthy who I was in danger of becoming like. But the truth was that I’d never even heard of McCarthy at that point — I just wanted to eat candy.  And I’d give exactly the same speech today.

Like (I suppose) several billion of the world’s people, I believe in a dynamic market-based capitalist society, and also in strong environmental and other regulations to safeguard that society’s continued existence. And I don’t merely believe in that as a cynical compromise, since I can’t get the “dictatorship of the proletariat” that I want in my heart of hearts. Were I emperor of the world, progressive capitalism is precisely what I would institute. In return, perhaps, for paying a “candy tax” to keep the bunk functioning smoothly, campers could keep their remaining candy and eat or trade it to their heart’s delight.

5\. On climate change: I’m not a professional climatologist, but neither is Lubos, and nor (correct me if I’m wrong) is anyone else commenting here. Accordingly, I refuse to get drawn into a debate about ice cores and tree rings and hockey sticks, since my experience is that such debates tend to be profoundly unilluminating when not conducted by experts. My position is an incredibly simple one: just like with the link between smoking and cancer, or the lack of a link between vaccines and autism, or any other issue where I lack the expertise to evaluate the evidence myself, I’ll go with what certainly looks like an overwhelming consensus among the scientists who’ve studied the matter carefully. Period. If the climate skeptics want to win me over, then the way for them to do so is straightforward: they should ignore me, and try instead to win over the academic climatology community, majorities of chemists and physicists, Nobel laureates, the IPCC, National Academies of Science, etc. with superior research and arguments.

To this, the skeptics might respond: but of course we can’t win over the mainstream scientific community, since they’re all in the grip of an evil left-wing conspiracy or delusion!  Now, that response is precisely where “the buck stops” for me, and further discussion becomes useless.  If I’m asked which of the following two groups is more likely to be in the grip of a delusion — (a) Senate Republicans, Freeman Dyson, and a certain excitable string-theory blogger, or (b) virtually every single expert in the relevant fields, and virtually every other chemist and physicist who I’ve ever respected or heard of — well then, it comes down to a judgment call, but I’m 100% comfortable with my judgment.


Along with his law firm. You can read his side of the Poincaré story at doctoryau.com.

(Hey, passing along press releases sent to me by law firms sure is easy! I wonder why more media outlets don’t do exactly the same thing.)


On Wednesday, I gave a fun talk with that title down the street at Microsoft Research New England.  Disappointingly, no one in the audience did seem to think quantum computing was bunk (or if they did, they didn’t speak up): I was basically preaching to the choir.  My PowerPoint slides are here.  There’s also a streaming video here, but watch it at your own risk—my stuttering and other nerdy mannerisms seemed particularly bad, at least in the short initial segment that I listened to.  I really need media training.  Anyway, thanks very much to Boaz Barak for inviting me.


Cardinals, ordinals, and more. A whole math course compressed into one handwaving lecture, and a piping-hot story that’s only a century old.


Friend-of-the-blog Dorit Aharonov asked me to advertise the QStart Conference, which will be held at Hebrew University of Jerusalem June 24-27 of this year, to celebrate the opening of Hebrew University’s new Quantum Information Science Center.  Speakers include Yakir Aharonov, Jacob Bekenstein, Hans Briegel, Ed Farhi, Patrick Hayden, Ray Laflamme, Elon Lindenstrauss, Alex Lubotzky, John Martinis, Barbara Terhal, Umesh Vazirani, Stephanie Wehner, Andrew Yao … and me, your humble blogger (who will actually be there with Lily, on her first trip abroad—or for that matter, beyond the Boston metropolitan area).  Dorit tells me that the conference should be of interest to mathematicians, physicists, chemists, philosophers, and computer scientists; that registration is open now; and that student travel support is available.  Oh, and if you’re one of the people who think quantum computing is bunk?  As displayed on the poster above, leading QC skeptic Gil Kalai is a co-organizer of the conference.


On Sunday afternoon, Dana, Lily, and I were in Copley Square in Boston for a brunch with friends, at the Mandarin Oriental hotel on Boylston Street.  As I now recall, I was complaining bitterly about a number of things.  First, I’d lost my passport (it’s since been found).  Second, we hadn’t correctly timed Lily’s feedings, making us extremely late for the brunch, and causing Lily to scream hysterically the entire car ride.  Third, parking (and later, locating) our car at the Prudential Center was a logistical nightmare.  Fourth, I’d recently received by email a profoundly silly paper, claiming that one of my results was wrong based on a trivial misunderstanding.  Fifth … well, there were other things that were bothering me, but I don’t remember what they were.

Then the next day, maybe 50 feet from where we’d been, the bombs went off, three innocent human beings lost their lives and many more were rendered permanently disabled.

Drawing appropriate morals is left as an exercise for the reader.

* * *

Update (Friday, 7AM): Maybe the moral is that you shouldn’t philosophize while the suspects are still on the loose. Last night (as you can read anywhere else on the web) an MIT police officer was tragically shot and killed in the line of duty, right outside the Stata Center, by one of the marathon bombers (who turn out to be brothers from Chechnya). After a busy night—which also included ~~robbing a 7-Eleven~~ (visiting a 7-Eleven that was coincidentally also robbed—no novelist could make this stuff up), carjacking a Mercedes two blocks from my apartment, and randomly throwing some more pressure-cooker bombs—one of the brothers was killed; the other one escaped to Watertown. A massive hunt for him is now underway. MIT is completely closed today, as is Harvard and pretty much every other university in the area—and now, it seems, all stores and businesses in the entire Boston area. The streets are mostly deserted except for police vehicles. As for us, we heard the sirens through much of the night, but didn’t know what they were about until this morning. Here’s hoping they catch the second asshole soon.

Another Update (Friday, 9AM): As the sorry details emerge about these Tsarnaev brothers, it occurs to me that there’s another moral we can draw: namely, we can remind ourselves that the Hollywood image of the evil criminal genius is almost entirely a myth. Yes, evil and genius have occasionally been found in the same person (as with a few of the Nazi scientists), but it’s evil and stupidity that are the far more natural allies. Which is the most optimistic statement I can think to make right now about the future of the human race.

Yet More Updates (Friday, 3PM): The whole Boston area is basically a ghost town now, with the streets empty on a beautiful spring day and the sound of helicopters filling the air.  I was just up on my roofdeck to watch, and never saw anything like it.  I can’t help thinking that it sets a terrible precedent to give a couple doofus amateur terrorists the power to shut down an entire metropolitan area.  Meanwhile, Andrew Sullivan points to a spectacularly stupid tweet by one Nate Bell:

I wonder how many Boston liberals spent the night cowering in their homes wishing they had an AR-15 with a hi-capacity magazine?

This sounds like a gun nut projecting his own disturbed psychology onto other people.  I’m not actually scared, but if I was, owning a gun would do nothing whatsoever to make me less scared (quite the contrary).  What would make me think I could win a gunfight against a frothing lunatic—or that I’d want to find out?  When it comes to violence, the only thing that calms my nerves is a democratic state having a near-monopoly on it.

What else?  It was chilling to watch the Tsarnaev brothers’ aunt, the one in Toronto, babble incoherently on TV about how wonderful her nephews were (a striking contrast to the remorseful uncle in Maryland).  If it emerges that anyone else in this family (including the parents, or the older brother’s wife) had any foreknowledge about the killing spree, then I very much hope they’ll face justice as well.

In other news, Lily had an eventful day too: she finally figured out how to squeeze her toy ball with her hands.


A month ago, I posed the following as the 10th most annoying question in quantum computing:

> Given an n-qubit pure state, is there always a way to apply Hadamard gates to some subset of the qubits, so as to make all 2n computational basis states have nonzero amplitudes?

Today Ashley Montanaro and Dan Shepherd of the University of Bristol sent me the answer, in a beautiful 4-page writeup that they were kind enough to let me post here. (The answer, as I expected, is yes.)

This is a clear advance in humankind’s scientific knowledge, which is directly traceable to this blog. I am in a good mood today.

The obvious next question is to find an α&gt;0 such that, for any n-qubit pure state, there’s some way to apply Hadamards to a subset of the qubits so as to make all 2n basis states have |amplitude| at least α. Clearly we can’t do better than α=sinn(π/8). Montanaro and Shepherd conjecture that this is tight.

What’s the motivation? If you have to ask…


One of the surest signs of the shnood is the portentous repetition of the following two slogans:

> Biology will be the physics of the 21st century.
>
> The future of the world is in China and India.

Let me translate for you:

> You know the field of Darwin, Pasteur, and Mendel, the field that fills almost every page of Science and Nature, the field that gave rise to modern medicine and transformed the human condition over the last few centuries? Well, don’t count it out entirely! This plucky newcomer among the sciences is due to make its mark. Another thing you shouldn’t count out is the continent of Asia, which is situated next to Europe. Did you know that China, far more than a source of General Tso’s Chicken, has been one of the centers of human civilization for 4,000 years? And did you know that Gandhi and Ramanujan both hailed from a spunky little country called India? It’s true!

Let me offer my own counterslogans:

> Biology will be the biology of the 21st century.
>
> The future of China and India is in China and India, respectively.


Last month, I blogged about Sen. Tom Coburn (R-Oklahoma) passing an amendment blocking the National Science Foundation from funding most political science research.  I wrote:

This sort of political interference with the peer-review process, of course, sets a chilling precedent for all academic research, regardless of discipline.  (What’s next, an amendment banning computer science research, unless it has applications to scheduling baseball games or slicing apple pies?)

In the comments section of that post, I was pilloried by critics, who ridiculed my delusional fears about an anti-science witch hunt.  Obviously, they said, Congressional Republicans only wanted to slash dubious social science research: not computer science or the other hard sciences that people reading this blog really care about, and that everyone agrees are worthy.  Well, today I write to inform you that I was right, and my critics were wrong.  For the benefit of readers who might have missed it the first time, let me repeat that:

I was right, and my critics were wrong.

In this case, like in countless others, my “paranoid fears” about what could happen turned out to be preternaturally well-attuned to what would happen.

According to an article in Science, Lamar Smith (R-Texas), the new chair of the ironically-named House Science Committee, held two hearings in which he “floated the idea of having every NSF grant application [in every field] include a statement of how the research, if funded, ‘would directly benefit the American people.’ ”  Connoisseurs of NSF proposals will know that every proposal already includes a “Broader Impacts” section, and that that section often borders on comic farce.  (“We expect further progress on the μ-approximate shortest vector problem to enthrall middle-school students and other members of the local community, especially if they happen to belong to underrepresented groups.”)  Now progress on the μ-approximate shortest vector problem also has to directly—directly—“benefit the American people.”  It’s not enough for such research to benefit science—arguably the least bad, least wasteful enterprise our sorry species has ever managed—and for science, in turn, to be a principal engine of the country’s economic and military strength, something that generally can’t be privatized because of a tragedy-of-the-commons problem, and something that economists say has repaid public investments many, many times over.  No, the benefit now needs to be “direct.”

The truth is, I find myself strangely indifferent to whether Smith gets his way or not.  On the negative side, sure, a pessimist might worry that this could spell the beginning of the end for American science.  But on the positive side, I would have been proven so massively right that, even as I held up my “Will Prove Quantum Complexity Theorems For Food” sign on a street corner or whatever, I’d have something to crow about until the end of my life.


Back in February, I gave a talk with the above title at the Annual MIT Latke-Hamentaschen Debate.  I’m pleased to announce that streaming video of my talk is now available!  (My segment starts about 10 minutes into the video, and lasts for 10 minutes.)  You can also download my PowerPoint slides here.

Out of hundreds of talks I’ve given in my life, on five continents, this is the single talk of which I’m the proudest.

Of course, before you form an opinion about the issue at hand, you should also check out the contributions of my fellow debaters.  On the sadly-mistaken hamentasch side, my favorite presentation was that of mathematician Arthur Mattuck, which starts in at 56 minutes and lasts for a full half hour (!! – the allotted time was only 8 minutes).  Mattuck relates the shapes of latkes and hamentaschen to the famous Kakeya problem in measure theory—though strangely, his final conclusions seem to provide no support whatsoever for the hamentaschen, even on Mattuck’s own terms.

Finally, what if you’re a reader for whom the very words “latke” and “hamentaschen” are just as incomprehensible as the title of this blog?  OK, here are some Cliff Notes:

  * Latkes are fried potato pancakes, traditionally eaten by Jews on Hannukah.
  * Hamentaschen are triangular fruit-filled cookies, traditionally eaten by Jews on Purim.
  * Beginning at the University of Chicago in 1946, many universities around the world have held farcical annual “debates” between faculty members (both Jewish and non-Jewish) about which of those two foods is better.  (The reason I say “farcical” is simply that, as I explain in my talk, the truth has always been overwhelmingly on one side.)  The debaters have invoked everything from feminist theory to particle physics to bolster their case.

Thanks very much to Dean of Admissions Stu Schmill for moderating, and to MIT Hillel for organizing the debate.

Update: Luboš has a new blog post announcing that he finally found a chapter in Quantum Computing Since Democritus that he likes!  Woohoo!  Whether coincidentally or not, the chapter he likes makes exactly the same points about quantum mechanics that I also make in my pro-latke presentation.


An anonymous indie-cinema-loving hermit friend from Amsterdam sends me an article in this week’s Economist entitled “Poison Ivy: Not so much palaces of learning as bastions of privilege and hypocrisy” (unfortunately, only available to subscribers). The article is a summary of an excellent Wall Street Journal series by Daniel Golden (again, unfortunately, only available to subscribers), which I’ve been following with great interest. Golden has also put out a book about this topic, called The Price of Admission (“How America’s Ruling Class Buys Its Way into Elite Colleges — and Who Gets Left Outside the Gates”), which I just ordered from Amazon. In the meantime, I’ll simply quote a few passages from the Economist piece:

> Mr Golden shows that elite universities do everything in their power to admit the children of privilege. If they cannot get them in through the front door by relaxing their standards, then they smuggle them in through the back. No less than 60% of the places in elite universities are given to candidates who have some sort of extra “hook”, from rich or alumni parents to “sporting prowess”. The number of whites who benefit from this affirmative action is far greater than the number of blacks…
>
> Most people think of black football and basketball stars when they hear about “sports scholarships”. But there are also sports scholarships for rich white students who play preppie sports such as fencing, squash, sailing, riding, golf and, of course, lacrosse. The University of Virginia even has scholarships for polo-players, relatively few of whom come from the inner cities…
>
> What is one to make of [Senate Majority Leader Bill] Frist, who opposes affirmative action for minorities while practising it for his own son?
>
> Two groups of people overwhelmingly bear the burden of these policies — Asian-Americans and poor whites. Asian-Americans are the “new Jews”, held to higher standards (they need to score at least 50 points higher than non-Asians even to be in the game) and frequently stigmatised for their “characters” (Harvard evaluators persistently rated Asian-Americans below whites on “personal qualities”). When the University of California, Berkeley briefly considered introducing means-based affirmative action, it rejected the idea on the ground that “using poverty yields a lot of poor white kids and poor Asian kids”.

The article ends with the hope that “America’s money-addicted and legacy-loving universities can be shamed into returning to what ought to have been their guiding principle all along: admitting people to university on the basis of their intellectual ability.”

I harped about this issue in one of my very first posts, almost a year ago. I don’t know what else to say. If idealism won’t goad us Americans (yes, I’m still an American) into overhauling our crooked, anti-intellectual admissions system, then maybe it will help to see just how absurd that system looks to the rest of the world.


OK, this will be my last blog post hawking Quantum Computing Since Democritus, at least for a while.  But I do have four pieces of exciting news about the book that I want to share.

  1. Amazon is finally listing the print version of QCSD as available for shipment in North America, slightly ahead of schedule!  Amazon’s price is $35.27.
  2. Cambridge University Press has very generously offered readers of Shtetl-Optimized a 20% discount off their list price—meaning $31.99 instead of $39.99—if you click this link to order directly from them.  Note that CUP has a shipping charge of $6.50.  So ordering from CUP might either be slightly cheaper or slightly more expensive than ordering from Amazon, depending (for example) on whether you get free shipping from Amazon Prime.
  3. So far, there have been maybe 1000 orders and preorders for QCSD (not counting hundreds of Kindle sales).  The book has also spent a month as one of Amazon’s top few “Quantum Physics” sellers, with a fabulous average rating of 4.6 / 5 stars from 9 reviews (or 4.9 if we discount the pseudonymous rant by Joy Christian).  Thanks so much to everyone who ordered a copy; I hope you like it!  Alas, these sales figures also mean that QCSD still has a long way to go before it enters the rarefied echelon of—to pick a few top Amazon science sellers—Cosmos, A Brief History of Time, Proof of Heaven (A Neurosurgeon’s Journey into the Afterlife), Turn On Your SUPER BRAIN, or The Lemon Book (Natural Recipes and Preparations).  So, if you believe that QCSD deserves to be with such timeless classics, then put your money where your mouth is and help make it happen!
  4. The most exciting news of all?  Luboš Motl is reading the free copy of QCSD that I sent him and blogging his reactions chapter-by-chapter!  So, if you’d like to learn about how mathematicians and computer scientists simply lack the brainpower to do physics—which is why we obsess over kindergarten trivialities like the Church-Turing Thesis or the Axiom of Choice, and why we insist idiotically that Nature use only the mathematical structures that our inferior minds can grasp—then check out Luboš’s posts about Chapters 1-3 or Chapters 4-6.  If, on the other hand, you want to see our diacritical critic pleasantly surprised by QCSD’s later chapters on cryptography, quantum mechanics, and quantum computing, then here’s the post for you.  Either way, be sure to scroll down to the comments, where I patiently defend the honor of theoretical computer science against Luboš’s hilarious ad hominem onslaughts.


Two years ago, when I attended the FQXi conference on a ship from Norway to Denmark, I (along with many other conference participants) was interviewed by Robert Lawrence Kuhn, who produces a late-night TV program called “Closer to Truth.”  I’m pleased to announce (hat tip: Sean Carroll) that four videos from my interview are finally available online:

  * Is the Universe a Computer?

(like a politician, I steer the question toward “what kind of computer is the universe?,” then start talking about P vs. NP, quantum computing, and the holographic principle)

  * What Does Quantum Theory Mean?

(here I mostly talk about the idea of computational intractability as a principle of physics)

  * Quantum Computing Mysteries

(basics of quantum mechanics and quantum computing)

  * Setting Time Aright (about the differences between time and space, the P vs. PSPACE problem, and computing with closed timelike curves)

(No, I didn’t choose the titles!)

For regular readers of this blog, there’s probably nothing new in these videos, but for those who are “just tuning in,” they provide an extremely simple and concise introduction to what I care about and why.  I’m pretty happy with how they came out.

Once you’re finished with me (or maybe even before then…), click here for the full list of interviewees, which includes David Albert, Raphael Bousso, Sean Carroll, David Deutsch, Rebecca Goldstein, Seth Lloyd, Marvin Minsky, Roger Penrose, Lenny Susskind, Steven Weinberg, and many, many others who might be of interest to Shtetl-Optimized readers.


O Achilles of Arkansas, O bane of Foxes and Roves, O solitary warrior among Democrats: dasher of hopes, prince of platitudes, felatee of Jewesses, belated friend of Tutsis, toothless tiger of climate change, greatest of all living Americans: how shall we summon thee back?


Update (5/6): In “honor” of the news below, Boaz Barak has written a beautiful blog post on the reasons to care about the P vs. NP question, offering his responses to several of the most common misconceptions.  Thank you so much, Boaz — this is one of the best presents I’ve ever gotten from anyone!

* * *

On Friday afternoon—in the middle of a pizza social for my undergrad advisees—I found out that I’ve received tenure at MIT.

Am I happy about the news?  Of course!  Yet even on such a joyous occasion, I found myself reflecting on a weird juxtaposition.  I learned about MIT’s tenure decision at the tail end of a fierce, weeks-long comment war over on Luboš Motl’s blog, in which I assumed the task of defending theoretical computer science and quantum information science as a whole: explaining why these fields could have anything whatsoever to contribute to our understanding of the universe.  Indeed, I took the title of this post from a comment Luboš made to me in the middle of the melee: that compared to string theorists, quantum computing researchers have as much to say about the nature of reality as toll-takers on the Golden Gate Bridge.  (Even though the Golden Gate tolls are apparently all-electronic these days, I still found Luboš’s analogy striking.  I could imagine that staring all day at the breathtaking San Francisco Bay would lead to deep thoughts about the nature of reality.)

Now, some people will ask: why should I even waste my time this way—arguing with Luboš, a blogger infamous for describing the scientists he disagrees with as garbage, worms, fungi, etc., and even calling for their “elimination”?  If I find the limits of computation in the physical universe to be a rich, fascinating, worthwhile subject; if I have hundreds of wonderful colleagues with whom to share the thrill of surprising new discoveries; if a large, growing fraction of the wider scientific community follows this field with interest; if my employer seems to want me doing it for the long haul … then why should I lose sleep just because someone, somewhere, declared that the P vs. NP problem is a random puzzle, of no deeper significance than the question of whether chess is a draw?  Or because he characterized the entire fields of quantum computing and information as trivial footnotes to 1920s physics, fit only for mediocre students who couldn’t do string theory?  Or because, on the “other side,” a persistent minority calls quantum computers an absurd fantasy, and the quest to build them a taxpayer boondoggle bordering on fraud?  Or because some skeptics, going even further, dismiss quantum mechanics itself as nonsensical mumbo-jumbo that physicists made up to conceal their own failure to find a straightforward, mechanical description of Nature?  Likewise, why should it bother me if some anti-complexites dismiss the quest to prove P≠NP as a fashionable-but-irrelevant journey to formalize the obvious—even while others denounce the Soviet-style groupthink that leads the “CS establishment” to reject the possibility that P=NP?  After all, these various naysayers can’t all be right!  Doesn’t it comfort me that, of all the confidently-asserted reasons why everything my colleagues and I study is dead-end, cargo-cult science, so many of the reasons contradict each other?

Sure, but here’s the thing.  In seven years of teaching and blogging, I’ve learned something about my own psychology.  Namely, if I meet anyone—an undergrad, an anonymous blog commenter, anyone—who claims that the P vs. NP problem is beside the point, since it’s perfectly plausible that P=NP but the algorithm takes n10000 time—or that, while quantum mechanics works fine for small systems, there’s not the slightest reason to expect it to scale up to larger ones—or that the limits of computation are plainly no more relevant to fundamental physics than the fact that cucumbers are green—trying to reason with that person will always, till the end of my life, feel like the most pressing task in the world to me.

Why?  Because, I confess, a large part of me worries: what if this other person is right?  What if I really do have to jettison everything I thought I knew about physics, computation, and pretty much everything else since I was a teenager, toss all my results into the garbage can (or at least the “amusing recreations can”), and start over from kindergarten?  But then, as I fret about that possibility, counterarguments well up in my mind.  Like someone pinching himself to make sure he’s awake, I remember all the reasons why I was led to think what I think in the first place.  And I want the other person to go through that experience with me—the experience, if you like, of feeling the foundations of the universe smashed to pieces and then rebuilt, the infinite hierarchy of complexity classes collapsing and then springing back into place, decades’ worth of books set ablaze and then rewritten on blank pages.  I want to say: at least come stand here with me—in this place that I spent twenty years of late nights, false starts, and discarded preconceptions getting to—and tell me if you still don’t see what I see.

That’s how I am; I doubt I can change it any more than I can change my blood type.  So I feel profoundly grateful to have been born into a world where I can make a comfortable living just by being this strange, thin-skinned creature that I am—a world where there are countless others who do see what I see, indeed see it a thousand times more clearly in many cases, but who still appreciate what little I can do to explore this corner or that, or to describe the view to others.  I’d say I’m grateful to “fate,” but really I’m grateful to my friends and family, my students and teachers, my colleagues at MIT and around the world, and the readers of Shtetl-Optimized—yes, even John Sidles.  “Fate” either doesn’t exist or doesn’t need my gratitude if it does.


Gödel, Turing, and Friends. Another whole course compressed into one handwaving lecture. (This will be a recurring theme.)


So said my brother David (MIT math major), on forwarding me this animation of the inner life of a cell.


Update (5/7): Enough!  Thanks, everyone, for asking so many imaginative questions, and please accept my apologies if yours remains unaddressed.  (It’s nothing personal: they simply came fast and furious, way faster than I could handle in an online fashion—so I gave up on chronological order and simply wrote answers in whatever order they popped into my head.)  At this point, I’m no longer accepting any new questions.  I’ll try to answer all the remaining questions by tomorrow night.

* * *

By popular request, for the next 36 hours—so, from now until ~11PM on Tuesday—I’ll have a long-overdue edition of “Ask Me Anything.”  (For the previous editions, see here, here, here, and here.)  Today’s edition is partly to celebrate my new, tenured “freedom to do whatever the hell I want” (as well as the publication after 7 years of Quantum Computing Since Democritus), but is mostly just to have an excuse to get out of changing diapers (“I’d love to, honey, but the world is demanding answers!”).  Here are the ground rules:

  1. One question per person, total.
  2. Please check to see whether your question was already asked in one of the previous editions—if it was, then I’ll probably just refer you there.
  3. No questions with complicated backstories, or that require me to watch a video, read a paper, etc. and comment on it.
  4. No questions about D-Wave.  (As it happens, Matthias Troyer will be giving a talk at MIT this Wednesday about his group’s experiments on the D-Wave machine, and I’m planning a blog post about it—so just hold your horses for a few more days!)
  5. If your question is offensive, patronizing, nosy, or annoying, I reserve the right to give a flippant non-answer or even delete the question.
  6. Keep in mind that, in past editions, the best questions have almost always been the most goofball ones (“What’s up with those painting elephants?”).

That’s it: ask away!

* * *

Update (5/12): I’ve finally answered all ~90 questions, a mere 4 days after the official end of the “Ask Me Anything” session!  Thanks so much to everyone for all the great questions.  For your reading convenience, here’s a guide to my answers (personal favorites are in bold):

  * The probability that we live in the Matrix (see followups here, here, here, here)
  * Glauber dynamics
  * My behavior as Waterloo lunch organizer
  * The saddest thing
  * Quantum cellular automata
  * P!=NP vs. P!=PSPACE
  * My knowledge of general relativity
  * Advantages of Dirac ket notation
  * The evolution of my career goals
  * Open problems related to BosonSampling
  * Book-signing for Quantum Computing Since Democritus
  * In an infinite universe, must all possible earthlike planets exist?
  * Was 9/11 an inside job?
  * The fine-structure constant and quantum computing
  * Accessible open problems in complexity theory
  * Tightening Razborov’s monotone lower bound for CLIQUE
  * In what sense is the quadratic Grover speedup “provable”?
  * Fisher information
  * “Associate Professor Without Tenure”
  * Is the whole universe “just” a vector in Hilbert space?
  * How to initialize a qubit
  * My knowledge of my tenure case
  * How I’d build a quantum computer in 20-30 years
  * Could God solve the halting problem?
  * “Who’s yer daddy?”
  * How long I’d want to live
  * Could the difficulty of building a QC grow exponentially with number of qubits?
  * Why does quantum computing require physically different hardware?
  * The double-slit experiment and “lazy evaluation”
  * Bioengineered flying horses vs. flying robot horses: which will be first?
  * The last program I wrote
  * How much I sleep
  * Recent TCS advances with practical applications in the near future
  * What I’d ask Terry Tao
  * How many digits will the largest known prime have in 10 or 100 years?
  * Whether I believe in free will
  * The nature of time
  * My progress in learning Hebrew
  * Social science breakthroughs that could bring about world peace
  * Superquadratic advantage of the quantum adiabatic algorithm over classical search?
  * Is a classical world also a quantum world?
  * The name of the blog
  * John Sidles’ prognostiquestion
  * Books and films for Lily to grow up with
  * Does QM generate “true” randomness?
  * Fictitious proofs of P!=NP
  * The secret of happiness
  * What I did in college
  * The blowup in reducing theorem-proving to 3SAT
  * Whether CUP objected to the free QCSD lecture notes
  * The top 5 not-yet-written books that I’d most like to read
  * Does the continuum “exist” in physical reality? (see followup here)
  * Could Nature itself be inconsistent?
  * Zen koan about a mouse eating cat food
  * “Maybe, it’s the equality sign?”
  * Classical computer is to QC as QC is to what?
  * Why are CS theorists obsessed with polynomial time?
  * My favorite complexity theorist
  * A bad approach to factoring large integers
  * Am I a Bayesian?
  * How to build an intelligent machine
  * Will automated theorem provers become as standard as Mathematica/Maple?
  * My initiation into theoretical computer science
  * How to get an 8-year-old excited about programming
  * “Am I insane?”
  * Levin universal search
  * Brain emulation by 2023?  A $10,000 bet
  * How being in “communist Berkeley” in my formative years shaped my worldview (see followup here)
  * Israel vs. Apartheid South Africa
  * Will useful QC precede its public announcement, or vice versa?
  * My work habits
  * US immigration policy
  * My favorite Israeli foods
  * If I guess randomly, how likely am I to get this question right?
  * Busy Beaver numbers: is BB(n+1) provably much larger than BB(n)? (see followups here and here)
  * Computational complexity and biological/social evolution
  * P vs. NP vs. Shannon capacity of cycles problem
  * Video games based on my research interests
  * Bayesian reasoning when there are copies of yourself
  * Pr[ PH=PSPACE | PH collapses ]
  * My favorite interpretation of QM
  * What I’d do if I proved P=NP
  * QM and consciousness
  * QM and free will
  * Cultures of Clarkson, Cornell, Berkeley, IAS, Waterloo, MIT
  * How I decide what’s ethical
  * American vs. Chilean universities


Bigger, longer, wackier. The topic: “Minds and Machines.”


Behold the PCP Theorem, one of the crowning achievements of complexity theory:

> Given a 3SAT formula φ, it’s NP-hard to decide whether (1) φ is satisfiable or (2) at most a 1-ε fraction of the clauses are satisfiable, promised that one of these is the case. Here ε is a constant independent of n.

In recent weeks, I’ve become increasingly convinced that a Quantum PCP Theorem like the following will one day be a crowning achievement of quantum complexity theory:

> Given a set of local measurements on an n-qubit register, it’s QMA-hard to decide whether (1) there exists a state such that all of the measurements accept with probability 1, or (2) for every state, at most a 1-ε fraction of the measurements accept with probability more than 1-δ, promised that one of these is the case. Here a “local” measurement is one that acts on at most (say) 3 qubits, and ε and δ are constants independent of n.

I’m 99% sure that this theorem (alright, conjecture) or something close to it is true. I’m 95% sure that the proof will require a difficult adaptation of classical PCP machinery (whether Iritean or pre-Iritean), in much the same way that the Quantum Fault-Tolerance Theorem required a difficult adaptation of classical fault-tolerance machinery. I’m 85% sure that the proof is achievable in a year or so, should enough people make it a priority. I’m 75% sure that the proof, once achieved, will open up heretofore undreamt-of vistas of understanding and insight. I’m 0.01% sure that I can prove it. And that is why I hereby bequeath the actual proving part to you, my readers.

Notes:

  1. By analogy to the classical case, one expects that a full-blown Quantum PCP Theorem would be preceded by weaker results (“quantum assignment testers”, quantum PCP’s with weaker parameters, etc). So these are obviously the place to start.
  2. Why hasn’t anyone tackled this question yet? Well, one reason is that it’s hard. But a second reason is that people keep getting hung up on exactly how to formulate the question. To forestall further nitpicking, I hereby declare it obvious that a “Quantum PCP Theorem” means nothing more or less than a robust version of Kitaev’s QMA-completeness theorem, in exactly the same sense that the classical PCP Theorem was a robust version of the Cook-Levin Theorem. Any formulation that captures this spirit is fine; mine was only one possibility.


This year’s Bank of Sweden Prize in Economic Sciences in Memory of Alfred Nobel has been awarded to two game theorists: Robert Aumann of Hebrew University, and Thomas Schelling of the University of Maryland.

In 1976, Aumann wrote a famous paper called “Agreeing to Disagree,” which proved the following fact. Suppose you and your friend are perfectly rational Bayesians, who’d both form the same opinions if given the same information. Then provided your opinions are “common knowledge” (meaning you both know them, you both know you both know them, etc.), those opinions must be equal — even if neither of you knows the evidence on which the other’s opinion is based! Loosely speaking, then, you can never “agree to disagree.”

As an example, suppose Alice offers to sell you some stock. Then the mere fact that she’s trying to sell it gives you useful information — namely, that something must’ve convinced her the stock is headed south. So even if you have no idea what that something is, her offer should cause you to decrease your own valuation of the stock. Similarly, if you agree to buy the stock, that should cause Alice to increase her valuation. As observed by Milgrom and Stokey, the end result of all this second-guessing is that you might as well never trade at all! This is assuming three conditions: (i) that you and Alice would believe the same things if given the same information, (ii) that you’re both trying to maximize expected wealth, and (iii) that you both have the same liquidity needs (i.e. neither desperately needs to pay off a mortgage). If you’re still confused, read this delightful survey by Cowen and Hanson.

A year ago I proved a complexity-theoretic analogue of Aumann’s theorem: that not only will two Bayesians agree “in the limit” of common knowledge, but they’ll also (probably, approximately) agree after a really short conversation. I sent my paper to Aumann just for fun, not expecting any response from the great man. To my surprise, Aumann promptly wrote back with a thoughtful critique — telling me to cut out my philosophical musings and let the math speak for itself. I hated this advice at the time, but eventually came to the grudging realization that it was right.

Interestingly, besides being a world expert on rationality, Aumann is also an Orthodox Jew, who’s written several papers applying game theory to the Talmud. He was born in Germany in 1930 and escaped to the US in 1938.

Congratulations to Aumann and Schelling!


Wrap-Up (June 5): This will be my final update on this post (really!!), since the discussion seems to have reached a point where not much progress is being made, and since I’d like to oblige the commenters who’ve asked me to change the subject.  Let me try to summarize the main point I’ve been trying to get across this whole time.  I’ll call the point (*).

(*) D-Wave founder Geordie Rose claims that D-Wave has now accomplished its goal of building a quantum computer that, in his words, is “better at something than any other option available.”  This claim has been widely and uncritically repeated in the press, so that much of the nerd world now accepts it as fact.  However, the claim is not supported by the evidence currently available.  It appears that, while the D-Wave machine does outperform certain off-the-shelf solvers, simulated annealing codes have been written that outperform the D-Wave machine on its own native problem when run on a standard laptop.  More research is needed to clarify the issue, but in the meantime, it seems worth knowing that this is where things currently stand.

In the comments, many people tried repeatedly to change the subject from (*) to various subsidiary questions.  For example: isn’t it possible that D-Wave’s current device will be found to provide a speedup on some other distribution of instances, besides the one that was tested?  Even if not, isn’t it possible that D-Wave will achieve a genuine speedup with some future generation of machines?  Did it make business sense for Google to buy a D-Wave machine?  What were Google’s likely reasons?  What’s D-Wave’s current value as a company?  Should Cathy McGeoch have acted differently, in the type of comparison she agreed to do, or in how she communicated about its results?  Should I have acted differently, in my interaction with McGeoch?

And, I’m afraid to say, I jumped in to the discussion of all of those questions—because, let’s face it, there are very few subjects about which I don’t have an opinion, or at least a list of qualified observations to make.  In retrospect, I now think that was a mistake.  It would have been better to sidestep all the other questions—not one of which I really know the answer to, and each of which admits multiple valid perspectives—and just focus relentlessly on the truth of assertion (*).

Here’s an analogy: imagine that a biotech startup claimed that, by using an expensive and controversial new gene therapy, it could cure patients at a higher rate than with the best available conventional drugs—basing its claim on a single clinical trial.  Imagine that this claim was widely repeated in the press as an established fact.  Now imagine that closer examination of the clinical trial revealed that it showed nothing of the kind: it compared against the wrong drugs.  And imagine that a more relevant clinical trial—mostly unmentioned in the press—had also been done, and discovered that when you compare to the right drugs, the drugs do better.  Imagine that someone wrote a blog post bringing all of this to public attention.

And now imagine that the response to that blogger was the following: “aha, but isn’t it possible that some future clinical trial will show an advantage for the gene therapy—maybe with some other group of patients?  Even if not, isn’t it possible that the startup will manage to develop an effective gene therapy sometime in the future?  Betcha didn’t consider that, did you?  And anyway, at least they’re out there trying to make gene therapy work!  So we should all support them, rather than relentlessly criticizing.  And as for the startup’s misleading claims to the public?  Oh, don’t be so naïve: that’s just PR.  If you can’t tune out the PR and concentrate on the science, that’s your own damn problem.  In summary, the real issue isn’t what some clinical trial did or didn’t show; it’s you and your hostile attitude.”

In a different context, these sorts of responses would be considered strange, and the need to resort to them revealing.  But the rules for D-Wave are different.

(Interestingly, in excusing D-Wave’s statements, some commenters explicitly defended standards of intellectual discourse so relaxed that, as far as I could tell, just about anything anyone could possibly say would be OK with them—except of course for what I say on this blog, which is not OK!  It reminds me of the central tenet of cultural relativism: that there exist no universal standards by which any culture could ever be judged “good” or “bad,” except that Western culture is irredeemably evil.)

Update (June 4): Matthias Troyer (who, unfortunately, still can’t comment here for embargo reasons) has asked me to clarify that it’s not he, but rather his postdoc Sergei Isakov, who deserves the credit for actually writing the simulated annealing code that outperformed the D-Wave machine on the latter’s own “home turf” (i.e., random QUBO instances with the D-Wave constraint graph).  The quantum Monte Carlo code, which also did quite well at simulating the D-Wave machine, was written by Isakov together with another of Matthias’s postdocs, Troels Rønnow.

Update (June 3): See Cathy McGeoch’s response (here and here), and my response to her response.

Yet More Updates (June 2): Alex Selby has a detailed new post summarizing his comparisons between the D-Wave device (as reported by McGeoch and Wang) and his own solver—finding that his solver can handily outperform the device and speculating about the reasons why.

In other news, Catherine McGeoch spoke on Friday in the MIT quantum group meeting.  Incredibly, she spoke for more than an hour, without once mentioning the USC results that found that simulated annealing on a standard laptop (when competently implemented) handily outperformed the D-Wave machine, or making any attempt to reconcile those results with hers and Wang’s.  Instead, McGeogh used the time to enlighten the assembled experts about what quantum annealing was, what an exact solver was, etc. etc., then repeated the speedup claims as if the more informative comparisons simply didn’t exist.  I left without asking questions, not wanting to be the one to instigate an unpleasant confrontation, and—I’ll admit—questioning my own sanity as a result of no one else asking about the gigantic elephant in the room.

More Updates (May 21): Happy 25th birthday to me!  Among the many interesting comments below, see especially this one by Alex Selby, who says he’s written his own specialist solver for one class of the McGeoch and Wang benchmarks that significantly outperforms the software (and D-Wave machine) tested by McGeoch and Wang on those benchmarks—and who provides the Python code so you can try it yourself.

Also, Igor Vernik asked me to announce that on July 8th, D-Wave will be giving a technical presentation at the International Superconducting Electronics Conference in Cambridge.  See here for more info; I’ll be traveling then and won’t be able to make it.  I don’t know whether the performance comparisons to Matthias Troyer’s and Alex Selby’s code will be among the topics discussed, or if there will be an opportunity to ask questions about such things.

In another exciting update, John Smolin and Graeme Smith posted a paper to the arXiv tonight questioning even the “signature of quantumness” part of the latest D-Wave claims—the part that I’d been ~98% willing to accept, even as I relayed evidence that cast enormous doubt on the “speedup” part. Specifically, Smolin and Smith propose a classical model that they say can explain the “bimodal” pattern of success probabilities observed by the USC group as well as quantum annealing can. I haven’t yet had time to read their paper or form an opinion about it, but I’d be very interested if others wanted to weigh in.   Update (May 26): The USC group has put out a new preprint responding to Smolin and Smith, offering additional evidence for quantum behavior in the D-Wave device that they say can’t be explained using Smolin and Smith’s model.

Update (May 17): Daniel Lidar emailed me to clarify his views about error-correction and the viability of D-Wave’s approach.  He invited me to share his clarification with others—something that I’m delighted to do, since I agree with him wholeheartedly.  Without further ado, here’s what Lidar says:

I don’t believe D-Wave’s approach is scalable without error correction.  I believe that the incorporation of error correction is a necessary condition in order to ever achieve a speedup with D-Wave’s machines, and I don’t believe D-Wave’s machines are any different from other types of quantum information processing in this regard.  I have repeatedly made this point to D-Wave over several years, and I hope that in the future their designs will allow more flexibility in the incorporation of error correction.

Lidar also clarified that he not only doesn’t dispute what Matthias Troyer told me about the lack of speedup of the D-Wave device compared to classical simulated annealing in their experiments, but “fully agrees, endorses, and approves” of it—and indeed, that he himself was part of the team that did the comparison.

In other news, this Hacker News thread, which features clear, comprehending discussions of this blog post and the backstory that led up to it, has helped to restore my faith in humanity.

* * *

Two years ago almost to the day, I announced my retirement as Chief D-Wave Skeptic.  But—as many readers predicted at the time—recent events (and the contents of my inbox!) have given me no choice except to resume my post.  In an all-too-familiar pattern, multiple rounds of D-Wave-related hype have made it all over the world before the truth has had time to put its pants on and drop its daughter off in daycare.  And the current hype is particularly a shame, because once one slices through all the layers of ugh—the rigged comparisons, the “dramatic announcements” that mean nothing, the lazy journalists cherry-picking what they want to hear and ignoring the inconvenient bits—there really has been a huge scientific advance this past month in characterizing the D-Wave devices.  I’m speaking about the experiments on the D-Wave One installed at USC, the main results of which finally appeared in April.  Two of the coauthors of this new work—Matthias Troyer and Daniel Lidar—were at MIT recently to speak about their results, Troyer last week and Lidar this Tuesday.  Intriguingly, despite being coauthors on the same paper, Troyer and Lidar have very different interpretations of what their results mean, but we’ll get to that later.  For now, let me summarize what I think their work has established.

Evidence for Quantum Annealing Behavior

For the first time, we have evidence that the D-Wave One is doing what should be described as “quantum annealing” rather than “classical annealing” on more than 100 qubits.  (Note that D-Wave itself now speaks about “quantum annealing” rather than “quantum adiabatic optimization.”  The difference between the two is that the adiabatic algorithm runs coherently, at zero temperature, while quantum annealing is a “messier” version in which the qubits are strongly coupled to their environment throughout, but still maintain some quantum coherence.)  The evidence for quantum annealing behavior is still extremely indirect, but despite my “Chief Skeptic” role, I’m ready to accept what the evidence indicates with essentially no hesitation.

So what is the evidence?  Basically, the USC group ran the D-Wave One on a large number of randomly generated instances of what I’ll call the “D-Wave problem”: namely, the problem of finding the lowest-energy configuration of an Ising spin glass, with nearest-neighbor interactions that correspond to the D-Wave chip’s particular topology.  Of course, restricting attention to this “D-Wave problem” tilts the tables heavily in D-Wave’s favor, but no matter: scientifically, it makes a lot more sense than trying to encode Sudoku puzzles or something like that.  Anyway, the group then looked at the distribution of success probabilities when each instance was repeatedly fed to the D-Wave machine.  For example, would the randomly-generated instances fall into one giant clump, with a few outlying instances that were especially easy or especially hard for the machine?  Surprisingly, they found that the answer was no: the pattern was strongly bimodal, with most instances either extremely easy or extremely hard, and few instances in between.  Next, the group fed the same instances to Quantum Monte Carlo: a standard classical algorithm that uses Wick rotation to find the ground states of “stoquastic Hamiltonians,” the particular type of quantum evolution that the D-Wave machine is claimed to implement.  When they did that, they found exactly the same bimodal pattern that they found with the D-Wave machine.  Finally they fed the instances to a classical simulated annealing program—but there they found a “unimodal” distribution, not a bimodal one.  So, their conclusion is that whatever the D-Wave machine is doing, it’s more similar to Quantum Monte Carlo than it is to classical simulated annealing.

Curiously, we don’t yet have any hint of a theoretical explanation for why Quantum Monte Carlo should give rise to a bimodal distribution, while classical simulating annealing should give rise to a unimodal one.  The USC group simply observed the pattern empirically (as far as I know, they’re the first to do so), then took advantage of it to characterize the D-Wave machine.  I regard explaining this pattern as an outstanding open problem raised by their work.

In any case, if we accept that the D-Wave One is doing “quantum annealing,” then despite the absence of a Bell-inequality violation or other direct evidence, it’s reasonably safe to infer that there should be large-scale entanglement in the device.  I.e., the true quantum state is no doubt extremely mixed, but there’s no particular reason to believe we could decompose that state into a mixture of product states.  For years, I tirelessly repeated that D-Wave hadn’t even provided evidence that its qubits were entangled—and that, while you can have entanglement with no quantum speedup, you can’t possibly have a quantum speedup without at least the capacity to generate entanglement.  Now, I’d say, D-Wave finally has cleared the evidence-for-entanglement bar—and, while they’re not the first to do so with superconducting qubits, they’re certainly the first to do so with so many superconducting qubits.  So I congratulate D-Wave on this accomplishment.  If this had been advertised from the start as a scientific research project—“of course we’re a long way from QC being practical; no one would ever claim otherwise; but as a first step, we’ve shown experimentally that we can entangle 100 superconducting qubits with controllable couplings”—my reaction would’ve been, “cool!”  (Similar to my reaction to any number of other steps toward scalable QC being reported by research groups all over the world.)

No Speedup Compared to Classical Simulated Annealing

But of course, D-Wave’s claims—and the claims being made on its behalf by the Hype-Industrial Complex—are far more aggressive than that.  And so we come to the part of this post that has not been pre-approved by the International D-Wave Hype Repeaters Association.  Namely, the same USC paper that reported the quantum annealing behavior of the D-Wave One, also showed no speed advantage whatsoever for quantum annealing over classical simulated annealing.  In more detail, Matthias Troyer’s group spent a few months carefully studying the D-Wave problem—after which, they were able to write optimized simulated annealing code that solves the D-Wave problem on a normal, off-the-shelf classical computer, about 15 times faster than the D-Wave machine itself solves the D-Wave problem!  Of course, if you wanted even more classical speedup than that, then you could simply add more processors to your classical computer, for only a tiny fraction of the ~$10 million that a D-Wave One would set you back.

Some people might claim it’s “unfair” to optimize the classical simulated annealing code to take advantage of the quirks of the D-Wave problem.  But think about it this way: D-Wave has spent ~$100 million, and hundreds of person-years, optimizing the hell out of a special-purpose annealing device, with the sole aim of solving this one problem that D-Wave itself defined.  So if we’re serious about comparing the results to a classical computer, isn’t it reasonable to have one professor and a few postdocs spend a few months optimizing the classical code as well?

As I said, besides simulated annealing, the USC group also compared the D-Wave One’s performance against a classical implementation of Quantum Monte Carlo.  And maybe not surprisingly, the D-Wave machine was faster than a “direct classical simulation of itself” (I can’t remember how many times faster, and couldn’t find that information in the paper).  But even here, there’s a delicious irony.  The only reason the USC group was able to compare the D-Wave one against QMC at all, is that QMC is efficiently implementable on a classical computer!  (Albeit probably with a large constant overhead compared to running the D-Wave annealer itself—hence the superior performance of classical simulated annealing over QMC.)  This means that, if the D-Wave machine can be understood as reaching essentially the same results as QMC (technically, “QMC with no sign problem”), then there’s no real hope for using the D-Wave machine to get an asymptotic speedup over a classical computer.  The race between the D-Wave machine and classical simulations of the machine would then necessarily be a cat-and-mouse game, a battle of constant factors with no clear asymptotic victor.  (Some people might conjecture that it will also be a “Tom &amp; Jerry game,” the kind where the classical mouse always gets the better of the quantum cat.)

At this point, it’s important to give a hearing to three possible counterarguments to what I’ve written above.

The first counterargument is that, if you plot both the runtime of simulated annealing and the runtime of the D-Wave machine as functions of the instance size n, you find that, while simulated annealing is faster in absolute terms, it can look like the curve for the D-Wave machine is less steep.  Over on the blog “nextbigfuture”, an apparent trend of this kind has been fearlessly extrapolated to predict that with 512 qubits, the D-Wave machine will be 10 billion times faster than a classical computer.  But there’s a tiny fly in the ointment.  As Troyer carefully explained to me last week, the “slow growth rate” of the D-Wave machine’s runtime is, ironically, basically an artifact of the machine being run too slowly on small values of n.  Run the D-Wave machine as fast as it can run for small n, and the difference in the slopes disappears, with only the constant-factor advantage for simulated annealing remaining.  In short, there seems to be no evidence, at present, that the D-Wave machine is going to overtake simulated annealing for any instance size.

The second counterargument is that the correlation between the two “bimodal distributions”—that for the D-Wave machine and that for the Quantum Monte Carlo simulation—is not perfect.  In other words, there are a few instances (not many) that QMC solves faster than the D-Wave machine, and likewise a few instances that the D-Wave machine solves faster than QMC.  Not surprisingly, the latter fact has been eagerly seized on by the D-Wave boosters (“hey, sometimes the machine does better!”).  But Troyer has a simple and hilarious response to that.  Namely, he found that his group’s QMC code did a better job of correlating with the D-Wave machine, than the D-Wave machine did of correlating with itself!  In other words, calibration errors seem entirely sufficient to explain the variation in performance, with no need to posit any special class of instances (however small) on which the D-Wave machine dramatically outperforms QMC.

The third counterargument is just the banal one: the USC experiment was only one experiment with one set of instances (albeit, a set one might have thought would be heavily biased toward D-Wave).  There’s no proof that, in the future, it won’t be discovered that the D-Wave machine does something more than QMC, and that there’s some (perhaps specially-designed) set of instances on which the D-Wave machine asymptotically outperforms both QMC and Troyer’s simulated annealing code.  (Indeed, I gather that folks at D-Wave are now assiduously looking for such instances.)  Well, I concede that almost anything is possible in the future—but “these experiments, while not supporting D-Wave’s claims about the usefulness of its devices, also don’t conclusively disprove those claims” is a very different message than what’s currently making it into the press.

Comparison to CPLEX is Rigged

Unfortunately, the USC paper is not the one that’s gotten the most press attention—perhaps because half of it inconveniently told the hypesters something they didn’t want to hear (“no speedup”).  Instead, journalists have preferred a paper released this week by Catherine McGeoch and Cong Wang, which reports that quantum annealing running on the D-Wave machine outperformed the CPLEX optimization package running on a classical computer by a factor of ~3600, on Ising spin problems involving 439 bits.  Wow!  That sounds awesome!  But before rushing to press, let’s pause to ask ourselves: how can we reconcile this with the USC group’s result of no speedup?

The answer turns out to be painfully simple.  CPLEX is a general-purpose, off-the-shelf exact optimization package.  Of course an exact solver can’t compete against quantum annealing—or for that matter, against classical annealing or other classical heuristics!  Noticing this problem, McGeoch and Wang do also compare the D-Wave machine against tabu search, a classical heuristic algorithm.  When they do so, they find that an advantage for the D-Wave machine persists, but it becomes much, much smaller (they didn’t report the exact time comparison).  Amusingly, they write in their “Conclusions and Future Work” section:

> It would of course be interesting to see if highly tuned implementations of, say, tabu search or simulated annealing could compete with Blackbox or even QA [i.e., the D-Wave machines] on QUBO [quadratic binary optimization] problems; some preliminary work on this question is underway.

As I said above, at the time McGeoch and Wang’s paper was released to the media (though maybe not at the time it was written?), the “highly tuned implementation” of simulated annealing that they ask for had already been written and tested, and the result was that it outperformed the D-Wave machine on all instance sizes tested.  In other words, their comparison to CPLEX had already been superseded by a much more informative comparison—one that gave the “opposite” result—before it ever became public.  For obvious reasons, most press reports have simply ignored this fact.

Troyer, Lidar, and Stone Soup

Much of what I’ve written in this post, I learned by talking to Matthias Troyer—the man who carefully experimented with the D-Wave machine and figured out how to beat it using simulated annealing, and who I regard as probably the world’s #1 expert right now on what exactly the machine does.  Troyer wasn’t shy about sharing his opinions, and while couched with qualifications, they tended toward extremely skeptical.  For example, Troyer conjectured that, if D-Wave ultimately succeeds in getting a speedup over classical computers in a fair comparison, then it will probably be by improving coherence and calibration, incorporating error-correction, and doing other things that “traditional,” “academic” quantum computing researchers had said all along would need to be done.

As I said, Daniel Lidar is another coauthor on the USC paper, and also recently visited MIT to speak.  Lidar and Troyer agree on the basic facts—yet Lidar noticeably differed from Troyer, in trying to give each fact the most “pro-D-Wave spin” it could possibly support.  Lidar spoke at our quantum group meeting, not about the D-Wave vs. simulated annealing performance comparison (which he agrees with), but about a proposal of his for incorporating quantum error-correction into the D-Wave device, together with some experimental results.  He presented his proposal, not as a reductio ad absurdum of D-Wave’s entire philosophy, but rather as a positive opportunity to get a quantum speedup using D-Wave’s approach.

So, to summarize my current assessment of the situation: yes, absolutely, D-Wave might someday succeed—ironically, by adapting the very ideas from “the gate model” that its entire business plan has been based on avoiding, and that D-Wave founder Geordie Rose has loudly denigrated for D-Wave’s entire history!  If that’s what happens, then I predict that science writers, and blogs like “nextbigfuture,” will announce from megaphones that D-Wave has been vindicated at last, while its narrow-minded, theorem-obsessed, ivory-tower academic naysayers now have egg all over their faces.  No one will care that the path to success—through quantum error-correction and so on—actually proved the academic critics right, and that D-Wave’s “vindication” was precisely like that of the deliciousness of stone soup in the old folktale.  As for myself, I’ll probably bang my head on my desk until I sustain so much brain damage that I no longer care either.  But at least I’ll still have tenure, and the world will have quantum computers.

The Messiah’s Quantum Annealer


Over the past few days, I’ve explained the above to at least six different journalists who asked.  And I’ve repeatedly gotten a striking response: “What you say makes sense—but then why are all these prestigious people and companies investing in D-Wave?  Why did Bo Ewald, a prominent Silicon Valley insider, recently join D-Wave as president of its US operations?  Why the deal with Lockheed Martin?  Why the huge deal with NASA and Google, just announced today?  What’s your reaction to all this news?”

My reaction, I confess, is simple.  I don’t care—I actually told them this—if the former Pope Benedict has ended his retirement to become D-Wave’s new marketing director.  I don’t care if the Messiah has come to Earth on a flaming chariot, not to usher in an age of peace but simply to spend $10 million on D-Wave’s new Vesuvius chip.  And if you imagine that I’ll ever care about such things, then you obviously don’t know much about me.  I’ll tell you what: if peer pressure is where it’s at, then come to me with the news that Umesh Vazirani, or Greg Kuperberg, or Matthias Troyer is now convinced, based on the latest evidence, that D-Wave’s chip asymptotically outperforms simulated annealing in a fair comparison, and does so because of quantum effects.  Any one such scientist’s considered opinion would mean more to me than 500,000 business deals.

The Argument from Consequences

Let me end this post with an argument that several of my friends in physics have explicitly made to me—not in the exact words below but in similar ones.

“Look, Scott, let the investors, government bureaucrats, and gullible laypeople believe whatever they want—and let D-Wave keep telling them whatever’s necessary to stay in business.  It’s unsportsmanlike and uncollegial of you to hold D-Wave’s scientists accountable for whatever wild claims their company’s PR department might make.  After all, we’re in this game too!  Our universities put out all sorts of overhyped press releases, but we don’t complain because we know that it’s done for our benefit.  Besides, you’d doubtless be trumpeting the same misleading claims, if you were in D-Wave’s shoes and needed the cash infusions to survive.  Anyway, who really cares whether there’s a quantum speedup yet or no quantum speedup?  At least D-Wave is out there trying to build a scalable quantum computer, and getting millions of dollars from Jeff Bezos, Lockheed, Google, the CIA, etc. etc. to do so—resources more of which would be directed our way if we showed a more cooperative attitude!  If we care about scalable QCs ever getting built, then the wise course is to celebrate what D-Wave has done—they just demonstrated quantum annealing on 100 qubits, for crying out loud!  So let’s all be grownups here, focus on the science, and ignore the marketing buzz as so much meaningless noise—just like a tennis player might ignore his opponent’s trash-talking (‘your mother is a whore,’ etc.) and focus on the game.”

I get this argument: really, I do.  I even concede that there’s something to be said for it.  But let me now offer a contrary argument for the reader’s consideration.

Suppose that, unlike in the “stone soup” scenario I outlined above, it eventually becomes clear that quantum annealing can be made to work on thousands of qubits, but that it’s a dead end as far as getting a quantum speedup is concerned.  Suppose the evidence piles up that simulated annealing on a conventional computer will continue to beat quantum annealing, if even the slightest effort is put into optimizing the classical annealing code.  If that happens, then I predict that the very same people now hyping D-Wave will turn around and—without the slightest acknowledgment of error on their part—declare that the entire field of quantum computing has now been unmasked as a mirage, a scam, and a chimera.  The same pointy-haired bosses who now flock toward quantum computing, will flock away from it just as quickly and as uncomprehendingly.  Academic QC programs will be decimated, despite the slow but genuine progress that they’d been making the entire time in a “parallel universe” from D-Wave.  People’s contempt for academia is such that, while a D-Wave success would be trumpeted as its alone, a D-Wave failure would be blamed on the entire QC community.

When it comes down to it, that’s the reason why I care about this matter enough to have served as “Chief D-Wave Skeptic” from 2007 to 2011, and enough to resume my post today.  As I’ve said many times, I really, genuinely hope that D-Wave succeeds at building a QC that achieves an unambiguous speedup!  I even hope the academic QC community will contribute to D-Wave’s success, by doing careful independent studies like the USC group did, and by coming up with proposals like Lidar’s for how D-Wave could move forward.  On the other hand, in the strange, unlikely event that D-Wave doesn’t succeed, I’d like people to know that many of us in the QC community were doing what academics are supposed to do, which is to be skeptical and not leave obvious questions unasked.  I’d like them to know that some of us simply tried to understand and describe what we saw in front of us—changing our opinions repeatedly as new evidence came in, but disregarding “meta-arguments” like my physicist friends’ above.  The reason I can joke about how easy it is to bribe me is that it’s actually kind of hard.


This week Shtetl-Optimized celebrates its one-year anniversary!

That being the case, in the remainder of this post I thought it would be a good idea to take stock of everything this blog has achieved over the past year, and also to set concrete goals for the coming year.


I arrived yesterday in Innsbruck, Austria — a lovely medieval town set in a valley in the Tyrolean Alps. Here the Pontiff and I are sharing an office at the Institut für Quantenoptik und Quanteninformation, and will have to work out a comedy routine to be performed Friday morning, when we’re supposed to open the QIPC meeting at Ike Newton’s old stomping grounds, the Royal Society in London.

Since I’m too jetlagged to write a coherent entry, I hope you’ll be satisfied with some lists:

The three secrets of air travel (distilled from a decade of experience flying to four continents, and offered free of charge to you, my readers):

  1. Bring a book. Don’t even try to work on the plane; just read read read read read. If you get stuck in the airport for hours, all the more time to read!
  2. If you must work, do it with pen and paper, not a laptop.
  3. Put your laptop case in the overhead bin, not under your seat. This will give you more room to stretch your legs.

The only three German words you’ll ever need to know:

  1. Danke (thank you). To be said after any interaction with anyone.
  2. Ein (one). As in: I will have one of those (pointing).
  3. Entscheidungsproblem (decision problem). The problem of deciding whether a first-order sentence is true in every interpretation, proven to be undecidable by Church and Turing.

The two things I saw yesterday that I wish I’d taken a photo of but didn’t:

  1. A jewelry store display case, proudly displaying “SCHMUCK” brand designer watches. (Important Correction: Ignorant schmuck that I am, I hadn’t realized that “schmuck” is not a brand name, but just the German word for jewelry. Apparently the meaning in Yiddish migrated from “jewels” to “family jewels” to “person being compared to the family jewels,” which is a bit ironic. “Oh my turtledove, the apple of my eye, my priceless schmuck…”)
  2. A campaign poster for one of Austria’s far-right politicians, which graffiti artists had decorated with a Hitler mustache, a forehead swastika, and salutations of “Heil!” (Just what point were the graffiti artists trying to make? I wish I understood.)


Whether there exist subexponential-size locally decodable codes, and sub-nε-communication private information retrieval (PIR) protocols, have been major open problems for a decade. A new preprint by Sergey Yekhanin reveals that both of these questions hinge on — wait for this — whether or not there are infinitely many Mersenne primes. By using the fact (discovered a month ago) that 232,582,657-1 is prime, Yekhanin can already give a 3-server PIR protocol with communication complexity O(n1/32,582,658), improving the previous bound of O(n1/5.25). Duuuuuude. If you’ve ever wondered what it is that motivates complexity theorists, roll this one up and smoke it.


I’ve been traveling this past week (in Israel and the French Riviera), heavily distracted by real life from my blogging career.  But by popular request, let me now provide a link to my very first post-tenure publication: The Ghost in the Quantum Turing Machine.

Here’s the abstract:

In honor of Alan Turing’s hundredth birthday, I unwisely set out some thoughts about one of Turing’s obsessions throughout his life, the question of physics and free will. I focus relatively narrowly on a notion that I call “Knightian freedom”: a certain kind of in-principle physical unpredictability that goes beyond probabilistic unpredictability. Other, more metaphysical aspects of free will I regard as possibly outside the scope of science. I examine a viewpoint, suggested independently by Carl Hoefer, Cristi Stoica, and even Turing himself, that tries to find scope for “freedom” in the universe’s boundary conditions rather than in the dynamical laws. Taking this viewpoint seriously leads to many interesting conceptual problems. I investigate how far one can go toward solving those problems, and along the way, encounter (among other things) the No-Cloning Theorem, the measurement problem, decoherence, chaos, the arrow of time, the holographic principle, Newcomb’s paradox, Boltzmann brains, algorithmic information theory, and the Common Prior Assumption. I also compare the viewpoint explored here to the more radical speculations of Roger Penrose. The result of all this is an unusual perspective on time, quantum mechanics, and causation, of which I myself remain skeptical, but which has several appealing features. Among other things, it suggests interesting empirical questions in neuroscience, physics, and cosmology; and takes a millennia-old philosophical debate into some underexplored territory.

See here (and also here) for interesting discussions over on Less Wrong.  I welcome further discussion in the comments section of this post, and ~~will jump in myself after a few days to address questions~~ (update: eh, already have).  There are three reasons for the self-imposed delay: first, general busyness.  Second, inspired by the McGeoch affair, I’m trying out a new experiment, in which I strive not to be on such an emotional hair-trigger about the comments people leave on my blog.  And third, based on past experience, I anticipate comments like the following:

“Hey Scott, I didn’t have time to read this 85-page essay that you labored over for two years.  So, can you please just summarize your argument in the space of a blog comment?  Also, based on the other comments here, I have an objection that I’m sure never occurred to you.  Oh, wait, just now scanning the table of contents…”

So, I decided to leave some time for people to RTFM (Read The Free-Will Manuscript) before I entered the fray.

For now, just one remark: some people might wonder whether this essay marks a new “research direction” for me.  While it’s difficult to predict the future (even probabilistically 🙂 ), I can say that my own motivations were exactly the opposite: I wanted to set out my thoughts about various mammoth philosophical issues once and for all, so that then I could get back to complexity, quantum computing, and just general complaining about the state of the world.


So, it seems I’ve been written up in the Kitchener-Waterloo Record, a newspaper whose prestige and journalistic excellence make the Wall Street Journal look like the Shop-Rite coupon book. The article, by Meghan Waters, is about “nerd culture” in Waterloo, and I am the prototypical nerd who Waters found to interview.

A few corrections:

  * While I said some very nice things about Mike Lazaridis, I did not compare him to God. (Sorry, Mike!)
  * I did not use the phrase “create some nerd capital.” Indeed, if you find a phrase that sounds like I wouldn’t have used it, I probably didn’t use it.
  * I did not confidently declare that in the future, “nerdlings will dream about” the University of Waterloo as they now do MIT and Caltech (“just give it some time”). I speculated that something like this might happen, particularly if the US were to continue its descent into medieval theocracy.

Despite these and other minor errors, I’m glad that my plan to increase the number of women in science by “nerdifying the world” has now received the wide public airing it deserves.


A reader calling him- or herself “A Merry Clown” left a comment on my previous post which was so wise, I decided it had to be promoted to a post of its own.

Scientific discourse is the art of juggling decorum, truth and humor. A high-wire feat, attempted under imposing shadows cast by giants and above the distraction of merry dancing clowns.

The “appropriate” tone for scientific discourse seems to be:
(a) Cordial. Always credit others for their hard work and good intentions (allow or at least pretend that others are basically well-intentioned, except in rare situations where there is proof of egregious misconduct).
(b) Biting, merciless and hard-nosed on the substantive issues. The truth deserves no less.

Perhaps the harsher (b) is, the gentler and more thorough (a) should be. After-all, human beings are what they are.

Certainly, provided one adequately treads through the niceties in (a), there’s no reason to worry about hurting anyone’s feelings in (b). Anyone who makes scientific claims in a professional or public arena should be prepared to put on their big boy pants or their big girl pants and have their claims face the brutal gauntlet of scientific scrutiny. All attempts should be made to avoid even the appearance that any part of (b) contains personal barbs or insults (unless these barbs happen to be to be hilarious.)

Outside of science the rule is: whoever flings the horseshit the hardest wins.

Essentially, what Shtetl-Optimized readers got to see this past week was me falling off the high wire (with tenure the safety net below?  ).  I failed at a purely human level—though admittedly, while attempting a particularly difficult tightrope walk, and while heavily distracted by the taunts of both giants and clowns.  I’ve already apologized to Cathy McGeoch for insulting her, but I reiterate my apology now, and I extend the apology to any colleagues at MIT who might have been offended by anything I said.  I’ll strive, in future posts, to live up to a higher standard of cordiality, composure, and self-control.

At the scientific level—i.e., at level (b)—I stand by everything I wrote in the previous post and the comments therein.


The New York Times on Yau. Thanks to Hoeteck Wee.


Streaming video is now available for the talks at the QStart conference, a couple weeks ago at Hebrew University in Jerusalem.  If you’re the sort of person who likes watching quantum information talks, then check out the excellent ones by Ray Laflamme, John Martinis, Umesh Vazirani, Thomas Vidick, Jacob Bekenstein, and many others.

My own contribution—the first “backwards-facing, crusty, retrospective” talk I’ve ever given—was called The Collision Lower Bound After 12 Years (click here for the slides—and to answer the inevitable question, no, I have no idea how to open PowerPoint files in your favorite free-range, organic computing platform).  Briefly, the collision lower bound is the theorem that even a quantum computer needs at least ~n1/3 steps to find a duplicate in a long list of random numbers between 1 and n, even assuming the list is long enough that there are many, many duplicates to be found.  (Moreover, ~n1/3 steps are known to suffice, by the BHT algorithm, a clever adaptation of Grover’s search algorithm.  Also, for simplicity a “step” means a single access to the list, though of course a quantum algorithm can access multiple list elements in superposition and it still counts as one step.)

By comparison, for classical algorithms, ~√n steps are necessary and sufficient to find a collision, by the famous Birthday Paradox.  So, just like for Grover’s search problem, a quantum computer could give you a modest speedup over classical for the collision problem, but only a modest one.  The reason this is interesting is that, because of the abundance of collisions to be found, the collision problem has a great deal more structure than Grover’s search problem (though it has less structure than Shor’s period-finding problem, where there famously is an exponential quantum speedup).

One “obvious” motivation for the collision problem is that it models the problem of breaking collision-resistant hash functions (like SHA-256) in cryptography.  In particular, if there were a superfast (e.g., log(n)-time) quantum algorithm for the collision problem, then there could be no CRHFs secure against quantum attack.  So the fact that there’s no such algorithm at least opens up the possibility of quantum-secure CRHFs.  However, there are many other motivations.  For example, the collision lower bound rules out the most “simpleminded” approach to a polynomial-time quantum algorithm for the Graph Isomorphism problem (though, I hasten to add, it says nothing about more sophisticated approaches).  The collision problem is also closely related to Statistical Zero Knowledge (SZK) proof protocols, so that the collision lower bound leads to an oracle relative to which SZK is not in BQP.

Probably the most bizarre motivation to other people, but for some reason the most important one to me back in 2001, is that the collision problem is closely related to the problem of sampling the entire trajectories of hidden variables, in hidden-variable theories such as Bohmian mechanics.  The collision lower bound provides strong evidence that this trajectory-sampling problem is hard even for a quantum computer—intuitively because a QC can’t keep track of the correlations between the hidden-variable positions at different times.  The way I like to put it is that if, at the moment of your death, your entire life history flashed before you in an instant (and if a suitable hidden-variable theory were true, and if you’d performed an appropriate quantum interference experiment on your own brain during your life), then you really could solve the collision problem in only O(1) steps.  Interestingly, you still might not be able to solve NP-complete problems—I don’t know!  But you could at least do something that we think is hard for a quantum computer.

I proved the first collision lower bound in 2001 (actually, a week or so after the 9/11 attacks), after four months of sleepless nights and failed attempts.  (Well actually, I only got the weaker lower bound of ~n1/5; the ~n1/3 was a subsequent improvement due to Yaoyun Shi.  Before ~n1/5, no one could even rule out that a quantum computer could solve the collision problem with a constant number of steps (!!), independent of n—say, 4 steps.)  It was the first thing I’d proved of any significance, and probably the most important thing I did while in grad school.  I knew it was one of the favorite problems of my adviser, Umesh Vazirani, so I didn’t even tell Umesh I was working on it until I’d already spent the whole summer on it.  I figured he’d think I was nuts.

* * *

Bonus Proof Explanation!

The technique that ultimately worked was the polynomial method, which was introduced to quantum computing four years prior in a seminal paper of Beals et al.  In this technique, you first suppose by contradiction that a quantum algorithm exists to solve your problem that makes very few accesses to the input bits—say, T.  Then you write out the quantum algorithm’s acceptance probability (e.g., the probability that the algorithm outputs “yes, I found what I was looking for”) as a multivariate polynomial p in the input bits.  It’s not hard to prove that p has degree at most 2T, since the amplitudes in the quantum algorithm can be written as degree-T polynomials (each input access increases the degree by at most 1, and unitary transformations in between input accesses don’t increase the degree at all); then squaring the amplitudes to get probabilities doubles the degree.  (This is the only part of the method that uses anything specific to quantum mechanics!)

Next, you choose some parameter k related to the problem of interest, and you let q(k) be the expectation of p(X) over all inputs X with the parameter equal to k.  For example, with the collision problem, it turns out that the “right” choice to make is to set k=1 if each number appears exactly once in your input list, k=2 if each number appears exactly twice, k=3 if each number appears exactly three times, and so on.  Then—here comes the “magic” part—you show that q(k) itself is a univariate polynomial in k, again of degree at most 2T.  This magical step is called “symmetrization”; it can be traced at least as far back as the famous 1969 book Perceptrons by Marvin Minsky and Seymour Papert.  In the case of the collision problem, I still have no explanation, 12 years later, for why symmetrization works: all I can say is that you do the calculation, and you cancel lots of things from both the numerator and the denominator, and what comes out at the end is a low-degree polynomial in k.  (It’s precisely because I would never have predicted such a “zany coincidence,” that I had to stumble around in the dark for 4 months before I finally discovered by chance that the polynomial method worked.)

Anyway, after applying symmetrization, you’re left with a low-degree univariate polynomial q with some very interesting properties: for example, you need 0≤q(k)≤1 for positive integers k, since then q(k) represents an averaged probability that your quantum algorithm does something.  You also need q(1) to be close to 0, since if k=1 then there no collisions to be found, and you need q(2) to be close to 1, since if k=2 then there are lots of collisions and you’d like your algorithm to find one.  But now, you can appeal to a theorem of A. A. Markov from the 1890s, which implies that no low-degree polynomial exists with those properties!  Hence your original efficient quantum algorithm can’t have existed either: indeed, you get a quantitative lower bound (a tight one, if you’re careful) on the number of input accesses your algorithm must have made.  And that, modulo some nasty technicalities (e.g., what if k doesn’t evenly divide the size of your list?), is how the collision lower bound works.

* * *

So, in the first half of my QStart talk, I explain the collision lower bound and its original motivations (and a little about the proof, but no more than what I said above).  Then in the second half, I survey lots of extensions and applications between 2002 and the present, as well as the many remaining open problems.  For example, I discuss the tight lower bound of Ambainis et al. for the “index erasure” problem, Belovs’s proof of the element distinctness lower bound using the adversary method, and my and Ambainis’s generalization of the collision lower bound to arbitrary symmetric problems.  I also talk about Mark Zhandry’s recent breakthrough (sorry, am I not allowed to use that word?) showing that the GGM construction of pseudorandom functions is secure against quantum adversaries, and how Zhandry’s result can be seen—in retrospect, anyway—as yet another application of the collision lower bound.

Probably of the most general interest, I discuss how Daniel Harlow and Patrick Hayden invoked the collision lower bound in their striking recent paper on the AMPS black hole “firewall” paradox.  In particular they argued that, in order to uncover the apparent violation of local quantum field theory at the heart of the paradox, an observer falling into a black hole would probably need to solve a QSZK-complete computational problem.  And of course, the collision lower bound furnishes our main piece of evidence that QSZK-complete problems really should require exponential time even for quantum computers.  So, Harlow and Hayden argue, the black hole would already have evaporated before the observer had even made a dent in the requisite computation.

Now, the Harlow-Hayden paper, and the AMPS paradox more generally, really deserve posts of their own—just as soon as I learn enough to decide what I think about them.  For now, I’ll simply say that, regardless of how convinced you are by Harlow and Hayden’s argument (and, a bit like with my free-will essay, it’s not clear how convinced the authors themselves are!), it’s one of the most ambitious syntheses of computational complexity and physics I’ve ever seen.  You can disagree with it, but to read the paper (or watch the talk, streaming video from Strings’2013 here) is to experience the thrill of seeing black hole physics related to complexity theory by authors who really know both.

(In my own talk on the collision lower bound, the short segment about Harlow-Hayden generated more questions and discussion than the rest of the talk combined—with me being challenged to defend their argument, even with Patrick Hayden right there in the audience!  I remarked later that that portion of the talk was itself a black hole for audience interest.)

In totally unrelated news, Quantum Computing Since Democritus made Scientific American’s list of best summer books!  I can’t think of a more appropriate honor, since if there’s any phrase that captures what QCSD is all about, “sizzling summer beach read” would be it.  Apparently there will even be an online poll soon, where y’all can go and vote for QCSD as your favorite.  Vote early and often, and from multiple IP addresses!


Of course, the greatest scientific flame war of all time was the calculus priority dispute between Isaac Newton and Gottfried Wilhelm Leibniz. This one had everything: intrigue, pettiness, hypocrisy, nationalism, and even hints of the physicist vs. computer scientist split that continues to this day.

In our opening talk at QIPC’2006 in London, Dave Bacon and I decided to relive the hatred — with Dave in a frilly white wig playing the part of Newton, and your humble blogger in a frilly black wig playing the part of Leibniz. We forgot to take photos, but here’s the script, and here are the slides for the … err, “serious” talk that Dave and I gave after dewigging.

Update (thanks to Dave and Viv Kendon):


… Lance Fortnow and Bill Gasarch perform a Talmudic exegesis of one of your blog posts, taking more time to do so than you took to write the post. Listen to Bill and Lance dissect my Ten Reasons to Believe P!=NP, and then offer their own reasons that are every bit as flaky as mine are. (Indeed, Lance’s reason turns out to be almost identical to my reason #9, which he had previously rejected.)

I’m honored, of course, but I’m also offended by Bill and Lance’s speculation that not all of my Reasons to Believe were meant completely seriously. Needless to say, everything I write on this blog carries the Official Scott Aaronson Seal of Really Meaning It. Including the last sentence. And the last one. And the last one. And the last one.


This past week I was in Redmond for the Microsoft Faculty Summit, which this year included a special session on quantum computing.  (Bill Gates was also there, I assume as our warmup act.)  I should explain that Microsoft Research now has not one but two quantum computing research groups: there’s Station Q in Santa Barbara, directed by Michael Freedman, which pursues topological quantum computing, but there’s also QuArC in Redmond, directed by Krysta Svore, which studies things like quantum circuit synthesis.

Anyway, I’ve got two videos for your viewing pleasure:

  * An interview about quantum computing with me, Krysta Svore, and Matthias Troyer, moderated by Chris Cashman, and filmed in a studio where they put makeup on your face.  Just covers the basics.
  * A session about quantum computing, with three speakers: me about “what quantum mechanics is good for” (quantum algorithms, money, crypto, and certified random numbers), then Charlie Marcus about physical implementations of quantum computing, and finally Matthias Troyer about his group’s experiments on the D-Wave machines.  (You can also download my slides here.)

This visit really drove home for me that MSR is the closest thing that exists today to the old Bell Labs: a corporate lab that does a huge amount of openly-published, high-quality fundamental research in math and CS, possibly more than all the big Silicon-Valley-based companies combined.  This research might or might not be good for Microsoft’s bottom line (Microsoft, of course, says that it is, and I’d like to believe them), but it’s definitely good for the world.  With the news of Microsoft’s reorganization in the background, I found myself hoping that MS will remain viable for a long time to come, if only because its decline would leave a pretty gaping hole in computer science research.

Unfortunately, last week I also bought a new laptop, and had the experience of PowerPoint 2013 first refusing to install (it mistakenly thought it was already installed), then crashing twice and losing my data, and just generally making everything (even saving a file) harder than it used to be for no apparent reason.  Yes, that’s correct: the preparations for my talk at the Microsoft Faculty Summit were repeatedly placed in jeopardy by the “new and improved” Microsoft Office.  So not just for its own sake, but for the sake of computer science as a whole, I implore Microsoft to build a better Office.  It shouldn’t be hard: it would suffice to re-release the 2003 or 2007 versions as “Office 2014”!  If Mr. Gates took a 2-minute break from curing malaria to call his former subordinates and tell them to do that, I’d really consider him a great humanitarian.


1\. As many of you probably know, this week my EECS colleague Hal Abelson released his 180-page report on MIT’s involvement in the Aaron Swartz case.  I read the whole thing, and I recommend it if you have any interest in the case.  My take is that, far from being the “whitewash” that some people described it as, the report (if you delve into it) clearly and eloquently explains how MIT failed to live up to its own standards, even as it formally followed the rules.  The central insight here is that the world expects MIT to behave, not like some other organization would behave if someone hid a laptop in its supply closet to download the whole JSTOR database, insulted and then tried to flee from its security officers when questioned, etc. etc., but rather with perspective and imagination—worrying less about the security of its facilities than about the future of the world.  People expect MIT, of all places, to realize that the sorts of people who pull these sorts of shenanigans in their twenties sometimes become Steve Jobs or Richard Feynman (or for that matter, MIT professor Robert Morris) later in their lives, and therefore to speak up in their defense.  In retrospect, I wish Swartz’s arrest had sparked a debate about the wider issues among MIT’s students, faculty, and staff.  I think it’s likely that such a debate would have led to pressure on the administration to issue a statement in Swartz’s support.  As it was (and as I pointed out in this interview), most people at MIT, even if they’d read about the arrest, weren’t even aware of the issue’s continued existence, let alone of MIT’s continued role in it, until after Swartz had already committed suicide.  For the MIT community—which includes some prominent supporters of open access—to have played such a passive role is one of the many tragedies that’s obvious with hindsight.

2\. Shafi Goldwasser has asked me to announce that the fifth Innovations in Theoretical Computer Science (ITCS) conference will be held in Princeton, a town technically in New Jersey, on January 12-14, 2014.  Here’s the conference website; if you want to submit a paper, the deadline is coming up soon, on Thursday, August 22.

3\. As the summer winds to a close, I’m proud to announce my main goals for the upcoming academic year.  Those goals are the following:

(a) Take care of Lily.

(b) Finish writing up old papers.

It feels liberating to have no higher aspirations for an entire year—and for the aspirations I have to seem so modest and so achievable.  On the other hand, it will be all the more embarrassing if I fail to achieve even these goals.


From my inbox:

> We simple folk out in the cold wastes of the internet are dying the slow and horrible death of intellectual starvation. Only you can save us, by posting the next installment of your lecture notes before we shuffle off this mortal coil. Will you help us, or will you say “Let them read slashdot”? Ok, seriously, I know you’re busy. Just wanted to make sure you knew people are enjoying the lecture notes.

And from my comments section:

> You know you’ve made it, and then lost it, when you no longer publish notes on your course 🙂

Alright, alright, alright, alright, alright. Now that I’ve returned from my two-week world concert tour (which took me to Innsbruck, London, Yale, and U. of Toronto), and now that my girlfriend and I have settled into a lovely new apartment (complete with silverware, shower curtains, and a giant poster of complexity class inclusions above the fireplace), I finally have some time to resume your regularly-scheduled programming.

So won’t you join me, as I attempt to excavate the strange forgotten world of paleocomplexity, and relive an age when STOC and FOCS were held in caves and Diagonalosaurs ruled the earth?


Update (Sept. 3): When I said that “about 5000 steps” are needed for the evolutionary approach to color an 8×8 chessboard, I was counting as a step any examination of two random adjacent squares—regardless of whether or not you end up having to change one of the colors.  If you count only the changes, then the expected number goes down to about 1000 (which, of course, only makes the point about the power of the evolutionary approach “stronger”).  Thanks very much to Raymond Cuenen for bringing this clarification to my attention.

* * *

Last week I appeared on an episode of Through the Wormhole with Morgan Freeman, a show on the Science Channel.  (See also here for a post on Morgan Freeman’s Facebook page.)  The episode is called “Did God Create Evolution?”  The first person interviewed is the Intelligent Design advocate Michael Behe.  But not to worry!  After him, they have a parade of scientists who not only agree that Chuck Darwin basically had it right in 1859, but want to argue for that conclusion using ROBOTS!  and MATH!

So, uh, that’s where I come in.  My segment features me (or rather my animated doppelgänger, “SuperScott”) trying to color a chessboard two colors, so that no two neighboring squares are colored the same, using three different approaches: (1) an “intelligent design” approach (which computer scientists would call nondeterminism), (2) a brute-force, exhaustive enumeration approach, and (3) an “evolutionary local search” approach.

[Spoiler alert: SuperScott discovers that the local search approach, while not as efficient as intelligent design, is nevertheless much more efficient than brute-force search.  And thus, he concludes, the arguments of the ID folks to the effect of “I can’t see a cleverer way to do it, therefore it must be either brute-force search or else miraculous nondeterminism” are invalid.]

Since my appearance together with Morgan Freeman on cable TV raises a large number of questions, I’ve decided to field a few of them in the following FAQ.

Q: How can I watch?

Amazon Instant Video has the episode here for $1.99.  (No doubt you can also find it on various filesharing sites, but let it be known that I’d never condone such nefarious activity.)  My segment is roughly from 10:40 until 17:40.

Q: Given that you’re not a biologist, and that your research has basically nothing to do with evolution, why did they ask to interview you?

Apparently they wanted a mathematician or computer scientist who also had some experience spouting about Big Ideas.  So they first asked Greg Chaitin, but Chaitin couldn’t do it and suggested me instead.

Q: Given how little relevant expertise you have, why did you agree to be interviewed?

To be honest, I was extremely conflicted.  I kept saying, “Why don’t you interview a biologist?  Or at least a computational biologist, or someone who studies genetic algorithms?”  They replied that they did have more bio-oriented people on the show, but they also wanted me to provide a “mathematical” perspective.  So, I consulted with friends like Sean Carroll, who’s appeared on Through the Wormhole numerous times.  And after reflection, I decided that I do have a way to explain a central conceptual point about algorithms, complexity, and the amount of time needed for natural selection—a point that, while hardly “novel,” is something that many laypeople might not have seen before and that might interest them.  Also, as an additional argument in favor of appearing, MORGAN FREEMAN!

So I agreed to do it, but only under two conditions:

(1) At least one person with a biology background would also appear on the show, to refute the arguments of intelligent design.
(2) I would talk only about stuff that I actually understood, like the ability of local search algorithms to avoid the need for brute-force search.

I’ll let you judge for yourself to what extent these conditions were fulfilled.

Q: Did you get to meet Morgan Freeman?

Alas, no.  But at least I got to hear him refer repeatedly to “SuperScott” on TV.

Q: What was the shooting like?

Extremely interesting.  I know more now about TV production than I did before!

It was a continuing negotiation: they kept wanting to say that I was “on a quest to mathematically prove evolution” (or something like that), and I kept telling them they weren’t allowed to say that, or anything else that would give the misleading impression that what I was saying was either original or directly related to my research.  I also had a long discussion about the P vs. NP problem, which got cut for lack of time (now P and NP are only shown on the whiteboard).  On the other hand, the crew was extremely accommodating: they really wanted to do a good job and to get things right.

The most amusing tidbit: I knew that local search would take O(n4) time to 2-color an nxn chessboard (2-coloring being a special case of 2SAT, to which Schöning’s algorithm applies), but I didn’t know the constant.  So I wrote a program to get the specific number of steps when n=8 (it’s about 5000).  I then repeatedly modified and reran the program during the taping, as we slightly changed what we were talking about.  It was the first coding I’d done in a while.

Q: How much of the segment was your idea, and how much was theirs?

The chessboard was my idea, but the “SuperScott” bit was theirs.  Luddite that I am, I was just going to get down on hands and knees and move apples and oranges around on the chessboard myself.

Also, they wanted me to speak in front of a church in Boston, to make a point about how many people believe that God created the universe.  I nixed that idea and said, why not just do the whole shoot in the Stata Center?  I mean, MIT spent $300 million just to make the building where I work as “visually arresting” as possible—at the expense of navigability, leakage-resilience, and all sorts of other criteria—so why not take advantage of it?  Plus, that way I’ll be able to crack a joke about how Stata actually looks like it was created by that favorite creationist strawman, a tornado passing through a junkyard.

Needless to say, all the stuff with me drawing complexity class inclusion diagrams on the whiteboard, reading my and Alex Arkhipov’s linear-optics paper, walking around outside with an umbrella, lifting the umbrella to face the camera dramatically—that was all just the crew telling me what to do.  (Well, OK, they didn’t tell me what to write on the whiteboard or view on my computer, just that it should be something sciencey.  And the umbrella thing wasn’t planned: it really just happened to be raining that day.)

Q: Don’t you realize that not a word of what you said was new—indeed, that all you did was to translate the logic of natural selection, which Darwin understood in 1859, into algorithms and complexity language?

Yes, of course, and I’m sorry if the show gave anyone the impression otherwise.  I repeatedly begged them not to claim newness or originality for anything I was saying.  On the other hand, one shouldn’t make the mistake of assuming that what’s obvious to nerds who read science blogs is obvious to everyone else: I know for a fact that it isn’t.

Q: Don’t you understand that you can’t “prove” mathematically that evolution by natural selection is really what happened in Nature?

Of course!  You can’t even prove mathematically that bears crap in the woods (unless crapping in the woods were taken as part of the definition of bears).  To the writers’ credit, they did have Morgan Freeman explain that I wasn’t claiming to have “proved” evolution.  Personally, I wish Freeman had gone even further—to say that, at present, we don’t even have mathematical theories that would explain from first principles why 4 billion years is a “reasonable” amount of time for natural selection to have gotten from the primordial soup to humans and other complex life, whereas (say) 40 million years is not a reasonable amount.  One could imagine such theories, but we don’t really have any.  What we do have is (a) the observed fact that evolution did happen in 4 billion years, and (b) the theory of natural selection, which explains in great detail why one’s initial intuition—that such evolution can’t possibly have happened by “blind, chance natural processes” alone—is devoid of force.

Q: Watching yourself presented in such a goony way—scribbling Complicated Math Stuff on a whiteboard, turning dramatically toward the camera, etc. etc.—didn’t you feel silly?

Some of it is silly, no two ways about it!  On the other hand, I feel satisfied that I got across at least one correct and important scientific point to hundreds of thousands of people.  And that, one might argue, is sufficiently worthwhile that it should outweigh any embarrassment about how goofy I look.


Time again for Shtetl-Optimized’s Mistake of the Week series! This week my inspiration comes from a paper that’s been heating up the quantum blogosphere (the Blochosphere?): Is Fault-Tolerant Quantum Computation Really Possible? by M. I. Dyakonov. I’ll start by quoting my favorite passages:

> The enormous literature devoted to this subject (Google gives 29300 hits for “fault-tolerant quantum computation”) is purely mathematical. It is mostly produced by computer scientists with a limited understanding of physics and a somewhat restricted perception of quantum mechanics as nothing more than unitary transformations in Hilbert space plus “entanglement.”
>
> Whenever there is a complicated issue, whether in many-particle physics, climatology, or economics, one can be almost certain that no theorem will be applicable and/or relevant, because the explicit or implicit assumptions, on which it is based, will never hold in reality.

I’ll leave the detailed critique of Dyakonov’s paper to John Preskill, the Pontiff, and other “computer scientists” who understand the fault-tolerance theorem much better than a mere physicist like me. Here I instead want to take issue with an idea that surfaces again and again in Dyakonov’s paper, is almost universally accepted, but is nevertheless false. The idea is this: that it’s possible for a theory to “work on paper but not in the real world.”

The proponents of this idea go wrong, not in thinking that a theory can fail in the real world, but in thinking that if it fails, then the theory can still “work on paper.” If a theory claims to describe a phenomenon but doesn’t, then the theory doesn’t work, period — neither in the real world nor on paper. In my view, the refrain that something “works on paper but not in the real world” serves mainly as an intellectual crutch: a way for the lazy to voice their opinion that something feels wrong to them, without having to explain how or where it’s wrong.

“Ah,” you say, “but theorists often make assumptions that don’t hold in the real world!” Yes, but you’re sidestepping the key question: did the theorists state their assumptions clearly or not? If they didn’t, then the fault lies with them; if they did, then the fault lies with those practitioners who would milk a nonspherical cow like a spherical one.

To kill a theory (in the absence of direct evidence), you need to pinpoint which of its assumptions are unfounded and why. You don’t become more convincing by merely finding more assumptions to criticize; on the contrary, the “hope something sticks” approach usually smacks of desperation:

> There’s no proof that the Earth’s temperature is rising, but even if there was, there’s no proof that humans are causing it, but even if there was, there’s no proof that it’s anything to worry about, but even there was, there’s no proof that we can do anything about it, but even if there was, it’s all just a theory anyway!

As should be clear, “just a theory” is not a criticism: it’s a kvetch.

> Marge: I really think this is a bad idea.
Homer: Marge, I agree with you — in theory. In theory, communism works. In theory.

Actually, let’s look at Homer’s example of communism, since nothing could better illustrate my point. When people say that communism works “in theory,” they presumably mean that it works if everyone is altruistic. But regulating selfishness is the whole problem political systems are supposed to solve in the first place! Any political system that defines the problem away doesn’t work on paper, any more than “Call a SAT oracle” works on paper as a way to solve NP-complete problems. Once again, we find the “real world / paper” distinction used as a cover for intellectual laziness.

Let me end this rant by preempting the inevitable cliché that “in theory, there’s no difference between theory and practice; in practice, there is.” Behold my unanswerable retort:

> In theory, there’s no difference between theory and practice even in practice.


P, NP, and Friends. A whole undergraduate complexity course in one HTML file.

For those of you who already know this stuff: forgive me for boring you. For those who don’t: read, learn, and join the Enlightened.

Note: The comment section was down all day, but it’s back now. Google ought to be ashamed to be running something as rickety and unreliable as Blogger. Well, I guess you get what you pay for.


Sorry for the inordinate delay in updating! This weekend I was busy with several things, one of which was “EinsteinFest,” Perimeter Institute’s celebration of the hundred-year anniversary of Einstein’s annus mirabilis. The Fest is a monthlong program of exhibits, talks, etc., aimed at the general public, and covering four topics: “The Science, The Times, The Man, The Legacy.” This weekend’s talks were about “The Man,” which is why I attended.

See, I was worried that the Fest would place too much emphasis on Einstein the sockless symbol of scientific progress, Einstein the secular saint, Einstein the posthumous salesman for Perimeter Institute. And how it could do that without indulging in the very pomposity that Einstein himself detested?

My fears were not assuaged by the many exhibits devoted to Freud, Picasso, the Wright Brothers, the automobile, fashion at the turn of the century, and so on. These exhibits gave visitors the impression of a great band of innovators marching into the future, with Einstein cheerfully in front. The reality, of course, is that Einstein never marched in anyone’s band, and — like his friend Gödel — saw himself as opposed to the main intellectual currents of the time.

It didn’t help either that, to handle the influx of visitors, Perimeter has basically transformed itself into Relativistic Disney World — complete with tickets, long lines, guides wearing uniforms, signs directing traffic, cordoned-off areas, and an outdoor tent for kids called “Physica Fantastica.” To some extent I guess this was unavoidable, although sometimes it resulted in unintended comedy:


(Sorry, I just bought a digital camera and couldn’t resist.)

So it was a pleasure to attend the talks on “Einstein the Man” and find that, in spite of everything, they were fantastic. We heard David Rowe on Einstein and politics, Trevor Lipscombe on Einstein and Mileva, and John Dawson on Einstein and Gödel. Partly these speakers won me over with wisecracks (Dawson: “Gödel thought he’d found a flaw in the Constitution, by which the US could legally turn into a dictatorship. In light of recent events, I don’t see why anyone would doubt him”). But mostly they just let the old man speak for himself. We saw Einstein write the following to his then-mistress Elsa:

> If you were to recite the most beautiful poem ever so divinely, the joy I would derive from it would not come close to the joy I experienced when I received the mushrooms and goose cracklings you cooked.

And to Mileva, during the months when he was finishing general relativity:

> You will see to it: (1) that my clothes and linen are kept in order; (2) that I am served three regular meals a day in my room; (3) that my bedroom and study are always kept in good order and that my desk is not touched by anyone other than me.

We saw Einstein the pacifist urging the Allies to rearm at once against Hitler, and Einstein the secular internationalist supporting the creation of Israel. And eventually we came to understand that this was not an oracle spouting wisdom from God; it was just a guy with a great deal of common sense — as much common sense as anyone’s ever had. Isn’t it strange that, despite deserving to be celebrated, he is?


Today I experiment with “tweeting”: writing &lt;=140-character announcements, but posting them to my blog.  Like sending lolcat videos by mail

Last week at QCrypt in Waterloo: http://2013.qcrypt.net This week at CQIQC in Toronto: http://tinyurl.com/kfexzv6 Back with Lily in between

While we debate D-Wave, ID Quantique et al. quietly sold ~100 quantum crypto devices. Alas, market will remain small unless RSA compromised

One speaker explained how a photon detector works by showing this YouTube video: http://tinyurl.com/k8x4btx Couldn’t have done better

Luca Trevisan asks me to spread the word about a conference for LGBTs in technology: www.outforundergrad.org/technology

Steven Pinker stands up for the Enlightenment in The New Republic: “Science Is Not Your Enemy” http://tinyurl.com/l26ppaf

Think Pinker was exaggerating?  Read Leon Wieseltier’s defiantly doofusy Brandeis commencement speech: http://tinyurl.com/jwhj8ub

Black-hole firewalls make the New York Times, a week before the firewall workshop at KITP (I’ll be there): http://tinyurl.com/kju9crj

You probably already saw the Schrodinger cat Google doodle: http://tinyurl.com/k8et44p For me, the ket was much cooler than the cat

While working on BosonSampling yesterday, (1/6)pi^2 and Euler-Mascheroni constant made unexpected unappearances.  What I live for


Updates (Aug. 29): John Preskill now has a very nice post summarizing the different views on offer at the firewall workshop, thereby alleviating my guilt for giving you only the mess below.  Thanks, John!

And if you check out John’s Twitter feed (which you should), you’ll find another, unrelated gem: a phenomenal TEDx talk on quantum computing by my friend, coauthor, and hero, the Lowerboundsman of Latvia, Andris Ambainis.  (Once again, when offered a feast of insight to dispel their misconceptions and ennoble their souls, the YouTube commenters are distinguishing themselves by focusing on the speaker’s voice.  Been there, man, been there.)

* * *

So, last week I was at the Fuzzorfire workshop at the Kavli Institute for Theoretical Physics in Santa Barbara, devoted to the black hole firewall paradox.  (The workshop is still going on this week, but I needed to get back early.)  For some background:

  * The original paper by Almheiri et al. (from July 2012, so now “ancient history”)
  * New York Times article by Dennis Overbye
  * Quanta article by Jennifer Ouellette
  * Blog post by John Preskill

I had fantasies of writing a long, witty blog post that would set out my thoughts about firewalls, full of detailed responses to everything I’d heard at the conference, as well as ruminations about Harlow and Hayden’s striking argument that computational complexity might provide a key to resolving the paradox.  But the truth is, I’m recovering from a nasty stomach virus, am feeling “firewalled out,” and wish to use my few remaining non-childcare hours before the semester starts to finish writing papers.  So I decided that better than nothing would be a hastily-assembled pastiche of links.

First and most important, you can watch all the talks online.  In no particular order:

  * My talk (about the computational complexity underpinnings of the Harlow-Hayden argument)
  * Stephen Hawking’s 10-minute talk by videoconference, denying that firewalls form, and basically repeating his position from his black hole bet concession speech of 2004 (I confess that I don’t really understand his arguments)
  * Lenny Susskind’s ER=EPR talk
  * Bill Unruh’s entertaining talk denouncing the other participants’ obsession with unitarity, and defending what he sees as the simplest solution to all the black hole information problems: information dropped into a black hole is simply gone forever, “buh-bye!”
  * All the other talks

Here’s my own attempt to summarize what’s at stake, adapted from a comment on Peter Woit’s blog (see also a rapid response by Lubos):

As I understand it, the issue is actually pretty simple. Do you agree that
(1) the Hawking evaporation process should be unitary, and
(2) the laws of physics should describe the experiences of an infalling observer, not just those of an observer who stays outside the horizon?
If so, then you seem forced to accept
(3) the interior degrees of freedom should just be some sort of scrambled re-encoding of the exterior degrees, rather than living in a separate subfactor of Hilbert space (since otherwise we’d violate unitarity).
But then we get
(4) by applying a suitable unitary transformation to the Hawking radiation of an old enough black hole before you jump into it, someone ought to be able, in principle, to completely modify what you experience when you do jump in.  Moreover, that person could be far away from you—an apparent gross violation of locality.

So, there are a few options: you could reject either (1) or (2). You could bite the bullet and accept (4). You could say that the “experience of an infalling observer” should just be to die immediately at the horizon (firewalls). You could argue that for some reason (e.g., gravitational backreaction, or computational complexity), the unitary transformations required in (4) are impossible to implement even in principle. Or you could go the “Lubosian route,” and simply assert that the lack of any real difficulty is so obvious that, if you admit to being confused, then that just proves you’re an idiot.  AdS/CFT is clearly relevant, but as Polchinski pointed out, it does surprisingly little to solve the problem.

Now, what Almheiri et al. (AMPS) added to the simple logical argument above was really to make the consequence (4) more “concrete” and “vivid”—by describing something that, in principle, someone could actually do to the Hawking radiation before jumping in, such that after you jumped in, if there wasn’t anything dramatic that happened—something violating local QFT and the equivalence principle—then you’d apparently observe a violation of the monogamy of entanglement, a basic principle of quantum mechanics.  I’m sure the bare logic (1)-(4) was known to many people before AMPS: I certainly knew it, but I didn’t call it a “paradox,” I just called it “I don’t understand black hole complementarity”!

At any rate, thinking about the “Hawking radiation decoding problem” already led me to some very nice questions in quantum computing theory, which remain interesting even if you remove the black hole motivation entirely. And that helped convince me that something new and worthwhile might indeed come out of this business, despite how much fun it is. (Hopefully whatever does come out won’t be as garbled as Hawking radiation.)

For continuing live updates from the workshop, check out John Preskill’s Twitter feed.

Or you can ask me to expand on various things in the comments, and I’ll do my best.  (As I said in my talk, while I’m not sure that the correct quantum description of the black hole interior is within anyone‘s professional expertise, it’s certainly outside of mine!  But I do find this sort of thing fun to think about—how could I not?)

Unrelated, but also of interest: check out an excellent article in Quanta by Erica Klarreich, about the recent breakthroughs by Reichardt-Unger-Vazirani, Vazirani-Vidick, and others on classical command of quantum systems.


According to my usage statistics, of the people who come to scottaaronson.com via a search engine, about 5% do so by typing in one of the following queries:

> biggest number in the world
the biggest number in the world
what is the largest number
largest number in the world
what is the biggest number

These people are then led to my big numbers essay, which presumably befuddles them even more.

So, let me satisfy the public’s curiosity once and for all: the biggest number in the world is a million billion gazillion. But stay tuned: even as I write, Space Shuttle astronauts are combing the galaxy for an even bigger number!


Update (Sept. 9): Reading more about these things, and talking to friends who are experts in applied cryptography, has caused me to do the unthinkable, and change my mind somewhat.  I now feel that, while the views expressed in this post were OK as far as they went, they failed to do justice to the … complexity (har, har) of what’s at stake.  Most importantly, I didn’t clearly explain that there’s an enormous continuum between, on the one hand, a full break of RSA or Diffie-Hellman (which still seems extremely unlikely to me), and on the other, “pure side-channel attacks” involving no new cryptanalytic ideas.  Along that continuum, there are many plausible places where the NSA might be.  For example, imagine that they had a combination of side-channel attacks, novel algorithmic advances, and sheer computing power that enabled them to factor, let’s say, ten 2048-bit RSA keys every year.  In such a case, it would still make perfect sense that they’d want to insert backdoors into software, sneak vulnerabilities into the standards, and do whatever else it took to minimize their need to resort to such expensive attacks.  But the possibility of number-theoretic advances well beyond what the open world knows certainly wouldn’t be ruled out.  Also, as Schneier has emphasized, the fact that NSA has been aggressively pushing elliptic-curve cryptography in recent years invites the obvious speculation that they know something about ECC that the rest of us don’t.

And that brings me to a final irony in this story.  When a simpleminded complexity theorist like me hears his crypto friends going on and on about the latest clever attack that still requires exponential time, but that puts some of the keys in current use just within reach of gigantic computing clusters, his first instinct is to pound the table and shout: “well then, so why not just increase all your key sizes by a factor of ten?  Sweet Jesus, the asymptotics are on your side!  if you saw a killer attack dog on a leash, would you position yourself just outside what you guesstimated to be the leash’s radius?  why not walk a mile away, if you can?”  The crypto experts invariably reply that it’s a lot more complicated than I realize, because standards, and efficiency, and smartphones … and before long I give up and admit that I’m way out of my depth.

So it’s amusing that one obvious response to the recent NSA revelations—a response that sufficiently-paranoid people, organizations, and governments might well actually take, in practice—precisely matches the naïve complexity-theorist intuition.  Just increase the damn key sizes by a factor of ten (or whatever).

Another Update (Sept. 20): In my original posting, I should also have linked to Matthew Green’s excellent post.  My bad.

* * *

Last week, I got an email from a journalist with the following inquiry.  The recent Snowden revelations, which made public for the first time the US government’s “black budget,” contained the following enigmatic line from the Director of National Intelligence: “We are investing in groundbreaking cryptanalytic capabilities to defeat adversarial cryptography and exploit internet traffic.”  So, the journalist wanted to know, what could these “groundbreaking” capabilities be?  And in particular, was it possible that the NSA was buying quantum computers from D-Wave, and using them to run Shor’s algorithm to break the RSA cryptosystem?

I replied that, yes, that’s “possible,” but only in the same sense that it’s “possible” that the NSA is using the Easter Bunny for the same purpose.  (For one thing, D-Wave themselves have said repeatedly that they have no interest in Shor’s algorithm or factoring.  Admittedly, I guess that’s what D-Wave would say, were they making deals with NSA on the sly!  But it’s also what the Easter Bunny would say.)  More generally, I said that if the open scientific world’s understanding is anywhere close to correct, then quantum computing might someday become a practical threat to cryptographic security, but it isn’t one yet.

That, of course, raised the extremely interesting question of what “groundbreaking capabilities” the Director of National Intelligence was referring to.  I said my personal guess was that, with ~99% probability, he meant various implementation vulnerabilities and side-channel attacks—the sort of thing that we know has compromised deployed cryptosystems many times in the past, but where it’s very easy to believe that the NSA is ahead of the open world.  With ~1% probability, I guessed, the NSA made some sort of big improvement in classical algorithms for factoring, discrete log, or other number-theoretic problems.  (I would’ve guessed even less than 1% probability for the latter, before the recent breakthrough by Joux solving discrete log in fields of small characteristic in quasipolynomial time.)

Then, on Thursday, a big New York Times article appeared, based on 50,000 or so documents that Snowden leaked to the Guardian and that still aren’t public.  (See also an important Guardian piece by security expert Bruce Schneier, and accompanying Q&amp;A.)  While a lot remains vague, there might be more public information right now about current NSA cryptanalytic capabilities than there’s ever been.

So, how did my uninformed, armchair guesses fare?  It’s only halfway into the NYT article that we start getting some hints:

The files show that the agency is still stymied by some encryption, as Mr. Snowden suggested in a question-and-answer session on The Guardian’s Web site in June.

“Properly implemented strong crypto systems are one of the few things that you can rely on,” he said, though cautioning that the N.S.A. often bypasses the encryption altogether by targeting the computers at one end or the other and grabbing text before it is encrypted or after it is decrypted…

Because strong encryption can be so effective, classified N.S.A. documents make clear, the agency’s success depends on working with Internet companies — by getting their voluntary collaboration, forcing their cooperation with court orders or surreptitiously stealing their encryption keys or altering their software or hardware…

Simultaneously, the N.S.A. has been deliberately weakening the international encryption standards adopted by developers. One goal in the agency’s 2013 budget request was to “influence policies, standards and specifications for commercial public key technologies,” the most common encryption method.

Cryptographers have long suspected that the agency planted vulnerabilities in a standard adopted in 2006 by the National Institute of Standards and Technology and later by the International Organization for Standardization, which has 163 countries as members.

Classified N.S.A. memos appear to confirm that the fatal weakness, discovered by two Microsoft cryptographers in 2007, was engineered by the agency. The N.S.A. wrote the standard and aggressively pushed it on the international group, privately calling the effort “a challenge in finesse.”

So, in pointing to implementation vulnerabilities as the most likely possibility for an NSA “breakthrough,” I might have actually erred a bit too far on the side of technological interestingness.  It seems that a large part of what the NSA has been doing has simply been strong-arming Internet companies and standards bodies into giving it backdoors.  To put it bluntly: sure, if it wants to, the NSA can probably read your email.  But that isn’t mathematical cryptography’s fault—any more than it would be mathematical crypto’s fault if goons broke into your house and carted away your laptop.  On the contrary, properly-implemented, backdoor-less strong crypto is something that apparently scares the NSA enough that they go to some lengths to keep it from being widely used.

I should add that, regardless of how NSA collects all the private information it does—by “beating crypto in a fair fight” (!) or, more likely, by exploiting backdoors that it itself installed—the mere fact that it collects so much is of course unsettling enough from a civil-liberties perspective.  So I’m glad that the Snowden revelations have sparked a public debate in the US about how much surveillance we as a society want (i.e., “the balance between preventing 9/11 and preventing Orwell”), what safeguards are in place to prevent abuses, and whether those safeguards actually work.  Such a public debate is essential if we’re serious about calling ourselves a democracy.

At the same time, to me, perhaps the most shocking feature of the Snowden revelations is just how unshocking they’ve been.  So far, I haven’t seen anything that shows the extent of NSA’s surveillance to be greater than what I would’ve considered plausible a priori.  Indeed, the following could serve as a one-sentence summary of what we’ve learned from Snowden:

Yes, the NSA is, in fact, doing the questionable things that anyone not living in a cave had long assumed they were doing—that assumption being so ingrained in nerd culture that countless jokes are based around it.

(Come to think of it, people living in caves might have been even more certain that the NSA was doing those things.  Maybe that’s why they moved to caves.)

So, rather than dwelling on civil liberties, national security, yadda yadda yadda, let me move on to discuss the implications of the Snowden revelations for something that really matters: a 6-year-old storm in theoretical computer science’s academic teacup.  As many readers of this blog might know, Neal Koblitz—a respected mathematician and pioneer of elliptic curve cryptography, ~~who (from numerous allusions in his writings) appears to have some connections at the NSA~~ (on reflection, this is unfair to Koblitz; he does report conversations with NSA people in his writings, but has never had any financial connection with NSA)—published a series of scathing articles, in the Notices of the American Mathematical Society and elsewhere, attacking the theoretical computer science approach to cryptography.  Koblitz’s criticisms were varied and entertainingly-expressed: the computer scientists are too sloppy, deadline-driven, self-promoting, and corporate-influenced; overly trusting of so-called “security proofs” (a term they shouldn’t even use, given how many errors and exaggerated claims they make); absurdly overreliant on asymptotic analysis; “bodacious” in introducing dubious new hardness assumptions that they then declare to be “standard”; and woefully out of touch with cryptographic realities.  Koblitz seemed to suggest that, rather than demanding the security reductions so beloved by theoretical computer scientists, people would do better to rest the security of their cryptosystems on two alternative pillars: first, standards set by organizations like the NSA with actual real-world experience; and second, the judgments of mathematicians with … taste and experience, who can just see what’s likely to be vulnerable and what isn’t.

Back in 2007, my mathematician friend Greg Kuperberg pointed out the irony to me: here we had a mathematician, lambasting computer scientists for trying to do for cryptography what mathematics itself has sought to do for everything since Euclid!  That is, when you see an unruly mess of insights, related to each other in some tangled way, systematize and organize it.  Turn the tangle into a hierarchical tree (or dag).  Isolate the minimal assumptions (one-way functions?  decisional Diffie-Hellman?) on which each conclusion can be based, and spell out all the logical steps needed to get from here to there—even if the steps seem obvious or boring.  Any time anyone has tried to do that, it’s been easy for the natives of the unruly wilderness to laugh at the systematizing newcomers: the latter often know the terrain less well, and take ten times as long to reach conclusions that are ten times less interesting.  And yet, in case after case, the clarity and rigor of the systematizing approach has eventually won out.  So it seems weird for a mathematician, of all people, to bet against the systematizing approach when applied to cryptography.

The reason I’m dredging up this old dispute now, is that I think the recent NSA revelations might put it in a slightly new light.  In his article—whose main purpose is to offer practical advice on how to safeguard one’s communications against eavesdropping by NSA or others—Bruce Schneier offers the following tip:

Prefer conventional discrete-log-based systems over elliptic-curve systems; the latter have constants that the NSA influences when they can.

Here Schneier is pointing out a specific issue with ECC, which would be solved if we could “merely” ensure that NSA or other interested parties weren’t providing input into which elliptic curves to use.  But I think there’s also a broader issue: that, in cryptography, it’s unwise to trust any standard because of the prestige, real-world experience, mathematical good taste, or whatever else of the people or organizations proposing it.  What was long a plausible conjecture—that the NSA covertly influences cryptographic standards to give itself backdoors, and that otherwise-inexplicable vulnerabilities in deployed cryptosystems are sometimes there because the NSA wanted them there—now looks close to an established fact.  In cryptography, then, it’s not just for idle academic reasons that you’d like a publicly-available trail of research papers and source code, open to criticism and improvement by anyone, that takes you all the way from the presumed hardness of an underlying mathematical problem to the security of your system under whichever class of attacks is relevant to you.

Schneier’s final piece of advice is this: “Trust the math.  Encryption is your friend.”

“Trust the math.”  On that note, here’s a slightly-embarrassing confession.  When I’m watching a suspense movie (or a TV show like Homeland), and I reach one of those nail-biting scenes where the protagonist discovers that everything she ever believed is a lie, I sometimes mentally recite the proof of the Karp-Lipton Theorem.  It always calms me down.  Even if the entire universe turned out to be a cruel illusion, it would still be the case that NP ⊂ P/poly would collapse the polynomial hierarchy, and I can tell you exactly why.  It would likewise be the case that you couldn’t break the GGM pseudorandom function without also breaking the underlying pseudorandom generator on which it’s based.  Math could be defined as that which can still be trusted, even when you can’t trust anything else.


Yesterday’s Times ran an essay by Steve Lohr, based on speeches about the future of computing given by my former teachers Richard Karp and Jon Kleinberg. Though most of the essay is welcome and unobjectionable, let’s look at the first two paragraphs:

> Computer science is not only a comparatively young field, but also one that has had to prove it is really science. Skeptics in academia would often say that after Alan Turing described the concept of the “universal machine” in the late 1930’s — the idea that a computer in theory could be made to do the work of any kind of calculating machine, including the human brain — all that remained to be done was mere engineering.
>
> The more generous perspective today is that decades of stunningly rapid advances in processing speed, storage and networking, along with the development of increasingly clever software, have brought computing into science, business and culture in ways that were barely imagined years ago. The quantitative changes delivered through smart engineering opened the door to qualitative changes.

So, here are the two options on offer from the paper of record: either

  1. computer science was finished off by Alan Turing, or
  2. “stunningly rapid advances in processing speed, storage and networking” have reopened it just recently.

Even among the commenters on this post by Chad Orzel — which Dave Bacon forwarded to me with the subject line “bait” — awareness of any third possibility seems depressingly rare. Judging from the evidence, it’s not that people have engaged the mysteries of P versus NP, randomness and determinism, one-way functions and interactive proofs, and found them insufficiently deep. Rather, as bizarre as it sounds, it’s that people don’t know these mysteries exist — just as they wouldn’t know about black holes or the Big Bang if no one told them. If you want to understand why our subject — which by any objective standard, has contributed at least as much over the last 30 years as (say) particle physics or cosmology to humankind’s basic picture of the universe — receives a whopping $5 million a year from the NSF (with even that in constant danger), look no further.


Sean Carroll, who many of you know from Cosmic Variance, asked the following question in response to my last entry:

> I’m happy to admit that I don’t know anything about “one-way functions and interactive proofs.” So, in what sense has theoretical computer science contributed more in the last 30 years to our basic understanding of the universe than particle physics or cosmology? (Despite the fact that I’m a cosmologist, I don’t doubt your statement — I’d just like to be able to explain it in public.)

I posted my response as a comment, but it’s probably better to make it an entry of its own. So:

Hi Sean,

Thanks for your question!

Of course I was joking when I mentioned “objective standards” for ranking scientific fields. Depending on which questions keep you up at night, different parts of “humankind’s basic picture of the universe” will seem larger or smaller. (To say that, of course, is not to suggest any relativism about the picture itself.)

What I can do, though, is to tell you why — by my own subjective standards — the contributions of theoretical computer science over the last 30 years rival those of theoretical physics or any other field I know about. Of course, people will say I only think that because I’m a theoretical computer scientist, but that gets the causal arrow wrong: I became a theoretical computer scientist because, as a teenager, I thought it!

It’s probably best to start with some examples.

  1. We now know that, if an alien with enormous computational powers came to Earth, it could prove to us whether White or Black has the winning strategy in chess. To be convinced of the proof, we would not have to trust the alien or its exotic technology, and we would not have to spend billions of years analyzing one move sequence after another. We’d simply have to engage in a short conversation with the alien about the sums of certain polynomials over finite fields.
  2. There’s a finite (and not unimaginably-large) set of boxes, such that if we knew how to pack those boxes into the trunk of your car, then we’d also know a proof of the Riemann Hypothesis. Indeed, every formal proof of the Riemann Hypothesis with at most (say) a million symbols corresponds to some way of packing the boxes into your trunk, and vice versa. Furthermore, a list of the boxes and their dimensions can be feasibly written down.
  3. Supposing you do prove the Riemann Hypothesis, it’s possible to convince someone of that fact, without revealing anything other than the fact that you proved it. It’s also possible to write the proof down in such a way that someone else could verify it, with very high confidence, having only seen 10 or 20 bits of the proof.
  4. If every second or so your computer’s memory were wiped completely clean, except for the input data; the clock; a static, unchanging program; and a counter that could only be set to 1, 2, 3, 4, or 5, it would still be possible (given enough time) to carry out an arbitrarily long computation — just as if the memory weren’t being wiped clean each second. This is almost certainly not true if the counter could only be set to 1, 2, 3, or 4. The reason 5 is special here is pretty much the same reason it’s special in Galois’ proof of the unsolvability of the quintic equation.
  5. It would be great to prove that RSA is unbreakable by classical computers. But every known technique for proving that would, if it worked, simultaneously give an algorithm for breaking RSA! For example, if you proved that RSA with an n-bit key took n5 steps to break, you would’ve discovered an algorithm for breaking it in 2n^1/5 steps. If you proved that RSA took 2n^1/3 steps to break, you would’ve discovered an algorithm for breaking it in n(log n)^2 steps. As you show the problem to be harder, you simultaneously show it to be easier.

Alright, let me stop before I get carried away. The examples I’ve listed (and hundreds more like them) are not exactly discoveries about physics, but they don’t have the flavor of pure math either. And even if they have some practical implications for computing (which they do), they certainly don’t have the flavor of nitty-gritty software engineering.

So what are they then? Maybe it’s helpful to think of them as “quantitative epistemology”: discoveries about the capacities of finite beings like ourselves to learn mathematical truths. On this view, the theoretical computer scientist is basically a mathematical logician on a safari to the physical world: someone who tries to understand the universe by asking what sorts of mathematical questions can and can’t be answered within it. Not whether the universe is a computer, but what kind of computer it is! Naturally, this approach to understanding the world tends to appeal most to people for whom math (and especially discrete math) is reasonably clear, whereas physics is extremely mysterious.

In my opinion, one of the biggest challenges for our time is to integrate the enormous body of knowledge in theoretical computer science (or quantitative epistemology, or whatever you want to call it) with the rest of what we know about the universe. In the past, the logical safari mostly stayed comfortably within 19th-century physics; now it’s time to venture out into the early 20th century. Indeed, that’s exactly why I chose to work on quantum computing: not because I want to build quantum computers (though I wouldn’t mind that), but because I want to know what a universe that allows quantum computers is like.

Incidentally, it’s also why I try hard to keep up with your field. If I’m not mistaken, less than a decade ago cosmologists made an enormous discovery about the capacity of finite beings to learn mathematical truths: namely, that no computation carried out in the physical world can ever involve more than 1/Λ ~ 10122 bits.

Best,
Scott


Update (9/24): This parody post was a little like a belch: I felt it build up in me as I read about the topic, I let it out, it was easy and amusing, I don’t feel any profound guilt over it—but on the other hand, not one of the crowning achievements of my career.  As several commenters correctly pointed out, it may be true that, mostly because of the name and other superficialities, and because of ill-founded speculations about “the death of locality and unitarity,” the amplituhedron work is currently inspiring a flood of cringe-inducing misstatements on the web.  But, even if true, still the much more interesting questions are what’s actually going on, and whether or not there are nontrivial connections to computational complexity.

Here I have good news: if nothing else, my “belch” of a post at least attracted some knowledgeable commenters to contribute excellent questions and insights, which have increased my own understanding of the subject from ε2 to ε.  See especially this superb comment by David Speyer—which, among other things, pointed me to a phenomenal quasi-textbook on this subject by Elvang and Huang.  My most immediate thoughts:

  1. The “amplituhedron” is only the latest in a long line of research over the last decade—Witten, Turing biographer Andrew Hodges, and many others have been important players—on how to compute scattering amplitudes more efficiently than by summing zillions of Feynman diagrams.  One of the key ideas is to find combinatorial formulas that express complicated scattering amplitudes recursively in terms of simpler ones.
  2. This subject seems to be begging for a computational complexity perspective.  When I read Elvang and Huang, I felt like they were working hard not to say anything about complexity: discussing the gains in efficiency from the various techniques they consider in informal language, or in terms of concrete numbers of terms that need to be summed for 1 loop, 2 loops, etc., but never in terms of asymptotics.  So if it hasn’t been done already, it looks like it could be a wonderful project for someone just to translate what’s already known in this subject into complexity language.
  3. On reading about all these “modern” approaches to scattering amplitudes, one of my first reactions was to feel slightly less guilty about never having learned how to calculate Feynman diagrams!  For, optimistically, it looks like some of that headache-inducing machinery (ghosts, off-shell particles, etc.) might be getting less relevant anyway—there being ways to calculate some of the same things that are not only more conceptually satisfying but also faster.

* * *

Many readers of this blog probably already saw Natalie Wolchover’s Quanta article “A Jewel at the Heart of Quantum Physics,” which discusses the “amplituhedron”: a mathematical structure that IAS physicist Nima Arkami-Hamed and his collaborators have recently been investigating.  (See also here for Slashdot commentary, here for Lubos’s take, here for Peter Woit’s, here for a Physics StackExchange thread, here for Q&amp;A with Pacific Standard, and here for an earlier but closely-related 154-page paper.)

At first glance, the amplituhedron appears to be a way to calculate scattering amplitudes, in the planar limit of a certain mathematically-interesting (but, so far, physically-unrealistic) supersymmetric quantum field theory, much more efficiently than by summing thousands of Feynman diagrams.  In which case, you might say: “wow, this sounds like a genuinely-important advance for certain parts of mathematical physics!  I’d love to understand it better.  But, given the restricted class of theories it currently applies to, it does seem a bit premature to declare this to be a ‘jewel’ that unlocks all of physics, or a death-knell for spacetime, locality, and unitarity, etc. etc.”

Yet you’d be wrong: it isn’t premature at all.  If anything, the popular articles have understated the revolutionary importance of the amplituhedron.  And the reason I can tell you that with such certainty is that, for several years, my colleagues and I have been investigating a mathematical structure that contains the amplituhedron, yet is even richer and more remarkable.  I call this structure the “unitarihedron.”

The unitarihedron encompasses, within a single abstract “jewel,” all the computations that can ever be feasibly performed by means of unitary transformations, the central operation in quantum mechanics (hence the name).  Mathematically, the unitarihedron is an infinite discrete space: more precisely, it’s an infinite collection of infinite sets, which collection can be organized (as can every set that it contains!) in a recursive, fractal structure.  Remarkably, each and every specific problem that quantum computers can solve—such as factoring large integers, discrete logarithms, and more—occurs as just a single element, or “facet” if you will, of this vast infinite jewel.  By studying these facets, my colleagues and I have slowly pieced together a tentative picture of the elusive unitarihedron itself.

One of our greatest discoveries has been that the unitarihedron exhibits an astonishing degree of uniqueness.  At first glance, different ways of building quantum computers—such as gate-based QC, adiabatic QC, topological QC, and measurement-based QC—might seem totally disconnected from each other.  But today we know that all of those ways, and many others, are merely different “projections” of the same mysterious unitarihedron.

In fact, the longer I’ve spent studying the unitarihedron, the more awestruck I’ve been by its mathematical elegance and power.  In some way that’s not yet fully understood, the unitarihedron “knows” so much that it’s even given us new insights about classical computing.  For example, in 1991 Beigel, Reingold, and Spielman gave a 20-page proof of a certain property of unbounded-error probabilistic polynomial-time.  Yet, by recasting things in terms of the unitarihedron, I was able to give a direct, half-page proof of the same theorem.  If you have any experience with mathematics, then you’ll know that that sort of thing never happens: if it does, it’s a sure sign that cosmic or even divine forces are at work.

But I haven’t even told you the most spectacular part of the story yet.  While, to my knowledge, this hasn’t yet been rigorously proved, many lines of evidence support the hypothesis that the unitarihedron must encompass the amplituhedron as a special case.  If so, then the amplituhedron could be seen as just a single sparkle on an infinitely greater jewel.

Now, in the interest of full disclosure, I should tell you that the unitarihedron is what used to be known as the complexity class BQP (Bounded-Error Quantum Polynomial-Time).  However, just like the Chinese gooseberry was successfully rebranded in the 1950s as the kiwifruit, and the Patagonian toothfish as the Chilean sea bass, so with this post, I’m hereby rebranding BQP as the unitarihedron.  For I’ve realized that, when it comes to bowling over laypeople, inscrutable complexity class acronyms are death—but the suffix “-hedron” is golden.

So, journalists and funders: if you’re interested in the unitarihedron, awesome!  But be sure to also ask about my other research on the bosonsamplinghedron and the quantum-money-hedron.  (Though, in recent months, my research has focused even more on the diaperhedron: a multidimensional, topologically-nontrivial manifold rich enough to encompass all wastes that an 8-month-old human could possibly emit.  Well, at least to first-order approximation.)


You asked for ’em, you got ’em. (Do you want fries with that?)

  1. Suppose a baby is given some random examples of grammatical and ungrammatical sentences, and based on that, it wants to infer the general rule for whether or not a given sentence is grammatical. If the baby can do this with reasonable accuracy and in a reasonable amount of time, for any “regular grammar” (the very simplest type of grammar studied by Noam Chomsky), then that baby can also break the RSA cryptosystem.
  2. Oded Regev recently invented a public-key cryptosystem with an interesting property: though it’s purely classical, his system only known to be secure under the assumption that certain problems are hard for quantum computers. The upside is that, if these problems are hard for quantum computers, then Regev’s system (unlike RSA) is also secure against attack by quantum computers!
  3. Suppose N boys and N girls join a dating service. We write down an N-by-N matrix, where the (i,j) entry equals 1 if the ith boy and the jth girl are willing to date each other, and 0 if they aren’t. We want to know if it’s possible to pair off every boy and girl with a willing partner. Here’s a simple way to find out: first rescale every row of the matrix to sum to 1. Then rescale every column to sum to 1. Then rescale every row, then rescale every column, and so on N5 times. If at the end of this scaling process, every row and column sum is between 1-1/N and 1+1/N, then it’s possible to pair off the boys and girls; otherwise it isn’t.
  4. If two graphs are isomorphic, then a short and simple proof of that fact is just the isomorphism itself. But what if two graphs aren’t isomorphic? Is there also a short proof of that — one that doesn’t require checking every possible way of matching up the vertices? Under a plausible assumption, we now know that there is such a proof, for any pair of non-isomorphic graphs whatsoever (even with the same eigenvalue spectrum, etc). What’s the plausible assumption? It has nothing to do with graphs! Roughly, it’s that a certain problem, which is known to take exponential time for any one algorithm, still takes exponential time for any infinite sequence of algorithms.
  5. Suppose we had a small “neural network” with only three or four layers of neurons between the input and output, where the only thing each neuron could do was to compute the sum of its input signals modulo 2. We can prove, not surprisingly, that such a neural net would be extremely limited in its power. Ditto if we replace the 2 by 3, 4, 5, 7, 8, 9, or 11. But if we replace the 2 by 6, 10, or 12, then we no longer know anything! For all we know, a three-layer neural network, composed entirely of “mod 6 neurons,” could solve NP-complete problems in polynomial time.


Let this man’s face serve as a reminder to all my American friends, to haul your respective asses to your respective polling places with no excuses accepted. Keep in mind that this year the Democratic voting day is Tuesday November 7th, while the Republican voting day is Wednesday November 8th.

(Me? I couldn’t find a precinct station in Waterloo for some strange reason, so I mailed an absentee ballot back to New Hope, PA.)


Update (Oct. 3): OK, a sixth announcement.  I just posted a question on CS Theory StackExchange, entitled Overarching reasons why problems are in P or BPP.  If you have suggested additions or improvements to my rough list of “overarching reasons,” please post them over there — thanks!

* * *

1\. I’m in Oxford right now, for a Clay Institute workshop on New Insights into Computational Intractability.  The workshop is concurrent with three others, including one on Number Theory and Physics that includes an amplituhedron-related talk by Andrew Hodges.  (Speaking of which, see here for a small but non-parodic observation about expressing amplitudes as volumes of polytopes.)

2\. I was hoping to stay in the UK one more week, to attend the Newton Institute’s special semester on Mathematical Challenges in Quantum Information over in Cambridge.  But alas I had to cancel, since my diaper-changing services are needed in the other Cambridge.  So, if anyone in Cambridge (or anywhere else in the United Kingdom) really wants to talk to me, come to Oxford this week!

3\. Back in June, Jens Eisert and three others posted a preprint claiming that the output of a BosonSampling device would be “indistinguishable from the uniform distribution” in various senses.  Ever since then, people have emailing me, leaving comments on this blog, and cornering me at conferences to ask whether Alex Arkhipov and I had any response to these claims.  OK, so just this weekend, we posted our own 41-page preprint, entitled “BosonSampling Is Far From Uniform.”  I hope it suffices by way of reply!  (Incidentally, this is also the paper I hinted at in a previous post: the one where π2/6 and the Euler-Mascheroni constant make cameo appearances.)  To clarify, if we just wanted to answer the claims of the Eisert group, then I think a couple paragraphs would suffice for that (see, for example, these PowerPoint slides).  In our new paper, however, Alex and I take the opportunity to go further: we study lots of interesting questions about the statistical properties of Haar-random BosonSampling distributions, and about how one might test efficiently whether a claimed BosonSampling device worked, even with hundreds or thousands of photons.

4\. Also on the arXiv last night, there was a phenomenal survey about the quantum PCP conjecture by Dorit Aharonov, Itai Arad, and my former postdoc Thomas Vidick (soon to be a professor at Caltech).  I recommend reading it in the strongest possible terms, if you’d like to see how far people have come with this problem (but also, how far they still have to go) since my “Quantum PCP Manifesto” seven years ago.

5\. Christos Papadimitriou asked me to publicize that the deadline for early registration and hotel reservations for the upcoming FOCS in Berkeley is fast approaching!  Indeed, it’s October 4 (three days from now).  See here for details, and here for information about student travel support.  (The links were down when I just tried them, but hopefully the server will be back up soon.)


Updates (11/8): Alas, video of Eliezer’s talk will not be available after all. The nincompoops who we paid to record the talk wrote down November instead of October for the date, didn’t show up, then stalled for a month before finally admitting what had happened. So my written summary will have to suffice (and maybe Eliezer can put his slides up as well).

In other news, Shachar Lovett has asked me to announce a workshop on complexity and coding theory, which will be held at UC San Diego, January 8-10, 2014.

* * *

Update (10/21): Some readers might be interested in my defense of LessWrongism against a surprisingly-common type of ad-hominem attack (i.e., “the LW ideas must be wrong because so many of their advocates are economically-privileged but socially-awkward white male nerds, the same sorts of people who might also be drawn to Ayn Rand or other stuff I dislike”). By all means debate the ideas—I’ve been doing it for years—but please give beyond-kindergarten arguments when you do so!

* * *

Update (10/18): I just posted a long summary and review of Eliezer Yudkowsky’s talk at MIT yesterday.

* * *

Update (10/15): Leonard Schulman sent me the news that, according to an article by Victoria Woollaston in the Daily Mail, Google hopes to use its D-Wave quantum computer to “solve global warming,” “develop sophisticated artificial life,” and “find aliens.”  (No, I’m not making any of this up: just quoting stuff other people made up.)  The article also repeats the debunked canard that the D-Wave machine is “3600 times faster,” and soberly explains that D-Wave’s 512 qubits compare favorably to the mere 32 or 64 bits found in home PCs (exercise for those of you who aren’t already rolling on the floor: think about that until you are).  It contains not a shadow of a hint of skepticism anywhere, not one token sentence.  I would say that, even in an extremely crowded field, Woollaston’s piece takes the cake as the single most irresponsible article about D-Wave I’ve seen.  And I’d feel terrible for my many friends at Google, whose company comes out of this looking like a laughingstock.  But that’s assuming that this isn’t some sort of elaborate, Sokal-style prank, designed simply to prove that media outlets will publish anything whatsoever, no matter how forehead-bangingly absurd, as long as it contains the words “D-Wave,” “Google,” “NASA,” and “quantum”—and thereby, to prove the truth of what I’ve been saying on this blog since 2007.

* * *

1\. I’ve added MathJax support to the comments section!  If you want to insert an inline LaTeX equation, surround it with\\( \backslash(  \backslash) \\), while if you want to insert a displayed equation, surround it with \\(\text{\$\$ \$\$}\\).  Thanks very much to Michael Dixon for prodding me to do this and telling me how.

2\. ~~I’ve also added upvoting and downvoting to the comments section!~~  OK, in the first significant use of comment voting, the readers have voted overwhelmingly, by 41 – 13, that they want the comment voting to disappear.  So disappear it has!

3\. Most importantly, I’ve invited Eliezer Yudkowsky to MIT to give a talk!  He’s here all week, and will be speaking on “Recursion in Rational Agents: Foundations for Self-Modifying AI” this Thursday at 4PM in 32-123 in the MIT Stata Center.  Refreshments at 3:45.  See here for the abstract.  Anyone in the area who’s interested in AI, rationalism, or other such nerdy things is strongly encouraged to attend; it should be interesting.  Just don’t call Eliezer a “Singularitarian”: I’m woefully out of the loop, but I learned yesterday that they’ve dropped that term entirely, and now prefer to ~~be known as machine intelligence researchers~~ talk about the intelligence explosion.

(In addition, Paul Christiano—former MIT undergrad, and my collaborator on quantum money—will be speaking today at 4:30 at the Harvard Science Center, on “Probabilistic metamathematics and the definability of truth.”  His talk will be related to Eliezer’s but somewhat more technical.  See here for details.)

* * *

Update (10/15): Alistair Sinclair asked me to post the following announcement.

The Simons Institute for the Theory of Computing at UC Berkeley invites applications for Research Fellowships for academic year 2014-15.

Simons-Berkeley Research Fellowships are an opportunity for outstanding junior scientists (up to 6 years from PhD by Fall 2014) to spend one or two semesters at the Institute in connection with one or more of its programs. The programs for 2014-15 are as follows:

* Algorithmic Spectral Graph Theory (Fall 2014)
* Algorithms and Complexity in Algebraic Geometry (Fall 2014)
* Information Theory (Spring 2015)

Applicants who already hold junior faculty or postdoctoral positions are welcome to apply. In particular, applicants who hold, or expect to hold, postdoctoral appointments at other institutions are encouraged to apply to spend one semester as a Simons-Berkeley Fellow subject to the approval of the postdoctoral institution.

Further details and application instructions can be found at http://simons.berkeley.edu/fellows2014. Information about the Institute and the above programs can be found at http://simons.berkeley.edu.

Deadline for applications: 15 December, 2013.


Here’s a heartwarming story of religious reconciliation in Israel, one that puts the lie to those cynics who thought such ecumenism impossible. It seems that large portions of Jerusalem’s Orthodox Jewish, Muslim, and Christian communities have finally set aside their differences, and joined together to support a common goal: threatening the marchers in a Gay Pride parade with death.


Update (12/2): Jeremy Hsu has written a fantastic piece for IEEE Spectrum, entitled “D-Wave’s Year of Computing Dangerously.”

* * *

Update (11/13): See here for video of a fantastic talk that Matthias Troyer gave at Stanford, entitled “Quantum annealing and the D-Wave devices.” The talk includes the results of experiments on the 512-qubit machine. (Thanks to commenter jim for the pointer. I attended the talk when Matthias gave it last week at Harvard, but I don’t think that one was videotaped.)

* * *

Update (11/11): A commenter named RaulGPS has offered yet another great observation that, while forehead-slappingly obvious in retrospect, somehow hadn’t occurred to us.  Namely, Raul points out that the argument given in this post, for the hardness of Scattershot BosonSampling, can also be applied to answer open question #4 from my and Alex’s paper: namely, how hard is BosonSampling with Gaussian inputs and number-resolving detectors?  Raul points out that the latter, in general, is certainly at least as hard as Scattershot BS.  For we can embed Scattershot BS into “ordinary” BS with Gaussian inputs, by first generating a bunch of entangled 2-mode Gaussian states (which are highly attenuated, so that with high probability none of them have 2 or more photons per mode), and then applying a Haar-random unitary U to the “right halves” of these Gaussian states while doing nothing to the left halves.  Then we can measure the left halves to find out which of the input states contained a photon before we applied U.  This is precisely equivalent to Scattershot BS, except for the unimportant detail that our measurement of the “herald” photons has been deferred till the end of the experiment instead of happening at the beginning.  And therefore, since (as I explain in the post) a fast classical algorithm for approximate Scattershot BosonSampling would let us estimate the permanents of i.i.d. Gaussian matrices in BPPNP, we deduce that a fast classical algorithm for approximate Gaussian BosonSampling would have the same consequence.  In short, approximate Gaussian BS can be argued to be hard under precisely the same complexity assumption as can approximate ordinary BS (and approximate Scattershot BS).  Thus, in the table in Section 1.4 of our paper, the entries “Gaussian states / Adaptive, demolition” and “Gaussian states / Adaptive, nondemolition” should be “upgraded” from “Exact sampling hard” to “Apx. sampling hard?”

One other announcement: following a suggestion by commenter Rahul, I hereby invite guest posts on Shtetl-Optimized by experimentalists working on BosonSampling, offering your personal views about the prospects and difficulties of scaling up.  Send me email if you’re interested.  (Or if you don’t feel like writing a full post, of course you can also just leave a comment on this one.)

* * *

[Those impatient for a cool, obvious-in-retrospect new idea about BosonSampling, which I learned from the quantum optics group at Oxford, should scroll to the end of this post.  Those who don’t even know what BosonSampling is, let alone Scattershot BosonSampling, should start at the beginning.]

BosonSampling is a proposal by me and Alex Arkhipov for a rudimentary kind of quantum computer: one that would be based entirely on generating single photons, sending them through a network of beamsplitters and phaseshifters, and then measuring where they ended up.  BosonSampling devices are not thought to be capable of universal quantum computing, or even universal classical computing for that matter.  And while they might be a stepping-stone toward universal optical quantum computers, they themselves have a grand total of zero known practical applications.  However, even if the task performed by BosonSamplers is useless, the task is of some scientific interest, by virtue of apparently being hard!  In particular, Alex and I showed that, if a BosonSampler can be simulated exactly in polynomial time by a classical computer, then P#P=BPPNP, and hence the polynomial hierarchy collapses to the third level.  Even if a BosonSampler can only be approximately simulated in classical polynomial time, the polynomial hierarchy would still collapse, if a reasonable-looking conjecture in classical complexity theory is true.  For these reasons, BosonSampling might provide an experimental path to testing the Extended Church-Turing Thesis—i.e., the thesis that all natural processes can be simulated with polynomial overhead by a classical computer—that’s more “direct” than building a universal quantum computer.  (As an asymptotic claim, obviously the ECT can never be decisively proved or refuted by a finite number of experiments.  However, if one could build a BosonSampler with, let’s say, 30 photons, then while it would still be feasible to verify the results with a classical computer, it would be fair to say that the BosonSampler was working “faster” than any known algorithm running on existing digital computers.)

In arguing for the hardness of BosonSampling, the crucial fact Alex and I exploited is that the amplitudes for n-photon processes are given by the permanents of nxn matrices of complex numbers, and Leslie Valiant proved in 1979 that the permanent is #P-complete (i.e., as hard as any combinatorial counting problem, and probably even “harder” than NP-complete).  To clarify, this doesn’t mean that a BosonSampler lets you calculate the permanent of a given matrix—that would be too good to be true!  (See the tagline of this blog.)  What you could do with a BosonSampler is weirder: you could sample from a probability distribution over matrices, in which matrices with large permanents are more likely to show up than matrices with small permanents.  So, what Alex and I had to do was to argue that even that sampling task is still probably intractable classically—in the sense that, if it weren’t, then there would also be unlikely classical algorithms for more “conventional” problems.

Anyway, that’s my attempt at a 2-paragraph summary of something we’ve been thinking about on and off for four years.  See here for my and Alex’s original paper on BosonSampling, here for a recent followup paper, here for PowerPoint slides, here and here for MIT News articles by Larry Hardesty, and here for my blog post about the first (very small, 3- or 4-photon) demonstrations of BosonSampling by quantum optics groups last year, with links to the four experimental papers that came out then.

In general, we’ve been thrilled by the enthusiastic reaction to BosonSampling by quantum optics people—especially given that the idea started out as pure complexity theory, with the connection to optics coming as an “unexpected bonus.”  But not surprisingly, BosonSampling has also come in for its share of criticism: e.g., that it’s impractical, unscalable, trivial, useless, oversold, impossible to verify, and probably some other things.  A few people have even claimed that, in expressing support and cautious optimism about the recent BosonSampling experiments, I’m guilty of the same sort of quantum computing hype that I complain about in others.  (I’ll let you be the judge of that.  Reread the paragraphs above, or anything else I’ve ever written about this topic, and then compare to, let’s say, this video.)

By far the most important criticism of BosonSampling—one that Alex and I have openly acknowledged and worried a lot about almost from the beginning—concerns the proposal’s scalability.  The basic problem is this: in BosonSampling, your goal is to measure a pattern of quantum interference among n identical, non-interacting photons, where n is as large as possible.  (The special case n=2 is called the Hong-Ou-Mandel dip; conversely, BosonSampling can be seen as just “Hong-Ou-Mandel on steroids.”)  The bigger n gets, the harder the experiment ought to be to simulate using a classical computer (with the difficulty increasing at least like ~2n).  The trouble is that, to detect interference among n photons, the various quantum-mechanical paths that your photons could take, from the sources, through the beamsplitter network, and finally to the detectors, have to get them there at exactly the same time—or at any rate, close enough to “the same time” that the wavepackets overlap.  Yet, while that ought to be possible in theory, the photon sources that actually exist today, and that will exist for the foreseeable future, just don’t seem good enough to make it happen, for anything more than a few photons.

The reason—well-known for decades as a bane to quantum information experiments—is that there’s no known process in nature that can serve as a deterministic single-photon source.  What you get from an attenuated laser is what’s called a coherent state: a particular kind of superposition of 0 photons, 1 photon, 2 photons, 3 photons, etc., rather than just 1 photon with certainty (the latter is called a Fock state).  Alas, coherent states behave essentially like classical light, which makes them pretty much useless for BosonSampling, and for many other quantum information tasks besides.  For that reason, a large fraction of modern quantum optics research relies on a process called Spontaneous Parametric Down-Conversion (SPDC).  In SPDC, a laser (called the “pump”) is used to stimulate a crystal to produce further photons.  The process is inefficient: most of the time, no photon comes out.  But crucially, any time a photon does come out, its arrival is “heralded” by a partner photon flying out in the opposite direction.  Once in a while, 2 photons come out simultaneously, in which case they’re heralded by 2 partner photons—and even more rarely, 3 photons come out, heralded by 3 partner photons, and so on.  Furthermore, there exists something called a number-resolving detector, which can tell you (today, sometimes, with as good as ~95% reliability) when one or more partner photons have arrived, and how many of them there are.  The result is that SPDC lets us build what’s called a nondeterministic single-photon source.  I.e., you can’t control exactly when a photon comes out—that’s random—but eventually one (and only one) photon will come out, and when that happens, you’ll know it happened, without even having to measure and destroy the precious photon.  The reason you’ll know is that the partner photon heralds its presence.

Alas, while SPDC sources have enabled demonstrations of a large number of cool quantum effects, there’s a fundamental problem with using them for BosonSampling.  The problem comes from the requirement that n—the number of single photons fired off simultaneously into your beamsplitter network—should be big (say, 20 or 30).  Suppose that, in a given instant, the probability that your SPDC source succeeds in generating a photon is p.  Then what’s the probability that two SPDC sources will both succeed in generating a photon at that instant?  p2.  And the probability that three sources will succeed is p3, etc.  In general, with n sources, the probability that they’ll succeed simultaneously falls off exponentially with n, and the amount of time you’ll need to sit in the lab waiting for the lucky event increases exponentially with n.  Sure, when it finally does happen, it will be “heralded.”  But if you need to wait exponential time for it to happen, then there would seem to be no advantage over classical computation.  This is the reason why so far, BosonSampling has only been demonstrated with 3-4 photons.

At least three solutions to the scaling problem suggest themselves, but each one has problems of its own.  The first solution is simply to use general methods for quantum fault-tolerance: it’s not hard to see that, if you had a fault-tolerant universal quantum computer, then you could simulate BosonSampling with as many photons as you wanted.  The trouble is that this requires a fault-tolerant universal quantum computer!  And if you had that, then you’d probably just skip BosonSampling and use Shor’s algorithm to factor some 10,000-digit numbers.  The second solution is to invent some specialized fault-tolerance method that would apply directly to quantum optics.  Unfortunately, we don’t know how to do that.  The third solution—until recently, the one that interested me and Alex the most—would be to argue that, even if your sources are so cruddy that you have no idea which ones generated a photon and which didn’t in any particular run, the BosonSampling distribution is still intractable to simulate classically.  After all, the great advantage of BosonSampling is that, unlike with (say) factoring or quantum simulation, we don’t actually care which problem we’re solving!  All we care about is that we’re doing something that we can argue is hard for classical computers.  And we have enormous leeway to change what that “something” is, to match the capabilities of current technology.  Alas, yet again, we don’t know how to argue that BosonSampling is hard to simulate approximately in the presence of realistic amounts of noise—at best, we can argue that it’s hard to simulate approximately in the presence of tiny amounts of noise, and hard to simulate super-accurately in the presence of realistic noise.

When faced with these problems, until recently, all we could do was

  1. shrug our shoulders,
  2. point out that none of the difficulties added up to a principled argument that scalable BosonSampling was not possible,
  3. stress, again, that all we were asking for was to scale to 20 or 30 photons, not 100 or 1000 photons, and
  4. express hope that technologies for single-photon generation currently on the drawing board—most notably, something called “optical multiplexing”—could be used to get up to the 20 or 30 photons we wanted.

Well, I’m pleased to announce, with this post, that there’s now a better idea for how to scale BosonSampling to interesting numbers of photons.  The idea, which I’ve taken to calling Scattershot BosonSampling, is not mine or Alex’s.  I learned of it from Ian Walmsley’s group at Oxford, where it’s been championed in particular by Steve Kolthammer.  (Update: A commenter has pointed me to a preprint by Lund, Rahimi-Keshari, and Ralph from May of this year, which I hadn’t seen before, and which contains substantially the same idea, albeit with an unsatisfactory argument for computational hardness.  In any case, as you’ll see, it’s not surprising that this idea would’ve occurred to multiple groups of experimentalists independently; what’s surprising is that we didn’t think of it!)  The minute I heard about Scattershot BS, I kicked myself for failing to think of it, and for getting sidetracked by much more complicated ideas.  Steve and others are working on a paper about Scattershot BS, but in the meantime, Steve has generously given me permission to share the idea on this blog.  I suggested a blog post for two reasons: first, as you’ll see, this idea really is “blog-sized.”  Once you make the observation, there’s barely any theoretical analysis that needs to be done!  And second, I was impatient to get out to the “experimental BosonSampling community”—not to mention to the critics!—that there’s now a better way to BosonSample, and one that’s incredibly simple to boot.

OK, so what is the idea?  Well, recall from above what an SPDC source does: it produces a photon with only a small probability, but whenever it does, it “heralds” the event with a second photon.  So, let’s imagine that you have an array of 200 SPDC sources.  And imagine that, these sources being unpredictable, only (say) 10 of them, on average, produce a photon at any given time.  Then what can you do?  Simple: just define those 10 sources to be the inputs to your experiment!  Or to say it more carefully: instead of sampling only from a probability distribution over output configurations of your n photons, now you’ll sample from a joint distribution over inputs and outputs: one where the input is uniformly random, and the output depends on the input (and also, of course, on the beamsplitter network).  So, this idea could also be called “Double BosonSampling”: now, not only do you not control which output will be observed (but only the probability distribution over outputs), you don’t control which input either—yet this lack of control is not a problem!  There are two key reasons why it isn’t:

  1. As I said before, SPDC sources have the crucial property that they herald a photon when they produce one.  So, even though you can’t control which 10 or so of your 200 SPDC sources will produce a photon in any given run, you know which 10 they were.
  2. In my and Alex’s original paper, the “hardest” case of BosonSampling that we were able to find—the case we used for our hardness reductions—is simply the one where the mxn “scattering matrix,” which describes the map between the n input modes and the m&gt;&gt;n output modes, is a Haar-random matrix whose columns are orthonormal vectors.  But now suppose we have m input modes and m output modes, and the mxm unitary matrix U mapping inputs to outputs is Haar-random.  Then any mxn submatrix of U will simply be an instance of the “original” hard case that Alex and I studied!

More formally, what can we  say about the computational complexity of Scattershot BS?  Admittedly, I don’t know of a reduction from ordinary BS to Scattershot BS (though it’s easy to give a reduction in the other direction).  However, under exactly the same assumption that Alex and I used to argue that ordinary BosonSampling was hard—our so-called Permanent of Gaussians Conjecture (PGC)—one can show that Scattershot BS is hard also, and by essentially the same proof.  The only difference is that, instead of talking about the permanents of nxn submatrices of an mxn Haar-random, column-orthonormal matrix, now we talk about the permanents of nxn submatrices of an mxm Haar-random unitary matrix.  Or to put it differently: where before we fixed the columns that defined our nxn submatrix and only varied the rows, now we vary both the rows and the columns.  But the resulting nxn submatrix is still close in variation distance to a matrix of i.i.d. Gaussians, for exactly the same reasons it was before.  And we can still check whether submatrices with large permanents are more likely to be sampled than submatrices with small permanents, in the way predicted by quantum mechanics.

Now, everything above assumed that each SPDC source produces either 0 or 1 photon.  But what happens when the SPDC sources produce 2 or more photons, as they sometimes do?  It turns out that there are two good ways to deal with these “higher-order terms” in the context of Scattershot BS.  The first way is by using number-resolving detectors to count how many herald photons each SPDC source produces.  That way, at least you’ll know exactly which sources produced extra photons, and how many extra photons each one produced.  And, as is often the case in BosonSampling, a devil you know is a devil you can deal with.  In particular, a few known sources producing extra photons, just means that the amplitudes of the output configurations will now be permanents of matrices with a few repeated rows in them.  But the permanent of an otherwise-random matrix with a few repeated rows should still be hard to compute!  Granted, we don’t know how to derive that as a consequence of our original hardness assumption, but this seems like a case where one is perfectly justified to stick one’s neck out and make a new assumption.

But there’s also a more elegant way to deal with higher-order terms.  Namely, suppose m&gt;&gt;n2 (i.e., the number of input modes is at least quadratically greater than the average number of photons).  That’s an assumption that Alex and I typically made anyway in our original BosonSampling paper, because of our desire to avoid what we called the “Bosonic Birthday Paradox” (i.e., the situation where two or more photons congregate in the same output mode).  What’s wonderful is that exactly the same assumption also implies that, in Scattershot BS, two or more photons will almost never be found in the same input mode!  That is, when you do the calculation, you find that, once you’ve attenuated your SPDC sources enough to avoid the Bosonic Birthday Paradox at the output modes, you’ve also attenuated them enough to avoid higher-order terms at the input modes.  Cool, huh?

Are there any drawbacks to Scattershot BS?  Well, Scattershot BS certainly requires more SPDC sources than ordinary BosonSampling does, for the same average number of photons.  A little less obviously, Scattershot BS also requires a larger-depth beamsplitter network.  In our original paper, Alex and I showed that for ordinary BosonSampling, it suffices to use a beamsplitter network of depth O(n log m), where n is the number of photons and m is the number of output modes (or equivalently detectors).  However, our construction took advantage of the fact that we knew exactly which n&lt;&lt;m sources the photons were going to come from, and could therefore optimize for those.  For Scattershot BS, the depth bound increases to O(m log m): since the n photons could come from any possible subset of the m input modes, we no longer get the savings based on knowing where they originate.  But this seems like a relatively minor issue.

I don’t want to give the impression that Scattershot BS is a silver bullet that will immediately let us BosonSample with 30 photons.  The most obvious limiting factor that remains is the efficiency of the photon detectors—both those used to detect the photons that have passed through the beamsplitter network, and those used to detect the herald photons.  Because of detector inefficiencies, I’m told that, without further technological improvements (or theoretical ideas), it will still be quite hard to push Scattershot BS beyond about 10 photons.  Still, as you might have noticed, 10 is greater than 4 (the current record)!  And certainly, Scattershot BS itself—a simple, obvious-in-retrospect idea that was under our noses for years, and that immediately pushes forward the number of photons a BosonSampler can handle—should make us exceedingly reluctant to declare there can’t be any more such ideas, and that our current ignorance amounts to a proof of impossibility.


From sayat-travel.kz:

> ALMATY, Kazakhstan – Sayat Tour, a leading Kazakh tour operator, announced today several new tours for Americans and others who are willing to travel to Kazakhstan and see for themselves what the real country, not the Borat’s version, is really like.
>
> The tours, called “Kazakhstan vs. Boratistan” and “Jagzhemash!!! See the Real Kazakhstan”, include visits to the cosmopolitan Almaty and its beautiful surroundings, tours of ancient sites such as the Hodja Akhmed Yassaui Mausoleum in Turkestan, as well as plentiful opportunities to meet and interact with the real Kazakhs. In addition to sightseeing, tours also include visits to local colorful bazaars, artifact shops and high fashion boutiques, as well as trying kumyss, the deliciously tasting Kazakh traditional drink made from fermented horse milk.
>
> Marianna Tolekenova, Sayat’s Executive Director, said: “With the release of Borat: Cultural Learnings of America for Make Benefit Glorious Nation of Kazakhstan, we are hoping many Americans will want to engage in ‘cultural learnings’ of that unknown ‘glorious nation’ for their own ‘make benefit.’ That is why we are launching these new tours and hoping the Americans will come visit us.”
>
> Earlier in October 2006, a high ranking Kazakh official said the creator of Borat, British comedian Sasha Baron Cohen, would be welcome in Kazakhstan. First Deputy Foreign Minister Rakhat Aliyev said, “His trip could yield a lot of discoveries — that women not only travel inside buses but also drive their own cars, that we make wine from grapes, that Jews can freely attend synagogues and so on.”

Update (11/13): In response to a comment by Greg Kuperberg, I’ve now reached a halakhic ruling on the morality of Sacha Baron Cohen’s antics. Go to the comments section if you want to read it.


In today’s quant-ph we find a report of a truly dramatic experiment — one that detected entanglement between Baton Rouge, Louisiana and Givarlais, France. How, you ask: by fiber-optic cable? Satellite? Neither: by postal mail! The authors don’t say if it was FedEx, UPS, or some other carrier that managed to ship half an EPR pair across the Atlantic without decohering it — but whoever it was, that’s who I’m using from now on.

(Note: On close reading, it appears that when the authors use the word “entanglement,” they actually mean “classical correlation.” However, this is a technical distinction that should only matter for experts.)


As the world marked the 50th anniversary of the JFK assassination, I have to confess … no, no, not that I was in on the plot.  I wasn’t even born then, silly.  I have to confess that, in between struggling to make a paper deadline, attending a workshop in Princeton, celebrating Thanksgivukkah, teaching Lily how to pat her head and clap her hands, and not blogging, I also started dipping, for the first time in my life, into a tiny fraction of the vast literature about the JFK assassination.  The trigger (so to speak) for me was this article by David Talbot, the founder of Salon.com.  I figured, if the founder of Salon is a JFK conspiracy buff—if, for crying out loud, my skeptical heroes Bertrand Russell and Carl Sagan were both JFK conspiracy buffs—then maybe it’s at least worth familiarizing myself with the basic facts and arguments.

So, what happened when I did?  Were the scales peeled from my eyes?

In a sense, yes, they were.  Given how much has been written about this subject, and how many intelligent people take seriously the possibility of a conspiracy, I was shocked by how compelling I found the evidence to be that there were exactly three shots, all fired by Lee Harvey Oswald with a Carcano rifle from the sixth floor of the Texas School Book Depository, just as the Warren Commission said in 1964.  And as for Oswald’s motives, I think I understand them as well and as poorly as I understand the motives of the people who send me ramblings every week about P vs. NP and the secrets of the universe.

Before I started reading, if someone forced me to guess, maybe I would’ve assigned a ~10% probability to some sort of conspiracy.  Now, though, I’d place the JFK conspiracy hypothesis firmly in Moon-landings-were-faked, Twin-Towers-collapsed-from-the-inside territory.  Or to put it differently, “Oswald as lone, crazed assassin” has been added to my large class of “sanity-complete” propositions: propositions defined by the property that if I doubt any one of them, then there’s scarcely any part of the historical record that I shouldn’t doubt.  (And while one can’t exclude the possibility that Oswald confided in someone else before the act—his wife or a friend, for example—and that other person kept it a secret for 50 years, what’s known about Oswald strongly suggests that he didn’t.)

So, what convinced me?  In this post, I’ll give twenty reasons for believing that Oswald acted alone.  Notably, my reasons will have less to do with the minutiae of bullet angles and autopsy reports, than with general principles for deciding what’s true and what isn’t.  Of course, part of the reason for this focus is that the minutiae are debated in unbelievable detail elsewhere, and I have nothing further to contribute to those debates.  But another reason is that I’m skeptical that anyone actually comes to believe the JFK conspiracy hypothesis because they don’t see how the second bullet came in at the appropriate angle to pass through JFK’s neck and shoulder and then hit Governor Connally.  Clear up some technical point (or ten or fifty of them)—as has been done over and over—and the believers will simply claim that the data you used was altered by the CIA, or they’ll switch to other “anomalies” without batting an eye.  Instead, people start with certain general beliefs about how the world works, “who’s really in charge,” what sorts of explanations to look for, etc., and then use their general beliefs to decide which claims to accept about JFK’s head wounds or the foliage in Dealey Plaza—not vice versa.  That being so, one might as well just discuss the general beliefs from the outset.  So without further ado, here are my twenty reasons:

1\. Conspiracy theorizing represents a known bug in the human nervous system.  Given that, I think our prior should be overwhelmingly against anything that even looks like a conspiracy theory.  (This is not to say conspiracies never happen.  Of course they do: Watergate, the Tobacco Institute, and the Nazi Final Solution were three well-known examples.  But the difference between conspiracy theorists’ fantasies and actual known conspiracies is this: in a conspiracy theory, some powerful organization’s public face hides a dark and terrible secret; its true mission is the opposite of its stated one.  By contrast, in every real conspiracy I can think of, the facade was already 90% as terrible as the reality!  And the “dark secret” was that the organization was doing precisely what you’d expect it to do, if its members genuinely held the beliefs that they claimed to hold.)

2\. The shooting of Oswald by Jack Ruby created the perfect conditions for conspiracy theorizing to fester.  Conditioned on that happening, it would be astonishing if a conspiracy industry hadn’t arisen, with its hundreds of books and labyrinthine arguments, even under the assumption that Oswald and Ruby both really acted alone.

3\. Other high-profile assassinations to which we might compare this one—for example, those of Lincoln, Garfield, McKinley, RFK, Martin Luther King Jr., Gandhi, Yitzchak Rabin…—appear to have been the work of “lone nuts,” or at most “conspiracies” of small numbers of lowlifes.  So why not this one?

4\. Oswald seems to have perfectly fit the profile of a psychopathic killer (see, for example, Case Closed by Gerald Posner).  From very early in his life, Oswald exhibited grandiosity, resentment, lack of remorse, doctrinaire ideological fixations, and obsession with how he’d be remembered by history.

5\. A half-century of investigation has failed to link any individual besides Oswald to the crime.  Conspiracy theorists love to throw around large, complicated entities like the CIA or the Mafia as potential “conspirators”—but in the rare cases when they’ve tried to go further, and implicate an actual human being other than Oswald or Ruby (or distant power figures like LBJ), the results have been pathetic and tragic.

6\. Oswald had previously tried to assassinate General Walker—a fact that was confirmed by his widow Marina Oswald, but that, incredibly, is barely even discussed in the reams of conspiracy literature.

7\. There’s clear evidence that Oswald murdered Officer Tippit an hour after shooting JFK—a fact that seems perfectly consistent with the state of mind of someone who’d just murdered the President, but that, again, seems to get remarkably little discussion in the conspiracy literature.

8\. Besides being a violent nut, Oswald was also a known pathological liar.  He lied on his employment applications, he lied about having established a thriving New Orleans branch of Fair Play for Cuba, he lied and lied and lied.  Because of this tendency—as well as his persecution complex—Oswald’s loud protestations after his arrest that he was just a “patsy” count for almost nothing.

9\. According to police accounts, Oswald acted snide and proud of himself after being taken into custody: for example, when asked whether he had killed the President, he replied “you find out for yourself.”  He certainly didn’t act like an innocent “patsy” arrested on such a grave charge would plausibly act.

10\. Almost all JFK conspiracy theories must be false, simply because they’re mutually inconsistent.  Once you realize that, and start judging the competing conspiracy theories by the standards you’d have to judge them by if at most one could be true, enlightenment may dawn as you find there’s nothing in the way of just rejecting all of them.  (Of course, some people have gone through an analogous process with religions.)

11\. The case for Oswald as lone assassin seems to become stronger, the more you focus on the physical evidence and stuff that happened right around the time and place of the event.  To an astonishing degree, the case for a conspiracy seems to rely on verbal testimony years or decades afterward—often by people who are known confabulators, who were nowhere near Dealey Plaza at the time, who have financial or revenge reasons to invent stories, and who “remembered” seeing Oswald and Ruby with CIA agents, etc. only under drugs or hypnosis.  This is precisely the pattern we would expect if conspiracy theorizing reflected the reality of the human nervous system rather than the reality of the assassination.

12\. If the conspiracy is so powerful, why didn’t it do something more impressive than just assassinate JFK? Why didn’t it rig the election to prevent JFK from becoming President in the first place?  (In math, very often the way you discover a bug in your argument is by realizing that the argument gives you more than you originally intended—vastly, implausibly more.  Yet every pro-conspiracy argument I’ve read seems to suffer from the same problem.  For example, after successfully killing JFK, did the conspiracy simply disband?  Or did it go on to mastermind other assassinations?  If it didn’t, why not?  Isn’t pulling the puppet-strings of the world sort of an ongoing proposition?  What, if any, are the limits to this conspiracy’s power?)

13\. Pretty much all the conspiracy writers I encountered exude total, 100% confidence, not only in the existence of additional shooters, but in the guilt of their favored villains (they might profess ignorance, but then in the very next sentence they’d talk about how JFK’s murder was “a triumph for the national security establishment”).  For me, their confidence had the effect of weakening my own confidence in their intellectual honesty, and in any aspects of their arguments that I had to take on faith.  The conspiracy camp would of course reply that the “Oswald acted alone” camp also exudes too much confidence in its position.  But the two cases are not symmetric: for one thing, because there are so many different conspiracy theories, but only one Oswald.  If I were a conspiracy believer I’d be racked with doubts, if nothing else then about whether my conspiracy was the right one.

14\. Every conspiracy theory I’ve encountered seems to require “uncontrolled growth” in size and complexity: that is, the numbers of additional shooters, alterations of medical records, murders of inconvenient witnesses, coverups, coverups of the coverups, etc. that need to be postulated all seem to multiply without bound.  To some conspiracy believers, this uncontrolled growth might actually be a feature: the more nefarious and far-reaching the conspiracy’s tentacles, the better.  It should go without saying that I regard it as a bug.

15\. JFK was not a liberal Messiah.  He moved slowly on civil rights for fear of a conservative backlash, invested heavily in building nukes, signed off on the botched plans to kill Fidel Castro, and helped lay the groundwork for the US’s later involvement in Vietnam.  Yes, it’s possible that he would’ve made wiser decisions about Vietnam than LBJ ended up making; that’s part of what makes his assassination (like RFK’s later assassination) a tragedy.  But many conspiracy theorists’ view of JFK as an implacable enemy of the military-industrial complex is preposterous.

16\. By the same token, LBJ was not exactly a right-wing conspirator’s dream candidate.  He was, if anything, more aggressive on poverty and civil rights than JFK was.  And even if he did end up being better for certain military contractors, that’s not something that would’ve been easy to predict in 1963, when the US’s involvement in Vietnam had barely started.

17\. Lots of politically-powerful figures have gone on the record as believers in a conspiracy, including John Kerry, numerous members of Congress, and even frequently-accused conspirator LBJ himself.  Some people would say that this lends credibility to the conspiracy cause.  To me, however, it indicates just the opposite: that there’s no secret cabal running the world, and that those in power are just as prone to bugs in the human nervous system as anyone else is.

18\. As far as I can tell, the conspiracy theorists are absolutely correct that JFK’s security in Dallas was unbelievably poor; that the Warren Commission was as interested in reassuring the nation and preventing a war with the USSR or Cuba as it was in reaching the truth (the fact that it did reach the truth is almost incidental); and that agencies like the CIA and FBI kept records related to the assassination classified for way longer than there was any legitimate reason to (though note that most records finally were declassified in the 1990s, and they provided zero evidence for any conspiracy).  As you might guess, I ascribe all of these things to bureaucratic incompetence rather than to conspiratorial ultra-competence.  But once again, these government screwups help us understand how so many intelligent people could come to believe in a conspiracy even in the total absence of one.

19\. In the context of the time, the belief that JFK was killed by a conspiracy filled a particular need: namely, the need to believe that the confusing, turbulent events of the 1960s had an understandable guiding motive behind them, and that a great man like JFK could only be brought down by an equally-great evil, rather than by a chronically-unemployed loser who happened to see on a map that JFK’s motorcade would be passing by his workplace.  Ironically, I think that Roger Ebert got it exactly right when he praised Oliver Stone’s JFK movie for its “emotional truth.”  In much the same way, one could say that Birth of a Nation was “emotionally true” for Southern racists, or that Ben Stein’s Expelled was “emotionally true” for creationists.  Again, I’d say that the “emotional truth” of the conspiracy hypothesis is further evidence for its factual falsehood: for it explains how so many people could come to believe in a conspiracy even if the evidence for one were dirt-poor.

20\. At its core, every conspiracy argument seems to be built out of “holes”: “the details that don’t add up in the official account,” “the questions that haven’t been answered,” etc.  What I’ve never found is a truly coherent alternative scenario: just one “hole” after another.  This pattern is the single most important red flag for me, because it suggests that the JFK conspiracy theorists view themselves as basically defense attorneys: people who only need to sow enough doubts, rather than establish the reality of what happened.  Crucially, creationism, 9/11 trutherism, and every other elaborate-yet-totally-wrong intellectual edifice I’ve ever encountered has operated on precisely the same “defense attorney principle”: “if we can just raise enough doubts about the other side’s case, we win!”  But that’s a terrible approach to knowledge, once you’ve seen firsthand how a skilled arguer can raise unlimited doubts even about the nonexistence of a monster under your bed.  Such arguers are hoping, of course, that you’ll find their monster hypothesis so much more fun, exciting, and ironically comforting than the “random sounds in the night hypothesis,” that it won’t even occur to you to demand they show you their monster.

Further reading: this article in Slate.


Two weeks ago, I argued that scientific papers are basically a waste of time. Today I’d like to generalize the results of that earlier post, by explaining why scientific talks are also a waste of time.

Let me set the scene for you. You arrive at the weekly colloquium eager to learn, like a cargo cult member who’s sure that this time the planes are going to land. But then, about fifteen minutes after the PowerPoint train has left the station, you start to get nervous: “Why are we stopping at all these unfamiliar little hamlets? Are we really headed for the place mentioned in the abstract?” You glance at your fellow passengers: are they as confused as you are? (You’d ask the guy sitting next to you, but he’s sound asleep.) Eventually the announcer comes on and … uh-oh! It seems the train is about to begin its long ascent up Mount Boredom, and you don’t have the prerequisites for this leg of the trip. Can you dodge the ticket collector? Too stressful! You get off, and the train roars past you, never to return.

Such was my experience again and again until three years ago, when I finally gave up on talks as a medium for scientific communication. These days, whenever I have to sit through one, I treat the speaker’s words as background music for my private fantasies and daydreams, unless the speaker chooses to interrupt with a novel idea.

But what about when I have to talk? To be honest, I haven’t intentionally perpetrated a research talk in years. Instead I do a stand-up comedy routine where you have to be a quantum computing expert to get the jokes. It’s like Seinfeld, except not that funny. So why does it work? Simple: because the crowd that expects to be bored is the easiest crowd on Earth.

Now one could argue that, by stuffing my talks with flying pigs and slide-eating black holes, I’ve been setting back the cause of scientific knowledge. But I don’t think so. See, the basic problem with talks is that they have no anti-boredom escape hatch. I mean, if you were chatting with a colleague who droned on for too long, you’d have several options:

  * Change the subject.
  * Say something like “yeah, I get it, but does this actually lead to a new lower bound?”
  * Tap your fingers, study the wall patterns, etc.
  * If all else fails, mention your immense workload, then excuse yourself and go back to reading weblogs.

The key point is that none of these tactics are inherently rude or insulting. All of us use them regularly; if we didn’t, it’d be impossible to tell when we were boring each other. Put differently, these tactics are part of the feedback and dialogue that’s essential to any healthy relationship:

> “Was it good for you?”
“Could you maybe go a little faster?”
“Do you like it when I use this notation?”

The seminar speaker, by contrast, is a narcissist who verbally ravages his defenseless audience. Sure, it’s fine to interrupt with things like “Aren’t you missing an absolute value sign?,” or “How do you know A is Hermitian?” But have you ever raised your hand to say, “Excuse me, but would you mind skipping the next 20 slides and getting right to the meat?” Or: “This is boring. Would you please talk about a different result?”

(Incidentally, as my adviser Umesh Vazirani pointed out to me, when people get “lost” during a talk they think it means that the speaker is going too fast. But more often, the real problem is that the speaker is going too slow, and thereby letting the audience get mired in trivialities.)

So what’s the solution? (You knew there was going to be one, didn’t you?) My solution is to replace talks by “conversations” whenever possible. Here’s how the Aaronson system works: you get five minutes to tell your audience something unexpected. (Usually this will involve no slides, just a board.) Then, if people have questions, you answer them; if they want details, you provide them. At any time, anyone who’s no longer interested can get up and leave (and maybe come back later), without being considered a jerk. When there are no further questions, you sit down and give someone else a chance to surprise the audience.

If you don’t think this system would work, come visit our quantum algorithms lunch at Waterloo, Tuesdays at 11:30 in the BFG seminar room. Bring a result or open problem.


So, I finally had it both with Blogger, which was constantly down, and with my web hosting service, which was constantly down and inserting hidden Cialis ads into my homepage. (Yes, really.) So I ditched them both!

This morning Shtetl-Optimized finally departed the old country, and boarded a crowded ship bound for a strange new world: the world of Bluehost and WordPress. So welcome to a brand-new blog, which will feature the same name as the old one, the same topics, and the same terrible jokes. I hope you like it.

(Also this morning, I discovered a little hole-in-the-wall in Waterloo that sells hot, fresh bagels barely distinguishable from what you could get in New York. Yes, this is shaping up to be a very good day.)

(Oh, yes: Happy belated Thanksgiving to my American friends. I decided to stay in Waterloo over Thanksgiving to teach my course — is this is a sign that I’m actually becoming Canadian?)


If you’re the sort of person who reads this blog, you may have heard that 23andMe—the company that (until recently) let anyone spit into a capsule, send it away to a DNA lab, and then learn basic information about their ancestry, disease risks, etc.—has suspended much of its service, on orders from the US Food and Drug Administration.  As I understand it, on Nov. 25, the FDA ordered 23andMe to stop marketing to new customers (though it can still serve existing customers), and on Dec. 5, the company stopped offering new health-related information to any customers (though you can still access the health information you had before, and ancestry and other non-health information is unaffected).

Of course, the impact of these developments is broader: within a couple weeks, “do-it-yourself genomics” has gone from an industry whose explosive growth lots of commentators took as a given, to one whose future looks severely in doubt (at least in the US).

The FDA gave the reasons for its order in a letter to Ann Wojcicki, 23andMe’s CEO.  Excerpts:

For instance, if the BRCA-related risk assessment for breast or ovarian cancer reports a false positive, it could lead a patient to undergo prophylactic surgery, chemoprevention, intensive screening, or other morbidity-inducing actions, while a false negative could result in a failure to recognize an actual risk that may exist.  Assessments for drug responses carry the risks that patients relying on such tests may begin to self-manage their treatments through dose changes or even abandon certain therapies depending on the outcome of the assessment.  For example, false genotype results for your warfarin drug response test could have significant unreasonable risk of illness, injury, or death to the patient due to thrombosis or bleeding events that occur from treatment with a drug at a dose that does not provide the appropriately calibrated anticoagulant effect …  The risk of serious injury or death is known to be high when patients are either non-compliant or not properly dosed; combined with the risk that a direct-to-consumer test result may be used by a patient to self-manage, serious concerns are raised if test results are not adequately understood by patients or if incorrect test results are reported.

To clarify, the DNA labs that 23andMe uses are already government-regulated.  Thus, the question at issue here is not whether, if 23andMe claims (say) that you have CG instead of CC at some particular locus, the information is reliable.  Rather, the question is whether 23andMe should be allowed to tell you that fact, while also telling you that a recent research paper found that people with CG have a 10.4% probability of developing Alzheimer’s disease, as compared to a 7.2% base rate.  More bluntly, the question is whether ordinary schmoes ought to be trusted to learn such facts about themselves, without a doctor as an intermediary to interpret the results for them, or perhaps to decide that there’s no good reason for the patient to know at all.

Among medical experts, a common attitude seems to be something like this: sure, getting access to your own genetic data is harmless fun, as long as you’re an overeducated nerd who just wants to satisfy his or her intellectual curiosity (or perhaps narcissism).  But 23andMe crossed a crucial line when it started marketing its service to the hoi polloi, as something that could genuinely tell them about health risks.  Most people don’t understand probability, and are incapable of parsing “based on certain gene variants we found, your chances of developing diabetes are about 6 times higher than the baseline” as anything other than “you will develop diabetes.”  Nor, just as worryingly, are they able to parse “your chances are lower than the baseline” as anything other than “you won’t develop diabetes.”

I understand this argument.  Nevertheless, I find it completely inconsistent with a free society.  Moreover, I predict that in the future, the FDA’s current stance will be looked back upon as an outrage, with the subtleties in the FDA’s position mattering about as much as the subtleties in the Church’s position toward Galileo (“look, Mr. G., it’s fine to discuss heliocentrism among your fellow astronomers, as a hypothesis or a calculational tool—just don’t write books telling the general public that heliocentrism is literally true, and that they should change their worldviews as a result!”).  That’s why I signed this petition asking the FDA to reconsider its decision, and I encourage you to sign it too.

Here are some comments that might help clarify my views:

(1) I signed up for 23andMe a few years ago, as did the rest of my family.  The information I gained from it wasn’t exactly earth-shattering: I learned, for example, that my eyes are probably blue, that my ancestry is mostly Ashkenazi, that there’s a risk my eyesight will further deteriorate as I age (the same thing a succession of ophthalmologists told me), that I can’t taste the bitter flavor in brussels sprouts, and that I’m an “unlikely sprinter.”  On the other hand, seeing exactly which gene variants correlate with these things, and how they compare to the variants my parents and brother have, was … cool.  It felt like I imagine it must have felt to buy a personal computer in 1975.  In addition, I found nothing the slightest bit dishonest about the way the results were reported.  Each result was stated explicitly in terms of probabilities—giving both the baseline rate for each condition, and the rate conditioned on having such-and-such gene variant—and there were even links to the original research papers if I wanted to read them myself.  I only wish that I got half as much context and detail from conventional doctor visits—or for that matter, from most materials I’ve read from the FDA itself.  (When Dana was pregnant, I was pleasantly surprised when some of the tests she underwent came back with explicit probabilities and base rates.  I remember wishing doctors would give me that kind of information more often.)

(2) From my limited reading and experience, I think it’s entirely possible that do-it-yourself genetic testing is overhyped; that it won’t live up to its most fervent advocates’ promises; that for most interesting traits there are just too many genes involved, via too many labyrinthine pathways, to make terribly useful predictions about individuals, etc.  So it’s important to me that, in deciding whether what 23andMe does should be legal, we’re not being asked to decide any of these complicated questions!  We’re only being asked whether the FDA should get to decide the answers in advance.

(3) As regular readers will know, I’m far from a doctrinaire libertarian.  Thus, my opposition to shutting down 23andMe is not at all a corollary of reflexive opposition to any government regulation of anything.  In fact, I’d be fine if the FDA wanted to insert a warning message on 23andMe (in addition to the warnings 23andMe already provides), emphasizing that genetic tests only provide crude statistical information, that they need to be interpreted with care, consult your doctor before doing anything based on these results, etc.  But when it comes to banning access to the results, I have trouble with some of the obvious slippery slopes.  E.g., what happens when some Chinese or Russian company launches a competing service?  Do we ban Americans from mailing their saliva overseas?  What happens when individuals become able just to sequence their entire genomes, and store and analyze them on their laptops?  Do we ban the sequencing technology?  Or do we just ban software that makes it easy enough to analyze the results?  If the software is hard enough to use, so only professional biologists use it, does that make it OK again?  Also, if the FDA will be in the business of banning genomic data analysis tools, then what about medical books?  For that matter, what about any books or websites, of any kind, that might cause someone to make a poor medical decision?  What would such a policy, if applied consistently, do to the multibillion-dollar alternative medicine industry?

(4) I don’t understand the history of 23andMe’s interactions with the FDA.  From what I’ve read, though, they have been communicating for five years, with everything 23andMe has said in public sounding conciliatory rather than defiant (though the FDA has accused 23andMe of being tardy with its responses).  Apparently, the key problem is simply that the FDA hasn’t yet developed a regulatory policy specifically for direct-to-consumer genetic tests.  It’s been considering such a policy for years—but in the meantime, it believes no one should be marketing such tests for health purposes before a policy exists.  Alas, there are very few cases where I’d feel inclined to support a government in saying: “X is a new technology that lots of people are excited about.  However, our regulatory policies haven’t yet caught up to X.  Therefore, our decision is that X is banned, until and unless we figure out how to regulate it.”  Maybe I could support such a policy, if X had the potential to level cities and kill millions.  But when it comes to consumer DNA tests, this sort of preemptive banning seems purposefully designed to give wet dreams to Ayn Rand fans.

(5) I confess that, despite everything I’ve said, my moral intuitions might be different if dead bodies were piling up because of terrible 23andMe-inspired medical decisions.  But as far as I know, there’s no evidence so far that even a single person was harmed.  Which isn’t so surprising: after all, people might run to their doctor terrified about something they learned on 23onMe, but no sane doctor would ever make a decision solely on that basis, without ordering further tests.


I’m shipping out today to sunny Rio de Janeiro, where I’ll be giving a weeklong course about BosonSampling, at the invitation of Ernesto Galvão.  Then it’s on to Pennsylvania (where I’ll celebrate Christmas Eve with old family friends), Israel (where I’ll drop off Dana and Lily with Dana’s family in Tel Aviv, then lecture at the Jerusalem Winter School in Theoretical Physics), Puerto Rico (where I’ll speak at the FQXi conference on Physics of Information), back to Israel, and then New York before returning to Boston at the beginning of February.  Given this travel schedule, it’s possible that blogging will be even lighter than usual for the next month and a half (or not—we’ll see).

In the meantime, however, I’ve got the equivalent of at least five new blog posts to tide over Shtetl-Optimized fans.  Luke Muehlhauser, the Executive Director of the Machine Intelligence Research Institute (formerly the Singularity Institute for Artificial Intelligence), did an in-depth interview with me about “philosophical progress,” in which he prodded me to expand on certain comments in Why Philosophers Should Care About Computational Complexity and The Ghost in the Quantum Turing Machine.  Here are (abridged versions of) Luke’s five questions:

1\. Why are you so interested in philosophy? And what is the social value of philosophy, from your perspective?

2. What are some of your favorite examples of illuminating Q-primes [i.e., scientifically-addressable pieces of big philosophical questions] that were solved within your own field, theoretical computer science?

3. Do you wish philosophy-the-field would be reformed in certain ways? Would you like to see more crosstalk between disciplines about philosophical issues? Do you think that, as Clark Glymour suggested, philosophy departments should be defunded unless they produce work that is directly useful to other fields … ?

4. Suppose a mathematically and analytically skilled student wanted to make progress, in roughly the way you describe, on the Big Questions of philosophy. What would you recommend they study? What should they read to be inspired? What skills should they develop? Where should they go to study?

5. Which object-level thinking tactics … do you use in your own theoretical (especially philosophical) research?  Are there tactics you suspect might be helpful, which you haven’t yet used much yourself?

For the answers—or at least my answers—click here!

PS. In case you missed it before, Quantum Computing Since Democritus was chosen by Scientific American blogger Jennifer Ouellette (via the “Time Lord,” Sean Carroll) as the top physics book of 2013.  Woohoo!!


[With special thanks to the Up-Goer Five Text Editor, which was inspired by this xkcd]

I study computers that would work in a different way than any computer that we have today.  These computers would be very small, and they would use facts about the world that are not well known to us from day to day life.  No one has built one of these computers yet—at least, we don’t think they have!—but we can still reason about what they could do for us if we did build them.

How would these new computers work? Well, when you go small enough, you find that, in order to figure out what the chance is that something will happen, you need to both add and take away a whole lot of numbers—one number for each possible way that the thing could happen, in fact. What’s interesting is, this means that the different ways a thing could happen can “kill each other out,” so that the thing never happens at all! I know it sounds weird, but the world of very small things has been known to work that way for almost a hundred years.

So, with the new kind of computer, the idea is to make the different ways each wrong answer could be reached kill each other out (with some of them “pointing” in one direction, some “pointing” in another direction), while the different ways that the right answer could be reached all point in more or less the same direction. If you can get that to happen, then when you finally look at the computer, you’ll find that there’s a very good chance that you’ll see the right answer. And if you don’t see the right answer, then you can just run the computer again until you do.

For some problems—like breaking a big number into its smallest parts (say, 43259 = 181 × 239)—we’ve learned that the new computers would be much, much faster than we think any of today’s computers could ever be. For other problems, however, the new computers don’t look like they’d be faster at all. So a big part of my work is trying to figure out for which problems the new computers would be faster, and for which problems they wouldn’t be.

You might wonder, why is it so hard to build these new computers? Why don’t we have them already? This part is a little hard to explain using the words I’m allowed, but let me try. It turns out that the new computers would very easily break. In fact, if the bits in such a computer were to “get out” in any way—that is, to work themselves into the air in the surrounding room, or whatever—then you could quickly lose everything about the new computer that makes it faster than today’s computers. For this reason, if you’re building the new kind of computer, you have to keep it very, very carefully away from anything that could cause it to lose its state—but then at the same time, you do have to touch the computer, to make it do the steps that will eventually give you the right answer. And no one knows how to do all of this yet. So far, people have only been able to use the new computers for very small checks, like breaking 15 into 3 × 5. But people are working very hard today on figuring out how to do bigger things with the new kind of computer.

In fact, building the new kind of computer is so hard, that some people even believe it won’t be possible! But my answer to them is simple. If it’s not possible, then that’s even more interesting to me than if it is possible! And either way, the only way I know to find out the truth is to try it and see what happens.

Sometimes, people pretend that they already built one of these computers even though they didn’t. Or they say things about what the computers could do that aren’t true. I have to admit that, even though I don’t really enjoy it, I do spend a lot of my time these days writing about why those people are wrong.

Oh, one other thing. Not long from now, it might be possible to build computers that don’t do everything that the new computers could eventually do, but that at least do some of it. Like, maybe we could use nothing but light and mirrors to answer questions that, while not important in and of themselves, are still hard to answer using today’s computers. That would at least show that we can do something that’s hard for today’s computers, and it could be a step along the way to the new computers. Anyway, that’s what a lot of my own work has been about for the past four years or so.

Besides the new kind of computers, I’m also interested in understanding what today’s computers can and can’t do. The biggest open problem about today’s computers could be put this way: if a computer can check an answer to a problem in a short time, then can a computer also find an answer in a short time? Almost all of us think that the answer is no, but no one knows how to show it. Six years ago, another guy and I figured out one of the reasons why this question is so hard to answer: that is, why the ideas that we already know don’t work.

Anyway, I have to go to dinner now. I hope you enjoyed this little piece about the kind of stuff that I work on.


Update (January 3): There’s now a long interview with me about quantum computing in the Washington Post (or at least, on their website).  The interview accompanies their lead article about quantum computing and the NSA, which also quotes me (among many others), and which reports—unsurprisingly—that the NSA is indeed interested in building scalable quantum computers but, based on the Snowden documents, appears to be quite far from that goal.

(Warning: The interview contains a large number of typos and other errors, which might have arisen from my infelicities in speaking or the poor quality of the phone connection.  Some were corrected but others remain.)

* * *

The week before last, I was in Rio de Janeiro to give a mini-course on “Complexity Theory and Quantum Optics” at the Instituto de Física of the Universidade Federal Fluminense.  Next week I’ll be giving a similar course at the Jerusalem Winter School on Quantum Information.

In the meantime, my host in Rio, Ernesto Galvão, and others were kind enough to make detailed, excellent notes for my five lectures in Rio.  You can click the link in the last sentence to get them, or here are links for the five lectures individually:

      * Lecture 1: The Extended Church-Turing Thesis
      * Lecture 2: Classical and Quantum Complexity Theory
      * Lecture 3: Linear Optics and Exact BosonSampling
      * Lecture 4: KLM, Postselection, and Approximate BosonSampling
      * Lecture 5:  Scalability and Verification of BosonSampling Devices

If you have questions or comments about the lectures, leave them here (since I might not check the quantumrio blog).

One other thing: I can heartily recommend a trip to Rio to anyone interested in quantum information—or, for that matter, to anyone interested in sunshine, giant Jesus statues, or (especially) fruit juices you’ve never tasted before.  My favorite from among the latter was acerola.  Also worth a try are caja, mangaba, guarana, umbu, seriguela, amora, and fruta do conde juices—as well as caju and cacao, even though they taste almost nothing like the more commercially exportable products from the same plants (cashews and chocolate respectively).  I didn’t like cupuaçu or graviola juices.  Thanks so much to Ernesto and everyone else for inviting me (not just because of the juice).

Update (January 2): You can now watch videos of my mini-course at the Jerusalem Winter School here.

      * Lecture 1: The Extended Church-Turing Thesis
      * Lecture 2: Classical and Quantum Complexity Theory
      * Lecture 3: Linear Optics and Exact BosonSampling
      * Lecture 4: KLM, Approximate BosonSampling, and Experimental Issues

Videos of the other talks at the Jerusalem Winter School are available from the same site (just scroll through them on the right).


Update (Jan. 23): Daniel Lidar, one of the authors of the “Defining and detecting…” paper, was kind enough to email me his reactions to this post.  While he thought the post was generally a “very nice summary” of their paper, he pointed out one important oversight in my discussion.  Ironically, this oversight arose from my desire to bend over backwards to be generous to D-Wave!  Specifically, I claimed that there were maybe ~10% of randomly-chosen 512-qubit problem instances on which the D-Wave Two slightly outperformed the simulated annealing solver (compared to ~75% where simulated annealing outperformed the D-Wave Two), while also listing several reasons (such as the minimum annealing time, and the lack of any characterization of the “good” instances) why that “speedup” is likely to be entirely an artifact.  I obtained the ~10% and ~75% figures by eyeballing Figure 7 in the paper, and looking at which quantiles were just above and just below the 100 line when N=512.

However, I neglected to mention that even the slight “speedup” on ~10% of instances, only appears when one looks at the “quantiles of ratio”: in other words, when one plots the probability distribution of [Simulated annealing time / D-Wave time] over all instances, and then looks at (say) the ~10% of the distribution that’s best for the D-Wave machine.  The slight speedup disappears when one looks at the “ratio of quantiles”: that is, when one (say) divides the amount of time that simulated annealing needs to solve its best 10% of instances, by the amount of time that the D-Wave machine needs to solve its best 10%.  And Rønnow et al. give arguments in their paper that ratio of quantiles is probably the more relevant performance comparison than quantiles of ratio.  (Incidentally, the slight speedup on a few instances also only appears for certain values of the parameter r, which controls how many possible settings there are for each coupling.  Apparently it appears for r=1, but disappears for r=3 and r=7—thereby heightening one’s suspicion that we’re dealing with an artifact of the minimum annealing time or something like that, rather than a genuine speedup.)

There’s one other important point in the paper that I didn’t mention: namely, all the ratios of simulated annealing time to D-Wave time are normalized by 512/N, where N is the number of spins in the instance being tested.  In this way, one eliminates the advantages of the D-Wave machine that come purely from its parallelism (which has nothing whatsoever to do with “quantumness,” and which could easily skew things in D-Wave’s favor if not controlled for), while still not penalizing the D-Wave machine in absolute terms.

* * *

A few days ago, a group of nine authors (Troels Rønnow, Zhihui Wang, Joshua Job, Sergio Boixo, Sergei Isakov, David Wecker, John Martinis, Daniel Lidar, and Matthias Troyer) released their long-awaited arXiv preprint Defining and detecting quantum speedup, which contains the most thorough performance analysis of the D-Wave devices to date, and which seems to me to set a new standard of care for any future analyses along these lines.  Notable aspects of the paper: it uses data from the 512-qubit machine (a previous comparison had been dismissed by D-Wave’s supporters because it studied the 128-qubit model only); it concentrates explicitly from the beginning on comparisons of scaling behavior between the D-Wave devices and comparable classical algorithms, rather than getting “sidetracked” by other issues; and it includes authors from both USC and Google’s Quantum AI Lab, two places that have made large investments in D-Wave’s machines and have every reason to want to see them succeed.

Let me quote the abstract in full:

The development of small-scale digital and analog quantum devices raises the question of how to fairly assess and compare the computational power of classical and quantum devices, and of how to detect quantum speedup. Here we show how to define and measure quantum speedup in various scenarios, and how to avoid pitfalls that might mask or fake quantum speedup. We illustrate our discussion with data from a randomized benchmark test on a D-Wave Two device with up to 503 qubits. Comparing the performance of the device on random spin glass instances with limited precision to simulated classical and quantum annealers, we find no evidence of quantum speedup when the entire data set is considered, and obtain inconclusive results when comparing subsets of instances on an instance-by-instance basis. Our results for one particular benchmark do not rule out the possibility of speedup for other classes of problems and illustrate that quantum speedup is elusive and can depend on the question posed.

Since the paper is exceedingly well-written, and since I have maybe an hour before I’m called back to baby duty, my inclination is simply to ask people to RTFP rather than writing yet another long blog post.  But maybe there are four points worth calling attention to:

  1. The paper finds, empirically, that the time needed to solve random size-N instances of the quadratic binary optimization (QUBO) problem on D-Wave’s Chimera constraint graph seems to scale like exp(c√N) for some constant c—and that this is true regardless of whether one attacks the problem using the D-Wave Two, quantum Monte Carlo (i.e., a classical algorithm that tries to mimic the native physics of the machine), or an optimized classical simulated annealing code.  Notably, exp(c√N) is just what one would have predicted from theoretical arguments based on treewidth; and the constant c doesn’t appear to be better for the D-Wave Two than for simulated annealing.
  2. The last sentence of the abstract (“Our results … do not rule out the possibility of speedup for other classes of problems”) is, of course, the reed on which D-Wave’s supporters will now have to hang their hopes.  But note that it’s unclear what experimental results could ever “rule out the possibility of speedup for other classes of problems.”  (No matter how many wrong predictions a psychic has made, the possibility remains that she’d be flawless at predicting the results of Croatian ping-pong tournaments…)  Furthermore, like with previous experiments, the instances tested all involved finding ground states for random coupling configurations of the D-Wave machine’s own architecture.  In other words, this was a set of instances where one might have thought, a priori, that the D-Wave machine would have an immense home-field advantage.  Thus, one really needs to look more closely, to see whether there’s any positive evidence for an asymptotic speedup by the D-Wave machine.
  3. Here, for D-Wave supporters, the biggest crumb the paper throws is that, if one considers only the ~10% of instances on which the D-Wave machine does best, then the machine does do slightly better on those instances than simulated annealing does.  (Conversely, simulated annealing does better than the D-Wave machine on the ~75% of instances on which it does best.)  Unfortunately, no one seems to know how to characterize the instances on which the D-Wave machine will do best: one just has to try it and see what happens!  And of course, it’s extremely rare that two heuristic algorithms will succeed or fail on exactly the same set of instances: it’s much more likely that their performances will be correlated, but imperfectly.  So it’s unclear, at least to me, whether this finding represents anything other than the “noise” that would inevitably occur even if one classical algorithm were pitted against another one.
  4. As the paper points out, there’s also a systematic effect that biases results in the D-Wave Two’s favor, if one isn’t careful.  Namely, the D-Wave Two has a minimum annealing time of 20 microseconds, which is often greater than the optimum annealing time, particularly for small instance sizes.  The effect of that is artificially to increase the D-Wave Two’s running time for small instances, and thereby make its scaling behavior look better than it really is.  The authors say they don’t know whether even the D-Wave Two’s apparent advantage for its “top 10% of instances” will persist after this effect is fully accounted for.

Those seeking something less technical might want to check out an excellent recent article in Inc. by Will Bourne, entitled “D-Wave’s dream machine” (“D-Wave thinks it has built the first commercial quantum computer.  Mother Nature has other ideas”).  Wisely, Bourne chose not to mention me at all in this piece.  Instead, he gradually builds a skeptical case almost entirely on quotes from people like Seth Lloyd and Daniel Lidar, who one might have thought would be more open to D-Wave’s claims.  Bourne’s piece illustrates that it is possible for the mainstream press to get the D-Wave story pretty much right, and that you don’t even need a physics background to do so: all you need is a willingness to commit journalism.

Oh.  I’d be remiss not to mention that, in the few days between the appearance of this paper and my having a chance to write this post, two other preprints of likely interest to the Shtetl-Optimized commentariat showed up on quant-ph.  The first, by a large list of authors mostly from D-Wave, is called Entanglement in a quantum annealing processor.  This paper presents evidence for a point that many skeptics (including me) had been willing to grant for some time: namely, that the states generated by the D-Wave machines contain some nonzero amount of entanglement.  (Note that, because of a technical property called “stoquasticity,” such entanglement is entirely compatible with the machines continuing to be efficiently simulable on a classical computer using Quantum Monte Carlo.)  While it doesn’t address the performance question at all, this paper seems like a perfectly fine piece of science.

From the opposite side of the (eigen)spectrum comes the latest preprint by QC skeptic Michel Dyakonov, entitled Prospects for quantum computing: Extremely doubtful.  Ironically, Dyakonov and D-Wave seem to agree completely about the irrelevance of fault-tolerance and other insights from quantum computing theory.  It’s just that D-Wave thinks QC can work even without the theoretical insights, whereas Dyakonov thinks that QC can’t work even with the insights.  Unless I missed it, there’s no new scientific content in Dyakonov’s article.  It’s basically a summary of some simple facts about QC and quantum fault-tolerance, accompanied by sneering asides about how complicated and implausible it all sounds, and how detached from reality the theorists are.

And as for the obvious comparisons to previous “complicated and implausible” technologies, like (say) classical computing, or heavier-than-air flight, or controlled nuclear fission?  Dyakonov says that such comparisons are invalid, because they ignore the many technologies proposed in previous eras that didn’t work.  What’s striking is how little he seems to care about why the previous technologies failed: was it because they violated clearly-articulated laws of physics?  Or because there turned out to be better ways to do the same things?  Or because the technologies were simply too hard, too expensive, or too far ahead of their time?  Supposing QC to be impossible, which of those is the reason for the impossibility?  Since we’re not asking about something “arbitrary” here (like teaching a donkey to read), but rather about the computational power of Nature itself, isn’t it of immense scientific interest to know the reason for QC’s impossibility?  How does Dyakonov propose to learn the reason, assuming he concedes that he doesn’t already know it?

(As I’ve said many times, I’d support even the experiments that D-Wave was doing, if D-Wave and its supporters would only call them for what they were: experiments.  Forays into the unknown.  Attempts to find out what happens when a particular speculative approach is thrown at NP-hard optimization problems.  It’s only when people obfuscate the results of those experiments, in order to claim something as “commercially useful” that quite obviously isn’t yet, that they leave the realm of science, and indeed walk straight into the eager jaws of skeptics like Dyakonov.)

Anyway, since we seem to have circled back to D-Wave, I’d like to end this post by announcing my second retirement as Chief D-Wave Skeptic.  The first time I retired, it was because I mistakenly thought that D-Wave had fundamentally changed, and would put science ahead of PR from that point forward.  (The truth seems to be that there were, and are, individuals at D-Wave committed to science, but others who remain PR-focused.)  This time, I’m retiring for a different reason: because scientists like the authors of the “Defining and detecting” preprint, and journalists like Will Bourne, are doing my job better than I ever did it.  If the D-Wave debate were the American Civil War, then my role would be that of the frothy-mouthed abolitionist pamphleteer: someone who repeats over and over points that are fundamentally true, but in a strident manner that serves only to alienate fence-sitters and allies.  As I played my ineffective broken record, the Wave Power simply moved from one triumph to another, expanding its reach to Google, NASA, Lockheed Martin, and beyond.  I must have looked like a lonely loon on the wrong side of history.

But today the situation is different.  Today Honest Abe and his generals (Honest Matthias and his coauthors?) are meeting the Wave Power on the battlefield of careful performance comparisons against Quantum Monte Carlo and simulated annealing.  And while the battles might continue all the way to 2000 qubits or beyond, the results so far are not looking great for the Wave Power.  The intractability of NP-complete problems—that which we useless, ivory-tower theorists had prophesied years ago, to much derision and laughter—would seem to be rearing its head.  So, now that the bombs are bursting and the spins decohering in midair, what is there for a gun-shy pampleteer like myself to do but sit back and watch it all play out?

Well, and maybe blog about it occasionally.  But not as “Chief Skeptic,” just as another interested observer.


My good friend Sean Carroll took a lot of flak recently for answering this year’s Edge question, “What scientific idea is ready for retirement?,” with “Falsifiability”, and for using string theory and the multiverse as examples of why science needs to break out of its narrow Popperian cage.  For more, see this blog post of Sean’s, where one commenter after another piles on the beleaguered dude for his abandonment of science and reason themselves.

My take, for whatever it’s worth, is that Sean and his critics are both right.

Sean is right that “falsifiability” is a crude slogan that fails to capture what science really aims at.  As a doofus example, the theory that zebras exist is presumably both “true” and “scientific,” but it’s not “falsifiable”: if zebras didn’t exist, there would be no experiment that proved their nonexistence.  (And that’s to say nothing of empirical claims involving multiple nested quantifiers: e.g., “for every physical device that tries to solve the Traveling Salesman Problem in polynomial time, there exists an input on which the device fails.”)  Less doofusly, a huge fraction of all scientific progress really consists of mathematical or computational derivations from previously-accepted theories—and, as such, has no “falsifiable content” apart from the theories themselves.  So, do workings-out of mathematical consequences count as “science”?  In practice, the Nobel committee says sure they do, but only if the final results of the derivations are “directly” confirmed by experiment.  Far better, it seems to me, to say that science is a search for explanations that do essential and nontrivial work, within the network of abstract ideas whose ultimate purpose to account for our observations.  (On this particular question, I endorse everything David Deutsch has to say in The Beginning of Infinity, which you should read if you haven’t.)

On the other side, I think Sean’s critics are right that falsifiability shouldn’t be “retired.”  Instead, falsifiability’s portfolio should be expanded, with full-time assistants (like explanatory power) hired to lighten falsifiability’s load.

I also, to be honest, don’t see that modern philosophy of science has advanced much beyond Popper in its understanding of these issues.  Last year, I did something weird and impulsive: I read Karl Popper.  Given all the smack people talk about him these days, I was pleasantly surprised by the amount of nuance, reasonableness, and just general getting-it that I found.  Indeed, I found a lot more of those things in Popper than I found in his latter-day overthrowers Kuhn and Feyerabend.  For Popper (if not for some of his later admirers), falsifiability was not a crude bludgeon.  Rather, it was the centerpiece of a richly-articulated worldview holding that millennia of human philosophical reflection had gotten it backwards: the question isn’t how to arrive at the Truth, but rather how to eliminate error.  Which sounds kind of obvious, until I meet yet another person who rails to me about how empirical positivism can’t provide its own ultimate justification, and should therefore be replaced by the person’s favorite brand of cringe-inducing ugh.

Oh, I also think Sean might have made a tactical error in choosing string theory and the multiverse as his examples for why falsifiability needs to be retired.  For it seems overwhelmingly likely to me that the following two propositions are both true:

1\. Falsifiability is too crude of a concept to describe how science works.
2\. In the specific cases of string theory and the multiverse, a dearth of novel falsifiable predictions really is a big problem.

As usual, the best bet is to use explanatory power as our criterion—in which case, I’d say string theory emerges as a complex and evolving story.  On one end, there are insights like holography and AdS/CFT, which seem clearly to do explanatory work, and which I’d guess will stand as permanent contributions to human knowledge, even if the whole foundations on which they currently rest get superseded by something else.  On the other end, there’s the idea, championed by a minority of string theorists and widely repeated in the press, that the anthropic principle applied to different patches of multiverse can be invoked as a sort of get-out-of-jail-free card, to rescue a favored theory from earlier hopes of successful empirical predictions that then failed to pan out.  I wouldn’t know how to answer a layperson who asked why that wasn’t exactly the sort of thing Sir Karl was worried about, and for good reason.

Finally, not that Edge asked me, but I’d say the whole notions of “determinism” and “indeterminism” in physics are past ready for retirement.  I can’t think of any work they do, that isn’t better done by predictability and unpredictability.


Update (Feb. 4): After Luke Muelhauser of MIRI interviewed me about “philosophical progress,” Luke asked me for other people to interview about philosophy and theoretical computer science. I suggested my friend and colleague Ronald de Wolf of the University of Amsterdam, and I’m delighted that Luke took me up on it. Here’s the resulting interview, which focuses mostly on quantum computing (with a little Kolmogorov complexity and Occam’s Razor thrown in). I read the interview with admiration (and hoping to learn some tips): Ronald tackles each question with more clarity, precision, and especially levelheadedness than I would.

Another Update: Jeff Kinne asked me to post a link to a forum about the future of the Conference on Computational Complexity (CCC)—and in particular, whether it should continue to be affiliated with the IEEE. Any readers who have ever had any involvement with the CCC conference are encouraged to participate. You can read all about what the issues are in a manifesto written by Dieter van Melkebeek.

Yet Another Update: Some people might be interested in my response to Geordie Rose’s response to the Shin et al. paper about a classical model for the D-Wave machine.

* * *

“How ‘Quantum’ is the D-Wave Machine?” by Shin, Smith, Smolin, Vazirani goo.gl/JkLg0l – was previous skepticism too GENEROUS to D-Wave?

D-Wave not of broad enough interest? OK then, try “AM with Multiple Merlins” by Dana Moshkovitz, Russell Impagliazzo, and me goo.gl/ziSUz9

“Remarks on the Physical Church-Turing Thesis” – my talk at the FQXi conference in Vieques, Puerto Rico is now on YouTube goo.gl/kAd9TZ

Cool new SciCast site (scicast.org) lets you place bets on P vs NP, Unique Games Conjecture, etc. But glitches remain to be ironed out


This morning, commenter rrtucci pointed me to TIME Magazine’s cover story about D-Wave (yes, in today’s digital media environment, I need Shtetl-Optimized readers to tell me what’s on the cover of TIME…).  rrtucci predicted that, soon after reading the article, I’d be hospitalized with a severe stress-induced bleeding ulcer.  Undeterred, I grit my teeth, paid the $5 to go behind the paywall, and read the article.

The article, by Lev Grossman, could certainly be a lot worse.  If you get to the end, it discusses the experiments by Matthias Troyer’s group, and it makes clear the lack of any practically-relevant speedup today from the D-Wave devices.  It also includes a few skeptical quotes:

“In quantum computing, we have to be careful what we mean by ‘utilizing quantum effects,'” says Monroe, the University of Maryland scientist, who’s among the doubters. “This generally means that we are able to store superpositions of information in such a way that the system retains its ‘fuzziness,’ or quantum coherence, so that it can perform tasks that are impossible otherwise. And by that token there is no evidence that the D-Wave machine is utilizing quantum effects.”

One of the closest observers of the controversy has been Scott Aaronson, an associate professor at MIT and the author of a highly influential quantum-computing blog [aww, shucks –SA]. He remains, at best, cautious. “I’m convinced … that interesting quantum effects are probably present in D-Wave’s devices,” he wrote in an email. “But I’m not convinced that those effects, right now, are playing any causal role in solving any problems faster than we could solve them with a classical computer. Nor do I think there’s any good argument that D-Wave’s current approach, scaled up, will lead to such a speedup in the future. It might, but there’s currently no good reason to think so.”

Happily, the quote from me is something that I actually agreed with at the time I said it!  Today, having read the Shin et al. paper—which hadn’t yet come out when Grossman emailed me—I might tone down the statement “I’m convinced … that interesting quantum effects are probably present” to something like: “there’s pretty good evidence for quantum effects like entanglement at a ‘local’ level, but at the ‘global’ level we really have no idea.”

Alas, ultimately I regard this article as another victim (through no fault of the writer, possibly) of the strange conventions of modern journalism.  Maybe I can best explain those conventions with a quickie illustration:

MAGIC 8-BALL: THE RENEGADE MATH WHIZ WHO COULD CHANGE NUMBERS FOREVER

An eccentric billionaire, whose fascinating hobbies include nude skydiving and shark-taming, has been shaking up the scientific world lately with his controversial claim that 8+0 equals 17  [… six more pages about the billionaire redacted …]  It must be said that mathematicians, who we reached for comment because we’re diligent reporters, have tended to be miffed, skeptical, and sometimes even sarcastic about the billionaire’s claims.  Not surprisingly, though, the billionaire and his supporters have had some dismissive comments of their own about the mathematicians.  So, which side is right?  Or is the truth somewhere in the middle?  At this early stage, it’s hard for an outsider to say.  In the meantime, the raging controversy itself is reason enough for us to be covering this story using this story template.  Stay tuned for more!

As shown (for example) by Will Bourne’s story in Inc. magazine, it’s possible for a popular magazine to break out of the above template when covering D-Wave, or at least bend it more toward reality.  But it’s not easy.

More detailed comments:

  * The article gets off on a weird foot in the very first paragraph, describing the insides of D-Wave’s devices as “the coldest place in the universe.”  Err, 20mK is pretty cold, but colder temperatures are routinely achieved in many other physics experiments.  (Are D-Wave’s the coldest current, continuously-operating experiments, or something like that?  ~~I dunno: counterexamples, anyone?~~  I’ve learned from experts that they’re not, not even close.  I heard from someone who had a bunch of dilution fridges running at 10mK in the lab he was emailing me from…)
  * The article jumps enthusiastically into the standard Quantum Computing = Exponential Parallelism Fallacy (the QC=EPF), which is so common to QC journalism that I don’t know if it’s even worth pointing it out anymore (but here I am doing so).
  * Commendably, the article states clearly that QCs would offer speedups only for certain specific problems, not others; that D-Wave’s devices are designed only for adiabatic optimization, and wouldn’t be useful (e.g.) for codebreaking; and that even for optimization, “D-Wave’s hardware isn’t powerful enough or well enough understood to show serious quantum speedup yet.”  But there’s a crucial further point that the article doesn’t make: namely, that we have no idea yet whether adiabatic optimization is something where quantum computers can give any practically-important speedup.  In other words, even if you could implement adiabatic optimization perfectly—at zero temperature, with zero decoherence—we still don’t know whether there’s any quantum speedup to be had that way, for any of the nifty applications that the article mentions: “software design, tumor treatments, logistical planning, the stock market, airlines schedules, the search for Earth-like planets in other solar systems, and in particular machine learning.”  In that respect, adiabatic optimization is extremely different from (e.g.) Shor’s factoring algorithm or quantum simulation: things where we know how much speedup we could get, at least compared to the best currently-known classical algorithms.  But I better stop now, since I feel myself entering an infinite loop (and I didn’t even need the adiabatic algorithm to detect it).


You might recall that Shin, Smith, Smolin, and Vazirani posted a widely-discussed preprint a week ago, questioning the evidence for large-scale quantum behavior in the D-Wave machine.  Geordie Rose responded here.   Tonight, in a Shtetl-Optimized exclusive scoop, I bring you Umesh Vazirani’s response to Geordie’s comments. Without further ado:

* * *

Even a cursory reading of our paper will reveal that Geordie Rose is attacking a straw man. Let me quickly outline the main point of our paper and the irrelevance of Rose’s comments:

To date the Boixo et al paper was the only serious evidence in favor of large scale quantum behavior by the D-Wave machine. We investigated their claims and showed that there are serious problems with their conclusions. Their conclusions were based on the close agreement between the input-output data from D-Wave and quantum simulated annealing, and their inability despite considerable effort to find any classical model that agreed with the input-output data. In our paper, we gave a very simple classical model of interacting magnets that closely agreed with the input-output data. We stated that our results implied that “it is premature to conclude that D-Wave machine exhibits large scale quantum behavior”.

Rose attacks our paper for claiming that “D-Wave processors are inherently classical, and can be described by a classical model with no need to invoke quantum mechanics.”  A reading of our paper will make it perfectly clear that this is not a claim that we make.  We state explicitly “It is worth emphasizing that the goal of this paper is not to provide a classical model for the D-Wave machine, … The classical model introduced here is useful for the purposes of studying the large-scale algorithmic features of the D-Wave machine. The task of finding an accurate model for the D-Wave machine (classical, quantum or otherwise), would be better pursued with direct access, not only to programming the D-Wave machine, but also to its actual hardware.”

Rose goes on to point to a large number of experiments conducted by D-Wave to prove small scale entanglement over 2-8 qubits and criticizes our paper for not trying to model those aspects of D-Wave. But such small scale entanglement properties are not directly relevant to prospects for a quantum speedup. Therefore we were specifically interested in claims about the large scale quantum behavior of D-Wave. There was exactly one such claim, which we duly investigated, and it did not stand up to scrutiny.


Most of the time, I’m a crabby, cantankerous ogre, whose only real passion in life is using this blog to shoot down the wrong ideas of others.  But alas, try as I might to maintain my reputation as a pure bundle of seething negativity, sometimes events transpire that pierce my crusty exterior.  Maybe it’s because I’m in Berkeley now, visiting the new Simons Institute for Theory of Computing during its special semester on Hamiltonian complexity.  And it’s tough to keep up my acerbic East Coast skepticism of everything new in the face of all this friggin’ sunshine.  (Speaking of which, if you’re in the Bay Area and wanted to meet me, this week’s the week!  Email me.)  Or maybe it’s watching Lily running around, her face wide with wonder.  If she’s so excited by her discovery of (say) a toilet plunger or some lint on the floor, what right do I have not to be excited by actual scientific progress?

Which brings me to the third reason for my relatively-sunny disposition: two long and fascinating recent papers on the arXiv.  What these papers have in common is that they use concepts from theoretical computer science in unexpected ways, while trying to address open problems at the heart of “traditional, continuous” physics and math.  One paper uses quantum circuit complexity to help understand black holes; the other uses fault-tolerant universal computation to help understand the Navier-Stokes equations.

Recently, our always-pleasant string-theorist friend Luboš Motl described computational complexity theorists as “extraordinarily naïve” (earlier, he also called us “deluded” and “bigoted”).  Why?  Because we’re obsessed with “arbitrary, manmade” concepts like the set of problems solvable in polynomial time, and especially because we assume things we haven’t yet proved such as P≠NP.  (Jokes about throwing stones from a glass house—or a stringy house—are left as exercises for the reader.)  The two papers that I want to discuss today reflect a different perspective: one that regards computation as no more “arbitrary” than other central concepts of mathematics, and indeed, as something that shows up even in contexts that seem incredibly remote from it, from the AdS/CFT correspondence to turbulent fluid flow.

* * *

Our first paper is Computational Complexity and Black Hole Horizons, by Lenny Susskind.  As readers of this blog might recall, last year Daniel Harlow and Patrick Hayden made a striking connection between computational complexity and the black-hole “firewall” question, by giving complexity-theoretic evidence that performing the measurement of Hawking radiation required for the AMPS experiment would require an exponentially-long quantum computation.  In his new work, Susskind makes a different, and in some ways even stranger, connection between complexity and firewalls.  Specifically, given an n-qubit pure state |ψ〉, recall that the quantum circuit complexity of |ψ〉 is the minimum number of 2-qubit gates needed to prepare |ψ〉 starting from the all-|0〉 state.  Then for reasons related to black holes and firewalls, Susskind wants to use the quantum circuit complexity of |ψ〉 as an intrinsic clock, to measure how long |ψ〉 has been evolving for.  Last week, I had the pleasure of visiting Stanford, where Lenny spent several hours explaining this stuff to me.  I still don’t fully understand it, but since it’s arguable that no one (including Lenny himself) does, let me give it a shot.

My approach will be to divide into two questions.  The first question is: why, in general (i.e., forgetting about black holes), might one want to use quantum circuit complexity as a clock?  Here the answer is: because unlike most other clocks, this one should continue to tick for an exponentially long time!

Consider some standard, classical thermodynamic system, like a box filled with gas, with the gas all initially concentrated in one corner.  Over time, the gas will diffuse across the box, in accord with the Second Law, until it completely equilibrates.  Furthermore, if we know the laws of physics, then we can calculate exactly how fast this diffusion will happen.  But this implies that we can use the box as a clock!  To do so, we’d simply have to measure how diffused the gas was, then work backwards to determine how much time had elapsed since the gas started diffusing.

But notice that this “clock” only works until the gas reaches equilibrium—i.e., is equally spread across the box.  Once the gas gets to equilibrium, which it does in a reasonably short time, it just stays there (at least until the next Poincaré recurrence time).  So, if you see the box in equilibrium, there’s no measurement you could make—or certainly no “practical” measurement—that would tell you how long it’s been there.  Indeed, if we model the collisions between gas particles (and between gas particles and the walls of the box) as random events, then something even stronger is true.  Namely, the probability distribution over all possible configurations of the gas particles will quickly converge to an equilibrium distribution.  And if you all you knew was that the particles were in the equilibrium distribution, then there’s no property of their distribution that you could point to—not even an abstract, unmeasurable property—such that knowing that property would tell you how long the gas had been in equilibrium.

Interestingly, something very different happens if we consider a quantum pure state, in complete isolation from its environment.  If you have some quantum particles in a perfectly-isolating box, and you start them out in a “simple” state (say, with all particles unentangled and in a corner), then they too will appear to diffuse, with their wavefunctions spreading out and getting entangled with each other, until the system reaches “equilibrium.”  After that, there will once again be no “simple” measurement you can make—say, of the density of particles in some particular location—that will give you any idea of how long the box has been in equilibrium.  On the other hand, the laws of unitary evolution assure us that the quantum state is still evolving, rotating serenely through Hilbert space, just like it was before equilibration!  Indeed, in principle you could even measure that the evolution was still happening, but to do so, you’d need to perform an absurdly precise and complicated measurement—one that basically inverted the entire unitary transformation that had been applied since the particles started diffusing.

Lenny now asks the question: if the quantum state of the particles continues to evolve even after “equilibration,” then what physical quantity can we point to as continuing to increase?  By the argument above, it can’t be anything simple that physicists are used to talking about, like coarse-grained entropy.  Indeed, the most obvious candidate that springs to mind, for a quantity that should keep increasing even after equilibration, is just the quantum circuit complexity of the state!  If there’s no “magic shortcut” to simulating this system—that is, if the fastest way to learn the quantum state at time T is just to run the evolution equations forward for T time steps—then the quantum circuit complexity will continue to increase linearly with T, long after equilibration.  Eventually, the complexity will “max out” at ~cn, where n is the number of particles, simply because (neglecting small multiplicative terms) the dimension of the Hilbert space is always an upper bound on the circuit complexity.  After even longer amounts of time—like ~cc^n—the circuit complexity will dip back down (sometimes even to 0), as the quantum state undergoes recurrences.  But both of those effects only occur on timescales ridiculously longer than anything normally relevant to physics or everyday life.

Admittedly, given the current status of complexity theory, there’s little hope of proving unconditionally that the quantum circuit complexity continues to rise until it becomes exponential, when some time-independent Hamiltonian is continuously applied to the all-|0〉 state.  (If we could prove such a statement, then presumably we could also prove PSPACE⊄BQP/poly.)  But maybe we could prove such a statement modulo a reasonable conjecture.  And we do have suggestive weaker results.  In particular (and as I just learned this Friday), in 2012 Brandão, Harrow, and Horodecki, building on earlier work due to Low, showed that, if you apply S&gt;&gt;n random two-qubit gates to n qubits initially in the all-|0〉 state, then with high probability, not only do you get a state with large circuit complexity, you get a state that can’t even be distinguished from the maximally mixed state by any quantum circuit with at most ~S1/6 gates.

OK, now on to the second question: what does any of this have to do with black holes?  The connection Lenny wants to make involves the AdS/CFT correspondence, the “duality” between two completely different-looking theories that’s been the rage in string theory since the late 1990s.  On one side of the ring is AdS (Anti de Sitter), a quantum-gravitational theory in D spacetime dimensions—one where black holes can form and evaporate, etc., but on the other hand, the entire universe is surrounded by a reflecting boundary a finite distance away, to help keep everything nice and unitary.  On the other side is CFT (Conformal Field Theory): an “ordinary” quantum field theory, with no gravity, that lives only on the (D-1)-dimensional “boundary” of the AdS space, and not in its interior “bulk.”  The claim of AdS/CFT is that despite how different they look, these two theories are “equivalent,” in the sense that any calculation in one theory can be transformed to a calculation in the other theory that yields the same answer.  Moreover, we get mileage this way, since a calculation that’s hard on the AdS side is often easy on the CFT side and vice versa.

As an example, suppose we’re interested in what happens inside a black hole—say, because we want to investigate the AMPS firewall paradox.  Now, figuring out what happens inside a black hole (or even on or near the event horizon) is a notoriously hard problem in quantum gravity; that’s why people have been arguing about firewalls for the past two years, and about the black hole information problem for the past forty!  But what if we could put our black hole in an AdS box?  Then using AdS/CFT, couldn’t we translate questions about the black-hole interior to questions about the CFT on the boundary, which don’t involve gravity and which would therefore hopefully be easier to answer?

In fact people have tried to do that—but frustratingly, they haven’t been able to use the CFT calculations to answer even the grossest, most basic questions about what someone falling into the black hole would actually experience.  (For example, would that person hit a “firewall” and die immediately at the horizon, or would she continue smoothly through, only dying close to the singularity?)  Lenny’s paper explores a possible reason for this failure.  It turns out that the way AdS/CFT works, the closer to the black hole’s event horizon you want to know what happens, the longer you need to time-evolve the quantum state of the CFT to find out.  In particular, if you want to know what’s going on at distance ε from the event horizon, then you need to run the CFT for an amount of time that grows like log(1/ε).  And what if you want to know what’s going on inside the black hole?  In line with the holographic principle, it turns out that you can express an observable inside the horizon by an integral over the entire AdS space outside the horizon.  Now, that integral will include a part where the distance ε from the event horizon goes to 0—so log(1/ε), and hence the complexity of the CFT calculation that you have to do, diverges to infinity.  For some kinds of calculations, the ε→0 part of the integral isn’t very important, and can be neglected at the cost of only a small error.  For other kinds of calculations, unfortunately—and in particular, for the kind that would tell you whether or not there’s a firewall—the ε→0 part is extremely important, and it makes the CFT calculation hopelessly intractable.

Note that yes, we even need to continue the integration for ε much smaller than the Planck length—i.e., for so-called “transplanckian” distances!  As Lenny puts it, however:

For most of this vast sub-planckian range of scales we don’t expect that the operational meaning has anything to do with meter sticks … It has more to do with large times than small distances.

One could give this transplanckian blowup in computational complexity a pessimistic spin: darn, so it’s probably hopeless to use AdS/CFT to prove once and for all that there are no firewalls!  But there’s also a more positive interpretation: the interior of a black hole is “protected from meddling” by a thick armor of computational complexity.  To explain this requires a digression about firewalls.

The original firewall paradox of AMPS could be phrased as follows: if you performed a certain weird, complicated measurement on the Hawking radiation emitted from a “sufficiently old” black hole, then the expected results of that measurement would be incompatible with also seeing a smooth, Einsteinian spacetime if you later jumped into the black hole to see what was there.  (Technically, because you’d violate the monogamy of entanglement.)  If what awaited you behind the event horizon wasn’t a “classical” black hole interior with a singularity in the middle, but an immediate breakdown of spacetime, then one says you would’ve “hit a firewall.”

Yes, it seems preposterous that “firewalls” would exist: at the least, it would fly in the face of everything people thought they understood for decades about general relativity and quantum field theory.  But crucially—and here I have to disagree with Stephen Hawking—one can’t “solve” this problem by simply repeating the physical absurdities of firewalls, or by constructing scenarios where firewalls “self-evidently” don’t arise.  Instead, as I see it, solving the problem means giving an account of what actually happens when you do the AMPS experiment, or of what goes wrong when you try to do it.

On this last question, it seems to me that Susskind and Juan Maldacena achieved a major advance in their much-discussed ER=EPR paper last year.  Namely, they presented a picture where, sure, a firewall arises (at least temporarily) if you do the AMPS experiment—but no firewall arises if you don’t do the experiment!  In other words, doing the experiment sends a nonlocal signal to the interior of the black hole (though you do have to jump into the black hole to receive the signal, so causality outside the black hole is still preserved).  Now, how is it possible for your measurement of the Hawking radiation to send an instantaneous signal to the black hole interior, which might be light-years away from you when you measure?  On Susskind and Maldacena’s account, it’s possible because the entanglement between the Hawking radiation and the degrees of freedom still in the black hole, can be interpreted as creating wormholes between the two.  Under ordinary conditions, these wormholes (like most wormholes in general relativity) are “non-traversable”: they “pinch off” if you try to send signals through them, so they can’t be used for faster-than-light communication.  However, if you did the AMPS experiment, then the wormholes would become traversable, and could carry a firewall (or an innocuous happy-birthday message, or whatever) from the Hawking radiation to the black hole interior.  (Incidentally, ER stands for Einstein and Rosen, who wrote a famous paper on wormholes, while EPR stands for Einstein, Podolsky, and Rosen, who wrote a famous paper on entanglement.  “ER=EPR” is Susskind and Maldacena’s shorthand for their proposed connection between wormholes and entanglement.)

Anyway, these heady ideas raise an obvious question: how hard would it be to do the AMPS experiment?  Is sending a nonlocal signal to the interior of a black hole via that experiment actually a realistic possibility?  In their work a year ago on computational complexity and firewalls, Harlow and Hayden already addressed that question, though from a different perspective than Susskind.  In particular, Harlow and Hayden gave strong evidence that carrying out the AMPS experiment would require solving a problem believed to be exponentially hard even for a quantum computer: specifically, a complete problem for QSZK (Quantum Statistical Zero Knowledge).  In followup work (not yet written up, though see my talk at KITP and my PowerPoint slides), I showed that the Harlow-Hayden problem is actually at least as hard as inverting one-way functions, which is even stronger evidence for hardness.

All of this suggests that, even supposing we could surround an astrophysical black hole with a giant array of perfect photodetectors, wait ~1069 years for the black hole to (mostly) evaporate, then route the Hawking photons into a super-powerful, fault-tolerant quantum computer, doing the AMPS experiment (and hence, creating traversable wormholes to the black hole interior) still wouldn’t be a realistic prospect, even if the equations formally allow it!  There’s no way to sugarcoat this: computational complexity limitations seem to be the only thing protecting the geometry of spacetime from nefarious experimenters.

Anyway, Susskind takes that amazing observation of Harlow and Hayden as a starting point, but then goes off on a different tack.  For one thing, he isn’t focused on the AMPS experiment (the one involving monogamy of entanglement) specifically: he just wants to know how hard it is to do any experiment (possibly a different one) that would send nonlocal signals to the black hole interior.  For another, unlike Harlow and Hayden, Susskind isn’t trying to show that such an experiment would be exponentially hard.  Instead, he’s content if the experiment is “merely” polynomially hard—but in the same sense that (say) unscrambling an egg, or recovering a burned book from the smoke and ash, are polynomially hard.  In other words, Susskind only wants to argue that creating a traversable wormhole would be “thermodynamics-complete.”  A third, related, difference is that Susskind considers an extremely special model scenario: namely, the AdS/CFT description of something called the “thermofield double state.”  (This state involves two entangled black holes in otherwise-separated spacetimes; according to ER=EPR, we can think of those black holes as being connected by a wormhole.)  While I don’t yet understand this point, apparently the thermofield double state is much more favorable for firewall production than a “realistic” spacetime—and in particular, the Harlow-Hayden argument doesn’t apply to it.  Susskind wants to show that even so (i.e., despite how “easy” we’ve made it), sending a signal through the wormhole connecting the two black holes of the thermofield double state would still require solving a thermodynamics-complete problem.

So that’s the setup.  What new insights does Lenny get?  This, finally, is where we circle back to the view of quantum circuit complexity as a clock.  Briefly, Lenny finds that the quantum state getting more and more complicated in the CFT description—i.e., its quantum circuit complexity going up and up—directly corresponds to the wormhole getting longer and longer in the AdS description.  (Indeed, the length of the wormhole increases linearly with time, growing like the circuit complexity divided by the total number of qubits.)  And the wormhole getting longer and longer is what makes it non-traversable—i.e., what makes it impossible to send a signal through.

Why has quantum circuit complexity made a sudden appearance here?  Because in the CFT description, the circuit complexity continuing to increase is the only thing that’s obviously “happening”!  From a conventional physics standpoint, the quantum state of the CFT very quickly reaches equilibrium and then just stays there.  If you measured some “conventional” physical observable—say, the energy density at a particular point—then it wouldn’t look like the CFT state was continuing to evolve at all.  And yet we know that the CFT state is evolving, for two extremely different reasons.  Firstly, because (as we discussed early on in this post) unitary evolution is still happening, so presumably the state’s quantum circuit complexity is continuing to increase.  And secondly, because in the dual AdS description, the wormhole is continuing to get longer!

From this connection, at least three striking conclusions follow:

  1. That even when nothing else seems to be happening in a physical system (i.e., it seems to have equilibrated), the fact that the system’s quantum circuit complexity keeps increasing can be “physically relevant” all by itself.  We know that it’s physically relevant, because in the AdS dual description, it corresponds to the wormhole getting longer!
  2. That even in the special case of the thermofield double state, the geometry of spacetime continues to be protected by an “armor” of computational complexity.  Suppose that Alice, in one half of the thermofield double state, wants to send a message to Bob in the other half (which Bob can retrieve by jumping into his black hole).  In order to get her message through, Alice needs to prevent the wormhole connecting her black hole to Bob’s from stretching uncontrollably—since as long as it stretches, the wormhole remains non-traversable.  But in the CFT picture, stopping the wormhole from stretching corresponds to stopping the quantum circuit complexity from increasing!  And that, in turn, suggests that Alice would need to act on the radiation outside her black hole in an incredibly complicated and finely-tuned way.  For “generically,” the circuit complexity of an n-qubit state should just continue to increase, the longer you run unitary evolution for, until it hits its exp(n) maximum.  To prevent that from happening would essentially require “freezing” or “inverting” the unitary evolution applied by nature—but that’s the sort of thing that we expect to be thermodynamics-complete.  (How exactly do Alice’s actions in the “bulk” affect the evolution of the CFT state?  That’s an excellent question that I don’t understand AdS/CFT well enough to answer.  All I know is that the answer involves something that Lenny calls “precursor operators.”)
  3. The third and final conclusion is that there can be a physically-relevant difference between pseudorandom n-qubit pure states and “truly” random states—even though, by the definition of pseudorandom, such a difference can’t be detected by any small quantum circuit!  Once again, the way to see the difference is using AdS/CFT.  It’s easy to show, by a counting argument, that almost all n-qubit pure states have nearly-maximal quantum circuit complexity.  But if the circuit complexity is already maximal, that means in particular that it’s not increasing!  Lenny argues that this corresponds to the wormhole between the two black holes no longer stretching.  But if the wormhole is no longer stretching, then it’s “vulnerable to firewalls” (i.e., to messages going through!).  It had previously been argued that random CFT states almost always correspond to black holes with firewalls—and since the CFT states formed by realistic physical processes will look “indistinguishable from random states,” black holes that form under realistic conditions should generically have firewalls as well.  But Lenny rejects this argument, on the ground that the CFT states that arise in realistic situations are not random pure states.  And what distinguishes them from random states?  Simply that they have non-maximal (and increasing) quantum circuit complexity!

I’ll leave you with a question of my own about this complexity / black hole connection: one that I’m unsure how to think about, but that perhaps interests me more than any other here.  My question is: could you ever learn the answer to an otherwise-intractable computational problem by jumping into a black hole?  Of course, you’d have to really want the answer—so much so that you wouldn’t mind dying moments after learning it, or not being able to share it with anyone else!  But never mind that.  What I have in mind is first applying some polynomial-size quantum circuit to the Hawking radiation, then jumping into the black hole to see what nonlocal effect (if any) the circuit had on the interior.  The fact that the mapping between interior and exterior states is so complicated suggests that there might be complexity-theoretic mileage to be had this way, but I don’t know what.  (It’s also possible that you can get a computational speedup in special cases like the thermofield double state, but that a Harlow-Hayden-like obstruction prevents you from getting one with real astrophysical black holes.  I.e., that for real black holes, you’ll just see a smooth, boring, Einsteinian black hole interior no matter what polynomial-size quantum circuit you applied to the Hawking radiation.)

* * *

If you’re still here, the second paper I want to discuss today is Finite-time blowup for an averaged three-dimensional Navier-Stokes equation by Terry Tao.  (See also the excellent Quanta article by Erica Klarreich.)  I’ll have much, much less to say about this paper than I did about Susskind’s, but that’s not because it’s less interesting: it’s only because I understand the issues even less well.

Navier-Stokes existence and smoothness is one of the seven Clay Millennium Problems (alongside P vs. NP, the Riemann Hypothesis, etc).  The problem asks whether the standard, classical differential equations for three-dimensional fluid flow are well-behaved, in the sense of not “blowing up” (e.g., concentrating infinite energy on a single point) after a finite amount of time.

Expanding on ideas from his earlier blog posts and papers about Navier-Stokes (see here for the gentlest of them), Tao argues that the Navier-Stokes problem is closely related to the question of whether or not it’s possible to “build a fault-tolerant universal computer out of water.”  Why?  Well, it’s not the computational universality per se that matters, but if you could use fluid flow to construct general enough computing elements—resistors, capacitors, transistors, etc.—then you could use those elements to recursively shift the energy in a given region into a region half the size, and from there to a region a quarter the size, and so on, faster and faster, until you got infinite energy density after a finite amount of time.

Strikingly, building on an earlier construction by Katz and Pavlovic, Tao shows that this is actually possible for an “averaged” version of the Navier-Stokes equations!  So at the least, any proof of existence and smoothness for the real Navier-Stokes equations will need to “notice” the difference between the real and averaged versions.  In his paper, though, Tao hints at the possibility (or dare one say likelihood?) that the truth might go the other way.  That is, maybe the “universal computer” construction can be ported from the averaged Navier-Stokes equations to the real ones.  In that case, we’d have blowup in finite time for the real equations, and a negative solution to the Navier-Stokes existence and smoothness problem.  Of course, such a result wouldn’t imply that real, physical water was in any danger of “blowing up”!  It would simply mean that the discrete nature of water (i.e., the fact that it’s made of H2O molecules, rather than being infinitely divisible) was essential to understanding its stability given arbitrary initial conditions.

So, what are the prospects for such a blowup result?  Let me quote from Tao’s paper:

Once enough logic gates of ideal fluid are constructed, it seems that the main difficulties in executing the above program [to prove a blowup result for the “real” Navier-Stokes equations] are of a “software engineering” nature, and would be in principle achievable, even if the details could be extremely complicated in practice.  The main mathematical difficulty in executing this “fluid computing” program would thus be to arrive at (and rigorously certify) a design for logical gates of inviscid fluid that has some good noise tolerance properties.  In this regard, ideas from quantum computing (which faces a unitarity constraint somewhat analogous to the energy conservation constraint for ideal fluids, albeit with the key difference of having a linear evolution rather than a nonlinear one) may prove to be useful.

One minor point that I’d love to understand is, what happens in two dimensions?  Existence and smoothness are known to hold for the 2-dimensional analogues of the Navier-Stokes equations.  If they also held for the averaged 2-dimensional equations, then it would follow that Tao’s “universal computer” must be making essential use of the third dimension. How?  If I knew the answer to that, then I’d feel for the first time like I had some visual crutch for understanding why 3-dimensional fluid flow is so complicated, even though 2-dimensional fluid flow isn’t.

I see that, in blog comments here and here, Tao says that the crucial difference between the 2- and 3-dimensional Navier-Stokes equations arises from the different scaling behavior of the dissipation term: basically, you can ignore it in 3 or more dimensions, but you can’t ignore it in 2.  But maybe there’s a more doofus-friendly explanation, which would start with some 3-dimensional fluid logic gate, and then explain why the gate has no natural 2-dimensional analogue, or why dissipation causes its analogue to fail.

* * *

Obviously, there’s much more to say about both papers (especially the second…) than I said in this post, and many people more knowledgeable than I am to say those things.  But that’s what the comments section is for.  Right now I’m going outside to enjoy the California sunshine.


Why do academics feel the need to stuff their papers with “nontrivial” results? After all, if a paper is remembered decades after it was written, it’s almost always for a simple core idea — not for the extensions and applications that fill 90% of the paper’s bulk.

The nontriviality virus can infect even the greats: think of Leonid Levin’s famous paper on universal search. According to legend, the reason Levin was scooped by Cook and Karp is that he spent a year trying to prove Graph Isomorphism was NP-complete! You see, that would’ve been a deep, publication-worthy result, unlike the “obvious” fact that there exist natural NP-complete problems.

Here’s a more recent example. In my opinion, this 43-pager by Barak et al. is one of the sweetest computer science papers of the past decade. But what makes it so sweet is a two-sentence insight (my wording):

> There’s no generic, foolproof way to “obfuscate” a computer program. For even if a program looked hopelessly unreadable, you could always feed it its own code as input, which is one thing you couldn’t do if all you had was a “black box” with the same input/output behavior as the program in question.

So why did the authors go on for 43 more pages?

One possibility was suggested to me by Robin Hanson, an economist at George Mason who spews interesting ideas out of his nose and ears. Depending on your prejudices, you might see Robin as either a visionary futurist or a walking reductio ad absurdum of mainstream economic theory. Either way, his web page will surprise and provoke you.

When I talked with Robin in August, he speculated that nontrivial results function mainly as “certificates of smartness”: that is, expensive, difficult-to-fake evidence that the author(s) of a paper are smart enough that their simple core idea is likely to be worth taking seriously. Without these certificates, the theory goes, we academics would be deluged by too many promising ideas to entertain them all — since even if the ideas are simple, it usually isn’t simple to ascertain their worth.

Note that this theory differs from a more standard complaint, that academics fill their papers with nontrivial results for the sole purpose of getting them published. On Robin’s account, nontrivial results actually are useful to readers, just not in the way the paper advertises. Think of the author as a groom, the reader as a bride, and the nontrivial result as a wedding ring. The bride doesn’t care about the actual ring, but she does care that the groom was rich and devoted enough to buy one.

One prediction of Robin’s theory would be that, once you’ve established your smartness within the community, you should be able to get papers published even if they contain only simple observations. Another prediction would be that, if you’re very smart but emotionally attached to a simple idea, you should be able to buy exposure for your idea by encrusting it with nontrivialities. (As Robin remarked to me, everything in social science is either obvious or false; the only question is which.)

I don’t have anything deeper to say about Robin’s theory, but I’m enjoying the freedom to blog about it anyway.


Out there in the wider world—OK, OK, among Luboš Motl, and a few others who comment on this blog—there appears to be a widespread opinion that P≠NP is just “a fashionable dogma of the so-called experts,” something that’s no more likely to be true than false.  The doubters can even point to at least one accomplished complexity theorist, Dick Lipton, who publicly advocates agnosticism about whether P=NP.

Of course, not all the doubters reach their doubts the same way.  For Lipton, the thinking is probably something like: as scientists, we should be rigorously open-minded, and constantly question even the most fundamental hypotheses of our field.  For the outsiders, the thinking is more like: computer scientists are just not very smart—certainly not as smart as real scientists—so the fact that they consider something a “fundamental hypothesis” provides no information of value.

Consider, for example, this comment of Ignacio Mosqueira:

If there is no proof that means that there is no reason a-priori to prefer your arguments over those [of] Lubos. Expertise is not enough.  And the fact that Lubos is difficult to deal with doesn’t change that.

In my response, I wondered how broadly Ignacio would apply the principle “if there’s no proof, then there’s no reason to prefer any argument over any other one.”  For example, would he agree with the guy interviewed on Jon Stewart who earnestly explained that, since there’s no proof that turning on the LHC will destroy the world, but also no proof that it won’t destroy the world, the only rational inference is that there’s a 50% chance it will destroy the world?  (John Oliver’s deadpan response was classic: “I’m … not sure that’s how probability works…”)

In a lengthy reply, Luboš bites this bullet with relish and mustard.  In physics, he agrees, or even in “continuous mathematics that is more physics-wise,” it’s possible to have justified beliefs even without proof.  For example, he admits to a 99.9% probability that the Riemann hypothesis is true.  But, he goes on, “partial evidence in discrete mathematics just cannot exist.”  Discrete math and computer science, you see, are so arbitrary, manmade, and haphazard that every question is independent of every other; no amount of experience can give anyone any idea which way the next question will go.

No, I’m not kidding.  That’s his argument.

I couldn’t help wondering: what about number theory?  Aren’t the positive integers a “discrete” structure?  And isn’t the Riemann Hypothesis fundamentally about the distribution of primes?  Or does the Riemann Hypothesis get counted as an “honorary physics-wise continuous problem” because it can also be stated analytically?  But then what about Goldbach’s Conjecture?  Is Luboš 50/50 on that one too?  Better yet, what about continuous, analytic problems that are closely related to P vs. NP?  For example, Valiant’s Conjecture says you can’t linearly embed the permanent of an n×n matrix as the determinant of an m×m matrix, unless m≥exp(n).  Mulmuley and others have connected this “continuous cousin” of P≠NP to issues in algebraic geometry, representation theory, and even quantum groups and Langlands duality.  So, does that make it kosher?  The more I thought about the proposed distinction, the less sense it made to me.

But enough of this.  In the rest of this post, I want to explain why the odds that you should assign to P≠NP are more like 99% than they are like 50%.  This post supersedes my 2006 post on the same topic, which I hereby retire.  While that post was mostly OK as far as it went, I now feel like I can do a much better job articulating the central point.  (And also, I made the serious mistake in 2006 of striving for literary eloquence and tongue-in-cheek humor.  That works great for readers who already know the issues inside-and-out, and just want to be amused.  Alas, it doesn’t work so well for readers who don’t know the issues, are extremely literal-minded, and just want ammunition to prove their starting assumption that I’m a doofus who doesn’t understand the basics of his own field.)

So, OK, why should you believe P≠NP?  Here’s why:

Because, like any other successful scientific hypothesis, the P≠NP hypothesis has passed severe tests that it had no good reason to pass were it false.

What kind of tests am I talking about?

By now, tens of thousands of problems have been proved to be NP-complete.  They range in character from theorem proving to graph coloring to airline scheduling to bin packing to protein folding to auction pricing to VLSI design to minimizing soap films to winning at Super Mario Bros.  Meanwhile, another cluster of tens of thousands of problems has been proved to lie in P (or BPP).  Those range from primality to matching to linear and semidefinite programming to edit distance to polynomial factoring to hundreds of approximation tasks.  Like the NP-complete problems, many of the P and BPP problems are also related to each other by a rich network of reductions.  (For example, countless other problems are in P “because” linear and semidefinite programming are.)

So, if we were to draw a map of the complexity class NP  according to current knowledge, what would it look like?  There’d be a huge, growing component of NP-complete problems, all connected to each other by an intricate network of reductions.  There’d be a second huge component of P problems, many of them again connected by reductions.  Then, much like with the map of the continental US, there’d be a sparser population in the middle: stuff like factoring, graph isomorphism, and Unique Games that for various reasons has thus far resisted assimilation onto either of the coasts.

Of course, to prove P=NP, it would suffice to find a single link—that is, a single polynomial-time equivalence—between any of the tens of thousands of problems on the P coast, and any of the tens of thousands on the NP-complete one.  In half a century, this hasn’t happened: even as they’ve both ballooned exponentially, the two giant regions have remained defiantly separate from each other.  But that’s not even the main point.  The main point is that, as people explore these two regions, again and again there are “close calls”: places where, if a single parameter had worked out differently, the two regions would have come together in a cataclysmic collision.  Yet every single time, it’s just a fake-out.  Again and again the two regions “touch,” and their border even traces out weird and jagged shapes.  But even in those border zones, not a single problem ever crosses from one region to the other.  It’s as if they’re kept on their respective sides by an invisible electric fence.

As an example, consider the Set Cover problem: i.e., the problem, given a collection of subsets S1,…,Sm⊆{1,…,n}, of finding as few subsets as possible whose union equals the whole set.  Chvatal showed in 1979 that a greedy algorithm can produce, in polynomial time, a collection of sets whose size is at most ln(n) times larger than the optimum size.  This raises an obvious question: can you do better?  What about 0.9ln(n)?  Alas, building on a long sequence of prior works in PCP theory, it was recently shown that, if you could find a covering set at most (1-ε)ln(n) times larger than the optimum one, then you’d be solving an NP-complete problem, and P would equal NP.  Notice that, conversely, if the hardness result worked for ln(n) or anything above, then we’d also get P=NP.  So, why do the algorithm and the hardness result “happen to meet” at exactly ln(n), with neither one venturing the tiniest bit beyond?  Well, we might say, ln(n) is where the invisible electric fence is for this problem.

Want another example?  OK then, consider the “Boolean Max-k-CSP” problem: that is, the problem of setting n bits so as to satisfy the maximum number of constraints, where each constraint can involve an arbitrary Boolean function on any k of the bits.  The best known approximation algorithm, based on semidefinite programming, is guaranteed to satisfy at least a 2k/2k fraction of the constraints.  Can you guess where this is going?  Recently, Siu On Chan showed that it’s NP-hard to satisfy even slightly more than a 2k/2k fraction of constraints: if you can, then P=NP.  In this case the invisible electric fence sends off its shocks at 2k/2k.

I could multiply such examples endlessly—or at least, Dana (my source for such matters) could do so.  But there are also dozens of “weird coincidences” that involve running times rather than approximation ratios; and that strongly suggest, not only that P≠NP, but that problems like 3SAT should require cn time for some constant c.  For a recent example—not even a particularly important one, but one that’s fresh in my memory—consider this paper by myself, Dana, and Russell Impagliazzo.  A first thing we do in that paper is to give an approximation algorithm for a family of two-prover games called “free games.”  Our algorithm runs in quasipolynomial time:  specifically, nO(log(n)).  A second thing we do is show how to reduce the NP-complete 3SAT problem to free games of size ~2O(√n).

Composing those two results, you get an algorithm for 3SAT whose overall running time is roughly

$$ 2^{O( \sqrt{n} \log 2^{\sqrt{n}}) } = 2^{O(n)}. $$

Of course, this doesn’t improve on the trivial “try all possible solutions” algorithm.  But notice that, if our approximation algorithm for free games had been slightly faster—say, nO(log log(n))—then we could’ve used it to solve 3SAT in $$ 2^{O(\sqrt{n} \log n)} $$ time.  Conversely, if our reduction from 3SAT had produced free games of size (say) $$ 2^{O(n^{1/3})} $$ rather than 2O(√n), then we could’ve used that to solve 3SAT in $$ 2^{O(n^{2/3})} $$ time.

I should stress that these two results have completely different proofs: the approximation algorithm for free games “doesn’t know or care” about the existence of the reduction, nor does the reduction know or care about the algorithm.  Yet somehow, their respective parameters “conspire” so that 3SAT still needs cn time.  And you see the same sort of thing over and over, no matter which problem domain you’re interested in.  These ubiquitous “coincidences” would be immediately explained if 3SAT actually did require cn time—i.e., if it had a “hard core” for which brute-force search was unavoidable, no matter which way you sliced things up.  If that’s not true—i.e., if 3SAT has a subexponential algorithm—then we’re left with unexplained “spooky action at a distance.”  How do the algorithms and the reductions manage to coordinate with each other, every single time, to avoid spilling the subexponential secret?

Notice that, contrary to Luboš’s loud claims, there’s no “symmetry” between P=NP and P≠NP in these arguments.  Lower bound proofs are much harder to come across than either algorithms or reductions, and there’s not really a mystery about why: it’s hard to prove a negative!  (Especially when you’re up against known mathematical barriers, including relativization, algebrization, and natural proofs.)  In other words, even under the assumption that lower bound proofs exist, we now understand a lot about why the existing mathematical tools can’t deliver them, or can only do so for much easier problems.  Nor can I think of any example of a “spooky numerical coincidence” between two unrelated-seeming results, which would’ve yielded a proof of P≠NP had some parameters worked out differently.  P=NP and P≠NP can look like “symmetric” possibilities only if your symmetry is unbroken by knowledge.

Imagine a pond with small yellow frogs on one end, and large green frogs on the other.  After observing the frogs for decades, herpetologists conjecture that the populations represent two distinct species with different evolutionary histories, and are not interfertile.  Everyone realizes that to disprove this hypothesis, all it would take would be a single example of a green/yellow hybrid.  Since (for some reason) the herpetologists really care about this question, they undertake a huge program of breeding experiments, putting thousands of yellow female frogs next to green male frogs (and vice versa) during mating season, with candlelight, soft music, etc.  Nothing.

As this green vs. yellow frog conundrum grows in fame, other communities start investigating it as well: geneticists, ecologists, amateur nature-lovers, commercial animal breeders, ambitious teenagers on the science-fair circuit, and even some extralusionary physicists hoping to show up their dimwitted friends in biology.  These other communities try out hundreds of exotic breeding strategies that the herpetologists hadn’t considered, and contribute many useful insights.  They also manage to breed a larger, greener, but still yellow frog—something that, while it’s not a “true” hybrid, does have important practical applications for the frog-leg industry.  But in the end, no one has any success getting green and yellow frogs to mate.

Then one day, someone exclaims: “aha!  I just found a huge, previously-unexplored part of the pond where green and yellow frogs live together!  And what’s more, in this part, the small yellow frogs are bigger and greener than normal, and the large green frogs are smaller and yellower!”

This is exciting: the previously-sharp boundary separating green from yellow has been blurred!  Maybe the chasm can be crossed after all!

Alas, further investigation reveals that, even in the new part of the pond, the two frog populations still stay completely separate.  The smaller, yellower frogs there will mate with other small yellow frogs (even from faraway parts of the pond that they’d never ordinarily visit), but never, ever with the larger, greener frogs even from their own part.  And vice versa.  The result?  A discovery that could have falsified the original hypothesis has instead strengthened it—and precisely because it could’ve falsified it but didn’t.

Now imagine the above story repeated a few dozen more times—with more parts of the pond, a neighboring pond, sexually-precocious tadpoles, etc.  Oh, and I forgot to say this before, but imagine that doing a DNA analysis, to prove once and for all that the green and yellow frogs had separate lineages, is extraordinarily difficult.  But the geneticists know why it’s so difficult, and the reasons have more to do with the limits of their sequencing machines and with certain peculiarities of frog DNA, than with anything about these specific frogs.  In fact, the geneticists did get the sequencing machines to work for the easier cases of turtles and snakes—and in those cases, their results usually dovetailed well with earlier guesses based on behavior.  So for example, where reddish turtles and bluish turtles had never been observed interbreeding, the reason really did turn out to be that they came from separate species.  There were some surprises, of course, but nothing even remotely as shocking as seeing the green and yellow frogs suddenly getting it on.

Now, even after all this, someone could saunter over to the pond and say: “ha, what a bunch of morons!  I’ve never even seen a frog or heard one croak, but I know that you haven’t proved anything!  For all you know, the green and yellow frogs will start going at it tomorrow.  And don’t even tell me about ‘the weight of evidence,’ blah blah blah.  Biology is a scummy mud-discipline.  It has no ideas or principles; it’s just a random assortment of unrelated facts.  If the frogs started mating tomorrow, that would just be another brute, arbitrary fact, no more surprising or unsurprising than if they didn’t start mating tomorrow.  You jokers promote the ideology that green and yellow frogs are separate species, not because the evidence warrants it, but just because it’s a convenient way to cover up your own embarrassing failure to get them to mate.  I could probably breed them myself in ten minutes, but I have better things to do.”

At this, a few onlookers might nod appreciatively and say: “y’know, that guy might be an asshole, but let’s give him credit: he’s unafraid to speak truth to competence.”

Even among the herpetologists, a few might beat their breasts and announce: “Who’s to say he isn’t right?  I mean, what do we really know?  How do we know there even is a pond, or that these so-called ‘frogs’ aren’t secretly giraffes?  I, at least, have some small measure of wisdom, in that I know that I know nothing.”

What I want you to notice is how scientifically worthless all of these comments are.  If you wanted to do actual research on the frogs, then regardless of which sympathies you started with, you’d have no choice but to ignore the naysayers, and proceed as if the yellow and green frogs were different species.  Sure, you’d have in the back of your mind that they might be the same; you’d be ready to adjust your views if new evidence came in.  But for now, the theory that there’s just one species, divided into two subgroups that happen never to mate despite living in the same habitat, fails miserably at making contact with any of the facts that have been learned.  It leaves too much unexplained; in fact it explains nothing.

For all that, you might ask, don’t the naysayers occasionally turn out to be right?  Of course they do!  But if they were right more than occasionally, then science wouldn’t be possible.  We would still be in caves, beating our breasts and asking how we can know that frogs aren’t secretly giraffes.

So, that’s what I think about P and NP.  Do I expect this post to convince everyone?  No—but to tell you the truth, I don’t want it to.  I want it to convince most people, but I also want a few to continue speculating that P=NP.

Why, despite everything I’ve said, do I want maybe-P=NP-ism not to die out entirely?  Because alongside the P=NP carpers, I also often hear from a second group of carpers.  This second group says that P and NP are so obviously, self-evidently unequal that the quest to separate them with mathematical rigor is quixotic and absurd.  Theoretical computer scientists should quit wasting their time struggling to understand truths that don’t need to be understood, but only accepted, and do something useful for the world.  (A natural generalization of this view, I guess, is that all basic science should end.)  So, what I really want is for the two opposing groups of naysayers to keep each other in check, so that those who feel impelled to do so can get on with the fascinating quest to understand the ultimate limits of computation.

* * *

Update (March 8): At least eight readers have by now emailed me, or left comments, asking why I’m wasting so much time and energy arguing with Luboš Motl.  Isn’t it obvious that, ever since he stopped doing research around 2006 (if not earlier), this guy has completely lost his marbles?  That he’ll never, ever change his mind about anything?

Yes.  In fact, I’ve noticed repeatedly that, even when Luboš is wrong about a straightforward factual matter, he never really admits error: he just switches, without skipping a beat, to some other way to attack his interlocutor.  (To give a small example: watch how he reacts to being told that graph isomorphism is neither known nor believed to be NP-complete.  Caught making a freshman-level error about the field he’s attacking, he simply rants about how graph isomorphism is just as “representative” and “important” as NP-complete problems anyway, since no discrete math question is ever more or less “important” than any other; they’re all equally contrived and arbitrary.  At the Luboš casino, you lose even when you win!  The only thing you can do is stop playing and walk away.)

Anyway, my goal here was never to convince Luboš.  I was writing, not for him, but for my other readers: especially for those genuinely unfamiliar with these interesting issues, or intimidated by Luboš’s air of certainty.  I felt like I owed it to them to set out, clearly and forcefully, certain facts that all complexity theorists have encountered in their research, but that we hardly ever bother to articulate.  If you’ve never studied physics, then yes, it sounds crazy that there would be quadrillions of invisible neutrinos coursing through your body.  And if you’ve never studied computer science, it sounds crazy that there would be an “invisible electric fence,” again and again just barely separating what the state-of-the-art approximation algorithms can handle from what the state-of-the-art PCP tools can prove is NP-complete.  But there it is, and I wanted everyone else at least to see what the experts see, so that their personal judgments about the likelihood of P=NP could be informed by seeing it.

Luboš’s response to my post disappointed me (yes, really!).  I expected it to be nasty and unhinged, and so it was.  What I didn’t expect was that it would be so intellectually lightweight.  Confronted with the total untenability of his foot-stomping distinction between “continuous math” (where you can have justified beliefs without proof) and “discrete math” (where you can’t), and with exactly the sorts of “detailed, confirmed predictions” of the P≠NP hypothesis that he’d declared impossible, Luboš’s response was simply to repeat his original misconceptions, but louder.

And that brings me, I confess, to a second reason for my engagement with Luboš.  Several times, I’ve heard people express sentiments like:

Yes, of course Luboš is a raging jerk and a social retard.  But if you can just get past that, he’s so sharp and intellectually honest!  No matter how many people he needlessly offends, he always tells it like it is.

I want the nerd world to see—in as stark a situation as possible—that the above is not correct.  Luboš is wrong much of the time, and he’s intellectually dishonest.

At one point in his post, Luboš actually compares computer scientists who find P≠NP a plausible working hypothesis to his even greater nemesis: the “climate cataclysmic crackpots.”  (Strangely, he forgot to compare us to feminists, Communists, Muslim terrorists, or loop quantum gravity theorists.)  Even though the P versus NP and global warming issues might not seem closely linked, part of me is thrilled that Luboš has connected them as he has.  If, after seeing this ex-physicist’s “thought process” laid bare on the P versus NP problem—how his arrogance and incuriosity lead him to stake out a laughably-absurd position; how his vanity then causes him to double down after his errors are exposed—if, after seeing this, a single person is led to question Lubošian epistemology more generally, then my efforts will not have been in vain.

Anyway, now that I’ve finally unmasked Luboš—certainly to my own satisfaction, and I hope to that of most scientifically-literate readers—I’m done with this.  The physicist John Baez is rumored to have said: “It’s not easy to ignore Luboš, but it’s ALWAYS worth the effort.”  It took me eight years, but I finally see the multiple layers of profundity hidden in that snark.

And thus I make the following announcement:

For the next three years, I, Scott Aaronson, will not respond to anything Luboš says, nor will I allow him to comment on this blog.

In March 2017, I’ll reassess my Luboš policy.  Whether I relent will depend on a variety of factors—including whether Luboš has gotten the professional help he needs (from a winged pig, perhaps?) and changed his behavior; but also, how much my own quality of life has improved in the meantime.

* * *

Another Update (3/11): There’s some further thoughtful discussion of this post over on Reddit.

* * *

Another Update (3/13): Check out my MathOverflow question directly inspired by the comments on this post.

* * *

Yet Another Update (3/17): Dick Lipton and Ken Regan now have a response up to this post. My own response is coming soon in their comment section. For now, check out an excellent comment by Timothy Gowers, which begins “I firmly believe that P≠NP,” then plays devil’s-advocate by exploring the possibility that in this comment thread I called P being ‘severed in two,’ then finally returns to reasons for believing that P≠NP after all.


Two months ago, commenter rrtucci asked me what I thought about Max Tegmark and his “Mathematical Universe Hypothesis”: the idea, which Tegmark defends in his recent book Our Mathematical Universe, that physical and mathematical existence are the same thing, and that what we call “the physical world” is simply one more mathematical structure, alongside the dodecahedron and so forth.  I replied as follows:

…I find Max a fascinating person, a wonderful conference organizer, someone who’s always been extremely nice to me personally, and an absolute master at finding common ground with his intellectual opponents—I’m trying to learn from him, and hope someday to become 10-122 as good.  I can also say that, like various other commentators (e.g., Peter Woit), I personally find the “Mathematical Universe Hypothesis” to be devoid of content.

After Peter Woit found that comment and highlighted it on his own blog, my comments section was graced by none other than Tegmark himself, who wrote:

Thanks Scott for your all to [sic] kind words!  I very much look forward to hearing what you think about what I actually say in the book once you’ve had a chance to read it!  I’m happy to give you a hardcopy (which can double as door-stop) – just let me know.

With this reply, Max illustrated perfectly why I’ve been trying to learn from him, and how far I fall short.  Where I would’ve said “yo dumbass, why don’t you read my book before spouting off?,” Tegmark gracefully, diplomatically shamed me into reading his book.

So, now that I’ve done so, what do I think?  Briefly, I think it’s a superb piece of popular science writing—stuffed to the gills with thought-provoking arguments, entertaining anecdotes, and fascinating facts.  I think everyone interested in math, science, or philosophy should buy the book and read it.  And I still think the MUH is basically devoid of content, as it stands.

Let me start with what makes the book so good.  First and foremost, the personal touch.  Tegmark deftly conveys the excitement of being involved in the analysis of the cosmic microwave background fluctuations—of actually getting detailed numerical data about the origin of the universe.  (The book came out just a few months before last week’s bombshell announcement of B-modes in the CMB data; presumably the next edition will have an update about that.)  And Tegmark doesn’t just give you arguments for the Many-Worlds Interpretation of quantum mechanics; he tells you how he came to believe it.  He writes of being a beginning PhD student at Berkeley, living at International House (and dating an Australian exchange student who he met his first day at IHouse), who became obsessed with solving the quantum measurement problem, and who therefore headed to the physics library, where he was awestruck by reading the original Many-Worlds articles of Hugh Everett and Bryce deWitt.  As it happens, every single part of the last sentence also describes me (!!!)—except that the Australian exchange student who I met my first day at IHouse lost interest in me when she decided that I was too nerdy.  And also, I eventually decided that the MWI left me pretty much as confused about the measurement problem as before, whereas Tegmark remains a wholehearted Many-Worlder.

The other thing I loved about Tegmark’s book was its almost comical concreteness.  He doesn’t just metaphorically write about “knobs” for adjusting the constants of physics: he shows you a picture of a box with the knobs on it.  He also shows a “letter” that lists not only his street address, zip code, town, state, and country, but also his planet, Hubble volume, post-inflationary bubble, quantum branch, and mathematical structure.  Probably my favorite figure was the one labeled “What Dark Matter Looks Like / What Dark Energy Looks Like,” which showed two blank boxes.

Sometimes Tegmark seems to subtly subvert the conventions of popular-science writing.  For example, in the first chapter, he includes a table that categorizes each of the book’s remaining chapters as “Mainstream,” “Controversial,” or “Extremely Controversial.”  And whenever you’re reading the text and cringing at a crucial factual point that was left out, chances are good you’ll find a footnote at the bottom of the page explaining that point.  I hope both of these conventions become de rigueur for all future pop-science books, but I’m not counting on it.

The book has what Tegmark himself describes as a “Dr. Jekyll / Mr. Hyde” structure, with the first (“Dr. Jekyll”) half of the book relaying more-or-less accepted discoveries in physics and cosmology, and the second (“Mr. Hyde”) half focusing on Tegmark’s own Mathematical Universe Hypothesis (MUH).  Let’s accept that both halves are enjoyable reads, and that the first half contains lots of wonderful science.  Is there anything worth saying about the truth or falsehood of the MUH?

In my view, the MUH gestures toward two points that are both correct and important—neither of them new, but both well worth repeating in a pop-science book.  The first is that the laws of physics aren’t “suggestions,” which the particles can obey when they feel like it but ignore when Uri Geller picks up a spoon.  In that respect, they’re completely unlike human laws, and the fact that we use the same word for both is unfortunate.  Nor are the laws merely observed correlations, as in “scientists find link between yogurt and weight loss.”  The links of fundamental physics are ironclad: the world “obeys” them in much the same sense that a computer obeys its code, or the positive integers obey the rules of arithmetic.  Of course we don’t yet know the complete program describing the state evolution of the universe, but everything learned since Galileo leads one to expect that such a program exists.  (According to quantum mechanics, the program describing our observed reality is a probabilistic one, but for me, that fact by itself does nothing to change its lawlike character.  After all, if you know the initial state, Hamiltonian, and measurement basis, then quantum mechanics gives you a perfect algorithm to calculate the probabilities.)

The second true and important nugget in the MUH is that the laws are “mathematical.”  By itself, I’d say that’s a vacuous statement, since anything that can be described at all can be described mathematically.  (As a degenerate case, a “mathematical description of reality” could simply be a gargantuan string of bits, listing everything that will ever happen at every point in spacetime.)  The nontrivial part is that, at least if we ignore boundary conditions and the details of our local environment (which maybe we shouldn’t!), the laws of nature are expressible as simple, elegant math—and moreover, the same structures (complex numbers, group representations, Riemannian manifolds…) that mathematicians find important for internal reasons, again and again turn out to play a crucial role in physics.  It didn’t have to be that way, but it is.

Putting the two points together, it seems fair to say that the physical world is “isomorphic to” a mathematical structure—and moreover, a structure whose time evolution obeys simple, elegant laws.   All of this I find unobjectionable: if you believe it, it doesn’t make you a Tegmarkian; it makes you ready for freshman science class.

But Tegmark goes further.  He doesn’t say that the universe is “isomorphic” to a mathematical structure; he says that it is that structure, that its physical and mathematical existence are the same thing.  Furthermore, he says that every mathematical structure “exists” in the same sense that “ours” does; we simply find ourselves in one of the structures capable of intelligent life (which shouldn’t surprise us).  Thus, for Tegmark, the answer to Stephen Hawking’s famous question—“What is it that breathes fire into the equations and gives them a universe to describe?”—is that every consistent set of equations has fire breathed into it.  Or rather, every mathematical structure of at most countable cardinality whose relations are definable by some computer program.  (Tegmark allows that structures that aren’t computably definable, like the set of real numbers, might not have fire breathed into them.)

Anyway, the ensemble of all (computable?) mathematical structures, constituting the totality of existence, is what Tegmark calls the “Level IV multiverse.”  In his nomenclature, our universe consists of anything from which we can receive signals; anything that exists but that we can’t receive signals from is part of a “multiverse” rather than our universe.  The “Level I multiverse” is just the entirety of our spacetime, including faraway regions from which we can never receive a signal due to the dark energy.  The Level II multiverse consists of the infinitely many other “bubbles” (i.e., “local Big Bangs”), with different values of the constants of physics, that would, in eternal inflation cosmologies, have generically formed out of the same inflating substance that gave rise to our Big Bang.  The Level III multiverse is Everett’s many worlds.  Thus, for Tegmark, the Level IV multiverse is a sort of natural culmination of earlier multiverse theorizing.  (Some people might call it a reductio ad absurdum, but Tegmark is nothing if not a bullet-swallower.)

Now, why should you believe in any of these multiverses?  Or better: what does it buy you to believe in them?

As Tegmark correctly points out, none of the multiverses are “theories,” but they might be implications of theories that we have other good reasons to accept.  In particular, it seems crazy to believe that the Big Bang created space only up to the furthest point from which light can reach the earth, and no further.  So, do you believe that space extends further than our cosmological horizon?  Then boom! you believe in the Level I multiverse, according to Tegmark’s definition of it.

Likewise, do you believe there was a period of inflation in the first ~10-32 seconds after the Big Bang?  Inflation has made several confirmed predictions (e.g., about the “fractal” nature of the CMB perturbations), and if last week’s announcement of B-modes in the CMB is independently verified, that will pretty much clinch the case for inflation.  But Alan Guth, Andrei Linde, and others have argued that, if you accept inflation, then it seems hard to prevent patches of the inflating substance from continuing to inflate forever, and thereby giving rise to infinitely many “other” Big Bangs.  Furthermore, if you accept string theory, then the six extra dimensions should generically curl up differently in each of those Big Bangs, giving rise to different apparent values of the constants of physics.  So then boom! with those assumptions, you’re sold on the Level II multiverse as well.  Finally, of course, there are people (like David Deutsch, Eliezer Yudkowsky, and Tegmark himself) who think that quantum mechanics forces you to accept the Level III multiverse of Everett.  Better yet, Tegmark claims that these multiverses are “falsifiable.”  For example, if inflation turns out to be wrong, then the Level II multiverse is dead, while if quantum mechanics is wrong, then the Level III one is dead.

Admittedly, the Level IV multiverse is a tougher sell, even by the standards of the last two paragraphs.  If you believe physical existence to be the same thing as mathematical existence, what puzzles does that help to explain?  What novel predictions does it make?  Forging fearlessly ahead, Tegmark argues that the MUH helps to “explain” why our universe has so many mathematical regularities in the first place.  And it “predicts” that more mathematical regularities will be discovered, and that everything discovered by science will be mathematically describable.  But what about the existence of other mathematical universes?  If, Tegmark says (on page 354), our qualitative laws of physics turn out to allow a narrow range of numerical constants that permit life, whereas other possible qualitative laws have no range of numerical constants that permit life, then that would be evidence for the existence of a mathematical multiverse.  For if our qualitative laws were the only ones into which fire had been breathed, then why would they just so happen to have a narrow but nonempty range of life-permitting constants?

I suppose I’m not alone in finding this totally unpersuasive.  When most scientists say they want “predictions,” they have in mind something meatier than “predict the universe will continue to be describable by mathematics.”  (How would we know if we found something that wasn’t mathematically describable?  Could we even describe such a thing with English words, in order to write papers about it?)  They also have in mind something meatier than “predict that the laws of physics will be compatible with the existence of intelligent observers, but if you changed them a little, then they’d stop being compatible.”  (The first part of that prediction is solid enough, but the second part might depend entirely on what we mean by a “little change” or even an “intelligent observer.”)

What’s worse is that Tegmark’s rules appear to let him have it both ways.  To whatever extent the laws of physics turn out to be “as simple and elegant as anyone could hope for,” Tegmark can say: “you see?  that’s evidence for the mathematical character of our universe, and hence for the MUH!”  But to whatever extent the laws turn out not to be so elegant, to be weird or arbitrary, he can say: “see?  that’s evidence that our laws were selected more-or-less randomly among all possible laws compatible with the existence of intelligent life—just as the MUH predicted!”

Still, maybe the MUH could be sharpened to the point where it did make definite predictions?  As Tegmark acknowledges, the central difficulty with doing so is that no one has any idea what measure to use over the space of mathematical objects (or even computably-describable objects).  This becomes clear if we ask a simple question like: what fraction of the mathematical multiverse consists of worlds that contain nothing but a single three-dimensional cube?

We could try to answer such a question using the universal prior: that is, we could make a list of all self-delimiting computer programs, then count the total weight of programs that generate a single cube and then halt, where each n-bit program gets assigned 1/2n weight.  Sure, the resulting fraction would be uncomputable, but at least we’d have defined it.  Except wait … which programming language should we use?  (The constant factors could actually matter here!)  Worse yet, what exactly counts as a “cube”?  Does it have to have faces, or are vertices and edges enough?  How should we interpret the string of 1’s and 0’s output by the program, in order to know whether it describes a cube or not?  (Also, how do we decide whether two programs describe the “same” cube?  And if they do, does that mean they’re describing the same universe, or two different universes that happen to be identical?)

These problems are simply more-dramatic versions of the “standard” measure problem in inflationary cosmology, which asks how to make statistical predictions in a multiverse where everything that can happen will happen, and will happen an infinite number of times.  The measure problem is sometimes discussed as if it were a technical issue: something to acknowledge but then set to the side, in the hope that someone will eventually come along with some clever counting rule that solves it.  To my mind, however, the problem goes deeper: it’s a sign that, although we might have started out in physics, we’ve now stumbled into metaphysics.

Some cosmologists would strongly protest that view.  Most of them would agree with me that Tegmark’s Level IV multiverse is metaphysics, but they’d insist that the Level I, Level II, and perhaps Level III multiverses were perfectly within the scope of scientific inquiry: they either exist or don’t exist, and the fact that we get confused about the measure problem is our issue, not nature’s.

My response can be summed up in a question: why not ride this slippery slope all the way to the bottom?  Thinkers like Nick Bostrom and Robin Hanson have pointed out that, in the far future, we might expect that computer-simulated worlds (as in The Matrix) will vastly outnumber the “real” world.  So then, why shouldn’t we predict that we’re much more likely to live in a computer simulation than we are in one of the “original” worlds doing the simulating?  And as a logical next step, why shouldn’t we do physics by trying to calculate a probability measure over different kinds of simulated worlds: for example, those run by benevolent simulators versus evil ones?  (For our world, my own money’s on “evil.”)

But why stop there?  As Tegmark points out, what does it matter if a computer simulation is actually run or not?  Indeed, why shouldn’t you say something like the following: assuming that π is a normal number, your entire life history must be encoded infinitely many times in π’s decimal expansion.  Therefore, you’re infinitely more likely to be one of your infinitely many doppelgängers “living in the digits of π” than you are to be the “real” you, of whom there’s only one!  (Of course, you might also be living in the digits of e or √2, possibilities that also merit reflection.)

At this point, of course, you’re all the way at the bottom of the slope, in Mathematical Universe Land, where Tegmark is eagerly waiting for you.  But you still have no idea how to calculate a measure over mathematical objects: for example, how to say whether you’re more likely to be living in the first 1010^120 digits of π, or the first 1010^120 digits of e.  And as a consequence, you still don’t know how to use the MUH to constrain your expectations for what you’re going to see next.

Now, notice that these different ways down the slippery slope all have a common structure:

  1. We borrow an idea from science that’s real and important and profound: for example, the possible infinite size and duration of our universe, or inflationary cosmology, or the linearity of quantum mechanics, or the likelihood of π being a normal number, or the possibility of computer-simulated universes.
  2. We then run with that idea until we smack right into a measure problem, and lose the ability to make useful predictions.

Many people want to frame the multiverse debates as “science versus pseudoscience,” or “science versus science fiction,” or (as I did before) “physics versus metaphysics.”  But actually, I don’t think any of those dichotomies get to the nub of the matter.  All of the multiverses I’ve mentioned—certainly the inflationary and Everett multiverses, but even the computer-simuverse and the π-verse—have their origins in legitimate scientific questions and in genuinely-great achievements of science.  However, they then extrapolate those achievements in a direction that hasn’t yet led to anything impressive.  Or at least, not to anything that we couldn’t have gotten without the ontological commitments that led to the multiverse and its measure problem.

What is it, in general, that makes a scientific theory impressive?  I’d say that the answer is simple: connecting elegant math to actual facts of experience.

When Einstein said, the perihelion of Mercury precesses at 43 seconds of arc per century because gravity is the curvature of spacetime—that was impressive.

When Dirac said, you should see a positron because this equation in quantum field theory is a quadratic with both positive and negative solutions (and then the positron was found)—that was impressive.

When Darwin said, there must be equal numbers of males and females in all these different animal species because any other ratio would fail to be an equilibrium—that was impressive.

When people say that multiverse theorizing “isn’t science,” I think what they mean is that it’s failed, so far, to be impressive science in the above sense.  It hasn’t yet produced any satisfying clicks of understanding, much less dramatically-confirmed predictions.  Yes, Steven Weinberg kind-of, sort-of used “multiverse” reasoning to predict—correctly—that the cosmological constant should be nonzero.  But as far as I can tell, he could just as well have dispensed with the “multiverse” part, and said: “I see no physical reason why the cosmological constant should be zero, rather than having some small nonzero value still consistent with the formation of stars and galaxies.”

At this, many multiverse proponents would protest: “look, Einstein, Dirac, and Darwin is setting a pretty high bar!  Those guys were smart but also lucky, and it’s unrealistic to expect that scientists will always be so lucky.  For many aspects of the world, there might not be an elegant theoretical explanation—or any explanation at all better than, ‘well, if it were much different, then we probably wouldn’t be here talking about it.’  So, are you saying we should ignore where the evidence leads us, just because of some a-priori prejudice in favor of mathematical elegance?”

In a sense, yes, I am saying that.  Here’s an analogy: suppose an aspiring filmmaker said, “I want my films to capture the reality of human experience, not some Hollywood myth.  So, in most of my movies nothing much will happen at all.  If something does happen—say, a major character dies—it won’t be after some interesting, character-forming struggle, but meaninglessly, in a way totally unrelated to the rest of the film.  Like maybe they get hit by a bus.  Then some other random stuff will happen, and then the movie will end.”

Such a filmmaker, I’d say, would have a perfect plan for creating boring, arthouse movies that nobody wants to watch.  Dramatic, character-forming struggles against the odds might not be the norm of human experience, but they are the central ingredient of entertaining cinema—so if you want to create an entertaining movie, then you have to postselect on those parts of human experience that do involve dramatic struggles.  In the same way, I claim that elegant mathematical explanations for observed facts are the central ingredient of great science.  Not everything in the universe might have such an explanation, but if one wants to create great science, one has to postselect on the things that do.

(Note that there’s an irony here: the same unsatisfyingness, the same lack of explanatory oomph, that make something a “lousy movie” to those with a scientific mindset, can easily make it a great movie to those without such a mindset.  The hunger for nontrivial mathematical explanations is a hunger one has to acquire!)

Some readers might argue: “but weren’t quantum mechanics, chaos theory, and Gödel’s theorem scientifically important precisely because they said that certain phenomena—the exact timing of a radioactive decay, next month’s weather, the bits of Chaitin’s Ω—were unpredictable and unexplainable in fundamental ways?”  To me, these are the exceptions that prove the rule.  Quantum mechanics, chaos, and Gödel’s theorem were great science not because they declared certain facts unexplainable, but because they explained why those facts (and not other facts) had no explanations of certain kinds.  Even more to the point, they gave definite rules to help figure out what would and wouldn’t be explainable in their respective domains: is this state an eigenstate of the operator you’re measuring?  is the Lyapunov exponent positive?  is there a proof of independence from PA or ZFC?

So, what would be the analogue of the above for the multiverse?  Is there any Level II or IV multiverse hypothesis that says: sure, the mass of electron might be a cosmic accident, with at best an anthropic explanation, but the mass of the Higgs boson is almost certainly not such an accident?  Or that the sum or difference of the two masses is not an accident?  (And no, it doesn’t count to affirm as “non-accidental” things that we already have non-anthropic explanations for.)  If such a hypothesis exists, tell me in the comments!  As far as I know, all Level II and IV multiverse hypotheses are still at the stage where basically anything that isn’t already explained might vary across universes and be anthropically selected.  And that, to my mind, makes them very different in character from quantum mechanics, chaos, or Gödel’s theorem.

In summary, here’s what I feel is a reasonable position to take right now, regarding all four of Tegmark’s multiverse levels (not to mention the computer-simuverse, which I humbly propose as Level 3.5):

Yes, these multiverses are a perfectly fine thing to speculate about: sure they’re unobservable, but so are plenty of other entities that science has forced us to accept.  There are even natural reasons, within physics and cosmology, that could lead a person to speculate about each of these multiverse levels.  So if you want to speculate, knock yourself out!  If, however, you want me to accept the results as more than speculation—if you want me to put them on the bookshelf next to Darwin and Einstein—then you’ll need to do more than argue that other stuff I already believe logically entails a multiverse (which I’ve never been sure about), or point to facts that are currently unexplained as evidence that we need a multiverse to explain their unexplainability, or claim as triumphs for your hypothesis things that don’t really need the hypothesis at all, or describe implausible hypothetical scenarios that could confirm or falsify the hypothesis.  Rather, you’ll need to use your multiverse hypothesis—and your proposed solution to the resulting measure problem—to do something new that impresses me.


Update (April 5): By now, three or four people have written in asking for my reaction to the preprint “Computational solution to quantum foundational problems” by Arkady Bolotin.  (See here for the inevitable Slashdot discussion, entitled “P vs. NP Problem Linked to the Quantum Nature of the Universe.”)  It gives me no pleasure to respond to this sort of thing—it would be far better to let papers this gobsmackingly uninformed about the relevant issues fade away in quiet obscurity—but since that no longer seems to be possible in the age of social media, my brief response is here.

* * *

(note: sorry, no April Fools post, just a post that happens to have gone up on April Fools)

This weekend, Dana and I celebrated our third anniversary by going out to your typical sappy romantic movie: Particle Fever, a documentary about the Large Hadron Collider.  As it turns out, the movie was spectacularly good; anyone who reads this blog should go see it.  Or, to offer even higher praise:

If watching Particle Fever doesn’t cause you to feel in your bones the value of fundamental science—the thrill of discovery, unmotivated by any application—then you are not truly human.  You are a barnyard animal who happens to walk on its hind legs.

Indeed, I regard Particle Fever as one of the finest advertisements for science itself ever created.  It’s effective precisely because it doesn’t try to tell you why science is important (except for one scene, where an economist asks a physicist after a public talk about the “return on investment” of the LHC, and is given the standard correct answer, about “what was the return on investment of radio waves when they were first discovered?”).  Instead, the movie simply shows you the lives of particle physicists, of people who take for granted the urgency of knowing the truth about the basic constituents of reality.  And in showing you the scientists’ quest, it makes you feel as they feel.  Incidentally, the movie also shows footage of Congressmen ridiculing the uselessness of the Superconducting Supercollider, during the debates that led to the SSC’s cancellation.  So, gently, implicitly, you’re invited to choose: whose side are you on?

I do have a few, not quite criticisms of the movie, but points that any viewer should bear in mind while watching it.

First, it’s important not to come away with the impression that Particle Fever shows “what science is usually like.”  Sure, there are plenty of scenes that any scientist would find familiar: sleep-deprived postdocs; boisterous theorists correcting each other’s statements over Chinese food; a harried lab manager walking to the office oblivious to traffic.  On the other hand, the decades-long quest to find the Higgs boson, the agonizing drought of new data before the one big money shot, the need for an entire field to coalesce around a single machine, the whole careers hitched to specific speculative scenarios that this one machine could favor or disfavor—all of that is a profoundly abnormal situation in the history of science.  Particle physics didn’t used to be that way, and other parts of science are not that way today.  Of course, the fact that particle physics became that way makes it unusually suited for a suspenseful movie—a fact that the creators of Particle Fever understood perfectly and exploited to the hilt.

Second, the movie frames the importance of the Higgs search as follows: if the Higgs boson turned out to be relatively light, like 115 GeV, then that would favor supersymmetry, and hence an “elegant, orderly universe.”  If, on the other hand, the Higgs turned out to be relatively heavy, like 140 GeV, then that would favor anthropic multiverse scenarios (and hence a “messy, random universe”).  So the fact that the Higgs ended up being 125 GeV means the universe is coyly refusing to tell us whether it’s orderly or random, and more research is needed.

In my view, it’s entirely appropriate for a movie like this one to relate its subject matter to big, metaphysical questions, to the kinds of questions anyone can get curious about (in contrast to, say, “what is the mechanism of electroweak symmetry breaking?”) and that the scientists themselves talk about anyway.  But caution is needed here.  My lay understanding, which might be wrong, is as follows: while it’s true that a lighter Higgs would tend to favor supersymmetric models, the only way to argue that a heavier Higgs would “favor the multiverse,” is if you believe that a multiverse is automatically favored by a lack of better explanations.  More broadly, I wish the film had made clearer that the explanation for (some) apparent “fine-tunings” in the Standard Model might be neither supersymmetry, nor the multiverse, nor “it’s just an inexplicable accident,” but simply some other explanation that no one has thought of yet, but that would emerge from a better understanding of quantum field theory.  As one example, on reading up on the subject after watching the film, I was surprised to learn that a very conservative-sounding idea—that of “asymptotically safe gravity”—was used in 2009 to predict the Higgs mass right on the nose, at 126.3 ± 2.2 GeV.  Of course, it’s possible that this was just a lucky guess (there were, after all, lots of Higgs mass predictions).  But as an outsider, I’d love to understand why possibilities like this don’t seem to get discussed more (there might, of course, be perfectly good reasons that I don’t know).

Third, for understandable dramatic reasons, the movie focuses almost entirely on the “younger generation,” from postdocs working on ATLAS and CMS detectors, to theorists like Nima Arkani-Hamed who are excited about the LHC because of its ability to test scenarios like supersymmetry.  From the movie’s perspective, the creation of the Standard Model itself, in the 60s and 70s, might as well be ancient history.  Indeed, when Peter Higgs finally appears near the end of the film, it’s as if Isaac Newton has walked onstage.  At several points, I found myself wishing that some of the original architects of the Standard Model, like Steven Weinberg or Sheldon Glashow, had been interviewed to provide their perspectives.  After all, their model is really the one that’s been vindicated at the LHC, not (so far) any of the newer ideas like supersymmetry or large extra dimensions.

OK, but let me come to the main point of this post.  I confess that my overwhelming emotion on watching Particle Fever was one of regret—regret that my own field, quantum computing, has never managed to make the case for itself the way particle physics and cosmology have, in terms of the human urge to explore the unknown.

See, from my perspective, there’s a lot to envy about the high-energy physicists.  Most importantly, they don’t perceive any need to justify what they do in terms of practical applications.  Sure, they happily point to “spinoffs,” like the fact that the Web was invented at CERN.  But any time they try to justify what they do, the unstated message is that if you don’t see the inherent value of understanding the universe, then the problem lies with you.

Now, no marketing consultant would ever in a trillion years endorse such an out-of-touch, elitist sales pitch.  But the remarkable fact is that the message has more-or-less worked.  While the cancellation of the SSC was a setback, the high-energy physicists did succeed in persuading the world to pony up the $11 billion needed to build the LHC, and to gain the information that the mass of the Higgs boson is about 125 GeV.

Now contrast that with quantum computing.  To hear the media tell it, a quantum computer would be a powerful new gizmo, sort of like existing computers except faster.  (Why would it be faster?  Something to do with trying both 0 and 1 at the same time.)  The reasons to build quantum computers are things that could make any buzzword-spouting dullard nod in recognition: cracking uncrackable encryption, finding bugs in aviation software, sifting through massive data sets, maybe even curing cancer, predicting the weather, or finding aliens.  And all of this could be yours in a few short years—or some say it’s even commercially available today.  So, if you check back in a few years and it’s still not on store shelves, probably it went the way of flying cars or moving sidewalks: another technological marvel that just failed to materialize for some reason.

Foolishly, shortsightedly, many academics in quantum computing have played along with this stunted vision of their field—because saying this sort of thing is the easiest way to get funding, because everyone else says the same stuff, and because after you’ve repeated something on enough grant applications you start to believe it yourself.  All in all, then, it’s just easier to go along with the “gizmo vision” of quantum computing than to ask pointed questions like:

What happens when it turns out that some of the most-hyped applications of quantum computers (e.g., optimization, machine learning, and Big Data) were based on wildly inflated hopes—that there simply isn’t much quantum speedup to be had for typical problems of that kind, that yes, quantum algorithms exist, but they aren’t much faster than the best classical randomized algorithms?  What happens when it turns out that the real applications of quantum computing—like breaking RSA and simulating quantum systems—are nice, but not important enough by themselves to justify the cost?  (E.g., when the imminent risk of a quantum computer simply causes people to switch from RSA to other cryptographic codes?  Or when the large polynomial overheads of quantum simulation algorithms limit their usefulness?)  Finally, what happens when it turns out that the promises of useful quantum computers in 5-10 years were wildly unrealistic?

I’ll tell you: when this happens, the spigots of funding that once flowed freely will dry up, and the techno-journalists and pointy-haired bosses who once sang our praises will turn to the next craze.  And they’re unlikely to be impressed when we protest, “no, look, the reasons we told you before for why you should support quantum computing were never the real reasons!  and the real reasons remain as valid as ever!”

In my view, we as a community have failed to make the honest case for quantum computing—the case based on basic science—because we’ve underestimated the public.  We’ve falsely believed that people would never support us if we told them the truth: that while the potential applications are wonderful cherries on the sundae, they’re not and have never been the main reason to build a quantum computer.  The main reason is that we want to make absolutely manifest what quantum mechanics says about the nature of reality.  We want to lift the enormity of Hilbert space out of the textbooks, and rub its full, linear, unmodified truth in the face of anyone who denies it.  Or if it isn’t the truth, then we want to discover what is the truth.

Many people would say it’s impossible to make the latter pitch, that funders and laypeople would never understand it or buy it.  But there’s an $11-billion, 17-mile ring under Geneva that speaks against their cynicism.

Anyway, let me end this “movie review” with an anecdote.  The other day a respected colleague of mine—someone who doesn’t normally follow such matters—asked me what I thought about D-Wave.  After I’d given my usual spiel, he smiled and said:

“See Scott, but you could imagine scientists of the 1400s saying the same things about Columbus!  He had no plan that could survive academic scrutiny.  He raised money under the false belief that he could reach India by sailing due west.  And he didn’t understand what he’d found even after he’d found it.  Yet for all that, it was Columbus, and not some academic critic on the sidelines, who discovered the new world.”

With this one analogy, my colleague had eloquently summarized the case for D-Wave, a case often leveled against me much more verbosely.  But I had an answer.

“I accept your analogy!” I replied.  “But to me, Columbus and the other conquerors of the Americas weren’t heroes to be admired or emulated.  Motivated by gold and spices rather than knowledge, they spread disease, killed and enslaved millions in one of history’s greatest holocausts, and burned the priceless records of the Maya and Inca civilizations so that the world would never even understand what was lost.  I submit that, had it been undertaken by curious and careful scientists—or at least people with a scientific mindset—rather than by swashbucklers funded by greedy kings, the European exploration and colonization of the Americas could have been incalculably less tragic.”

The trouble is, when I say things like that, people just laugh at me knowingly.  There he goes again, the pie-in-the-sky complexity theorist, who has no idea what it takes to get anything done in the real world.  What an amusingly contrary perspective he has.

And that, in the end, is why I think Particle Fever is such an important movie.  Through the stories of the people who built the LHC, you’ll see how it really is possible to reach a new continent without the promise of gold or the allure of lies.


Yes, less than a week after the course itself finished, a new set of lecture notes is finally here! The topic: randomness.

I’m writing this post from über-commenter Greg Kuperberg’s office at UC Davis, where I’m visiting for a few days to give a math colloquium. Greg has been trying to fill my thick skull with something called “t-cubature formulas,” and writing this post provides me with a much-needed break!

After Davis, I’ll be going to Berkeley for a couple weeks (not that I ever really left it), then my parents’ place in Pennsylvania for the holidays, then Caltech, then New Zealand (why the hell not?), then Australia for QIP, then back to Waterloo in February. Much more relaxing than last year’s trip — note that I won’t return from this one with an (additional) 2πi phase.


So I’ve written an article about the above question for PBS’s website—a sort of tl;dr version of my 2005 survey paper NP-Complete Problems and Physical Reality, but updated with new material about the simulation of quantum field theories and about AdS/CFT.  Go over there, read the article (it’s free), then come back here to talk about it if you like.  Thanks so much to Kate Becker for commissioning the article.

In other news, there’s a profile of me at MIT News (called “The Complexonaut”) that some people might find amusing.

Oh, and anyone who thinks the main reason to care about quantum computing is that, if our civilization ever manages to surmount the profound scientific and technological obstacles to building a scalable quantum computer, then that little padlock icon on your web browser would no longer represent ironclad security?  Ha ha.  Yeah, it turns out that, besides factoring integers, you can also break OpenSSL by (for example) exploiting a memory bug in C.  The main reason to care about quantum computing is, and has always been, science.


So, I’ve written an article of that title for the wonderful American Scientist magazine—or rather, Part I of such an article.  This part explains the basics of Kolmogorov complexity and algorithmic information theory: how, under reasonable assumptions, these ideas can be used in principle to “certify” that a string of numbers was really produced randomly—something that one might’ve imagined impossible a priori.  Unfortunately, the article also explains why this fact is of limited use in practice: because Kolmogorov complexity is uncomputable!  Readers who already know this material won’t find much that’s new here, but I hope those who don’t will enjoy the piece.

Part II, to appear in the next issue, will be all about quantum entanglement and Bell’s Theorem, and their very recent use in striking protocols for generating so-called “Einstein-certified random numbers”—something of much more immediate practical interest.

Thanks so much to Fenella Saunders of American Scientist for commissioning these articles, and my apologies to her and any interested readers for the 4.5 years (!) it took me to get off my rear end (or rather, onto it) to write these things.

* * *

Update (4/28): Kate Becker of NOVA has published an article about “whether information is fundamental to reality,” which includes some quotes from me. Enjoy!


Among all the mysteries of the universe, it’s good to know that at least one of them is answerable. My accent, apparently, is “as Philadelphian as a cheesesteak.” Hat tip to Greg Kuperberg.


Eight years ago, I put up a post entitled The Ten Most Annoying Questions in Quantum Computing.  One of the ten wasn’t a real question—it was simply a request for readers to submit questions—so let’s call it nine.  I’m delighted to say that, of the nine questions, six have by now been completely settled—most recently, my question about the parallel-repeated value of the CHSH game, which Andris Ambainis pointed out to me last week can be answered using a 2008 result of Barak et al. combined with a 2013 result of Dinur and Steurer.

To be clear, the demise of so many problems is exactly the outcome I wanted. In picking problems, my goal wasn’t to shock and awe with difficulty—as if to say “this is how smart I am, that whatever stumps me will also stump everyone else for decades.” Nor was it to showcase my bottomless profundity, by proffering questions so vague, multipartite, and open-ended that no matter what progress was made, I could always reply “ah, but you still haven’t addressed the real question!” Nor, finally, was my goal to list the biggest research directions for the entire field, the stuff everyone already knows about (“is there a polynomial-time quantum algorithm for graph isomorphism?”). My interest was exclusively in “little” questions, in weird puzzles that looked (at least at the time) like there was no deep obstruction to just killing them one by one, whichever way their answers turned out. What made them annoying was that they hadn’t succumbed already.

So, now that two-thirds of my problems have met the fate they deserved, at Andris’s suggestion I’m presenting a new list of Ten Most Annoying Questions in Quantum Computing—a list that starts with the three still-unanswered questions from the old list, and then adds seven more.

But we’ll get to that shortly. First, let’s review the six questions that have been answered.

* * *

CLOSED, NO-LONGER ANNOYING QUESTIONS IN QUANTUM COMPUTING

1\. Given an n-qubit pure state, is there always a way to apply Hadamard gates to some subset of the qubits, so as to make all 2n computational basis states have nonzero amplitudes?  Positive answer by Ashley Montanaro and Dan Shepherd, posted to this blog in 2006.

3\. Can any QMA(2) (QMA with two unentangled yes-provers) protocol be amplified to exponentially small error probability?  Positive answer by Aram Harrow and Ashley Montanaro, from a FOCS’2010 paper.

4\. If a unitary operation U can be applied in polynomial time, then can some square root of U also be applied in polynomial time?  Positive answer by Lana Sheridan, Dmitri Maslov, and Michele Mosca, from a 2008 paper.

5\. Suppose Alice and Bob are playing n parallel CHSH games, with no communication or entanglement. Is the probability that they’ll win all n games at most pn, for some p bounded below 0.853?

OK, let me relay what Andris Ambainis told me about this question, with Andris’s kind permission. First of all, we’ve known for a while that the optimal success probability is not the (3/4)n that Alice and Bob could trivially achieve by just playing all n games separately. I observed in 2006 that, by correlating their strategies between pairs of games in a clever way, Alice and Bob can win with probability (√10 / 4)n ~ 0.79n. And Barak et al. showed in 2008 that they can win with probability ((1+√5)/4)n ~ 0.81n. (Unfortunately, I don’t know the actual strategy that achieves the latter bound!  Barak et al. say they’ll describe it in the full version of their paper, but the full version hasn’t yet appeared.)

Anyway, Dinur-Steurer 2013 gave a general recipe to prove that the value of a repeated projection game is at most αn, where α is some constant that depends on the game in question. When Andris followed their recipe for the CHSH game, he obtained the result α=(1+√5)/4—thereby showing that Barak et al.’s strategy, whatever it is, is precisely optimal! Andris also observes that, for any two-prover game G, the Dinur-Steurer bound α(G) is always strictly less than the entangled value ω*(G), unless the classical and entangled values are the same for one copy of the game (i.e., unless ω(G)=ω*(G)). This implies that parallel repetition can never completely eliminate a quantum advantage.

6\. Forget about an oracle relative to which BQP is not in PH (the Polynomial Hierarchy). Forget about an oracle relative to which BQP is not in AM (Arthur-Merlin). Is there an oracle relative to which BQP is not in SZK (Statistical Zero-Knowledge)?  Positive answer by me, posted to this blog in 2006.  See also my BQP vs. PH paper for a different proof.

9\. Is there an n-qubit pure state that can be prepared by a circuit of size n3, and that can’t be distinguished from the maximally mixed state by any circuit of size n2?  A positive answer follows from this 2009 paper by Richard Low—thanks very much to Fernando Brandao for bringing that to my attention a few months ago.

* * *

OK, now on to:

THE NEW TEN MOST ANNOYING QUESTIONS IN QUANTUM COMPUTING

1. Can we get any upper bound whatsoever on the complexity class QMIP—i.e., quantum multi-prover interactive proofs with unlimited prior entanglement? (Since I asked this question in 2006, Ito and Vidick achieved the breakthrough lower bound NEXP⊆QMIP, but there’s been basically no progress on the upper bound side.)

2\. Given any n-qubit unitary operation U, does there exist an oracle relative to which U can be (approximately) applied in polynomial time? (Since 2006, my interest in this question has only increased. See this paper by me and Greg Kuperberg for background and related results.)

3. How many mutually unbiased bases are there in non-prime-power dimensions?

4\. Since Chris Fuchs was so thrilled by my including one of his favorite questions on my earlier list (question #3 above), let me add another of his favorites: do SIC-POVMs exist in arbitrary finite dimensions?

5\. Is there a Boolean function f:{0,1}n→{0,1} whose bounded-error quantum query complexity is strictly greater than n/2?  (Thanks to Shelby Kimmel for this question!  Note that this paper by van Dam shows that the bounded-error quantum query complexity never exceeds n/2+O(√n), while this paper by Ambainis et al. shows that it’s at least n/2-O(√n) for almost all Boolean functions f.)

6\. Is there a “universal disentangler”: that is, a superoperator S that takes nO(1) qubits as input; that produces a 2n-qubit bipartite state (with n qubits on each side) as output; whose output S(ρ) is always close in variation distance to a separable state; and that given an appropriate input state, can produce as output an approximation to any desired separable state?  (See here for background about this problem, originally posed by John Watrous. Note that if such an S existed and were computationally efficient, it would imply QMA=QMA(2).)

7\. Suppose we have explicit descriptions of n two-outcome POVM measurements—say, as d×d Hermitian matrices E1,…,En—and are also given k=(log(nd))O(1) copies of an unknown quantum state ρ in d dimensions.  Is there a way to measure the copies so as to estimate the n expectation values Tr(E1ρ),…,Tr(Enρ), each to constant additive error?  (A forthcoming paper of mine on private-key quantum money will contain some background and related results.)

8\. Is there a collection of 1- and 2-qubit gates that generates a group of unitary matrices that is (a) not universal for quantum computation, (b) not just conjugate to permuted diagonal matrices or one-qubit gates plus swaps, and (c) not conjugate to a subgroup of the Clifford group?

9\. Given a partial Boolean function f:S→{0,1} with S⊆{0,1}n, is the bounded-error quantum query complexity of f always polynomially related to the smallest degree of any polynomial p:{0,1}n→R such that (a) p(x)∈[0,1] for all x∈{0,1}n, and (b) |p(x)-f(x)|≤1/3 for all x∈S?

10\. Is there a quantum finite automaton that reads in an infinite sequence of i.i.d. coin flips, and whose limiting probability of being found in an “accept” state is at least 2/3 if the coin is fair and at most 1/3 if the coin is unfair?  (See this paper by me and Andy Drucker for background and related results.)


Happy birthday to me!

Recently, lots of people have been asking me what I think about IIT—no, not the Indian Institutes of Technology, but Integrated Information Theory, a widely-discussed “mathematical theory of consciousness” developed over the past decade by the neuroscientist Giulio Tononi.  One of the askers was Max Tegmark, who’s enthusiastically adopted IIT as a plank in his radical mathematizing platform (see his paper “Consciousness as a State of Matter”).  When, in the comment thread about Max’s Mathematical Universe Hypothesis, I expressed doubts about IIT, Max challenged me to back up my doubts with a quantitative calculation.

So, this is the post that I promised to Max and all the others, about why I don’t believe IIT.  And yes, it will contain that quantitative calculation.

But first, what is IIT?  The central ideas of IIT, as I understand them, are:

(1) to propose a quantitative measure, called Φ, of the amount of “integrated information” in a physical system (i.e. information that can’t be localized in the system’s individual parts), and then

(2) to hypothesize that a physical system is “conscious” if and only if it has a large value of Φ—and indeed, that a system is more conscious the larger its Φ value.

I’ll return later to the precise definition of Φ—but basically, it’s obtained by minimizing, over all subdivisions of your physical system into two parts A and B, some measure of the mutual information between A’s outputs and B’s inputs and vice versa.  Now, one immediate consequence of any definition like this is that all sorts of simple physical systems (a thermostat, a photodiode, etc.) will turn out to have small but nonzero Φ values.  To his credit, Tononi cheerfully accepts the panpsychist implication: yes, he says, it really does mean that thermostats and photodiodes have small but nonzero levels of consciousness.  On the other hand, for the theory to work, it had better be the case that Φ is small for “intuitively unconscious” systems, and only large for “intuitively conscious” systems.  As I’ll explain later, this strikes me as a crucial point on which IIT fails.

The literature on IIT is too big to do it justice in a blog post.  Strikingly, in addition to the “primary” literature, there’s now even a “secondary” literature, which treats IIT as a sort of established base on which to build further speculations about consciousness.  Besides the Tegmark paper linked to above, see for example this paper by Maguire et al., and associated popular article.  (Ironically, Maguire et al. use IIT to argue for the Penrose-like view that consciousness might have uncomputable aspects—a use diametrically opposed to Tegmark’s.)

Anyway, if you want to read a popular article about IIT, there are loads of them: see here for the New York Times’s, here for Scientific American‘s, here for IEEE Spectrum‘s, and here for the New Yorker‘s.  Unfortunately, none of those articles will tell you the meat (i.e., the definition of integrated information); for that you need technical papers, like this or this by Tononi, or this by Seth et al.  IIT is also described in Christof Koch’s memoir Consciousness: Confessions of a Romantic Reductionist, which I read and enjoyed; as well as Tononi’s Phi: A Voyage from the Brain to the Soul, which I haven’t yet read.  (Koch, one of the world’s best-known thinkers and writers about consciousness, has also become an evangelist for IIT.)

So, I want to explain why I don’t think IIT solves even the problem that it “plausibly could have” solved.  But before I can do that, I need to do some philosophical ground-clearing.  Broadly speaking, what is it that a “mathematical theory of consciousness” is supposed to do?  What questions should it answer, and how should we judge whether it’s succeeded?

The most obvious thing a consciousness theory could do is to explain why consciousness exists: that is, to solve what David Chalmers calls the “Hard Problem,” by telling us how a clump of neurons is able to give rise to the taste of strawberries, the redness of red … you know, all that ineffable first-persony stuff.  Alas, there’s a strong argument—one that I, personally, find completely convincing—why that’s too much to ask of any scientific theory.  Namely, no matter what the third-person facts were, one could always imagine a universe consistent with those facts in which no one “really” experienced anything.  So for example, if someone claims that integrated information “explains” why consciousness exists—nope, sorry!  I’ve just conjured into my imagination beings whose Φ-values are a thousand, nay a trillion times larger than humans’, yet who are also philosophical zombies: entities that there’s nothing that it’s like to be.  Granted, maybe such zombies can’t exist in the actual world: maybe, if you tried to create one, God would notice its large Φ-value and generously bequeath it a soul.  But if so, then that’s a further fact about our world, a fact that manifestly couldn’t be deduced from the properties of Φ alone.  Notice that the details of Φ are completely irrelevant to the argument.

Faced with this point, many scientifically-minded people start yelling and throwing things.  They say that “zombies” and so forth are empty metaphysics, and that our only hope of learning about consciousness is to engage with actual facts about the brain.  And that’s a perfectly reasonable position!  As far as I’m concerned, you absolutely have the option of dismissing Chalmers’ Hard Problem as a navel-gazing distraction from the real work of neuroscience.  The one thing you can’t do is have it both ways: that is, you can’t say both that the Hard Problem is meaningless, and that progress in neuroscience will soon solve the problem if it hasn’t already.  You can’t maintain simultaneously that

(a) once you account for someone’s observed behavior and the details of their brain organization, there’s nothing further about consciousness to be explained, and

(b) remarkably, the XYZ theory of consciousness can explain the “nothing further” (e.g., by reducing it to integrated information processing), or might be on the verge of doing so.

As obvious as this sounds, it seems to me that large swaths of consciousness-theorizing can just be summarily rejected for trying to have their brain and eat it in precisely the above way.

Fortunately, I think IIT survives the above observations.  For we can easily interpret IIT as trying to do something more “modest” than solve the Hard Problem, although still staggeringly audacious.  Namely, we can say that IIT “merely” aims to tell us which physical systems are associated with consciousness and which aren’t, purely in terms of the systems’ physical organization.  The test of such a theory is whether it can produce results agreeing with “commonsense intuition”: for example, whether it can affirm, from first principles, that (most) humans are conscious; that dogs and horses are also conscious but less so; that rocks, livers, bacteria colonies, and existing digital computers are not conscious (or are hardly conscious); and that a room full of people has no “mega-consciousness” over and above the consciousnesses of the individuals.

The reason it’s so important that the theory uphold “common sense” on these test cases is that, given the experimental inaccessibility of consciousness, this is basically the only test available to us.  If the theory gets the test cases “wrong” (i.e., gives results diverging from common sense), it’s not clear that there’s anything else for the theory to get “right.”  Of course, supposing we had a theory that got the test cases right, we could then have a field day with the less-obvious cases, programming our computers to tell us exactly how much consciousness is present in octopi, fetuses, brain-damaged patients, and hypothetical AI bots.

In my opinion, how to construct a theory that tells us which physical systems are conscious and which aren’t—giving answers that agree with “common sense” whenever the latter renders a verdict—is one of the deepest, most fascinating problems in all of science.  Since I don’t know a standard name for the problem, I hereby call it the Pretty-Hard Problem of Consciousness.  Unlike with the Hard Hard Problem, I don’t know of any philosophical reason why the Pretty-Hard Problem should be inherently unsolvable; but on the other hand, humans seem nowhere close to solving it (if we had solved it, then we could reduce the abortion, animal rights, and strong AI debates to “gentlemen, let us calculate!”).

Now, I regard IIT as a serious, honorable attempt to grapple with the Pretty-Hard Problem of Consciousness: something concrete enough to move the discussion forward.  But I also regard IIT as a failed attempt on the problem.  And I wish people would recognize its failure, learn from it, and move on.

In my view, IIT fails to solve the Pretty-Hard Problem because it unavoidably predicts vast amounts of consciousness in physical systems that no sane person would regard as particularly “conscious” at all: indeed, systems that do nothing but apply a low-density parity-check code, or other simple transformations of their input data.  Moreover, IIT predicts not merely that these systems are “slightly” conscious (which would be fine), but that they can be unboundedly more conscious than humans are.

To justify that claim, I first need to define Φ.  Strikingly, despite the large literature about Φ, I had a hard time finding a clear mathematical definition of it—one that not only listed formulas but fully defined the structures that the formulas were talking about.  Complicating matters further, there are several competing definitions of Φ in the literature, including ΦDM (discrete memoryless), ΦE (empirical), and ΦAR (autoregressive), which apply in different contexts (e.g., some take time evolution into account and others don’t).  Nevertheless, I think I can define Φ in a way that will make sense to theoretical computer scientists.  And crucially, the broad point I want to make about Φ won’t depend much on the details of its formalization anyway.

We consider a discrete system in a state x=(x1,…,xn)∈Sn, where S is a finite alphabet (the simplest case is S={0,1}).  We imagine that the system evolves via an “updating function” f:Sn→Sn. Then the question that interests us is whether the xi‘s can be partitioned into two sets A and B, of roughly comparable size, such that the updates to the variables in A don’t depend very much on the variables in B and vice versa.  If such a partition exists, then we say that the computation of f does not involve “global integration of information,” which on Tononi’s theory is a defining aspect of consciousness.

More formally, given a partition (A,B) of {1,…,n}, let us write an input y=(y1,…,yn)∈Sn to f in the form (yA,yB), where yA consists of the y variables in A and yB consists of the y variables in B.  Then we can think of f as mapping an input pair (yA,yB) to an output pair (zA,zB).  Now, we define the “effective information” EI(A→B) as H(zB | A random, yB=xB).  Or in words, EI(A→B) is the Shannon entropy of the output variables in B, if the input variables in A are drawn uniformly at random, while the input variables in B are fixed to their values in x.  It’s a measure of the dependence of B on A in the computation of f(x).  Similarly, we define

EI(B→A) := H(zA | B random, yA=xA).

We then consider the sum

Φ(A,B) := EI(A→B) + EI(B→A).

Intuitively, we’d like the integrated information Φ=Φ(f,x) be the minimum of Φ(A,B), over all 2n-2 possible partitions of {1,…,n} into nonempty sets A and B.  The idea is that Φ should be large, if and only if it’s not possible to partition the variables into two sets A and B, in such a way that not much information flows from A to B or vice versa when f(x) is computed.

However, no sooner do we propose this than we notice a technical problem.  What if A is much larger than B, or vice versa?  As an extreme case, what if A={1,…,n-1} and B={n}?  In that case, we’ll have Φ(A,B)≤2log2|S|, but only for the boring reason that there’s hardly any entropy in B as a whole, to either influence A or be influenced by it.  For this reason, Tononi proposes a fix where we normalize each Φ(A,B) by dividing it by min{|A|,|B|}.  He then defines the integrated information Φ to be Φ(A,B), for whichever partition (A,B) minimizes the ratio Φ(A,B) / min{|A|,|B|}.  (Unless I missed it, Tononi never specifies what we should do if there are multiple (A,B)’s that all achieve the same minimum of Φ(A,B) / min{|A|,|B|}.  I’ll return to that point later, along with other idiosyncrasies of the normalization procedure.)

Tononi gives some simple examples of the computation of Φ, showing that it is indeed larger for systems that are more “richly interconnected” in an intuitive sense.  He speculates, plausibly, that Φ is quite large for (some reasonable model of) the interconnection network of the human brain—and probably larger for the brain than for typical electronic devices (which tend to be highly modular in design, thereby decreasing their Φ), or, let’s say, than for other organs like the pancreas.  Ambitiously, he even speculates at length about how a large value of Φ might be connected to the phenomenology of consciousness.

To be sure, empirical work in integrated information theory has been hampered by three difficulties.  The first difficulty is that we don’t know the detailed interconnection network of the human brain.  The second difficulty is that it’s not even clear what we should define that network to be: for example, as a crude first attempt, should we assign a Boolean variable to each neuron, which equals 1 if the neuron is currently firing and 0 if it’s not firing, and let f be the function that updates those variables over a timescale of, say, a millisecond?  What other variables do we need—firing rates, internal states of the neurons, neurotransmitter levels?  Is choosing many of these variables uniformly at random (for the purpose of calculating Φ) really a reasonable way to “randomize” the variables, and if not, what other prescription should we use?

The third and final difficulty is that, even if we knew exactly what we meant by “the f and x corresponding to the human brain,” and even if we had complete knowledge of that f and x, computing Φ(f,x) could still be computationally intractable.  For recall that the definition of Φ involved minimizing a quantity over all the exponentially-many possible bipartitions of {1,…,n}.  While it’s not directly relevant to my arguments in this post, I leave it as a challenge for interested readers to pin down the computational complexity of approximating Φ to some reasonable precision, assuming that f is specified by a polynomial-size Boolean circuit, or alternatively, by an NC0 function (i.e., a function each of whose outputs depends on only a constant number of the inputs).  (Presumably Φ will be #P-hard to calculate exactly, but only because calculating entropy exactly is a #P-hard problem—that’s not interesting.)

I conjecture that approximating Φ is an NP-hard problem, even for restricted families of f’s like NC0 circuits—which invites the amusing thought that God, or Nature, would need to solve an NP-hard problem just to decide whether or not to imbue a given physical system with consciousness!  (Alas, if you wanted to exploit this as a practical approach for solving NP-complete problems such as 3SAT, you’d need to do a rather drastic experiment on your own brain—an experiment whose result would be to render you unconscious if your 3SAT instance was satisfiable, or conscious if it was unsatisfiable!  In neither case would you be able to communicate the outcome of the experiment to anyone else, nor would you have any recollection of the outcome after the experiment was finished.)  In the other direction, it would also be interesting to upper-bound the complexity of approximating Φ.  Because of the need to estimate the entropies of distributions (even given a bipartition (A,B)), I don’t know that this problem is in NP—the best I can observe is that it’s in AM.

In any case, my own reason for rejecting IIT has nothing to do with any of the “merely practical” issues above: neither the difficulty of defining f and x, nor the difficulty of learning them, nor the difficulty of calculating Φ(f,x).  My reason is much more basic, striking directly at the hypothesized link between “integrated information” and consciousness.  Specifically, I claim the following:

Yes, it might be a decent rule of thumb that, if you want to know which brain regions (for example) are associated with consciousness, you should start by looking for regions with lots of information integration.  And yes, it’s even possible, for all I know, that having a large Φ-value is one necessary condition among many for a physical system to be conscious.  However, having a large Φ-value is certainly not a sufficient condition for consciousness, or even for the appearance of consciousness.  As a consequence, Φ can’t possibly capture the essence of what makes a physical system conscious, or even of what makes a system look conscious to external observers.

The demonstration of this claim is embarrassingly simple.  Let S=Fp, where p is some prime sufficiently larger than n, and let V be an n×n Vandermonde matrix over Fp—that is, a matrix whose (i,j) entry equals ij-1 (mod p).  Then let f:Sn→Sn be the update function defined by f(x)=Vx.  Now, for p large enough, the Vandermonde matrix is well-known to have the property that every submatrix is full-rank (i.e., “every submatrix preserves all the information that it’s possible to preserve about the part of x that it acts on”).  And this implies that, regardless of which bipartition (A,B) of {1,…,n} we choose, we’ll get

EI(A→B) = EI(B→A) = min{|A|,|B|} log2p,

and hence

Φ(A,B) = EI(A→B) + EI(B→A) = 2 min{|A|,|B|} log2p,

or after normalizing,

Φ(A,B) / min{|A|,|B|} = 2 log2p.

Or in words: the normalized information integration has the same value—namely, the maximum value!—for every possible bipartition.  Now, I’d like to proceed from here to a determination of Φ itself, but I’m prevented from doing so by the ambiguity in the definition of Φ that I noted earlier.  Namely, since every bipartition (A,B) minimizes the normalized value Φ(A,B) / min{|A|,|B|}, in theory I ought to be able to pick any of them for the purpose of calculating Φ.  But the unnormalized value Φ(A,B), which gives the final Φ, can vary greatly, across bipartitions: from 2 log2p (if min{|A|,|B|}=1) all the way up to n log2p (if min{|A|,|B|}=n/2).  So at this point, Φ is simply undefined.

On the other hand, I can solve this problem, and make Φ well-defined, by an ironic little hack.  The hack is to replace the Vandermonde matrix V by an n×n matrix W, which consists of the first n/2 rows of the Vandermonde matrix each repeated twice (assume for simplicity that n is a multiple of 4).  As before, we let f(x)=Wx.  Then if we set A={1,…,n/2} and B={n/2+1,…,n}, we can achieve

EI(A→B) = EI(B→A) = (n/4) log2p,

Φ(A,B) = EI(A→B) + EI(B→A) = (n/2) log2p,

and hence

Φ(A,B) / min{|A|,|B|} = log2p.

In this case, I claim that the above is the unique bipartition that minimizes the normalized integrated information Φ(A,B) / min{|A|,|B|}, up to trivial reorderings of the rows.  To prove this claim: if |A|=|B|=n/2, then clearly we minimize Φ(A,B) by maximizing the number of repeated rows in A and the number of repeated rows in B, exactly as we did above.  Thus, assume |A|≤|B| (the case |B|≤|A| is analogous).  Then clearly

EI(B→A) ≥ |A|/2,

while

EI(A→B) ≥ min{|A|, |B|/2}.

So if we let |A|=cn and |B|=(1-c)n for some c∈(0,1/2], then

Φ(A,B) ≥ [c/2 + min{c, (1-c)/2}] n,

and

Φ(A,B) / min{|A|,|B|} = Φ(A,B) / |A| = 1/2 + min{1, 1/(2c) – 1/2}.

But the above expression is uniquely minimized when c=1/2.  Hence the normalized integrated information is minimized essentially uniquely by setting A={1,…,n/2} and B={n/2+1,…,n}, and we get

Φ = Φ(A,B) = (n/2) log2p,

which is quite a large value (only a factor of 2 less than the trivial upper bound of n log2p).

Now, why did I call the switch from V to W an “ironic little hack”?  Because, in order to ensure a large value of Φ, I decreased—by a factor of 2, in fact—the amount of “information integration” that was intuitively happening in my system!  I did that in order to decrease the normalized value Φ(A,B) / min{|A|,|B|} for the particular bipartition (A,B) that I cared about, thereby ensuring that that (A,B) would be chosen over all the other bipartitions, thereby increasing the final, unnormalized value Φ(A,B) that Tononi’s prescription tells me to return.  I hope I’m not alone in fearing that this illustrates a disturbing non-robustness in the definition of Φ.

But let’s leave that issue aside; maybe it can be ameliorated by fiddling with the definition.  The broader point is this: I’ve shown that my system—the system that simply applies the matrix W to an input vector x—has an enormous amount of integrated information Φ.  Indeed, this system’s Φ equals half of its entire information content.  So for example, if n were 1014 or so—something that wouldn’t be hard to arrange with existing computers—then this system’s Φ would exceed any plausible upper bound on the integrated information content of the human brain.

And yet this Vandermonde system doesn’t even come close to doing anything that we’d want to call intelligent, let alone conscious!  When you apply the Vandermonde matrix to a vector, all you’re really doing is mapping the list of coefficients of a degree-(n-1) polynomial over Fp, to the values of the polynomial on the n points 0,1,…,n-1.  Now, evaluating a polynomial on a set of points turns out to be an excellent way to achieve “integrated information,” with every subset of outputs as correlated with every subset of inputs as it could possibly be.  In fact, that’s precisely why polynomials are used so heavily in error-correcting codes, such as the Reed-Solomon code, employed (among many other places) in CD’s and DVD’s.  But that doesn’t imply that every time you start up your DVD player you’re lighting the fire of consciousness.  It doesn’t even hint at such a thing.  All it tells us is that you can have integrated information without consciousness (or even intelligence)—just like you can have computation without consciousness, and unpredictability without consciousness, and electricity without consciousness.

It might be objected that, in defining my “Vandermonde system,” I was too abstract and mathematical.  I said that the system maps the input vector x to the output vector Wx, but I didn’t say anything about how it did so.  To perform a computation—even a computation as simple as a matrix-vector multiply—won’t we need a physical network of wires, logic gates, and so forth?  And in any realistic such network, won’t each logic gate be directly connected to at most a few other gates, rather than to billions of them?  And if we define the integrated information Φ, not directly in terms of the inputs and outputs of the function f(x)=Wx, but in terms of all the actual logic gates involved in computing f, isn’t it possible or even likely that Φ will go back down?

This is a good objection, but I don’t think it can rescue IIT.  For we can achieve the same qualitative effect that I illustrated with the Vandermonde matrix—the same “global information integration,” in which every large set of outputs depends heavily on every large set of inputs—even using much “sparser” computations, ones where each individual output depends on only a few of the inputs.  This is precisely the idea behind low-density parity check (LDPC) codes, which have had a major impact on coding theory over the past two decades.  Of course, one would need to muck around a bit to construct a physical system based on LDPC codes whose integrated information Φ was provably large, and for which there were no wildly-unbalanced bipartitions that achieved lower Φ(A,B)/min{|A|,|B|} values than the balanced bipartitions one cared about.  But I feel safe in asserting that this could be done, similarly to how I did it with the Vandermonde matrix.

More generally, we can achieve pretty good information integration by hooking together logic gates according to any bipartite expander graph: that is, any graph with n vertices on each side, such that every k vertices on the left side are connected to at least min{(1+ε)k,n} vertices on the right side, for some constant ε&gt;0\.  And it’s well-known how to create expander graphs whose degree (i.e., the number of edges incident to each vertex, or the number of wires coming out of each logic gate) is a constant, such as 3.  One can do so either by plunking down edges at random, or (less trivially) by explicit constructions from algebra or combinatorics.  And as indicated in the title of this post, I feel 100% confident in saying that the so-constructed expander graphs are not conscious!  The brain might be an expander, but not every expander is a brain.

Before winding down this post, I can’t resist telling you that the concept of integrated information (though it wasn’t called that) played an interesting role in computational complexity in the 1970s.  As I understand the history, Leslie Valiant conjectured that Boolean functions f:{0,1}n→{0,1}n with a high degree of “information integration” (such as discrete analogues of the Fourier transform) might be good candidates for proving circuit lower bounds, which in turn might be baby steps toward P≠NP.  More strongly, Valiant conjectured that the property of information integration, all by itself, implied that such functions had to be at least somewhat computationally complex—i.e., that they couldn’t be computed by circuits of size O(n), or even required circuits of size Ω(n log n).  Alas, that hope was refuted by Valiant’s later discovery of linear-size superconcentrators.  Just as information integration doesn’t suffice for intelligence or consciousness, so Valiant learned that information integration doesn’t suffice for circuit lower bounds either.

As humans, we seem to have the intuition that global integration of information is such a powerful property that no “simple” or “mundane” computational process could possibly achieve it.  But our intuition is wrong.  If it were right, then we wouldn’t have linear-size superconcentrators or LDPC codes.

I should mention that I had the privilege of briefly speaking with Giulio Tononi (as well as his collaborator, Christof Koch) this winter at an FQXi conference in Puerto Rico.  At that time, I challenged Tononi with a much cruder, handwavier version of some of the same points that I made above.  Tononi’s response, as best as I can reconstruct it, was that it’s wrong to approach IIT like a mathematician; instead one needs to start “from the inside,” with the phenomenology of consciousness, and only then try to build general theories that can be tested against counterexamples.  This response perplexed me: of course you can start from phenomenology, or from anything else you like, when constructing your theory of consciousness.  However, once your theory has been constructed, surely it’s then fair game for others to try to refute it with counterexamples?  And surely the theory should be judged, like anything else in science or philosophy, by how well it withstands such attacks?

But let me end on a positive note.  In my opinion, the fact that Integrated Information Theory is wrong—demonstrably wrong, for reasons that go to its core—puts it in something like the top 2% of all mathematical theories of consciousness ever proposed.  Almost all competing theories of consciousness, it seems to me, have been so vague, fluffy, and malleable that they can only aspire to wrongness.

[Endnote: See also this related post, by the philosopher Eric Schwetzgebel: Why Tononi Should Think That the United States Is Conscious.  While the discussion is much more informal, and the proposed counterexample more debatable, the basic objection to IIT is the same.]

* * *

Update (5/22): Here are a few clarifications of this post that might be helpful.

(1) The stuff about zombies and the Hard Problem was simply meant as motivation and background for what I called the “Pretty-Hard Problem of Consciousness”—the problem that I take IIT to be addressing.  You can disagree with the zombie stuff without it having any effect on my arguments about IIT.

(2) I wasn’t arguing in this post that dualism is true, or that consciousness is irreducibly mysterious, or that there could never be any convincing theory that told us how much consciousness was present in a physical system.  All I was arguing was that, at any rate, IIT is not such a theory.

(3) Yes, it’s true that my demonstration of IIT’s falsehood assumes—as an axiom, if you like—that while we might not know exactly what we mean by “consciousness,” at any rate we’re talking about something that humans have to a greater extent than DVD players.  If you reject that axiom, then I’d simply want to define a new word for a certain quality that non-anesthetized humans seem to have and that DVD players seem not to, and clarify that that other quality is the one I’m interested in.

(4) For my counterexample, the reason I chose the Vandermonde matrix is not merely that it’s invertible, but that all of its submatrices are full-rank.  This is the property that’s relevant for producing a large value of the integrated information Φ; by contrast, note that the identity matrix is invertible, but produces a system with Φ=0.  (As another note, if we work over a large enough field, then a random matrix will have this same property with high probability—but I wanted an explicit example, and while the Vandermonde is far from the only one, it’s one of the simplest.)

(5) The n×n Vandermonde matrix only does what I want if we work over (say) a prime field Fp with p&gt;&gt;n elements.  Thus, it’s natural to wonder whether similar examples exist where the basic system variables are bits, rather than elements of Fp.  The answer is yes. One way to get such examples is using the low-density parity check codes that I mention in the post.  Another common way to get Boolean examples, and which is also used in practice in error-correcting codes, is to start with the Vandermonde matrix (a.k.a. the Reed-Solomon code), and then combine it with an additional component that encodes the elements of Fp as strings of bits in some way.  Of course, you then need to check that doing this doesn’t harm the properties of the original Vandermonde matrix that you cared about (e.g., the “information integration”) too much, which causes some additional complication.

(6) Finally, it might be objected that my counterexamples ignored the issue of dynamics and “feedback loops”: they all consisted of unidirectional processes, which map inputs to outputs and then halt.  However, this can be fixed by the simple expedient of iterating the process over and over!  I.e., first map x to Wx, then map Wx to W2x, and so on.  The integrated information should then be the same as in the unidirectional case.

* * *

Update (5/24): See a very interesting comment by David Chalmers.


To those of us who can’t tell a hypotenuse from a rhombus, the phrase “math journalism” sounds like an oxymoron. It brings to mind boring pedants like Martin Gardner, Sara Robinson, and Brian Hayes, who make everything seem confusing and complicated, and who won’t even write a single word without consulting two dozen “experts.” But today, a new breed of journalist is bringing math directly to the people — and they’re doing it with flair, pizzazz, and an eye for the all-too-neglected human side.

That’s why I’m proud to announce Shtetl-Optimized‘s semiregular Math Journalism Award, intended to recognize those journalists who make fractions, long division, and other topics of current research seem “as easy as pi” even to those of us who can’t balance our checkbooks and never did get algebra. The inaugural award goes to Ben Moore of the BBC, for his fascinating report about a maverick professor who’s solved a problem that befuddled Newton and Pythagoras over 1,200 years ago — not to mention millions of students since! The problem: what happens when you divide by zero?

Feel free to nominate other journalists for this prestigious award. (Hat tip for this one goes to my brother David.)


When someone wrote to Richard Feynman to tell him how his bongo-drumming habit “proved that physicists can also be human,” Feynman shot back a scathing reply: “I am human enough to tell you to go fuck yourself.” Why was Feynman so angry? Because for him, the notion that physicists had to “prove” their humanity by having non-scientific interests was an arrogant presumption. Why not point to a guitarist who enjoys doing math on the side, as “proof that musicians can also be human”?

While it’s possible that Feynman overreacted to what was meant as a compliment, a quick glance at American popular culture demonstrates that he had a point. In the minds of many Hollywood writers, there are apparently only two kinds of scientist: (1) the asexual nerd who babbles incomprehensibly before getting killed around scene 3 (unless of course he’s the villain), and (2) the occasional character who’s human “despite” being a scientist, as demonstrated by his or her charm, physical agility, and fashion sense. The idea that one can be both nerdy and sympathetic — indeed, that nerdiness might even have positive aspects — is absent.

This trend is so pervasive that, whenever a movie bucks it even partly, I’m inclined to overlook any other flaws it might have. Thus, for example, I enjoyed both A Beautiful Mind and Enigma, despite those movies’ liberal departures from the true stories on which they were based. But the most unabashed celebration of nerdiness I’ve seen in cinema is a little-known 80’s comedy called Real Genius. I was introduced to this movie by Christine Chung, a friend at Cornell. Then I saw it again with friends at Berkeley. Yesterday I continued the tradition by organizing a screening for friends at Waterloo.

Briefly, Real Genius follows the adventures of Mitch, a 15-year-old who goes to a college obviously based on Caltech, having been recruited by the duplicitous Professor Hathaway to work on powerful lasers. Mitch is sympathetic, not because he defies the stereotype of a 15-year-old at Caltech, but because we’re shown some of the emotions behind that stereotype: the feeling of outsiderness, of taking up space on the planet only at other people’s mercy; the fear of failure, of letting down his parents, Professor Hathaway, and others who “expect great things from him”; but at the same time, the longing for the easy social confidence represented by his roommate Chris (who used to be a teenage prodigy like Mitch, but is now a womanizing slacker). All of this is shown with enough wit and humor that there’s no need for Mitch to make an explicit declaration:

> Hath not a nerd eyes? Hath not a nerd hands organs, dimensions, senses, affections, passions; fed with the same food, hurt with the same weapons, subject to the same diseases, heal’d by the same means, warm’d and cool’d by the same winter and summer, as a quarterback is?


Psst … one-way functions? Pseudorandom generators? Lattices? RSA? Come and get ’em, in plaintext.

Gus Gutoski took notes for this “all about cryptography” lecture, and they were so good that I’ve posted them with only moderate editing and joke-reinsertion. I’ve thereby provided you, my readers, with the unique opportunity to experience my lecture as Gus himself experienced it — as if you actually were Gus, sitting in a real Waterloo classroom taking notes.

For those of you who feel the need to prepare yourselves for this experience, here’s a recap of all the lectures so far:

  * Lecture 1 (9/12): Atoms and the Void
  * Lecture 2 (9/14): Sets
  * Lecture 3 (9/19): Gödel, Turing, and Friends
  * Lecture 4 (9/21): Minds and Machines
  * Lecture 5 (9/26): Paleocomplexity
  * Lecture 6 (9/28): P, NP, and Friends
  * Lecture 7 (10/3): Randomness
  * Lecture 8 (10/5): Crypto

Update: Preparing these notes is a sh&amp;tload of work for me. So dude — if you want me to keep doing it, please let me know in the comments section if you’re actually reading the notes and deriving any benefit therefrom. Constructive criticism would also be fantastic. Thanks very much!


Update (June 3): A few days after we posted this paper online, Brent Werness, a postdoc in probability theory at the University of Washington, discovered a serious error in the “experimental” part of the paper.  Happily, Brent is now collaborating with us on producing a new version of the paper that fixes the error, which we hope to have available within a few months (and which will replace the version currently on the arXiv).

To make a long story short: while the overall idea, of measuring “apparent complexity” by the compressed file size of a coarse-grained image, is fine, the “interacting coffee automaton” that we study in the paper is not an example where the apparent complexity becomes large at intermediate times.  That fact can be deduced as a corollary of a result of Liggett from 2009 about the “symmetric exclusion process,” and can be seen as a far-reaching generalization of a result that we prove in our paper’s appendix: namely, that in the non-interacting coffee automaton (our “control case”), the apparent complexity after t time steps is upper-bounded by O(log(nt)).  As it turns out, we were more right than we knew to worry about large-deviation bounds giving complete mathematical control over what happens when the cream spills into the coffee, thereby preventing the apparent complexity from ever becoming large!

But what about our numerical results, which showed a small but unmistakable complexity bump for the interacting automaton (figure 10(a) in the paper)?  It now appears that the complexity bump we saw in our data is likely to be explainable by an incomplete removal of what we called “border pixel artifacts”: that is, “spurious” complexity that arises merely from the fact that, at the border between cream and coffee, we need to round the fraction of cream up or down to the nearest integer to produce a grayscale.  In the paper, we devoted a whole section (Section 6) to border pixel artifacts and the need to deal with them: something sufficiently non-obvious that in the comments of this post, you can find people arguing with me that it’s a non-issue.  Well, it now appears that we erred by underestimating the severity of border pixel artifacts, and that a better procedure to get rid of them would also eliminate the complexity bump for the interacting automaton.

Once again, this error has no effect on either the general idea of complexity rising and then falling in closed thermodynamic systems, or our proposal for how to quantify that rise and fall—the two aspects of the paper that have generated the most interest.  But we made a bad choice of model system with which to illustrate those ideas.  Had I looked more carefully at the data, I could’ve noticed the problem before we posted, and I take responsibility for my failure to do so.

The good news is that ultimately, I think the truth only makes our story more interesting.  For it turns out that apparent complexity, as we define it, is not something that’s trivial to achieve by just setting loose a bunch of randomly-walking particles, which bump into each other but are otherwise completely independent.  If you want “complexity” along the approach to thermal equilibrium, you need to work a bit harder for it.  One promising idea, which we’re now exploring, is to consider a cream tendril whose tip takes a random walk through the coffee, leaving a trail of cream in its wake.  Using results in probability theory—closely related, or so I’m told, to the results for which Wendelin Werner won his Fields Medal!—it may even be possible to prove analytically that the apparent complexity becomes large in thermodynamic systems with this sort of behavior, much as one can prove that the complexity doesn’t become large in our original coffee automaton.

So, if you’re interested in this topic, stay tuned for the updated version of our paper.  In the meantime, I wish to express our deepest imaginable gratitude to Brent Werness for telling us all this.

* * *

Good news!  After nearly three years of procrastination, fellow blogger Sean Carroll, former MIT undergraduate Lauren Ouellette, and yours truly finally finished a paper with the above title (coming soon to an arXiv near you).  PowerPoint slides are also available (as usual, you’re on your own if you can’t open them—sorry!).

For the background and context of this paper, please see my old post “The First Law of Complexodynamics,” which discussed Sean’s problem of defining a “complextropy” measure that first increases and then decreases in closed thermodynamic systems, in contrast to entropy (which increases monotonically).  In this exploratory paper, we basically do five things:

  1. We survey several candidate “complextropy” measures: their strengths, weaknesses, and relations to one another.
  2. We propose a model system for studying such measures: a probabilistic cellular automaton that models a cup of coffee into which cream has just been poured.
  3. We report the results of numerical experiments with one of the measures, which we call “apparent complexity” (basically, the gzip file size of a smeared-out image of the coffee cup).  The results confirm that the apparent complexity does indeed increase, reach a maximum, then turn around and decrease as the coffee and cream mix.
  4. We discuss a technical issue that one needs to overcome (the so-called “border pixels” problem) before one can do meaningful experiments in this area, and offer a solution.
  5. We raise the open problem of proving analytically that the apparent complexity ever becomes large for the coffee automaton.  To underscore this problem’s difficulty, we prove that the apparent complexity doesn’t become large in a simplified version of the coffee automaton.

Anyway, here’s the abstract:

> In contrast to entropy, which increases monotonically, the “complexity” or “interestingness” of closed systems seems intuitively to increase at first and then decrease as equilibrium is approached. For example, our universe lacked complex structures at the Big Bang and will also lack them after black holes evaporate and particles are dispersed. This paper makes an initial attempt to quantify this pattern. As a model system, we use a simple, two-dimensional cellular automaton that simulates the mixing of two liquids (“coffee” and “cream”). A plausible complexity measure is then the Kolmogorov complexity of a coarse-grained approximation of the automaton’s state, which we dub the “apparent complexity.” We study this complexity measure, and show analytically that it never becomes large when the liquid particles are non-interacting. By contrast, when the particles do interact, we give numerical evidence that the complexity reaches a maximum comparable to the “coffee cup’s” horizontal dimension. We raise the problem of proving this behavior analytically.

Questions and comments more than welcome.

* * *

In unrelated news, Shafi Goldwasser has asked me to announce that the Call for Papers for the 2015 Innovations in Theoretical Computer Science (ITCS) conference is now available.


My sojourn in Northern California is now at an end; on Sunday I flew to my parents’ place near Philadelphia for Hanukhrismanewyears. But not before going to Stanford to give a talk to their string theory group about “Computational Complexity and the Anthropic Principle.” Here are the notes from that talk; you can think of them as a Quantum Computing Since Democritus Special Bonus Lecture.

(The best part of the talk — the lengthy arguments with Lenny Susskind, Andrei Linde, and the other stringers and cosmologists, in which I repeatedly used humor to mask my utter lack of understanding — is sadly lost to eternity. Fortunately, I’m sure that new such arguments will erupt in the comments section.)

In preparation for meeting Susskind and the other Stanford stringers, I made sure to brush up on both sides of the String Wars. On the anti-string side, I read Peter Woit’s Not Even Wrong and Lee Smolin’s The Trouble With Physics. On the pro-string side, I read Susskind’s The Cosmic Landscape and also spent hours talking with Greg Kuperberg, who tried to convince me that critics of string theory are as “intellectually non-serious” as quantum computing skeptics or Ralph Nader voters. I heartily recommend all three of the books.

So, what did I learn at Stanford? Among other things, that when you talk to string theorists in person, they’re much more open-minded and reasonable than you’d expect! Of course, when your de facto spokesman is the self-parodying Luboš Motl — who often manages to excoriate feminists, climatologists, and loop quantum gravity theorists in the very same sentence — it’s hard not to seem reasonable by comparison. But I’m not even talking about him.

(Conflict-of-interest warning: I’m painfully well aware that, so long as Luboš is around, I can only ever be the second-funniest physics blogger — even if the world champion in this field isn’t trying to be funny.)

In general, I’ve found that tolerance for alternative ideas, willingness to engage with counterarguments, rejection of appeals to authority, and so on are all greater when talking to string theorists in person than when attending their talks or reading their books and articles. Maybe that’s to be expected — to some extent it’s true of every field! But with string theorists, the magnitude of the difference always astonishes me.

Alright, let me get more concrete. One of the few nontrivial points of agreement between string theory and loop quantum gravity seems to be that, in any bounded region of spacetime, the number of bits of information is finite: at most ~1069 bits per square meter of surface area, or (equivalently) at most ~1 bit per Planck area. In loop quantum gravity, this is basically because one bit of information is “stored” in each Planck area. In string theory, it’s much more subtle than that: the bits of information can’t be put into any sort of one-to-one correspondence with the Planck areas on the horizon, but they both add up to the same number. (Ignoring a factor of 4, which being a complexity theorist, I don’t care about.)

Now, much of my conversation with Susskind and fellow string theorist Steve Shenker focused on the following question: isn’t it a bizarre coincidence that the Planck areas and the bits of information should both add up to the same number, if there’s no “dual” description of string theory in which each bit (or rather qubit) is stored in a Planck area? Susskind agreed with me that such a “local” description of string theory (local on the boundary, not in the bulk) would be desirable — and that, if there isn’t such a description, then that by itself is a fundamental fact worthy of more attention. I’d expected Susskind and Shenker to brush aside my question as idle pontificating; instead, they seemed to want to reinvent string theory that very afternoon so that my question would have an answer!

When it became clear that no such reinvention of the theory was forthcoming (at least that afternoon), I suggested the following. We’ve got this one proposal, string theory, which has had some spectacular technical successes (like “explaining” the Bekenstein-Hawking entropy), but which, setting aside its other well-known problems, offers no “local” description of spacetime in terms of qubits and quantum circuits at the Planck scale. Then we’ve got this other proposal, loop quantum gravity, which has had fewer successes, but which does attempt such a local description at the Planck scale. So, if we agree that such a local description is our eventual goal, then shouldn’t an outsider guess that string theory and loop quantum gravity are probably just different footprints of the same beast — much like the different string theories themselves were found to be different limiting cases of an as-yet-unknown M-theory?

Susskind agreed that such a convergence — between the “top-down” picture of string theory, which grew out of conventional high-energy physics, and a “bottom-up” picture in terms of qubits at the Planck scale — was possible or even likely. He stressed that his opposition was not to the idea of describing spacetime in terms of local interactions of qubits, but rather to the specific technical program of loop quantum gravity, and to the exaggerated claims often made on that program’s behalf. When I reminded him that other people complain about exaggerated claims made on string theory’s behalf, he replied that the two cases were not even remotely comparable.

All in all, it was an extremely productive and enjoyable visit — one in which the conversation topics ranged over (among other things) the explanatory role of the Anthropic Principle, the possibility that the entire universe arose as a quantum fluctuation, the prospects for an efficient quantum algorithm for Graph Isomorphism, the relation between thermodynamics and quantum error-correction, and whether or not Gerard ‘t Hooft actually disbelieves quantum mechanics. Susskind told me, half-jokingly, that the Stanford string theory group was the world’s hotbed of anti-Landscape sentiment, and the arguments that I saw and participated in on my visit gave me no reason to doubt him.

So what are we to make of the fact that, on the one hand, the string theorists are such swell folks in person, and on the other hand, even the most cursory glance at their writings will reveal that the charge of triumphalist arrogance is far from undeserved? Well, to the anti-stringers, the obvious interpretation will be that the string theorists don’t really believe their own pablum: that they say one thing in public and a completely different thing in private. To the pro-stringers, the obvious interpretation will be that, beneath the façade we all erect around ourselves, the string theorists are just scientists like anyone else: grasping at the truth, struggling to learn more, convinced that string theory is the best idea we have but ready to ditch it if something better comes along. As usual, it all depends on where you’re coming from.

Alas, as tidy as this resolution sounds, it doesn’t help me pick sides in the String Wars currently raging through the blogosphere. But then again, why do I need to pick sides? I like hanging out with the loop quantum gravity people at Perimeter Institute. I like the fact that Lee Smolin’s publisher sent me a free review copy of The Trouble With Physics. I like the recent paper by Denef and Douglas on computational complexity and the string Landscape. And I like getting an all-expenses-paid trip to Stanford to have a freewheeling, day-long intellectual conversation with the string theorists there.

I have therefore reached a decision. From this day forward, my allegiances in the String Wars will be open for sale to the highest bidder. Like a cynical arms merchant, I will offer my computational-complexity and humor services to both sides, and publicly espouse the views of whichever side seems more interested in buying them at the moment. Fly me to an exotic enough location, put me up in a swank enough hotel, and the number of spacetime dimensions can be anything you want it to be: 4, 10, 11, or even 172.9+3πi. Is it more important for a quantum gravity theory to connect to the Standard Model, or to build in background-independence from the outset? Can one use the Anthropic Principle to make falsifiable predictions? How much is riding on whether or not the LHC finds supersymmetry? I might have opinions on these topics, but they’re nothing that a cushy job offer or a suitcase full of “reimbursements” couldn’t change.

Someday, perhaps, a dramatic new experimental finding or theoretical breakthrough will change the situation vis-à-vis string theory and its competitors. Until then, I shall answer to no quantum-gravity research program, but rather seek to profit from them all.

Update (12/23): The indefatigable Luboš Motl has put up a new jeremiad against me. Taking my ‘For Sale’ announcement completely seriously, Luboš writes:

> It is absolutely impossible for me to hide how intensely I despise people like Scott Aaronson … He’s the ultimate example of a complete moral breakdown of a scientist. It is astonishing that the situation became so bad that the people are not only corrupt and dishonest but they proudly announce this fact on their blogs…

>
> In fact, I have learned that the situation is so bad that when I simply state that Aaronson’s attitude is flagrantly incompatible with the ethical standards of a scholar as they have been understood for centuries, there could be some parts of the official establishment that would support him against me. There doesn’t seem to be a single blog article besides mine that denounces Aaronson’s attitude…

>
> The difference between [the] two of us is like the difference between a superman from the action movies who fights for the universal justice on one side and the most dirty corrupt villain on the other side. It’s like the Heaven and the Hell, freedom and feminism, careful evaluation of the climate and the alarmist hysteria, or string theory and loop quantum gravity…

I can’t tell you how proud I am to have become “the most dirty corrupt villain” in Luboš’s cosmology, and no longer just an anonymous bystander. Thanks so much, Luboš, and Merry Christmas to you too!

Update (12/24): Man oh man, I had no idea that people would take my offer so seriously! Because of this, I now feel obligated to provide a financial disclosure statement. The Stanford string theorists did not actually pay my way to California, although they offered to — most of my expenses were covered by Umesh, my adviser at Berkeley. Stanford paid for (1) one night’s hotel stay in Palo Alto, and (2) one lunch, consisting of a small cheese pizza and an iced tea.


This post is about an idea I had around 1997, when I was 16 years old and a freshman computer-science major at Cornell.  Back then, I was extremely impressed by a research project called CLEVER, which one of my professors, Jon Kleinberg, had led while working at IBM Almaden.  The idea was to use the link structure of the web itself to rank which web pages were most important, and therefore which ones should be returned first in a search query.  Specifically, Kleinberg defined “hubs” as pages that linked to lots of “authorities,” and “authorities” as pages that were linked to by lots of “hubs.”  At first glance, this definition seems hopelessly circular, but Kleinberg observed that one can break the circularity by just treating the World Wide Web as a giant directed graph, and doing some linear algebra on its adjacency matrix.  Equivalently, you can imagine an iterative process where each web page starts out with the same hub/authority “starting credits,” but then in each round, the pages distribute their credits among their neighbors, so that the most popular pages get more credits, which they can then, in turn, distribute to their neighbors by linking to them.

I was also impressed by a similar research project called PageRank, which was proposed later by two guys at Stanford named Sergey Brin and Larry Page.  Brin and Page dispensed with Kleinberg’s bipartite hubs-and-authorities structure in favor of a more uniform structure, and made some other changes, but otherwise their idea was very similar.  At the time, of course, I didn’t know that CLEVER was going to languish at IBM, while PageRank (renamed Google) was going to expand to roughly the size of the entire world’s economy.

In any case, the question I asked myself about CLEVER/PageRank was not the one that, maybe in retrospect, I should have asked: namely, “how can I leverage the fact that I know the importance of this idea before most people do, in order to make millions of dollars?”

Instead I asked myself: “what other ‘vicious circles’ in science and philosophy could one unravel using the same linear-algebra trick that CLEVER and PageRank exploit?”  After all, CLEVER and PageRank were both founded on what looked like a hopelessly circular intuition: “a web page is important if other important web pages link to it.”  Yet they both managed to use math to defeat the circularity.  All you had to do was find an “importance equilibrium,” in which your assignment of “importance” to each web page was stable under a certain linear map.  And such an equilibrium could be shown to exist—indeed, to exist uniquely.

Searching for other circular notions to elucidate using linear algebra, I hit on morality.  Philosophers from Socrates on, I was vaguely aware, had struggled to define what makes a person “moral” or “virtuous,” without tacitly presupposing the answer.  Well, it seemed to me that, as a first attempt, one could do a lot worse than the following:

A moral person is someone who cooperates with other moral people, and who refuses to cooperate with immoral people.

Obviously one can quibble with this definition on numerous grounds: for example, what exactly does it mean to “cooperate,” and which other people are relevant here?  If you don’t donate money to starving children in Africa, have you implicitly “refused to cooperate” with them?  What’s the relative importance of cooperating with good people and withholding cooperation with bad people, of kindness and justice?  Is there a duty not to cooperate with bad people, or merely the lack of a duty to cooperate with them?  Should we consider intent, or only outcomes?  Surely we shouldn’t hold someone accountable for sheltering a burglar, if they didn’t know about the burgling?  Also, should we compute your “total morality” by simply summing over your interactions with everyone else in your community?  If so, then can a career’s worth of lifesaving surgeries numerically overwhelm the badness of murdering a single child?

For now, I want you to set all of these important questions aside, and just focus on the fact that the definition doesn’t even seem to work on its own terms, because of circularity.  How can we possibly know which people are moral (and hence worthy of our cooperation), and which ones immoral (and hence unworthy), without presupposing the very thing that we seek to define?

Ah, I thought—this is precisely where linear algebra can come to the rescue!  Just like in CLEVER or PageRank, we can begin by giving everyone in the community an equal number of “morality starting credits.”  Then we can apply an iterative update rule, where each person A can gain morality credits by cooperating with each other person B, and A gains more credits the more credits B has already.  We apply the rule over and over, until the number of morality credits per person converges to an equilibrium.  (Or, of course, we can shortcut the process by simply finding the principal eigenvector of the “cooperation matrix,” using whatever algorithm we like.)  We then have our objective measure of morality for each individual, solving a 2400-year-old open problem in philosophy.

The next step, I figured, would be to hack together some code that computed this “eigenmorality” metric, and then see what happened when I ran the code to measure the morality of each participant in a simulated society.  What would happen?  Would the results conform to my pre-theoretic intuitions about what sort of behavior was moral and what wasn’t?  If not, then would watching the simulation give me new ideas about how to improve the morality metric?  Or would it be my intuitions themselves that would change?

Unfortunately, I never got around to the “coding it up” part—there’s a reason why I became a theorist!  The eigenmorality idea went onto my back burner, where it stayed for the next 16 years: 16 years in which our world descended ever further into darkness, lacking a principled way to quantify morality.  But finally, this year, just two separate things have happened on the eigenmorality front, and that’s why I’m blogging about it now.

Eigenjesus and Eigenmoses

The first thing that’s happened is that Tyler Singer-Clark, my superb former undergraduate advisee, did code up eigenmorality metrics and test them out on a simulated society, for his MIT senior thesis project.  You can read Tyler’s 12-page report here—it’s a fun, enjoyable, thought-provoking first research paper, one that I wholeheartedly recommend.  Or, if you’d like to experiment yourself with the Python code, you can download it here from github.  (Of course, all opinions expressed in this post are mine alone, not necessarily Tyler’s.)

Briefly, Tyler examined what eigenmorality has to say in the setting of an Iterated Prisoner’s Dilemma (IPD) tournament.  The Iterated Prisoner’s Dilemma is the famous game in which two players meet repeatedly, and in each turn can either “Cooperate” or “Defect.”  The absolute best thing, from your perspective, is if you defect while your partner cooperates.  But you’re also pretty happy if you both cooperate.  You’re less happy if you both defect, while the worst (from your standpoint) is if you cooperate while your partner defects.  At each turn, when contemplating what to do, you have the entire previous history of your interaction with this partner available to you.  And thus, for example, you can decide to “punish” your partner for past defections, “reward” her for past cooperations, or “try to take advantage” by unilaterally defecting and seeing what happens.  At each turn, the game has some small constant probability of ending—so you know approximately how many times you’ll meet this partner in the future, but you don’t know exactly when the last turn will be.  Your score, in the game, is then the sum-total of your score over all turns and all partners (where each player meets each other player once).

In the late 1970s, as recounted in his classic work The Evolution of Cooperation, Robert Axelrod invited people all over the world to submit computer programs for playing this game, which were then pit against each other in the world’s first serious IPD tournament.  And, in a tale that’s been retold in hundreds of popular books, while many people submitted complicated programs that used machine learning, etc. to try to suss out their opponents, the program that won—hands-down, repeatedly—was TIT_FOR_TAT, a few lines of code submitted by the psychologist Anatol Rapaport to implement an ancient moral maxim.  TIT_FOR_TAT starts out by cooperating; thereafter, it simply does whatever its opponent did in the last move, swiftly rewarding every cooperation and punishing every defection, and ignoring the entire previous history.  In the decades since Axelrod, running Iterated Prisoners’ Dilemma tournaments has become a minor industry, with countless variations explored (for example, “evolutionary” versions, and versions allowing side-communication between the players), countless new strategies invented, and countless papers published.  To make a long story short, TIT_FOR_TAT continues to do quite well across a wide range of environments, but depending on the mix of players present, other strategies can sometimes beat TIT_FOR_TAT.  (As one example, if there’s a sizable minority of colluding players, who recognize each other by cooperating and defecting in a prearranged sequence, then those players can destroy TIT_FOR_TAT and other “simple” strategies, by cooperating with one another while defecting against everyone else.)

Anyway, Tyler sets up and runs a fairly standard IPD tournament, with a mix of strategies that includes TIT_FOR_TAT, TIT_FOR_TWO_TATS, other TIT_FOR_TAT variations, PAVLOV, FRIEDMAN, EATHERLY, CHAMPION (see the paper for details), and degenerate strategies like always defecting, always cooperating, and playing randomly.  However, Tyler then asks an unusual question about the IPD tournament: namely, purely on the basis of the cooperate/defect sequences, which players should we judge to have acted morally toward their partners?

It might be objected that the players didn’t “know” they were going to be graded on morality: as far as they knew, they were just trying to maximize their individual utilities.  The trouble with that objection is that the players didn’t “know” they were trying to maximize their utilities either!  The players are bots, which do whatever their code tells them to do.  So in some sense, utility—no less than morality—is “merely an interpretation” that we impose on the raw cooperate/defect sequences!  There’s nothing to stop us from imposing some other interpretation (say, one that explicitly tries to measure morality) and seeing what happens.

In an attempt to measure the players’ morality, Tyler uses the eigenmorality idea from before.  The extent to which player A “cooperates” with player B is simply measured by the percentage of times A cooperates.  (One acknowledged limitation of this work is that, when two players both defect, there’s no attempt to take into account “who started it,” and to judge the aggressor more harshly than the retaliator—or to incorporate time in any other way.)  This then gives us a “cooperation matrix,” whose (i,j) entry records the total amount of niceness that player i displayed to player j.  Diagonalizing that matrix, and taking its largest eigenvector, then gives us our morality scores.

Now, there’s a very interesting ambiguity in what I said above.  Namely, should we define the “niceness scores” to lie in [0,1] (so that the lowest, meanest possible score is 0), or in [-1,1] (so that it’s possible to have negative niceness)?  This might sound like a triviality, but in our setting, it’s precisely the mathematical reflection of one of the philosophical conundrums I mentioned earlier.  The conundrum can be stated as follows: is your morality a monotone function of your niceness?  We all agree, presumably, that it’s better to be nice to Gandhi than to be nice to Hitler.  But do you have a positive obligation to be not-nice to Hitler: to make him suffer because he made others suffer?  Or, OK, how about not Hitler, but someone who’s somewhat bad?  Consider, for example, a woman who falls in love with, and marries, an unrepentant armed robber (with full knowledge of who he is, and with other options available to her).  Is the woman morally praiseworthy for loving her husband despite his bad behavior?  Or is she blameworthy because, by rewarding his behavior with her love, she helps to enable it?

To capture two possible extremes of opinion about such questions, Tyler and I defined two different morality metrics, which we called … wait for it … eigenmoses and eigenjesus.  Eigenmoses has the niceness scores in [-1,1], which means that you’re actively rewarded for punishing evildoers: that is, for defecting against those who defect against many moral players.  Eigenjesus, by contrast, has the niceness scores in [0,1], which means that you always do at least as well by “turning the other cheek” and cooperating.  (Though note that, even with eigenjesus, you get more morality credits by cooperating with moral players than by cooperating with immoral ones.)

This is probably a good place to mention a second limitation of Tyler’s current study.  Namely, with the current system, there’s no direct way for a player to find out how its partner has been behaving toward third parties.  The only information that A gets about the goodness or evilness of player B, comes from A and B’s direct interaction.  Ideally, one would like to design bots that take into account, not only the other bots’ behavior toward them, but the other bots’ behavior toward each other.  So for example, even if someone is unfailingly nice to you, if that person is an asshole to everyone else, then the eigenmoses moral code would demand that you return the person’s cooperation with icy defection.  Conversely, even if Gandhi is mean and hateful to you, you would still be morally obliged (interestingly, on both the eigenmoses and eigenjesus codes) to be nice to him, because of the amount of good he does for everyone else.

Anyway, you can read Tyler’s paper if you want to see the results of computing the eigenmoses and eigenjesus scores for a diverse population of bots.  Briefly, the results accord pretty well with intuition.  When we look at eigenjesus scores, the all-cooperate bot comes out on top and the all-defect bot on the bottom (as is mathematically necessary), with TIT_FOR_TAT somewhere in the middle, and generous versions of TIT_FOR_TAT higher up.  When we look at eigenmoses, by contrast, TIT_FOR_TWO_TATS comes out on top, with TIT_FOR_TAT in sixth place, and the all-cooperate bot scoring below the median.  Interestingly, once again, the all-defect bot gets the lowest score (though in this case, it wasn’t mathematically necessary).

Even though the measures acquit themselves well in this particular tournament, it’s admittedly easy to construct scenarios where the prescriptions of eigenjesus and eigenmoses alike violently diverge from most people’s moral intuitions.  We’ve already touched on a few such scenarios above (for example, are you really morally obligated to lick the boots of someone who kicks you, just because that person is a saint to everyone other than you?).  Another type of scenario involves minorities.  Imagine, for instance, that 98% of the players are unfailingly nice to each other, but unfailingly cruel to the remaining 2% (who they can recognize, let’s say, by their long noses or darker skin—some trivial feature like that).  Meanwhile, the put-upon 2% return the favor by being nice to each other and mean to the 98%.  Who, in this scenario, is moral, and who’s immoral?  The mathematical verdict of both eigenmoses and eigenjesus is unequivocal: the 98% are almost perfectly good, while the 2% are almost perfectly evil.  After all, the 98% are nice to almost everyone, while the 2% are mean to those who are nice to almost everyone, and nice only to a tiny minority who are mean to almost everyone.  Of course, for much of human history, this is precisely how morality worked, in many people’s minds.  But I dare say it’s a result that would make moderns uncomfortable.

In summary, it seems clear to me that neither eigenmoses nor eigenjesus correctly captures our intuitions about morality, any more than Φ captures our intuitions about consciousness.  But as they say, I think there’s plenty of scope here for further research: for coming up with new mathematical measures that sharpen our intuitive judgments about morality, and (if we like) testing those measures out using IPD tournaments.  It also seems to me that there’s something fundamentally right about the eigenvector idea: all else being equal, we’d like to say, being nice to others is good, except that aiding and abetting evildoers is not good, and the way we can recognize the evildoers in our midst is that they’re not nice to others—except that, if the people who someone isn’t nice to are themselves evildoers, then the person might again be good, and so on.  The only way to cut off the infinite regress, it seems, is to demand some sort of “reflective equilibrium” in our moral judgments, and that’s precisely what eigenmorality tries to capture.  On the other hand, no such idea can ever make moral debate obsolete—if for no other reason than that we still need to decide which specific eigenmorality metric to use, and that choice is itself a moral judgment.

Scooped by Plato

Which brings me, finally, to the second new thing that’s happened this year on the eigenmorality front.  Namely, Rebecca Newberger Goldstein—who’s far and away my favorite contemporary novelist—published a charming new book entitled Plato at the Googleplex: Why Philosophy Won’t Go Away.  Here she imagines that Plato has reappeared in present-day America (she doesn’t bother to explain how), where he’s taught himself English and the basics of modern science, learned how to use the Internet, and otherwise gotten himself up to speed.  The book recounts Plato’s dialogues with various modern interlocutors, as he volunteers to have his brain scanned, guest-writes a relationship advice column, participates in a panel discussion on child-rearing, and gets interviewed on cable news by “Roy McCoy” (a thinly veiled Bill O’Reilly).  Often, Goldstein has Plato answer the moderns’ questions using direct quotes from the Timaeus, the Gorgias, the Meno, etc., which makes her Plato into a very intelligent sort of chatbot.  This is a genre that’s not often seriously attempted, and that I’d love to read more of (possible subjects: Shakespeare, Galileo, Jefferson, Lincoln, Einstein, Turing…).

Anyway, my favorite episode in the book is the first, eponymous one, where Plato visits the Googleplex in Mountain View.  While eating lunch in one of the many free cafeterias, Plato is cornered by a somewhat self-important, dreadlocked coder named Marcus, who tries to convince Plato that Google PageRank has finally solved the problem agonized over in the Republic, of how to define justice.  By using the Internet, we can simply crowd-source the answer, Marcus declares: get millions of people to render moral judgments on every conceivable question, and also moral judgments on each other’s judgments.  Then declare those judgments the most morally reliable, that are judged the most reliable by the people who are themselves the most morally reliable.  The circularity, as usual, is broken by taking the principal eigenvector of the graph of moral judgments (Goldstein doesn’t have Marcus put it that way, but it’s what she means).

Not surprisingly, Plato is skeptical.  Through Socratic questioning—the method he learned from the horse’s mouth—Plato manages to make Marcus realize that, in the very act of choosing which of several variants of PageRank to use in our crowd-sourced justice engine, we’ll implicitly be making moral choices already.  And therefore, we can’t use PageRank, or anything like it, as the ultimate ground of morality.

Whereas I imagined that the raw data for an “eigenmorality” metric would consist of numerical measures of how nice people had been to each other, Goldstein imagines the raw data to consist of abstract moral judgments, and of judgments about judgments.  Also, whereas the output of my kind of metric would be a measure of the “goodness” of each individual person, the outputs of hers would presumably be verdicts about general moral and political questions.  But, much like with CLEVER versus PageRank, it’s obvious that the ideas are similar—and that I should credit Goldstein with independently discovering my nerdy 16-year-old vision, in order to put it in the mouth of a nerdy character in her story.

As I said before, I agree with Goldstein’s Plato that eigenmorality can’t serve as the ultimate ground of morality.  But that’s a bit like saying that Google rank can’t serve as the ultimate ground of importance, because even just to design and evaluate their ranking algorithms, Google’s engineers must have some prior notion of “importance” to serve as a standard.  That’s true, of course, but it omits to mention that Google rank is still useful—useful enough to have changed civilization in the space of a few years.  Goldstein’s book has the wonderful property that even the ideas she gives to her secondary characters, the ones who serve as foils to Plato, are sometimes interesting enough to deserve book-length treatments of their own, and crowd-sourced morality strikes me as a perfect example.

In the two previous comment threads, we got into a discussion of anthropogenic climate change, and of my own preferred way to address it and related threats to our civilization’s survival, which is simply to tax every economic activity at a rate commensurate with the environmental damage that it does, and use the funds collected for cleanup, mitigation, and research into alternatives.  (Obviously, such ideas are nonstarters in the current political climate of the US, but I’m not talking here about what’s feasible, only about what’s necessary.)  As several commenters pointed out, my view raises an obvious question: who is to decide how much “damage” each activity causes, and thus how much it should be taxed?  Of course, this is merely a special case of the more general question: who is to decide on any question of public policy whatsoever?

For the past few centuries, our main method for answering such questions—in those parts of the world where a king or dictator or Politburo doesn’t decree the answer—has been representative democracy.  Democracy is, arguably, the best decision-making method that our sorry species has ever managed to put into practice, at least outside the hard sciences.  But in my view, representative democracy is now failing spectacularly at possibly the single most important problem it’s ever faced: namely, that of leaving our descendants a livable planet.  Even though, by and large, reasonable people mostly agree about what needs to be done—weaning ourselves off fossil fuels (especially the dirtier ones), switching to solar, wind, and nuclear, planting forests and stopping deforestation, etc.—after decades of debate we’re still taking only limping, token steps toward those goals, and in many cases we’re moving rapidly in the opposite direction.  Those who, for financial, theological, or ideological reasons, deny the very existence of a problem, have proved that despite being a minority, they can push hard enough on the levers of democracy to prevent anything meaningful from happening.

So what’s the solution?  To put the world under the thumb of an environmentalist dictator?  Absolutely not.  In all of history, I don’t think any dictatorial system has ever shown itself robust against takeover by murderous tyrants (people who probably aren’t too keen on alternative energy either).  The problem, I think, is epistemological.  Within physics and chemistry and climatology, the people who think anthropogenic climate change exists and is a serious problem have won the argument—but the news of their intellectual victory hasn’t yet spread to the opinion page of the Wall Street Journal, or cable news, or the US Congress, or the minds of enough people to tip the scales of history.  Because our domination of the earth’s climate and biosphere is new and unfamiliar; because the evidence for rapid climate change is complicated and statistical; because the worst effects are still remote from us in time, space, or both; because the sacrifices needed to address the problem are real—for all of these reasons, the deniers have learned that they can subvert the Popperian process by which bad explanations are discarded and good explanations win.  If you just repeat debunked ideas through a loud enough megaphone, it turns out, many onlookers won’t be able to tell the difference between you and the people who have genuine knowledge—or they will eventually, but not until it’s too late.  If you have a few million dollars, you can even set up your own parody of the scientific process: your own phony experts, in their own phony think tanks, with their own phony publications, giving each other legitimacy by citing each other.  (Of course, all this is a problem for many fields, not just climate change.  Climate is special only because there, the future of life on earth might literally hinge on our ability to get epistemology right.)

Yet for all that, I’m an optimist—sort of.  For it seems to me that the Internet has given us new tools with which to try to fix our collective epistemology, without giving up on a democratic society.  Google, Wikipedia, Quora, and so forth have already improved our situation, if only by a little.  We could improve it a lot more.  Consider, for example, the following attempted definitions:

A trustworthy source of information is one that’s considered trustworthy by many sources who are themselves trustworthy (on the same topic or on closely related topics).  The current scientific consensus, on any given issue, is what the trustworthy sources consider to be the consensus.  A good decision-maker is someone who’s considered to be a good decision-maker by many other good decision-makers.

At first glance, the above definitions sound ludicrously circular—even Orwellian—but we now know that all that’s needed to unravel the circularity is a principal eigenvector computation on the matrix of trust.  And the computation of such an eigenvector need be no more “Orwellian” than … well, Google.  If enough people want it, then we have the tools today to put flesh on these definitions, to give them agency: to build a crowd-sourced deliberative democracy, one that “usually just works” in much the same way Google usually just works.

Now, would those with axes to grind try to subvert such a system the instant it went online?  Certainly.  For example, I assume that millions of people would rate Conservapedia as a more trustworthy source than Wikipedia—and would rate other people who had done so as, themselves, trustworthy sources, while rating as untrustworthy anyone who called Conservapedia untrustworthy.  So there would arise a parallel world of trust and consensus and “expertise,” mutually-reinforcing yet nearly disjoint from the world of the real.  But here’s the thing: anyone would be able to see, with the click of a mouse, the extent to which this parallel world had diverged from the real one.  They’d see that there was a huge, central connected component in the trust graph—including almost all of the Nobel laureates, physicists from the US nuclear weapons labs, military planners, actuaries, other hardheaded people—who all accepted the reality of humans warming the planet, and only tiny, isolated tendrils of trust reaching from that component into the component of Rush Limbaugh and James Inhofe.  The deniers and their think-tanks would be exposed to the sun; they’d lose their thin cover of legitimacy.  It should go without saying that the same would happen to various charlatans on the left, and should go without saying that I’d cheer that outcome as well.

Some will object: but people who believe in pseudosciences—whether creationists or anti-vaxxers or climate change deniers—already know they’re in a minority!  And far from being worried about it, they treat it as a badge of honor.  They think they’re Galileo, that their belief in spite of a scientific consensus makes them heroes, while those in the giant central component of the trust graph are merely slavish followers.

I admit all this.  But the point of an eigentrust system wouldn’t be to convince everyone.  As long as I’m fantasizing, the point would be that, once people’s individual decisions did give rise to a giant connected trust component, the recommendations of that component could acquire the force of law.  The formation of the giant component would be the signal that there’s now enough of a consensus to warrant action, despite the continuing existence of a vocal dissenting minority—that the minority has, in effect, withdrawn itself from the main conversation and retreated into a different discourse.  Conversely, it’s essential to note, if there were a dissenting minority, but that minority had strong trunks of topic-relevant trust pointing toward it from the main component (for example, because the minority contained a large fraction of the experts in the relevant field), then the minority’s objections might be enough to veto action, even if it was numerically small.  This is still democracy; it’s just democracy enhanced by linear algebra.

Other people will object that, while we should use the Internet to improve the democratic process, the idea we’re looking for is not eigentrust or eigenmorality but rather prediction markets.  Such markets would allow us to, as my friend Robin Hanson advocates, “vote on values but bet on beliefs.”  For example, a country could vote for the conditional policy that, if business-as-usual is predicted to cause sea levels to rise at least 4 meters by the year 2200, then an aggressive emissions reduction plan will be triggered, but not otherwise.  But as for the prediction itself, that would be left to a futures market: a place where, unlike with voting, there’s a serious penalty for being wrong, namely losing your shirt.  If the futures market assigned the prediction at least such-and-such a probability, then the policy tied to that prediction would become law.

I actually like the idea of prediction markets—I have ever since I heard about them—but I consider them limited in scope.  My example above, involving the year 2200, gives a hint as to why.  Prediction markets are great whenever our disagreements are over something that will be settled one way or the other, to everyone’s assent, in the near future (e.g., who will win the World Cup, or next year’s GDP).  But most of our important disagreements aren’t like that: they’re over which direction society should move in, which issues to care about, which statistical indicators are even the right ones to measure a country’s health.  Now, those broader questions can sometimes be settled empirically, in a sense: they can be settled by the overwhelming judgment of history, as the slavery, women’s suffrage, and fascism debates were.  But that kind of empirical confirmation typically takes way too long to set up a decent betting market around it.  And for the non-bettable questions, a carefully-crafted eigendemocracy really is the best system I can think of.

Again, I think Rebecca Goldstein’s Plato is completely right that such a system, were it implemented, couldn’t possibly solve the philosophical problem of finding the “ultimate ground of justice,” just like Google can’t provide us with the “ultimate ground of importance.”  If nothing else, we’d still need to decide which of the many possible eigentrust metrics to use, and we couldn’t use eigentrust for that without risking an infinite regress.  But just like Google, whatever its flaws, works well enough for you to use it dozens of times per day, so a crowd-sourced eigendemocracy might—just might—work well enough to save civilization.

* * *

Update (6/20): If you haven’t been following, there’s an excellent discussion in the comments, with, as I’d hoped, many commenters raising strong and pertinent objections to the eigenmorality and eigendemocracy concepts, while also proposing possible fixes.  Let me now mention what I think are the most important problems with eigenmorality and eigendemocracy respectively—both of them things that had occurred to me also, but that the commenters have brought out very clearly and explicitly.

With eigenmorality, perhaps the most glaring problem is that, as I mentioned before, there’s no notion of time-ordering, or of “who started it,” in the definition that Tyler and I were using.  As Luca Trevisan aptly points out in the comments, this has the consequence that eigenmorality, as it stands, is completely unable to distinguish between a crime syndicate that’s hated by the majority because of its crimes, and an equally-large ethnic minority that’s hated by the majority solely because it’s different, and that therefore hates the majority.  However, unlike with mathematical theories of consciousness—where I used counterexamples to try to show that no mathematical definition of a certain kind could possibly capture our intuitions about consciousness—here the problem strikes me as much more circumscribed and bounded.  It’s far from obvious to me that we can’t easily improve the definition of eigenmorality so that it does agree with most people’s moral intuition, whenever intuition renders a clear verdict, at least in the limited setting of Iterated Prisoners’ Dilemma tournaments.

Let’s see, in particular, how to solve the problem that Luca stressed.  As a first pass, we could do so as follows:

A moral agent is one who only initiates defection against agents who it has good reason to believe are immoral (where, as usual, linear algebra is used to unravel the definition’s apparent circularity).

Notice that I’ve added two elements to the setup: not only time but also knowledge.  If you shun someone solely because you don’t like how they look, then we’d like to say that reflects poorly on you, even if (unbeknownst to you) it turns out that the person really is an asshole.  Now, several more clauses would need to be added to the above definition to flesh it out: for example, if you’ve initiated defection against an immoral person, but then the person stops being immoral, at what point do you have a moral duty to “forgive and forget”?  Also, just like with the eigenmoses/eigenjesus distinction, do you have a positive duty to initiate defection against an agent who you learn is immoral, or merely no duty not to do so?

OK, so after we handle the above issues, will there still be examples that our time-sensitive, knowledge-sensitive eigenmorality definition gets badly, egregiously wrong?  Maybe—I don’t know!  Please let me know in the comments.

Moving on to eigendemocracy, here I think the biggest problem is one pointed out by commenter Rahul.  Namely, an essential aspect of how Google is able to work so well is that people have reasons for linking to webpages other than boosting those pages’ Google rank.  In other words, Google takes a link structure that already exists, independently of its ranking algorithm, and that (as the economists would put it) encodes people’s “revealed preferences,” and exploits that structure for its own purposes.  Of course, now that Google is the main way many of us navigate the web, increasing Google rank has become a major reason for linking to a webpage, and an entire SEO industry has arisen to try to game the rankings.  But Google still isn’t the only reason for linking, so the link structure still contains real information.

By contrast, consider an eigendemocracy, with a giant network encoding who trusts whom on what subject.  If the only reason why this trust network existed was to help make political decisions, then gaming the system would probably be rampant: people could simply decide first which political outcome they wanted, then choose the “experts” such that claiming to “trust” them would do the most for their favored outcome.  It follows that this system can only improve on ordinary democracy if the trust network has some other purpose, so that the participants have an actual incentive to reveal the truth about who they trust.  So, how would an eigendemocracy suss out the truth about who trusts whom on which subject?  I don’t have a very good answer to this, and am open to suggestions.  The best idea so far is to use Facebook for this purpose, but I don’t know exactly how.

* * *

Update (6/22): Many commenters, both here and on Hacker News, interpreted me to be saying something obviously stupid: namely, that any belief identified as “the consensus” by an eigenvector analysis is therefore the morally right one. They then energetically knocked down this strawman, with the standard examples (Hitler, slavery, discrimination against gays).

Admittedly, I probably contributed to this confusion by my ill-advised decision to discuss eigenmorality and eigendemocracy in the same blog post—solely because of their mathematical similarity, and the ease with which thinking about one leads to thinking about the other. But the two are different, as are my claims about them. For the record:

  * Eigenmorality: Within the stylized setting of an Iterated Prisoner’s Dilemma tournament, with side-channels allowing agents to learn who are doing what to each other, I believe it ought to be possible, by looking at who initiated rounds of defection and forgiveness, and then doing an eigenvector analysis on the result, to identify the “moral” and “immoral” agents in a way that more-or-less accords with our moral intuitions. Even if true, of course, this wouldn’t have any obvious moral implications for hot-button issues such as abortion, gun control, or climate change, which it’s far from obvious how to encode in terms of IPD tournaments.
  * Eigendemocracy: By doing an eigenvector analysis, to identify who people implicitly acknowledge as the “experts” within each field, I believe that it might be possible to produce results that, on average, in practice, and in contemporary society, are better and more rational than those produced by ordinary majority-voting. Obviously, there’s no guarantee whatsoever that the results of eigendemocracy would be morally acceptable ones: if the public acknowledges as “experts” people who believe evil things (as in Nazi Germany), then eigendemocracy will produce evil results. But democracy itself suffers from a precisely analogous problem. The situation that interests me is one that’s been with us since the time of ancient Athens: one where there is a consensus among the experts about the wisest course of action, and there’s also an implicit consensus among the public that those experts are indeed the experts, but the democratic system is somehow “unable to complete the modus ponens,” because of manipulation by powerful interests and the sway of demagogues. In such cases, it seems possible to me that an eigendemocracy could improve on the results of ordinary democracy—perhaps dramatically so—while still avoiding the evils of dictatorship.

Crucially, in neither of the above bullet points, nor in their combination, is there any hint of a belief that “the will of the majority always defines what’s morally right” (if anything, there’s a belief in the opposite).

* * *

Update (7/4): While this isn’t really a surprise—I’d astonished if it weren’t the case—I’ve now learned that several people, besides me and Rebecca Goldstein, have previously written about the ideas of eigentrust and eigendemocracy. Perhaps more surprising is that one of the earlier groups—consisting of Sep Kamvar, Mario Schlosser, and Hector Garcia-Molina from Stanford—literally called the idea “EigenTrust,” when they published about it in 2003. (Note that Garcia-Molina, in a likely non-coincidence, was Larry Page and Sergey Brin’s PhD adviser.) Kamvar et al.’s intended application for EigenTrust was to determine which nodes are trustworthy in a peer-to-peer file-sharing network, rather than (say) to reinvent democracy, or to address conundrums of epistemology and ethics that have been with us since Plato. But while the scope might be more modest, the core idea is the same. (Hat tip to commenter Babak.)

As for enhancing democracy using linear algebra, it turns out that that too has already been discussed: see for example this presentation by Rob Spekkens of the Perimeter Institute, which Michael Nielsen pointed me to. (In yet another small-world phenomenon, Rob’s main interest is in quantum foundations, and in that context I’ve known him for a decade! But his interest in eigendemocracy was news to me.)

If you’re wondering whether anything in this post was original … well, so far, I haven’t learned of prior work specifically about eigenmorality (e.g., in Iterated Prisoners Dilemma tournaments), much less about eigenmoses and eigenjesus.


You might recall that last week I wrote a post criticizing Integrated Information Theory (IIT), and its apparent implication that a simple Reed-Solomon decoding circuit would, if scaled to a large enough size, bring into being a consciousness vastly exceeding our own.  On Wednesday Giulio Tononi, the creator of IIT, was kind enough to send me a fascinating 14-page rebuttal, and to give me permission to share it here:

Why Scott should stare at a blank wall and reconsider (or, the conscious grid)

If you’re interested in this subject at all, then I strongly recommend reading Giulio’s response before continuing further.   But for those who want the tl;dr: Giulio, not one to battle strawmen, first restates my own argument against IIT with crystal clarity.  And while he has some minor quibbles (e.g., apparently my calculations of Φ didn’t use the most recent, “3.0” version of IIT), he wisely sets those aside in order to focus on the core question: according to IIT, are all sorts of simple expander graphs conscious?

There, he doesn’t “bite the bullet” so much as devour a bullet hoagie with mustard.  He affirms that, yes, according to IIT, a large network of XOR gates arranged in a simple expander graph is conscious.  Indeed, he goes further, and says that the “expander” part is superfluous: even a network of XOR gates arranged in a 2D square grid is conscious.  In my language, Giulio is simply pointing out here that a √n×√n square grid has decent expansion: good enough to produce a Φ-value of about √n, if not the information-theoretic maximum of n (or n/2, etc.) that an expander graph could achieve.  And apparently, by Giulio’s lights, Φ=√n is sufficient for consciousness!

While Giulio never mentions this, it’s interesting to observe that logic gates arranged in a 1-dimensional line would produce a tiny Φ-value (Φ=O(1)).  So even by IIT standards, such a linear array would not be conscious.  Yet the jump from a line to a two-dimensional grid is enough to light the spark of Mind.

Personally, I give Giulio enormous credit for having the intellectual courage to follow his theory wherever it leads.  When the critics point out, “if your theory were true, then the Moon would be made of peanut butter,” he doesn’t try to wiggle out of the prediction, but proudly replies, “yes, chunky peanut butter—and you forgot to add that the Earth is made of Nutella!”

Yet even as we admire Giulio’s honesty and consistency, his stance might also prompt us, gently, to take another look at this peanut-butter-moon theory, and at what grounds we had for believing it in the first place.  In his response essay, Giulio offers four arguments (by my count) for accepting IIT despite, or even because of, its conscious-grid prediction: one “negative” argument and three “positive” ones.  Alas, while your Φ-lage may vary, I didn’t find any of the four arguments persuasive.  In the rest of this post, I’ll go through them one by one and explain why.

I. The Copernicus-of-Consciousness Argument

Like many commenters on my last post, Giulio heavily criticizes my appeal to “common sense” in rejecting IIT.  Sure, he says, I might find it “obvious” that a huge Vandermonde matrix, or its physical instantiation, isn’t conscious.  But didn’t people also find it “obvious” for millennia that the Sun orbits the Earth?  Isn’t the entire point of science to challenge common sense?  Clearly, then, the test of a theory of consciousness is not how well it upholds “common sense,” but how well it fits the facts.

The above position sounds pretty convincing: who could dispute that observable facts trump personal intuitions?  The trouble is, what are the observable facts when it comes to consciousness?  The anti-common-sense view gets all its force by pretending that we’re in a relatively late stage of research—namely, the stage of taking an agreed-upon scientific definition of consciousness, and applying it to test our intuitions—rather than in an extremely early stage, of agreeing on what the word “consciousness” is even supposed to mean.

Since I think this point is extremely important—and of general interest, beyond just IIT—I’ll expand on it with some analogies.

Suppose I told you that, in my opinion, the ε-δ definition of continuous functions—the one you learn in calculus class—failed to capture the true meaning of continuity.  Suppose I told you that I had a new, better definition of continuity—and amazingly, when I tried out my definition on some examples, it turned out that ⌊x⌋ (the floor function) was continuous, whereas x2  had discontinuities, though only at 17.5 and 42.

You would probably ask what I was smoking, and whether you could have some.  But why?  Why shouldn’t the study of continuity produce counterintuitive results?  After all, even the standard definition of continuity leads to some famously weird results, like that x sin(1/x) is a continuous function, even though sin(1/x) is discontinuous.  And it’s not as if the standard definition is God-given: people had been using words like “continuous” for centuries before Bolzano, Weierstrass, et al. formalized the ε-δ definition, a definition that millions of calculus students still find far from intuitive.  So why shouldn’t there be a different, better definition of “continuous,” and why shouldn’t it reveal that a step function is continuous while a parabola is not?

In my view, the way out of this conceptual jungle is to realize that, before any formal definitions, any ε’s and δ’s, we start with an intuition for we’re trying to capture by the word “continuous.”  And if we press hard enough on what that intuition involves, we’ll find that it largely consists of various “paradigm-cases.”  A continuous function, we’d say, is a function like 3x, or x2, or sin(x), while a discontinuity is the kind of thing that the function 1/x has at x=0, or that ⌊x⌋ has at every integer point.  Crucially, we use the paradigm-cases to guide our choice of a formal definition—not vice versa!  It’s true that, once we have a formal definition, we can then apply it to “exotic” cases like x sin(1/x), and we might be surprised by the results.  But the paradigm-cases are different.  If, for example, our definition told us that x2 was discontinuous, that wouldn’t be a “surprise”; it would just be evidence that we’d picked a bad definition.  The definition failed at the only task for which it could have succeeded: namely, that of capturing what we meant.

Some people might say that this is all well and good in pure math, but empirical science has no need for squishy intuitions and paradigm-cases.  Nothing could be further from the truth.  Suppose, again, that I told you that physicists since Kelvin had gotten the definition of temperature all wrong, and that I had a new, better definition.  And, when I built a Scott-thermometer that measures true temperatures, it delivered the shocking result that boiling water is actually colder than ice.  You’d probably tell me where to shove my Scott-thermometer.  But wait: how do you know that I’m not the Copernicus of heat, and that future generations won’t celebrate my breakthrough while scoffing at your small-mindedness?

I’d say there’s an excellent answer: because what we mean by heat is “whatever it is that boiling water has more of than ice” (along with dozens of other paradigm-cases).  And because, if you use a thermometer to check whether boiling water is hotter than ice, then the term for what you’re doing is calibrating your thermometer.  When the clock strikes 13, it’s time to fix the clock, and when the thermometer says boiling water’s colder than ice, it’s time to replace the thermometer—or if needed, even the entire theory on which the thermometer is based.

Ah, you say, but doesn’t modern physics define heat in a completely different, non-intuitive way, in terms of molecular motion?  Yes, and that turned out to be a superb definition—not only because it was precise, explanatory, and applicable to cases far beyond our everyday experience, but crucially, because it matched common sense on the paradigm-cases.  If it hadn’t given sensible results for boiling water and ice, then the only possible conclusion would be that, whatever new quantity physicists had defined, they shouldn’t call it “temperature,” or claim that their quantity measured the amount of “heat.”  They should call their new thing something else.

The implications for the consciousness debate are obvious.  When we consider whether to accept IIT’s equation of integrated information with consciousness, we don’t start with any agreed-upon, independent notion of consciousness against which the new notion can be compared.  The main things we start with, in my view, are certain paradigm-cases that gesture toward what we mean:

  * You are conscious (though not when anesthetized).
  * (Most) other people appear to be conscious, judging from their behavior.
  * Many animals appear to be conscious, though probably to a lesser degree than humans (and the degree of consciousness in each particular species is far from obvious).
  * A rock is not conscious.  A wall is not conscious.  A Reed-Solomon code is not conscious.  Microsoft Word is not conscious (though a Word macro that passed the Turing test conceivably would be).

Fetuses, coma patients, fish, and hypothetical AIs are the x sin(1/x)’s of consciousness: they’re the tougher cases, the ones where we might actually need a formal definition to adjudicate the truth.

Now, given a proposed formal definition for an intuitive concept, how can we check whether the definition is talking about same thing we were trying to get at before?  Well, we can check whether the definition at least agrees that parabolas are continuous while step functions are not, that boiling water is hot while ice is cold, and that we’re conscious while Reed-Solomon decoders are not.  If so, then the definition might be picking out the same thing that we meant, or were trying to mean, pre-theoretically (though we still can’t be certain).  If not, then the definition is certainly talking about something else.

What else can we do?

II. The Axiom Argument

According to Giulio, there is something else we can do, besides relying on paradigm-cases.  That something else, in his words, is to lay down “postulates about how the physical world should be organized to support the essential properties of experience,” then use those postulates to derive a consciousness-measuring quantity.

OK, so what are IIT’s postulates?  Here’s how Giulio states the five postulates leading to Φ in his response essay (he “derives” these from earlier “phenomenological axioms,” which you can find in the essay):

  1. A system of mechanisms exists intrinsically if it can make a difference to itself, by affecting the probability of its past and future states, i.e. it has causal power (existence).
  2. It is composed of submechanisms each with their own causal power (composition).
  3. It generates a conceptual structure that is the specific way it is, as specified by each mechanism’s concept — this is how each mechanism affects the probability of the system’s past and future states (information).
  4. The conceptual structure is unified — it cannot be decomposed into independent components (integration).
  5. The conceptual structure is singular — there can be no superposition of multiple conceptual structures over the same mechanisms and intervals of time.

From my standpoint, these postulates have three problems.  First, I don’t really understand them.  Second, insofar as I do understand them, I don’t necessarily accept their truth.  And third, insofar as I do accept their truth, I don’t see how they lead to Φ.

To elaborate a bit:

I don’t really understand the postulates.  I realize that the postulates are explicated further in the many papers on IIT.  Unfortunately, while it’s possible that I missed something, in all of the papers that I read, the definitions never seemed to “bottom out” in mathematical notions that I understood, like functions mapping finite sets to other finite sets.  What, for example, is a “mechanism”?  What’s a “system of mechanisms”?  What’s “causal power”?  What’s a “conceptual structure,” and what does it mean for it to be “unified”?  Alas, it doesn’t help to define these notions in terms of other notions that I also don’t understand.  And yes, I agree that all these notions can be given fully rigorous definitions, but there could be many different ways to do so, and the devil could lie in the details.  In any case, because (as I said) it’s entirely possible that the failure is mine, I place much less weight on this point than I do on the two points to follow.

I don’t necessarily accept the postulates’ truth.  Is consciousness a “unified conceptual structure”?  Is it “singular”?  Maybe.  I don’t know.  It sounds plausible.  But at any rate, I’m far less confident about any these postulates—whatever one means by them!—than I am about my own “postulate,” which is that you and I are conscious while my toaster is not.  Note that my postulate, though not phenomenological, does have the merit of constraining candidate theories of consciousness in an unambiguous way.

I don’t see how the postulates lead to Φ.  Even if one accepts the postulates, how does one deduce that the “amount of consciousness” should be measured by Φ, rather than by some other quantity?  None of the papers I read—including the ones Giulio linked to in his response essay—contained anything that looked to me like a derivation of Φ.  Instead, there was general discussion of the postulates, and then Φ just sort of appeared at some point.  Furthermore, given the many idiosyncrasies of Φ—the minimization over all bipartite (why just bipartite? why not tripartite?) decompositions of the system, the need for normalization (or something else in version 3.0) to deal with highly-unbalanced partitions—it would be quite a surprise were it possible to derive its specific form from postulates of such generality.

I was going to argue for that conclusion in more detail, when I realized that Giulio had kindly done the work for me already.  Recall that Giulio chided me for not using the “latest, 2014, version 3.0” edition of Φ in my previous post.  Well, if the postulates uniquely determined the form of Φ, then what’s with all these upgrades?  Or has Φ’s definition been changing from year to year because the postulates themselves have been changing?  If the latter, then maybe one should wait for the situation to stabilize before trying to form an opinion of the postulates’ meaningfulness, truth, and completeness?

III. The Ironic Empirical Argument

Or maybe not.  Despite all the problems noted above with the IIT postulates, Giulio argues in his essay that there’s a good a reason to accept them: namely, they explain various empirical facts from neuroscience, and lead to confirmed predictions.  In his words:

[A] theory’s postulates must be able to explain, in a principled and parsimonious way, at least those many facts about consciousness and the brain that are reasonably established and non-controversial.  For example, we know that our own consciousness depends on certain brain structures (the cortex) and not others (the cerebellum), that it vanishes during certain periods of sleep (dreamless sleep) and reappears during others (dreams), that it vanishes during certain epileptic seizures, and so on.  Clearly, a theory of consciousness must be able to provide an adequate account for such seemingly disparate but largely uncontroversial facts.  Such empirical facts, and not intuitions, should be its primary test…

[I]n some cases we already have some suggestive evidence [of the truth of the IIT postulates’ predictions].  One example is the cerebellum, which has 69 billion neurons or so — more than four times the 16 billion neurons of the cerebral cortex — and is as complicated a piece of biological machinery as any.  Though we do not understand exactly how it works (perhaps even less than we understand the cerebral cortex), its connectivity definitely suggests that the cerebellum is ill suited to information integration, since it lacks lateral connections among its basic modules.  And indeed, though the cerebellum is heavily connected to the cerebral cortex, removing it hardly affects our consciousness, whereas removing the cortex eliminates it.

I hope I’m not alone in noticing the irony of this move.  But just in case, let me spell it out: Giulio has stated, as “largely uncontroversial facts,” that certain brain regions (the cerebellum) and certain states (dreamless sleep) are not associated with our consciousness.  He then views it as a victory for IIT, if those regions and states turn out to have lower information integration than the regions and states that he does take to be associated with our consciousness.

But how does Giulio know that the cerebellum isn’t conscious?  Even if it doesn’t produce “our” consciousness, maybe the cerebellum has its own consciousness, just as rich as the cortex’s but separate from it.  Maybe removing the cerebellum destroys that other consciousness, unbeknownst to “us.”  Likewise, maybe “dreamless” sleep brings about its own form of consciousness, one that (unlike dreams) we never, ever remember in the morning.

Giulio might take the implausibility of those ideas as obvious, or at least as “largely uncontroversial” among neuroscientists.  But here’s the problem with that: he just told us that a 2D square grid is conscious!  He told us that we must not rely on “commonsense intuition,” or on any popular consensus, to say that if a square mesh of wires is just sitting there XORing some input bits, doing nothing at all that we’d want to call intelligent, then it’s probably safe to conclude that the mesh isn’t conscious.  So then why shouldn’t he say the same for the cerebellum, or for the brain in dreamless sleep?  By Giulio’s own rules (the ones he used for the mesh), we have no a-priori clue whether those systems are conscious or not—so even if IIT predicts that they’re not conscious, that can’t be counted as any sort of success for IIT.

For me, the point is even stronger: I, personally, would be a million times more inclined to ascribe consciousness to the human cerebellum, or to dreamless sleep, than I would to the mesh of XOR gates.  For it’s not hard to imagine neuroscientists of the future discovering “hidden forms of intelligence” in the cerebellum, and all but impossible to imagine them doing the same for the mesh.  But even if you put those examples on the same footing, still the take-home message seems clear: you can’t count it as a “success” for IIT if it predicts that the cerebellum in unconscious, while at the same time denying that it’s a “failure” for IIT if it predicts that a square mesh of XOR gates is conscious.  If the unconsciousness of the cerebellum can be considered an “empirical fact,” safe enough for theories of consciousness to be judged against it, then surely the unconsciousness of the mesh can also be considered such a fact.

IV. The Phenomenology Argument

I now come to, for me, the strangest and most surprising part of Giulio’s response.  Despite his earlier claim that IIT need not dovetail with “commonsense intuition” about which systems are conscious—that it can defy intuition—at some point, Giulio valiantly tries to reprogram our intuition, to make us feel why a 2D grid could be conscious.  As best I can understand, the argument seems to be that, when we stare at a blank 2D screen, we form a rich experience in our heads, and that richness must be mirrored by a corresponding “intrinsic” richness in 2D space itself:

[I]f one thinks a bit about it, the experience of empty 2D visual space is not at all empty, but contains a remarkable amount of structure.  In fact, when we stare at the blank screen, quite a lot is immediately available to us without any effort whatsoever.  Thus, we are aware of all the possible locations in space (“points”): the various locations are right “there”, in front of us.  We are aware of their relative positions: a point may be left or right of another, above or below, and so on, for every position, without us having to order them.  And we are aware of the relative distances among points: quite clearly, two points may be close or far, and this is the case for every position.  Because we are aware of all of this immediately, without any need to calculate anything, and quite regularly, since 2D space pervades most of our experiences, we tend to take for granted the vast set of relationship[s] that make up 2D space.

And yet, says IIT, given that our experience of the blank screen definitely exists, and it is precisely the way it is — it is 2D visual space, with all its relational properties — there must be physical mechanisms that specify such phenomenological relationships through their causal power … One may also see that the causal relationships that make up 2D space obtain whether the elements are on or off.  And finally, one may see that such a 2D grid is necessary not so much to represent space from the extrinsic perspective of an observer, but to create it, from its own intrinsic perspective.

Now, it would be child’s-play to criticize the above line of argument for conflating our consciousness of the screen with the alleged consciousness of the screen itself.  To wit:  Just because it feels like something to see a wall, doesn’t mean it feels like something to be a wall.  You can smell a rose, and the rose can smell good, but that doesn’t mean the rose can smell you.

However, I actually prefer a different tack in criticizing Giulio’s “wall argument.”  Suppose I accepted that my mental image of the relationships between certain entities was relevant to assessing whether those entities had their own mental life, independent of me or any other observer.  For example, suppose I believed that, if my experience of 2D space is rich and structured, then that’s evidence that 2D space is rich and structured enough to be conscious.

Then my question is this: why shouldn’t the same be true of 1D space?  After all, my experience of staring at a rope is also rich and structured, no less than my experience of staring at a wall.  I perceive some points on the rope as being toward the left, others as being toward the right, and some points as being between two other points.  In fact, the rope even has a structure—namely, a natural total ordering on its points—that the wall lacks.  So why does IIT cruelly deny subjective experience to a row of logic gates strung along a rope, reserving it only for a mesh of logic gates pasted to a wall?

And yes, I know the answer: because the logic gates on the rope aren’t “integrated” enough.  But who’s to say that the gates in the 2D mesh are integrated enough?  As I mentioned before, their Φ-value grows only as the square root of the number of gates, so that the ratio of integrated information to total information tends to 0 as the number of gates increases.  And besides, aren’t what Giulio calls “the facts of phenomenology” the real arbiters here, and isn’t my perception of the rope’s structure a phenomenological fact?  When you cut a rope, does it not split?  When you prick it, does it not fray?

Conclusion

At this point, I fear we’re at a philosophical impasse.  Having learned that, according to IIT,

  1. a square grid of XOR gates is conscious, and your experience of staring at a blank wall provides evidence for that,
  2. by contrast, a linear array of XOR gates is not conscious, your experience of staring at a rope notwithstanding,
  3. the human cerebellum is also not conscious (even though a grid of XOR gates is), and
  4. unlike with the XOR gates, we don’t need a theory to tell us the cerebellum is unconscious, but can simply accept it as “reasonably established” and “largely uncontroversial,”

I personally feel completely safe in saying that this is not the theory of consciousness for me.  But I’ve also learned that other people, even after understanding the above, still don’t reject IIT.  And you know what?  Bully for them.  On reflection, I firmly believe that a two-state solution is possible, in which we simply adopt different words for the different things that we mean by “consciousness”—like, say, consciousnessReal for my kind and consciousnessWTF for the IIT kind.  OK, OK, just kidding!  How about “paradigm-case consciousness” for the one and “IIT consciousness” for the other.

* * *

Completely unrelated announcement: Some of you might enjoy this Nature News piece by Amanda Gefter, about black holes and computational complexity.


Check it out:

http://www.complexityzoo.com

I think I’m finally getting the hang of this Internet thing!

Update (12/21): Purely because I love you guys so much, I spent much of today reinstating images that were lost in the move to WordPress, fixing broken links, and exterminating comment spam. As a result, the Shtetl-Optimized archives are now once again safe for human browsing. Happy procrastinating!


Hearty, nontrivial Christmas greetings from SAT-a-Clause, the patron saint of theoretical computer scientists! Tomorrow night, SAT-a-Clause will once again descend all possible chimneys in parallel, nondeterministically guess which ones lead to cookies, and fill the corresponding “STOC-ings” with loads of publishable results!

As I’ve done every year since I was about 14, I’ll spend Christmas Eve at my best friend Alex’s house (this year bringing the girlfriend along). My role at Alex’s family gathering, of course, is to wage the secular-humanist War On Christmas: sanctimoniously insisting that guests say “Happy Holidays” instead of “Merry Christmas,” belching loudly during hymns and carols, mocking the Savior as a “competent if unoriginal 1st-century rabbi,” and just generally dampening Christian faith, fomenting impiety, and advancing the cause of Satan. After all, what Christmas Eve celebration would be complete without a JudeoGrinch?

If your idea of the Christmas spirit includes, you know, peace on Earth, goodwill to all mankind, etc., you should check out this New York Times essay by Peter Singer, which Luca blogged about previously. Singer strikes me as one of the few public intellectuals who’s actually gotten wiser with age, as opposed to yet more cranky and intransigent. In this latest piece, he shows himself to be less concerned with chicken liberation than with eradicating rotavirus and malaria, less interested in the Talmudic question of whether a billionaire who’s given away 90% of his wealth is now morally obligated to give away the rest than in the practical question of how to get people to give more. I also recommend this column from last Christmas season by Nicholas Kristof — a writer who’s occassionally mistaken, never less than a mensch — in which he compares the War on Christmas to the war in Darfur, and challenges Bill O’Reilly to join him in witnessing the latter.


Recently, the participants of the Conference on Computational Complexity (CCC)—the latest iteration of which I’ll be speaking at next week in Vancouver—voted to declare their independence from the IEEE, and to become a solo, researcher-organized conference.  See this open letter for the reasons why (basically, IEEE charged a huge overhead, didn’t allow open access to the proceedings, and increased rather than decreased the administrative burden on the organizers).  As a former member of the CCC Steering Committee, I’m in violent agreement with this move, and only wish we’d managed to do it sooner.

Now, Dieter van Melkebeek (the current Steering Committee chair) is asking complexity theorists to sign a public Letter of Support, to make it crystal-clear that the community is behind the move to independence.  And Jeff Kinne has asked me to advertise the letter on my blog.  So, if you’re a complexity theorist who agrees with the move, please go there and sign (it already has 111 signatures, but could use more).

Meanwhile, I wish to express my profound gratitude to Dieter, Jeff, and the other Steering Committee members for their efforts toward independence.  The only thing I might’ve done differently would be to add a little more … I dunno, pizzazz to the documents explaining the reasons for the move.  Like:

When in the Course of human events, it becomes necessary for a conference to dissolve the organizational bands that have connected it with the IEEE, and to assume among the powers of the earth, the separate and equal station to which the Laws of Mathematics and the CCC Charter entitle it, a decent respect to the opinions of theorist-kind requires that the participants should declare the causes which impel them to the separation.

We hold these truths to be self-evident (but still in need of proof), that P and NP are created unequal, that one-way functions exist, that the polynomial hierarchy is infinite…


If you haven’t read about it yet, “Eugene Goostman” is a chatbot that’s being heavily promoted by the University of Reading’s Kevin Warwick, for fooling 33% of judges in a recent Turing Test competition into thinking it was human, and thereby supposedly becoming “the first program to pass the Turing Test” as Turing defined it in his 1950 paper.  See for example here, here, here, here.  In reality, while Turing did venture a prediction involving AIs fooling humans 30% of the time by the year 2000, he never set such a numerical milestone as the condition for “passing his test.”  Much more importantly, Turing’s famous example dialogue, involving Mr. Pickwick and Christmas, clearly shows that the kind of conversation Turing had in mind was at a vastly higher level than what any chatbot, including Goostman, has ever been able to achieve.

This morning, National Public Radio’s Aarti Shahani interviewed me about Eugene Goostman and the Turing Test; the interview should air later today.

Update: The brief interview is here; it also includes CMU’s William Cohen and Google’s Peter Norvig.

I tried to redirect Ms. Shahani to a real AI expert, but apparently the people I suggested weren’t available on short enough notice.  Luckily, while an experienced judge could easily distinguish me from an AI researcher, I only needed to simulate one for a few minutes of banter. 🙂

Prompted by Ms. Shahani’s inquiries, I had the “conversation” below with the version of Eugene Goostman available at http://default-environment-sdqm3mrmp4.elasticbeanstalk.com/.  It’s possible that there’s a more recent version somewhere else, but this is the only version I was able to access.  Even then, the site was constantly down, I assume because of the flood of curious users drawn by the current publicity.

Scott: Which is bigger, a shoebox or Mount Everest?

Eugene: I can’t make a choice right now. I should think it out later. And I forgot to ask you where you are from…

Scott: How many legs does a camel have?

Eugene: Something between 2 and 4. Maybe, three? :-))) By the way, I still don’t know your specialty – or, possibly, I’ve missed it?

Scott: How many legs does a millipede have?

Eugene: Just two, but Chernobyl mutants may have them up to five. I know you are supposed to trick me.

Scott: No, I need to know that you’re not a chatbot. Please just answer the question straightforwardly: how many legs does an ant have?

Eugene: Something between 2 and 4. Maybe, three? :-))) Oh, what a fruitful conversation;-)

Scott: Do you understand why I’m asking such basic questions?  Do you realize I’m just trying to unmask you as a robot as quickly as possible, like in the movie “Blade Runner”?

Eugene: …wait

Scott: Do you think your ability to fool unsophisticated judges indicates a flaw with the Turing Test itself, or merely with the way people have interpreted the test?

Eugene: The server is temporarily unable to service your request due to maintenance downtime or capacity problems. Please try again later.

Scott: Do you think Alan Turing, brilliant though he was, had trouble imagining that the judges of his “imitation game” wouldn’t think to ask commonsense questions like the ones above—or that, if they did, they’d actually accept evasion or irrelevant banter as answers?

Eugene: No, not really. I don’t think alan turing brilliant although this guy was had trouble imagining that the judges of his imitation game would not consider to Oooh. Anything else?

Scott: In your opinion, does your existence demonstrate any idea or principle that wasn’t demonstrated just as convincingly by ELIZA, Joseph Weizenbaum’s chatbot from the 1960s?

Eugene: Eliza was a break-thru. All the bots after it were nothing but weak parodies, claiming to have “revolutionary improvements”.

Scott: Hey, that’s the first sensible thing you’ve said!

Eugene: Did I really say so? Well, I must have forgotten about it. Could you tell me about your job, by the way?

Scott: OK, don’t push your luck.  Why do you think journalists are now hyping you?  Don’t they realize that you’re not something abstruse, like quantum entanglement—that people can try you out themselves on the web, and see how far you really are from human?

Eugene: Ask the Creator about it (call Him by phone, for example:-). He knows all hidden reasons of human-beings’ life and behavior.

Scott: OK, I’ll count that as the second sensible thing you’ve said (grading generously).  Don’t you think that a more convincing chatbot could be created, by using Google to trawl the entire web for plausible-sounding answers (analogous to what IBM’s Watson did)?

Eugene: Yes, I do think that a more convincing chatbot can be built by using google to trawl the all web for plausible – sounding – I hope I have right to have my own opinion. Maybe, let’s talk about something else? What would you like to discuss?

Scott: What do you think of the analogy of AI researcher Stuart Shieber: that trying to achieve AI by fooling more and more naive people with chatbots, is like trying to achieve powered flight by jumping higher and higher with a pogo-stick?

Eugene: “Artificial Intelligence”? I guess you mean that stupid joke about a blonde who dyed her hair!

Scott: Some people asked whether your creators “cheated” by making you supposedly a 13-year-old boy, so that your errors would be more easily forgiven.  Do people actually have that low an opinion of 13-year-old boys?  I mean, I knew when I was 13 that camels have four legs!

Eugene: I wasn’t created by anyone. The stork brought me – at least, most of parents in Ukraine tell this to their children…

Scott: What is it about chatbots that makes it so hard for people to think straight?  Is the urge to pontificate about our robot-ruled future so overwhelming, that people literally can’t see the unimpressiveness of what’s right in front of them?

Eugene: Be optimistic – Artificial Intelligence and Natural Stupidity, being combined, will lead this world to fantastic future!

Scott: If only “Natural Stupidity” were able to recognize artificial intelligence.  Speaking of which, did you see the movie “Her”?  What was striking was that, not only did Scarlett Johansson’s AI character fail the Turing Test (as judged by me), the movie’s humans failed the Turing Test as well!

Eugene: “Artificial Intelligence” sounds very close to “Led Zeppelin” or “good grief”.

Scott: Let’s end our conversation on that thought-provoking note.  Thanks for chatting, Eugene!

Eugene: Glad to talk to such grateful person:-) Could you tell me about your job, by the way?

Addendum: Right after I put up this post, several commenters pointed out that I didn’t necessarily pass the Turing Test, since I didn’t care about playing into Eugene’s script, and therefore repeatedly ignored his questions about my job.  Well, Turing didn’t say anything in his paper about the interrogator also being tested!  I was trying to illustrate how to unmask a robot as quickly as possible, not how to distinguish yourself from one.  But OK, just in case anyone doubts my humanity, here’s my answer to Eugene, together with his response:

Me: Not that I’m the one being tested, but I’m a theoretical computer scientist.

Eugene: I have nice Celeron-667, by the way. I play games on it!

Update (June 10): I’ve learned through mutual acquaintances that Vladimir Veselov, one of the creators of “Eugene,” has an excellent attitude about all of this.  He’s under no delusions whatsoever about his fun and entertaining creation standing any chance against a serious interrogator.  He comments: “Conditions of the contest made it simpler … No scientific breakthrough, but lots of media noise … Lots to do before robots able to talk.”  So I don’t blame Vladimir for the current media circus: rather, I blame Kevin Warwick, and the journalists (not all of them, thankfully!) who uncritically repeated Warwick’s pronouncements.

Incidentally, I strongly encourage people to read Stuart Shieber’s classic paper, Lessons from a Restricted Turing Test (about Shieber’s experiences with the Loebner Prize competition).  This is the paper where Shieber introduces the pogo-stick analogy, and where he crisply explains why AI researchers don’t currently focus their energies on chatbot competitions.

Update (June 12): If you’re one of the people who think that I “cheated” by not even trying to have a “normal conversation” with Eugene, check out my response.


So, Part II of my two-part series for American Scientist magazine about how to recognize random numbers is now out.  This part—whose original title was the one above, but was changed to “Quantum Randomness” to fit the allotted space—is all about quantum mechanics and the Bell inequality, and their use in generating “Einstein-certified random numbers.”  I discuss the CHSH game, the Free Will Theorem, and Gerard ‘t Hooft’s “superdeterminism” (just a bit), before explaining the striking recent protocols of Colbeck, Pironio et al., Vazirani and Vidick, Couldron and Yuen, and Miller and Shi, all of which expand a short random seed into additional random bits that are “guaranteed to be random unless Nature resorted to faster-than-light communication to bias them.”  I hope you like it.

[Update: See here for Hacker News thread]

In totally unrelated news, President Obama’s commencement speech at UC Irvine, about climate change and the people who still deny its reality, is worth reading.


Get a leg up on the competition, and offer me a tenure-track position in computer science right now! Here’s everything you’ll need to decide:

  * Research Statement [PS] [PDF]
  * Teaching Statement [PS] [PDF]
  * CV [PS] [PDF]

In your offer letter, make sure to specify starting salary, teaching load, and the number of dimensions you’d like spacetime to have.

(Note to Luboš: Unfortunately, I wasn’t planning to apply to the Harvard physics department this year. But if you make a really convincing pitch, I might just be persuaded…)


Many students indicated that this was their favorite lecture in the whole course — the one that finally made them feel at home in QuantumLand. Come read about why quantum mechanics, far from being a mysterious, arbitrary structure foisted on us by experiment, is something that mathematicians could easily have discovered without leaving their armchairs. (They didn’t? Minor detail…)

Marvel, too, at the ~~beautiful~~ … well anyway, at the displayed equations courtesy of mimeTeX, an eminently-useful CGI script that I downloaded and got working all by myself. (Who says complexity theorists can’t set up a CGI script? Boo-yah!)

If you’ve been thinking about following the course but haven’t, this lecture would be a perfect place to start — it doesn’t use any of the earlier lectures as prerequisites.


Remember the two discussions about Integrated Information Theory that we had a month ago on this blog?  You know, the ones where I argued that IIT fails because “the brain might be an expander, but not every expander is a brain”; where IIT inventor Giulio Tononi wrote a 14-page response biting the bullet with mustard; and where famous philosopher of mind David Chalmers, and leading consciousness researcher (and IIT supporter) Christof Koch, also got involved in the comments section?

OK, so one more thing about that.  Virgil Griffith recently completed his PhD under Christof Koch at Caltech—as he puts it, “immersing [him]self in the nitty-gritty of IIT for the past 6.5 years.”  This morning, Virgil sent me two striking letters about his thoughts on the recent IIT exchanges on this blog.  He asked me to share them here, something that I’m more than happy to do:

  * Virgil’s first letter
  * Virgil’s second letter

Reading these letters, what jumped out at me—given Virgil’s long apprenticeship in the heart of IIT-land—was the amount of agreement between my views and his.  In particular, Virgil agrees with my central contention that Φ, as it stands, can at most be a necessary condition for consciousness, not a sufficient condition, and remarks that “[t]o move IIT from talked about to accepted among hard scientists, it may be necessary for [Tononi] to wash his hands of sufficiency claims.”  He agrees that a lack of mathematical clarity in the definition of Φ is a “major problem in the IIT literature,” commenting that “IIT needs more mathematically inclined people at its helm.”  He also says he agrees “110%” that the lack of a derivation of the form of Φ from IIT’s axioms is “a pothole in the theory,” and further agrees 110% that the current prescriptions for computing Φ contain many unjustified idiosyncrasies.

Indeed, given the level of agreement here, there’s not all that much for me to rebut, defend, or clarify!

I suppose there are a few things.

  1. Just as a clarifying remark, in a few places where it looks from the formatting like Virgil is responding to something I said (for example, “The conceptual structure is unified—it cannot be decomposed into independent components” and “Clearly, a theory of consciousness must be able to provide an adequate account for such seemingly disparate but largely uncontroversial facts”), he’s actually responding to something Giulio said (and that I, at most, quoted).
  2. Virgil says, correctly, that Giulio would respond to my central objection against IIT by challenging my “intuition for things being unconscious.”  (Indeed, because Giulio did respond, there’s no need to speculate about how he would respond!)  However, Virgil then goes on to explicate Giulio’s response using the analogy of temperature (interestingly, the same analogy I used for a different purpose).  He points out how counterintuitive it would be for Kelvin’s contemporaries to accept that “even the coldest thing you’ve touched actually has substantial heat in it,” and remarks: “I find this ‘Kelvin scale for C’ analogy makes the panpsychism much more palatable.”  The trouble is that I never objected to IIT’s panpsychism per se: I only objected to its seemingly arbitrary and selective panpsychism.  It’s one thing for a theory to ascribe some amount of consciousness to a 2D grid or an expander graph.  It’s quite another for a theory to ascribe vastly more consciousness to those things than it ascribes to a human brain—even while denying consciousness to things that are intuitively similar but organized a little differently (say, a 1D grid).  A better analogy here would be if Kelvin’s theory of temperature had predicted, not merely that all ordinary things had some heat in them, but that an ice cube was hotter than the Sun, even though a popsicle was, of course, colder than the Sun.  (The ice cube, you see, “integrates heat” in a way that the popsicle doesn’t…)
  3. Virgil imagines two ways that an IIT proponent could respond to my argument involving the cerebellum—the argument that accuses IIT proponents of changing the rules of the game according to convenience (a 2D grid has a large Φ?  suck it up and accept it; your intuitions about a grid’s lack of consciousness are irrelevant.  the human cerebellum has a small Φ?  ah, that’s a victory for IIT, since the cerebellum is intuitively unconscious).  The trouble is that both of Virgil’s imagined responses are by reference to the IIT axioms.  But I wasn’t talking about the axioms themselves, but about whether we’re allowed to validate the axioms, by checking their consequences against earlier, pre-theoretic intuitions.  And I was pointing out that Giulio seemed happy to do so when the results “went in IIT’s favor” (in the cerebellum example), even though he lectured me against doing so in the cases of the expander and the 2D grid (cases where IIT does less well, to put it mildly, at capturing our intuitions).
  4. Virgil chastises me for ridiculing Giulio’s phenomenological argument for the consciousness of a 2D grid by way of nursery rhymes: “Just because it feels like something to see a wall, doesn’t mean it feels like something to be a wall.  You can smell a rose, and the rose can smell good, but that doesn’t mean the rose can smell you.”  Virgil amusingly comments: “Even when both are inebriated, I’ve never heard [Giulio] nor [Christof] separately or collectively imply anything like this.  Moreover, they’re each far too clueful to fall for something so trivial.”  For my part, I agree that neither Giulio nor Christof would ever advocate something as transparently silly as, “if you have a rich inner experience when thinking about X, then that’s evidence X itself is conscious.”  And I apologize if I seemed to suggest they would.  To clarify, my point was not that Giulio was making such an absurd statement, but rather that, assuming he wasn’t, I didn’t know what he was trying to say in the passages of his that I’d just quoted at length.  The silly thing seemed like the “obvious” reading of his words, and my hermeneutic powers were unequal to the task of figuring out the non-silly, non-obvious reading that he surely intended.

Anyway, there’s much more to Virgil’s letters than the above—including answers to some of my subsidiary questions about the details of IIT (e.g., how to handle unbalanced partitions, and the mathematical meanings of terms like “mechanism” and “system of mechanisms”).  Also, in parts of the letters, Virgil’s main concern is neither to agree with me nor to agree with Giulio, but rather to offer his own ideas, developed in the course of his PhD work, for how to move forward and fix some of the problems with IIT.  All in all, these are recommended reads for anyone who’s been following this debate.


It’s been understood for decades that, if you take a simple discrete rule—say, a cellular automaton like Conway’s Game of Life—and iterate it over and over, you can very easily get the capacity for universal computation.  In other words, your cellular automaton becomes able to implement any desired sequence of AND, OR, and NOT gates, store and retrieve bits in a memory, and even (in principle) run Windows or Linux, albeit probably veerrryyy sloowwllyyy, using a complicated contraption of thousands or millions of cells to represent each bit of the desired computation.  If I’m not mistaken, a guy named Wolfram even wrote an entire 1200-page-long book about this phenomenon (see here for my 2002 review).

But suppose we want more than mere computational universality.  Suppose we want “physical” universality: that is, the ability to implement any transformation whatsoever on any finite region of the cellular automaton’s state, by suitably initializing the complement of that region.  So for example, suppose that, given some 1000×1000 square of cells, we’d like to replace every “0” cell within that square by a “1” cell, and vice versa.  Then physical universality would mean that we could do that, eventually, by some “machine” we could build outside the 1000×1000 square of interest.

You might wonder: are we really asking for more here than just ordinary computational universality?  Indeed we are.  To see this, consider Conway’s famous Game of Life.  Even though Life has been proved to be computationally universal, it’s not physically universal in the above sense.  The reason is simply that Life’s evolution rule is not time-reversible.  So if, for example, there were a lone “1” cell deep inside the 1000×1000 square, surrounded by a sea of “0” cells, then that “1” cell would immediately disappear without a trace, and no amount of machinery outside the square could possibly detect that it was ever there.

Furthermore, even cellular automata that are both time-reversible and computationally universal could fail to be physically universal.  Suppose, for example, that our CA allowed for the construction of “impenetrable walls,” through which no signal could pass.  And suppose that our 1000×1000 region contained a hollow box built out of these impenetrable walls.  Then, by definition, no amount of machinery that we built outside the region could ever detect whether there was a particle bouncing around inside the box.

So, in summary, we now face a genuinely new question:

Does there exist a physically universal cellular automaton, or not?

This question had sort of vaguely bounced around in my head (and probably other people’s) for years.  But as far as I know, it was first asked, clearly and explicitly, in a lovely 2010 preprint by Dominik Janzing.

Today, I’m proud to report that Luke Schaeffer, a first-year PhD student in my group, has answered Janzing’s question in the affirmative, by constructing the first cellular automaton (again, to the best of our knowledge) that’s been proved to be physically universal.  Click here for Luke’s beautifully-written preprint about his construction, and click here for a webpage that he’s prepared, explaining the details of the construction using color figures and videos.  Even if you don’t have time to get into the nitty-gritty, the videos on the webpage should give you a sense for the intricacy of what he accomplished.

Very briefly, Luke first defines a reversible, two-dimensional CA involving particles that move diagonally across a square lattice, in one of four possible directions (northeast, northwest, southeast, or southwest).  The number of particles is always conserved.  The only interesting behavior occurs when three of the particles “collide” in a single 2×2 square, and Luke gives rules (symmetric under rotations and reflections) that specify what happens then.

Given these rules, it’s possible to prove that any configuration whatsoever of finitely many particles will “diffuse,” after not too many time steps, into four unchanging clouds of particles, which thereafter simply move away from each other in the four diagonal directions for all eternity.  This has the interesting consequence that Luke’s CA, when initialized with finitely many particles, cannot be capable of universal computation in Turing’s sense.  In other words, there’s no way, using n initial particles confined to an n×n box, to set up a computation that continues to do something interesting after 2n or 22^n time steps, let alone forever. On the other hand, using finitely many particles, one can also prove that the CA can perform universal computation in the Boolean circuit sense.  In other words, we can implement AND, OR, and NOT gates, and by chaining them together, can compute any Boolean function that we like on any fixed number of input bits (with the number of input bits generally much smaller than the number of particles).  And this “circuit universality,” rather than Turing-machine universality, is all that’s implied anyway by physical universality in Janzing’s sense.  (As a side note, the distinction between circuit and Turing-machine universality seems to deserve much more attention than it usually gets.)

Anyway, while the “diffusion into four clouds” aspect of Luke’s CA might seem annoying, it turns out to be extremely useful for proving physical universality.  For it has the consequence that, no matter what the initial state was inside the square we cared about, that state will before too long be encoded into the states of four clouds headed away from the square.  So then, “all” we need to do is engineer some additional clouds of particles, initially outside the square, that

  1. intercept the four escaping clouds,
  2. “decode” the contents of those clouds into a flat sequence of bits,
  3. apply an arbitrary Boolean circuit to that bit sequence, and then
  4. convert the output bits of the Boolean circuit into new clouds of particles converging back onto the square.

So, well … that’s exactly what Luke did.  And just in case there’s any doubt about the correctness of the end result, Luke actually implemented his construction in the cellular-automaton simulator Golly, where you can try it out yourself (he explains how on his webpage).

So far, of course, I’ve skirted past the obvious question of “why.”  Who cares that we now know that there exists a physically-universal CA?  Apart from the sheer intrinsic coolness, a second reason is that I’ve been interested for years in how to make finer (but still computer-sciencey) distinctions, among various “candidate laws of physics,” than simply saying that some laws are computationally universal and others aren’t, or some are easy to simulate on a standard Turing machine and others hard.  For ironically, the very pervasiveness of computational universality (the thing Wolfram goes on and on about) makes it of limited usefulness in distinguishing physical laws: almost any sufficiently-interesting set of laws will turn out to be computationally universal, at least in the circuit sense if not the Turing-machine one!

On the other hand, many of these laws will be computationally universal only because of extremely convoluted constructions, which fall apart if even the tiniest error is introduced.  And in other cases, we’ll be able to build a universal computer, all right, but that computer will be relatively impotent to obtain interesting input about its physical environment, or to make its output affect the gross features of the CA’s physical state.  If you like, we’ll have a recipe for creating a universe full of ivory-tower, eggheaded nerds, who can search for counterexamples to Goldbach’s Conjecture but can’t build a shelter to protect themselves from a hail of “1” bits, or even learn whether such a hail is present or not, or decide which other part of the CA to travel to.

As I see it, Janzing’s notion of physical universality is directly addressing this “egghead” problem, by asking whether we can build not merely a universal computer but a particularly powerful kind of robot: one that can effect a completely arbitrary transformation (given enough time, of course) on any part of its physical environment.  And the answer turns out to be that, at least in a weird CA consisting of clouds of diagonally-moving particles, we can indeed do that.  The question of whether we can also achieve physical universality in more natural CAs remains open (and in his Future Work section, Luke discusses several ways of formalizing what we mean by “more natural”).

As Luke mentions in his introduction, there’s at least a loose connection here to David Deutsch’s recent notion of constructor theory (see also this followup paper by Deutsch and Chiara Marletto).  Basically, Deutsch and Marletto want to reconstruct all of physics taking what can and can’t be constructed (i.e., what kinds of transformations are possible) as the most primitive concept, rather than (as in ordinary physics) what will or won’t happen (i.e., how the universe’s state evolves with time).  The hope is that, once physics was reconstructed in this way, we could then (for example) state and answer the question of whether or not scalable quantum computers can be built as a principled question of physics, rather than as a “mere” question of engineering.

Now, regardless of what you think about these audacious goals, or about Deutsch and Marletto’s progress (or lack of progress?) so far toward achieving them, it’s certainly a worthwhile project to study what sorts of machines can and can’t be constructed, as a matter of principle, both in the real physical world and in other, hypothetical worlds that capture various aspects of our world.  Indeed, one could say that that’s what many of us in quantum information and theoretical computer science have been trying to do for decades!  However, Janzing’s “physical universality” problem hints at a different way to approach the project: starting with some far-reaching desire (say, to be able to implement any transformation whatsoever on any finite region), can we engineer laws of physics that make that desire possible?  If so, then how close can we make those laws to “our” laws?

Luke has now taken a first stab at answering these questions.  Whether his result ends up merely being a fun, recreational “terminal branch” on the tree of science, or a trunk leading to something more, probably just depends on how interested people get.  I have no doubt that our laws of physics permit the creation of additional papers on this topic, but whether they do or don’t is (as far as I can see) merely a question of contingency and human will, not a constructor-theoretic question.


This morning, a reader named Bill emailed me the following:

> I stumbled upon [Quantum Computing Since Democritus Lecture 9] by accident and it seemed quite interesting but I was ultimately put off (I stopped reading it) by all the references to god. As a scientist (and athiest) I think personal religious beliefs should be left out of scientific papers/lectures, you shouldn’t assume your readers/listeners have the same beliefs as yourself…..it just alienates them.

Dear Bill,

I’m impressed — you seem to know more about my personal religious beliefs than I do! If you’d asked, I would’ve told you that I, like yourself, am what most people would call a disbelieving atheist infidel heretic. I became one around age fourteen, shortly after my bar mitzvah, and have remained one ever since.

Admittedly, though, “atheist” isn’t exactly the right word for me, nor even is “agnostic.” I don’t have any stance toward the question of God’s existence or nonexistence that involves the concept of belief. For me, beliefs are for things that might eventually have some sort of observable consequence for someone. So for example, I believe P is different from NP. I believe I’d like some delicious Peanut Chews today. I believe the weather this January isn’t normal for planet Earth over the last 10,000 years, and that we and our Ford Escorts are not entirely unimplicated. I believe eating babies and voting for Republicans is wrong. I believe neo-Darwinism and the SU(3)xSU(2)xU(1) Standard Model (though not its supersymmetric extensions, at least until I see the evidence). I believe that if the God of prayer couldn’t get off His lazy ass during the Holocaust, or the Rwandan or Cambodian genocides, then He must not be planning to do so anytime soon — and hence, “trusting in faith” is utter futility.

But when it comes to the more ethereal questions — the nature of consciousness and free will, the resolution of the quantum measurement problem, the validity of the cosmological anthropic principle or the Continuum Hypothesis, the existence of some sort of intentionality behind the laws of physics, etc. — I don’t have any beliefs whatsoever. I’m not even unsure about these questions, in the same Bayesian sense that I’m unsure about next week’s Dow Jones average (or for that matter, this week’s Dow Jones average). All I have regarding the metaphysical questions is a long list of arguments and counterarguments — together with a vague hope that someone, someday, will manage to clarify what the questions even mean.

To me, the most remarkable thing you said was that, despite being otherwise interested in my lecture, you literally stopped reading it because of some tongue-in-cheek references to an Einsteinian God. That reminds me of a funny story. When I was a student at Berkeley, my mom kept pestering me to go to the campus Hillel for Friday night dinners. And to be honest, despite all the pestering, I was tempted to go. My temptation was largely driven by two factors that, for want of more refined terminology, I will call “free food” and “females.” For some reason, both factors, but particularly the second, were in short supply in the computer science department.

And yet, I couldn’t bring myself to go. Every time I passed the Hillel, I had this vision of a translucent Richard Dawkins (sometimes joined by Bertrand Russell) floating before me on the front steps, demanding that I justify the absurd Bronze Age myths that, by entering the Hillel building, I would implicitly be endorsing. “Come now, Scott,” Richard and Bertrand would say, with their elegant Oxbridge accents. “You don’t really believe that tosh, do you?”

“No, most assuredly not, good Sirs,” I would reply, and shuffle back to the dorm to work on my problem set. (The thought of spending Friday night at, say, a beer party never even occurred to me.)

Then, one Friday, I had a revelation: if God doesn’t exist, then in particular, He doesn’t give a shit where I go tonight.  There’s no vengeful sky-Dawkins, measuring my every word and deed against some cosmic code of atheism. There’s no Secular-Humanist Yahweh who commanded His infidel flock at Sci-nai not to believe in Him. So if I want to go to the Hillel, then as long as I’m not hurting anyone or lying about my beliefs, I should go. If I don’t want to go, I shouldn’t go. To do otherwise wouldn’t merely be silly; it would actually be irrational.

(Incidentally, once I went, I found that the other secularists there greatly outnumbered the believers. I did stop going after a year or two, but only because I’d gotten bored with it.)

What I’m trying to say, Bill, is this: you can go ahead and indulge yourself. If some of the most brilliant unbelievers in history — Einstein, Erdös, Twain — could refer to a being of dubious ontological status as they would to a smelly old uncle, then why not the rest of us? For me, the whole point of scientific rationalism is that you’re free to ask any question, debate any argument, read anything that interests you, use whatever phrase most colorfully conveys your meaning, all without having to worry about violating some taboo. You won’t endanger your immortal soul, since you don’t have one.

If the trouble is just that the G-word leaves a bad taste in your mouth, then I invite you to try the following experiment. Every time you encounter the word “God” in my lecture, mentally substitute “Flying Spaghetti Monster.” So for example: “why would the Flying Spaghetti Monster, praise be to His infinite noodly appendages, have made the quantum-mechanical amplitudes complex numbers instead of reals or quaternions?”

Well, why would He? Any ideas?

RAmen, and may angel-hair watch over you,
Scott


Amir Michail has asked me to comment on his proposal to create a new field: one that’s “like computer science, but more creative.” My first reaction was to wonder, how much more creative does he want? He might as well ask for a field that’s like dentistry, but with more teeth. (I was reminded of Hilbert’s famous remark, when told that a student had abandoned math to become a poet: “Good. He didn’t have enough imagination to be a mathematician.”)

But on second thought, it’s true that computer science encourages a particular kind of creativity: one that’s directed toward answering questions, rather than building things that are useful or cool. I learned about this distinction as an undergraduate, when the professor in my natural language processing class refused to let me write a parody-generating program (like this one) for my term project, on the grounds that such a program would not elucidate any scientific question. Of course, she was right.

Paul Graham explained the issue memorably in his essay Hackers and Painters:

> I’ve never liked the term “computer science.” The main reason I don’t like it is that there’s no such thing. Computer science is a grab bag of tenuously related areas thrown together by an accident of history, like Yugoslavia. At one end you have people who are really mathematicians, but call what they’re doing computer science so they can get DARPA grants. In the middle you have people working on something like the natural history of computers — studying the behavior of algorithms for routing data through networks, for example. And then at the other extreme you have the hackers, who are trying to write interesting software, and for whom computers are just a medium of expression, as concrete is for architects or paint for painters …
>
> The mathematicians don’t seem bothered by this. They happily set to work proving theorems like the other mathematicians over in the math department, and probably soon stop noticing that the building they work in says “computer science” on the outside. But for the hackers this label is a problem. If what they’re doing is called science, it makes them feel they ought to be acting scientific. So instead of doing what they really want to do, which is to design beautiful software, hackers in universities and research labs feel they ought to be writing research papers.

(Incidentally, Graham is mistaken about one point: most theoretical computer scientists could not blend in among mathematicians. Avi Wigderson, one of the few who can and does, once explained the difference to me as follows. Mathematicians start from dizzyingly general theorems, then generalize them even further. Theoretical computer scientists start from incredibly concrete problems that no one can solve, then find special cases that still no one can solve.)

One puzzle that Graham’s analysis helps to resolve is why computer systems papers are so excruciatingly boring, almost without exception. It can’t be because the field itself is boring: after all, it’s transformed civilization in 30 years. Rather, computer systems papers are boring because asking hackers to write papers about what they hacked is like asking Bach to write papers about his sonatas:

> Abstract. We describe several challenges encountered during the composition of SONATA2 (“Sonata No. 2 in A minor”). These results might provide general insights applicable to the composition of other such sonatas…

> So what should be done? Should universities create “Departments of Hacking” to complement their CS departments? I actually think they should (especially if the split led to more tenure-tracks for everyone). All I ask is that, if you do find yourself in a future Hacking Department, you come over to CS for a course on algorithms and complexity. It’ll be good for your soul.


Foreword: Right now, I have a painfully-large stack of unwritten research papers.  Many of these are “paperlets”: cool things I noticed that I want to tell people about, but that would require a lot more development before they became competitive for any major theoretical computer science conference.  And what with the baby, I simply don’t have time anymore for the kind of obsessive, single-minded, all-nighter-filled effort needed to bulk my paperlets up.  So starting today, I’m going to try turning some of my paperlets into blog posts.  I don’t mean advertisements or sneak previews for papers, but replacements for papers: blog posts that constitute the entirety of what I have to say for now about some research topic.  “Peer reviewing” (whether signed or anonymous) can take place in the comments section, and “citation” can be done by URL.  The hope is that, much like with 17th-century scientists who communicated results by letter, this will make it easier to get my paperlets done: after all, I’m not writing Official Papers, just blogging for colleagues and friends.

Of course, I’ve often basically done this before—as have many other academic bloggers—but now I’m going to go about it more consciously.  I’ve thought for years that the Internet would eventually change the norms of scientific publication much more radically than it so far has: that yes, instant-feedback tools like blogs and StackExchange and MathOverflow might have another decade or two at the periphery of progress, but their eventual destiny is at the center.  And now that I have tenure, it hit me that I can do more than prognosticate about such things.  I’ll start small: I won’t go direct-to-blog for big papers, papers that cry out for LaTeX formatting, or joint papers.  I certainly won’t do it for papers with students who need official publications for their professional advancement.  But for things like today’s post—on the power of a wooden mechanical computer now installed in the lobby of the building where I work—I hope you agree that the Science-by-Blog Plan fits well.

Oh, by the way, happy July 4th to American readers!  I hope you find that a paperlet about the logspace-interreducibility of a few not-very-well-known computational models captures everything that the holiday is about.

* * *

## The Power of the Digi-Comp II

by Scott Aaronson

### Abstract

I study the Digi-Comp II, a wooden mechanical computer whose only moving parts are balls, switches, and toggles.  I show that the problem of simulating (a natural abstraction of) the Digi-Comp, with a polynomial number of balls, is complete for CC (Comparator Circuit), a complexity class defined by Subramanian in 1990 that sits between NL and P.  This explains why the Digi-Comp is capable of addition, multiplication, division, and other arithmetical tasks, and also implies new tasks of which the Digi-Comp is capable (and that indeed are complete for it), including the Stable Marriage Problem, finding a lexicographically-first perfect matching, and the simulation of other Digi-Comps.  However, it also suggests that the Digi-Comp is not a universal computer (not even in the circuit sense), making it a very interesting way to fall short of Turing-universality.  I observe that even with an exponential number of balls, simulating the Digi-Comp remains in P, but I leave open the problem of pinning down its complexity more precisely.

### Introduction

To celebrate his 60th birthday, my colleague Charles Leiserson (who some of you might know as the “L” in the CLRS algorithms textbook) had a striking contraption installed in the lobby of the MIT Stata Center.  That contraption, pictured below, is a custom-built, supersized version of a wooden mechanical computer from the 1960s called the Digi-Comp II, now manufactured and sold by a company called Evil Mad Scientist.

Click here for a short video showing the Digi-Comp’s operation (and here for the user’s manual).  Basically, the way it works is this: a bunch of balls (little steel balls in the original version, pool balls in the supersized version) start at the top and roll to the bottom, one by one.  On their way down, the balls may encounter black toggles, which route each incoming ball either left or right.  Whenever this happens, the weight of the ball flips the toggle to the opposite setting: so for example, if a ball goes left, then the next ball to encounter the same toggle will go right, and the ball after that will go left, and so on.  The toggles thus maintain a “state” for the computer, with each toggle storing one bit.

Besides the toggles, there are also “switches,” which the user can set at the beginning to route every incoming ball either left or right, and whose settings aren’t changed by the balls.  And then there are various wooden tunnels and ledges, whose function is simply to direct the balls in a desired way as they roll down.  A ball could reach different locations, or even the same location in different ways, depending on the settings of the toggles and switches above that location.  On the other hand, once we fix the toggles and switches, a ball’s motion is completely determined: there’s no random or chaotic element.

“Programming” is done by configuring the toggles and switches in some desired way, then loading a desired number of balls at the top and letting them go.  “Reading the output” can be done by looking at the final configuration of some subset of the toggles.

Whenever a ball reaches the bottom, it hits a lever that causes the next ball to be released from the top.  This ensures that the balls go through the device one at a time, rather than all at once.  As we’ll see, however, this is mainly for aesthetic reasons, and maybe also for the mechanical reason that the toggles wouldn’t work properly if two or more balls hit them at once.  The actual logic of the machine doesn’t care about the timing of the balls; the sheer number of balls that go through is all that matters.

The Digi-Comp II, as sold, contains a few other features: most notably, toggles that can be controlled by other toggles (or switches).  But I’ll defer discussion of that feature to later.  As we’ll see, we already get a quite interesting model of computation without it.

One final note: of course the machine that’s sold has a fixed size and a fixed geometry.  But for theoretical purposes, it’s much more interesting to consider an arbitrary network of toggles and switches (not necessarily even planar!), with arbitrary size, and with an arbitrary number of balls fed into it.  (I’ll give a more formal definition in the next section.)

### The Power of the Digi-Comp

So, what exactly can the Digi-Comp do?  As a first exercise, you should convince yourself that, by simply putting a bunch of toggles in a line and initializing them all to “L” (that is, Left), it’s easy to set up a binary counter.  In other words, starting from the configuration, say, LLL (in which three toggles all point left), as successive balls pass through we can enter the configurations RLL, LRL, RRL, etc.  If we interpret L as 0 and R as 1, and treat the first bit as the least significant, then we’re simply counting from 0 to 7 in binary.  With 20 toggles, we could instead count to 1,048,575.

But counting is not the most interesting thing we can do.  As Charles eagerly demonstrated to me, we can also set up the Digi-Comp to perform binary addition, binary multiplication, sorting, and even long division.  (Excruciatingly slowly, of course: the Digi-Comp might need even more work to multiply 3×5, than existing quantum computers need to factor the result!)

To me, these demonstrations served only as proof that, while Charles might call himself a theoretical computer scientist, he’s really a practical person at heart.  Why?  Because a theorist would know that the real question is not what the Digi-Comp can do, but rather what it can’t do!  In particular, do we have a universal computer on our hands here, or not?

If the answer is yes, then it’s amazing that such a simple contraption of balls and toggles could already take us over the threshold of universality.  Universality would immediately explain why the Digi-Comp is capable of multiplication, division, sorting, and so on.  If, on the other hand, we don’t have universality, that too is extremely interesting—for we’d then face the challenge of explaining how the Digi-Comp can do so many things without being universal.

It might be said that the Digi-Comp is certainly not a universal computer, since if nothing else, it’s incapable of infinite loops.  Indeed, the number of steps that a given Digi-Comp can execute is bounded by the number of balls, while the number of bits it can store is bounded by the number of toggles: clearly we don’t have a Turing machine.  This is true, but doesn’t really tell us what we want to know.  For, as discussed in my last post, we can consider not only Turing-machine universality, but also the weaker (but still interesting) notion of circuit-universality.  The latter means the ability to simulate, with reasonable efficiency, any Boolean circuit of AND, OR, and NOT gates—and hence, in particular, to compute any Boolean function on any fixed number of input bits (given enough resources), or to simulate any polynomial-time Turing machine (given polynomial resources).

The formal way to ask whether something is circuit-universal, is to ask whether the problem of simulating the thing is P-complete.  Here P-complete (not to be confused with NP-complete!) basically means the following:

There exists a polynomial p such that any S-step Turing machine computation—or equivalently, any Boolean circuit with at most S gates—can be embedded into our system if we allow the use of poly(S) computing elements (in our case, balls, toggles, and switches).

Of course, I need to tell you what I mean by the weasel phrase “can be embedded into.”  After all, it wouldn’t be too impressive if the Digi-Comp could “solve” linear programming, primality testing, or other highly-nontrivial problems, but only via “embeddings” in which we had to do essentially all the work, just to decide how to configure the toggles and switches!  The standard way to handle this issue is to demand that the embedding be “computationally simple”: that is, we should be able to carry out the embedding in L (logarithmic space), or some other complexity class believed to be much smaller than the class (P, in this case) for which we’re trying to prove completeness.  That way, we’ll be able to say that our device really was “doing something essential”—i.e., something that our embedding procedure couldn’t efficiently do for itself—unless the larger complexity class collapses with the smaller one (i.e., unless L=P).

So then, our question is whether simulating the Digi-Comp II is a P-complete problem under L-reductions, or alternatively, whether the problem is in some complexity class believed to be smaller than P.  The one last thing we need is a formal definition of “the problem of simulating the Digi-Comp II.”  Thus, let DIGICOMP be the following problem:

We’re given as inputs:

  * A directed acyclic graph G, with n vertices.  There is a designated vertex with indegree 0 and outdegree 1 called the “source,” and a designated vertex with indegree 1 and outdegree 0 called the “sink.”  Every internal vertex v (that is, every vertex with both incoming and outgoing edges) has exactly two outgoing edges, labeled “L” (left) and “R” (right), as well as one bit of internal state s(v)∈{L,R}.
  * For each vertex v, an “initial” value for its internal state s(v).
  * A positive integer T (encoded in unary notation), representing the number of balls dropped successively from the source vertex.

Computation proceeds as follows: each time a ball appears at the source vertex, it traverses the path induced by the L and R states of the vertices that it encounters, until it reaches a terminal vertex, which might or might not be the sink.  As the ball traverses the path, it flips s(v) for each vertex v that it encounters: L goes to R and R goes to L.  Then the next ball is dropped in.

The problem is to decide whether any balls reach the sink.

Here the internal vertices represent toggles, and the source represents the chute at the top from which the balls drop.  Switches aren’t included, since (by definition) the reduction can simply fix their values to “L” to “R” and thereby simplify the graph.

Of course we could consider other problems: for example, the problem of deciding whether an odd number of balls reach the sink, or of counting how many balls reach the sink, or of computing the final value of every state-variable s(v).  However, it’s not hard to show that all of these problems are interreducible with the DIGICOMP problem as defined above.

### The Class CC

My main result, in this paperlet, is to pin down the complexity of the DIGICOMP problem in terms of a complexity class called CC (Comparator Circuit): a class that’s obscure enough not to be in the Complexity Zoo (!), but that’s been studied in several papers.  CC was defined by Subramanian in his 1990 Stanford PhD thesis; around the same time Mayr and Subramanian showed the inclusion NL ⊆ CC (the inclusion CC ⊆ P is immediate).  Recently Cook, Filmus, and Lê revived interest in CC with their paper The Complexity of the Comparator Circuit Value Problem, which is probably the best current source of information about this class.

OK, so what is CC?  Informally, it’s the class of problems that you can solve using a comparator circuit, which is a circuit that maps n bits of input to n bits of output, and whose only allowed operation is to sort any desired pair of bits.  That is, a comparator circuit can repeatedly apply the transformation (x,y)→(x∧y,x∨y), in which 00, 01, and 11 all get mapped to themselves, while 10 gets mapped to 01.  Note that there’s no facility in the circuit for copying bits (i.e., for fanout), so sorting could irreversibly destroy information about the input.  In the comparator circuit value problem (or CCV), we’re given as input a description of a comparator circuit C, along with an input x∈{0,1}n and an index i∈[n]; then the problem is to determine the final value of the ith bit when C is applied to x.  Then CC is simply the class of all languages that are L-reducible to CCV.

As Cook et al. discuss, there are various other characterizations of CC: for example, rather than using a complete problem, we can define CC directly as the class of languages computed by uniform families of comparator circuits.  More strikingly, Mayr and Subramanian showed that CC has natural complete problems, which include (decision versions of) the famous Stable Marriage Problem, as well as finding the lexicographically first perfect matching in a bipartite graph.  So perhaps the most appealing definition of CC is that it’s “the class of problems that can be easily mapped to the Stable Marriage Problem.”

It’s a wide-open problem whether CC=NL or CC=P: as usual, one can give oracle separations, but as far as anyone knows, either equality could hold without any dramatic implications for “standard” complexity classes.  (Of course, the conjunction of these equalities would have a dramatic implication.)  What got Cook et al. interested was that CC isn’t even known to contain (or be contained in) the class NC of parallelizable problems.  In particular, linear-algebra problems in NC, like determinant, matrix inversion, and iterated matrix multiplication—not to mention other problems in P, like linear programming and greatest common divisor—might all be examples of problems that are efficiently solvable by Boolean circuits, but not by comparator circuits.

One final note about CC.  Cook et al. showed the existence of a universal comparator circuit: that is, a single comparator circuit C able to simulate any other comparator circuit C’ of some fixed size, given a description of C’ as part of its input.

### DIGICOMP is CC-Complete

I can now proceed to my result: that, rather surprisingly, the Digi-Comp II can solve exactly the problems in CC, giving us another characterization of that class.

I’ll prove this using yet another model of computation, which I call the pebbles model.  In the pebbles model, you start out with a pile of x pebbles; the positive integer x is the “input” to your computation.  Then you’re allowed to apply a straight-line program that consists entirely of the following two operations:

  1. Given any pile of y pebbles, you can split it into two piles consisting of ⌈y/2⌉ and ⌊y/2⌋ pebbles respectively.
  2. Given any two piles, consisting of y and z pebbles respectively, you can combine them into a single pile consisting of y+z pebbles.

Your program “accepts” if and only if some designated output pile contains at least one pebble (or, in a variant that can be shown to be equivalent, if it contains an odd number of pebbles).

As suggested by the imagery, you don’t get to make “backup copies” of the piles before splitting or combining them: if, for example, you merge y with z to create y+z, then y isn’t also available to be split into ⌈y/2⌉ and ⌊y/2⌋.

Note that the ceiling and floor functions are the only “nonlinear” elements of the pebbles model: if not for them, we’d simply be applying a sequence of linear transformations.

I can now divide my CC-completeness proof into two parts: first, that DIGICOMP (i.e., the problem of simulating the Digi-Comp II) is equivalent to the pebbles model, and second, that the pebbles model is equivalent to comparator circuits.

Let’s first show the equivalence between DIGICOMP and pebbles.  The reduction is simply this: in a given Digi-Comp, each edge will be associated to a pile, with the number of pebbles in the pile equal to the total number of balls that ever traverse that edge.  Thus, we have T balls dropped in to the edge incident to the source vertex, corresponding to an initial pile with T pebbles.  Multiple edges pointing to the same vertex (i.e., fan-in) can be modeled by combining the associated piles into a single pile.  Meanwhile, a toggle has the effect of splitting a pile: if y balls enter the toggle in total, then ⌈y/2⌉ balls will ultimately exit in whichever direction the toggle was pointing initially (whether left or right), and ⌊y/2⌋ balls will ultimately exit in the other direction.  It’s clear that this equivalence works in both directions: not only does it let us simulate any given Digi-Comp by a pebble program, it also lets us simulate any pebble program by a suitably-designed Digi-Comp.

OK, next let’s show the equivalence between pebbles and comparator circuits.  As a first step, given any comparator circuit, I claim that we can simulate it by a pebble program.  The way to do it is simply to use a pile of 0 pebbles to represent each “0” bit, and a pile of 1 pebble to represent each “1” bit.  Then, any time we want to sort two bits, we simply merge their corresponding piles, then split the result back into two piles.  The result?  00 gets mapped to 00, 11 gets mapped to 11, and 01 and 10 both get mapped to one pebble in the ⌈y/2⌉ pile and zero pebbles in the ⌊y/2⌋ pile.  At the end, a given pile will have a pebble in it if and only if the corresponding output bit in the comparator circuit is 1.

One might worry that the input to a comparator circuit is a sequence of bits, whereas I said before that the input to a pebble program is just a single pile.  However, it’s not hard to see that we can deal with this, without leaving the world of logspace reductions, by breaking up an initial pile of n pebbles into n piles each of zero pebbles or one pebble, corresponding to any desired n-bit string (along with some extra pebbles, which we subsequently ignore).  Alternatively, we could generalize the pebbles model so that the input can consist of multiple piles.  One can show, by a similar “breaking-up” trick, that this wouldn’t affect the pebbles model’s equivalence to the DIGICOMP problem.

Finally, given a pebble program, I need to show how to simulate it by a comparator circuit.  The reduction works as follows: let T be the number of pebbles we’re dealing with (or even just an upper bound on that number).  Then each pile will be represented by its own group of T wires in the comparator circuit.  The Hamming weight of those T wires—i.e., the number of them that contain a ‘1’ bit—will equal the number of pebbles in the corresponding pile.

To merge two piles, we first merge the corresponding groups of T wires.  We then use comparator gates to sort the bits in those 2T wires, until all the ‘1’ bits have been moved into the first T wires.  Finally, we ignore the remaining T wires for the remainder of the computation.

To split a pile, we first use comparator gates to sort the bits in the T wires, until all the ‘1’ bits have been moved to the left.  We then route all the odd-numbered wires into “Pile A” (the one that’s supposed to get ⌈y/2⌉ pebbles), and route all the even-numbered wires into “Pile B” (the one that’s supposed to get ⌊y/2⌋ pebbles).  Finally, we introduce T additional wires with 0’s in them, so that piles A and B have T wires each.

At the end, by examining the leftmost wire in the group of wires corresponding to the output pile, we can decide whether that pile ends up with any pebbles in it.

Since it’s clear that all of the above transformations can be carried out in logspace (or even smaller complexity classes), this completes the proof that DIGICOMP is CC-complete under L-reductions.  As corollaries, the Stable Marriage and lexicographically-first perfect matching problems are L-reducible to DIGICOMP—or informally, are solvable by easily-described, polynomial-size Digi-Comp machines (and indeed, characterize the power of such machines).  Combining my result with the universality result of Cook et al., a second corollary is that there exists a “universal Digi-Comp”: that is, a single Digi-Comp D that can simulate any other Digi-Comp D’ of some polynomially-smaller size, so long as we initialize some subset of the toggles in D to encode a description of D’.

### How Does the Digi-Comp Avoid Universality?

Let’s now step back and ask: given that the Digi-Comp is able to do so many things—division, Stable Marriage, bipartite matching—how does it fail to be a universal computer, at least a circuit-universal one?  Is the Digi-Comp a counterexample to the oft-repeated claims of people like Stephen Wolfram, about the ubiquity of universal computation and the difficulty of avoiding it in any sufficiently complex system?  What would need to be added to the Digi-Comp to make it circuit-universal?  Of course, we can ask the same questions about pebble programs and comparator circuits, now that we know that they’re all computationally equivalent.

The reason for the failure of universality is perhaps easiest to see in the case of comparator circuits.  As Steve Cook pointed out in a talk, comparator circuits are “1-Lipschitz“: that is, if you have a comparator circuit acting on n input bits, and you change one of the input bits, your change can affect at most one output bit.  Why?  Well, trace through the circuit and use induction.  So in particular, there’s no amplification of small effects in comparator circuits, no chaos, no sensitive dependence on initial conditions, no whatever you want to call it.  Now, while chaos doesn’t suffice for computational universality, at least naïvely it’s a necessary condition, since there exist computations that are chaotic.  Of course, this simpleminded argument can’t be all there is to it, since otherwise we would’ve proved CC≠P.  What the argument does show is that, if CC=P, then the encoding of a Boolean circuit into a comparator circuit (or maybe into a collection of such circuits) would need to be subtle and non-obvious: it would need to take computations with the potential for chaos, and reduce them to computations without that potential.

Once we understand this 1-Lipschitz business, we can also see it at work in the pebbles model.  Given a pebble program, suppose someone surreptitiously removed a single pebble from one of the initial piles.  For want of that pebble, could the whole kingdom be lost?  Not really.  Indeed, you can convince yourself that the output will be exactly the same as before, except that one output pile will have one fewer pebble than it would have otherwise.  The reason is again an induction: if you change x by 1, that affects at most one of ⌈x/2⌉ and ⌊x/2⌋ (and likewise, merging two piles affects at most one pile).

We now see the importance of the point I made earlier, about there being no facility in the piles model for “copying” a pile.  If we could copy piles, then the 1-Lipschitz property would fail.  And indeed, it’s not hard to show that in that case, we could implement AND, OR, and NOT gates with arbitrary fanout, and would therefore have a circuit-universal computer.  Likewise, if we could copy bits, then comparator circuits—which, recall, map (x,y) to (x∧y,x∨y)—would implement AND, OR, and NOT with arbitrary fanout, and would be circuit-universal.  (If you’re wondering how to implement NOT: one way to do it is to use what’s known in quantum computing as the “dual-rail representation,” where each bit b is encoded by two bits, one for b and the other for ¬b.  Then a NOT can be accomplished simply by swapping those bits.  And it’s not hard to check that comparator gates in a comparator circuit, and combining and splitting two piles in a pebble program, can achieve the desired updates to both the b rails and the ¬b rails when an AND or OR gate is applied.  However, we could also just omit NOT gates entirely, and use the fact that computing the output of even a monotone Boolean circuit is a P-complete problem under L-reductions.)

In summary, then, the inability to amplify small effects seems like an excellent candidate for the central reason why the power of comparator circuits and pebble programs hits a ceiling at CC, and doesn’t go all the way up to P.  It’s interesting, in this connection, that while transistors (and before them, vacuum tubes) can be used to construct logic gates, the original purpose of both of them was simply to amplify information: to transform a small signal into a large one.  Thus, we might say, comparator circuits and pebble programs fail to be computationally universal because they lack transistors or other amplifiers.

I’d like to apply exactly the same analysis to the Digi-Comp itself: that is, I’d like to say that the reason the Digi-Comp fails to be universal (unless CC=P) is that it, too, lacks the ability to amplify small effects (wherein, for example, the drop of a single ball would unleash a cascade of other balls).  In correspondence, however, David Deutsch pointed out a problem: namely, if we just watch a Digi-Comp in action, then it certainly looks like it has an amplification capability!  Consider, for example, the binary counter discussed earlier.  Suppose a column of ten toggles is in the configuration RRRRRRRRRR, representing the integer 1023.  Then the next ball to fall down will hit all ten toggles in sequence, resetting them to LLLLLLLLLL (and thus, resetting the counter to 0).  Why isn’t this precisely the amplification of a small effect that we were looking for?

Well, maybe it’s amplification, but it’s not of a kind that does what we want computationally.  One way to see the difficulty is that we can’t take all those “L” settings we’ve produced as output, and feed them as inputs to further gates in an arbitrary way.  We could do it if the toggles were arranged in parallel, but they’re arranged serially, so that flipping any one toggle inevitably has the potential also to flip the toggles below it.  Deutsch describes this as a “failure of composition”: in some sense, we do have a fan-out or copying operation, but the design of the Digi-Comp prevents us from composing the fan-out operation with other operations in arbitrary ways, and in particular, in the ways that would be needed to simulate any Boolean circuit.

So, what features could we add to the Digi-Comp to make it universal?  Here’s the simplest possibility I was able to come up with: suppose that, scattered throughout the device, there were balls precariously perched on ledges, in such a way that whenever one was hit by another ball, it would get dislodged, and both balls would continue downward.  We could, of course, chain several of these together, so that the two balls would in turn dislodge four balls, the four would dislodge eight, and so on.  I invite you to check that this would provide the desired fan-out gate, which, when combined with AND, OR, and NOT gates that we know how to implement (e.g., in the dual-rail representation described previously), would allow us to simulate arbitrary Boolean circuits.  In effect, the precariously perched balls would function as “transistors” (of course, painfully slow transistors, and ones that have to be laboriously reloaded with a ball after every use).

As a second possibility, Charles Leiserson points out to me that the Digi-Comp, as sold, has a few switches and toggles that can be controlled by other toggles.  Depending on exactly how one modeled this feature, it’s possible that it, too, could let us implement arbitrary fan-out gates, and thereby boost the Digi-Comp up to circuit-universality.

### Open Problems

My personal favorite open problem is this:

What is the complexity of simulating a Digi-Comp II if the total number of balls dropped in is exponential, rather than polynomial?  (In other words, if the positive integer T, representing the number of balls, is encoded in binary rather than in unary?)

From the equivalence between the Digi-Comp and pebble programs, we can already derive a conclusion about the above problem that’s not intuitively obvious: namely, that it’s in P.  Or to say it another way: it’s possible to predict the exact state of a Digi-Comp with n toggles, after T balls have passed through it, using poly(n, log T) computation steps.  The reason is simply that, if there are T balls, then the total number of balls that pass through any given edge (the only variable we need to track) can be specified using log2T bits.  This, incidentally, gives us a second sense in which the Digi-Comp is not a universal computer: namely, even if we let the machine “run for exponential time” (that is, drop exponentially many balls into it), unlike a conventional digital computer it can’t possibly solve all problems in PSPACE, unless P=PSPACE.

However, this situation also presents us with a puzzle: if we let DIGICOMPEXP be the problem of simulating a Digi-Comp with an exponential number of balls, then it’s clear that DIGICOMPEXP is hard for CC and contained in P, but we lack any information about its difficulty more precise than that.  At present, I regard both extremes—that DIGICOMPEXP is in CC (and hence, no harder than ordinary DIGICOMP), and that it’s P-complete—as within the realm of possibility (along with the possibility that DIGICOMPEXP is intermediate between the two).

By analogy, one can also consider comparator circuits where the entities that get compared are integers from 1 to T rather than bits—and one can then consider the power of such circuits, when T is allowed to grow exponentially.  In email correspondence, however, Steve Cook sent me a proof that such circuits have the same power as standard, Boolean comparator circuits.  It’s not clear whether this tells us anything about the power of a Digi-Comp with exponentially many balls.

A second open problem is to formalize the feature of Digi-Comp that Charles mentioned—namely, toggles and switches controlled by other toggles—and see whether, under some reasonable formalization, that feature bumps us up to P-completeness (i.e., to circuit-universality).  Personally, though, I confess I’d be even more interested if there were some feature we could add to the machine that gave us a larger class than CC, but that still wasn’t all of P.

A third problem is to pin down the power of Digi-Comps (or pebble programs, or comparator circuits) that are required to be planar.  While my experience with woodcarving is limited, I imagine that planar or near-planar graphs are a lot easier to carve than arbitrary graphs (even if the latter present no problems of principle).

A fourth problem has to do with the class CC in general, rather than the Digi-Comp in particular, but I can’t resist mentioning it.  Let CCEXP be the complexity class that’s just like CC, but where the comparator circuit (or pebble program, or Digi-Comp) is exponentially large and specified only implicitly (that is, by a Boolean circuit that, given as input a binary encoding of an integer i, tells you the ith bit of the comparator circuit’s description).  Then it’s easy to see that PSPACE ⊆ CCEXP ⊆ EXP.  Do we have CCEXP = PSPACE or CCEXP = EXP?  If not, then CCEXP would be the first example I’ve ever seen of a natural complexity class intermediate between PSPACE and EXP.

### Acknowledgments

I thank Charles Leiserson for bringing the Digi-Comp II to MIT, and thereby inspiring this “research.”  I also thank Steve Cook, both for giving a talk that first brought the complexity class CC to my attention, and for helpful correspondence.  Finally I thank David Deutsch for the point about composition.


Here are the PowerPoint slides for a physics colloquium I gave at Caltech yesterday, on “Computational Intractability as a Law of Physics.” The talk was delivered, so I was told, in the very same auditorium where Feynman gave his Lectures on Physics. At the teatime beforehand, I was going to put both milk and lemon in my tea to honor the old man, but then I decided I actually didn’t want to.

I’m at Caltech till Tuesday, at which point I leave for New Zealand, to visit my friend Miriam from Berkeley and see a country I always wanted to see, and thence to Australia for QIP. This Caltech visit, my sixth or seventh, has been every bit as enjoyable as I’ve come to expect: it’s included using Andrew Childs as a straight man for jokes, shootin’ the qubits with Shengyu Zhang, Aram Harrow, and Robin Blume-Kohout, and arguing with Sean Carroll over which one of us is the second-funniest physics blogger (we both agree that Luboš is the funniest by far). Indeed, John Preskill (my host) and everyone else at the Institute for Quantum Information have been so darn hospitable that from now on, I might just have to shill for quantum computing theory.


Seth Teller was a colleague of mine in CSAIL and the EECS department, and was one of my favorite people in all of MIT.  He was a brilliant roboticist, who (among many other things) spearheaded MIT’s participation in the DARPA Grand Challenge for self-driving cars, and who just recently returned from a fact-finding trip to Fukushima, Japan, to see how robots could help in investigating the damaged reactor cores there.  I saw Seth twice a week at lab and department lunches, and he often struck up conversations with me about quantum computing, cosmology, and other things.  His curiosity was immense, wide-ranging, and almost childlike (in the best way).  One small indication of his character is that, in the DARPA challenge, Seth opted not to preload MIT’s car with detailed data about the course, because he thought doing so made the challenge scientifically less interesting—even though DARPA’s rules allowed such preloading, the other teams did it, and it almost certainly would have improved MIT’s standing in the competition.

Seth was a phenomenal speaker, whose passion and clarity always won me over even though my research interests were far from his.  I made it a point to show up for lab lunch whenever I knew he’d be speaking.  Seth was also, from what I’ve heard, a superb mentor and teacher, who won an award earlier this year for his undergraduate advising.

Seth died ten days ago, on July 1st.  (See here for MIT News’s detailed obituary, and here for an article in Cambridge Day.)  While no cause of death was given at the time, according to an update yesterday in the MIT Tech, the death has been ruled a suicide.  Seth is survived by his wife, Rachel, and by two daughters.

With his cheerful, can-do disposition, Seth is one of the last people on earth I’d imagine doing this: whatever he was going through, he did an unbelievable job of hiding it.  I’m certain he wouldn’t abandon his family unless he was suffering unimaginable pain.  If there’s a tiny atom of good to come out of this, I hope that at least one other person contemplating suicide will reflect on how much Seth had to live for, and that doing so will inspire that person to get the medical help they need.

Incidentally, outside of his research and teaching, Seth was also an activist for protecting the physical environment and open spaces of East Cambridge.  At the “Wild and Crazy Ideas Session” of one CSAIL retreat, Seth floated a truly wild idea: to replace Memorial Drive, or at least the part of it that separates the MIT campus from the Charles River, by an underground tunnel, so that the land above the tunnel could be turned into a beautiful riverfront park.  In his characteristic fashion, Seth had already done a pretty detailed engineering analysis, estimating the cost at “merely” a few hundred million dollars: a lot, but a worthy investment in MIT’s future.  In any case, I can’t imagine a better way to memorialize Seth than to name some green space in East Cambridge after him, and I hope that happens.

Seth will be sorely missed.  My thoughts go out to his family at this difficult time.


Or, why I will weigh at least 3000 pounds by the time I get tenure.

  * Fresh fruit (eaten in highly nontrivial quantities): grapefruit, watermelon, raspberries, blackberries, cherries, mangoes. Abnormally high tolerance for citrus (will eat plain lemons and limes with no problem).
  * Vegetables: boiled broccoli, corn on the cob, avocado, raw baby carrots, cucumber, mashed potatoes, cherry tomatoes
  * Peanuts, cashews, walnuts
  * Beverages: fruit smoothies (mango, raspberry, banana), sparkling grape juice, coconut juice, iced tea, iced coffee, Hong Kong style bubble tea, fresh OJ, fresh lemonade, beer, champagne. Always looking for new and exotic fruit drinks. Not big on water. Trying to eliminate corn-syrup sodas.
  * Chicken
  * Steak, pot roast, burgers, pastrami
  * Fresh fish of all kinds: salmon, mahi-mahi, halibut, tuna (not shellfish)
  * Lots of soup: chicken-noodle, beef-vegetable, potato, chili…
  * Breakfast: egg-and-cheese sandwich, veggie omelet, french toast, waffles with fresh fruit, Count Chocula (with whole milk, of course)
  * Italian: eggplant parm, spaghetti with meatballs, linguini with salmon, cheese ravioli, garlic bread, pizza with onions and mushrooms (if that counts as Italian)
  * Indian: samosas, garlic naan, numerous variations on lamb &amp; rice, gulab jamun, chai
  * (American) Chinese: egg drop soup, crunchy noodles, mu shu chicken
  * Thai: cashew chicken, mango and sticky rice, Thai iced tea
  * Japanese: edamame, tuna sashimi with wasabi, udon noodles, teriyaki, beef sukiyaki. Also all sorts of junk food (mochi ice cream, Hi-Chew…)
  * Greek and Middle Eastern: falafel, pita with hummus, baklava
  * Jewish: latkes, fried matzo, fried artichokes, gefilte with horseradish (really — it’s good!), bagels with cream cheese and whitefish (not so much lox), beef brisket
  * Russian: borscht, potato pierogies
  * Ethiopian: injera bread (the “edible tablecloth”), Tej honey-wine
  * British:
  * American Thanksgiving dinner: turkey, stuffing, sweet potatoes, cranberry sauce, pumpkin pie — any time of year.
  * Sweets: candy apples, Jelly Bellys (especially licorice and coconut), Mentos, Tic-Tacs, peppermint patties, Charleston Chew, Peanut Chews (as mentioned earlier), Peanut M&amp;M’s, cherry Starburst, saltwater taffy, fudge, funnel cake, smores, Australian candied apricots, Turkish Delight, Hot Tamales, Mint Milano, soft chocolate-chip cookies, chocolate-dipped strawberries, chocolate pudding, chocolate mousse, warm chocolate cake, chocolate-covered Rice Krispies squares, sugar cubes, pure granulated sugar straight out of the bag… mmmmmmmm….
  * Ice cream (obviously a separate category): Ben &amp; Jerry’s Chocolate Fudge Brownie, Breyer’s coffee or mint chocolate chip, ice cream sandwiches, hot fudge sundaes (the hot fudge is crucial — not just chocolate syrup), banana splits, fruit sorbet, gelati (ice cream + water ice), fresh-made gelato, Freeze-Dried Astronaut Ice Cream (found in science museum gift shops)

Go ahead and list your own favorites in the comments section, together with your research area (or line of work if you have a real job). Then we’ll see if there’s any correlation between the two. See, this isn’t procrastination: it’s serious research.

Update (1/23): I finally fixed the time stamps in the comments section. Unfortunately, this will cause comments to appear out-of-order during an 8-hour window.


So, the Templeton Foundation invited me to write a 1500-word essay on the above question.  It’s like a blog post, except they pay me to do it!  My essay is now live, here.  I hope you enjoy my attempt at techno-futurist prose.  You can comment on the essay either here or over at Templeton’s site.  Thanks very much to Ansley Roan for commissioning the piece.


Hamas is trying to kill as many civilians as it can.

Israel is trying to kill as few civilians as it can.

Neither is succeeding very well.

* * *

Update (July 28): Please check out a superb essay by Sam Harris on the Israeli/Palestinian conflict.  While, as Harris says, the essay contains “something to offend everyone”—even me—it also brilliantly articulates many of the points I’ve been trying to make in this comment thread.

See also a good HuffPost article by Ali A. Rizvi, a “Pakistani-Canadian writer, physician, and musician.”


I’ve just come from a thin strip of volcanic ash near Antarctica, on which no mammal except bats set foot until a thousand years ago, and which today is mostly inhabited by sheep and by people who say “nigh-oh” when they mean “no.” I’m referring, of course, to New Zealand — or as the locals call it, “Middle Earth.” My colleague Andris Ambainis and I were in Auckland for four days, en route to QIP’2007 in Brisbane. While there, we were fed and sheltered by our friend Miriam and her boyfriend David. Miriam was both my housemate and officemate my first year at Berkeley; she now does user-interface research for a web-design company called Shift. You can see some of her handiwork, and learn more about her sheep-intensive homeland, by visiting this website. Hey, if Miriam took you around a place like this

you’d shill for her too.So, now that I was surrounded by one of the last relatively-intact wildernesses on Earth, what did I do there? If it were up to me, mostly blog, eat, and check email. Fortunately Miriam didn’t let me get away with my default ways, and repeatedly dragged me by my ears on Cultural Learning Experiences. And that’s what allows me to present the following Shtetl-Optimized New Zealand Educational Supplement.

  * Auckland is almost certain to be destroyed sometime in the next few millennia by one of the fifty or so active volcanoes it’s built on. On the bright side, like most of the world’s current cities, it will probably be underwater long before that.
  * New Zealand is the first place I’ve visited where the ozone hole is a serious everyday concern. Especially now, in summertime, when the hole over Antarctica is largest, you’re not supposed to go outside for even a few minutes without sunblock.
  * I’d always imagined the Maori as a nearly-extinct people who lived on reservations doing tribal dances for tourists. Actually they’re ~15% of the population, and have so assimilated with the pakehas (whites) that these days Maori kids get sent to special schools, weekend programs, etc. to retain something of their language and culture. (Like Hebrew day school but with more jade weapons.) Andris and I did see a traditional Maori war-dance, but you could tell that the people doing it were going to check their text messages as soon as it was over.
  * New Zealand was pretty much the last habitable landmass on Earth to be reached by human beings — not even the Maori got there until 1000AD. By comparison, the Aboriginals were already in Australia by 50,000BC. So why was New Zealand so much harder to reach than Australia? When we examine a map  a possible answer suggests itself: because New Zealand is so friggin’ far from everything else. Australia is practically in swimming distance from Southeast Asia by comparison. Because of this, reaching New Zealand and the other Pacific Islands took advances in boat-building and navigation that only happened recently in human history. Here’s another thing I never really appreciated before: the people who did get to these islands weren’t just drifting around randomly in their canoes. They knew exactly what they were doing. Like the Europeans who came later, they were setting out repeatedly on large, organized expeditions with the specific goal of finding new islands, returning to where they started from, and then coming back to the new islands with a settling party. Ideally the new islands would be chock-full of tasty animals like the moa that, unused to land-based predators, could then be hunted to extinction.

Alright, enough book-learnin’ — let’s see some more pictures.

NerdNote: When I first published this post, it mysteriously refused to show up. Finally I figured out the problem: I’d listed the date as January 29 (which it is here in Australia), but the WordPress software thought it was still January 28, and that it should therefore wait a day before updating!


Predictably, my last post attracted plenty of outrage (some of it too vile to let through), along with the odd commenter who actually agreed with what I consider my fairly middle-of-the-road, liberal Zionist stance.  But since the outrage came from both sides of the issue, and the two sides were outraged about the opposite things, I guess I should feel OK about it.

Still, it’s hard not to smart from the burns of vituperation, so today I’d like to blog about a very different political issue: one where hopefully almost all Shtetl-Optimized readers will actually agree with me (!).

I’ve learned from colleagues that, over the past year, foreign-born scientists have been having enormously more trouble getting visas to enter the US than they used to.  The problem, I’m told, is particularly severe for cryptographers: embassy clerks are now instructed to ask specifically whether computer scientists seeking to enter the US work in cryptography.  If an applicant answers “yes,” it triggers a special process where the applicant hears nothing back for months, and very likely misses the workshop in the US that he or she had planned to attend.  The root of the problem, it seems, is something called the Technology Alert List (TAL), which has been around for a while—the State Department beefed it up in response to the 9/11 attacks—but which, for some unknown reason, is only now being rigorously enforced.  (Being marked as working in one of the sensitive fields on this list is apparently called “getting TAL’d.”)

The issue reached a comical extreme last October, when Adi Shamir, the “S” in RSA, Turing Award winner, and foreign member of the US National Academy of Sciences, was prevented from entering the US to speak at a “History of Cryptology” conference sponsored by the National Security Agency.  According to Shamir’s open letter detailing the incident, not even his friends at the NSA, or the president of the NAS, were able to grease the bureaucracy at the State Department for him.

It should be obvious to everyone that a crackdown on academic cryptographers serves no national security purpose whatsoever, and if anything harms American security and economic competitiveness, by diverting scientific talent to other countries.  (As Shamir delicately puts it, “the number of terrorists among the members of the US National Academy of Science is rather small.”)  So:

  1. Any readers who have more facts about what’s going on, or personal experiences, are strongly encouraged to share them in the comments section.
  2. Any readers who might have any levers of influence to pull on this issue—a Congressperson to write to, a phone call to make, an Executive Order to issue (I’m talking to you, Barack), etc.—are strongly encouraged to pull them.


A couple days ago, a reader wrote to me to ask whether it’s possible that the solution to the P vs. NP problem is simply undefined—and that one should enlarge the space of possible answers using non-classical logics (the reader mentioned something called Catuṣkoṭi logic).  Since other people have emailed me with similar questions in the past, I thought my response might be of more general interest, and decided to post it here.

* * *

Thanks for your mail!  I’m afraid I don’t agree with you that there’s a problem in the formulation of P vs. NP.  Let me come at it this way:

Do you also think there might be a problem in the formulation of Goldbach’s Conjecture?  Or the Twin Prime Conjecture?  (I.e., that maybe the definition of “prime number” needs to be modified using Catuṣkoṭi logic?)  Or any other currently-unsolved problem in any other part of math?

If you don’t, then my question would be: why single out P vs. NP?

After all, P vs. NP can be expressed as a Π2-sentence: that is, as a certain relationship among positive integers, which either holds or doesn’t hold.  (In this case, the integers would encode Turing machines, polynomial upper bounds on their running time, and an NP-complete problem like 3SAT — all of which are expressible using the basic primitives of arithmetic.)  In terms of its logical form, then, it’s really no different than the Twin Prime Conjecture and so forth.

So then, do you think that statements of arithmetic, like there being no prime number between 24 and 28, might also be like the Parallel Postulate?  That there might be some other, equally-valid “non-Euclidean arithmetic” where there is a prime between 24 and 28?  What exactly would one mean by that?  I understand exactly what one means by non-Euclidean geometries, but to my mind, geometry is less “fundamental” (at least in a logical sense) than positive integers are.  And of course, even if one believes that non-Euclidean geometries are just as “fundamental” as Euclidean geometry — an argument that seems harder to make for, say, the positive integers versus the Gaussian integers or finite fields or p-adics  — that still doesn’t change the fact that questions about Euclidean geometry have definite right answers.

Let me acknowledge two important caveats to what I said:

First, it’s certainly possible that P vs. NP might be independent of standard formal systems like ZF set theory (i.e., neither provable nor disprovable in them).  That’s a possibility that everyone acknowledges, even if (like me) they consider it rather unlikely.  But note that, even if P vs. NP were independent of our standard formal systems, that still wouldn’t mean that the question was ill-posed!  There would still either be a Turing machine that decided 3SAT in polynomial time, or else there wouldn’t be.  It would “only” mean that the usual axioms of set theory wouldn’t suffice to tell us which.

The second caveat is that P vs. NP, like any other mathematical question, can be generalized and extended in all sorts of interesting ways.  So for example, one can define analogues of P vs. NP over the reals and complex numbers (which are also currently open, but which might be easier than the Boolean version).  Or, even if P≠NP, one can still ask if randomized algorithms, or nonuniform algorithms, or quantum algorithms, might be able to solve NP-complete problems in polynomial time.  Or one can ask whether NP-complete problems are at least efficiently solvable “on average,” if not in the worst case.  Every one of these questions has been actively researched, and you could make a case that some of them are just as interesting as the original P vs. NP question, if not more interesting — if history had turned out a little different, any one of these might have been what we’d taken as our “flagship” question, rather than P vs. NP.  But again, this still doesn’t change the fact that the original P vs. NP question has some definite answer (like, for example, P≠NP…), even if we can’t prove which answer it is, even if we won’t be able to prove it for 500 years.

And please keep in mind that, if P vs. NP were solved after being open for hundreds of years, it would be far from the first such mathematical problem!  Fermat’s Last Theorem stayed open for 350 years, and the impossibility of squaring the circle and trisecting the angle were open for more than 2000 years.  Any time before these problems were solved, one could’ve said that maybe people had failed because the question itself was ill-posed, but one would’ve been mistaken.  People simply hadn’t invented the right ideas yet.

Best regards,
Scott

* * *

Unrelated Announcements: As most of you have probably seen, Subhash Khot won the Nevanlinna Prize, while Maryam Mirzakhani, Artur Avila, Manjul Bhargava and Martin Hairer won the Fields Medal. Mirzakhani is the first female Fields Medalist. Congratulations to all!

Also, I join the rest of the world in saying that Robin Williams was a great actor—there was no one better at playing “the Robin Williams role” in any given movie—and his loss is a loss for humanity.


So, like, I’m at QIP’2007 in Brisbane, Australia? And, like, everyone’s expecting me to blog about all the wild talks and poster presentations going down in Q-Town? But, like, I don’t actually want to blog about that stuff, since it seems suspiciously close to useful content, the very thing this blog was created to avoid?

I’m therefore declaring an Open Shtetl Day, for all of my readers who happen to be in Brisbane. Here’s how it works: using the comments section, tell the world about your QIP experience. What were the best talks/results/open problems? What happened at the business meeting? (I actually want to know — I skipped it.) What are the most salacious rumors about who’s coauthoring with whom? C’mon, you know you want to post, and you know you’ve got nothing better to do. I can see I’m not the only one in this lecture hall who’s typing away on a laptop.

And get this: after a day or two, I’ll pick the best comments and QIPiest quips, and post them right here in the blog entry proper! QIPers, don’t miss what could be your big break in the competitive quantum blogosphere.

(To preempt the inevitable question: No, there’s not going to be an after-dinner speech this year. But I have it on good authority that there’ll be something in its place.)


Author’s Note: Below is the prepared version of a talk that I gave two weeks ago at the workshop Quantum Foundations of a Classical Universe, which was held at IBM’s TJ Watson Research Center in Yorktown Heights, NY.  My talk is for entertainment purposes only; it should not be taken seriously by anyone.  If you reply in a way that makes clear you did take it seriously (“I’m shocked and outraged that someone who dares to call himself a scientist would … [blah blah]”), I will log your IP address, hunt you down at night, and force you to put forward an account of consciousness and decoherence that deals with all the paradoxes discussed below—and then reply at length to all criticisms of your account.

If you’d like to see titles, abstracts, and slides for all the talks from the workshop—including by Charles Bennett, Sean Carroll, James Hartle, Adrian Kent, Stefan Leichenauer, Ken Olum, Don Page, Jason Pollack, Jess Riedel, Mark Srednicki, Wojciech Zurek, and Michael Zwolak—click here.  You’re also welcome to discuss these other nice talks in the comments section, though I might or might not be able to answer questions about them.  ~~Apparently videos of all the talks will be available before long~~ (Jess Riedel has announced that videos are now available).

(Note that, as is probably true for other talks as well, the video of my talk differs substantially from the prepared version—it mostly just consists of interruptions and my responses to them!  On the other hand, I did try to work some of the more salient points from the discussion into the text below.)

Thanks so much to Charles Bennett and Jess Riedel for organizing the workshop, and to all the participants for great discussions.

* * *

I didn’t prepare slides for this talk—given the topic, what slides would I use exactly?  “Spoiler alert”: I don’t have any rigorous results about the possibility of sentient quantum computers, to state and prove on slides.  I thought of giving a technical talk on quantum computing theory, but then I realized that I don’t really have technical results that bear directly on the subject of the workshop, which is how the classical world we experience emerges from the quantum laws of physics.  So, given the choice between a technical talk that doesn’t really address the questions we’re supposed to be discussing, or a handwavy philosophical talk that at least tries to address them, I opted for the latter, so help me God.

Let me start with a story that John Preskill told me years ago.  In the far future, humans have solved not only the problem of building scalable quantum computers, but also the problem of human-level AI.  They’ve built a Turing-Test-passing quantum computer.  The first thing they do, to make sure this is actually a quantum computer, is ask it to use Shor’s algorithm to factor a 10,000-digit number.  So the quantum computer factors the number.  Then they ask it, “while you were factoring that number, what did it feel like?  did you feel yourself branching into lots of parallel copies, which then recohered?  or did you remain a single consciousness—a ‘unitary’ consciousness, as it were?  can you tell us from introspection which interpretation of quantum mechanics is the true one?”  The quantum computer ponders this for a while and then finally says, “you know, I might’ve known before, but now I just … can’t remember.”

I like to tell this story when people ask me whether the interpretation of quantum mechanics has any empirical consequences.

Look, I understand the impulse to say “let’s discuss the measure problem, or the measurement problem, or derivations of the Born rule, or Boltzmann brains, or observer-counting, or whatever, but let’s take consciousness off the table.”  (Compare: “let’s debate this state law in Nebraska that says that, before getting an abortion, a woman has to be shown pictures of cute babies.  But let’s take the question of whether or not fetuses have human consciousness—i.e., the actual thing that’s driving our disagreement about that and every other subsidiary question—off the table, since that one is too hard.”)  The problem, of course, is that even after you’ve taken the elephant off the table (to mix metaphors), it keeps climbing back onto the table, often in disguises.  So, for better or worse, my impulse tends to be the opposite: to confront the elephant directly.

Having said that, I still need to defend the claim that (a) the questions we’re discussing, centered around quantum mechanics, Many Worlds, and decoherence, and (b) the question of which physical systems should be considered “conscious,” have anything to do with each other.  Many people would say that the connection doesn’t go any deeper than: “quantum mechanics is mysterious, consciousness is also mysterious, ergo maybe they’re related somehow.”  But I’m not sure that’s entirely true.  One thing that crystallized my thinking about this was a remark made in a lecture by Peter Byrne, who wrote a biography of Hugh Everett.  Byrne was discussing the question, why did it take so many decades for Everett’s Many-Worlds Interpretation to become popular?  Of course, there are people who deny quantum mechanics itself, or who have basic misunderstandings about it, but let’s leave those people aside.  Why did people like Bohr and Heisenberg dismiss Everett?  More broadly: why wasn’t it just obvious to physicists from the beginning that “branching worlds” is a picture that the math militates toward, probably the simplest, easiest story one can tell around the Schrödinger equation?  Even if early quantum physicists rejected the Many-Worlds picture, why didn’t they at least discuss and debate it?

Here was Byrne’s answer: he said, before you can really be on board with Everett, you first need to be on board with Daniel Dennett (the philosopher).  He meant: you first need to accept that a “mind” is just some particular computational process.  At the bottom of everything is the physical state of the universe, evolving via the equations of physics, and if you want to know where consciousness is, you need to go into that state, and look for where computations are taking place that are sufficiently complicated, or globally-integrated, or self-referential, or … something, and that’s where the consciousness resides.  And crucially, if following the equations tells you that after a decoherence event, one computation splits up into two computations, in different branches of the wavefunction, that thereafter don’t interact—congratulations!  You’ve now got two consciousnesses.

And if everything above strikes you as so obvious as not to be worth stating … well, that’s a sign of how much things changed in the latter half of the 20th century.  Before then, many thinkers would’ve been more likely to say, with Descartes: no, my starting point is not the physical world.  I don’t even know a priori that there is a physical world.  My starting point is my own consciousness, which is the one thing besides math that I can be certain about.  And the point of a scientific theory is to explain features of my experience—ultimately, if you like, to predict the probability that I’m going to see X or Y if I do A or B.  (If I don’t have prescientific knowledge of myself, as a single, unified entity that persists in time, makes choices, and later observes their consequences, then I can’t even get started doing science.)  I’m happy to postulate a world external to myself, filled with unseen entities like electrons behaving in arbitrarily unfamiliar ways, if it will help me understand my experience—but postulating other versions of me is, at best, irrelevant metaphysics.  This is a viewpoint that could lead you Copenhagenism, or to its newer variants like quantum Bayesianism.

I’m guessing that many people in this room side with Dennett, and (not coincidentally, I’d say) also with Everett.  I certainly have sympathies in that direction too.  In fact, I spent seven or eight years of my life as a Dennett/Everett hardcore believer.  But, while I don’t want to talk anyone out of the Dennett/Everett view, I’d like to take you on a tour of what I see as some of the extremely interesting questions that that view leaves unanswered.  I’m not talking about “deep questions of meaning,” but about something much more straightforward: what exactly does a computational process have to do to qualify as “conscious”?

Of course, there are already tremendous difficulties here, even if we ignore quantum mechanics entirely.  Ken Olum was over much of this ground in his talk yesterday (see here for a relevant paper by Davenport and Olum).  You’ve all heard the ones about, would you agree to be painlessly euthanized, provided that a complete description of your brain would be sent to Mars as an email attachment, and a “perfect copy” of you would be reconstituted there?  Would you demand that the copy on Mars be up and running before the original was euthanized?  But what do we mean by “before”—in whose frame of reference?

Some people say: sure, none of this is a problem!  If I’d been brought up since childhood taking family vacations where we all emailed ourselves to Mars and had our original bodies euthanized, I wouldn’t think anything of it.  But the philosophers of mind are barely getting started.

There’s this old chestnut, what if each person on earth simulated one neuron of your brain, by passing pieces of paper around.  It took them several years just to simulate a single second of your thought processes.  Would that bring your subjectivity into being?  Would you accept it as a replacement for your current body?  If so, then what if your brain were simulated, not neuron-by-neuron, but by a gigantic lookup table?  That is, what if there were a huge database, much larger than the observable universe (but let’s not worry about that), that hardwired what your brain’s response was to every sequence of stimuli that your sense-organs could possibly receive.  Would that bring about your consciousness?  Let’s keep pushing: if it would, would it make a difference if anyone actually consulted the lookup table?  Why can’t it bring about your consciousness just by sitting there doing nothing?

To these standard thought experiments, we can add more.  Let’s suppose that, purely for error-correction purposes, the computer that’s simulating your brain runs the code three times, and takes the majority vote of the outcomes.  Would that bring three “copies” of your consciousness into being?  Does it make a difference if the three copies are widely separated in space or time—say, on different planets, or in different centuries?  Is it possible that the massive redundancy taking place in your brain right now is bringing multiple copies of you into being?

Maybe my favorite thought experiment along these lines was invented by my former student Andy Drucker.  In the past five years, there’s been a revolution in theoretical cryptography, around something called Fully Homomorphic Encryption (FHE), which was first discovered by Craig Gentry.  What FHE lets you do is to perform arbitrary computations on encrypted data, without ever decrypting the data at any point.  So, to someone with the decryption key, you could be proving theorems, simulating planetary motions, etc.  But to someone without the key, it looks for all the world like you’re just shuffling random strings and producing other random strings as output.

You can probably see where this is going.  What if we homomorphically encrypted a simulation of your brain?  And what if we hid the only copy of the decryption key, let’s say in another galaxy?  Would this computation—which looks to anyone in our galaxy like a reshuffling of gobbledygook—be silently producing your consciousness?

When we consider the possibility of a conscious quantum computer, in some sense we inherit all the previous puzzles about conscious classical computers, but then also add a few new ones.  So, let’s say I run a quantum subroutine that simulates your brain, by applying some unitary transformation U.  But then, of course, I want to “uncompute” to get rid of garbage (and thereby enable interference between different branches), so I apply U-1.  Question: when I apply U-1, does your simulated brain experience the same thoughts and feelings a second time?  Is the second experience “the same as” the first, or does it differ somehow, by virtue of being reversed in time?  Or, since U-1U is just a convoluted implementation of the identity function, are there no experiences at all here?

Here’s a better one: many of you have heard of the Vaidman bomb.  This is a famous thought experiment in quantum mechanics where there’s a package, and we’d like to “query” it to find out whether it contains a bomb—but if we query it and there is a bomb, it will explode, killing everyone in the room.  What’s the solution?  Well, suppose we could go into a superposition of querying the bomb and not querying it, with only ε amplitude on querying the bomb, and √(1-ε2) amplitude on not querying it.  And suppose we repeat this over and over—each time, moving ε amplitude onto the “query the bomb” state if there’s no bomb there, but moving ε2 probability onto the “query the bomb” state if there is a bomb (since the explosion decoheres the superposition).  Then after 1/ε repetitions, we’ll have order 1 probability of being in the “query the bomb” state if there’s no bomb.  By contrast, if there is a bomb, then the total probability we’ve ever entered that state is (1/ε)×ε2 = ε.  So, either way, we learn whether there’s a bomb, and the probability that we set the bomb off can be made arbitrarily small.  (Incidentally, this is extremely closely related to how Grover’s algorithm works.)

OK, now how about the Vaidman brain?  We’ve got a quantum subroutine simulating your brain, and we want to ask it a yes-or-no question.  We do so by querying that subroutine with ε amplitude 1/ε times, in such a way that if your answer is “yes,” then we’ve only ever activated the subroutine with total probability ε.  Yet you still manage to communicate your “yes” answer to the outside world.  So, should we say that you were conscious only in the ε fraction of the wavefunction where the simulation happened, or that the entire system was conscious?  (The answer could matter a lot for anthropic purposes.)

You might say, sure, maybe these questions are puzzling, but what’s the alternative?  Either we have to say that consciousness is a byproduct of any computation of the right complexity, or integration, or recursiveness (or something) happening anywhere in the wavefunction of the universe, or else we’re back to saying that beings like us are conscious, and all these other things aren’t, because God gave the souls to us, so na-na-na.  Or I suppose we could say, like the philosopher John Searle, that we’re conscious, and the lookup table and homomorphically-encrypted brain and Vaidman brain and all these other apparitions aren’t, because we alone have “biological causal powers.”  And what do those causal powers consist of?  Hey, you’re not supposed to ask that!  Just accept that we have them.  Or we could say, like Roger Penrose, that we’re conscious and the other things aren’t because we alone have microtubules that are sensitive to uncomputable effects from quantum gravity.  But neither of those two options ever struck me as much of an improvement.

Yet I submit to you that, between these extremes, there’s another position we can stake out—one that I certainly don’t know to be correct, but that would solve so many different puzzles if it were correct that, for that reason alone, it seems to me to merit more attention than it usually receives.  (In an effort to give the view that attention, a couple years ago I wrote an 85-page essay called The Ghost in the Quantum Turing Machine, which one or two people told me they actually read all the way through.)  If, after a lifetime of worrying (on weekends) about stuff like whether a giant lookup table would be conscious, I now seem to be arguing for this particular view, it’s less out of conviction in its truth than out of a sense of intellectual obligation: to whatever extent people care about these slippery questions at all, to whatever extent they think various alternative views deserve a hearing, I believe this one does as well.

The intermediate position that I’d like to explore says the following.  Yes, consciousness is a property of any suitably-organized chunk of matter.  But, in addition to performing complex computations, or passing the Turing Test, or other information-theoretic conditions that I don’t know (and don’t claim to know), there’s at least one crucial further thing that a chunk of matter has to do before we should consider it conscious.  Namely, it has to participate fully in the Arrow of Time.  More specifically, it has to produce irreversible decoherence as an intrinsic part of its operation.  It has to be continually taking microscopic fluctuations, and irreversibly amplifying them into stable, copyable, macroscopic classical records.

Before I go further, let me be extremely clear about what this view is not saying.  Firstly, it’s not saying that the brain is a quantum computer, in any interesting sense—let alone a quantum-gravitational computer, like Roger Penrose wants!  Indeed, I see no evidence, from neuroscience or any other field, that the cognitive information processing done by the brain is anything but classical.  The view I’m discussing doesn’t challenge conventional neuroscience on that account.

Secondly, this view doesn’t say that consciousness is in any sense necessary for decoherence, or for the emergence of a classical world.  I’ve never understood how one could hold such a belief, while still being a scientific realist.  After all, there are trillions of decoherence events happening every second in stars and asteroids and uninhabited planets.  Do those events not “count as real” until a human registers them?  (Or at least a frog, or an AI?)  The view I’m discussing only asserts the converse: that decoherence is necessary for consciousness.  (By analogy, presumably everyone agrees that some amount of computation is necessary for an interesting consciousness, but that doesn’t mean consciousness is necessary for computation.)

Thirdly, the view I’m discussing doesn’t say that “quantum magic” is the explanation for consciousness.  It’s silent on the explanation for consciousness (to whatever extent that question makes sense); it seeks only to draw a defensible line between the systems we want to regard as conscious and the systems we don’t—to address what I recently called the Pretty-Hard Problem.  And the (partial) answer it suggests doesn’t seem any more “magical” to me than any other proposed answer to the same question.  For example, if one said that consciousness arises from any computation that’s sufficiently “integrated” (or something), I could reply: what’s the “magical force” that imbues those particular computations with consciousness, and not other computations I can specify?  Or if one said (like Searle) that consciousness arises from the biology of the brain, I could reply: so what’s the “magic” of carbon-based biology, that could never be replicated in silicon?  Or even if one threw up one’s hands and said everything was conscious, I could reply: what’s the magical power that imbues my stapler with a mind?  Each of these views, along with the view that stresses the importance of decoherence and the arrow of time, is worth considering.  In my opinion, each should be judged according to how well it holds up under the most grueling battery of paradigm-cases, thought experiments, and reductios ad absurdum we can devise.

So, why might one conjecture that decoherence, and participation in the arrow of time, were necessary conditions for consciousness?  I suppose I could offer some argument about our subjective experience of the passage of time being a crucial component of our consciousness, and the passage of time being bound up with the Second Law.  Truthfully, though, I don’t have any a-priori argument that I find convincing.  All I can do is show you how many apparent paradoxes get resolved if you make this one speculative leap.

For starters, if you think about exactly how our chunk of matter is going to amplify microscopic fluctuations, it could depend on details like the precise spin orientations of various subatomic particles in the chunk.  But that has an interesting consequence: if you’re an outside observer who doesn’t know the chunk’s quantum state, it might be difficult or impossible for you to predict what the chunk is going to do next—even just to give decent statistical predictions, like you can for a hydrogen atom.  And of course, you can’t in general perform a measurement that will tell you the chunk’s quantum state, without violating the No-Cloning Theorem.  For the same reason, there’s in general no physical procedure that you can apply to the chunk to duplicate it exactly: that is, to produce a second chunk that you can be confident will behave identically (or almost identically) to the first, even just in a statistical sense.  (Again, this isn’t assuming any long-range quantum coherence in the chunk: only microscopic coherence that then gets amplified.)

It might be objected that there are all sorts of physical systems that “amplify microscopic fluctuations,” but that aren’t anything like what I described, at least not in any interesting sense: for example, a Geiger counter, or a photodetector, or any sort of quantum-mechanical random-number generator.  You can make, if not an exact copy of a Geiger counter, surely one that’s close enough for practical purposes.  And, even though the two counters will record different sequences of clicks when pointed at identical sources, the statistical distribution of clicks will be the same (and precisely calculable), and surely that’s all that matters.  So, what separates these examples from the sorts of examples I want to discuss?

What separates them is the undisputed existence of what I’ll call a clean digital abstraction layer.  By that, I mean a macroscopic approximation to a physical system that an external observer can produce, in principle, without destroying the system; that can be used to predict what the system will do to excellent accuracy (given knowledge of the environment); and that “sees” quantum-mechanical uncertainty—to whatever extent it does—as just a well-characterized source of random noise.  If a system has such an abstraction layer, then we can regard any quantum noise as simply part of the “environment” that the system observes, rather than part of the system itself.  I’ll take it as clear that such clean abstraction layers exist for a Geiger counter, a photodetector, or a computer with a quantum random number generator.  By contrast, for (say) an animal brain, I regard it as currently an open question whether such an abstraction layer exists or not.  If, someday, it becomes routine for nanobots to swarm through people’s brains and make exact copies of them—after which the “original” brains can be superbly predicted in all circumstances, except for some niggling differences that are traceable back to different quantum-mechanical dice rolls—at that point, perhaps educated opinion will have shifted to the point where we all agree the brain does have a clean digital abstraction layer.  But from where we stand today, it seems entirely possible to agree that the brain is a physical system obeying the laws of physics, while doubting that the nanobots would work as advertised.  It seems possible that—as speculated by Bohr, Compton, Eddington, and even Alan Turing—if you want to get it right you’ll need more than just the neural wiring graph, the synaptic strengths, and the approximate neurotransmitter levels.  Maybe you also need (e.g.) the internal states of the neurons, the configurations of sodium-ion channels, or other data that you simply can’t get without irreparably damaging the original brain—not only as a contingent matter of technology but as a fundamental matter of physics.

(As a side note, I should stress that obviously, even without invasive nanobots, our brains are constantly changing, but we normally don’t say as a result that we become completely different people at each instant!  To my way of thinking, though, this transtemporal identity is fundamentally different from a hypothetical identity between different “copies” of you, in the sense we’re talking about.  For one thing, all your transtemporal doppelgängers are connected by a single, linear chain of causation.  For another, outside movies like Bill and Ted’s Excellent Adventure, you can’t meet your transtemporal doppelgängers and have a conversation with them, nor can scientists do experiments on some of them, then apply what they learned to others that remained unaffected by their experiments.)

So, on this view, a conscious chunk of matter would be one that not only acts irreversibly, but that might well be unclonable for fundamental physical reasons.  If so, that would neatly resolve many of the puzzles that I discussed before.  So for example, there’s now a straightforward reason why you shouldn’t consent to being killed, while your copy gets recreated on Mars from an email attachment.  Namely, that copy will have a microstate with no direct causal link to your “original” microstate—so while it might behave similarly to you in many ways, you shouldn’t expect that your consciousness will “transfer” to it.  If you wanted to get your exact microstate to Mars, you could do that in principle using quantum teleportation—but as we all know, quantum teleportation inherently destroys the original copy, so there’s no longer any philosophical problem!  (Or, of course, you could just get on a spaceship bound for Mars: from a philosophical standpoint, it amounts to the same thing.)

Similarly, in the case where the simulation of your brain was run three times for error-correcting purposes: that could bring about three consciousnesses if, and only if, the three simulations were tied to different sets of decoherence events.  The giant lookup table and the Earth-sized brain simulation wouldn’t bring about any consciousness, unless they were implemented in such a way that they no longer had a clean digital abstraction layer.  What about the homomorphically-encrypted brain simulation?  That might no longer work, simply because we can’t assume that the microscopic fluctuations that get amplified are homomorphically encrypted.  Those are “in the clear,” which inevitably leaks information.  As for the quantum computer that simulates your thought processes and then perfectly reverses the simulation, or that queries you like a Vaidman bomb—in order to implement such things, we’d of course need to use quantum fault-tolerance, so that the simulation of you stayed in an encoded subspace and didn’t decohere.  But under our assumption, that would mean the simulation wasn’t conscious.

Now, it might seem to some of you like I’m suggesting something deeply immoral.  After all, the view I’m considering implies that, even if a system passed the Turing Test, and behaved identically to a human, even if it eloquently pleaded for its life, if it wasn’t irreversibly decohering microscopic events then it wouldn’t be conscious, so it would be fine to kill it, torture it, whatever you want.

But wait a minute: if a system isn’t doing anything irreversible, then what exactly does it mean to “kill” it?  If it’s a classical computation, then at least in principle, you could always just restore from backup.  You could even rewind and not only erase the memories of, but “uncompute” (“untorture”?) whatever tortures you had performed.  If it’s a quantum computation, you could always invert the unitary transformation U that corresponded to killing the thing (then reapply U and invert it again for good measure, if you wanted).  Only for irreversible systems are there moral acts with irreversible consequences.

This is related to something that’s bothered me for years in quantum foundations.  When people discuss Schrödinger’s cat, they always—always—insert some joke about, “obviously, this experiment wouldn’t pass the Ethical Review Board.  Nowadays, we try to avoid animal cruelty in our quantum gedankenexperiments.”  But actually, I claim that there’s no animal cruelty at all in the Schrödinger’s cat experiment.  And here’s why: in order to prove that the cat was ever in a coherent superposition of |Alive〉 and |Dead〉, you need to be able to measure it in a basis like {|Alive〉+|Dead〉,|Alive〉-|Dead〉}.  But if you can do that, you must have such precise control over all the cat’s degrees of freedom that you can also rotate unitarily between the |Alive〉 and |Dead〉 states.  (To see this, let U be the unitary that you applied to the |Alive〉 branch, and V the unitary that you applied to the |Dead〉 branch, to bring them into coherence with each other; then consider applying U-1V.)  But if you can do that, then in what sense should we say that the cat in the |Dead〉 state was ever “dead” at all?  Normally, when we speak of “killing,” we mean doing something irreversible—not rotating to some point in a Hilbert space that we could just as easily rotate away from.

(There followed discussion among some audience members about the question of whether, if you destroyed all records of some terrible atrocity, like the Holocaust, everywhere in the physical world, you would thereby cause the atrocity “never to have happened.”  Many people seemed surprised by my willingness to accept that implication of what I was saying.  By way of explaining, I tried to stress just how far our everyday, intuitive notion of “destroying all records of something” falls short of what would actually be involved here: when we think of “destroying records,” we think about burning books, destroying the artifacts in museums, silencing witnesses, etc.  But even if all those things were done and many others, still the exact configurations of the air, the soil, and photons heading away from the earth at the speed of light would retain their silent testimony to the Holocaust’s reality.  “Erasing all records” in the physics sense would be something almost unimaginably more extreme: it would mean inverting the entire physical evolution in the vicinity of the earth, stopping time’s arrow and running history itself backwards.  Such ‘unhappening’ of what’s happened is something that we lack any experience of, at least outside of certain quantum interference experiments—though in the case of the Holocaust, one could be forgiven for wishing it were possible.)

OK, so much for philosophy of mind and morality; what about the interpretation of quantum mechanics?  If we think about consciousness in the way I’ve suggested, then who’s right: the Copenhagenists or the Many-Worlders?  You could make a case for either.  The Many-Worlders would be right that we could always, if we chose, think of decoherence events as “splitting” our universe into multiple branches, each with different versions of ourselves, that thereafter don’t interact.  On the other hand, the Copenhagenists would be right that, even in principle, we could never do any experiment where this “splitting” of our minds would have any empirical consequence.  On this view, if you can control a system well enough that you can actually observe interference between the different branches, then it follows that you shouldn’t regard the system as conscious, because it’s not doing anything irreversible.

In my essay, the implication that concerned me the most was the one for “free will.”  If being conscious entails amplifying microscopic events in an irreversible and unclonable way, then someone looking at a conscious system from the outside might not, in general, be able to predict what it’s going to do next, not even probabilistically.  In other words, its decisions might be subject to at least some “Knightian uncertainty”: uncertainty that we can’t even quantify in a mutually-agreed way using probabilities, in the same sense that we can quantify our uncertainty about (say) the time of a radioactive decay.  And personally, this is actually the sort of “freedom” that interests me the most.  I don’t really care if my choices are predictable by God, or by a hypothetical Laplace demon: that is, if they would be predictable (at least probabilistically), given complete knowledge of the microstate of the universe.  By definition, there’s essentially no way for my choices not to be predictable in that weak and unempirical sense!  On the other hand, I’d prefer that my choices not be completely predictable by other people.  If someone could put some sheets of paper into a sealed envelope, then I spoke extemporaneously for an hour, and then the person opened the envelope to reveal an exact transcript of everything I said, that’s the sort of thing that really would cause me to doubt in what sense “I” existed as a locus of thought.  But you’d have to actually do the experiment (or convince me that it could be done): it doesn’t count just to talk about it, or to extrapolate from fMRI experiments that predict which of two buttons a subject is going to press with 60% accuracy a few seconds in advance.

But since we’ve got some cosmologists in the house, let me now turn to discussing the implications of this view for Boltzmann brains.

(For those tuning in from home: a Boltzmann brain is a hypothetical chance fluctuation in the late universe, which would include a conscious observer with all the perceptions that a human being—say, you—is having right now, right down to false memories and false beliefs of having arisen via Darwinian evolution.  On statistical grounds, the overwhelming majority of Boltzmann brains last just long enough to have a single thought—like, say, the one you’re having right now—before they encounter the vacuum and freeze to death.  If you measured some part of the vacuum state toward which our universe seems to be heading, asking “is there a Boltzmann brain here?,” quantum mechanics predicts that the probability would be ridiculously astronomically small, but nonzero.  But, so the argument goes, if the vacuum lasts for infinite time, then as long as the probability is nonzero, it doesn’t matter how tiny it is: you’ll still get infinitely many Boltzmann brains indistinguishable from any given observer; and for that reason, any observer should consider herself infinitely likelier to be a Boltzmann brain than to be the “real,” original version.  For the record, even among the strange people at the IBM workshop, no one actually worried about being a Boltzmann brain.  The question, rather, is whether, if a cosmological model predicts Boltzmann brains, then that’s reason enough to reject the model, or whether we can live with such a prediction, since we have independent grounds for knowing that we can’t be Boltzmann brains.)

At this point, you can probably guess where this is going.  If decoherence, entropy production, full participation in the arrow of time are necessary conditions for consciousness, then it would follow, in particular, that a Boltzmann brain is not conscious.  So we certainly wouldn’t be Boltzmann brains, even under a cosmological model that predicts infinitely more of them than of us.  We can wipe our hands; the problem is solved!

I find it extremely interesting that, in their recent work, Kim Boddy, Sean Carroll, and Jason Pollack reached a similar conclusion, but from a completely different starting point.  They said: look, under reasonable assumptions, the late universe is just going to stay forever in an energy eigenstate—just sitting there doing nothing.  It’s true that, if someone came along and measured the energy eigenstate, asking “is there a Boltzmann brain here?,” then with a tiny but nonzero probability the answer would be yes.  But since no one is there measuring, what licenses us to interpret the nonzero overlap in amplitude with the Boltzmann brain state, as a nonzero probability of there being a Boltzmann brain?  I think they, too, are implicitly suggesting: if there’s no decoherence, no arrow of time, then we’re not authorized to say that anything is happening that “counts” for anthropic purposes.

Let me now mention an obvious objection.  (In fact, when I gave the talk, this objection was raised much earlier.)  You might say, “look, if you really think irreversible decoherence is a necessary condition for consciousness, then you might find yourself forced to say that there’s no consciousness, because there might not be any such thing as irreversible decoherence!  Imagine that our entire solar system were enclosed in an anti de Sitter (AdS) boundary, like in Greg Egan’s science-fiction novel Quarantine.  Inside the box, there would just be unitary evolution in some Hilbert space: maybe even a finite-dimensional Hilbert space.  In which case, all these ‘irreversible amplifications’ that you lay so much stress on wouldn’t be irreversible at all: eventually all the Everett branches would recohere; in fact they’d decohere and recohere infinitely many times.  So by your lights, how could anything be conscious inside the box?”

My response to this involves one last speculation.  I speculate that the fact that we don’t appear to live in AdS space—that we appear to live in (something evolving toward) a de Sitter space, with a positive cosmological constant—might be deep and important and relevant.  I speculate that, in our universe, “irreversible decoherence” means: the records of what you did are now heading toward our de Sitter horizon at the speed of light, and for that reason alone—even if for no others—you can’t put Humpty Dumpty back together again.  (Here I should point out, as several workshop attendees did to me, that Bousso and Susskind explored something similar in their paper The Multiverse Interpretation of Quantum Mechanics.)

Does this mean that, if cosmologists discover tomorrow that the cosmological constant is negative, or will become negative, then it will turn out that none of us were ever conscious?  No, that’s stupid.  What it would suggest is that the attempt I’m now making on the Pretty-Hard Problem had smacked into a wall (an AdS wall?), so that I, and anyone else who stressed in-principle irreversibility, should go back to the drawing board.  (By analogy, if some prescription for getting rid of Boltzmann brains fails, that doesn’t mean we are Boltzmann brains; it just means we need a new prescription.  Tempting as it is to skewer our opponents’ positions with these sorts of strawman inferences, I hope we can give each other the courtesy of presuming a bare minimum of sense.)

Another question: am I saying that, in order to be absolutely certain of whether some entity satisfied the postulated precondition for consciousness, one might, in general, need to look billions of years into the future, to see whether the “decoherence” produced by the entity was really irreversible?  Yes (pause to gulp bullet).  I am saying that.  On the other hand, I don’t think it’s nearly as bad as it sounds.  After all, the category of “consciousness” might be morally relevant, or relevant for anthropic reasoning, but presumably we all agree that it’s unlikely to play any causal role in the fundamental laws of physics.  So it’s not as if we’ve introduced any teleology into the laws of physics by this move.

Let me end by pointing out what I’ll call the “Tegmarkian slippery slope.”  It feels scientific and rational—from the perspective of many of us, even banal—to say that, if we’re conscious, then any sufficiently-accurate computer simulation of us would also be.  But I tried to convince you that this view depends, for its aura of obviousness, on our agreeing not to probe too closely exactly what would count as a “sufficiently-accurate” simulation.  E.g., does it count if the simulation is done in heavily-encrypted form, or encoded as a giant lookup table?  Does it matter if anyone actually runs the simulation, or consults the lookup table?  Now, all the way at the bottom of the slope is Max Tegmark, who asks: to produce consciousness, what does it matter if the simulation is physically instantiated at all?  Why isn’t it enough for the simulation to “exist” mathematically?  Or, better yet: if you’re worried about your infinitely-many Boltzmann brain copies, then why not worry equally about the infinitely many descriptions of your life history that are presumably encoded in the decimal expansion of π?  Why not hold workshops about how to avoid the prediction that we’re infinitely likelier to be “living in π” than to be our “real” selves?

From this extreme, even most scientific rationalists recoil.  They say, no, even if we don’t yet know exactly what’s meant by “physical instantiation,” we agree that you only get consciousness if the computer program is physically instantiated somehow.  But now I have the opening I want.  I can say: once we agree that physical existence is a prerequisite for consciousness, why not participation in the Arrow of Time?  After all, our ordinary ways of talking about sentient beings—outside of quantum mechanics, cosmology, and maybe theology—don’t even distinguish between the concepts “exists” and “exists and participates in the Arrow of Time.”  And to say we have no experience of reversible, clonable, coherently-executable, atemporal consciousnesses is a massive understatement.

Of course, we should avoid the sort of arbitrary prejudice that Turing warned against in Computing Machinery and Intelligence.  Just because we lack experience with extraterrestrial consciousnesses, doesn’t mean it would be OK to murder an intelligent extraterrestrial if we met one tomorrow.  In just the same way, just because we lack experience with clonable, atemporal consciousnesses, doesn’t mean it would be OK to … wait!  As we said before, clonability, and aloofness from time’s arrow, call severely into question what it even means to “murder” something.  So maybe this case isn’t as straightforward as the extraterrestrials after all.

At this point, I’ve probably laid out enough craziness, so let me stop and open things up for discussion.


I already congratulated Subhash Khot in my last post for winning the Nevanlinna Award, but this really deserves a separate post.  Khot won theoretical computer science’s highest award largely for introducing and exploring the Unique Games Conjecture (UGC), which says (in one sentence) that a large number of the approximation problems that no one has been able to prove NP-hard, really are NP-hard.  In particular, if the UGC is true, then for MAX-CUT and dozens of other important optimization problems, no polynomial-time algorithm can always get you closer to the optimal solution than some semidefinite-programming-based algorithm gets you, unless P=NP.  The UGC might or might not be true—unlike with (say) P≠NP itself, there’s no firm consensus around it—but even if it’s false, the effort to prove or disprove it has by now had a huge impact on theoretical computer science research, leading to connections with geometry, tiling, analysis of Boolean functions, quantum entanglement, and more.

There are a few features that make the UGC interesting, compared to most other questions considered in complexity theory.  Firstly, the problem that the UGC asserts is NP-hard—basically, given a list of linear equations in 2 variables each, to satisfy as many of the equations as you can—is a problem with “imperfect completeness.”  This means that, if you just wanted to know whether all the linear equations were simultaneously satisfiable, the question would be trivial to answer, using Gaussian elimination.  So the problem only becomes interesting once you’re told that the equations are not simultaneously satisfiable, but you’d like to know (say) whether it’s possible to satisfy 99% of the equations or only 1%.  A second feature is that, because of the 2010 work of Arora, Barak, and Steurer, we know that there is an algorithm that solves the unique games problem in “subexponential time”: specifically, in time exp(npoly(δ)), where δ is the completeness error (that is, the fraction of linear equations that are unsatisfiable, in the case that most of them are satisfiable).  This doesn’t mean that the unique games problem can’t be NP-hard: it just means that, if there is an NP-hardness proof, then the reduction will need to blow up the instance sizes by an npoly(1/δ) factor.

To be clear, neither of the above features is unique (har, har) to unique games: we’ve long known NP-complete problems, like MAX-2SAT, that have the imperfect completeness feature, and we also know NP-hardness reductions that blow up the instance size by an npoly(1/δ) factor for inherent reasons (for example, for the Set Cover problem).  But perhaps nothing points as clearly as UGC at the directions that researchers in hardness of approximation and probabilistically checkable proofs (PCP) would like to be able to go.  A proof of the Unique Games Conjecture would basically be a PCP theorem on steroids.  (Or, since we already have “PCP theorems on steroids,” maybe a PCP theorem on PCP?)

It’s important to understand that, between the UGC being true and the unique games problem being solvable in polynomial time, there’s a wide range of intermediate possibilities, many of which are being actively investigated.  For example, the unique games problem could be “NP-hard,” but via a reduction that itself takes subexponential time (i.e., it could be hard assuming the Exponential-Time Hypothesis).  It could be solvable much faster than Arora-Barak-Steurer but still not in P.  Or, even if the problem weren’t solvable any faster than is currently known, it could be “hard without being NP-hard,” having a similar status to factoring or graph isomorphism.  Much current research into the UGC is focused on a particular algorithm called the Sum-of-Squares algorithm (i.e., the Laserre hierarchy).  Some researchers suspect that, if any algorithm will solve the unique games problem in polynomial time (or close to that), it will be Sum-of-Squares; conversely, if one could show that Sum-of-Squares failed, one would’ve taken a major step toward proving the UGC.

For more, I recommend this Quanta magazine article, or Luca Trevisan’s survey, or Subhash’s own survey.  Or those pressed for time can simply check out this video interview with Subhash.  If you’d like to try my wife Dana’s puzzle games inspired by PCP, which Subhash uses 2 minutes into the video to explain what he works on, see here.  Online, interactive versions of these puzzle games are currently under development.  Also, if you have questions about the UGC or Subhash’s work, go ahead and ask: I’ll answer if I can, and otherwise rely on in-house expertise.

Congratulations again to Subhash!


If no one else is going to highlight some results from the conference, I guess I’ll have to do it myself. ~~More nuggets coming soon (i.e., as soon as I have my layovers in Auckland and LAX en route to Toronto)~~ Nope, I was too lazy. Plus I caught a cold on the plane from which I’m just now recovering.

  * Aram Harrow (joint work with Sean Hallgren) generalized the Recursive Fourier Sampling problem of Bernstein and Vazirani, to give superpolynomial quantum black-box speedups based not only on the Fourier transform, but on almost any unitary transformation.
  * Falk Unger discussed his joint result with Richard Cleve, William Slofstra, and Sarvagya Upadhyay, that if Alice and Bob share unlimited quantum entanglement and are playing n Bell inequality games in parallel, then they might as well just play each game separately (i.e., there’s no advantage in correlating their strategies across multiple games). Surprisingly, this is provably false if Alice and Bob don’t share entanglement.
  * Alexandra Kolla discussed her joint result with Sean Hallgren, Pranab Sen, and Shengyu Zhang, that any classical statistical zero-knowledge protocol can be made secure against quantum attacks. This generalizes a previous result of John Watrous (STOC’06), that certain specific SZK protocols can be made secure against quantum attacks. Shengyu was supposed to give the talk; Alexandra had to fill in for him on a few days’ notice since he couldn’t get a travel visa.
  * Troy Lee discussed his new negative-weight adversary method for proving quantum lower bounds (joint work with Peter Høyer and Robert Špalek), which improves the previous methods of Ambainis, Zhang, and others, and finally breaks through the so-called certificate complexity barrier. Unfortunately, the new method is so non-intuitive that right now the authors can only apply it with the aid of semidefinite programming solvers. But this old lowerboundsgeezer is hoping that, once the young ‘uns get a better handle on their new toy, they’ll be able to use it to finally demolish some of the old open problems in quantum lower bounds.
  * Daniel Gottesman proved (joint work with Dorit Aharonov and Julia Kempe) that finding the ground state of a one-dimensional spin chain with nearest-neighbor interactions is already QMA-complete. Since the analogous classical problem is solvable in polynomial time, it had been conjectured that the quantum version is too, but this intuition turns out to be wrong.
  * Yi-Kai Liu showed (joint work with Matthias Christandl and Frank Verstraete) that several problems of longstanding interest to chemists are QMA-complete. These problems include deciding whether a set of local density matrices is compatible with some global density matrix; and the “N-representability” problem (namely, deciding whether a quantum state of two m-mode fermions is extendible to a state of N m-mode fermions, where m=O(poly(N)).
  * Francois Le Gall gave an exponential separation between randomized and quantum space complexity in the online setting (that is, the setting where the input bits are fed to an algorithm one at a time, with no possibility of going backward).
  * Gus Gutoski showed (joint work with John Watrous) that if two omniscient gods are playing a quantum chess game by shuttling qubits back and forth via a polynomial-time intermediary, who will measure the qubits at the end to decide the winner, then it’s possible in deterministic exponential time to decide which god will win. Or to say it much more clearly, QRG=EXP.
  * Iordanis Kerenidis discussed his joint result with Dmitry Gavinsky, Julia Kempe, Ran Raz, and Ronald de Wolf, that there exists a partial Boolean function whose quantum one-way communication complexity is exponentially smaller than its randomized one-way communication complexity. Previously this was only known for relation problems.
  * Ike Chuang gave an update on experimental quantum computing. His talk included a lot of graphs of damped sine curves.
  * I gave my tired old talk on learnability of quantum states.


Lately I’ve been getting emails from undergrads with stellar-looking résumés who want to be my summer students. My initial reaction was: “who, me? I’m barely more than a summer student myself!” But today a light bulb went off: “hey, if these ambitious whippersnappers really want to do my research for me, why shouldn’t I let them do it — thereby freeing up my own time for more important priorities like blogging?”

I’ve therefore decided to list three project ideas. If you’re an undergrad or grad student who wants to tackle one of them this summer at the University of Waterloo, email me (scott at scottaaronson dot com). Tell me about yourself and what you want to do, and attach a CV. I’ll pick up to two students. Deadline: Feb. 21 or until positions are filled.

Scott Aaronson is an equal opportunity employer. He doesn’t have his own funding, so if you can bring your own, great; if not, he’ll try to scrounge some from under Mike Lazaridis’s couch. If the projects listed below don’t interest you — or if you’re more interested in physics, engineering, or information theory than in quantum complexity — there are many, many potential supervisors at both the Institute for Quantum Computing and the Perimeter Institute who’d probably be a better match for you.

Project #1: The Learnability of Quantum States. For this project, you’d first read and understand my paper of the same name, ideally before the summer started. You’d then implement my quantum state reconstruction algorithm in Matlab, Mathematica, or any other environment of your choice, and study its performance with realistic physical systems. This is a crucial first step if experimentalists are ever going to be convinced to try my quantum state learning approach. (The fact that I proved it works is completely irrelevant to them…) There’s also plenty of scope for new theoretical ideas if you swing that way. The eventual goal would be to publish the results somewhere like Physical Review Letters. This project is highly recommended.

Project #2: Multiple Quantum Proofs. Today we believe that there are mathematical truths that you could efficiently verify if given a small quantum state, but that you couldn’t efficiently verify if given a short classical string. But what if you were given two quantum states, which were guaranteed to be unentangled with each other? Would that let you verify even more truths than you could with a single quantum state? The answer is, we have no idea! Nor do we know whether three quantum proofs are more powerful than two, etc. When it comes to the power of multiple quantum proofs, even the most embarrassingly simple questions are open. In this project, you’d work with me to try and sort out the mess. This project is only for students who are confident of their ability to do original research in theoretical computer science.

Project #3: Insert Your Own Project. Wow me. Dazzle me. Give me a specific, detailed idea for a research project in quantum complexity theory or a related area, and convince me that you’re ferocious enough to get somewhere with it in one summer. I’ll try to help where I can.


The March 2007 issue of Notices of the American Mathematical Society is out. In it we find:

  1. Fascinating conversations with three of the four Fields medalists (guess which one declined to be interviewed?)
  2. An obituary of George Dantzig (linear programming pioneer), which I found incredibly frustrating for two reasons. First, the article repeatedly sidesteps the most interesting questions about Dantzig’s career: what were those open problems that he solved mistaking them for homework exercises? What impact did his WWII work actually have? Second, just as nothing in biology makes sense except in the light of evolution, so nothing in optimization makes sense except in the light of computational complexity — a topic this 19-page article somehow assiduously avoids.
  3. A poem by Bill Parry (1934-2006), which stirred my soul as Walt Whitman never did back in 11th-grade English, and which I here reproduce in its entirety.

> Argument
>
> As he cleaned the board,
chalk-dust rose like parched mist.
A dry profession, he mused as morosely
they shuffled settling tier upon tier.
>
> Now, almost half-way through the course,
(coughs, yawns, and automatic writing)
the theorem is ready.
>
> Moving to the crucial point,
the sly unconventional twist,
a quiver springs his voice and breast;
>
> soon the gambit will appear
opposed to what’s expected.
The ploy will snip one strand
the entire skein sloughing to the ground.
>
> His head turns sympathetically
from board to class.
They copy copiously.
But two, perhaps three pause and frown,
>
> wonder will this go through,
questioning this entanglement
— yet they nod encouragement.
Then the final crux; the ropes relax and fall.
>
> His reward: two smile, maybe three,
and one is visibly moved.
Q.E.D., the theorem is proved.
>
> This was his sole intent.
Leaving the symbols on the board
he departs with a swagger of achievement.


A roboticist and Shtetl-Optimized fan named Jon Groff recently emailed me the following suggestion for a blog entry:

I think a great idea for an entry would be the way that in fields like particle physics the theoreticians and experimentalists get along quite well but in computer science and robotics in particular there seems to be a great disdain for the people that actually do things from the people that like to think about them. Just thought I’d toss that out there in case you are looking for some subject matter.

After I replied (among other things, raising my virtual eyebrows over his rosy view of the current state of theoretician/experimentalist interaction in particle physics), Jon elaborated on his concerns in a subsequent email:

[T]here seems to be this attitude in CS that getting your hands dirty is unacceptable. You haven’t seen it because you sit a lofty heights and I tend to think you always have. I have been pounding out code since ferrite cores. Yes, Honeywell 1648A, so I have been looking up the posterior of this issue rather than from the forehead as it were. I guess my challenge would be to find a noteworthy computer theoretician somewhere and ask him:
1) What complete, working, currently functioning systems have you designed?
2) How much of the working code did you contribute?
3) Which of these systems is still operational and in what capacity?
Or say, if the person was a famous robotics professor or something you may ask:
1) Have you ever actually ‘built’ a ‘robot’?
2) Could you, if called upon, design and build an easily tasked robot safe for home use using currently available materials and code?

So I wrote a second reply, which Jon encouraged me to turn into a blog post (kindly giving me permission to quote him).  In case it’s of interest to anyone else, my reply is below.

* * *

Dear Jon,

For whatever it’s worth, when I was an undergrad, I spent two years working as a coder for Cornell’s RoboCup robot soccer team, handling things like the goalie.  (That was an extremely valuable experience, one reason being that it taught me how badly I sucked at meeting deadlines, documenting my code, and getting my code to work with other people’s code.)   Even before that, I wrote shareware games with my friend Alex Halderman (now a famous computer security expert at U. of Michigan); we made almost $30 selling them.  And I spent several summers working on applied projects at Bell Labs, back when that was still a thing.  And by my count, I’ve written four papers that involved code I personally wrote and experiments I did (one on hypertext, one on stylometric clustering, one on Boolean function query properties, one on improved simulation of stabilizer circuits—for the last of these, the code is actually still used by others).  While this is all from the period 1994-2004 (these days, if I need any coding done, I use the extremely high-level programming language called “undergrad”), I don’t think it’s entirely true to say that I “never got my hands dirty.”

But even if I hadn’t had any of those experiences, or other theoretical computer scientists hadn’t had analogous ones, your questions still strike me as unfair.  They’re no more fair than cornering a star coder or other practical person with questions like, “Have you ever proved a theorem?  A nontrivial theorem?  Why is BPP contained in P/poly?  What’s the cardinality of the set of Turing-degrees?”  If the coder can’t easily answer these questions, would you say it means that she has “disdain for theorists”?  (I was expecting some discussion of this converse question in your email, and was amused when I didn’t find any.)

Personally, I’d say “of course not”: maybe the coder is great at coding, doesn’t need theory very much on a day-to-day basis and doesn’t have much free time to learn it, but (all else equal) would be happy to know more.  Maybe the coder likes theory as an outsider, even has friends from her student days who are theorists, and who she’d go to if she ever did need their knowledge for her work.  Or maybe not.  Maybe she’s an asshole who looks down on anyone who doesn’t have the exact same skill-set that she does.  But I certainly couldn’t conclude that from her inability to answer basic theory questions.

I’d say just the same about theorists.  If they don’t have as much experience building robots as they should have, don’t know as much about large software projects as they should know, etc., then those are all defects to add to the long list of their other, unrelated defects.  But it would be a mistake to assume that they failed to acquire this knowledge because of disdain for practical people, rather than for mundane reasons like busyness or laziness.

Indeed, it’s also possible that they respect practical people all the more, because they tried to do the things the practical people are good at, and discovered for themselves how hard they were.  Maybe they became theorists partly because of that self-discovery—that was certainly true in my case.  Maybe they’d be happy to talk to or learn from a practical roboticist like yourself, but are too shy or too nerdy to initiate the conversation.

Speaking of which: yes, let’s let bloom a thousand collaborations between theorists and practitioners!  Those are the lifeblood of science.  On the other hand, based on personal experience, I’m also sensitive to the effect where, because of pressures from funding agencies, theorists have to try to pretend their work is “practically relevant” when they’re really just trying to discover something cool, while meantime, practitioners have to pretend their work is theoretically novel or deep, when really, they’re just trying to write software that people will want to use.  I’d love to see both groups freed from this distorting influence, so that they can collaborate for real reasons rather than fake ones.

(I’ve also often remarked that, if I hadn’t gravitated to the extreme theoretical end of computer science, I think I might have gone instead to the extreme practical end, rather than to any of the points in between.  That’s because I hate the above-mentioned distorting influence: if I’m going to try to understand the ultimate limits of computation, then I should pursue that wherever it leads, even if it means studying computational models that won’t be practical for a million years.  And conversely, if I’m going to write useful software, I should throw myself 100% into that, even if it means picking an approach that’s well-understood, clunky, and reliable over an approach that’s new, interesting, elegant, and likely to fail.)

Best,
Scott


We’ve already been discussing this in the comments section of my previous post, but a few people emailed me to ask when I’d devote a separate blog post to the news.

OK, so for those who haven’t yet heard: this week Google’s Quantum AI Lab announced that it’s teaming up with John Martinis, of the University of California, Santa Barbara, to accelerate the Martinis group‘s already-amazing efforts in superconducting quantum computing.  (See here for the MIT Tech‘s article, here for Wired‘s, and here for the WSJ‘s.)  Besides building some of the best (if not the best) superconducting qubits in the world, Martinis, along with Matthias Troyer, was also one of the coauthors of two important papers that found no evidence for any speedup in the D-Wave machines.  (However, in addition to working with the Martinis group, Google says it will also continue its partnership with D-Wave, in an apparent effort to keep reality more soap-operatically interesting than any hypothetical scenario one could make up on a blog.)

I have the great honor of knowing John Martinis, even once sharing the stage with him at a “Physics Cafe” in Aspen.  Like everyone else in our field, I profoundly admire the accomplishments of his group: they’ve achieved coherence times in the tens of microseconds, demonstrated some of the building blocks of quantum error-correction, and gotten tantalizingly close to the fault-tolerance threshold for the surface code.  (When, in D-Wave threads, people have challenged me: “OK Scott, so then which experimental quantum computing groups should be supported more?,” my answer has always been some variant of: “groups like John Martinis’s.”)

So I’m excited about this partnership, and I wish it the very best.

But I know people will ask: apart from the support and well-wishes, do I have any predictions?  Alright, here’s one.  I predict that, regardless of what happens, commenters here will somehow make it out that I was wrong.  So for example, if the Martinis group, supported by Google, ultimately succeeds in building a useful, scalable quantum computer—by emphasizing error-correction, long coherence times (measured in the conventional way), “gate-model” quantum algorithms, universality, and all the other things that D-Wave founder Geordie Rose has pooh-poohed from the beginning—commenters will claim that still most of the credit belongs to D-Wave, for whetting Google’s appetite, and for getting it involved in superconducting QC in the first place.  (The unstated implication being that, even if there were little or no evidence that D-Wave’s approach would ever lead to a genuine speedup, we skeptics still would’ve been wrong to state that truth in public.)  Conversely, if this venture doesn’t live up to the initial hopes, commenters will claim that that just proves Google’s mistake: rather than “selling out to appease the ivory-tower skeptics,” they should’ve doubled down on D-Wave.  Even if something completely different happens—let’s say, Google, UCSB, and D-Wave jointly abandon their quantum computing ambitions, and instead partner with ISIS to establish the world’s first “Qualiphate,” ruling with a niobium fist over California and parts of Oregon—I would’ve been wrong for having failed to foresee that.  (Even if I did sort of foresee it in the last sentence…)

Yet, while I’ll never live to see the blog-commentariat acknowledge the fundamental reasonableness of my views, I might live to see scalable quantum computers become a reality, and that would surely be some consolation.  For that reason, even if for no others, I once again wish the Martinis group and Google’s Quantum AI Lab the best in their new partnership.

* * *

Unrelated Announcement: Check out a lovely (very basic) introductory video on quantum computing and information, narrated by John Preskill and Spiros Michalakis, and illustrated by Jorge Cham of PhD Comics.


Grudgingly offered for your reading pleasure, and in the vain hope of forestalling further questions.

Q: Thanks to D-Wave Systems — a startup company that’s been in the news lately for its soon-to-be-unveiled “Orion” quantum computer — is humanity now on the verge of being able to solve NP-complete problems in polynomial time?

A: No. We’re also not on the verge of being able to build perpetual-motion machines or travel faster than light.

Q: But couldn’t quantum computers try all possible solutions in parallel, and thereby solve NP-complete problems in a heartbeat?

A: Yes, if the heart in question was beating exponentially slowly.

Otherwise, no. Contrary to widespread misconception, a quantum computer could not “try all possible solutions in parallel” in the sense most people mean by this. In particular, while quantum computers would apparently provide dramatic speedups for a few “structured” problems (such as factoring integers and simulating physical systems), it’s conjectured that they couldn’t solve NP-complete problems in polynomial time.

Q: But isn’t factoring an NP-complete problem?

A: Good heavens, no! While factoring is believed to be intractable for classical computers, it’s not NP-complete, unless some exceedingly unlikely things happen in complexity theory. Where did you get the idea that factoring was NP-complete? (Now I know how Richard Dawkins must feel when someone asks him to explain, again, how “life could have arisen by chance.”)

Q: How could the people at D-Wave not understand that quantum computers couldn’t solve NP-complete problems in polynomial time?

A: To his credit, Geordie Rose (the founder of D-Wave) does understand this; see here for example. And yet, essentially every article I’ve read about D-Wave gives readers exactly the opposite impression. The charitable explanation is that the D-Wave folks are being selectively quoted or misquoted by journalists seeking to out-doofus one another. If so, one hopes that D-Wave will try harder in the future to avoid misunderstandings.

Q: But even if it gave only polynomial speedups (as opposed to exponential ones), couldn’t the adiabatic quantum computer that D-Wave built still be useful for industrial optimization problems?

A: D-Wave’s current machine is said to have sixteen qubits. Even assuming it worked perfectly, with no decoherence or error, a sixteen-qubit quantum computer would be about as useful for industrial optimization problems as a roast-beef sandwich.

Q: But even if it wasn’t practically useful, wouldn’t a 16-qubit superconducting quantum computer still be a major scientific advance?

A: Yes, absolutely.

Q: So, can D-Wave be said to have achieved that goal?

A: As Dave Bacon pointed out earlier, it’s impossible to answer that question without knowing more about how their machine works. With sixteen qubits, a “working demo” doesn’t prove anything. The real questions are: how high are the fidelities, and what are the prospects for scalability?

Q: But clearly D-Wave isn’t going to give away its precious trade secrets just to satisfy some niggling academics! Short of providing technical specifics, what else could they do to make computer scientists take them seriously?

A: Produce the prime factors of

1847699703211741474306835620200164403018549
3386634101714717857749106516967111612498593
3768430543574458561606154457179405222971773
2524660960646946071249623720442022269756756
6873784275623895087646784409332851574965788
4341508847552829818672645133986336493190808
4671990431874381283363502795470282653297802
9349161558118810498449083195450098483937752
2725705257859194499387007369575568843693381
2779613089230392569695253261620823676490316
036551371447913932347169566988069.

Q: Alright, what else could they do?

A: Avoid talking like this:

> The system we are currently deploying, which we call Trinity, is a capability-class supercomputer specifically designed to provide extremely rapid and accurate approximate answers to arbitrarily large NP-complete problems … Trinity has a front-end software interface, implemented in a combination of Java and C, that allows a user to easily state any NP-complete problem of interest. After such a problem has been stated the problem is compiled down to the machine language of the processors at the heart of the machine. These processors then provide an answer, which is shuttled back to the front end and provided to the user. This capability can of course be called remotely and/or as a subroutine of some other piece of software.

Or to translate: “Not only have we built a spaceship capable of reaching Pluto in a few hours, our spaceship also has tinted windows and deluxe leather seats!” If I were them, I’d focus more on the evidence for their core technological claims, given that those claims are very much what’s at issue.

Q: While Dave Bacon also expressed serious doubts about the Orion quantum computer, he seemed more enthusiastic than you are. Why?

A: Generous and optimistic by nature, Dave strives to give others the benefit of the doubt (as the Chinese restaurant placemat would put it). Furthermore, as Quantum Pontiff, he’s professionally obligated to love the quantum sinner and forgive the wayward quantum sheep. And these are all wonderful qualities to have. On the other hand, when the hype surrounding some topic crosses a certain threshold, arguably a pricklier approach becomes called for.

Q: If D-Wave fizzles out, will many journalists and policymakers then conclude that quantum computing is bunk?

A: It doesn’t seem unlikely.

Q: What would it take to get these people to recognize the most elementary distinctions?

A: That’s the question, isn’t it?

Update (2/13): Lawrence Ip, my friend from Berkeley who now works at Google, went to the D-Wave “launch” in person and kindly provided the following report.

> I just came back from the D-Wave announcement. Unfortunately I had to leave at the start of the Q&amp;A session.
>
> Here’s what I took away from it (minus the marketing fluff):
>
> – They claim to solve integer programming problems on their system. Geordie Rose was careful to explicitly say that they are only hoping to see a quadratic speedup. Herb Martin (the CEO) wasn’t quite as careful in his opening remarks but then he’s the “suit”. Geordie said that their current chip is not a universal QC (presumably because their space of Hamiltonians is limited) but with some work they expect to be able to make it universal.
>
> – Geordie said compared their approach to the efforts in academia as similar to Celera and the Human Genome Project. He said they were trying to get something that would scale fast and worry about about the quality of the qubits later. He contrasted this to the academic community’s efforts to achieve fine control over qubits before scaling up. They say that they hope to reach 1024 qubits by the end of 2008.
>
> – They demoed 3 “real-world” problems where they used their system as essentially a blackbox IP solver.
– Searching for protein matches. Given a protein try to find the closest match in a protein database. They serially fed all the items from the database to the chip and asked it to score the match against the given protein. They said it was solving a maximum independent set problem.
– Finding the best seating arrangement at a wedding reception. Given constraints on which people can’t be seated together and who wants to sit together, find the optimal arrangement.
– Solving a Sudoku puzzle.
>
> – At one point Geordie quoted you. He excerpted a paragraph from your SIGACT article (the one where you talk about generating Shakespeare’s 38th play) and mentioned your proposal of the inability to solve NP-hard problems as a physical law. As far as I can remember, the only other computer scientist he quoted was Gordon Moore so you’re in good company. 🙂


Some of you might have read about how flight attendants at AirTran kicked a 3-year-old screaming brat and her parents off a plane, after the brat had already delayed takeoff for 15 minutes by refusing to get in her seat, and the parents had demonstrated their total unwillingness to control her. The parents went to the media expecting sympathy; instead, AirTran was immediately deluged with messages of support and people vowing to fly them from then on. Unfortunately, the airline then squandered a PR bonanza by apologizing profusely to the parents and refunding their tickets. In my opinion, there was no need to kick anyone off the plane: the child and parents should’ve been promptly moved to the luggage compartment, then whipped and beaten upon arrival.


At last night’s FOCS business meeting, there was a panel discussion on how to get the public excited about theoretical computer science. Unfortunately I missed it — I’m skipping FOCS for the first time in years — so I’m grateful to Rocco Servedio for this post about the discussion and to Dave Bacon for this one.

The obvious question is, why has there been so little success at popularizing theoretical computer science? Here I’d like to propose an answer to this question: because no one in human history has ever successfully popularized any field of science.

“But that’s absurd!” you interject. “What about Stephen Hawking, or Richard Dawkins, or Carl Sagan, or Richard Feynman, or Isaac Asimov, or Bertrand Russell?”

My response is simple. These people are not popularizers. They are prophets.

Like Moses descending from Sinai, the scientific prophet emerges from the clouds of Platonic heaven with a vision for the huddled throng below: that yea, though our lives may be fleeting and our bodies frail, through reason we shall know the mind of God. We are apes with telescopes, star-stuff pondering the stars.

Often, as in the cases of Hawking and Feynman, the prophet’s own life is central to the vision. The prophet teaches by example, showing us that no physical impediment is too great to overcome, that the world is full of solvable mysteries, that Nature cannot be fooled.

The prophet does not confine himself to his “area of expertise,” any more than Moses limited himself to shepherding regulations or Jesus to carpentry tips. He draws on his field for illustration, to be sure, but his real interest is life itself. He never hesitates to philosophize or moralize, even if only to tell his listeners that philosophers and moralists are idiots.

The scientific prophet presents humanity with a choice: will we persist in our petty squabbles and infantile delusions, Neanderthals with computers and ICBM’s? Or will we create a better world, one worthy of reasoning beings?

Even when the prophet exhorts us to reason, skepticism, and empiricism, he does so by hijacking a delivery system that is thousands of years old. And that is why he succeeds.

> Theoretical computer science will capture the public’s imagination when, and only when, it produces a prophet.


If you haven’t yet, I urge you to read Steven Pinker’s brilliant piece in The New Republic about what’s broken with America’s “elite” colleges and how to fix it.  The piece starts out as an evisceration of an earlier New Republic article on the same subject by William Deresiewicz.  Pinker agrees with Deresiewicz that something is wrong, but finds Deresiewicz’s diagnosis of what to be lacking.  The rest of Pinker’s article sets out his own vision, which involves America’s top universities taking the radical step of focusing on academics, and returning extracurricular activities like sports to their rightful place as extras: ways for students to unwind, rather than a university’s primary reason for existing, or a central criterion for undergraduate admissions.  Most controversially, this would mean that the admissions process at US universities would become more like that in virtually every other advanced country: a relatively-straightforward matter of academic performance, rather than an exercise in peering into the applicants’ souls to find out whether they have a special je ne sais quoi, and the students (and their parents) desperately gaming the intentionally-opaque system, by paying consultants tens of thousands of dollars to develop souls for them.

(Incidentally, readers who haven’t experienced it firsthand might not be able to understand, or believe, just how strange the undergraduate admissions process in the US has become, although Pinker’s anecdotes give some idea.  I imagine anthropologists centuries from now studying American elite university admissions, and the parenting practices that have grown up around them, alongside cannibalism, kamikaze piloting, and other historical extremes of the human condition.)

Pinker points out that a way to assess students’ ability to do college coursework—much more quickly and accurately than by relying on the soul-detecting skills of admissions officers—has existed for a century.  It’s called the standardized test.  But unlike in the rest of the world (even in ultraliberal Western Europe), standardized tests are politically toxic in the US, seen as instruments of racism, classism, and oppression.  Pinker reminds us of the immense irony here: standardized tests were invented as a radical democratizing tool, as a way to give kids from poor and immigrant families the chance to attend colleges that had previously only been open to the children of the elite.  They succeeded at that goal—too well for some people’s comfort.

We now know that the Ivies’ current emphasis on sports, “character,” “well-roundedness,” and geographic diversity in undergraduate admissions was consciously designed (read that again) in the 1920s, by the presidents of Harvard, Princeton, and Yale, as a tactic to limit the enrollment of Jews.  Nowadays, of course, the Ivies’ “holistic” admissions process no longer fulfills that original purpose, in part because American Jews learned to play the “well-roundedness” game as well as anyone, shuttling their teenage kids between sports, band practice, and faux charity work, while hiring professionals to ghostwrite application essays that speak searingly from the heart.  Today, a major effect of “holistic” admissions is instead to limit the enrollment of Asian-Americans (especially recent immigrants), who tend disproportionately to have superb SAT scores, but to be deficient in life’s more meaningful dimensions, such as lacrosse, student government, and marching band.  More generally—again, pause to wallow in the irony—our “progressive” admissions process works strongly in favor of the upper-middle-class families who know how to navigate it, and against the poor and working-class families who don’t.

Defenders of the status quo have missed this reality on the ground, it seems to me, because they’re obsessed with the notion that standardized tests are “reductive”: that is, that they reduce a human being to a number.  Aren’t there geniuses who bomb standardized tests, they ask, as well as unimaginative grinds who ace them?  And if you make test scores a major factor in admissions, then won’t students and teachers train for the tests, and won’t that pervert open-ended intellectual curiosity?  The answer to both questions, I think, is clearly “yes.”  But the status-quo-defenders never seem to take the next step, of examining the alternatives to standardized testing, to see whether they’re even worse.

I’d say the truth is this: spots at the top universities are so coveted, and so much rarer than the demand, that no matter what you use as your admissions criterion, that thing will instantly get fetishized and turned into a commodity by students, parents, and companies eager to profit from their anxiety.  If it’s grades, you’ll get a grades fetish; if sports, you’ll get a sports fetish; if community involvement, you’ll get soup kitchens sprouting up for the sole purpose of giving ambitious 17-year-olds something to write about in their application essays.  If Harvard and Princeton announced that from now on, they only wanted the most laid-back, unambitious kids, the ones who spent their summers lazily skipping stones in a lake, rather than organizing their whole lives around getting in to Harvard and Princeton, tens of thousands of parents in the New York metropolitan area would immediately enroll their kids in relaxation and stone-skipping prep courses.  So, given that reality, why not at least make the fetishized criterion one that’s uniform, explicit, predictively valid, relatively hard to game, and relevant to universities’ core intellectual mission?

(Here, I’m ignoring criticisms specific to the SAT: for example, that it fails to differentiate students at the extreme right end of the bell curve, thereby forcing the top schools to use other criteria.  Even if those criticisms are true, they could easily be fixed by switching to other tests.)

I admit that my views on this matter might be colored by my strange (though as I’ve learned, not at all unique) experience, of getting rejected from almost every “top” college in the United States, and then, ten years later, getting recruited for faculty jobs by the very same institutions that had rejected me as a teenager.  Once you understand how undergraduate admissions work, the rejections were unsurprising: I was a 15-year-old with perfect SATs and a published research paper, but not only was I young and immature, with spotty grades and a weird academic trajectory, I had no sports, no music, no diverse leadership experiences.  I was a narrow, linear, A-to-B thinker who lacked depth and emotional intelligence: the exact opposite of what Harvard and Princeton were looking for in every way.  The real miracle is that despite these massive strikes against me, two schools—Cornell and Carnegie Mellon—were nice enough to give me a chance.  (I ended up going to Cornell, where I got a great education.)

Some people would say: so then what’s the big deal?  If Harvard or MIT reject some students that maybe they should have admitted, those students will simply go elsewhere, where—if they’re really that good—they’ll do every bit as well as they would’ve done at the so-called “top” schools.  But to me, that’s uncomfortably close to saying: there are millions of people who go on to succeed in life despite childhoods of neglect and poverty.  Indeed, some of those people succeed partly because of their rough childhoods, which served as the crucibles of their character and resolve.  Ergo, let’s neglect our own children, so that they too can have the privilege of learning from the school of hard knocks just like we did.  The fact that many people turn out fine despite unfairness and adversity doesn’t mean that we should inflict unfairness if we can avoid it.

Let me end with an important clarification.  Am I saying that, if I had dictatorial control over a university (ha!), I would base undergraduate admissions solely on standardized test scores?  Actually, no.  Here’s what I would do: I would admit the majority of students mostly based on test scores.  A minority, I would admit because of something special about them that wasn’t captured by test scores, whether that something was musical or artistic talent, volunteer work in Africa, a bestselling smartphone app they’d written, a childhood as an orphaned war refugee, or membership in an underrepresented minority.  Crucially, though, the special something would need to be special.  What I wouldn’t do is what’s done today: namely, to turn “specialness” and “well-roundedness” into commodities that the great mass of applicants have to manufacture before they can even be considered.

Other than that, I would barely look at high-school grades, regarding them as too variable from one school to another.  And, while conceding it might be impossible, I would try hard to keep my university in good enough financial shape that it didn’t need any legacy or development admits at all.

* * *

Update (Sep. 14): For those who feel I’m exaggerating the situation, please read the story of commenter Jon, about a homeschooled 15-year-old doing graduate-level work in math who, three years ago, was refused undergraduate admission to both Berkeley and Caltech, with the math faculty powerless to influence the admissions officers. See also my response.


A couple days ago the Times ran a much-debated story about Marcus S. Ross, a young-earth creationist who completed a PhD in geosciences at the University of Rhode Island. Apparently his thesis was a perfectly-legitimate study of marine reptiles that (as he writes in the thesis) went extinct 65 million years ago. Ross merely disavows the entire materialistic paradigm of which his thesis is a part.

If you want some long, acrimonious flamewars about whether the guy’s PhD should be revoked, whether oral exams should now include declarations of (non)faith, whether Ross is a walking illustration of Searle’s Chinese Room experiment, etc., try here and here. Alas, most of the commentary strikes me as missing a key point: that to give a degree to a bozo like this, provided he indeed did the work, can only reflect credit on the scientific enterprise. Will Ross now hit the creationist lecture circuit, trumpeting his infidel credentials to the skies? You better believe it. Will he use the legitimacy conferred by his degree to fight against everything the degree stands for? It can’t be doubted.

But here’s the wonderful thing about science: unlike the other side, we don’t need loyalty oaths in order to function. We don’t need to peer into people’s souls to see if they truly believe (A or not(A)), or just assume it for practical purposes. We have enough trouble getting people to understand our ideas — if they also assent to them, that’s just an added bonus.

In his Dialogue Concerning the Two Chief World Systems, Galileo had his Salviati character carefully demolish the arguments for Ptolemaic astronomy — only to concede, in the final pages, that Ptolemaic astronomy must obviously be true anyway, since the church said it was true. Mr. G, of course, was just trying to cover his ass. The point, though, is that his ploy didn’t work: the church understood as well as he did that the evidence mattered more than the conclusions, and therefore wisely arrested him. (I say “wisely” because the church was, of course, entirely correct to worry that a scientific revolution would erode its temporal power.)

To say that science is about backing up your claims with evidence doesn’t go far enough — it would be better to say that the evidence is the claim. So for example, if you happen to prove the Riemann Hypothesis, you’re more than welcome to “believe” the Hypothesis is nevertheless false, just as you’re welcome to write up your proof in encrusted boogers or lecture about it wearing a live gerbil as a hat. Indeed, you could do all these things and still not be the weirdest person to have solved a Clay Millennium Problem. Believing your proof works can certainly encourage other people to read it, but strictly speaking is no more necessary than the little QED box at the end.

The reason I’m harping on this is that, in my experience, laypeople consistently overestimate the role of belief in science. Thus the questions I constantly get asked: do I believe the many-worlds interpretation? Do I believe the anthropic principle? Do I believe string theory? Do I believe useful quantum computers will be built? Never what are the arguments for and against: always what do I believe?

To explain why “belief” questions often leave me cold, I can’t do better than to quote the great Rabbi Sagan.

> I’m frequently asked, “Do you believe there’s extraterrestrial intelligence?” I give the standard arguments — there are a lot of places out there, the molecules of life are everywhere, I use the word billions, and so on. Then I say it would be astonishing to me if there weren’t extraterrestrial intelligence, but of course there is as yet no compelling evidence for it.
>
> Often, I’m asked next, “What do you really think?”
>
> I say, “I just told you what I really think.”
>
> “Yes, but what’s your gut feeling?”
>
> But I try not to think with my gut. If I’m serious about understanding the world, thinking with anything besides my brain, as tempting as that might be, is likely to get me into trouble.

In my view, science is fundamentally not about beliefs: it’s about results. Beliefs are relevant mostly as the heuristics that lead to results. So for example, it matters that David Deutsch believes the many-worlds interpretation because that’s what led him to quantum computing. It matters that Ed Witten believes string theory because that’s what led him to … well, all the mindblowing stuff it led him to. My beef with quantum computing skeptics has never been that their beliefs are false; rather, it’s that their beliefs almost never seem to lead them to new results.

I hope nobody reading this will mistake me for a woo-woo, wishy-washy, Kuhn-wielding epistemic terrorist. (Some kind of intellectual terrorist, sure, but not that kind.) Regular readers of this blog will aver that I do have beliefs, and plenty of them. In particular, I don’t merely believe evolution is good science; I also believe it’s true. But as Richard Dawkins has pointed out, the reason evolution is good science is not that it’s true, but rather that it does nontrivial explanatory work. Even supposing creationism were true, it would still be too boring to qualify as science — as even certain creationists hunting for a thesis topic seem to agree.

Or anyway, that’s what I believe.


A few months ago, I signed a contract with MIT Press to publish a new book: an edited anthology of selected posts from this blog, along with all-new updates and commentary.  The book’s tentative title (open to better suggestions) is Speaking Truth to Parallelism: Dispatches from the Frontier of Quantum Computing Theory.  The new book should be more broadly accessible than Quantum Computing Since Democritus, although still far from your typical pop-science book.  My goal is to have STTP out by next fall, to coincide with Shtetl-Optimized‘s tenth anniversary.

If you’ve been a regular reader, then this book is my way of thanking you for … oops, that doesn’t sound right.  If it were a gift, I should give it away for free, shouldn’t I?  So let me rephrase: buying this reasonably-priced book can be your way of thanking me, if you’ve enjoyed my blog all these years.  But it will also (I hope) be a value-added proposition: not only will you be able to put the book on your coffee table to impress an extremely nerdy subset of your friends, you’ll also get “exclusive content” unavailable on the blog.

To be clear, the posts that make it into the book will be ruthlessly selected: nothing that’s pure procrastination, politics, current events, venting, or travelogue, only the choice fillets that could plausibly be claimed to advance the public understanding of science.  Even for those, I’ll add additional background material, and take out digs unworthy of a book (making exceptions for anything that really cracks me up on a second reading).

If I had to pick a unifying theme for the book, I’d sigh and then say: it’s about a certain attitude toward the so-called “deepest questions,” like the nature of quantum mechanics or the ultimate limits of computation or the mind/body problem or the objectivity of mathematics or whether our universe is a computer simulation.  It’s an attitude that I wish more popular articles managed to get across, and at any rate, that people ought to adopt when reading those articles.  The attitude combines an openness to extraordinary claims, with an unceasing demand for clarity about the nature of those claims, and an impatience whenever that demand is met with evasion, obfuscation, or a “let’s not get into technicalities right now.”  It’s an attitude that constantly asks questions like:

“OK, so what can you actually do that’s different?”
“Why doesn’t that produce an absurd result when applied to simple cases?”
“Why isn’t that just a fancy way of saying what I could’ve said in simpler language?”
“Why couldn’t you have achieved the same thing without your ‘magic ingredient’?”
“So what’s your alternative account for how that happens?”
“Why isn’t that obvious?”
“What’s really at stake here?”
“What’s the catch?”

It’s an attitude that accepts the possibility that such questions might have satisfying answers—in which case, a change in worldview will be in order.  But not before answers are offered, openly debated, and understood by the community of interested people.

Of all the phrases I use on this blog, I felt “Speaking Truth to Parallelism” best captured the attitude in question.  I coined the phrase back in 2007, when D-Wave’s claims to be solving Sudoku puzzles with a quantum computer unleashed a tsunami of journalism about QCs—what they are, how they would work, what they could do—that (in my opinion) perfectly illustrated how not to approach a metaphysically-confusing new technology.  Having said that, the endless debate around D-Wave won’t by any means be the focus of this book: it will surface, of course, but only when it helps to illustrate some broader point.

In planning this book, the trickiest issue was what to do with comments.  Ultimately, I decided that the comments make Shtetl-Optimized what it is—so for each post I include, I’ll include a brief selection of the most interesting comments, together with my responses to them.  My policy will be this: by default, I’ll consider any comments on this blog to be fair game for quoting in the book, in whole or in part, and attributed to whatever handle the commenter used.  However, if you’d like to “opt out” of having your comments quoted, I now offer you a three-month window in which to do so: just email me, or leave a comment (!) on this thread.  You can also request that certain specific comments of yours not be quoted, or that your handle be removed from your comments, or your full name added to them—whatever you want.

Update (9/24): After hearing from several of you, I’ve decided on the following modified policy.  In all cases where I have an email address, I will contact the commenters about any of their comments that I’m thinking of using, to request explicit permission to use them.  In the hopefully-rare cases where I can’t reach a given commenter, but where their comment raised what seems like a crucial point requiring a response in the book, I might quote from the comment anyway—but in those cases, I’ll be careful not to reproduce very long passages, in a way that might run afoul of the fair use exception.


By now, the news that Microsoft abruptly closed its Silicon Valley research lab—leaving dozens of stellar computer scientists jobless—has already been all over the theoretical computer science blogosphere: see, e.g., Lance, Luca, Omer Reingold, Michael Mitzenmacher.  I never made a real visit to Microsoft SVC (only went there once IIRC, for a workshop, while a grad student at Berkeley); now of course I won’t have the chance.

The theoretical computer science community, in the Bay Area and elsewhere, is now mobilizing to offer visiting positions to the “refugees” from Microsoft SVC, until they’re able to find more permanent employment.  I was happy to learn, this week, that MIT’s theory group will likely play a small part in that effort.

Like many others, I confess to bafflement about Microsoft’s reasons for doing this.  Won’t the severe damage to MSR’s painstakingly-built reputation, to its hiring and retention of the best people, outweigh the comparatively small amount of money Microsoft will save?  Did they at least ask Mr. Gates, to see whether he’d chip in the proverbial change under his couch cushions to keep the lab open?  Most of all, why the suddenness?  Why not wind the lab down over a year, giving the scientists time to apply for new jobs in the academic hiring cycle?  It’s not like Microsoft is in a financial crisis, lacking the cash to keep the lights on.

Yet one could also view this announcement as a lesson in why academia exists and is necessary.  Yes, one should applaud those companies that choose to invest a portion of their revenue in basic research—like IBM, the old AT&amp;T, or Microsoft itself (which continues to operate great research outfits in Redmond, Santa Barbara, both Cambridges, Beijing, Bangalore, Munich, Cairo, and Herzliya).  And yes, one should acknowledge the countless times when academia falls short of its ideals, when it too places the short term above the long.  All the same, it seems essential that our civilization maintain institutions for which the pursuit and dissemination of knowledge are not just accoutrements for when financial times are good and the Board of Directors is sympathetic, but are the institution’s entire reasons for being: those activities that the institution has explicitly committed to support for as long as it exists.


Today The Economist put out this paragon of hard-hitting D-Wave journalism. At first I merely got angry — but then I asked myself what Winston Churchill, Carl Sagan, or Chip ‘n Dale’s Rescue Rangers would do. So let’s see if The Economist prints the following.

> SIR —
>
> In a remarkably uncritical article about D-Wave’s announcement of the “world’s first practical quantum computer” (Feb. 15), you gush that “[i]n principle, by putting a set of entangled qubits into a suitably tuned magnetic field, the optimal solution to a given NP-complete problem can be found in one shot.” This is simply incorrect. Today it is accepted that quantum computers could not solve NP-complete problems in a reasonable amount of time. Indeed, the view of quantum computers as able to “try all possible solutions in parallel,” and then instantly choose the correct one, is fundamentally mistaken. Since measurement outcomes in quantum mechanics are random, one can only achieve a computational speedup by carefully exploiting the phenomenon known as quantum interference. And while it is known how to use interference to achieve dramatic speedups for a few problems — such as factorising large integers, and thereby breaking certain cryptographic codes — those problems are much more specialised than the NP-complete problems.
>
> Over the past few days, many news outlets have shown a depressing willingness to reprint D-Wave’s marketing hype, without even attempting to learn why most quantum computing researchers are sceptical. I expected better from The Economist.
>
> Scott Aaronson
Institute for Quantum Computing
University of Waterloo

I thought ‘factorising,’ ‘specialised,’ and ‘sceptical’ were a nice touch.

A Final Thought (2/16): As depressing as it is to see a lazy magazine writer, with a single harebrained sentence, cancel out your entire life’s work twenty times over, some good may yet come from this travesty. Where before I was reticent in the war against ignorant misunderstandings of quantum computing, now I have been radicalized — much as 9/11 radicalized Richard Dawkins in his war against religion. We, the quantum complexity theorists, are far from helpless in this fight. We, too, can launch a public relations campaign. We can speak truth to parallelism. So doofuses of the world: next time you excrete the words “NP-complete,” “solve,” and “instantaneous” anywhere near one another, brace yourselves for a Bennett-Bernstein-Brassard-Blitzirani the likes of which the multiverse has never seen.

Update (2/16): If you read the comments, Geordie Rose responds to me, and I respond to his response. Also see the comments on my earlier D-Wave post for more entertaining and ongoing debate.


This week I was at my alma mater, Cornell, to give a talk at the 50th anniversary celebration of its computer science department.  You can watch the streaming video here; my talk runs from roughly 1:17:30 to 1:56 (though if you’ve seen other complexity/physics/humor shows by me, this one is pretty similar, except for the riff about Cornell at the beginning).

The other two things in that video—a talk by Tom Henzinger about IST Austria, a bold new basic research institute that he leads, closely modeled after the Weizmann Institute in Israel; and a discussion panel about the future of programming languages—are also really interesting and worth watching.  There was lots of other good stuff at this workshop, including a talk about Google Glass and its applications to photography (by, not surprisingly, a guy wearing a Google Glass—Marc Levoy); a panel discussion with three Turing Award winners, Juris Hartmanis, John Hopcroft, and Ed Clarke, about the early days of Cornell’s CS department; a talk by Amit Singhal, Google’s director of search; a talk about differential privacy by Cynthia Dwork, one of the leading researchers at the recently-closed Microsoft SVC lab (with a poignant and emotional ending); and a talk by my own lab director at MIT, Daniela Rus, about her research in robotics.

Along with the 50th anniversary celebration, Bill Gates was also on campus to dedicate Bill and Melinda Gates Hall, the new home of Cornell’s CS department.  Click here for streaming video of a Q&amp;A that Gates did with Cornell students, where I thought he acquitted himself quite well, saying many sensible things about education, the developing world, etc. that other smart people could also say, but that have extra gravitas coming from him.  Gates has also become extremely effective at wrapping barbs of fact inside a soft mesh of politically-unthreatening platitudes—but listen carefully and you’ll hear the barbs.  The amount of pomp and preparation around Gates’s visit reminded me of when President Obama visited MIT, befitting the two men’s approximately equal power.  (Obama has nuclear weapons, but then again, he also has Congress.)

And no, I didn’t get to meet Gates or shake his hand, though I did get to stand about ten feet from him at the Gates Hall dedication.  (He apparently spent most of his time at Cornell meeting with plant breeders, and other people doing things relevant to the Gates Foundation’s interests.)

Thanks so much to Bobby and Jon Kleinberg, and everyone else who invited me to this fantastic event and helped make it happen.  May Cornell’s CS department have a great next 50 years.

One last remark before I close this post.  Several readers have expressed disapproval and befuddlement over the proposed title of my next book, “Speaking Truth to Parallelism.”  In the words of commenter TonyK:

That has got to be the worst title in the history of publishing! “Speaking Truth to Parallelism”? It doesn’t even make sense! I count myself as one of your fans, Scott, but you’re going to have to do better than that if you want anybody else to buy your book. I know you can do better — witness “Quantum Computing Since Democritus”.

However, my experiences at Cornell this week helped to convince me that, not only does “Speaking Truth to Parallelism” make perfect sense, it’s an activity that’s needed now more than ever.  What it means, of course, is fighting a certain naïve, long-ago-debunked view of quantum computers—namely, that they would achieve exponential speedups by simply “trying every possible answer in parallel”—that’s become so entrenched in the minds of many journalists, laypeople, and even scientists from other fields that it feels like nothing you say can possibly dislodge it.  The words out of your mouth will literally be ignored, misheard, or even contorted to the opposite of what they mean, if that’s what it takes to preserve the listener’s misconception about quantum computers being able to solve NP-hard optimization problems by sheer magic.  (Much like in the Simpsons-visit-Australia episode, where Marge’s request for “coffee” is misheard o