Skip to content

Instantly share code, notes, and snippets.

@breck7
Created June 21, 2017 20:49
Show Gist options
  • Save breck7/42a00b00cdcc797e4bb0b6c291a1b764 to your computer and use it in GitHub Desktop.
Save breck7/42a00b00cdcc797e4bb0b6c291a1b764 to your computer and use it in GitHub Desktop.
> Please don't mistake this question as offensive, but: How many people did you show this to before putting it out in the open?
Not offended at all! I showed the paper to somewhere between 20 - 30 people before putting it out there. The paper improved 100x thanks entirely to them,
but still, the response was lukewarm(at best). I only have a B.S. in Economics (and barely at that), and have never written
a "paper" before, so that might explain some of its "quirks".
> The biggest problem of the whole thing is that it doesn't even get its motivation across to me. I read through the blogpost and the paper and I have no idea why I should care.
Haha, great point! For many years I've been wrestling with a few core problems:
- visual programming
- better data visualization
- easier data munging/ETL
Finally last year I quit my job to dive in full time to trying to solve the data vis problems. To solve it I had to find
TN and ETNs. So the app is the motivation. But the theory enables the app. And I realized that the theory is much
bigger than just the app, in terms of the benefits to other programmers it will offer. And from what I can tell,
the way to "launch" a theory is to do a paper. Hence the motivation for the paper.
> The paper is in dire need of some peer review
Agreed! As I said, I had over 20 folks help, and thanks to them it's much better, but the "hubris" and tone and any mistakes
are mine and mine alone. I worked at Microsoft for a few years and am used to all my code getting reviewed. I asked
the ~20 folks to basically do "code reviews" of the paper. Is that substantially different than peer review? Not sure.
I also have the benefit of doing this on my own dime, and not having any institutional concerns or motives, so am
perhaps more lax with style than is normal.
> (also proofreading, but that's not why I'm here).
Haha. Nits are welcome too, but yes, I appreciate your big picture comments.
> Here's some points that came to my mind while reading it. (Looking over it after writing it all down makes this appear like a rant. Please don't take it that way. I want you to improve, and you can only do so with honest, constructive feedback, which is what I'm aiming to provide.)
As I said, I'm used to constant, harsh code reviews, so no offense is taken (on the contrary, I really, genuinely appreciate the harsher comments).
> Our current high-level antique languages (HALs) add to this complexity.
It doesn't inspire confidence when you use inflammatory language ("antique") in the second sentence. You should present your ideas in a neutral way and let readers arrive at judgments themselves.
This is a great point. I needed an acronymn for the non-ETN languages, and it seemed to fit. It also was meant to push
the envelope a bit. I of course owe everything to what I call the ALs, so have no ill will. And in fact, even if ETNs are
superior, it will be a long time before we have ETNs all the way down.
> ASTs have enabled local minimum gains in developer productivity.
> What do you mean? ASTs are not a tool for application developers, they're a tool for compilers and compiler developers.
I would say that's part of the problem. Developers have to wrestle with the "black box" that is ASTs. When developers have visibility
and access to the tree structure of their programs, programming becomes more delightful.
> But programmers lose time and insight due to discrepancies between HAL code and ASTs.
Unsubstantiated claim.
Correct. I have shown almost no evidence to this point. Coming soon.
> GitHub ES6 repo
> What's an ES6?
The modern day Javascript, which is a fantastic language (which I would not say about earlier versions of JS).
> TN [...] makes reading code locally and globally easy, and ETN programs use far fewer source nodes than equivalent HAL programs. ETNs let programmers write much shorter, perfect programs.
A litany of unsubstantiated claims. You should either use a much more neutral language (e.g. "Our goal was to devise a notation that does X" instead of "ETN does X") or present evidence to support your claims (e.g. "we wrote 100 functioning programs in ETN and in $common_language and found that ETN programs use 35.6% less source nodes"). Also, you didn't explain why "number of source nodes" is a useful metric.
Great points! More evidence will soon come to light. I wanted to keep this paper short. Two pages was a hard limit I set for myself.
On # of source nodes being a useful metric: http://www.paulgraham.com/power.html
> There are a lot more unsubstantiated claims in the subsequent sections. I'm not going to point these out individually because I think you will get what I'm going for.
Yup!
> Every text string is a valid TN program. [...]
> This whole section is conceptually fuzzy. I kind of get why you would consider this property useful, but I don't see that having any value in practice. Why should I care about the program being invalid for the 1 second while I type out this particular variable name? I only care if the program is valid once it's being passed into the compiler/interpreter. If the momentary invalidity of a source code during character-wise editing really were a problem, we would have switched to AST-based editors decades ago (at least once we had access to GUIs).
This is important in Ohayo. Sorry, will get that launched shortly.
But my bigger gripe is this:
> "Errors" still can exist at the ETN level, but ETN microparsers can handle errors independently
> So you're just moving the problem to a different layer. Also, this same behavior can be had with any "antique" programming language whose grammar defines synchronization points. For example, in C/C++, if a parser error occurs within a statement, the parser can jump ahead to the next semicolon token and continue parsing the next statement from there. How is TN/ETN superior to this?
It's incredibly easy to implement innovative new error handling strategies in ETNs. It's a very fair point though and
I'm benefitting from evidence that right now only I have access to. Again, coming soon.
> C. Semantic diffs
> I don't get the point of this section. When I have a merge conflict between two changes, it's because they touch the same stuff, and therefore human-level intelligence is required to merge them. What is (E)TN bringing to the table to make merges more automatic? The best way to clarify this would probably be to present a concrete example.
Occasionally one person's editor will use tabs instead of spaces, or perform some other formatting that conflicts with others' tools. Even when
a tool like Prettier is used (which itself is brand new--and fantastic, I might add), stuff still happens. In TN everything is semantic: an extra YI creates an extra node, an extra XI changes a parent-child relationship.
> HALs ignore whitespace
> I would have replied that you apparently never used Python, but the next example uses Python, so I don't know how to explain this.
Python uses some whitespace, and ignores some whitespace.
> JSON program
> JSON is not a programming language.
I agree, "document" would have been a better word here. I accidentally swapped those. But I don't think it's a nit. TN/ETNs work for declarative notations as well as Turing Complete languages.
> For example, the JSON program below requires extra nodes to encode Python
> Why on earth would you wrap Python in JSON (or ETN or anything)? Related:
This example actually is taken directly from (https://ipython.org/), a very popular tool. Because the words are arbiratry
I took liberties to add some allusions to various things, but it's a real thing.
> multi-lingual programs
> I assume you mean that the code contains the same program in multiple programming languages. Why would I want to do this? This whole section looks like a solution in want of a problem. Why would I need to wrap Python into any of these structures?
Sorry, I guess that's not too clear. I mean a document that contains sections written in different languages. For example, and HTML document often contains HTML, CSS, and Javascript in 1 file.
> The GER demonstrates how...
> So this paper is only a teaser and I should look at the Github repo for the real deal? Sorry, but that's not how research papers work. I feel like I'm reading a blog post here. It's okay to link to references (e.g. websites or other papers) when you're introducing an existing idea. But when presenting a novel idea, a paper should be reasonably selfcontained.
Gotcha, this is good feedback! There's a whole lot of evidence and demos and stuff to talk about, this is just the tip of the iceberg and if I didn't
limit it to 2 pages it could have been 400.
> Prediction 1: no structure will be found that cannot serialize to TN.
> That's not a prediction, that's a tautology. In a world where I can serialize everything as a string of 1s and 0s, of course I can serialize everything as TN, too. This could be a prediction if you defined the serialization(s) in some way (at least by their properties).
Great way of putting it! Thank you for this comment. Perhaps maybe something along the lines of, 1s and 0s will inevitably need to represent trees to encode a structure, and TN presents a minimum way to do that. Or, "you can encode everything in 1s and 0s, but you will inevitably have to build a notation for trees, and ETNs (2d- turing machines) that can operate on those trees". Anyway, something along those lines. But I think the general,
informal prediction will hold.
> Below is some code in a simple ETN, JsonETN:
> This snippet looks like a rendering error. If you're not going to describe what's in it, I'd rather cut it.
Sorry, instead of "lorem ipsums" I took liberties to make references to people and things that have helped me over the years.
> ETNs will be found for great ALs including C, RISC-V, ES6 and Arc.
> Now that's just namedropping. What particular relevance do these languages have that warrants their mention over others, except for (in the case of RISC-V or ES6) appealing to the HN crowd?
Again, me taking liberties to, in place of lorem ipsums, pay tribute to some of my favorite things.
> And by the way, RISC-V is not a language. The RISC-V ISA's machine code is a language. This lack of rigorousness is one of the most pervasive patterns in the paper, and one of the reasons why I said earlier that I feel like reading a blog post instead of a paper.
Sorry if that wasn't clear, I was indeed referring to the ISA.
> Some ETNs have already been found
> discovering an ETN for an AL
> I take issue with how you use the word "find". Why are you so insistent in denying that languages are invented?
Oh, "invented" works too. I just found it interesting that by the conclusions of this paper, one can go out and pick any arbitrary Turing
Complete language and be sure that they could find an ETN for it. Or invent an ETN. Either word works for me. Thanks for the feedback on that.
> Tree Oriented Programming will supersede Object Oriented Programming.
> These are two entirely separate concerns, as far as I can see. TOP, as you show it in the paper, happens on the syntactic level only. OOP happens on the semantic level. To witness, observe how most nontrivial C libraries are object-oriented, even though C itself is not object-oriented in any way.
I think the tree structure is the important thing when looking at the best programs (objects, or node types in my terms, are quite important too, but more important to get the tree right than the objects right ).
> High level ETNs will be found to translate machine written programs into understandable trees.
> I said before that I won't point out any unsubstantiated claim, but this one gets an honorable mention for being particularly egregious.
Agreed. More than a hunch, though.
> At time of publication, 312 programming languages and notations are searched. Over 10,000 pages of research was read.
> Similar to namedropping, this is numberdropping. Again, this is something for a blog post, not for a paper. It doesn't add any useful information for the reader.
I think it's somewhat useful, no? If I had said 30, wouldn't the paper be more worthy of skepticism?
> [Figure 2]
What is this image supposed to tell me?
> [References] http://sci-hub.cc/[...]
Pro-tip: Don't link to SciHub in a paper.
Haha. Fair point.
Without SciHub though, this work wouldn't have been possible, so I think it would be cowardly of me to not give her & them credit.
It's a damn shame we live in a world where something like SciHub isn't encouraged, and is instead attacked.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment