jlongster/codemirror-opt.md

## codemirror-opt.md

      
    Raw
  

              codemirror-opt.md
            
          
    I did some poking around CodeMirror and how it works, particularly with it's current code folding implementation. My goal was to see if there was a way to only have part of the string in memory, render those parts, and still have line numbers be correct. My goal was to have the editor display something like this:
1 
2 
3  function() {
27 }
28

Note how the function body is missing but the line numbers jump from 3 to 27. If I could get this rendering with only the string function() {\n}, we can build what we need on top of it.
Unfortunately, CodeMirror assumes that the entire source string is available in memory, full stop. There's no elegant way to make this work. But we might have a few options if we fully understand how CodeMirror works. So let's take a look at how CodeMirror works.
I found where it actually slices the source string and inserts each individual line: code. The function insertLineContent appends a line to the DOM (with whatever builder is). You'll see the allText.slice call on that line, and that's where it gets the text.
That code is wrapped in a if (!spans) { condition, so what is that? "spans" is an internal represention for what is created when you call markText. That function allows you "mark" ranges of text (that cross lines) and do something different with them. The code folding functionality (which is just a plugin) is built on top of this builtin behavior. A span can be marked as "collapsed" and you'll noticed here that collapsed spans don't actually add anything to the DOM.
It seems like a very specific builtin feature for the sole purpose of something like code folding. I wouldn't be surprised if code folding was the reason collapsed spans exist.
We'll talk more about spans later. Now, take a look at the line where it inserts each line into the DOM again. It gets the line content with the call allText.slice(at, at = styles[i]), so what is this styles array?
It's passed into insertLineContent, so moving up a call you see it's created from getLineStyles, which gets the style array from highlightLine. This function has the meat. It generates an array which contains offsets into the original string. (it's called highlightLine because here is where it, if any modes exists other than text, tokenizes it and splits the line into individual words to allow highlighting).
OK, with all of this context, here are a few solutions:


Generate the styles (offsets) array ourselves
If you look in getLineStyles, it first checks if there is a styles object attached and will use that if so (it caches the result). We might be able to "post-process" the offsets by iterating over the lines and manually patching each line's styles array with the right text offsets. Or somehow create the styles array ourselves.
However, this doesn't solve how to actually get the expanding/collapsing working... but the real solution will involve some of all of these ideas.


Create a string proxy
We could try to create an object that looks and acts like a string, but would be aware of which sections are collapsed, and methods like slice would appropriately translate a "raw" location to one that looks like the right line in the "collapsed" string. We would simply pass this in as the value option.
In this case, the editor would still create millions of lines. The string proxy would return the real length of the source, so that could still be a problem. The only difference is we don't have the full string locally (save memory) and most of the lines would be empty at first (in the internal rep in CM).
We'd also have to manually walk through the collapsed sections and call markText to collapse those regions so it matches our collapsed string.
In this case, I think everything works (line numbers, etc), but I'm not sure if it actually solves the perf problem.


Fork CodeMirror?
The more I think about this, the less hopeful I am. If #2 does not actually solve the perf problem, then we need CodeMirror to not actually create the internal represention for lines that don't exist yet. This would require forking CodeMirror and customizing it to our needs.