Skip to content

Instantly share code, notes, and snippets.

@justinbmeyer
Last active August 19, 2022 04:50
Show Gist options
  • Star 94 You must be signed in to star a gist
  • Fork 5 You must be signed in to fork a gist
  • Save justinbmeyer/4662050 to your computer and use it in GitHub Desktop.
Save justinbmeyer/4662050 to your computer and use it in GitHub Desktop.
JS Memory

JavaScript Code

var str = "hi";

Memory allocation:

Address Value Description
...... ...
0x1001 call object Start of a call object's memory
0x1002 0x00af Reference to invoked function
0x1003 1 Number of references in this call object
0x1004 str Name of variable (in practice would not be in a single address)
0x1005 0x1001 Memory address of "hi"
...... ...
0x2001 string type identifier
0x2002 2 number of bytes
0x2003 h byte of first character
0x2004 i byte of second character

Explanation

When JS runs: var str = "hi"; by calling some function, it first hoists all variable declarations and creates a spot in memory for the variable. This might leave memory something like:

Address Value Description
...... ...
0x1001 call object Start of a call object's memory
0x1002 0x00af Reference to invoked function
0x1003 1 Number of references in this call object
0x1004 str Name of variable
0x1005 empty
...... ...

In practice, the name of the variable, str, would not be held in a single memory address. Also, the variable names and locations would not be stored in a fixed-memory array (possibly in a hash-table).

Next, the string "hi" would be created in memory like:

Address Value Description
...... ...
0x2001 string type identifier
0x2002 2 number of bytes
0x2003 h byte of first character
0x2004 i byte of second character

Finally, the pointer address of str would be set to the memory address of "hi" (0x2001), leaving memory as indicated at the top of the page.

@mraleph
Copy link

mraleph commented Jan 29, 2013

Let take a look at what V8 actually does when it runs something like:

function foo(x) {
  if (x) {
    var hi = "str";
  }
  return hi;
}

First of all, allocation of string "str" does not happen every time you run this function. It is allocated once by a parser. Every time you execute this code the very same string object will be used again and again. It looks like this:

hidden class
length
hash
str■

where hidden class is actually the first word of every object, it points to a structure describing object type and layout. Three first fields all have the same size a normal pointer would have. In other words it is 4 bytes on 32bit, 8 bytes on 64bit architectures. The rest is for the string's content which is padded on the right so that overall size of object is divisible by pointer size. In the example above I assume 4 byte pointers and so I added 1 byte of padding (■).

There are actually numerous ways to represent a string in V8 and the one above is called sequential string, because it carries its payload in the body.

Now lets get back to the function parsing. As I said the string itself is allocated once by a parser. The same, in some sense, is true for a variable as well. V8 does not recompute the scope every time you enter the function, it does it once when it parses foo. For a simple function above it'll compute the scope with a parameter x (parameter index 1) and a single local variable hi (local index 0). When JIT compiler starts emitting the code it'll use indices to access variables not names.

The call object that you are describing above does not really exist in the heap. Instead the native machine stack is used.

When you enter the function foo you end up with something like this:

...
receiver tagged pointer to the receiver of the invocation (this)
x tagged pointer to the passed value of x
retaddr return address into the caller
caller ebp frame pointer for the caller ← ebp points here
function tagged pointer to the invoked function (foo)
context tagged pointer to the function context
undefined tagged pointer to the value of variable hi ← esp points here

(Every slot in the "picture" above is of pointer size. I am using → to highlight that the value itself is not stored on the stack. Only pointer to it is.)

As you can notice there is no variable names here (the mapping is retained somewhere in the scope object produced by a parser, but it is not used by the code generated by JIT compiler). There is also no explicit size this information can be computed if needed from stack registers (ebp and esp) when required.

If initialisation var hi = "str" is executed then pointer to undefined will be replaced with a pointer to a string "str" allocated as described above and the stack will start looking like this:

...
receiver tagged pointer to the receiver of the invocation (this)
x tagged pointer to the passed value of x
retaddr return address into the caller
caller ebp frame pointer for the caller ← ebp points here
function tagged pointer to the invoked function (foo)
context tagged pointer to the function context
"str" tagged pointer to the value of variable hi ← esp points here

[Things will look differently if x is a captured variable or if x was introduced by eval but lets leave these complexities aside for now]

@jaredwy
Copy link

jaredwy commented Jan 29, 2013

If my knowledge is still relevant, mozilla still uses up to 128bit payload for their eqv. choice was made so they could fit floating point values into their eqv. of a sequential https://bugzilla.mozilla.org/show_bug.cgi?id=549143

and as always wingo has a writeup on this http://wingolog.org/archives/2011/05/18/value-representation-in-javascript-implementations

@mraleph
Copy link

mraleph commented Jan 29, 2013

@jaredwy: they use 64bit values which is enough to fit a double precision floating point value and other stuff.

[sequential in my comment above refered to string representation and has nothing to do with tagging scheme]

@jaredwy
Copy link

jaredwy commented Jan 29, 2013

As discussed on IM you seem to be correct. Linking the same link here for future reference http://hg.mozilla.org/mozilla-central/file/6cca454559c8/js/src/jsval.h#l253

@taitems
Copy link

taitems commented Jan 30, 2013

Heading typo: Explination

@kulte
Copy link

kulte commented Jan 30, 2013

Another possible typo: You state the memory address of "hi" is 0x2001 in your explanation at the end, but in the table state it as 0x1001

@msankhala
Copy link

Can you please explain "Start of a call object's memory", "Reference to invoked function" and "Number of references in this call object". Sorry if i seems like a dumb, i am newbie in javascript. i know in javascript every function is object, what is difference between call object and invoked function in this case? Aren't they same.

@mraleph
Copy link

mraleph commented Jan 30, 2013

@msankhala If I'm guessing @justinbmeyer intentions right then call object is usually called activation record.

Function and activation record are not the same because every time you call a function you get a new activation record that exists until the call has returned. There can be multiple activation records for the same function if it is called recursively.

@CotunaAurelian
Copy link

I have a question. What happens when I have another encoding? For instance I have a string written in Unicode. I'm asking because I'm curious where it is stored the encoding type. Thanks,

@mraleph
Copy link

mraleph commented Jan 31, 2013

@CotunaAurelian string encoding is stored in the hidden class (aka map internally in V8). strings are either one byte (latin1) or two byte (utf16) encoded, but a particular encoding is not visible to JavaScript code. ECMA-262 5th actually specifies that string is a sequence of 16bit integers

@justinbmeyer
Copy link
Author

@mraleph,

(Question Summary: If the native call stack is used, and a function's activation frame is popped when the function returns, how do inner functions "walk up" to parent function's activation frame to get a value).

Thank you VERY much for your explanation! I thought I had some time when I started this post to work on my JSConf talk, but that time evaporated, so I am resuming it now. I owe you a dinner / beer / etc. If you are coming to JSConf, I can pay that debt sooner than later.

By call object I'm referring to whatever mechanism allows closures to work. For the following example:

var counter = function(){
  var i = 0;
  return function(){
    return i++;
  }
}
index = counter()
index()

Some record (possibly stack frame / activation record) needs to exist for the inner function to find the value of i. I've assumed that an inner function's call object (or activation record) has a reference to its parent call object (or activation record), that i is retrieved by checking the current call object for i and walking up to the parent call objects until i is found.

I'm not sure how this would work with the native call stack. It's my (mis)understanding that the call stack is popped when the current function has completed running. If that's true, then I'm missing something else. Perhaps when a function is run and its activation record is created, the new activation record gets all of its parent references?

Thanks again for your help!

@justinbmeyer
Copy link
Author

@mraleph

I doubt that the new activation record gets all of its parent references, otherwise Chrome's and FF's dev tools would not give "Scope Variables" grouped by closure. Each "closure" contains the same information as what I was calling call object.

Is this structure part of the native call stack? How is it that they exist after the function runs? Thanks.

@mraleph
Copy link

mraleph commented May 21, 2013

If we are talking about V8 then part of the activation record that needs to survive after function returns is "detached" from the stack and is allocated a normal heap in a structure called context. Up in my first comment there is a slot for it on the native stack.

After execution just entered counter native stack looks like this:

...
receiver this (global object)
retaddr return address into the caller
caller ebp frame pointer for the caller ← ebp points here
function tagged pointer to the invoked function (counter)
context tagged pointer to the function context (empty context) ← esp points here

as you can see there is no space reserved for i unlike in the previous example. Where does it go? Next thing that happens just after entering this code V8 will create a local context object that looks like this:

Lets call it Context_5fff02b (random digits at the end to reflect that each time we enter function counter we allocate a new one on the heap):

...
map pointer to a type descriptor (aka map, hidden class)
closure pointer to a function that created this context counter
previous pointer to a previous context used by with contexts null
extension dynamic scope data for with or eval null
global object global object for quick access
i place for a variable i undefined

You can see there are quite a few internal fields because contexts are used for multiple purposes. You can look through source code to get a grasp of the details involved.

After local context was created the stack slot where context is stored will be updated:

...
receiver this (global object)
retaddr return address into the caller
caller ebp frame pointer for the caller ← ebp points here
function tagged pointer to the invoked function (counter)
context tagged pointer to the function context ( Context_5fff02b ) ← esp points here

This context pointer will be used when working with variable i inside function counter and all closures allocated in it will get this context as its outer context.

Inside the function code current context is always cached in a register esi so storing 0 into i looks like this:

mov [esi + (5 * 4 - 1)], 0

I decomposed immediate offset into parts to make it clearer where each came from (5th field, 4 bytes per field, -1 to untag tagged pointer) generated code will just say offset 29.

Allocated function will approximately look like this (in fact there are more fields in closures, I skip irrelevant ones)

...
map pointer to a type descriptor (aka map, hidden class)
code pointer to the code
context pointer to the outer context Context_5fff02b

Hope this helps.

@martianmartian
Copy link

i would really appreciate it if someone can help me to check my understanding of this whole process. i wrote them down here, here, and here. Justing wondering coz these things have been bugging me for a while.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment