How memory allocation happens in javascript?
A few weeks ago we started a series aimed at digging deeper into JavaScript and how it actually works: we thought that by knowing the building blocks of JavaScript and how they come to play together you’ll be able to write better code and apps. Show
The first post of the series focused on providing an overview of the engine, the runtime, and the call stack. Thе second post examined closely the internal parts of Google’s V8 JavaScript engine and also provided a few tips on how to write better JavaScript code. In this third post, we’ll discuss another critical topic that’s getting ever more neglected by developers due to the increasing maturity and complexity of programming languages that are being used on a daily basis — memory management. We’ll also provide a few tips on how to handle memory leaks in JavaScript that we at SessionStack follow as we need to make sure SessionStack causes no memory leaks or doesn’t increase the memory consumption of the web app in which we are integrated. OverviewLanguages, like C, have low-level memory management primitives such as At the same time, JavaScript allocates memory when things (objects, strings, etc.) are created and “automatically” frees it up when they are not used anymore, a process called garbage collection. This seemingly “automatical” nature of freeing up resources is a source of confusion and gives JavaScript (and other high-level-language) developers the false impression they can choose not to care about memory management. This is a big mistake. Even when working with high-level languages, developers should have an understanding of memory management (or at least the basics). Sometimes there are issues with the automatic memory management (such as bugs or implementation limitations in the garbage collectors, etc.) which developers have to understand in order to handle them properly (or to find a proper workaround, with a minimum trade off and code debt). Memory life cycleNo matter what programming language you’re using, memory life cycle is pretty much always the same: Here is an overview of what happens at each step of the cycle:
For a quick overview of the concepts of the call stack and the memory heap, you can read our first post on the topic. What is memory?Before jumping straight to memory in JavaScript, we’ll briefly discuss what memory is in general and how it works in a nutshell. On a hardware
level, computer memory consists of a large number of Since as humans, we are not that good at doing all of our thinking and arithmetic in bits, we organize them into larger groups, which together can be used to represent numbers. 8 bits are called 1 byte. Beyond bytes, there are words (which are sometimes 16, sometimes 32 bits). A lot of things are stored in this memory:
The compiler and the operating system work together to take care of most of the memory management for you, but we recommend that you take a look at what’s going on under the hood. When you compile your code, the compiler can examine primitive data types and calculate ahead of time how much memory they will need. The required amount is then allocated to the program in the call stack space. The space in which these variables are allocated is called the stack space because as functions get called, their memory gets added on top of the existing memory. As they terminate, they are removed in a LIFO (last-in, first-out) order. For example, consider the following declarations: int n; // 4 bytes The compiler can immediately see that the code requires
The compiler will insert code that will interact with the operating system to request the necessary number of bytes on the stack for your variables to be stored. In the example above, the compiler knows the exact memory address of each variable. In fact, whenever we write to the variable Notice that if we attempted to access When functions call other functions, each gets its own chunk of the stack when it is called. It keeps all its local variables there, but also a program counter that remembers where in its execution it was. When the function finishes, its memory block is once again made available for other purposes. Dynamic allocationUnfortunately, things aren’t quite as easy when we don’t know at compile time how much memory a variable will need. Suppose we want to do something like the following: int n = readInput(); // reads input from the user...// create an array with "n" elements Here, at compile time, the compiler does not know how much memory the array will need because it is determined by the value provided by the user. It, therefore, cannot allocate room for a variable on the stack. Instead, our program needs to explicitly ask the operating system for the right amount of space at run-time. This memory is assigned from the heap space. The difference between static and dynamic memory allocation is summarized in the following table: Differences between statically and dynamically allocated memoryTo fully understand how dynamic memory allocation works, we need to spend more time on pointers, which might be a bit too much of a deviation from the topic of this post. If you’re interested in learning more, just let me know in the comments and we can go into more details about pointers in a future post. Allocation in JavaScriptNow we’ll explain how the first step (allocate memory) works in JavaScript. JavaScript relieves developers from the responsibility to handle memory allocations — JavaScript does it by itself, alongside declaring values. var n = 374; // allocates memory for a number Some function calls result in object allocation as well: var d = new Date(); // allocates a Date objectvar e = document.createElement('div'); // allocates a DOM element Methods can allocate new values or objects: var s1 = 'sessionstack'; Using memory in JavaScriptUsing the allocated memory in JavaScript basically, means reading and writing in it. This can be done by reading or writing the value of a variable or an object property or even passing an argument to a function. Release when the memory is not needed anymoreMost of the memory management issues come at this stage. The hardest task here is to figure out when the allocated memory is not needed any longer. It often requires the developer to determine where in the program such piece of memory is not needed anymore and free it. High-level languages embed a piece of software called garbage collector which job is to track memory allocation and use in order to find when a piece of allocated memory is not needed any longer in which case, it will automatically free it. Unfortunately, this process is an approximation since the general problem of knowing whether some piece of memory is needed is undecidable (can’t be solved by an algorithm). Most garbage collectors work by collecting memory which can no longer be accessed, e.g. all variables pointing to it went out of scope. That’s, however, an under-approximation of the set of memory spaces that can be collected, because at any point a memory location may still have a variable pointing to it in scope, yet it will never be accessed again. Garbage collectionDue to the fact that finding whether some memory is “not needed anymore” is undecidable, garbage collections implement a restriction of a solution to the general problem. This section will explain the necessary notions to understand the main garbage collection algorithms and their limitations. Memory referencesThe main concept garbage collection algorithms rely on is the one of reference. Within the context of memory management, an object is said to reference another object if the former has an access to the latter (can be implicit or explicit). For instance, a JavaScript object has a reference to its prototype (implicit reference) and to its properties’ values (explicit reference). In this context, the idea of an “object” is extended to something broader than regular JavaScript objects and also contains function scopes (or the global lexical scope).
Reference-counting garbage collectionThis is the simplest garbage collection algorithm. An object is considered “garbage collectible” if there are zero references pointing to it. Take a look at the following code: var o1 = { Cycles are creating problemsThere is a limitation when it comes to cycles. In the following example, two objects are created and reference one another, thus creating a cycle. They will go out of scope after the function call, so they are effectively useless and could be freed. However, the reference-counting algorithm considers that since each of the two objects is referenced at least once, neither can be garbage-collected. function f() { Mark-and-sweep algorithmIn order to decide whether an object is needed, this algorithm determines whether the object is reachable. The Mark-and-sweep algorithm goes through these 3 steps:
This algorithm is better than the previous one since “an object has zero reference” leads to this object being unreachable. The opposite is not true as we have seen with cycles. As of 2012, all modern browsers ship a mark-and-sweep garbage-collector. All improvements made in the field of JavaScript garbage collection (generational/incremental/concurrent/parallel garbage collection) over the last years are implementation improvements of this algorithm (mark-and-sweep), but not improvements over the garbage collection algorithm itself, nor its goal of deciding whether an object is reachable or not. In this article, you can read in a greater detail about tracing garbage collection that also covers mark-and-sweep along with its optimizations. Cycles are not a problem anymoreIn the first example above, after the function call returns, the two objects are not referenced anymore by something reachable from the global object. Consequently, they will be found unreachable by the garbage collector. Even though there are references between the objects, they’re not reachable from the root. Counter intuitive behavior of Garbage CollectorsAlthough Garbage Collectors are convenient they come with their own set of trade-offs. One of them is non-determinism. In other words, GCs are unpredictable. You can’t really tell when a collection will be performed. This means that in some cases programs use more memory that it’s actually required. In other cases, short-pauses may be noticeable in particularly sensitive applications. Although non-determinism means one cannot be certain when a collection will be performed, most GC implementations share the common pattern of doing collection passes during allocation. If no allocations are performed, most GCs stay idle. Consider the following scenario:
In this scenario, most GCs will not run any further collection passes. In other words, even though there are unreachable references available for collection, these are not claimed by the collector. These are not strictly leaks but still, result in higher-than-usual memory usage. What are memory leaks?Just like the memory suggests, memory leaks are pieces of memory that the application have used in the past but is not needed any longer but has not yet been return back to the OS or the pool of free memory. Programming languages favor different ways of managing memory. However, whether a certain piece of memory is used or not is actually an undecidable problem. In other words, only developers can make it clear whether a piece of memory can be returned to the operating system or not. Certain programming languages provide features that help developers do this. Others expect developers to be completely explicit about when a piece of memory is unused. Wikipedia has good articles on manual and automatic memory management. The four types of common JavaScript leaks1: Global variablesJavaScript handles
undeclared variables in an interesting way: when a undeclared variable is referenced, a new variable gets created in the global object. In a browser, the global object would be function foo(arg) { is the equivalent of: function foo(arg) { Let’s say the purpose of You can also accidentally create a global variable using function foo() {
Unexpected globals is certainly an issue, however, more often than not your code would be infested with explicit global variables which by definition cannot be collected by the garbage collector. Special attention needs to be given to global variables used to temporarily store and process large bits of information. Use global variables to store data if you must but when you do, make sure to assign it as null or reassign it once you are done with it. 2: Timers or callbacks that are forgottenLet’s take Libraries which provide observers and other instruments that accept callbacks usually make sure all references to the callbacks become unreachable once their instances are unreachable too. Still, the code below is not a rare find: var serverData = loadData(); The snippet above shows the consequences of using timers that reference nodes or data that’s no longer needed. The When using observers, you need to make sure you make an explicit call to remove them once you are done with them (either the observer is not needed anymore, or the object will become unreachable). Luckily, most modern browsers would do the job for you: they’ll automatically collect the observer handlers once the observed object becomes unreachable even if you forgot to remove the listener. In the past some browsers were unable to handle these cases (good old IE6). Still, though, it’s in line with best practices to remove the observers once the object becomes obsolete. See the following example: var element = document.getElementById('launch-button'); You no longer need to call If you leverage the 3: ClosuresA key aspect of JavaScript development are closures: an inner function that has access to the outer (enclosing) function’s variables. Due to the implementation details of the JavaScript runtime, it is possible to leak memory in the following way: var theThing = null;var replaceThing = function () { var originalThing = theThing; Once In this case, the scope created for the closure In the above example, the scope created for the closure All this can result in a considerable memory leak. You can expect to see a spike in memory usage when the above snippet is run over and over again. Its size won’t shrink when the garbage collector runs. A linked list of closures is created (its root is This issue was found by the Meteor team and they have a great article that describes the issue in great detail. 4: Out of DOM referencesThere are cases in which developers store DOM nodes inside data structures. Suppose you want to rapidly update the contents of several rows in a table. If you store a reference to each DOM row in a dictionary or an array, there will be two references to the same DOM element: one in the DOM tree and another in the dictionary. If you decide to get rid of these rows, you need to remember to make both references unreachable. var elements = { There’s an additional consideration that has to be taken into account when it comes to references to inner or leaf nodes inside a DOM tree. If you keep a reference to a table cell (a We at
SessionStack try to follow these best practices in writing code that handles memory allocation properly, and here’s why: Once you integrate SessionStack into your production web app, it starts recording everything: all DOM changes, user interactions, JavaScript exceptions, stack traces, failed network requests, debug messages, etc.
There is a free plan so
you can give it a try now. |