What is a JavaScript engine

Rainer Hahnekamp 271

JavaScript is often seen as slow by users of other languages: The poor performance has historical reasons, which the browser engines can compensate for today.

Users of languages ​​like Java or C # still smile at JavaScript. Because of the poor language design and the poor performance, it is not a "full-fledged" language. That was the case maybe 15 years ago: In the meantime, JavaScript has continuously improved on all fronts. A new and improved standard appears annually - JavaScript no longer needs to hide in terms of performance. The engine that executes the code is responsible for the speed of execution. Leading the way for JavaScript is Google with its V8 engine.

True to the motto "To understand the present, you have to know the past", here is a look into the past of JavaScript.

A little code example

(() => {
const han = {firstname: "Han", lastname: "Solo"};
const luke = {firstname: "Luke", lastname: "Skywalker"};
const leia = {firstname: "Leia", lastname: "Organa"};
const obi = {firstname: "Obi-Wan", lastname: "Kenobi"};
const yoda = {firstname: "", lastname: "Yoda"};
const people = [
han, luke, leia, obi,
yoda, luke, leia, obi
];

const getName = (person) => person.lastname;

console.time ("engine");
for (var i = 0; i <1000 * 1000 * 1000; i ++) {
getName (people [i & 7]);
}
console.timeEnd ("engine");
})();

1.328 seconds (Dell XPS 15 (9570) with i7-8750H, Windows 10 Pro, Chrome 71)

The example is relatively simple. It creates five objects that represent characters from the Star Wars films with their first and last names. Then they loop 1,000,000,000 through a function that returns the last name. Depending on the speed of the computer, the execution takes up to two seconds. That changes suddenly, however, when each object receives an additional property or, in the case of "Yoda", one is omitted:

(() => {
const han = {firstname: "Han", lastname: "Solo", spacecraft: "Falcon"};
const luke = {firstname: "Luke", lastname: "Skywalker", job: "Jedi"};
const leia = {firstname: "Leia", lastname: "Organa", gender: "female"};
const obi = {firstname: "Obi-Wan", lastname: "Kenobi", retired: true};
const yoda = {lastname: "Yoda"};

const people = [
han, luke, leia, obi,
yoda, luke, leia, obi];

const getName = (person) => person.lastname;

console.time ("engine");
for (var i = 0; i <1000 * 1000 * 1000; i ++) {
getName (people [i & 7]);
}
console.timeEnd ("engine");
})();

11.386 seconds (Dell XPS 15 (9570) with i7-8750H, Windows 10 Pro, Chrome 71)

The example is specially tailored for Chrome and is not a benchmark. It is relatively unlikely that such code would appear in a practical application. Nonetheless, it is perfect for illustrating the inner workings of an engine.

Historical development and fundamental design decisions

JavaScript was never designed for large application development. When it first saw the light of day in 1995 under the direction of Brendan Eich in Netscape Navigator 2.0, there was already a top dog for it: Java. Because at that time there was a huge hype about Java: It was considered the C ++ successor and thus the new "Lingua Franca".

However, Netscape saw the need for a "lightweight" alternative. Use cases were in particular form validations or small animations. Under no circumstances did Netscape want to challenge Java as a competitor, because at the same time they entered into an alliance with Sun, the then owner of Java, in order to be able to assert themselves together against the overwhelming power of Microsoft.

That is why the new programming language was designed as a little brother to Java. It was given the trendy name JavaScript and thus had a duo on the web similar to that of Microsoft with Visual Basic and Visual C ++ on the desktop. These choices were reflected in the design of JavaScript. If you only want to develop small applications, it is important that this is done quickly and easily. Performance is of secondary importance for small scripts. That is why the course was set back then, which later turned out to be a stumbling block in terms of performance.

Easy handling through interpreter and dynamic typing

Two essential factors in JavaScript make getting started relatively easy: The abandonment of a compiler in favor of an interpreter and the dynamic typing. In a nutshell, the interpreter executes the source code directly at runtime. That means, you save a compiler and the setup of the necessary tools. You can therefore deliver an application directly in the source code, which is an immense simplification.

The dynamic typing only determines the type at runtime. Omitting the type information in the source code leads to shorter code and developers can, for example, use multiple types String or Number for the same variable or function parameter. However, this leads to a decrease in performance: Dynamic typing and interpreters do not only have advantages.

A compiler generates machine code that can be executed directly without an interpreter. The speed of execution of machine code is significantly faster than that of an interpreter. In addition, the compiler can analyze and optimize the source code (including inline caches). It is similar with dynamic types. At runtime, an interpreter has to check again and again what type it is actually, which is assigned to a variable, for example. Since the type can change during runtime, the check has to be repeated - this of course costs valuable time. In the case of static typing, the step is omitted, since the types are known to be fixed and cannot change.

Compiler and static typing combined offer another advantage. Certain error categories such as incorrect syntax or the use of non-existent properties or functions (for example due to typing) lead to the compiler process being aborted. However, since this takes place at the developer's site, such code never reaches the user.

JavaScript takes over the role of Java

So JavaScript was designed as an easily accessible language, which had a negative impact on performance. Unfortunately, further development did not go according to plan. Java did not survive as a "web language" and JavaScript remained. Due to the increasing penetration of the Internet, the browser was chosen more and more frequently as the preferred platform for applications instead of the desktop. JavaScript, as the only alternative, was now also used for large applications. The applications were not really performing - the search for answers in the area of ​​engines began.

What is an engine?

The engine executes the program code. It is therefore an integral part of every browser and of course also runs in Node.js (server-side) or Electron (desktop-side). Every browser manufacturer has its own engine. It should come as no surprise that a browser that executes JavaScript quickly also comes out on top with users. In this way, browser developers can gain a decisive advantage in the hotly contested browser market with a fast engine.

Four major manufacturers currently dominate the market: Apple (Safari), Google (Chrome), Microsoft (Edge) and Mozilla (Firefox). Each manufacturer has given its own engine a "hip" name. The best known is certainly V8, Chrome's engine. The rest are JavaScriptCore (Safari), Chakra (Edge) and SpiderMonkey (Firefox). However, Microsoft announced at the beginning of December that it would switch to Chromium, and thus to Google's V8 engine, in the future. A first preview is planned for the beginning of 2019: Then there would only be three.

V8 has a special place. It was not only the first "modern engine" that brought JavaScript to the required level in terms of performance in order to run large applications with high performance, but also represents the standard engine in Node.js and Electron.

Modern engines with compiler and static typing

The various engines work very differently in detail. Nonetheless, they are based on the same principles. In retrospect, the engine tries to make up for the two discussed "shortcomings" of the programming language. During runtime, it integrates a compiler and static types or tries to simulate their functions.

From a technical point of view, this kind of retrofitting during the runtime is a challenge - comparable to the idea of ​​rewriting the software of an aircraft during flight. Why does the engine have to do that? Why can't you make changes to JavaScript directly instead?

Static typing cannot work if the programming language is based on a dynamic one. Changing the language afterwards makes most applications incompatible. The option is therefore ruled out. The same applies in the event that JavaScript is repurposed into a compiled language. One would rather have to design a new programming language from scratch: Greetings from WebAssembly

The JIT compiler

First of all, the question arises of how a compiler could be integrated retrospectively: fast preparation or execution?

The first problem is that it takes the compiler some time to compile an application. The interpreter, on the other hand, only has to parse the source code and can immediately start executing it line by line. With the compiler, however, it looks different, it takes much longer. As a result, users would have to wait for the compiler to finish when opening a website. For example, if you have a complete Angular application with all sorts of additional libraries, it takes a few seconds.

Interpreter and compiler in one

That's why engine developers help themselves with a little trick. They combine the best of both worlds and use the fast program start of an interpreter and the fast execution of the machine code generated by the compiler. From a bird's eye view, the interpreter executes each application at startup. For users, this means that they can use the application quickly. At the same time, the engine compiles and optimizes the source code. Users do not have to wait until the entire program is available in compiled form. Rather, the parts present in machine code can be accessed.

The strategies differ depending on the engine. It can be, for example, that an engine first creates a profile of the application and generates program code optimized for the profile. Or it recompiles the compiled code over and over again over several phases and becomes faster after each compilation.

How does the engine decide which parts of the code to optimize? While the interpreter is busy doing its job, the engine is watching him. Pieces of code that he frequently executes are candidates for the compiler. These parts are called "Hot Code (Path)". As a result, the compiler only intervenes when necessary. This is why the procedure is also known as "Just-in-Time Compilation".

It may well happen that the same source code is both interpreted and executed in different versions in the form of machine code during the runtime of a web application.

Static types through inline caches

After looking at the first, the second "weak point" follows: dynamic typing. Here, engines use a technique called inline caches. The basic assumption for inline caches is that only the same types are transferred, for example as parameters for function calls. The engine determines whether this is the case or not while creating the profile or while observing the program flow.

If it turns out that the same types are actually present over and over again, the engine rewrites the code so that the property or function can be accessed without lengthy type checking. As with static typing, the memory address is known and can be called up directly.

If things don't go according to plan: de-optimization

The engine must protect itself in the event that a different type suddenly occurs. That is why it includes a type check before access. If it is not a type that exists in the inline cache, a de-optimization takes place and it interprets the program part until it is compiled again.

When does the engine now assign two objects of the same type? In V8, the rule applies that this is only the case if exactly the same properties are available and they are defined in the same order. The following declarations therefore generate two different types:

const han = {firstname: "Han", lastname: "Solo"};
const luke = {lastname: "Skywalker", firstname: "Luke"};

Note that the same properties are defined, but not in the same order. The reasons are simply that the properties are arranged differently in the internal memory structure and are therefore located at different memory addresses.

Inline caches in several variants

Unfortunately in JavaScript it is rare that there is really only one type at a time. So-called "duck typing" is particularly widespread in the context of functional programming. It is not important that it is always the same type, but that only one method or property is available and a function can thus be applied to different types. In the example this means that the function getName The only important thing is that the parameters passed have a property called lastName to have. It doesn't matter whether the objects differ from one another.

Unfortunately, this leads to different types. Fortunately, inline caches can store several types. One speaks of a polymorphic inline cache, in contrast to a monomorphic one, which really only stores one type.

With Google's Engine V8, the polymorphic cache is only available for up to four different types. If there are more than four, it uses a megamorphic inline cache. However, it stores the information about the types in a global cache memory, which means that the variant is much slower than mono- or polymorphic.

Inline caches! == types

It is important to note that inline caches do not really incorporate static typing. Rather, one tries to compensate for the speed advantages of a static typing with a dynamic one. For the sake of better understanding, the author has always used the term "type". The fact is, they're not really types in the classic sense like JavaScript data types. The technical term for this often occurs under the names "Object Shapes" or "Hidden Classes". Each engine has its own special name. Nevertheless, the principle of a type should be familiar to the reader and serve to better understand how inline caches work.

Round two of the code sample

It should now be clear that the code example created five different "types" or object shapes by adding more properties. It switched from a monomorphic to a megamorphic inline cache. How can developers still get a monomorphic inline cache? The solution to this is relatively simple.

You have to make sure that all objects have the same properties and that they are always defined in the same order. The classes introduced with ECMAScript 6 help to achieve this. In the construction, users can define all properties and use standard values ​​to ensure that really every property is initialized with the correct type:

(() => {
class person {
constructor ({
firstname = "",
lastname = "",
spaceship = "",
job = "",
gender = "",
retired = false
} = {}) {
Object.assign (this, {
first name,
lastName,
spaceship,
job,
gender,
retired
});
}
}
const han = new Person ({
firstname: "Han",
lastname: "Solo",
spaceship: "Falcon"
});
const luke = new Person ({
firstname: "Luke",
lastname: "Skywalker",
job: "Jedi"
});
const leia = new Person ({
firstname: "Leia",
lastname: "Organa",
gender: "female"
});
const obi = new Person ({
firstname: "Obi-Wan",
lastname: "Kenobi",
retired: true
});
const yoda = new Person ({lastname: "Yoda"});
const people = [han, luke, leia, obi, yoda, luke, leia, obi];
const getName = person => person.lastname;
console.time ("engine");
for (var i = 0; i <1000 * 1000 * 1000; i ++) {
getName (people [i & 7]);
}
console.timeEnd ("engine");
})();

1.319 seconds (Dell XPS 15 (9570) with i7-8750H, Windows 10 Pro, Chrome 71)

If you now run the script, it will run just as quickly as in the first example.

Conclusion

JavaScript is currently the most widely used programming language.In addition to the simplicity of the language and the ubiquity of the Internet, a major reason was that the browser manufacturers managed to increase the execution speed to a level comparable to compiled languages ​​such as C #.

This is all the more remarkable since JavaScript was not designed for performance. Dynamic typing and interpretation enable a quick introduction to the language, but slow down the execution speed considerably. Any changes cannot be made to the language for the sake of compatibility, which is why engines try to eliminate the two weaknesses directly during execution.

The inline caches and JIT compilation techniques presented in the article represent essential improvement measures. Modern engines also have a whole range of other optimization techniques. Using TypeScript or similar languages ​​(Flow) is helpful. They "educate" or "urge" the developer to use static types. They cannot force it, so in the end it is the responsibility of the developer.

[Update: 01/14/2018: Mention of TypeScript added in the conclusion, times added to the code examples.] (bbo)

Rainer Hahnekamp
is an independent software developer. He is always on the lookout for new tools that can help improve code quality.

271 comments