The main threat of a single thread

Apr 10, 2020 • 12 min read • javascript nodejs

When doing interviews with Node.js candidates at The Cocktail, one of the questions I usually ask is the following:

Imagine you have a Node server started with node index.js. In this file, we have a route that runs some code which isn’t I/O and takes 1 minute to complete. Something like the following:

function longTask() {
  // Imaging a task like processing each pixel of a very large image
  // or sorting a large array
  for (let i = 0; i <= 10_000_000_000; i++) {}
}

app.get('/foo', () => {
  longTask()
})

What is the problem we face when 3 concurrent users try to access this route? How can we solve it?

This question helps me to understand how well they know how JavaScript and Node.js work, in addition to how much knowledge they have working with distributed systems, message queues and background tasks.

I found that some candidates have a common answer:

JavaScript is asynchronous, so there is nothing to worry about.

This is kind of true for I/O operations (calling a REST API, a database, reading a file…), but not for CPU-bound tasks, and before getting into this I’d like to review some concepts.

Some background

Language specification != implementation

First of all, I think it’s important to note that a language specification is not the same as the language implementation.

When you create a programming language, you use a language specification to know what to implement, but how you implement the compiling or the runtime of that language is up to you.

The language specification, in the case of JavaScript, is the ECMAScript® Language Specification.

Then, there are several engines that implement JavaScript, some of them are Rhino, JavaScriptCore, SpiderMonkey and V8 (the engine Node.js uses).

You may have heard something about the Event Loop – if not, I really recommend you to watch this talk or this one –, but maybe, what you don’t know is that the Event Loop is not part of the JavaScript specification. In fact, the Event Loop belongs to the Web application APIs of the HTML specification.

The ECMAScript specification talks about execution contexts, agents and the executing thread of those agents, but nothing about an Event Loop.

Finally, Node.js is a JavaScript runtime (using V8 underneath) which wasn’t meant to run in within a browser, so it can avoid having an Event Loop. However, using an Event Loop was the primary motivation to create Node.js, in that talk you can see that each instance of Node.js has only one thread – usually in production applications you’d use something like PM2 in order to have more than one process of Node.js in the same machine.

With that said, if we were to create a JavaScript runtime that doesn’t have an Event Loop and runs in multiple threads, it would be a totally valid JavaScript runtime.

Concurrent vs parallel execution

Now that we know that Node.js has only one thread and an Event Loop, we need to know the difference between a concurrent and parallel execution of tasks. The easiest way to understand this is with an example:

Imagine that we are in a kitchen and we need to chop some onions and some potatoes. So:

In a concurrent execution, I chop some onions, then some potatoes, then again some onions, then some potatoes, and so forth. In no particular order.
In a parallel execution, I chop all the onions and another person chops all the potatoes, at the same time.

In summary, for a concurrent execution to be parallel, it needs more than one unit of processing – or chopping in this example 😂. So, if our runtime only uses one thread to run our code, it can never be parallel.

Run-to-completion scheduling

In a concurrent execution of tasks, we can have two different schemes of how to schedule tasks: non-preemptive (or run-to-completion) scheduling or preemetive scheduling.

Using the example above, the difference will be the following:

With a run-to-completion scheduling, we chop all the onions and then we can chop all the potatoes.
With a preemptive scheduling, we start chopping onions, after 5 minutes, we start chopping potatoes, then after 5 minutes we go back to chop onions, and so forth.

Both have benefits and downsides:

If I have too many onions, with run-to-completion my colleagues will have to wait too long to have potatoes ready to cook.
But with preemptive scheduling, I will have to worry about using different tables and knives to chop each item (no shared memory between processes) or cleaning them each time and being extremely careful.

Erlang, for instance, uses preemptive scheduling, and as you may guess from the question, Node.js uses run-to-completion scheduling.

We could achieve some kind of preemptive scheduling in Node.js using task partitioning, but that would make your code a lot harder to read.

Synchronous vs asynchronous

One of the features of JavaScript is that functions are first-class citizens of the language, meaning that we can assign functions to variables or pass them as arguments to another function, for example:

function hello() {
  return 'Hello'
}

function print(callback) {
  console.log(callback())
}

print(hello)
print(() => 'world')

const anotherFunction = hello

The previous code, although it is using callbacks, everything happens within a synchronous execution, each line is executed in order. To make this code asynchronous, we need to delay the execution of some lines.

In Node.js (and in browsers), we can use the setTimeout API to enqueue a task to be executed after at least some amount of time. When we call this function, the runtime will enqueue this task to the Event Loop, and once the current execution finishes and the time since it was queued is equal or greater than the time specified, the queued task will be run.

For example, this code is asynchronous:

function print() {
  console.log('world')
}

setTimeout(print, 1000);

console.log('Hello')

// Hello # This is printed immediately
// world # This is printed after the last line
//         is ran and at least 1 second
//         has passed since print was called.

However, the main execution and the enqueued task are run in the same single thread.

I/O tasks

Bear in mind that, for input/output operations, asynchrony is achieved by offloading that operating system call (such as reading a file, perform TCP request, handling a socket connection and so on) to another process.

For example, in the following code:

function myCallback(err, data) {
  console.log(data)
}

fs.readFile('/my-file.txt', myCallback)

Reading the file can take hours to be performed, but Node.js will run this task in an OS process, outside the main JavaScript runtime thread and will continue the execution of the remaining code. Once the reading of the file is finished myCallback will be executed in the main thread.

This is similar in the browsers, when we perform an HTTP request, it is performed by the browser outside the main thread.

Recap

Node.js runs JavaScript concurrently in a single thread using an Event Loop and run-to-completion scheduling of tasks.

The question

After reviewing all of this, we can go back to the question:

Imagine you have a Node server started with node index.js. In this file, we have a route that runs some code which isn’t I/O and takes 1 minute to complete. Something like the following:

function longTask() {
  for (let i = 0; i <= 10_000_000_000; i++) {}
}

app.get('/foo', () => {
  longTask()
})

What is the problem we face when 3 concurrent users try to access this route?

The first user will block for 1 minute the start of the request of the next user, and so on.

We now know that app.get('/foo') is part of an I/O task, so it will be processed outside of the main thread. But, once a request arrives, the main thread will run () => longTask() and, as that function is a CPU expensive computation and there is no I/O involved, this function will block the main thread for 1 minute while it is being executed.

How can we solve it?

There are several answers here, some wrong and some valid. It’s worth reviewing them all.

Common pitfalls

As it is a callback it doesn’t block the main thread

We saw before this isn’t true.

We wrap it within a setTimeout

This doesn’t work, as we mention before, Node.js execution isn’t parallel, so this only delays the blocking for the future.

If we do something like this:

function longTask() {
  for (let i = 0; i <= 10_000_000_000; i++) {}
}

app.get('/foo', () => {
  setTimeout(longTask, 1000)
})

The users that arrive within one second of the first user, will have their request accepted. However, once the second has passed, the server will be blocked.

We make the route an async function

This doesn’t work either. Async functions are like Promises, when the function is called, it is enqueued in the Event Loop, but once it pops out of the Event Loop and it is executed, it runs in the main thread and it will block it anyway.

So, something like this won’t work:

function longTask() {
  for (let i = 0; i <= 10_000_000_000; i++) {}
}

app.get('/foo', async () => {
  longTask()
})

Possible solutions to our problem

To solve this problem, we need to offload the computation of longTask to the background. We can find different approaches here:

Transform the operation to an external I/O call

If we implement this function as a background job or another service, we can transform CPU-bound work to an I/O call.

For example, the main server will have the following code:

app.get('/foo', () => {
  console.log('Request received')
  // We don't await this call in order to avoid
  // keeping open user connections waiting for a result
  https.get('https://another-service/longTask')
  console.log('Request finished but task still running')
})

The server that will be blocked will be another-service (which we can scale as we want), and our main server can take as users as we want because https.get is an I/O task and it won’t block any incoming request.

Note that we can use a message queue to communicate between services or background jobs, it is up to you.

Use Node.js Child Process

This approach, with its benefits and downsides, is really well explained here.

Using Node.js Worker threads

Ok, fine, I’ve hidden some information about Node.js threads 🙈. Since v10.5.0 of Node.js we can use what is called Worker threads, in browsers we have something similar called WebWorkers.

Using a worker thread, we can offload the execution of this task to another thread (sharing memory space) within the same machine.

Final thoughts

Usually, in the majority of programs that we write, it’s hard to reach a situation where we accidentally block the main thread with a CPU-intensive task.

However, I hope this article helps you in the future finding possible bottlenecks or slowdowns in your web server or frontend application.