In the past, Node.js Development was often not an option when building applications that require CPU intensive computation. This is due to its non-blocking, event-driven I/O architecture. With the advent of thread workers in Node.js, it is possible to use it for CPU intensive applications. In this article, we will take a look at certain use cases of worker threads in a Node.js Development application.
Before continuing with the use cases of thread workers in Node.js, let’s do a quick comparison of I/O-bound or CPU-bound in Node :
I/O bound in Node.js:
A program is said to be bound by a resource if an increase in the resource leads to improved performance of the program. An increase in the speed of the I/O subsystem(such as memory, hard disk speed or network connection) increases the performance of an I/O bound program. This is typical of Node.js applications as the event loop often spends time waiting for the network, filesystem and perhaps database I/O to complete their operations before continuing with code execution or returning a response. Increasing hard disk speed and/or network connection would usually improve the overall performance of the application or program.
CPU bound in Node.js:
A program is CPU bound if its processing time reduces by an increase in CPU. For instance, a program that calculates the hash of a file will processes faster on a 2.2GHz processor and process slower on a 1.2GHz.
For CPU bound applications the majority of the time is spent using the CPU to do calculations. In Node.js, CPU bound applications block the event and cause other requests to be held up.
Node.js Important Rule:
Node runs in a single-threaded event loop, using non-blocking I/O calls, allowing it to concurrently support tens of thousands of computations running at the same time, for example serving multiple incoming HTTP requests. This works well and is fast as long as the work associated with each client at any given time is small. But if you perform CPU intensive calculations, your concurrent Node.js server will come to a screeching halt. Other incoming requests will wait as only one request is being served at a time.
Certain strategies have been used to cope with CPU intensive tasks in Node.js. Multiple processes (like cluster API) that make sure that the CPU is optimally used, child processes that spawn up a new process to handle blocking tasks.
These strategies are advantageous because the event loop is not blocked, it also allows separation of processes, so if something goes wrong in one process, it does not affect other processes. However, since the child processes run in isolation they are not able to share a memory with each other and the communication of data must be via JSON, which requires serialization and deserialization of data.
The best solution for CPU intensive computation in Node.js is to run multiple Node.js instances inside the same process, where memory can be shared and there would be no need to pass data via JSON. This is exactly what worker threads do in Node.js.
Real-world CPU Tasks with Threads :
We will look at a few use cases of thread workers in a Node.js application. We will not be looking at thread worker APIs because we will just be looking at the use cases of thread workers in a node application.
Image Resizing :
Let’s say you are building an application that allows users to upload a profile image and then you generate multiple sizes (eg: 100 x 100 and 64 x 64) of the image for the various use cases within the application. The process of resizing the image is CPU intensive and having to resize into two different sizes would also increase the time spent by the CPU resizing the image. The task of resizing the image can be outsourced to a separate thread while the main thread handles other lightweight tasks.
Video Compression :
Video compression is another CPU intensive task that can be outsourced to the thread worker. Most video streaming applications would usually have multiple variations of a single video which is shown to users depending on their network connection. Thread workers can do the job of compressing the video to various sizes.
FFmpeg-fluet is a commonly used module for video processing in Node.js applications. It is dependent on which
ffmpeg is a complete, cross-platform solution to record, convert and stream audio and video.
Because of the overhead of creating workers each time you need to use a new thread, it is recommended that you create a pool of workers that you can use when you need them as opposed to creating workers on the fly. To create a worker pool we use an NPM module,
node-worker-threads-pool it creates worker threads pool using Node’s worker_threads module.
File integrity :
Suppose you have to store your files on cloud storage. You want to be sure that the files that you store are not tampered by any third party. You can do it by computing the hash of that file using a Cryptographic hash algorithm. You save these hashes and their storage location in your database. When you download the files, you compute the hash again to see if they match. The process of computing the hash is CPU intensive and can be done in a thread worker:
Notice that we have both the worker thread code and the main thread code in the same file. The
isMainThread property of the thread worker helps us determine the current thread and run the code appropriate for each thread. The main thread creates a new worker and listens to events from the worker. The worker thread calculates the hash of a stream of data using the Node.js crypto method called.