Parallelizing tasks in Node.js and Java

Recently we built a Deep Learning model for predicting whether a user who came to our sites buy a car or not if yes then which model. It is giving good results with test data. So we thought let’s do real testing. For that, we need to run this model for all the active leads(users) and get predictions.

We know that our application which is serving our deep learning model can serve 3 requests in parallel. We don’t want to deploy multiple servers because its one time task and it give the response in a couple of milliseconds. We want to run this model for 1 lakh leads. So, we need to write a program that hits the REST API of our application which is serving our deep learning model but at a time only 3 active requests should be there.

I get this question/condition/problem lot of times, I want to perform tasks in parallel but in a controlled way. Let me show the code to get this kind of job done in Node.js & Java.

For example sake, I want to get 10 images and want to store it on my hard disk. For this task, I am going to use Adorable Avatars API.

Avatar Image

In Node.js

To do this in Node.js you need Async, request packages. To install these packages run following commands.

npm install async
npm install request

Let me explain the code line by line.

1) Imports for required packages. We need request package for hitting api and getting an avatar in response. We need fs package for writing the Avatar which we got in response on the hard disk. We need async package for parallelizing download and writing task.

2) Variable to hold Avatar files names.

3) A function which takes Avatar name and a callback function as arguments. It hits the URL of the Avatar and writes the response on hard disk through streaming. Once it is done or any error occurred callback function is called.

4) Async function which takes Avatar files names array, number of tasks should run in parallel, function to do the business logic, the callback function which will be called after all tasks are done as arguments.

async.forEachLimit is for running tasks in parallel but in a controlled way. We want to run 2 tasks in parallel so we give 2 here.

In Java

Let me explain the code line by line.

1) Imports required for the program. Arrays for creating List. Executors, ThreadPoolExecutor for thread pool creation. URL for avatar URL. InputStream, Files, Paths, StandardCopyOption for writing avatar image on the hard disk from the response.

2) Created thread pool executor with limit/size 2. At a time 2 threads will be active.

3) Variable to hold Avatar files names.

4) Iterating Avatar files names and creating executor for each task. Which internally call getAndStoreAvatar function to do the task.

5) executor.shutdown() function to shut down the thread pool executor after tasks done.

6) In getAndStoreAvatar function, we created URL for hitting the API. Then opened the stream and wrote/copied the stream into the file with replace existing file option.

7) Catch block is to handle checked exceptions like MalformedURLException, IOException & If something goes wrong we are just catching the exception and printing stack trace.

Peace. Happy Coding.

Related Post

Leave a Reply

Your email address will not be published. Required fields are marked *