Optimizing speed (In Node.JS)

This is an article about optimizing speed in Node.JS, since every article I found was about optimizing memory usage, finding out memory leaks, but none of the methods I found helped me resolve a bug that was causing my code to run in 5 minutes instead of 3 seconds on my docxtemplater library. This article is about the tools I used / created to find the source of the issue.

The issue was created on github . It explained that with some inputs with many elements in an array, the generation became very slow (taking more than 5 minutes for the generation of one document). The creator of the issue sent me a document to reproduce it. In the same time, I tried to use `node –prof`, but it didn’t give me more info than the fact that « Lazycompile » was taking more than 99% of the time. That is why I decided to create a speed test to find out how the speed of the library depended on the size of the input.

This is the data that the program I created gave (n is the length of the array) before optimization :


n     time(ms)

50    94
100   160
200   473
350   1407
500   2931
750   6918
1000  11106
1250  17345
1500  23982
1750  32912
2000  41297
2250  51642
2500  66122

This is the graph of the data :

raw_unoptimized

The graph doesn’t seem to be linear, that’s why we have to do a linear regression. We use log-scale to find out the order of our curve.

order_unoptimized

We find out that ln(time) = 2*ln(n) - 4.5 . This gives us an order of 2, so this means that our algorithm is O(n^2). However, since our algorithm should be looping only once over the data, our algorithm should be O(n).

It wasn’t easy to find where the bottleneck was. I searched for tools for profiling node.js programs, and finally found a pull request on the code coverage tool istanbul to create a new « profile » function : Pull request to add profiling command to istanbul

I didn’t like the type of output of the code, so I modified the code to output data as json in a file, so that I could analyse it at the end.

Here is my version of the profiler on github.

I then used a small jq script to analyse the json file :


jqq='. | to_entries | sort_by(.value.ms) | .[].value | .filename + "@" + .name + ":" + (.calls|tostring) + ":" + (.ms|tostring) + "ms"'

jq sed « s:$(pwd)/prof/::g »

This gives us the function calls, ordered by the amout of time taken by each function. The result is the following :


scopeManager.js@functorIfNotInverted:403:55462.439ms
scopeManager.js@loopOverValue:11:55463.694ms
scopeManager.js@loopOver:11:55464.606ms
xmlTemplater.js@dashLoop:10:55787.118ms
xmlTemplater.js@calcSubXmlTemplater:414:69817.831ms
xmlTemplater.js@forLoop:11:69825.111ms
xmlTemplater.js@replaceLoopTag:11:69855.047ms
xmlTemplater.js@loopClose:1754:70233.641ms
xmlTemplater.js@handleCharacter:24270:70334.91ms
xmlTemplater.js@forEachCharacter:416:70490.983ms
xmlTemplater.js@compile:416:70513.029ms
xmlTemplater.js@render:416:70514.29ms
docUtils.js@DocUtils.clone:4086081:82243.956ms

We can see that we are going into DocUtils.clone more than 4 million times, for about 80 seconds. We are actually cloning the array of data too many times. The clone function is itself recursive, that is why there are so many function calls. If we replace the implementation of clone with JSON.parse(JSON.stringify(a)), we already get a huge speed gain. I ended up removing the clone totally, since it wasn’t really useful for the cost of it.

After optimizing, we see the following data (post optimization) :


n     time(ms)
50    57
100   49
200   56
350   92
500   105
750   130
1000  178
1250  222
1500  257
1750  287
2000  326
2250  384
2500  430

raw_optimized

The linear regression for this graph shows :

order_optimized

ln(time) = ln(n) -1.8

This validates that the order is indeed 1, which is what we wanted.

Here is a graph showing the differences, the first graph before the speed improvement, the second graph after the speed improvement. We can see that the difference is so huge that the new version seems completely flat in comparison with the old version.

comparison

I’m really happy that now, docxtemplater is much faster, and I learnt something new about finding bottlenecks : if the profiling tools don’t fit your needs, something very simple can do the job !

Une réflexion au sujet de « Optimizing speed (In Node.JS) »

Laisser une réponse

Votre adresse de messagerie ne sera pas publiée.