1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
|
# WebWorker Threads
This is based on @xk (jorgechamorro)'s [Threads A GoGo for Node.js](https://github.com/audreyt/node-threads-a-gogo), but with an API conforming to the [Web Worker standard](http://www.w3.org/TR/workers/).
This module provides an asynchronous, evented and/or continuation passing style API for moving blocking/longish CPU-bound tasks out of Node's event loop to JavaScript threads that run in parallel in the background and that use all the available CPU cores automatically; all from within a single Node process.
This module requires Node.js 0.8.0+.
On Unix (including Linux and OS X), this module requires a working node-gyp toolchain, which in turn requires make and C/C++.
For example, on OS X, you could install XCode from Apple, and then use it to install the command line tools (under Preferences -> Downloads).
On Windows, this module requires Node.js 0.9.3+ and a working [node-gyp toolchain](http://dailyjs.com/2012/05/17/windows-and-node-3/).
## Installing the module
With [npm](http://npmjs.org/):
npm install webworker-threads
Sample usage (adapted from [MDN](https://developer.mozilla.org/en-US/docs/DOM/Using_web_workers#Passing_data)):
```js
var Worker = require('webworker-threads').Worker;
// var w = new Worker('worker.js'); // Standard API
// You may also pass in a function:
var worker = new Worker(function(){
postMessage("I'm working before postMessage('ali').");
onmessage = function(event) {
postMessage('Hi ' + event.data);
self.close();
};
});
worker.onmessage = function(event) {
console.log("Worker said : " + event.data);
};
worker.postMessage('ali');
```
A more involved example in [LiveScript](http://livescript.net/) syntax, with five threads:
```coffee
{ Worker } = require \webworker-threads
for til 5 => (new Worker ->
fibo = (n) -> if n > 1 then fibo(n - 1) + fibo(n - 2) else 1
@onmessage = ({ data }) -> postMessage fibo data
)
..onmessage = ({ data }) ->
console.log "[#{ @thread.id }] #data"
@postMessage Math.ceil Math.random! * 30
..postMessage Math.ceil Math.random! * 30
do spin = -> process.nextTick spin
```
## Introduction
After the initialization phase of a Node program, whose purpose is to setup listeners and callbacks to be executed in response to events, the next phase, the proper execution of the program, is orchestrated by the event loop whose duty is to [juggle events, listeners and callbacks quickly and without any hiccups nor interruptions that would ruin its performance](http://youtube.com/v/D0uA_NOb0PE?autoplay=1)
Both the event loop and said listeners and callbacks run sequentially in a single thread of execution, Node's main thread. If any of them ever blocks, nothing else will happen for the duration of the block: no more events will be handled, no more callbacks nor listeners nor timeouts nor nextTick()ed functions will have the chance to run and do their job, because they won't be called by the blocked event loop, and the program will turn sluggish at best, or appear to be frozen and dead at worst.
### What is WebWorker-Threads
`webworker-threads` provides an asynchronous API for CPU-bound tasks that's missing in Node.js:
``` javascript
var Worker = require('webworker-threads').Worker;
require('http').createServer(function (req,res) {
var fibo = new Worker(function() {
function fibo (n) {
return n > 1 ? fibo(n - 1) + fibo(n - 2) : 1;
}
onmessage = function (event) {
postMessage(fibo(event.data));
}
});
fibo.onmessage = function (event) {
res.end('fib(40) = ' + event.data);
};
fibo.postMessage(40);
}).listen(port);
```
And it won't block the event loop because for each request, the `fibo` worker will run in parallel in a separate background thread.
## API
### Module API
``` javascript
var Threads= require('webworker-threads');
```
##### .Worker
`new Threads.Worker( [ file | function ] )` returns a Worker object.
##### .create()
`Threads.create( /* no arguments */ )` returns a thread object.
##### .createPool( numThreads )
`Threads.createPool( numberOfThreads )` returns a threadPool object.
---
### Web Worker API
``` javascript
var worker= new Threads.Worker('worker.js');
var worker= new Threads.Worker(function(){ ... });
var worker= new Threads.Worker();
```
##### .postMessage( data )
`worker.postMessage({ x: 1, y: 2 })` sends a data structure into the worker. The worker can receive it using the `onmessage` handler.
##### .onmessage
`worker.onmessage = function (event) { console.log(event.data) };` receives data from the worker's `postMessage` calls.
##### .terminate()
`worker.terminate()` terminates the worker thread.
##### .addEventListener( type, cb )
`worker.addEventListener('message', callback)` is equivalent to setting `worker.onmesssage = callback`.
##### .dispatchEvent( event )
Currently unimplemented.
##### .removeEventListener( type )
Currently unimplemented.
##### .thread
Returns the underlying `thread` object; see the next section for details.
Note that this attribute is implementation-specific, and not part of W3C Web Worker API.
---
### Thread API
``` javascript
var thread= Threads.create();
```
##### .id
`thread.id` is a sequential thread serial number.
##### .load( absolutePath [, cb] )
`thread.load( absolutePath [, cb] )` reads the file at `absolutePath` and `thread.eval(fileContents, cb)`.
##### .eval( program [, cb])
`thread.eval( program [, cb])` converts `program.toString()` and eval()s it in the thread's global context, and (if provided) returns the completion value to `cb(err, completionValue)`.
##### .on( eventType, listener )
`thread.on( eventType, listener )` registers the listener `listener(data)` for any events of `eventType` that the thread `thread` may emit.
##### .once( eventType, listener )
`thread.once( eventType, listener )` is like `thread.on()`, but the listener will only be called once.
##### .removeAllListeners( [eventType] )
`thread.removeAllListeners( [eventType] )` deletes all listeners for all eventTypes. If `eventType` is provided, deletes all listeners only for the event type `eventType`.
##### .emit( eventType, eventData [, eventData ... ] )
`thread.emit( eventType, eventData [, eventData ... ] )` emits an event of `eventType` with `eventData` inside the thread `thread`. All its arguments are .toString()ed.
##### .destroy( /* no arguments */ )
`thread.destroy( /* no arguments */ )` destroys the thread.
---
### Thread pool API
``` javascript
threadPool= Threads.createPool( numberOfThreads );
```
##### .load( absolutePath [, cb] )
`threadPool.load( absolutePath [, cb] )` runs `thread.load( absolutePath [, cb] )` in all the pool's threads.
##### .any.eval( program, cb )
`threadPool.any.eval( program, cb )` is like `thread.eval()`, but in any of the pool's threads.
##### .any.emit( eventType, eventData [, eventData ... ] )
`threadPool.any.emit( eventType, eventData [, eventData ... ] )` is like `thread.emit()`, but in any of the pool's threads.
##### .all.eval( program, cb )
`threadPool.all.eval( program, cb )` is like `thread.eval()`, but in all the pool's threads.
##### .all.emit( eventType, eventData [, eventData ... ] )
`threadPool.all.emit( eventType, eventData [, eventData ... ] )` is like `thread.emit()`, but in all the pool's threads.
##### .on( eventType, listener )
`threadPool.on( eventType, listener )` is like `thread.on()`, registers listeners for events from any of the threads in the pool.
##### .totalThreads()
`threadPool.totalThreads()` returns the number of threads in this pool: as supplied in `.createPool( number )`
##### .idleThreads()
`threadPool.idleThreads()` returns the number of threads in this pool that are currently idle (sleeping)
##### .pendingJobs()
`threadPool.pendingJobs()` returns the number of jobs pending.
##### .destroy( [ rudely ] )
`threadPool.destroy( [ rudely ] )` waits until `pendingJobs()` is zero and then destroys the pool. If `rudely` is truthy, then it doesn't wait for `pendingJobs === 0`.
---
### Global Web Worker API
Inside every Worker instance from webworker-threads, there's a global `self` object with these properties:
##### .postMessage( data )
`postMessage({ x: 1, y: 2 })` sends a data structure back to the main thread.
##### .onmessage
`onmessage = function (event) { ... }` receives data from the main thread's `.postMessage` calls.
##### .close()
`close()` stops the current thread.
##### .addEventListener( type, cb )
`addEventListener('message', callback)` is equivalent to setting `self.onmesssage = callback`.
##### .dispatchEvent( event )
`dispatchEvent({ type: 'message', data: data })` is the same as `self.postMessage(data)`.
##### .removeEventListener( type )
Currently unimplemented.
##### .importScripts( file [, file...] )
`importScripts('a.js', 'b.js')` loads one or more files from the disk and `eval()` them in the worker's instance scope.
##### .thread
The underlying `thread` object; see the next section for details.
Note that this attribute is implementation-specific, and not part of W3C Web Worker API.
---
### Global Thread API
Inside every thread .create()d by webworker-threads, there's a global `thread` object with these properties:
##### .id
`thread.id` is the serial number of this thread
##### .on( eventType, listener )
`thread.on( eventType, listener )` is just like `thread.on()` above.
##### .once( eventType, listener )
`thread.once( eventType, listener )` is just like `thread.once()` above.
##### .emit( eventType, eventData [, eventData ... ] )
`thread.emit( eventType, eventData [, eventData ... ] )` is just like `thread.emit()` above.
##### .removeAllListeners( [eventType] )
`thread.removeAllListeners( [eventType] )` is just like `thread.removeAllListeners()` above.
##### .nextTick( function )
`thread.nextTick( function )` is like `process.nextTick()`, but much faster.
---
### Global Helper API
Inside every thread .create()d by webworker-threads, there are some helpers:
##### console.log(arg1 [, arg2 ...])
Same as `console.log` on the main process.
##### console.error(arg1 [, arg2 ...])
Same as `console.log`, except it prints to stderr.
##### puts(arg1 [, arg2 ...])
`puts(arg1 [, arg2 ...])` converts .toString()s and prints its arguments to stdout.
-----------
WIP WIP WIP
-----------
Note that everything below this line is under construction and subject to change.
-----------
## Examples
**A.-** Here's a program that makes Node's event loop spin freely and as fast as possible: it simply prints a dot to the console in each turn:
cat examples/quickIntro_loop.js
``` javascript
(function spinForever () {
process.nextTick(spinForever);
})();
```
**B.-** Here's another program that adds to the one above a fibonacci(35) call in each turn, a CPU-bound task that takes quite a while to complete and that blocks the event loop making it spin slowly and clumsily. The point is simply to show that you can't put a job like that in the event loop because Node will stop performing properly when its event loop can't spin fast and freely due to a callback/listener/nextTick()ed function that's blocking.
cat examples/quickIntro_blocking.js
``` javascript
function fibo (n) {
return n > 1 ? fibo(n - 1) + fibo(n - 2) : 1;
}
(function fiboLoop () {
process.stdout.write(fibo(35).toString());
process.nextTick(fiboLoop);
})();
(function spinForever () {
process.nextTick(spinForever);
})();
```
**C.-** The program below uses `webworker-threads` to run the fibonacci(35) calls in a background thread, so Node's event loop isn't blocked at all and can spin freely again at full speed:
cat examples/quickIntro_oneThread.js
``` javascript
function fibo (n) {
return n > 1 ? fibo(n - 1) + fibo(n - 2) : 1;
}
function cb (err, data) {
process.stdout.write(data);
this.eval('fibo(35)', cb);
}
var thread= require('webworker-threads').create();
thread.eval(fibo).eval('fibo(35)', cb);
(function spinForever () {
process.nextTick(spinForever);
})();
```
**D.-** This example is almost identical to the one above, only that it creates 5 threads instead of one, each running a fibonacci(35) in parallel and in parallel too with Node's event loop that keeps spinning happily at full speed in its own thread:
cat examples/quickIntro_fiveThreads.js
``` javascript
function fibo (n) {
return n > 1 ? fibo(n - 1) + fibo(n - 2) : 1;
}
function cb (err, data) {
process.stdout.write(" ["+ this.id+ "]"+ data);
this.eval('fibo(35)', cb);
}
var Threads= require('webworker-threads');
Threads.create().eval(fibo).eval('fibo(35)', cb);
Threads.create().eval(fibo).eval('fibo(35)', cb);
Threads.create().eval(fibo).eval('fibo(35)', cb);
Threads.create().eval(fibo).eval('fibo(35)', cb);
Threads.create().eval(fibo).eval('fibo(35)', cb);
(function spinForever () {
process.nextTick(spinForever);
})();
```
**E.-** The next one asks `webworker-threads` to create a pool of 10 background threads, instead of creating them manually one by one:
cat examples/multiThread.js
``` javascript
function fibo (n) {
return n > 1 ? fibo(n - 1) + fibo(n - 2) : 1;
}
var numThreads= 10;
var threadPool= require('webworker-threads').createPool(numThreads).all.eval(fibo);
threadPool.all.eval('fibo(35)', function cb (err, data) {
process.stdout.write(" ["+ this.id+ "]"+ data);
this.eval('fibo(35)', cb);
});
(function spinForever () {
process.nextTick(spinForever);
})();
```
**F.-** This is a demo of the `webworker-threads` eventEmitter API, using one thread:
cat examples/quickIntro_oneThreadEvented.js
``` javascript
var thread= require('webworker-threads').create();
thread.load(__dirname + '/quickIntro_evented_childThreadCode.js');
/*
This is the code that's .load()ed into the child/background thread:
function fibo (n) {
return n > 1 ? fibo(n - 1) + fibo(n - 2) : 1;
}
thread.on('giveMeTheFibo', function onGiveMeTheFibo (data) {
this.emit('theFiboIs', fibo(+data)); //Emits 'theFiboIs' in the parent/main thread.
});
*/
//Emit 'giveMeTheFibo' in the child/background thread.
thread.emit('giveMeTheFibo', 35);
//Listener for the 'theFiboIs' events emitted by the child/background thread.
thread.on('theFiboIs', function cb (data) {
process.stdout.write(data);
this.emit('giveMeTheFibo', 35);
});
(function spinForever () {
process.nextTick(spinForever);
})();
```
**G.-** This is a demo of the `webworker-threads` eventEmitter API, using a pool of threads:
cat examples/quickIntro_multiThreadEvented.js
``` javascript
var numThreads= 10;
var threadPool= require('webworker-threads').createPool(numThreads);
threadPool.load(__dirname + '/quickIntro_evented_childThreadCode.js');
/*
This is the code that's .load()ed into the child/background threads:
function fibo (n) {
return n > 1 ? fibo(n - 1) + fibo(n - 2) : 1;
}
thread.on('giveMeTheFibo', function onGiveMeTheFibo (data) {
this.emit('theFiboIs', fibo(+data)); //Emits 'theFiboIs' in the parent/main thread.
});
*/
//Emit 'giveMeTheFibo' in all the child/background threads.
threadPool.all.emit('giveMeTheFibo', 35);
//Listener for the 'theFiboIs' events emitted by the child/background threads.
threadPool.on('theFiboIs', function cb (data) {
process.stdout.write(" ["+ this.id+ "]"+ data);
this.emit('giveMeTheFibo', 35);
});
(function spinForever () {
process.nextTick(spinForever);
})();
```
## More examples
The `examples` directory contains a few more examples:
* [ex01_basic](https://github.com/xk/node-threads-a-gogo/blob/master/examples/ex01_basic.md): Running a simple function in a thread.
* [ex02_events](https://github.com/xk/node-threads-a-gogo/blob/master/examples/ex02_events.md): Sending events from a worker thread.
* [ex03_ping_pong](https://github.com/xk/node-threads-a-gogo/blob/master/examples/ex03_ping_pong.md): Sending events both ways between the main thread and a worker thread.
* [ex04_main](https://github.com/xk/node-threads-a-gogo/blob/master/examples/ex04_main.md): Loading the worker code from a file.
* [ex05_pool](https://github.com/xk/node-threads-a-gogo/blob/master/examples/ex05_pool.md): Using the thread pool.
* [ex06_jason](https://github.com/xk/node-threads-a-gogo/blob/master/examples/ex06_jason.md): Passing complex objects to threads.
## Rationale
[Node.js](http://nodejs.org) is the most awesome, cute and super-sexy piece of free, open source software.
Its event loop can spin as fast and smooth as a turbo, and roughly speaking, **the faster it spins, the more power it delivers**. That's why [@ryah](http://twitter.com/ryah) took great care to ensure that no -possibly slow- I/O operations could ever block it: a pool of background threads (thanks to [Marc Lehmann's libeio library](http://software.schmorp.de/pkg/libeio.html)) handle any blocking I/O calls in the background, in parallel.
In Node it's verboten to write a server like this:
``` javascript
http.createServer(function (req,res) {
res.end( fs.readFileSync(path) );
}).listen(port);
```
Because synchronous I/O calls **block the turbo**, and without proper boost, Node.js begins to stutter and behaves clumsily. To avoid it there's the asynchronous version of `.readFile()`, in continuation passing style, that takes a callback:
``` javascript
fs.readfile(path, function cb (err, data) { /* ... */ });
```
It's cool, we love it (*), and there's hundreds of ad hoc built-in functions like this in Node to help us deal with almost any variety of possibly slow, blocking I/O.
### But what's with longish, CPU-bound tasks?
How do you avoid blocking the event loop, when the task at hand isn't I/O bound, and lasts more than a few fractions of a millisecond?
``` javascript
http.createServer(function cb (req,res) {
res.end( fibonacci(40) );
}).listen(port);
```
You simply can't, because there's no way... well, there wasn't before `webworker-threads`.
### Why Threads
Threads (kernel threads) are very interesting creatures. They provide:
1.- Parallelism: All the threads run in parallel. On a single core processor, the CPU is switched rapidly back and forth among the threads providing the illusion that the threads are running in parallel, albeit on a slower CPU than the real one. With 10 compute-bound threads in a process, the threads would appear to be running in parallel, each one on a CPU with 1/10th the speed of the real CPU. On a multi-core processor, threads are truly running in parallel, and get time-sliced when the number of threads exceed the number of cores. So with 12 compute bound threads on a quad-core processor each thread will appear to run at 1/3rd of the nominal core speed.
2.- Fairness: No thread is more important than another, cores and CPU slices are fairly distributed among threads by the OS scheduler.
3.- Threads fully exploit all the available CPU resources in your system. On a loaded system running many tasks in many threads, the more cores there are, the faster the threads will complete. Automatically.
4.- The threads of a process share exactly the same address space, that of the process they belong to. Every thread can access every memory address within the process' address space. This is a very appropriate setup when the threads are actually part of the same job and are actively and closely cooperating with each other. Passing a reference to a chunk of data via a pointer is many orders of magnitude faster than transferring a copy of the data via IPC.
### Why not multiple processes.
The "can't block the event loop" problem is inherent to Node's evented model. No matter how many Node processes you have running as a [Node-cluster](http://blog.nodejs.org/2011/10/04/an-easy-way-to-build-scalable-network-programs/), it won't solve its issues with CPU-bound tasks.
Launch a cluster of N Nodes running the example B (`quickIntro_blocking.js`) above, and all you'll get is N -instead of one- Nodes with their event loops blocked and showing a sluggish performance.
|