Strange that I’m not seeing the error on the public instance that also runs on Windows though memory use has been the bane of the node agent for a while. What version of node are you running? IIRC the max-old-space-size flag was a fairly recent addition.
There are a couple of things that might be able to help the issue:
1 - I recently changed the timeline code to write to disk instead of sending it all over IPC. We should be able to have the agent write the events to disk as they come in instead of all at once which might help reduce the overhead though it still loads the whole file to POST the result so there are a lot of places that would need to be fixed. If we wrote it out gzipped then it might be enough to keep the rest of the pipeline happy.
2 - I’m working on using netlog events for generating the waterfalls. If that works out then we’ll be able to turn off most of the tracing events and just get the network events (when timeline and trace are not enabled for a test). That will result in MUCH smaller traces. Probably a few weeks before the switch is done completely though.
Also, I can only get accurate results when running “First View and Repeat View” on the longest-running page.
When running “First View Only” the test works perfectly (can view results and explore them), then upon finishing the test, the NodeJs agent crashes or hangs with a few final words (below; full version attached as first-view-only-…txt). SIGINT to the process causes the test to mark itself as failed as the process exits.
D Sep_08_14:05:16.817 C:\xyzzy\wpt\agent\js\src\multipart.js:92 Multipart.exports.Multipart.addPart : addPart: name=error body=Page load timeout headers=undefined
D Sep_08_14:05:16.822 C:\xyzzy\wpt\agent\js\src\multipart.js:115 Multipart.exports.Multipart.addFilePart : addFilePart: name=file filename=1_results.zip size=31707
I Sep_08_14:06:10.489 C:\xyzzy\wpt\agent\js\src\agent_main.js:149 Agent.<anonymous> : wd_server child process exit code 0 signal null
[At 14:10 I could explore the results and view the result timeline on /result/150908_AR_J9V. Eventually, I send a SIGINT.]
A Sep_08_14:15:40.286 C:\xyzzy\wpt\agent\js\src\wpt_client.js:422 Client.onSignal_ : Received SIGINT, will exit after the current run is killed.
D Sep_08_14:18:11.194 C:\xyzzy\wpt\agent\js\src\multipart.js:92 Multipart.exports.Multipart.addPart : addPart: name=testerror body=Page load timeout headers=undefined
A Sep_08_14:18:12.673 C:\xyzzy\wpt\agent\js\src\wpt_client.js:417 Client.<anonymous> : Exiting due to SIGINT.
I have only seen this happen so far when doing “First View Only” against the NodeJs agent.
Out of curiosity, any reason you’re running the node command directly instead of the batch file? It shouldn’t be an issue but I’m not 100% sure about all of the flags that get passed through since the batch file does a bit of path discovery (like for the webdriver location).
In the debug output before you send the SIGINT, was there a call stack anywhere? One of my biggest pain points with the node agent is random exceptions throwing the state machine out of whack. I’m not sure why kids these days love functional programming so much, at least for a procedural app like the agent it is a nightmare.
[quote=“pmeenan, post:5, topic:9547”]
Out of curiosity, any reason you’re running the node command directly instead of the batch file?[/quote]
I reasoned to give different inputs to node, I should try calling node directly.
It does appear to be working. I exported/imported the environment variables between command prompts and checked that the variables made sense, as well.
[quote=“pmeenan, post:5, topic:9547”]
In the debug output before you send the SIGINT, was there a call stack anywhere? One of my biggest pain points with the node agent is random exceptions throwing the state machine out of whack.[/quote]
Call stack anywhere?
In my full command-line output there was never a call stack shown.
From when I was debugging:
If I run node in “–debug” mode, the application does hit a call stack and exit early (attached). Perhaps that is interesting.
As I understand, the V8 command line arguments. must occur before the script, otherwise, they are treated as command line arguments to the script. So in this case, the --max-old-space-size has no effect.
BTW: I ran into a similar OOM issue that I addressed by removing the command line argument V8 command line argument --max-old-space-size.