A Case Study: Load Testing with Galaxy, Meteor

One of the many ways RedLine13 helps our clients, is through continuously developing our services to meet the growing demands of developers and performance testers alike.

In a thorough and informative case study published by evolross on the Meteor Forums, he explains his journey while running a Meteor 1.5.1 production app deployed on Galaxy, wherein he receives hundreds, sometimes thousands, of simultaneous bursts of users hitting the app.

For a full read of the thread on the Meteor forums, click here.

In evolross’s own words:

“Over time and growing use of my app, I’ve noticed that Galaxy and/or Meteor do not handle very many users simultaneously hitting the app in a burst and the servers quickly get pegged at 100% causing huge delays in the loading of the app, especially in data retrieval from the database in both pub/sub and Meteor Methods. The performance problems started happening after about forty simultaneous logins.“

The above reminds us that it doesn’t take hundreds or thousands of users to cause performance issues!

Faced with a large event, evolross upgraded his containers to Quad size (4.1 ECU) and increased the container count to 12. This was a good solution for evolross. He discovered that if your Meteor CPU gets pegged, one quickly experiences problems, such as:

Kadira/Meteor APM metrics can get misleading
Galaxy metrics can go bonkers (especially connection counts)
Response times go through the roof on a linear scale (see graphs below).
Especially having to do with database queries from both Meteor Methods and pub/sub.
The initial/static HTML still loads fairly quickly, but the population of data into the page, even with the use of non-reactive Meteor Methods, hangs for an unusually long period of time.
User refreshing in between wait times causing more load.

evolross then set out to load test his app to verify that this “bursting” was actually the problem. He needed to simulate hundreds (if not thousands) of realistic users hitting his app at the same time (read: not ramping up over five minutes – which is unfortunately what a lot of cloud load testers offer when you need to scale up users).

The problem evolross found with load testing in Meteor: Problem/failure reproduction. evolross found he had to have real (or at least headless) browsers hitting my app – and lots of them. JMeter, Gatling, and a whole variety of web/cloud load testers (even a lot of major providers) were unusable because they only test HTTP traffic.

They don’t simulate button clicks calling JS functions and the resources downloading, javascripting, database calling, reactivity, Oplog work, etc. etc. of actually loading your Meteor app.

evolross soon understood:

Running these tests on your Meteor app and Meteor performed quite well just serving the HTTP of the app, and running a JMeter test with 1000 simultaneous hits to his app from evolross’s desktop and a Compact container performed great, but that’s only serving the HTTP of the page. Galaxy doesn’t even register these hits as “connections”, but it did show a minor CPU hit.

While these types of HTTP tests work, they don’t come close to actually reproducing the problem.

evolross found:

The only way to reproduce the problem was using a cloud service that actually launches browser instances across distributed machines. He found there are cloud load tester apps that will charge hundreds to thousands of dollars per month to perform tests like this.

“I looked at a lot of them. 95% of them are too expensive for my app. Amazon Mechanical Turk is also too expensive when you need hundreds/thousands of users. I can’t afford $999 per month and I also can’t afford $50 per test.”

evolross went on to mention:

“The very best solution I found for my use-case (which I know is kind of a weird edge-case) is www.redline13.com. Their service actually has a free tier that lets you connect your own AWS credentials, spin up your own EC2 instances, and they take care of firing off your tests for you and handling all the behind-the-scenes setup of your EC2 instances to start and run PhantomJS.

You just pay for your EC2 usage.

They can even load super-cheap Spot Instances and let you re-use them for a whole hour. This makes doing tests of hundreds/thousands of users costs pennies per test. […] And best of all, all the instances fire off almost simultaneously. Webdriver can do almost anything. And their PhantomJS reports back tons of useful metrics that Redline13 saves for free (see below). Redline13 has some paid plans that involve support and extra features (like test replay and cloning).

Some trial and error was required in getting Redline13 to work for evolross’s app. Here are some tips he mentions that all of our users may find useful:

You need to run a powerful enough server to run your PhantomJS instances or risk running into problems and anomalies with your testing machine not having enough CPU to run the test instances. There’s a metric on the Redline13 results under Agent Metrics stats that shows “Load Agent CPU Usage”. Make sure this never pegs at 100%. If it does, not all your instances will run. I recommend an M4.16XLarge (or several of them if you’re testing in the thousands of users). One of these boxes can handle hundreds of PhantomJS instances.
I also recommend having your use-case “do something” like adding a document. That way you can count how many documents were added and verify that the total number of documents matches your instance count to verify all your instances ran as expected. Otherwise it can be hard to tell if they just, for example, hit the URL of your app.
Tests can take a few minutes to spin up. Be patient. Check for errors at the bottom of the results page. You can run more tests using the same EC2 instances you started. They’re available for about an hour. And use Spot Instances, it’s way cheaper.
I could only get testing in PhantomJS to work. Firefox and Chrome are also offered, but both of them returned errors. I emailed Redline13 about this and they said they’re working on fixing it. As I mention above, PhantomJS is more efficient anyway. PhantomJS also occasionally fails to run every now and then, never figured out why.

*We have since updated Redline13 to now work with Firefox and Chrome.

In all, we’re flattered that evolross was able to incur such a productive experience with the service we passionately provide. Even more so we’re humbled at his assertion that:

“…they [RedLine13] offer a TON of value for no charge.”

We, have updated our service as a result of the issues that he experienced with Firefox and Chrome and are happy that we can maintain an open dialogue and ongoing relationship with the growing needs of our users.

For a full read of the thread on the Meteor forums, click here.