This is the talk that I gave at JSConf.eu 2009, then modified slightly and given again at the December Bayjax meetup (the parts on jQuery and HTML 5 in IE were added).
Unit Testing
✦ Break code into logical chucks for testing.
✦ Focus on one method at a time.
✦ Good for testing APIs.
✦ Popular Frameworks:
✦ QUnit
✦ JSUnit
✦ YUITest
✦ FireUnit
JSUnit
✦ One of the oldest JavaScript testing
frameworks.
✦ A port of JUnit to JavaScript, circa 2001.
✦ Code feels very 2001 (frames!)
✦ http://www.jsunit.net/
JSUnit
function coreSuite() {
var result = new top.jsUnitTestSuite();
result.addTestPage("tests/jsUnitAssertionTests.html");
result.addTestPage("tests/jsUnitSetUpTearDownTests.html");
result.addTestPage("tests/jsUnitRestoredHTMLDivTests.html");
result.addTestPage("tests/jsUnitFrameworkUtilityTests.html");
result.addTestPage("tests/jsUnitOnLoadTests.html");
result.addTestPage("tests/jsUnitUtilityTests.html");
return result;
}
function serverSuite() {
var result = new top.jsUnitTestSuite();
result.addTestPage("tests/server/jsUnitVersionCheckTests.html");
result.addTestPage("tests/server/jsUnitServerAjaxTests.html");
return result;
}
function librariesSuite() {
var result = new top.jsUnitTestSuite();
result.addTestPage("tests/jsUnitMockTimeoutTest.html");
return result;
}
function suite() {
var newsuite = new top.jsUnitTestSuite();
newsuite.addTestSuite(coreSuite());
newsuite.addTestSuite(serverSuite());
newsuite.addTestSuite(librariesSuite());
return newsuite;
}
JSUnit
function testAssertNotUndefined() {
assertNotUndefined("1 should not be undefined", 1);
assertNotUndefined(1);
}
function testAssertNaN() {
assertNaN("a string should not be a number", "string");
assertNaN("string");
}
function testAssertNotNaN() {
assertNotNaN("1 should not be not a number", 1);
assertNotNaN(1);
}
function testFail() {
var excep = null;
try {
fail("Failure message");
} catch (e) {
excep = e;
}
assertJsUnitException("fail(string) should throw a JsUnitException", excep);
}
function testTooFewArguments() {
var excep = null;
try {
assert();
} catch (e1) {
excep = e1;
}
assertNonJsUnitException("Calling an assertion function with too
few arguments should throw an exception", excep);
}
YUITest (2 & 3)
✦ Testing framework built and developed by
Yahoo (released Oct 2008).
✦ Completely overhauled to go with YUI v3.
✦ Features:
✦ Supports async tests.
✦ Has good event simulation.
✦ v2: http://developer.yahoo.com/yui/
examples/yuitest/
✦ v3: http://developer.yahoo.com/yui/3/test/
YUITest 2
YAHOO.example.yuitest.ArrayTestCase = new YAHOO.tool.TestCase({
name : "Array Tests",
setUp : function () {
this.data = [0,1,2,3,4]
},
tearDown : function () {
delete this.data;
},
testPop : function () {
var Assert = YAHOO.util.Assert;
var value = this.data.pop();
Assert.areEqual(4, this.data.length);
Assert.areEqual(4, value);
},
testPush : function () {
var Assert = YAHOO.util.Assert;
this.data.push(5);
Assert.areEqual(6, this.data.length);
Assert.areEqual(5, this.data[5]);
}
});
QUnit
✦ Unit Testing framework built for jQuery.
✦ Features:
✦ Supports asynchronous testing.
✦ Can break code into modules.
✦ Supports test timeouts.
✦ No dependencies.
✦ Painfully simple.
✦ http://docs.jquery.com/QUnit
QUnit Style
test("a basic test example", function() {
ok( true, "this test is fine" );
var value = "hello";
equals( "hello", value, "We expect value to be hello" );
});
module("Module A");
test("first test within module", function() {
ok( true, "all pass" );
});
test("second test within module", function() {
ok( true, "all pass" );
});
module("Module B");
test("some other test", function() {
expect(2);
equals( true, false, "failing test" );
equals( true, true, "passing test" );
});
FireUnit
✦ Unit testing extension for Firebug
✦ fireunit.ok( true, “...” );
✦ http://fireunit.org/
Standardization
✦ CommonJS: A unified cross-platform API
for JavaScript.
✦ (Including the server-side!)
✦ Working to standardize a simple testing
API.
✦ http://wiki.commonjs.org/wiki/CommonJS
Server-Side
✦ Ignore the browser! Simulate it on the
server-side.
✦ Almost always uses Java + Rhino to
construct a browser.
✦ Some frameworks:
✦ Crosscheck
✦ Env.js
✦ Blueridge
Server-Side
✦ Crosscheck
✦ Pure Java, even simulates browser bugs.
✦ http://www.thefrontside.net/crosscheck
✦ Env.js
✦ Pure JavaScript, focuses on standards
support.
✦ http://github.com/thatcher/env-js/tree/
master
✦ Blueridge
✦ Env.js + Screw.Unit + Rhino
✦ http://github.com/relevance/blue-ridge/
Distributed
✦ Selenium Grid
✦ Push Selenium tests out to many
machines (that you manage),
simultaneously.
✦ Collect and store the results.
✦ http://selenium-grid.seleniumhq.org/
✦ TestSwarm
✦ Push tests to a distributed swarm of
clients.
✦ Results viewable on the server.
✦ http://testswarm.com/
The Scaling Problem
✦ The Problem:
✦ jQuery has 6 test suites
✦ Run in 15 browsers
✦ (Not even including multiple platforms
or mobile browsers!)
✦ All need to be run for every commit,
patch, and plugin.
✦ JavaScript testing doesn’t scale well.
Distributed Testing
✦ Hub server
✦ Clients connect and help run tests
✦ A simple JavaScript client that can be run
in all browsers
✦ Including mobile browsers!
✦ TestSwarm
FF 3.5 FF 3.5 FF 3.5
IE 6
IE 6
FF 3 IE 6
Op 9
FF 3
IE 7
TestSwarm
IE 7
Test Suite Test Suite Test Suite
TestSwarm.com
✦ Incentives for top testers (t-shirts, books)
✦ Will be opening for alpha testing very soon
✦ Help your favorite JavaScript library
become better tested!
✦ http://testswarm.com
Major Cases
✦ Same Code, Different Platforms
✦ Compare V8 vs. SpiderMonkey vs.
JavaScriptCore
✦ Different Code, Same Platform
✦ Compare CSS Selector Engines
✦ A/B testing a piece of code
Same Code, Different Platform
✦ A number of suites analyzing JS perf:
✦ SunSpider (from WebKit)
✦ V8 Benchmark (from V8/Chrome)
✦ Dromaeo (from Mozilla)
✦ Statistical accuracy and reproducibility is
paramount.
SunSpider
✦ All tests were highly balanced.
✦ Provide some level of statistical accuracy.
✦ +/- 5ms (for example)
✦ Tests are run by loading an iframe with the
test 5 times.
✦ getTime() is run before/after each test.
✦ Entire suite must be trashed in order to
upgrade/fix a test.
Error Rate?
✦ How do we get it? What does it mean?
✦ It’s how confident we are that we arrived
at the result we want in the number of
runs that we’ve done.
Normal Distribution
✦ First: Assume that the results are coming
back in a normal distribution.
✦ The “bell curve”
Confidence
✦ Next: We need a confidence level.
✦ T-Distribution works well here.
http://en.wikipedia.org/wiki/Student%27s_t-distribution
Error Rate
✦ 5 runs
✦ (each run is potentially 1000s of
individual test runs)
✦ 95% Confidence (t-distribution = 2.776)
✦ Standard Errors Mean =
✦ (std_dev / sqrt(runs)) * 2.776
✦ Error = (err_mean / mean) * 100
✦ This way you can get results like:
✦ 123ms +/- 5ms
V8 Benchmark
✦ Tests are run, potentially, 1000s of times.
✦ Also provides an error rate.
✦ (Use a geometric mean to arrive at a
result.)
Small Time Accuracy
✦ Small time:
✦ 1ms, 1ms, 1ms, 1ms, 3ms
✦ huge error!
✦ Large time:
✦ 1234ms, 1234ms, 1234ms, 1234ms, 1238ms
✦ tiny error!
✦ Tests that run faster need to be run more
times.
✦ Running more times = less potential for
weird results.
http://ejohn.org/blog/javascript-benchmark-quality/
Runs/Second
✦ var start = (new Date).getTime();
while (time < 1000) {
runTest();
time = (new Date).getTime() - start;
}
✦ More test runs, more statistical accuracy.
✦ V8 & Dromaeo-style suites handle this.
✦ (Problem: getTime() is being run on every
loop - it should be run less frequently in
order to influence the numbers less.)
Runs/Second
✦ You are now measuring tests/second rather
than seconds per test.
✦ You run tests as many times in one second
as you can.
✦ Then you do that multiple times (5?)
✦ THEN you analyze the final numbers:
✦ 1234run/s, 1230runs/s, 1240runs/s, ...
Harmonic Mean
✦ A way to average rates
✦ Which is what we have! runs/second
✦ For example:
✦ 1234run/s, 1230runs/s, 1240runs/s,
1236runs/ms, 1232runs/s
✦ 5 / ( (1/1234) + (1/1230) + (1/1240) + (1/1236)
+ (1/1232) ) =
✦ 1234.39runs/s!
http://en.wikipedia.org/wiki/Harmonic_mean
Dromaeo
✦ All individual tests are versioned
✦ Makes it easy to update or fix a bug in a
test
✦ Can only run tests of specific versions
against each other
✦ Uses V8’s style of running tests.
✦ Also has DOM and framework tests.
✦ ...and hooks for doing Shark profiling.
Bug Fixes
✦ Tests will, inevitably, have bugs that need
to be fixed.
✦ Fixing a bug changes the result quality.
✦ Tests need to be versioned so that changes
can be made.
✦ You look at Test v1 vs. Test v1 results.
✦ Not Test v2 vs. Test v1.
✦ Tip: Just use the last revision control
commit # for the test file.
Different Code, Same Platform
✦ Most solutions here are very poor.
✦ Run the test very few times, use getTime().
✦ Highly inaccurate results, massive error.
Garbage Collection
✦ Browsers periodically run garbage
collectors to clean up old objects no longer
referenced.
✦ This can take a long time and spike your
test results.
✦ Example:
✦ 10ms, 13ms, 11ms, 12ms, 486ms, 12ms, ...
✦ When comparing engine to engine, this
doesn’t matter.
✦ Comparing code vs. code, it does.
Mean, Median, Mode?
✦ Mode!
✦ Run your tests a large number of times.
✦ What is the ‘mode’ (the result that
occurs most frequently)
✦ Example:
✦ 10, 11, 11, 12, 12, 12, 13, 14
✦ Mode = 12ms.
✦ Less accurate than mean, but gives you a
more-consistent result.
✦ DON’T DISCARD “BAD” RESULTS!
IE in Wine
✦ Running Internet Explorer in Wine (on
Linux) gives fine-grained timer results
✦ Down to the millisecond!
✦ You can also run IE, in Wine, on OS X:
✦ ies4osx
✦ Huge Caveat: It gives you fine-grained
time, but that doesn’t mean it’s accurate.
Different Code, Same Platform
✦ How can we get good numbers?
✦ We have to go straight to the source: Use
the tools the browsers provide.