Announcing TestDeck, an automated test management platform

With the growing Shift left movement, testing has become an integral part of the software development process. Now, and inadvertently, the productivity of developers is very closely tied to test infrastructure stability.

Through our DX community, we have spoken with several software engineering teams in depth about their test infrastructure, the execution speed, and the reliability of their test systems.

Here’s what we’ve learned.

Flaky tests

Almost every large engineering team blames flaky tests for some degree of unreliability in their test systems. A flaky test is a test that produces inconsistent results, passing or failing at random without any change in the source code. Google even published a blog post about their challenges with flaky tests.

The statistics are staggering:

1.5% of all test runs reported a “flaky” result
Almost 16% of the tests have some level of flakiness associated with them
84% of the transitions observed from pass to fail involve a flaky test

Slow release cycles

Since flaky tests live across your feature and trunk branches, they inadvertently slow down the entire development cycle. Now, the builds are regularly broken and someone may need to manually rerun the entire CI to get a green build to ship a new release.

Triaging cost

Remember that someone has to triage these failures. Are these tests truly flaky, and if so, what is the root cause? Who should own it and how frequently are these failing?

After running that gauntlet, it’s decision time. How do we unblock the rest of the team? Should we comment it out and assign an owner or does it need to be fixed immediately?

Fixing cost

A team will likely spend 2-3 times longer trying to fix the test than they took to write it in the first place. One of the big challenges with flaky tests is understanding both the context and the environment to reproduce the issue.

High developer frustration

Nobody wants to work in an environment where it takes longer for a software engineer to ship a feature than to build it. This further results in poor morale and low job satisfaction.

Ignored true positives

With a lack of developer trust in the test systems, most of these warning signs of a failing test will eventually be ignored. Now you not only have a failing tests problem, but the engineering team needs to trust the test results leading to a degraded engineering culture.

Introducing TestDeck

After learning from so many teams about how they use an internal system to manage flaky tests, we built the TestDeck platform!

TestDeck is a test management platform that solves the problem of flaky tests by using automation to identify and remove them. The platform is designed to integrate seamlessly with your existing CI/CD pipelines and testing frameworks.

TestDeck provides real-time status updates for each test case, enabling you to identify and take action on your unreliable tests quickly. Using TestDeck you can set up custom quarantine rules for unreliable tests, ensuring that they don’t cause unnecessary delays in the testing process. Using TestDeck you can:

Keep track of test run times, reliability, and identify flaky tests. Teams can monitor whether test stability is degrading or improving for base branches, providing valuable insights to help teams improve their overall testing process

Using TestDeck APIs and webhooks, teams can set up automatic rules to quarantine or rerun an unreliable test, ensuring that only reliable tests are included in the final test suite. Some other examples include: creating automatic JIRA tickets, receiving alerts on configured thresholds, and assigning owners based on code path.

Get a historical view of a particular test case, including how often the test has failed (flaky or not) and whether the test has become stable or unstable. This information can be viewed by feature branches or base branches, helping teams identify trends and patterns in test stability.

Configure a scheduled cadence (e.g. a nightly job) to rerun the test suite on a green build to proactively identify unreliable tests.

Comparison to existing test analytics tools

Most existing test analytics tools such as DataDog’s CI visibility tool or CircleCI test analytics also provide a way to monitor the test results and insights into test failures. TestDeck takes a different approach by being a fully-integrated workflow.

Using TestDeck APIs and webhooks, teams can create an end-to-end automated workflow to:

rerun specific tests automatically
split out fast, reliable tests from unreliable tests
get alerts about decreasing reliability
track regressions
and automatically create JIRA tickets with appropriate owners

Beta access

TestDeck currently supports Buildkite, CircleCI, and GitHub actions. You can read more about it in our documentation, sign up for free beta access and contact us at howto@aviator.co if you have any questions.

Announcing TestDeck, an automated test management platform

Ankit Jain

Announcing TestDeck, an automated test management platform