Mastering Test Retries: The Art of Automation Craftsmanship
The concept of retrying failed test cases in test automation has emerged as a response to the challenges posed by complex software systems. When test steps fail, they can be retried a specific number of times with a delay between retries. This retrying process helps the application stabilize or recover from temporary issues. If a test step succeeds within the retries, it is considered a pass. However, if it consistently fails, it is marked as a failure. The retry mechanism can be selectively applied to specific steps or assertions for maximum benefit. The usage of retry mechanism is an opportunity for an automation engineer to demonstrate their skills.
In the early days of test automation, retrying test cases was often done manually. Test engineers then started implementing retry mechanisms by writing code to catch exceptions or errors and re-executing the failed steps.
Subsequently, frameworks like TestNG, JUnit, and NUnit added annotations or attributes for specifying retry settings at the test method or class level. However, excessive retries became a problem, so the time delay between retry attempts was gradually increased to allow transient issues to naturally resolve.
Furthermore, CI tools such as Jenkins, Travis CI, and CircleCI included retry support in their build and test automation workflows. As test suites grew larger and more complex, parallel test execution became the preferred approach. However, retry mechanisms presented new challenges in supporting parallelism.
With test cases executing in parallel, reporting became more complex, requiring information about retried test cases. Today, machine learning and AI can be applied to retry mechanisms to adapt strategies based on historical test results.
Cloud-based test automation platforms and services now offer robust retry capabilities.
Test retries are particularly useful when a test is flaky and fails intermittently. Flaky tests can be caused by various factors, such as timing issues, network conditions, race conditions, or the stability of the tested application itself.
Playwright has a built-in auto-waiting and retry mechanism for locators and matchers. This mechanism continuously runs the specified logic until the condition is met or the timeout limit is reached. This can help to reduce or eliminate flakiness in your tests.
For example, you can use the page.retry()
method to retry a test step if a particular element is not visible within a certain amount of time. You can also use the expect()
matcher with the toBeVisible()
option to wait for an element to be visible before asserting anything about it.
Using test retries and Playwright’s auto-waiting and retry mechanism can help to make your tests more reliable and less flaky. These are the lessons that i have learned the harder way while using test retries
- Only retry tests that are flaky. Retrying all tests can add a significant amount of time to your test suite, so it is important to only retry tests that are known to fail intermittently.
- Set appropriate timeouts. When using the
page.retry()
method, be sure to set a timeout that is long enough for the test step to complete, but not so long that it adds too much time to your test suite. - Use descriptive error messages. When your tests fail, be sure to include descriptive error messages so that you can easily identify the cause of the failure.
Test retry can be configured in various ways.
Using playwright.config.js
.
module.exports = defineConfig({
retries: 2,
});
For the playwright.config.ts
file, I have retries set to 2
. Within a specific test file, I am able to override the retries by using test.describe.configure()
at the beginning of the file (and outside of any test.describe()
or test()
blocks). The below code will set all tests within this file to retry 2 times.
test.describe(() => {
test.describe.configure({ retries: 2 });
test('test 1', async ({ page }) => {
// ...
});
test('test 2', async ({ page }) => {
// ...
});
});
Running the test in the command line interface (CLI).
export default defineConfig({
retries: [process.env.CI](<http://process.env.ci/>) ? 2 : 0,
});
You can override this setting by using the --retries
flag when running tests from the command line:
npx playwright test --retries=3
Code Retry Blocks:
To reveal a text box hidden behind a toggle heading, simply click on it. However, in some cases, the click may fail to trigger the event due to various factors such as clicking too soon before the page has fully loaded. To ensure success, redoing the click may be necessary it's best to attempt a step-level retry rather than re-running the whole test again. This solution has been found to be effective.
await expect(async () => {
await this.page.getByRole('button').click();
await expect(this.page.getByRole('textbox')).toBeVisible();
}).toPass();
The code block within the async () => { ... }
function is executed initially. The toPass()
method starts monitoring the assertions within the block. If all assertions pass successfully, the code block completes, and the toPass()
method finishes without retries. If any assertion within the block fails, the toPass()
method will re-run the entire async () => { ... }
function, including all the assertions, until either all assertions pass or a timeout is reached. This process continues until all assertions pass or the specified timeout is exceeded.
My recent publication compiled a comprehensive collection of 100 similar typescript programs. Each program not only elucidates essential Typescript concepts and expounds upon the significance of test automation but also provides practical guidance on its implementation using Playwright. This resource will undoubtedly be valuable if you look deeper into similar topics.