At Twilio SendGrid, we write end-to-end (E2E) tests toward the end of a new feature or page development cycle to ensure all the parts are connected and working properly between the frontend and backend from an end user perspective.
We have experimented with various E2E testing frameworks and libraries such as our own custom in-house Ruby Selenium framework, WebdriverIO, and primarily, Cypress for over two years as highlighted in part one and part two of the blog post series documenting our migration across all the solutions. Regardless of the framework or library we utilized, we found ourselves asking the same questions about what features we can automate and write E2E tests for. After identifying which features we can test, we also noticed ourselves applying the same general strategy over and over again for writing and setting up the tests.
This blog post does not require any prior knowledge of writing E2E tests in a certain library or framework, but it helps if you’ve seen web applications and wondered about how to best automate things in the browser to test the pages work correctly. We aim to walk you through how to think about E2E tests, so you can apply these questions and general strategy for writing tests to any framework you may choose.
When it comes to writing E2E tests, we need to make sure the flows in the pages we are testing in our application meet certain criteria. Let’s walk through some of the high-level questions we ask ourselves to determine if an E2E test is possible to automate.
1. Is it possible to reset the user data back to a certain state before each test through some reliable way such as the API? If there is no way to reliably reset a user back to the state you want, it cannot be automated and expected to run as part of your blocking tests before deployment. It’s also an antipattern and usually non-deterministic to return a user back to a certain state through the UI because it is slow and automating steps through the UI is already flaky enough. It is more reliable to make API calls to reset the user’s state without ever having to open up a page in the browser. Another alternative, if you have a service that exists, is to create new users before each test with the proper data. As long as we reset a persisted user or create a user before each test, we can then focus on the parts we are testing on the page.
2. Do we have control over the feature, API, or system we intend to test? If it’s a third party service you’re relying on for billing or for any other feature, is there a way to mock them out or have it work deterministically with certain values? You want to gain as much control over the test as possible to reduce flakiness. You can create dedicated test users with isolated resources or data per test run so that it cannot be affected by anything else.
3. Is the service or feature itself consistent enough to work within a reasonable timeout? Often you may have to implement polling or wait for certain data to be processed and make it to the database (like slower asynchronous updates and triggered email events). If those services frequently occur within a reasonable and reliable window of time, you can set proper polling timeouts as you wait for specific DOM elements to show up or data to be updated.
4. Can we select the elements we need to interact with on a page? Are you dealing with iframes or generated elements you do not have control over and cannot alter? In order to interact with elements on a page, you can add more specific selectors like `data-hook` or `data-testid` attributes rather than selecting on ids or class names. The ids and class names are more prone to change as they are commonly associated with styles. Imagine trying to select hashed class names or ids from styled-components or CSS modules otherwise. For third-party generated elements or open source component libraries like react-select, you could wrap those elements with a parent element with a `data-hook` attribute and select the children underneath. For dealing with iframes, we created custom commands to extract out the DOM elements we need to assert and act on, which we’ll provide an example later on.
There are more considerations to take into account, but it all boils down to one question: Can we repeat this E2E test in a consistent and timely manner and achieve the same results?
1. Figure out the high value test cases we can automate. Some examples include happy path tests covering most of a feature flow: performing CRUD operations through the UI for a user’s profile information, filtering down a table for matching results given some data, creating a post, or setting up an API key. Other edge cases and error handling, however, may be better to cover with unit and integration tests. Run it through the questions we mentioned in the previous section to help shorten your list of test cases.
2. Think about how to repeat these tests by setting up or tearing down through the API as much as possible.
For the high value, automatable test cases, start to note which things you should set up through the API. Some examples are seeding the user with proper data if the user does not have enough filterable data for pagination, if the user’s data expires on a rolling window of 30 days, or if we need to possibly tear down some data left over from successful or incomplete tests before the current test starts again. The tests should be able to run and set itself up in the same repeatable state regardless of how the last test run succeeded or failed.
It’s important to think: how can I reset this user’s data back to the starting point so I can test only the part of the feature I want?
For example, if you want to test the ability for a user to add a post so that it eventually shows up in the user’s post list, the post must first be deleted.
3. Walk in your customer’s shoes and keep track of the UI steps needed to fully finish a feature flow. Record the steps for a customer to complete a full flow or action. Keep track of what the user should or should not see or interact with after each step. We’ll be making sanity checks and assertions along the way to ensure the users are encountering the proper sequences of events for their actions. We will then translate the sanity checks into automated commands and assertions.
4. Maintain changes and automate flows by adding specific selectors and implementing page objects (or any other kind of wrapper). Review those steps you wrote down for how to maneuver and go through a feature flow. Add more specific selectors like `data-hook` attributes to elements the user interacted with like buttons, modals, inputs, table rows, alerts, and cards. If you prefer, you can create page objects, wrappers, or helper files with references to those elements via the selectors you added. You can then implement reusable functions to interact with the page’s actionable elements.
5. Translate the user steps you recorded into Cypress tests with the helpers you created. In the tests, we commonly log in to a user through the API and preserve the session cookie before each test case runs to stay logged in. We then set up or tear down the user’s data through the API to have a consistent starting point. With everything in place, we visit the page we will be testing directly for our feature. We proceed with executing steps for the flow such as a create, update, or delete flow, asserting what should happen or be visible on the page along the way. In order to speed up tests and to reduce flakiness, avoid resetting or building up state through the UI and bypass things like logging in through the login page or deleting things through the UI to focus on the parts you want to test. Make sure to always do those parts in the `before` or `beforeEach` hooks. Otherwise, if you used `after` or `afterEach` hooks, the tests may fail in between, leading your cleanup steps to never be run and causing subsequent test runs to fail.
6. Hammer and stamp out test flakiness. After implementing the tests and it passes a couple times locally, it is tempting to set up a pull request, merge it right away, and have the tests run on a schedule with the rest of your test suite or trigger them in your deployment steps. Before you do that:
The goal is to see how your tests behave across repeated test runs under different conditions and to make them as stable and consistent as possible so we spend less time going back to fix the tests and focus more on catching actual bugs in our environments.
Here is a sample Cypress test boilerplate layout in which we created a login global support command called `cy.login(username, password).` We set the cookie explicitly and preserve it before each test case so we can stay logged in and go directly to the pages we are testing. We also carry out some setup or tear down through the API and bypass the login page every time as shown below.
- First, try leaving the user in various states and see if your tests still pass to ensure you have the proper setup steps.
- Next, investigate running your tests in parallel when triggered during one of your deployment flows. This allows you to see if resources are being stomped on by the same users and if there are any race conditions happening.
- Then, observe how your tests run in headless mode in a Docker container to see if you may need to bump up any timeouts or adjust any selectors.