Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

Introduction

For the last 6 months, there has been a Cypress happy path e2e test implemented in the LiteFarm codebase and it is set up to run on a GitHub hosted runner on pull requests to the integration branch. The statement coverage for this test is currently just under 60%, and it was is the QA team’s goal to progressively increase this coverage metric.

Goals

Tests should be refactored to be more flexible (guidance from Mika Saryan). Some goals to aim for with our tests:

  1. The tests should be organized into smaller chunks with specific functionality: create task, create account, create farm, etc

  2. The architecture has to be designed to allow flexibility in how you run the tests

    1. Only run a single language

    2. Allow to run only "fast" tests. You can tag the tests as fast and slow, this allows to run subset of the tests (e.g. when testing a commit vs integration into master)

  3. Use "fixtures" (they may not be called this in Cypress). There is concept of fixtures in automated testing, which allow to do certain pre-requisites before running a test (e.g. create an account). This allows you to run 1 test or 10 tests, but create an account only once.

  4. Eliminate any Sleep or fixed time based Wait. Usually this is the first thing that causes flaky and slow tests

  5. Add profiling. You need to know how long each test takes which will allow to figure out bottlenecks in the test or in the backend

Issues with the current implementation

It has been the experience of both the engineering and QA team that cypress tests that have been stabilised stabilized and run consistently on local environments are still flaky when run in the CI/CD pipeline on the github Github hosted runner. This has made progress increasing test coverage quite challenging.

Some possible solutions to this issue are as follows:

Solution 1

Run cypress tests on the developers local environments. And enforce said local testing by setting up cypress tests to run on a husky pre-push hook

...

  1. Limited test coverage: Running E2E tests on a pre-push hook may not cover all possible scenarios and edge cases that can occur in a production environment. It is important to ensure that the tests cover a wide range of scenarios and edge cases to ensure that the code is thoroughly tested.(e.g. minIO server on local)

  2. Environment inconsistencies: Developers may encounter environment inconsistencies when running automated tests in their local environment, which can lead to false positives or false negatives.

Solution 2

Incorporate E2E tests into the deploy script, this assumes the current flakiness of E2E tests in CI is due to the test environment(the github runner)

...

  1. Increased deployment time: Running E2E tests during deployment can increase the time it takes to deploy new code changes. This is because E2E tests typically take longer to run than unit tests or integration tests. This increased deployment time can cause delays in delivering new features or updates to users.

  2. Limited customization: GitHub Actions provide a limited set of customization options for E2E testing. Teams may have specific requirements or preferences for how their E2E tests are run that cannot be accommodated within the constraints of a GitHub Action workflow.

  3. Security: Running E2E tests in the production environment using a GitHub Action workflow may raise security concerns, as it requires giving access to the production environment to external systems.

Solution 3

Create a custom Github runner to run on a Docker hosting service to more accurately mimic the beta environment

...

  1. Maintenance: Running E2E tests on a dedicated CI runner requires additional maintenance and management overhead. Teams must manage and maintain the infrastructure, which can be time-consuming and costly.

  2. Resource utilisationutilization: Running E2E tests on a dedicated CI runner can consume significant resources, which can be expensive or require additional infrastructure scaling.

Decision

Team agreed, Solution 3 is the most viable. Should investigate different services that could spin up a separate environment when a PR is made. Essentially maintain the current flow, but move off of Github runners to something more robust such as CircleCI free / non-profit tier.

Next steps

Next step

Owner

Due

Explore custom runner pricing and configuration options.

Mwaya

Investigate CircleCI’s free-tier’s ability to build our respective Docker containers (custom runner?)

Mwaya

Suggested next steps

After further consultation with one of the contributors(KK) the following amendments and additions to the current strategy were suggested:

  1. An upgrade to a larger Github hosted runner(https://docs.github.com/en/actions/using-github-hosted-runners/using-larger-runners ) to explore whether lack of resources is the cause of flakiness of Cypress tests in the CI/CD pipeline

  2. Run the app using Docker, if the above does not yield any positive results

  3. Ensuring that engineers fix tests that become broken

  4. Standardized .env files across developers local environments and test environments

  5. Breaking test suites down into smaller, focused test suites. Aim to keep each test suite focused on a specific area or feature of your application. Avoid creating overly large or complex test suites that test multiple unrelated features. Smaller, focused test suites are easier to manage, debug, and maintain.