Why Selenium and Cucumber Should Not Be Used Together

In this post, I will explain why I believe it is a bad idea to write UI automated tests with Selenium and Cucumber.

The title of the post mentions Selenium and Cucumber because they are the most popular browser automation and BDD tools respectively, however the context of this article applies to any UI automation tool in combination with any BDD tool.

Before I dig deeper, let’s review some background information.

What is Selenium?

Selenium is a browser automation testing tool which is capable of interacting with the HTML elements of a web application to simulate a user activity.

In Selenium WebDriver, we can write scripts in a number of programming languages and can be a great asset for multiple OS and cross browser testing.

What is Cucumber?

Cucumber was created to drive Behaviour Driven Development (BDD) process, such that the customer can describe their requirements as a series of examples called scenarios, in plain text files using the gherkin language in the Given When Then format.

In Cucumber world, these files are called feature files which are reviewed by the Scrum team to get a clear understanding of the requirements before starting the actual development.

Once development is underway, the developers and/or QA will write Step Definitions which are essentially snippets of code which bind the scenarios from the feature files to the test code which execute actions against the application under test.

Selenium and Cucumber

Both Selenium and Cucumber are great tools for their own purposes but when used together, things don’t marry up nicely! Let’s see why.

Stories are generally written from a user’s perspective, for example:

Feature: Login functionality

As a user of website abc.com

I want customers to be able to login to the site

So that they can view their account information.

In turn, Scenarios in the feature files are written in a way which describes the behaviour of the feature when a user interacts with the application. For example:

Scenario 1: Valid login

Given I am on abc.com Login page

When I enter valid credentials

Then I am redirected to My Account page

And so you can add more scenarios to test different data combinations.

Because both the story and the feature file are written from a high level point of view, and because we want to automate the scenarios, it only seems natural to start writing step definitions in Cucumber which call Selenium to drive the application, do the actions and verify the outcome.

But, this is where the problem occurs; when we start combining Selenium with Cucumber to write automated UI tests.

In all fairness, in simple cases like the Login scenario above, things fit nicely together and the approach seems plausible, and in fact most examples that you see on the internet, demonstrating the use of Selenium and Cucumber, seem to limit themselves to the famous Login example.

The readers of such blogs would assume that they can take the simple Login scenario and apply the same principle to a wider context of an application.

Don’t be fooled though, as things can get very sour with Selenium and Cucumber, when applied to a real world large web based application.

Let’s take an example of a search results page of a typical e-commerce application which sells products online. Normally the search results page is full of features, such as filters, sorts, list of products, ability to change search, ability to paginate or auto-load on scrolling, etc, as can be seen in the screenshot below:

selenium-cucumber-example

I’m going to assume that each feature on the search results page, was added to the site on an incremental basis using agile development.

Applying the same principle of our simple Login example, as each feature is developed we would have a respective feature file filled with lots of different scenarios. For example:

In iteration 1 of the development, “Filter by Price” is developed, so we would have a feature file for it with its own scenarios related to the price filter.

In iteration 2 of the development, “Filter by Star Rating” is developed, so we would have a feature file for it with its own scenarios related to the star rating filter, and so on for each new feature.

It is important to note that the scenarios in each feature file are only specific to their respective feature. In fact, this is why they are called feature files because the focus is on the individual features.

As mentioned earlier, when the application is simple, we can survive the challenge of automating the scenarios on UI with Selenium and Cucumber. However, as the application grows and new features are added, complexity arises as there could be dependencies between different features.

For instance, I could first filter my search results by price then apply another filter for star rating. Ah…we now have a problem!

Which feature file should this scenario now go? In “Filter by Star Rating” file or “Filter by Price” file? How about if I now add a scenario to apply a sort to my filtered results to sort by highest votes?

If a stakeholder wishes to see what our test coverage is, which of the feature files should he look into? Will he get the full picture of scenario coverage by reading just one of the feature files or would he need to read all feature files?

At the time of development, when each feature is developed one by one in each iteration, the feature files would be focused on the feature itself, so at some point, when we have multiple features, we need to start thinking about testing these, not only in isolation, but also creative scenarios where we combine different features.

And in fact, this is what real users of the application will do. They will first enter their search criteria, once on the search results page, they would possibly paginate, then filter, then sort, then go back, and so on, and they can do these actions in any order. There won’t be a prescribed order of events. This is a real user journey and a real test of the system!

Majority of the bugs in an application are exposed when either a feature itself is buggy or when two features that work perfectly well in isolation, don’t work together. This is what the Pairwise Testing Model is based upon.

So, what’s the big deal with using Selenium and Cucumber together?

Where at all possible, we should not use the web GUI for functional verification. Functionality of a feature should be tested at API layer by integration tests.

UI should only be reserved for checking the user flows through the application, or end-to-end tests and making sure relevant expected modules or widgets are present on the page as the user navigates from one page to another.

A typical user journey would entail:

1 – Navigate to the homepage of abc.com website

2 – Search for a product from homepage

3 – Browse through the list of search results

4 – Apply filter and/or sort

5 – Read product details

6 – Add the product to basket

7 – Continue to checkout…

Selenium is excellent in automating these scenarios and checking for various elements on each page and as I mentioned above, that’s what we should focus on when testing at UI layer, and testing the different states transitions.

As can be seen, each user journey through the application touches on many pages and potentially interacts with multiple features on each and every page, and we would be verifying various things at each step throughout the journey, so using a “feature file” to document these scenarios makes absolute no sense whatsoever, because we’re not testing a feature, we’re testing the integrated system.

Things really go pear shaped when we attempt to write the end-to-end scenarios in a Given-When-Then format. How many Givens are we going to have? How many Thens are we going to have?

One could argue that for end-to-end tests we could just use Selenium on its own without the Cucumber and have separate automated tests for each feature using Selenium and Cucumber. Again, I don’t recommend this approach as you will possible have duplicate tests and we know how slow and brittle UI tests are, so we should aim to have less of them not more! Moreover, you will still have to deal with feature dependencies tests.

To summarise:

Cucumber is a great tool in checking the behaviour of a feature at the API layer with integration tests where each feature can be thoroughly tested. This tool should be used for Story Testing.

Selenium is a great tool for automating user scenarios at the UI layer and checking the behaviour of the system as a whole. This tool should be used for User Journey Testingencompassing many user stories.

When we get to System Integration Testing or UI Testing, it is best to use Selenium without the underlying Cucumber framework as trying to write Cucumber feature files for user journeys, can get very cumbersome and would not serve the purpose the tool is built for.

19 Replies to “Why Selenium and Cucumber Should Not Be Used Together”

  1. Filtering on Price and filtering on Facilities are not two features. Filtering as such is a feature and that feature can be tested in a scenario, with all sorts of different combinations (if you want to), by applying a table with the combinations you want to test. You may very well start out with a Price filtering feature but as soon as you implement the second filter argument, you should refactor into a general filter feature.

    Since all the steps are reusable for new feature files, you may reuse them in a sorting feature test so you can test sorting on all kinds of filtered results.

    User journeys are not features. User journeys consist of the use of features and each of these features can very well be tested separately or in combination as needed at the UI level. It does make good sense to test features at the API level provided you have an API level to test on, but that may not be the case.

    Finally, the test on the API level will not ensure that the features work on the GUI level. Tests on the GUI level will ensure that, and using Cucumber as the test description language is quite fine as this may be understood by the business domain expert.

    So in short, I really don’t get the point of this post.

  2. Hi,
    I dont agree with this blog, using selenium + cucumber will be more powerful, he can define filters are feature and he can mention each filter type as scenario , if you want to test with API, then you respect library and call the respective methods in step definition.

    if you have any doubts about these implementation, please take community help , but please don’t write this kind of blog blindly , It will mislead the cucumber users

  3. Agreed with that BDD is not such suitable for large project with very large variety of pre-conditions and system states. As for me – JUnit (1 method = 1 test) is better among all tools for testing UI I used because of maintainability and simplicity of framework built using it.

  4. Agree with you completely. WebDriver is best when clubbed with any unit testing framework (like testng or junit) or have your own framework created.

  5. Hi,
    I’m not really agree with the conclusion of the article.

    Ok, using Selenium + Cucumber will give some situations where you may have issues if you just run your features one after the other, because the initial application state will not be the attendee.

    When you write your features, you expect to be on an “initial state” so you need your step tests to restore this initial state at the end of the test. This is one of the main concern in all automation testing projects, whatever is the technology used to implement the tests.

  6. Seriously the very confusing article, reader was very confused between the two tools and he is failed to explain the reason for not using the two tools together. IT totally wasted time and energy while reading this stupid topic.

    1. I’m sorry that you failed to get the point of the post about using cucumber with webdriver. I have experience working with both tools and this post comes straight from experience. It is total insanity to write cucumber UI tests!

      1. “I’m sorry you failed” is not an apology – if the a person hasn’t understood your point, you need to give them the benefit of the doubt and assume there’s a flaw in your article. Condescension is unbecoming in a testing context, and does little to promote a collaborative atmosphere, especially in an agile setting.

        The majority of comments here seem to disagree with your article, so if I’d written it my next question would be: did I fail to get the point across? Or am I incorrect?

        At the core of your article is a fundamental misunderstanding of what Cucumber does, and what the best practices for Cucumber are.

        1. Your cucumber feature tests should evolve over the course of the development. In an agile project, your requirements and your developed code evolves, why not your tests?
        2. Your example considers different search terms as different features (which is a bizarre granularity, by the way). A feature file would be developed for each. As the complexity and repetition of the features increases, the more that complexity should become enshrined in your glue language and less in the cucumber feature files. You get to a point where you rationalise your multiple “search” related feature files into one feature file for “searching”. Remember feature files want to be declarative, not procedural.
        3. I sincerely doubt that anyone wishing to analyse code coverage would content themselves with viewing one feature file out of several, ever. Besides, that’s what feature files are for – they enshrine a requirement. So the QA manager ensures that there’s a 1:1 mapping from requirement to feature file, and that each feature file when executed has passed or failed as expected (depending on the stage of the Iteration).
        4. You spoke about Selenium on top of a Cucumber framework – surely it should be the other way round?
        5. There ought to be functional tests which are achieved via API , this is true. But behaviours are necessarily human – humans don’t use APIs. It therefore makes sense to apply the behavioural model to the UI.

  7. I totally agree with what Carsten Jensen or Olivier said.

    But anyway thank you AMIR GHAHRAI, for your time trying to explain your point of view, so as you said: “why I believe it is a bad idea to write”.

  8. I think the most important factor is to use cucumber for what it’s meant to be used: describe the behaviours of your system.

    I think a common mistake is to use it as a tool to describe your tests rather than your system. It’s a small nuance but an important one.

    From there, you should chose the path of least resistance to implement your steps and the one that allows you to code your tests as early as possible.

    Granted every project is different and it often happens we have to deal with badly coded monolithic legacy systems. But in most cases, if you understand and agree with what I have written above, you should tend to avoid implementing Cucumber tests at the UI level. Or at least, it should never be your first choice.

    I do make UI tests with WebDriver. But they are system tests making sure that all parts of the system work well together once they are built and deployed. I usually find Cucumber is not the right tool to drive that type of tests. I rather pair WebDriver with any xUnit tool and use something like Allure as a reporting framework.

  9. Please avoid writing biasing blog, I don’t agree with this at all … this means you have not understood the use of cucumber/selenium fully. There is no such rule or drawbacks of using these together, it’s just that how comfortable you are with these tools…
    the only problem with such blog are they misguide people ….

  10. Totally agree.. No one understands that.. Cucumber is good for Unit and API test. But not so good for e2e test. Even if we can and many people do write e2e by mixing cucumber and selenium, there needs to be many things to be taken care of while writing test like the BDD pattern etc.,

  11. Your article proceeds from a misunderstanding about what it is that Cucumber does and best practices associated with Cucumber and BDD.

    1. I don’t know why anyone would, when assessing code coverage, only look at a single feature file. Code coverage assessment has to be comprehensive – you can’t extrapolate code coverage from a single test. With cucumber in a BDD context one requirement should map to one feature 1:1 . As such, a QA manager need only ensure that each requirement has a feature file, and in execution each of those feature files passes or fails as expected.
    2. The agile approach is to be open to, and indeed embrace, the potential for requirements to change, and for code to be changed. Why not tests? One’s Cucumber tests ought to adapt to changing needs for the project, as such;
    3. In your example, where search terms were deemed separate features (which is an unusual approach), then composing the search terms will indeed eventually lead to complexity and repetition. The Cucumber Way, at that point, would be to rationalise the individual feature files into a single “Searching” feature file, and consign the complexity instead to the glue-language. Remember, feature files want to be declarative, not procedural.
    4. You mentioned “Selenium without the underlying Cucumber framework” – surely it’s the other way round? Cucumber -> Glue Language -> Selenium (webdriver) .
    5. There is a need for functional tests via APIs, this is true. However, behaviours are necessarily human-centric. Humans don’t use APIs. Therefore Behaviours (i.e. User Journeys) ought to be tested via the UI.

    Hopefully this comment won’t be deleted like the last one.

  12. I disagree. The advantage of cucumber is a shared language that is human readable.

    I also disagree with a comment made above by sbouf10 – ‘I think a common mistake is to use it as a tool to describe your tests rather than your system. ‘

    Developers are using it to describe their tests, because it’s just easier to read for all parties and quicker than writing in code, and can be universally portable between technology stack preference.

    There is currently a limitation with ‘Gherkin’ in and common misunderstanding of ‘Background’. Using background for multiple scenarios is a problem because it runs for every scenario, rather than once for all scenarios. Developers are using ‘Background’ for websites to open a url, which is time expensive, because it will run this for every scenario, which you don’t want.

    I do agree that currently the implementation of cucumber is slowing down the test coverage, but not significantly, if managed well.

  13. Cucumber feature files can almost be looked at as a method in a developer’s mind. In the above Filter By scenarios, we’ll likely start writing one for Filter by Price. Then later realize there are many similarities among all the Filter By , we decide to refactor the initial method by passing in several parameters. The end result is that you have one method in one place to declare it and can be called at any time you want.

    I understand you can do similar things in a feature file by parameterizing values and providing test data, except you would still need to do the real programming work behind the feature file in the Step Definition to implement all the logics. As a developer, if I can accomplish this task in one place by one method, why do I want to maintain two (Feature File and Step Definition).

    There are other BDD tools such as Rspec and many other unit test framework can achieve the same goal of describing which story each test tests, hence provides the ability to generate a RTM to the management.

    I have done projects that used Cucumber with Selenium, but ultimately moved away from it for the below two reasons: 1) Product Owners didn’t want to maintain the feature files as they didn’t want to learn gherkin (to have PO owns the feature file was the original purpose of using Cucumber) 2) Automation developers didn’t want to take care of the feature file while maintaining the real coding behind it at the same time, because it took longer to complete one story, hence lower ROI.

    The real focus of our project was to let the tests discover defects, and have the developers to fix them. We didn’t need fancy test results with bars and graphs, which Cucumber is strong in this area. Neither did our PO looked into the feature files much at all.

    If it works for your project, go for it. It really depends on what each project dictates.

Leave a Reply