Accessible Entry Sign on a brick wall

Accessibility-Testing in webmate

Rule-based Testing of Accessibility Guidelines in webmate

A significant part of internet users is disabled and relies on technical assistance in order to carry out their tasks like any other non-handicapped person. To support them, there are accessibility guidelines that define how to design a website so a disabled person can use it. Unfortunately, it’s difficult to follow those guidelines and to even have an idea where those guidelines are infringed. I wrote a master’s thesis on this topic a few years ago and here is an excerpt of it. If you are interested, you can find the complete thesis under this link. Please keep in mind that even back then, the purpose of the thesis was to provide a proof of concept implementation and thus the implementation details are subject to change. For this article, we tried to update some parts that were outdated. Still, please keep in mind that there might be parts that represent an older version of webmate.

The internet is arguably one, if not the most, influential technology of the 20th century. Its influence on human society is growing fast. Not only has it changed the way we interact, talk, shop or access information, it also enables us to overcome our physical disabilities. When two people interact online, they do not know anything about the physical appearance of the other person, whether they are poor, rich, disabled or healthy. One could talk for hours with a stranger online without ever noticing that they are blind. Thus, in theory, the internet could act as an equalizer by overcoming social gaps and integrating people who could not join the dialog in an equal way before.

Unfortunately, providing access to the internet is not as easy as it appears. Its main representation is text-based, making it hard for visually impaired people to receive the information they want. But other forms of presentation can pose challenges to certain groups of people as well. Color-blind persons cannot distinguish between green and red buttons, deaf persons cannot hear audio, mute persons cannot use speech assistants and those who are paralyzed cannot use the keyboard to type. According to the WHO, around 15% of the global population has some kind of disability, which would relate to around 1.1 billion people in absolute numbers.

However, there are techniques and tools that can help disabled people to overcome these burdens. Most operating systems provide a wide range of tools that can assist them in their interaction with the computer. There are also designated techniques and tools for the web. Nonetheless, most tools require the website to be designed in a specific way, for example by providing an alternative text for pictures.

When looking for accessibility guidelines, one quickly encounters the WCAG guidelines

In an effort to establish standards for accessibility on the internet, the W3C2 [1] started the Web Accessibility Initiative and published a set of guidelines, the most important for the web being the WCAG [2]. In its most recent form, this guideline establishes a set of design guidelines which enables disabled people to interact with the website.

Extract from the WCAG Guidelines
Guideline 1.4.3 as seen in the official WCAG 2.0 Guidelines

Those guidelines are developed and maintained by the W3C which is responsible for many standards on the web. They are divided into four categories, each containing several guidelines and sub-guidelines. Each guideline is assigned to one of three levels, either A, AA or AAA with A being the bare minimum to pass and AAA being the highest standard of accessibility. As the owner of the website, you can only claim conformance to the guidelines if you fulfil all rules of a certain level. For example, in order to get A conformance, the website must fulfil all A rules.

Part of website failing a guideline
Part of website failing Guideline 1.4.3. Note that the image has been zoomed in to get better picture quality. The text marked with the red rectangles uses colors that have a low contrast ratio.

Despite an enormous effort from the W3C and several governments, who adopted and promoted the WCAG in one form or another, there are several studies that show that there are still major difficulties for disabled people to use the internet in a barrier-free way. One of the reasons might be that the needs of them are not taken into account during the design process. This does not necessarily mean a lack of commitment, but must rather be considered as a sign of unawareness. So it might help if there were guidelines for the development process.

However, there are some aspects that are nearly impossible to check with techniques that are currently available. To give an example, Section 1.4.9 requires that images containing text are only used for decoration or that there is no alternative to the use of text in the image. While it may be easy to detect text inside of an image, it is next to impossible for a computer program to know whether the text is really essential or not. Other aspects would be the number of hosts or third party content which often interferes with the accessibility of webpages.

The webmate Platform to the rescue

Webmate is a testing platform for websites, web applications and mobile apps. It consists of several core services that can be used to develop different testing applications. Due to its micro service architecture, it can be easily extended and integrated in existing test frameworks like Jenkins or Imbus. Tests are carried out on real browsers running either on real devices or inside virtual machines. The browsers themselves are controlled via automation frameworks like Appium or Selenium [3]. Putting all of this together, webmate is capable of deploying and accessing browsers automatically, executing tests and reporting the results back to the user in a clean and understandable way.

One of the core elements that we used in this thesis is webmates state extraction feature, which is based on the work of Dallmeier et. al.[4]. Its main purpose is to extract a DOM representation of the current state of a website [5]. Nowadays, the HTML of a web page does not necessarily match the actual content presented to the user in their browser. In fact, the HTML is heavily modified by CSS-Styling and/or JavaScript. Different browsers may interpret the same HTML file differently, especially if the HTML is invalid. All those differences are reflected in the DOM which is the internal representation of a website in a browser. With webmate, we can analyse the DOM to find errors.

Graphic display of webmate framework
An overview over the webmate framework.

Note: In the following paragraphs, you will find implementation details about webmate that are outdated. In particular, the implementation details of how rules are executed and the part related to the keyword engine were in an alpha state at the time of writing the thesis and are not representative of how webmate currently operates. We left them in here so that you can get a better picture of the general idea what rule-based testing is and how it is supposed to work in webmate once we release it.

Rule-based Testing in webmate

The basic idea of rule-based testing is to define a set of rules that must be met by the application. A rule always has a set of conditions and a set of facts:

Text about Contrast Rule
An example for a rule written in natural language.

Conditions determine when the rule needs to be checked. For example, a rule should be tested for every text element of a size greater than 18pt. Complex conditions that combine multiple statements via AND or OR operators ensure that the rule is only activated when truly needed.

Facts on the other hand are properties that must be met when the condition applies. For instance, if the element is a text element of a size greater than 18pt (this is the condition part), the text color and the background color must have a contrast ratio greater than 3:1. Facts are the actual testing part in this approach. Each one has one or more checkers that check whether the fact is true or false. They might be compared to an assert statement in functional testing.

Those two parts operate on the data extracted by the webmate crawler, which may be generated as part of another test that produces DOM data. Using individual checkers to determine whether a fact is true or not allows us to be flexible in the actual implementation. This way, it is also possible to use existing third- party tools. Now, how do we translate a guideline into an executable rule test? Let us revisit Rule 1.4.3 above. This guideline has two conditions: Either the element in question is a text or an image of text. Since these two types of texts require fundamentally different analysis approaches, we would split them into two rules. For now, we focus on the text analysis. The checker could work as follows: Whenever a text is encountered, the color of the text and the background color of the enclosing element are extracted from the DOM. Then, the contrast ratio can be calculated using the official WCAG definition. If the ratio is below the threshold for the given text size, the checker reports a failure.

Executing the Rules

Once the rules are implemented, we still need a way to effectively test them on a large set of pages. Luckily, the webmate framework provides a simple tool for our purpose, the Keyword Engine.

The Keyword Engine defines a set of certain keywords that can be executed one after another. A keyword is simply a piece of code that can be executed. It can take parameters and works very similar to something like a function pointer. Without going into too much detail, we can use the Keyword Engine to simply open up a web page, let webmate extract the DOM and then execute our rules.

Executions of keywords are organized in sessions. They are isolated from each other and use separate resources. Thus, we can execute as many tests as we require, as long as we have the necessary resources for it. By doing this, we can easily parallelize and scale our evaluation up or down.

Rules involving Webdriver

Sometimes a guideline requires to check certain aspects of user interaction or other properties that may not be visible in the DOM. Considering Guideline 2.2.2 which states that whenever a website uses animations that start automatically, those animations should not last longer than five seconds or at least the user should be able to pause them. The last part cannot really be checked since it would require to click on every element of the website and then determine whether it stopped the animation or not. On the other hand, the first part can be checked quite easily as follows. Open up the page and take screenshots after zero, two and a half and five seconds. If the animation is longer than five seconds, all three screenshots should be different. If we relied on the DOM, we would not be able to detect every kind of movement, especially when it is embedded in a flash animation.

Besides controlling the browsers for DOM extraction, webmate can also control browsers by itself. It can connect to the respective browser and use it for testing. Since webmate has full control over the browser, it can do everything we could do using a local webdriver on our own machine.

Evaluation of Guideline 2.2.2
Evaluation of Guideline 2.2.2 written. We use a webdriver to take screenshots in order to determine if the website uses autoplay features.

This webdriver can now be used to check certain properties that could not be checked by simply looking at the DOM. If we return to our example, we now use the webdriver to make the screenshots mentioned above and compare them. If the screenshots change, we report a failure.

We shorten this insight so that the article does not become too long. As mentioned above, if you are interested, please feel free to have a look at the complete master thesis.


In general, the results we obtained from the evaluation paint a rather grim picture. On average, more than half of the rules we tested were violated. Even easy guidelines like setting the language attribute in the root of the HTML file or using an alt attribute for non-text content were violated consistently. As we mentioned before, a considerable amount of violations was caused by non-accessible third party content being loaded onto the webpage at runtime. This content effectively destroys the effort of developers trying to build accessible websites, as they do not have control over it.

Additionally, we used the number and size of JavaScript, CSS, HTTP files and the number of hosts to construct a simple set of metrics for the complexity of a website. We used this metric to investigate whether the complexity of a website correlates with the violations or not. In most cases, we also detected positive correlations indicating that more complex websites are more likely to commit violations. This correlation might be explained by more complexity also adding more possible points of failure, thus increasing the overall chance of violating a rule. Another aspect we found is that the number of hosts also correlated positively with the number of violations. This would be an indication confirming our hypothesis that third party content often interferes with the accessibility of a webpage.

Another point that is preventing a website from being more accessible are the guidelines themselves. While other W3C standards define clear and easy to understand guidelines and specifications, the WCAG 2.0 document stays vague most of the time to allow developers to find their own way of meeting the guidelines. However, this also means that it is not clear at all what is met by the standard and what is not. Considering Guideline 3.1.1, which states: ”The default human language of each Web page can be programmatically determined.”. One could argue that this is met by every page that contains text per default, as there are techniques that are able to infer the language used by the website from the text. Nonetheless, in the end, this does not help people with disabilities, as this would mean that assistive technology would always need to thoroughly analyze the website in advance to get the required information, something the developer could provide for free. This issue is especially important as these guidelines are the foundation for several pieces of legislation all around the world.

Screenshot of a tweet
A tweet from a person making a point about how the number of links on a website can distress people with ADHD and OCD.

Furthermore, the guidelines defined by the WCAG 2.0 may not cover the needs of disabled people entirely. Twitter user Safia Abdalla asked disabled users to tweet their problems while web browsing. With problems that are covered by the WCAG but still seem to be very widespread, others made some remarks that are not covered by the guidelines. For example, a person having ADHD says that having a great number of links annoys them, as they have to open each link in a new tab and read it thoroughly. There is no guideline which ensures that links are kept to a necessary minimum. This becomes an even bigger problem as websites tend to use as many links as possible to become more relevant for the Google search engine.

While the current situation may not look as good as we wish, we are confident that improvements can be made. However, this can only be done if all sides work together. In order to be able to construct a more accessible internet, we need developers to commit themselves to meet a set of clear, well-defined standards. Furthermore, we need tools that automatically check the websites and provide clear, valuable feedback to the developer. With our approach, we made a step towards this goal and want to encourage others to join the effort to remove barriers and build a more inclusive internet.