Machine learning to identify correlations between KPI’s and developer behaviours

Wayne Tofteroo, Head of Test Common Services at DVLA, explains how machine learning is used to identify positive and negative correlations between KPI’s and developer behaviours

Monitoring the behaviour of software in a live production environment is essentially another form of testing. Faults, software behaviours and consumer interactions with your software can be observed, assessed and appropriate actions taken.

In extreme cases where service has been interrupted, monitoring will provide alerts to developers which will enable a fast assessment of a problem, a timely fix and restoration of service.

Good monitoring will provide alerts as too whether the software is meeting its non-functional KPIs and where it can be improved, also the behaviour of the software in terms of its interaction with its consumers can be monitored.

It is important that the builders of your software understand how the end consumers are interacting with that software. For example, we may have many downloads of a mobile application but few users of the services it provides.

Good monitoring will provide insights into where the mobile application can be improved to increase take-up of your service. Part of the application may have seemed brilliant in the design sessions but not in actual usage with consumers, this can be identified by monitoring user journeys and corrected and hopefully the use of your mobile application based service will improve.

The challenge with monitoring software is what metrics should we collect and how should we interpret the data. machine learning and AI provide the opportunities to dive deep into a vast pool of data points and pull out meaningful insights. It can filter out the noise and identify the data that is relevant to what we are looking for.

Once machine learning and AI have been deployed and set up, it can do this task repeatedly and it can learn continuously as too that data which is both significant and relevant to you.

DevOps and Monitoring

The benefits of the DevOps approach is a clear, fast responsive pipeline delivering IT change that meets the business need when it is needed. A smooth running DevOps operation can deliver immense value to an organisation.

However, creating an efficient DevOps setup is not going to happen overnight. Assembling developers and testers into squads, discrete self-contained and self-managing development units of around 8 to 10 people, and arming them with a complex array of integrating tools with an objective of delivering business value through the timely release of good IT, software will take time.

There will be settling in time required where all the squad members learn how to work together and how to make the best use of the tools and processes. The dynamics in the squad and its behaviours are important as in order for the squad to reach its full potential, all its members must be working closely together. The squad needs to learn how to operate and develop good behaviours. This will happen through good planning but also through trial and error.

However, DevOps is continuously deploying software releases to production. Good DevOps relies on monitoring to be in place production to provide alerts to both new errors and changes in software and consumer behaviour.

Machine Learning and AI can provide insights from the monitoring of the software performance and consumer interaction with the software to highlight issue that may have occurred that can be associated with behaviours in the squad, such as a lack of appropriate testing, being carried out, or not identifying all the required acceptance criteria for a story before it is developed and deployed.

Improvements to DevOps Squad Performance

The goal of all DevOps squads it to continuously improve, both velocity and quality. There are many metrics that can be used to measure the performance of the squad, however, the metric which is most commonly used to deliver insight on the behaviour of the squad is velocity.

The velocity is what can be delivered by the squad in a sprint that is of acceptable quality. Here the squad leader can see what the limiting factors are to achieving quality in the sprint, what new defect stories have been produced, how many user stories were found not to have good enough acceptance criteria and they will have seen how the squad members operated in the sprint. They can then address issues with velocity during sprint retrospectives.

The squad needs to learn what at it can deliver in a sprint and then look to see how it can improve and deliver more in the next sprint. However, DevOps squads can have very low cycle times, sometimes as low as a day or a few hours. Here it is difficult for a Squad Leader to manage the squad and ensure the best behaviours are carried out. The importance of Monitoring and analysis becomes critical in supplying DevOps leaders with the information they require to identify and rectify any issues they may have in the internal processes and behaviours of the squads they manage.

Improvements to DevOps Testing

Measuring the quality of delivered and implemented software is difficult. Testing in the squad is designed to prevent defects from reaching live operation. The most common forms of development and testing in a DevOps squad is TDD, test driven development, BDD, behavioural driven development and ATDD, acceptance test lead development.

TDD is more often used by developers to define what they should develop and BDD and ACDD used by the specialist testers to match the delivery to the acceptance criteria. All these techniques are designed to test that the software will do exactly what is required at the required level of quality. However, testing of this nature essentially verifies the software under predicted conditions and data selected for the test.

The purpose of this type of testing is not to find defects. This means that not all edge cases may not be tested. Good monitoring and AI and Monitoring to generate the analysis will identify those issues when they occur and enable an appropriate assessment to be made of the severity. New User stories can be added to the backlog or fix done immediately.

Monitoring coupled with machine learning and AI will enable important edge cases to be identified and included in automated regression test packs. This will improve the quality of testing and the quality of the delivered product.

Written by Wayne Tofteroo, Head of Test Common Services at DVLA