Over the last decade, DevOps has become the de facto model for software development. It fits well with the enterprise agenda of quickly delivering quality products and services and addressing customer expectations with frequent updates. With its use of automation, off-the-shelf tools and process orchestration, DevOps helps teams engineer quality from the ground-up, ensuring all quality parameters are checked off, the first time. It’s sometimes called a fail-proof process.

This, however, is an over-reach. Even the most sophisticated space missions, and the most advanced digital companies are prone to failures. With increasingly complex application architectures and the involvement of third-party entities, it’s practically impossible to develop fail-proof systems. The possibility of something going wrong – or failure points – increases exponentially as stakeholders multiply across the application ecosystem.

This can tarnish the sheen of even the most mature DevOps practices. Even as enterprises deploy mechanisms to engineer quality – i.e., democratizing quality by shifting quality assurance (QA) left in the lifecycle – they can still fail to weed out all possible defects.

How do enterprises ensure defects don’t clog pipelines and are remediated without impacting delivery speed?

Enter, Self-Healing QA

QA can help eliminate bottlenecks in DevOps pipelines by addressing defects on-the-go with artificial intelligence (AI) and machine-learning algorithms, expediting resolution timelines. Here’s how it works:

  • Auto-triaging of defects: With diverse stakeholders driving development or production activities, it can be time-consuming to trace a defect back to the correct team. Using a machine-learning algorithm, QA teams can analyze historical datasets and classify defects by root cause, enabling them to identify which stakeholder should fix the defect, as soon as it is identified. With root cause analysis over time, the algorithm can also ensure shippable code quality by helping to rectify the code as it is being written.

    We worked with a U.S. telecommunications business that was struggling to triage defects in its multi-vendor ecosystem. Even with its in-house machine-learning algorithm, it was realizing just 11% accuracy.

    We deployed our AI-driven defect-triaging bot to reduce resolution time and increase accuracy. The bot analyzed and classified historical operations data to map past defects to their root cause. Using these insights, the bot could automatically triage new defects with 55% accuracy, significantly higher than the client’s in-house machine-learning based solution. It was also able to predict a defect earlier in the lifecycle by flagging issues in the code that resembled root causes of past defects. A dashboard kept the team updated with live reporting on resolution status. The bot also accelerated defect resolution speed by 45%, helping the client avoid $16 million in annual projected costs.

  • Self-healing of objects: Most defects identified during the QA phase are tied to a code or requirement change that wasn’t captured by the test assets used to validate it. For instance, if a developer changes the user-interface layer on a new release – i.e., shifts the “submit” button from the right side of the page to the center – the code will fail when the automated regression test suite is executed because the test scripts weren’t updated. This doesn’t mean the application isn’t working or the code is faulty.

    To address this, an intelligent QA bot can automatically heal the test assets by analyzing requirement or code changes after a test fails. If the system identifies a change, it can trigger automated, parallel generation of a new test script with natural language processing and run it again. This helps speed things along the pipeline since defects pertaining to objects (such as the “submit” button) are auto-healed without human intervention.

    We worked with a U.S. bank that was spending significant time and effort to identify and fix defects, leading to delayed releases.  We deployed our machine learning-based abend analyzer QA bot to triage and auto-heal defects. The QA bot uses multiple-level classification with a custom algorithm to trace defects to their source, with the ability to address more than 15,000 incidents in 10 months. By analyzing incident patterns, the QA bot fixes defects seamlessly without human intervention. An integrated web-based dashboard for end-to-end monitoring enables real-time resolution tracking.

    With 88% accuracy, the QA bot reduced defect resolution time by 50%, enabling the client to minimize impact to the business and realize up to $100,000 in cost savings.

  • Auto-remediation of environments: Failures often occur due to issues external to test scripts, such as low memory space or file not found. QA teams can develop intelligent automation scripts that map how an application interacts with different elements to auto-diagnose an issue. If a defect occurs due to inadequate CPU memory, test scripts built with a knowledge of application structure can trigger a clean-up of the desk space and then re-validate the code – helping to run releases on-schedule.

The Way Forward: Testing the Unknown

Today’s customers expect products and services that anticipate their needs. Businesses that fail to meet these expectations and allow defects to undermine quality put themselves at risk.

The next step for businesses is to shift their focus from product/service requirements to user journeys so they can stay continuously aware of customer behaviors and preferences. A click-by-click account of how a user is interacting with an application in real-time can give enterprises a view into user behavior and help them glean insights into the customer experience.

QA teams can use these insights to unearth scenarios that weren’t visualized in the requirements. By simulating these interactions in a test environment, QA teams can continuously validate the application from a user standpoint and actively look for defects rather than addressing them reactively as they occur.

With an AI-driven approach to maintaining continuous quality, organizations can keep their DevOps cycles on track and minimize impact to the customer experience and, ultimately, business performance.

Vikul Gupta

Vikul Gupta

Vikul Gupta is the Market Leader for Digital Assurance within Cognizant’s Quality Engineering & Assurance Practice. He has 20 years of experience... Read more