The evolving role of testing in a complex, AI-driven ecosystem

NextWave and Diffblue’s Testing Leadership Forum brought together senior figures from financial institutions and consultancies to discuss the shifting landscape of software testing. Hosted over breakfast at The Ivy, the session focused on AI-powered innovation, strategic transformation, and practical approaches to improving quality and delivery in large-scale, complex environments.

“Testing today is not just a function — it's a strategic control point in delivery,” Chris Dutta, Testing Practice Lead at NextWave, commented.

Chris set the stage by highlighting:

The increasing pace of change across technology delivery in FS
The rising complexity of testing in an interconnected DevOps ecosystem
The Six Pillars of Successful Testing, which guide NextWave’s strategy:
1. Quality Engineering
2. Strategy
3. Transformation & Advisory
4. Data & AI Assurance
5. Non-Functional Testing (NFT)
6. Testing Execution

Several industry themes and trends arose in the conversation, which was followed by a captivating demo from Diffblue. Here are some of the highlights.

Testing within a complex ecosystem

Participants emphasised that testing is no longer standalone. Tooling, environments, DevOps integration, and shifting delivery models all impact how and when testing should be executed.

“It’s not just testing tools. It’s where they fit within your broader strategy and ecosystem.”

Strategic gaps & culture

Outdated or static testing strategies remain a common problem.

There is still:

Inconsistent application of Shift Left
Lack of executive-level testing strategy
Misalignment between business expectations and test planning

“Many of our challenges haven’t changed in 15 years — limited time, incomplete environments, and testing always squeezed at the end.”

Security & NFT are priorities

Recent InfoSec incidents have put Non-Functional Testing (NFT) — especially performance and security — at the top of the agenda. Traditionally deprioritised, NFT is now essential for resilience, compliance, and customer trust.

AI in testing – productivity vs. precision

The discussion turned to the practical impact of AI in test automation:

Diffblue Cover vs. Copilot: Key Differences

LLMs like Copilot generate tests with high failure rates:
- 30–40% do not compile or do not pass
- Require heavy human inspection and rework

Diffblue Cover is built with reinforcement learning and code execution:
- Guarantees compiling, passing, and correct unit tests
- No reliance on external training data
- Fully offline-capable, meeting enterprise security needs
- Delivers 26x productivity gain compared to developer-written LLM-assisted tests

“Large Language Models are like smart teenagers — brilliant but unpredictable. Our agent is like a toaster — precise, reliable, and autonomous.”

Test quality, coverage, and metrics

Diffblue’s approach uses mutation test scores as a benchmark for test quality. In side-by-side comparisons with Copilot:

Test quality is equivalent or better
Coverage is deeper and faster
Enables CI integration and real-time regression testing on new PRs

“You can run Diffblue across a million-line legacy codebase and return to it later with fully generated tests, ready to detect regressions.”

Requirements-driven testing

Early-phase testing based on requirements validation is gaining traction. Participants cited examples of:

Using AI to flag ambiguous requirements
Writing tests at the requirements level to ensure alignment
Identifying scope issues early to avoid CR-driven overruns

“80% of defects came from poor requirements. We’ve now got that down to 20% using AI-driven requirement reviews.”

Practical AI demo: Diffblue cover in action

The team demonstrated:

Automated unit test creation on open-source Java projects
Dynamic branch coverage visualisation
Self-configuring test generation tools (CLI & IntelliJ plugin)
Updating failing tests based on code changes
CI/CD integration with PR-level analysis and test patching

“You define your test scope. Diffblue Cover does the rest. From static analysis to branch coverage to full suite generation — autonomously.”

Configurability & transparency

Participants raised questions around test data realism and control:

Diffblue allows custom inputs, mocks, and data formats
Tests are fully inspectable
Planned enhancements include assertion updating and context-driven transparency

“We build around what’s in the code, not guesses about requirements — unless you tell us otherwise.”

Culture & metrics

Several organisations are shifting toward metrics-led quality cultures, but progress is uneven.

Challenges include:

Too many KPIs without real accountability
Difficulty relating test metrics to business value
Resistance to change and over-reliance on "heroic" testing effort at the end of delivery

“We’ve got 50 KPIs — and no one outside the test team knows what to do with them.”

How NextWave can elevate your testing strategy

At NextWave, we specialise in delivering innovative and comprehensive quality engineering, testing, and assurance services tailored to the unique needs of financial services organisations. Our approach is designed to address common testing challenges and drive efficiency, accuracy, and confidence across your technology initiatives.n

Key Benefits of partnering with NextWave & Diffblue

Accelerated delivery: By leveraging automation and AI, we increase test coverage and reduce maintenance efforts, enabling faster and more reliable software releases.
Cost optimisation: We focus on rationalising testing spend, avoiding overspend through timely and efficient delivery, and tracking the total cost of testing to ensure a clear return on investment.
Strategic alignment: Our experts help establish clear, risk-based test strategies aligned with your organisation's objectives, ensuring that quality metrics support business outcomes.
Advanced tooling and environments: We assist in selecting the right testing tools from a vast array of smart options and utilise TestOps to manage key operational aspects, including DevOps and cloud integration, facilitating a shift to continuous testing and delivery.
Scalable resourcing: Our network of skilled test professionals brings both business and technical domain knowledge, allowing you to flex resources up or down as needed.
Sustainable testing practices: We identify appropriate green performance indicators to ensure that your testing processes are environmentally sustainable.

With a team of seasoned consultants and a suite of test assets, accelerators, and industry-leading tools, collaborating with NextWave and Diffblue will enhance your testing capabilities, reduce defect leakage, and ensure robust, performant, and secure systems.

For more details on how we can help, visit our testing services page.

Conclusions

Agentic AI testing delivers major productivity and reliability benefits over LLMs
Requirements-based testing and non-functional testing are gaining critical importance
A strong, continuously updated testing strategy is vital to keeping pace with change
Testing must evolve from a tactical QA function to a strategic enabler of delivery
Cultural change, AI literacy, and tooling alignment are the next frontiers

Next Steps

To continue the conversation or explore adoption:

Request a test strategy review with NextWave
Trial Diffblue Cover in your own legacy or cloud migration context

Reach out to chris.dutta@nxwave.co.uk for more information or schedule a free, no-obligation call.

You can also go to our secure hub to download the slides.

Tags:

Transformation, Consulting, Testing

Post by Maya Kokerov
May 21, 2025

Maya is NextWave's Digital Marketing Lead. She is a published journalist with two first-class degrees from Warwick and LSE. She has experience in copywriting, website design, pr and marketing across industries including fintech, agritech, nanotechnology and sustainability.