Why banks are struggling to deliver ROI on AI investments

Major banks have poured billions into generative AI over the past two years. JPMorgan Chase rolled out LLM Suite to 60,000 employees, while Bank of America has deployed hundreds of AI models. Wells Fargo, U.S. Bank and Fifth Third have each launched AI assistants to streamline internal operations.

Yet despite this aggressive pursuit, a recent MIT report found that 95% of organizations are seeing no measurable return on their GenAI investments.

For an industry defined by precision and margins, the gap between investment and impact is troubling. Boards want results, CFOs are asking hard questions and the efficiency gains promised by AI have remained largely theoretical.

The problem isn't the technology itself. Banks are realizing task-level efficiency gains across their operations, but these successes remain isolated. The challenge is translating that localized productivity into true enterprise-wide transformation.

That’s where agentic automation comes in. AI systems that autonomously execute complex workflows such as loan processing, compliance reporting, and fraud detection represent the next frontier. According to KPMG, this shift to autonomous workflow execution could unlock as much as $3 trillion in value creation.

But current automation frameworks weren't built for the complexity of financial services, where workflows span multiple systems, compliance rules vary by jurisdiction and even people in the same roles work in different ways. That complexity overwhelms existing solutions, forcing institutions to choose between off-the-shelf tools that solve narrow problems or custom-built frameworks that place heavy automation burdens on already-stretched teams.

Neither approach can deliver what banks actually need: the ability to discover where automation creates the most value (the sequencing problem) and then deploy it at scale across the enterprise (the scaling problem).

Understanding these two barriers is key to understanding why so many AI investments stall and how a new approach can turn isolated gains into enterprise-scale impact.

The sequencing problem

Existing automation frameworks, whether off-the-shelf or custom-built, share a fundamental flaw: they require organizations to predict which workflows need automation before they have proof of where friction or opportunity actually exists.

Without mechanisms for observing how work actually happens, institutions must guess. They survey employees, compile research and map theoretical workflows. But these methods capture perceptions and what should occur based on documented processes, not operational reality.

The result is that administrators select or design solutions based on incomplete information, often before they understand what behaviors actually drive inefficiency or value.

Discovery should precede design, not the other way around.

The scaling problem

Even if institutions could accurately predict the right workflows to automate, existing frameworks pose another major challenge: they force a choice between broad deployment and meaningful customization.

Off-the-shelf solutions address narrow use cases, forcing enterprises to deploy multiple point solutions. This leads to agent sprawl; a growing portfolio of disconnected tools that demand constant management and oversight. Further, these tools are often rigid, creating change management overhead as employees must adapt their workflows to fit the technology rather than the other way around.

Custom frameworks promise flexibility but demand deep technical expertise, ongoing maintenance and constant adaptation. Automating workflows for one role or department rarely translates cleanly to another, turning bespoke solutions into technical debt faster than they create value.

Each approach solves one side of the equation but fails on the other, leaving banks without a scalable path forward.

How behavioral agent automation solves both problems

A new class of platform is emerging to address both the sequencing and scaling challenges simultaneously. Behavioral Agent Automation Platforms (BAAPs) observe how employees actually work with AI tools, identify patterns of friction across the organization and automatically assemble and deploy the agentic capabilities needed to resolve them.

This approach solves sequencing by replacing prediction with observation. Instead of guessing which workflows to automate, the platform continuously monitors how work happens in practice: how loan officers search for information, where compliance teams encounter roadblocks and which tasks consume disproportionate time. Discovery precedes design.

It solves scaling by automating the automation itself. Rather than requiring teams to build and maintain custom workflows, the platform autonomously assembles capabilities based on what it observes, then deploys them following institutional approval processes.

The architectural components that make this possible include:

Secure, model-agnostic AI access that connects to multiple foundation models while maintaining data governance
Enterprise-wide data connectivity with core banking systems, CRMs, communication tools and document repositories
Behavioral observability that captures how work actually happens within the platform
An insights engine that interprets behavioral data, identifies friction, and surfaces automation candidates proactively
Governance infrastructure that ensures human-in-the-loop approvals and complete audit trails
Autonomous workflow assembly and deployment following institutional approval processes
Real-time telemetry to measure performance and detect emerging friction points
Continuous adaptation so automations improve as behaviors evolve

Together, these capabilities move institutions beyond static, predefined workflows toward dynamic systems that learn from real behavior and scale themselves safely within governance boundaries.

What this means for evaluation

When observation replaces prediction as the foundation for automation, the evaluation criteria shift. The question is no longer whether a platform can execute predefined workflows but whether it can discover the workflows that matter most.

Can it observe how work actually happens across systems, identify friction from behavioral signals and deploy automations automatically while maintaining governance and reducing the burden on technology teams?

Moving beyond pilots

Institutions seeing measurable ROI have stopped relying on frameworks that require prediction and started using platforms that observe reality. They build automation strategy on behavioral evidence, deploying systems that learn from how employees actually work rather than requiring them to become automation architects.

For details on how Behavioral Agent Automation Platforms are addressing the financial services automation challenge, you'll find the deep dive here.