McKinsey says ML pipelines are the future of banking

In their advice to help banks accelerate AI adoption, McKinsey recommends using AI-first pipelines as part of their core technology and data layer

Security Data Pipeline Platforms
Abishek Ganesan
Abishek Ganesan

In their article about how banks can extract value from a new generation of AI technology, notable strategy and management consulting firm McKinsey identified AI-enabled data pipelines as an essential part of the ‘Core Technology and Data Layer’. They found this infrastructure to be necessary for AI transformation, as an important intermediary step in the evolution banks and financial institutions will have to make for them to see tangible results from their investments in AI.

The technology stack for the AI-powered banking of the future relies greatly on an increased focus on managing enterprise data better. McKinsey’s Financial Services Practice forecasts that with these tools, banks will have the capacity to harness AI and “… become more intelligent, efficient, and better able to achieve stronger financial performance.

What McKinsey says

The promise of AI in banking

The authors point to increased adoption of AI across industries and organizations, but the depth of the adoption remains low and experimental. They express their vision of an AI-first bank, which -

  1. Reimagines the customer experience through personalization and streamlined, frictionless use across devices, for bank-owned platforms and partner ecosystems
  2. Leverages AI for decision-making, by building the architecture to generate real-time insights and translating them into output which addresses precise customer needs. (They could be talking about Reef)
  3. Modernizes core technology with automation and streamlined architecture to enable continuous, secure data exchange (and now, Cruz)

They recommend that banks and financial service enterprises set a bold vision for AI-powered transformation, and root the transformation in business value.

AI stack powered by multiagent systems

The true potential of AI will require banks of the future to tread beyond just AI models, the authors claim. With embedding AI into four capability layers as the goal, they identify ‘data and core tech’ as one of those four critical components. They have augmented an earlier AI capability stack, specifically adding data preprocessing, vector databases, and data post-processing to create an ‘enterprise data’ part of the ‘core technology and data layer’. They indicate that this layer would build a data-driven foundation for multiple AI agents to deliver customer engagement and enable AI-powered decision-making across various facets of a bank’s functioning.

Our perspective

Data quality is the single greatest predictor of LLM effectiveness today, and our current generation of AI tools are fundamentally wired to convert large volumes of data into patterns, insights, and intelligence. We believe the true value of enterprise AI lies in depth, where Agentic AI modules can speak and interact with each other while automating repetitive tasks and completing specific and niche workstreams and workflows. This is only possible when the AI modules have access to purposeful, meaningful, and contextual data to rely on.

We are already working with multiple banks and financial services institutions to enable data processing (pre and post), and our Cruz and Reef products are deployed in many financial institutions to become the backbone of their transformation into AI-first organizations.

Are you curious to see how you can come closer to building the data infrastructure of the future? Set up a call with our experts to see what’s possible when data is managed with intelligence.

Abishek Ganesan
Abishek Ganesan
Marketing Manager

At DataBahn.ai, Abishek leads the content marketing charter and helps technology, security, and data leaders worldwide understand how to unlock value from data through DataBahn's pioneering data fabric solution, which transforms enterprise data management. His diverse experience–spanning freelancing, agency work, an early-stage startup, and running a small business–sets me apart. This breadth has honed my ability to develop marketing strategies that balance immediate growth and long-term brand equity.

Uncover hidden visitor insights to improve their website journey
Share

See related articles

SIEM migration is a high-stakes project. Whether you are moving from a legacy on-prem SIEM to a cloud-native platform, or changing vendors for better performance, flexibility, or cost efficiency, more security leaders are finding themselves at this inflection point. The benefits look clear on paper, however, in practice, the path to get there is rarely straightforward.

SIEM migrations often drag on for months. They break critical detections, strain engineering teams with duplicate pipelines, and blow past the budgets set. The work is not just about switching platforms. It is about preserving threat coverage, maintaining compliance, and keeping the SOC running without gaps. And let’s not forget, the challenge of testing multiple SIEMs before making the switch and, what should be a forward-looking upgrade, can quickly turn into a drawn-out struggle.

In this blog, we’ll explore how security teams can approach SIEM migration in a way that reduces risk, shortens timelines, and avoids costly surprises.

What Makes a SIEM Migration Difficult and How to Prepare

Even with a clear end goal, SIEM migration is rarely straightforward. It’s a project that touches every part of the SOC, from ingestion pipelines to detection logic, and small oversights early on can turn into major setbacks later. These are some of the most common challenges security teams face when making the switch.

Data format and ingestion mismatches
Every SIEM has its own log formats, field mappings, and parsing rules. Moving sources over often means reworking normalization, parsers, and enrichment processes, all while keeping the old system running.

Detection logic that doesn’t transfer cleanly
Rules built for one SIEM often fail in another due to differences in correlation methods, query languages, or built-in content. This can cause missed alerts or floods of false positives during migration.

The operational weight of a dual run
Running the old and new SIEM in parallel is almost always required, but it doubles the workload. Teams must maintain two sets of pipelines and dashboards while monitoring for gaps or inconsistencies.

Rushed or incomplete evaluation before migration
Many teams struggle to properly test multiple SIEMs with realistic data, either because of engineering effort or data sensitivity. When evaluation is rushed or skipped, ingest cost issues, coverage gaps, or integration problems often surface mid-migration. A thorough evaluation with representative data helps avoid these surprises.  

In our upcoming SIEM Migration Evaluation Checklist, we’ll share the key criteria to test before you commit to a migration, from log schema compatibility and detection performance to ingestion costs and integration fit.

How DataBahn Reinvents SIEM Migration with a Security Data Fabric

Many of the challenges that slow or derail SIEM migration come down to one thing: a lack of control over the data layer. DataBahn’s Security Data Fabric addresses this by separating data collection and routing from the SIEM itself, giving teams the flexibility to move, test, and optimize data without being tied to a single platform.

Ingest once, deliver anywhere
Connect your sources to a single, neutral pipeline that streams data simultaneously to both your old and new SIEMs. With our new Smart Agent, you can capture data using the most effective method for each source — deploying a lightweight, programmable agent where endpoint visibility or low latency is critical or a hybrid model where agentless collection suffices. This flexibility lets you onboard sources quickly without rebuilding agents or parsers for each SIEM.

Native format delivery
Route logs in the exact schema each SIEM expects, whether that’s Splunk CIM, Elastic UDM, OCSF, or a proprietary model, without custom scripting. Automated transformation ensures each destination gets the data it can parse and enrich without errors or loss of fidelity.

Dual-run without the overhead
Stream identical data to both environments in real time while continuously monitoring pipeline health. Adjust routing or transformations on the fly so both SIEMs stay in sync through the cutover, without doubling engineering work.

AI-powered data relevance filtering
Automatically identify and forward only security-relevant events to your SIEM, while routing non-critical logs into cold storage for compliance. This reduces ingest costs and alert fatigue while keeping a complete forensic archive available when needed.

Safe, representative evaluation
Send real or synthetic log streams to candidate SIEMs for side-by-side testing without risking sensitive data. This lets you validate performance, rule compatibility, and integration fit before committing to a migration.

Unified Migration Workflow with DataBahn

When you own the data layer, migration becomes a sequence of controlled steps instead of a risky, ad hoc event. DataBahn’s workflow keeps both old and new SIEMs fully operational during the transition, letting you validate detection parity, performance, and cost efficiency before the final switch.  

With this workflow, migration becomes a controlled, reversible process instead of a risky, one-time event. You keep your SOC fully operational while gaining the freedom to test and adapt at every stage.

For a deeper look at this process, explore our SIEM Migration use case overview —  from the problems it solves to how it works, with key capabilities and outcomes.

Key Success Metrics for a SIEM Migration

Successful SIEM migrations aren’t judged only by whether the cutover happens on time. The real measure is whether your SOC emerges more efficient, more accurate in detection, and more resilient to change. Those gains are often lost when migrations are rushed or handled ad hoc, but by putting control of the data pipeline at the center of your migration strategy, they become the natural outcome.

  • Lower migration costs by eliminating duplicate ingestion setups, reducing vendor-specific engineering, and avoiding expensive reprocessing when formats don’t align.
  • Faster timelines because sources are onboarded once, and transformations are handled automatically in the pipeline, not rebuilt for each SIEM.
  • Detection parity from day one in the new SIEM, with side-by-side validation ensuring that existing detections still trigger as expected.
  • Regulatory compliance by keeping a complete, audit-ready archive of all security telemetry, even as you change platforms.
  • Future flexibility to evaluate, run in parallel, or even switch SIEMs again without having to rebuild your ingestion layer from scratch.

These outcomes are not just migration wins, they set up your SOC for long-term agility in a fast-changing security technology landscape.

Making SIEM Migration Predictable

SIEM migration will always be a high-stakes project for any security team, but it doesn’t have to be disruptive or risky. When you control your data pipeline from end to end, you maintain visibility, detection accuracy, and operational resilience even as you transition systems.

Your migration risk goes up when precursor evaluation relies on small or unrepresentative datasets or when evaluation criteria are unclear. According to industry experts, many organizations launch SIEM pilots without predefined benchmarks or comprehensive testing, leading to gaps in coverage, compatibility, or cost that surface only midway through migration.

To help avoid that level of disruption, we’ll be sharing a SIEM Evaluation Checklist for modern enterprises — a practical guide to running a complete and realistic evaluation before you commit to a migration.

Whether you’re moving to the cloud, consolidating tools, or preparing for your first migration in years, pairing a controlled data pipeline with a disciplined evaluation process positions you to lead the migration smoothly, securely, and confidently.

Download our SIEM Migration one-pager for a concise, shareable summary of the workflow, benefits, and key considerations.

Black Hat 2025: Where Community Meets Innovation

The air outside is a wall of heat. Inside, the Mandalay Bay convention floor hums with thousands of conversations, security researchers debating zero-days, vendors unveiling new tools, and old friends spotting each other across the crowd. It’s loud, chaotic, and absolutely electric!  

For us at Databahn, Black Hat isn’t just a showcase of cutting-edge research and product launches. It’s where the heartbeat of the cybersecurity community comes alive in the handshakes, the hallway chats, and the unexpected reunions that remind us why we’re here in the first place.

From security engineers and researchers to marketers, event organizers, and community builders, every role plays a part in making this event what it is. Like RSAC, Black Hat feels less like a trade show and more like a reunion, one where we share new ideas, catch up with longtime peers, and recognize the often-unsung contributors who quietly keep the cybersecurity world moving.

AI and Telemetry Take Center Stage

While the people are the soul of Black Hat, the conversations this year reflected a major shift in technology priorities: the role of telemetry in the AI era.

Why Telemetry Matters More Than Ever

AI, autonomous agents, and APIs are transforming security operations faster than ever before. But their effectiveness hinges on the quality of the data they consume. Modern detection, response, analytics, and compliance workflows all depend on telemetry that is:

  • Selective: capturing only what matters
  • Low-latency: delivering it in near-real time
  • Structured: making it usable for SIEMs, data lakes, analytics, and AI models

Introducing the Smart Agent for the Modern Enterprise

We took this challenge head-on at Black Hat with the launch of our Smart Agent: a lightweight, programmable collection layer designed to bring policy, precision, and platform awareness to endpoint telemetry.

  • Reduce Agent Sprawl: Minimize deployment overhead and avoid tool bloat
  • Lower Hidden Costs: Prevent over-collection and unnecessary storage expenses
  • Adapt to Any Environment: Tailor data collection to asset type, latency requirements, and downstream use cases

Think of it as a precision instrument for your security data that turns telemetry from a bottleneck into a force multiplier.

Breaking the Agent vs. Agentless Binary

For years, the industry has debated: agent or agentless? At Databahn.ai, we see this as a false binary. Real-world environments require both approaches, deployed intelligently based on:

  • Asset type
  • Risk profile
  • Latency sensitivity
  • Compliance requirements

The Smart Agent gives security teams that flexibility without forcing trade-offs. Learn more about our approach here.

Empowering Teams Through Smarter Technology  

As we push the envelope in telemetry, our goal remains the same: build the platform that enables people to do their best work. Because in cybersecurity, the human element isn’t just important; it’s irreplaceable. If you missed us at Black Hat, let’s talk about how the Smart Agents can help your team cut data waste, improve precision, and stay ahead of evolving threats.

This blog is based on a CXO Insight Series conversation between Preston Wood and Aditya Sundararam on LinkedIn Live. Watch the full episode here.

In today’s cybersecurity landscape, it’s no longer enough to ingest more logs. CISOs face deeper, more systemic challenges as the foundational architecture of the modern enterprise SOC relies on antiquated SIEMs, siloed data landscapes, and brittle data pipelines which have reached their limit.

The main takeaway? If CISOs don’t rethink their approach to telemetry and pipelines, they’ll continue to fall behind. Not because they lack the tools, but because data strategies don’t work on a broken data foundation.

The Real CISO problem: Data Sprawl without Context

Preston opened the session by recounting a familiar story for many security leaders: SOCs using SIEMs that are drowning in irrelevant data, security analysts overwhelmed by a noisy tsunami of alerts, and struggling to investigate and manage their security posture effectively.

“You’ve got a 24/7 SOC and a dozen tools throwing off logs, but your team is still asking the same questions: what’s actually going on here?”
Preston

The issue isn’t visibility, it’s clarity. As Preston noted, enterprise SOCs don’t just suffer from managing volume; they suffer from lack of trust in their data. When logs are duplicated, out of order, lack context, and come in formats that were invented long after the tools meant to make sense of them, analysts spend more time normalizing and querying than detecting and responding.

SIEMs aren’t the Answer - they’re the Bottleneck

The session detailed serious limitations of the SIEM-centric model:

  • Too rigid:
    Legacy SIEMs demand proprietary formats and expensive tuning to onboard new sources
  • Too noisy:
    SIEMs want to collect all your data for pattern analysis; but all they do is raise costs, and leave it to the SOC to figure out what matters
  • Too slow:
    Detection happens after-the-fact, after data is shipped, indexed, and queried.
  • Too expensive:
    License and compute costs scale linearly with ingestion, which has grown 1000x in the last 10 years; but SIEM effectiveness has not increased
“When your security ROI is gated by how many terabytes you can afford to ingest, you’re already behind.”
Preston

Preston argued that SIEMs still have a role - but not as the data movement engine. That role now belongs to something else: the security data pipeline.

Security Data Pipelines: Why they Matter

For security leaders wrestling with log sprawl, cloud complexity, and regulatory risk, the answer isn’t going to come from continuing to overload the SIEM; that has led to overwhelming telemetry sprawl, increasingly fragmented environment, and mounting telemetry pressure. To elevate enterprise SOC operations, security leaders should focus on treating the data pipeline as the part of their security architecture that can unlock the future for them.

The shift means rethinking how data flows through the stack. Instead of sending everything to the SIEM and dealing with the noise later, security teams should be routing and filtering telemetry at the edge, well before it even reaches their analytics tools. Enrichment can happen upstream as well, happening left-of-SIEM and ensuring contextual signals from the enterprise environment, and reducing dependency on external threat feeds. Normalization can occur just once as the data is collected and aggregated, ideally using open schemas like OCSF, so that data can be reused for different use cases – detection, investigation, and compliance. By placing intelligent data pipelines before the SIEM, teams can significantly cut egress, compute, and SIEM licensing costs while improving the signal-to-noise ratio.

Ultimately, this isn’t about adding a new tool into an already complex toolchain. It’s about building an intelligent and foundational data fabric layer which understands the environment, aligns with business and risk priorities, and prepares the organization for an AI-driven future. This is essential for SOCs looking to lean into AI use cases, because without AI-ready data, security tools leveraging AI are just window dressing.

AI-ready Security starts with Agentic Pipelines

Preston warned his fellow CISOs about how most security vendors are racing to bolt AI onto their dashboards to leverage the current hype cycle. True AI-driven security begins at the pipeline layer, which delivers structured, enriched, clean telemetry that is collected and governed in real-time. This is the input LLM or reasoning engines can build on for future SOC use cases.

DataBahn’s platform was purpose-built for this future: using Agentic AI to automate parsing, schema detection, enrichment, and routing. With products like Cruz and Reef operating as intelligent assistants embedded in the data plane - leveraging Agentic AI to learn, adopt, evolve, and grow into helping security teams - security decision makers can begin to empower their teams away from the manual drudgery of managing data movement and focus them on strategic goals. Agentic AI pipelines also create a foundational data layer to ensure that your AI-powered security tools are equipped with the data, context, policy, and understanding required to deliver value.

"Agentic AI doesn't start in the UI. It starts with the data fabric. It starts with being able to reason over telemetry that actually makes sense."
Preston

What CISOs should do now

The session closed with a call to action for CISOs navigating their next big data decision: whether that’s SIEM migration, XDR adoption, or cloud expansion.

  1. Rethink your architecture:
    Stop treating your SIEM as the center; start with the pipeline.
  2. Control your data before it controls you:
    Invest in a governance-first pipeline layer that helps you decide what gets seen, stored, or suppressed.
  3. Choose future-proof platforms:
    Look for vendor-agnostic, AI-native solutions that decouple ingestion from analytics, and  leverage agentic AI to let you evolve without replatforming.

The future belongs to organizations that control their telemetry, set up a streamlined data fabric, and prepare their stack for AI – not just in theory, and not for tomorrow, but put into practice today.

This blog is based on a CXO Insight Series conversation between Preston Wood and Aditya Sundararam on LinkedIn Live. Watch the full episode here.