Intelligent Video Analytics (IVA) has gone from niche to mainstream across security, retail, manufacturing, transport, and a growing list of other sectors. The premise is straightforward: run computer vision on the video stream, extract structured information, drive automated decisions. The interesting question, the one that actually shapes whether a deployment works, is where the processing happens.

Edge or cloud. The choice has direct consequences for latency, bandwidth, security, total cost of ownership, and how the system scales. This post breaks down what each option looks like in practice and gives you a framework for matching the architecture to the workload.

Edge processing, explained

Edge processing runs the analytics on the device or local network where the video originates. The AI model lives on a smart camera, an on-site server, or a dedicated edge box. Video never leaves the local network, or only the relevant clips and metadata do. Latency drops because there's no round-trip to a remote data centre.

Where edge wins

  • Low latency. Sub-second detection-to-alert is realistic when the inference is local. For security, intrusion, and anomaly detection, that lag matters.
  • Bandwidth savings. You're not pushing raw HD video off-site. Only the events, clips, and metadata that matter travel onward.
  • Privacy and data residency. Footage of people stays on the local network. That's much easier to defend against GDPR or sectoral regulators than "we ship it all to us-east-1".
  • Offline resilience. When the WAN drops, the system keeps running. Alerts queue and sync when connectivity returns.
  • Scale by distribution. Each site brings its own compute. You don't need a centralised farm sized for total camera count.

Where edge gets hard

The benefits aren't free. Things to plan for:

  • Less horsepower per device. An edge box can't match a cloud GPU cluster. Larger or more complex models may need quantisation or pruning to fit.
  • Upfront capital cost. Hardware is paid for on day one rather than amortised across a monthly bill.
  • Fleet management. Hundreds of distributed devices need a real management story: OTA updates, monitoring, secure boot, drift detection.
  • Physical security. The device is on-site. Someone could unplug it, steal it, or tamper with it.
  • Local storage limits. Edge boxes have finite disk. Retention strategy needs to account for what stays local and what gets archived.

Cloud processing, explained

Cloud processing ships the video stream to a remote data centre where the inference runs on shared GPU capacity. The appeal is elastic compute, central management, and the ability to run larger models than would fit on an edge device.

Where cloud wins

  • Elastic scale. Add cameras without buying hardware. Spin up GPU capacity for a backfill job, then turn it off.
  • Lower upfront cost. No site visits, no capex. Pay-as-you-go aligns spend with usage.
  • Central management. One pane of glass for model deployment, monitoring, and updates across the entire estate.
  • Heavier models. Multi-camera reasoning, long-context vision-language models, big ensembles. The cloud can run things an edge device can't.
  • Accessibility. Dashboards and search work from anywhere with a browser.

Where cloud gets hard

The trade-offs that don't make it into the sales deck:

  • Latency. Network round-trip plus queue time plus inference. Hard to hit sub-second for real-time alerting.
  • Bandwidth bill. Continuous HD upload from dozens of cameras is expensive, both on the WAN link and on egress out of customer sites.
  • Privacy footprint. Sensitive footage now lives off-site. That changes your DPIA, your compliance posture, and your incident response plan.
  • Internet dependency. When the link goes down, the cameras are blind. For safety-critical workloads, that's not acceptable.
  • Vendor lock-in. Provider-specific APIs and data formats make migration expensive.

Edge vs cloud, head to head

Six dimensions worth comparing side by side:

  • Latency. Edge wins. Local inference is typically 10x faster end-to-end than a cloud round-trip.
  • Bandwidth. Edge wins. You're moving metadata, not raw video.
  • Security and privacy. Edge is easier to defend on data residency grounds. Cloud is fine if you have a credible encryption and access control story.
  • Cost. Edge front-loads spend (hardware), cloud spreads it (compute and egress). Over a 3-year horizon, edge usually wins on TCO for steady workloads. Cloud wins for bursty or short-duration workloads.
  • Scalability. Cloud wins on elasticity. Edge scales by adding boxes, which is fine but slower.
  • Operational complexity. Cloud is easier to operate centrally. Edge needs a real fleet management story.

The hybrid pattern, which is what most real deployments look like

In practice, the answer is usually both. Run the time-critical and bandwidth-heavy parts at the edge (detection, alerting, clip extraction), and push the rest to the cloud (aggregation, longer-term storage, heavy-duty analytics, dashboards). You get the latency and privacy benefits of edge and the scale and central management of cloud.

What hybrid looks like in different settings

  • Real-time alerting. Edge detects the event and fires the alert in milliseconds. Cloud receives the clip and metadata for forensics, review, and trend analysis.
  • Traffic management. Edge counts vehicles and adjusts signals in real time. Cloud aggregates city-wide patterns and feeds longer-horizon planning.
  • Retail analytics. Edge tracks dwell time and queue lengths. Cloud joins that to sales data, builds the cross-store dashboards, and runs the merchandising models.
Hybrid IVA architecture diagram

How to pick

Six questions to answer before settling on an architecture:

  1. What does the application actually need? Latency budget, bandwidth available, security posture. Real-time safety alerting has very different needs from monthly footfall analytics.
  2. What's the network like at each site? Symmetric fibre is a different world from a single 4G uplink shared with the rest of the business.
  3. How sensitive is the footage? A warehouse looks different from a hospital ward. Map the regulatory exposure honestly.
  4. What's the budget shape? Capex-friendly or opex-friendly. Both work, they just lead to different architectures.
  5. Where is this going to be in 3 years? Plan for the scale you'll actually need, not the one you're starting from.
  6. Who's going to run it? Edge fleets need ops capability. If you don't have it, you'll either build it or pay someone else to.

Things that hold true regardless of where you process

Edge, cloud, hybrid: the basics don't change.

  • Secure the device and the data. Treat every camera, every box, every model endpoint as a potential attack surface.
  • Optimise the models for the hardware. Quantisation, pruning, distillation. A model that runs at 5 FPS on the target device is useful; one that doesn't fit isn't.
  • Monitor the system, not just the cameras. Drift, false positive rates, alert volumes, model latency. Watch them like any production service.
  • Pick hardware that matches the workload. Generic boxes are tempting but rarely cost-optimal for AI inference at scale.
  • Update regularly. Models, OS, firmware. The longer you defer, the harder the eventual update gets.
Diagram of a secure edge device

What's coming next

A few shifts worth tracking:

  • Edge silicon keeps getting better. Each generation of AI accelerators shifts more of the workload that needed cloud GPUs onto a 50W box.
  • Federated learning. Models improve from distributed data without the raw video ever leaving the site. The privacy story is genuinely better.
  • 5G and private wireless. Lower-latency, higher-bandwidth links make some hybrid patterns that were impractical before viable.
  • Serverless inference. For bursty cloud workloads, serverless GPU is changing the cost profile of cloud-side processing.

Picking the right architecture for an IVA system is one of the load-bearing decisions in the deployment. Get it right and the system scales, the bills are predictable, and the latency budget is met. Get it wrong and you're either rebuilding it within a year or living with a system that's slow, expensive, or both. The framework above is the one we use when scoping a deployment. Match the architecture to the latency budget, the bandwidth available, the regulatory posture, and the operational maturity of the team that's going to run it. Edge, cloud, or hybrid: the right answer is the one that fits the workload.

The field is moving fast. New edge silicon, new model architectures, new networking options. Keep checking that the architecture you picked 18 months ago is still the right one for the next 18.

Enjoyed this deep dive into IVA architectures? Give our GitHub project a star to show your support and stay updated on the latest advancements in AI and video analytics: Star our project on GitHub!