When Should You Use Serverless Architecture?

Use serverless for event-driven, variable-traffic, pay-per-use workloads. Avoid it for long-running, latency-critical, or stateful ones. That single rule covers most decisions — the rest of this article is about recognizing which side of the line your workload sits on, because most real systems have some of both.

"Serverless" doesn't mean no servers; it means you don't manage them. You deploy functions (or managed services), the platform runs them on demand, scales them automatically, and bills you per request and per millisecond of execution. When nothing is running, you pay nothing — scale to zero.

Where serverless shines

APIs with unpredictable traffic. If your load is spiky — quiet at 3am, slammed during a product launch — serverless scales up and down automatically and you only pay for what you use. No idle servers, no scramble to provision.

Event processing. File uploaded to storage, message on a queue, webhook received — these are naturally event-driven and map perfectly to functions that fire, do their work, and disappear.

Background and scheduled jobs. Nightly reports, cleanup tasks, cron-style work. Spinning up a function on a schedule is cheaper and simpler than keeping a server alive to run something once a day.

MVPs and side projects. Scale-to-zero means an idle app costs essentially nothing, and there's no infrastructure to maintain. You get to production faster.

Microservices. Individual, independently deployable functions are a natural fit for a service-oriented split — each service scales and bills on its own.

Here's a concrete shape for the good-fit case — an image-processing function that fires when a file lands in storage:


js
1// Triggered by a storage upload event — runs, does its job, disappears.
2export async function handler(event) {
3  const { bucket, key } = event.record;
4  const image = await storage.get(bucket, key);
5  const thumb = await resize(image, { width: 400 });
6  await storage.put(bucket, `thumbnails/${key}`, thumb);
7  return { statusCode: 200 };
8}

Nothing runs when no images are uploaded, it scales to handle a thousand simultaneous uploads without any configuration, and you pay only for the milliseconds it actually executes. That's the serverless sweet spot in one function.

Where serverless hurts

Long-running tasks. Serverless functions have execution time limits (commonly up to ~15 minutes on major platforms). A video transcode, a big data export, or a long batch job will hit the ceiling. Use a container or a dedicated worker instead.

Latency-sensitive workloads. Cold starts — the delay when a function spins up from idle — add latency to the first request after a quiet period. For a real-time system where every millisecond counts (the kind of low-latency work I did on AMG), an always-warm server is more predictable.

Heavy in-memory state. Functions are stateless and ephemeral. If your workload keeps a large cache or long-lived connections in memory, serverless fights you — you'd push that state to an external store on every invocation, which is slow and costly.

Stable, high-volume traffic. This is the counterintuitive one: if your traffic is predictably high and constant, per-request pricing becomes more expensive than a reserved server running flat-out. Serverless economics win on variable load, not sheer volume. Past a certain steady request rate, a right-sized always-on server is simply cheaper — and you can reserve it for a further discount.

Vendor lock-in concerns. Serverless functions tend to lean on platform-specific triggers, event formats, and adjacent managed services. That's fine — often worth it — but it means moving off that provider later is real work. If avoiding lock-in is a hard requirement, containers running the same code anywhere give you more portability.

Serverless vs traditional servers

Factor	Serverless	Traditional servers
Scaling	Automatic, per-request	Manual or autoscaling groups you configure
Cost model	Pay per execution; scale to zero	Pay for provisioned capacity, idle or not
Best traffic pattern	Spiky, unpredictable, low-average	Steady, high, predictable
Cold starts	Yes, on idle functions	None (always warm)
Execution time	Capped (minutes)	Unlimited
State	Stateless / ephemeral	Persistent in-process state OK
Ops burden	Minimal — no infra to manage	You own patching, scaling, uptime
Long-running jobs	Poor fit	Good fit

The hybrid reality

In practice you rarely pick one for the whole system. The mature answer is hybrid: run your steady, latency-sensitive core on servers or containers, and offload the spiky, event-driven, or scheduled edges to serverless functions.

A concrete example. An e-commerce platform might run its main API on always-warm containers for predictable checkout latency, while pushing these to serverless:

Image processing when a product photo is uploaded (event-driven).
Sending order-confirmation emails (spiky, fire-and-forget).
Nightly sales reports (scheduled).
Handling third-party webhooks (unpredictable, bursty).

Each piece runs where its workload profile fits best. That's the whole game — match the workload to the model rather than forcing everything into one.

Quick decision guide

Ask three questions:

Is the traffic variable or event-driven? If yes, serverless is a strong candidate.
Does any single task run long, need consistent low latency, or hold in-memory state? If yes, keep that part on a server.
Is the traffic steady and high-volume? If yes, a reserved server is usually cheaper.

Most systems answer "some of each" — which is exactly why hybrid is the norm.

A note on cold starts

Cold starts get talked about more than they usually deserve, so it's worth being precise. A cold start is the extra latency the first time a function runs after being idle — the platform has to allocate resources and load your code before it can respond. Subsequent requests hit a "warm" instance and are fast.

Whether this matters depends entirely on your workload. For a background job or an event processor, nobody is waiting, so a few hundred milliseconds of cold-start latency is irrelevant. For a user-facing API endpoint with sporadic traffic, it can mean an occasional slow first request. The mitigations — provisioned concurrency to keep instances warm, smaller deployment packages, lighter runtimes — help, but note that provisioned concurrency partly undoes the scale-to-zero cost benefit. If consistent low latency is a hard requirement, that's your signal the workload belongs on an always-warm server, not a signal to fight the platform.

Where this fits

Serverless is often how individual microservices get deployed, so it pairs naturally with the architecture decision in Microservices vs Monolith: Which Should You Choose?. For the full picture of deploying, scaling, and architecting a production app, start at the hub: Deploying, Scaling & Architecting Full-Stack Apps.

Where serverless shines

Event processing. File uploaded to storage, message on a queue, webhook received — these are naturally event-driven and map perfectly to functions that fire, do their work, and disappear.

MVPs and side projects. Scale-to-zero means an idle app costs essentially nothing, and there's no infrastructure to maintain. You get to production faster.

Microservices. Individual, independently deployable functions are a natural fit for a service-oriented split — each service scales and bills on its own.

Here's a concrete shape for the good-fit case — an image-processing function that fires when a file lands in storage:


js
1// Triggered by a storage upload event — runs, does its job, disappears.
2export async function handler(event) {
3  const { bucket, key } = event.record;
4  const image = await storage.get(bucket, key);
5  const thumb = await resize(image, { width: 400 });
6  await storage.put(bucket, `thumbnails/${key}`, thumb);
7  return { statusCode: 200 };
8}

Where serverless hurts

Serverless vs traditional servers

Factor	Serverless	Traditional servers
Scaling	Automatic, per-request	Manual or autoscaling groups you configure
Cost model	Pay per execution; scale to zero	Pay for provisioned capacity, idle or not
Best traffic pattern	Spiky, unpredictable, low-average	Steady, high, predictable
Cold starts	Yes, on idle functions	None (always warm)
Execution time	Capped (minutes)	Unlimited
State	Stateless / ephemeral	Persistent in-process state OK
Ops burden	Minimal — no infra to manage	You own patching, scaling, uptime
Long-running jobs	Poor fit	Good fit

The hybrid reality

A concrete example. An e-commerce platform might run its main API on always-warm containers for predictable checkout latency, while pushing these to serverless:

Image processing when a product photo is uploaded (event-driven).
Sending order-confirmation emails (spiky, fire-and-forget).
Nightly sales reports (scheduled).
Handling third-party webhooks (unpredictable, bursty).

Each piece runs where its workload profile fits best. That's the whole game — match the workload to the model rather than forcing everything into one.

Quick decision guide

Ask three questions:

Is the traffic variable or event-driven? If yes, serverless is a strong candidate.
Does any single task run long, need consistent low latency, or hold in-memory state? If yes, keep that part on a server.
Is the traffic steady and high-volume? If yes, a reserved server is usually cheaper.

Most systems answer "some of each" — which is exactly why hybrid is the norm.

Amine

Blog Posts

Amine

Amine

Amine

Blog Posts

Amine

Where serverless shines

Where serverless hurts

Serverless vs traditional servers

The hybrid reality

Quick decision guide

A note on cold starts

Where this fits

Related Posts

Amine

Amine

When Should You Use Serverless Architecture?

Where serverless shines

Where serverless hurts

Serverless vs traditional servers

The hybrid reality

Quick decision guide

A note on cold starts

Where this fits

Related Posts