5+ years software engineer
5+ years software engineer
5+ years software engineer
5+ years software engineer
Use serverless for event-driven, variable-traffic, pay-per-use workloads. Avoid it for long-running, latency-critical, or stateful ones. That single rule covers most decisions — the rest of this article is about recognizing which side of the line your workload sits on, because most real systems have some of both.
"Serverless" doesn't mean no servers; it means you don't manage them. You deploy functions (or managed services), the platform runs them on demand, scales them automatically, and bills you per request and per millisecond of execution. When nothing is running, you pay nothing — scale to zero.
APIs with unpredictable traffic. If your load is spiky — quiet at 3am, slammed during a product launch — serverless scales up and down automatically and you only pay for what you use. No idle servers, no scramble to provision.
Event processing. File uploaded to storage, message on a queue, webhook received — these are naturally event-driven and map perfectly to functions that fire, do their work, and disappear.
Background and scheduled jobs. Nightly reports, cleanup tasks, cron-style work. Spinning up a function on a schedule is cheaper and simpler than keeping a server alive to run something once a day.
MVPs and side projects. Scale-to-zero means an idle app costs essentially nothing, and there's no infrastructure to maintain. You get to production faster.
Microservices. Individual, independently deployable functions are a natural fit for a service-oriented split — each service scales and bills on its own.
Here's a concrete shape for the good-fit case — an image-processing function that fires when a file lands in storage:
js1// Triggered by a storage upload event — runs, does its job, disappears. 2export async function handler(event) { 3 const { bucket, key } = event.record; 4 const image = await storage.get(bucket, key); 5 const thumb = await resize(image, { width: 400 }); 6 await storage.put(bucket, `thumbnails/${key}`, thumb); 7 return { statusCode: 200 }; 8}
Nothing runs when no images are uploaded, it scales to handle a thousand simultaneous uploads without any configuration, and you pay only for the milliseconds it actually executes. That's the serverless sweet spot in one function.
Long-running tasks. Serverless functions have execution time limits (commonly up to ~15 minutes on major platforms). A video transcode, a big data export, or a long batch job will hit the ceiling. Use a container or a dedicated worker instead.
Latency-sensitive workloads. Cold starts — the delay when a function spins up from idle — add latency to the first request after a quiet period. For a real-time system where every millisecond counts (the kind of low-latency work I did on AMG), an always-warm server is more predictable.
Heavy in-memory state. Functions are stateless and ephemeral. If your workload keeps a large cache or long-lived connections in memory, serverless fights you — you'd push that state to an external store on every invocation, which is slow and costly.
Stable, high-volume traffic. This is the counterintuitive one: if your traffic is predictably high and constant, per-request pricing becomes more expensive than a reserved server running flat-out. Serverless economics win on variable load, not sheer volume. Past a certain steady request rate, a right-sized always-on server is simply cheaper — and you can reserve it for a further discount.
Vendor lock-in concerns. Serverless functions tend to lean on platform-specific triggers, event formats, and adjacent managed services. That's fine — often worth it — but it means moving off that provider later is real work. If avoiding lock-in is a hard requirement, containers running the same code anywhere give you more portability.
| Factor | Serverless | Traditional servers |
|---|---|---|
| Scaling | Automatic, per-request | Manual or autoscaling groups you configure |
| Cost model | Pay per execution; scale to zero | Pay for provisioned capacity, idle or not |
| Best traffic pattern | Spiky, unpredictable, low-average | Steady, high, predictable |
| Cold starts | Yes, on idle functions | None (always warm) |
| Execution time | Capped (minutes) | Unlimited |
| State | Stateless / ephemeral | Persistent in-process state OK |
| Ops burden | Minimal — no infra to manage | You own patching, scaling, uptime |
| Long-running jobs | Poor fit | Good fit |
In practice you rarely pick one for the whole system. The mature answer is hybrid: run your steady, latency-sensitive core on servers or containers, and offload the spiky, event-driven, or scheduled edges to serverless functions.
A concrete example. An e-commerce platform might run its main API on always-warm containers for predictable checkout latency, while pushing these to serverless:
Each piece runs where its workload profile fits best. That's the whole game — match the workload to the model rather than forcing everything into one.
Ask three questions:
Most systems answer "some of each" — which is exactly why hybrid is the norm.
Cold starts get talked about more than they usually deserve, so it's worth being precise. A cold start is the extra latency the first time a function runs after being idle — the platform has to allocate resources and load your code before it can respond. Subsequent requests hit a "warm" instance and are fast.
Whether this matters depends entirely on your workload. For a background job or an event processor, nobody is waiting, so a few hundred milliseconds of cold-start latency is irrelevant. For a user-facing API endpoint with sporadic traffic, it can mean an occasional slow first request. The mitigations — provisioned concurrency to keep instances warm, smaller deployment packages, lighter runtimes — help, but note that provisioned concurrency partly undoes the scale-to-zero cost benefit. If consistent low latency is a hard requirement, that's your signal the workload belongs on an always-warm server, not a signal to fight the platform.
Serverless is often how individual microservices get deployed, so it pairs naturally with the architecture decision in Microservices vs Monolith: Which Should You Choose?. For the full picture of deploying, scaling, and architecting a production app, start at the hub: Deploying, Scaling & Architecting Full-Stack Apps.
A practical map for taking a full-stack app from a working codebase to a production system that ships reliably and scales — deployment, CI/CD, performance, and architecture choices.
A step-by-step walkthrough for deploying a full-stack app to production — environment prep, choosing hosting, deploying the database, backend, and frontend, connecting them, and going live with a custom domain and HTTPS.
Build a CI/CD pipeline that makes deploys boring and safe — the full flow from git push to production, a working GitHub Actions YAML file, deployment strategies, and the best practices that keep pipelines fast and reliable.