Don't Fear the Vercel Bill: A Developer's Guide to Cost Optimization |

There is a lot of fear around how expensive Vercel can be, fueled by stories of terrible bills that float around online. After auditing numerous codebases that caused these huge bills, a pattern of common mistakes emerges. This article explores the ways you can use Vercel incorrectly and, more importantly, how to use it right.

To showcase just how bad things can be, an app was built implementing all the common pitfalls that lead to massive Vercel bills. We will go through and fix them, so you can find them in your own codebase and prevent a runaway invoice.

1. The Hidden Cost of the 'public' Directory

It can seem like the most innocent thing in the world: you place a video in the public directory, add it to a <video> tag, and it just works. This is great, right? Totally safe?

Except that for services like Vercel, infrastructure is expensive, especially for bandwidth. The reason Vercel charges so much for bandwidth compared to providers like Hetzner is that everything in the public directory gets distributed via a Content Delivery Network (CDN). Good CDNs are expensive.

The benefit of a CDN is that small assets, like a favicon, are served from a location close to your users, which is highly beneficial for performance. Even Cloudflare acknowledged this when they built R2, which, despite being cheaper for file hosting, is much slower than a CDN.

Because of this, putting large files in the public folder is costly. If an asset is something you can't reasonably respond with in a single request chunk, it shouldn't be in there. A general rule is if a file is more than a few kilobytes, do not put it in the public directory.

The easiest solution is to host it on a dedicated file storage service. For example, you can use a service like AWS S3, Cloudflare R2, or others. After uploading your large file to such a service, you simply get a URL.

All you have to do is swap the source out in your code:

<!-- Before -->
<video src="/my-large-video.mp4"></video>

<!-- After -->
<video src="https://your-file-host.com/path/to/my-large-video.mp4"></video>

That's it. This simple change can potentially save you from a very expensive bill, turning thousands of dollars in potential bandwidth costs into zero, as many of these services have generous free tiers and do not charge for egress in the same way.

2. Mastering Next.js Image Optimization

On the topic of assets, there is another edge case that developers run into. Imagine a page that grabs a thousand random Pokémon sprites. This is doing something right: using the Next.js <Image> component. This is awesome because it can take a large image and compress it significantly, sometimes down to just a few kilobytes.

However, it's crucial to understand how Vercel bills for its Image Optimizer. - Free Plan: 1,000 image optimizations. - Pro Plan: 5,000 free optimizations, then $5 per 1,000.

That $5 per thousand optimizations is not cheap. A couple of mistakes were made in this Pokémon sprite implementation.

First, the files being referenced are already small and don't really need optimization. It's nice to have them on the Vercel CDN, but it's not necessary.

The much bigger mistake is how the allowed image sources are configured. In next.config.js, you might have something like this:

// next.config.js
module.exports = {
  images: {
    remotePatterns: [
      {
        protocol: 'https',
        hostname: 'raw.githubusercontent.com',
        port: '',
        pathname: '**',
      },
    ],
  },
};

This configuration allows any path from that hostname. If other people are hosting random images on GitHub, they could use your optimization endpoint to generate tens of thousands of additional image optimizations, running up your bill.

An "image optimization" occurs for each unique image URL. If you render an image at 1000x1000 pixels and another version at 200x200, Vercel only bills you based on the unique source URL, not the different sizes generated.

The quick fix is to make the pathname more restrictive:

// next.config.js
module.exports = {
  images: {
    remotePatterns: [
      {
        protocol: 'httpshttps',
        hostname: 'raw.githubusercontent.com',
        port: '',
        pathname: '/PokeAPI/sprites/master/**',
      },
    ],
  },
};

Now, the app will only optimize images that come from that specific repository path. As long as the repo isn't compromised, you're good. This same principle applies if you use a file hosting service. Ensure your configuration only allows URLs from your specific application or account, not the entire hosting service.

3. Taming Expensive Serverless Functions

Bandwidth is just one part of the equation. Serverless functions can also get expensive if not handled correctly.

Let's say you've built a blog with a data model for posts, comments, and users. You might write an API endpoint to fetch all this data at once. A naive implementation could look something like this:

// An example of an inefficient API route
import { db } from './db';

export async function getBlogPostData(postId) {
  // 1. Get the post
  const post = await db.post.findFirst({ where: { id: postId } });

  // 2. Get the comments for the post
  const comments = await db.comment.findMany({ where: { postId: post.id } });

  // 3. Get the post's author
  const author = await db.user.findFirst({ where: { id: post.userId } });

  // 4. Get the users for all the comments
  const commentUserIds = comments.map(comment => comment.userId);
  const commentUsers = await db.user.findMany({ where: { id: { in: commentUserIds } } });

  // ... logic to combine data ...
  return { ... };
}

Hopefully, you can quickly see the problem. This code makes multiple sequential, blocking calls to the database. If each call takes just 50 milliseconds, the total minimum compute time is 200ms. This is hilariously bad.

The Quick Fix: Parallelize Queries

A dumb, quick fix is to run non-dependent queries in parallel.

const commentsPromise = db.comment.findMany({ where: { postId: post.id } });
const authorPromise = db.user.findFirst({ where: { id: post.userId } });

const [comments, author] = await Promise.all([commentsPromise, authorPromise]);

This at least runs two queries at the same time. But we can do much better.

The Real Fix: Use SQL Relations

The ideal solution is to write a single query that gets all the data in one pass using relations, which modern ORMs like Prisma make easy.

// A single, efficient Prisma query
const post = await db.post.findFirst({
  where: { id: postId },
  include: {
    user: true, // The author
    comments: {
      include: {
        user: true, // The user for each comment
      },
    },
  },
});

This single query is vastly simpler and more performant. It means the request takes a fraction of the time to resolve. In one of the massive Vercel bills that was audited, requests were taking over 20 seconds with over 15 blocking Prisma calls. A single Promise.all cut request times by 90%, and using relations cut it down even further. The runtime was reduced from over 20 seconds to under 2 seconds in about 30 minutes.

You need to know how to use a database. Vercel's infrastructure scales so well that even absolute garbage code can function, but it will cost you.

4. Leveraging Concurrency and Background Jobs

Vercel recently added a feature to make long-running requests less painful: increased function concurrency. When a function is waiting on an external resource (like a database or API), other requests can be processed on the same Lambda instance. This can reduce your bill by half or more, especially if you have long requests, such as waiting 20+ seconds for an AI to generate something.

For very long-running tasks, an even better solution is queuing. Instead of having your server wait for data, you can throw the task into a queue. The service generating your data can then update the queue when it's done. - Inngest is a popular service for creating durable functions that can handle these workflows. - trigger.dev is an open-source solution for background jobs with no timeouts.

If you have requests that must take a long time, you should probably put them in a queue instead of letting your servers eat the cost of waiting.

5. The Power of Server-Side Caching

You don't always have to re-compute data. Sometimes, you can skip it entirely.

Let's say a query takes 10 seconds to run, but the data it resolves doesn't change often. You can use Next.js's unstable_cache function (a stable version is expected soon).

import { unstable_cache } from 'next/cache';
import { db } from './db';

const getCachedBlogPost = unstable_cache(
  async (postId) => {
    // This expensive query now gets cached
    return db.post.findFirst({
      where: { id: postId },
      include: { /* ... relations ... */ },
    });
  },
  ['blog-post'], // Cache key parts
  { tags: ['blog-post'] } // Cache tag for revalidation
);

With this special cached function, the first time it's called, it does the work. From that point forward, the data is cached, and you don't have to run the expensive query again. If the call took 10 seconds, it will now only take 10 seconds the very first time.

But what happens when the data changes? When a user leaves a comment, you need to invalidate the cache.

import { revalidateTag } from 'next/cache';

export async function leaveComment(data) {
  const comment = await db.comment.create({ data });

  // Invalidate the cache for our blog post
  revalidateTag('blog-post');

  return comment;
}

Now, you only have to run the heavy query once per comment, not once per page view. This is a huge win. A common mistake is fetching user data from the database on every single request. If you cache that user data, most requests become instantaneous instead of blocking on a database call.

Note: This caching is happening entirely on the server. It has nothing to do with client-side caching libraries like React Query. The server makes a call to the database, and you are telling the server, "Hey, once this is done, you don't have to call the database anymore; just use the result from the cache." This is effectively a wrapper around a Key-Value store (like Vercel KV or Redis) in the cloud.

6. Static First: Avoid Unnecessary Computation

You shouldn't have to make an API call to load a static blog post. A common failure is forcing pages to be dynamic when they don't need to be. You might see this line in a page component:

export const dynamic = 'force-dynamic';

This forces the page to be generated on the server every single time a user visits it. If the page has no user-specific data (like a blog post, a terms of service page, or documentation), it should be static. Loading this page shouldn't require any compute at all.

Thankfully, Vercel makes this easy to check. When you run a build, the output will show you which routes are static and which are dynamic. - ● (filled circle): Static - ƒ (lambda symbol): Dynamic

You want to make sure your heavy pages that are static by nature are actually being built as static HTML.

7. Choosing the Right Analytics

There's a tab in every Vercel dashboard for Analytics. It's important to note that these are web analytics, not product analytics. - Web Analytics (like Google Analytics) tells you which pages people are going to and provides counts. - Product Analytics (like Amplitude or Mixpanel) lets you track the specific journey of a user through your site.

Vercel's analytics can get expensive. The Pro plan has a cap of 20 million events per month and a pricing of $14 per 100,000 events. It's generally recommended to evaluate other solutions, especially for product analytics, as they may be more cost-effective and provide different features.

8. The Ultimate Safety Net: Spend Management

If you are still concerned about a massive, multi-thousand-dollar bill appearing out of nowhere, Vercel has a solution: Spend Management.

You can set a hard spend limit in your project's Billing tab. You can specify the maximum amount you want to spend and even manage when you get notifications. If you are concerned that usage will spike, this provides a safety net. It does mean your service will go down if the limit is hit, but this usually only happens if you've implemented things very wrong or your service has gone viral.

With good architecture, the amount of compute each request costs can be hilariously low, making it difficult for even a DDoS attack to generate a significant bill. For most, you're probably fine, but if you are the nervous type, the switch is there to be flipped.

All of these tips apply to other platforms too. You can use these same principles to be more successful on Netlify, Cloudflare, or any other serverless platform. Build your apps in a way that you understand, and try your best to keep the complexity down.