Optimizing NestJS Performance with HTTP Keep-Alive

HTTP Keep-Alive, also known as HTTP persistent connection, is the practice of reusing a single TCP connection to send multiple HTTP requests and responses, instead of opening a new connection for every request (HTTP persistent connection – Wikipedia). In modern HTTP/1.1, connections are persistent by default, meaning the server won’t close the TCP socket immediately after responding (unless explicitly told to do so). Enabling and tuning Keep-Alive can have a big impact on performance for high-throughput APIs and microservices.

This post will explain what Keep-Alive is, why it matters for performance, and how to use it effectively in a NestJS application – both on the server side and when your NestJS app makes HTTP requests to others. We’ll include code examples and discuss best practices (like timeouts, max sockets, and connection reuse) as well as when Keep-Alive is beneficial and when it might be counterproductive.

What is HTTP Keep-Alive and Why Does It Matter?

HTTP Keep-Alive (persistent connections) means reusing the same TCP connection for multiple HTTP requests/responses instead of closing it after each response (HTTP persistent connection – Wikipedia). By default, a naive HTTP/1.1 client-server exchange would go through a cycle of: open TCP connection -> send request -> get response -> close connection. With Keep-Alive, the “close connection” step is deferred, allowing subsequent requests to skip the TCP handshake overhead and use the existing channel.

This reuse of connections has several performance benefits:

Reduced Latency: Subsequent requests have no TCP handshake or TLS negotiation delay, avoiding extra round-trips (HTTP persistent connection – Wikipedia). For example, a TLS handshake alone can add ~60 ms (3 network RTTs) of latency (Enable HTTPS keepAlive in Node.js – DEV Community). With Keep-Alive, after the first request, later requests can be 70% faster since they skip handshake steps (Enable HTTPS keepAlive in Node.js – DEV Community).
Lower CPU and Network Overhead: Establishing a connection (especially HTTPS) is CPU-intensive. Reusing connections means fewer new connections to set up, which reduces CPU usage on both client and server (HTTP persistent connection – Wikipedia). It also means fewer packets overall (no repeated SYN/SYN-ACK), reducing network congestion (HTTP persistent connection – Wikipedia).
Higher Throughput: For high-throughput scenarios (like microservices calling each other hundreds or thousands of times per second), using new connections for every request can exhaust the operating system’s resources. For instance, the OS may keep closed TCP ports in a TIME_WAIT state for ~60 seconds (When should I use (or not use) the keepAlive option of the Node.js http Agent? – Stack Overflow). If your Node.js process makes a huge number of short-lived requests without Keep-Alive, it can run out of ephemeral ports due to so many sockets in TIME_WAIT. Keep-Alive avoids this by reusing sockets, preventing port exhaustion (When should I use (or not use) the keepAlive option of the Node.js http Agent? – Stack Overflow).
Fewer Open/Close Cycles: Opening and closing connections repeatedly is a waste of resources. Servers need to allocate memory and file descriptors for each connection. By keeping connections open, you reduce the churn of constantly allocating and freeing these resources. This is especially important in microservice architectures where services may communicate very frequently.

In summary, HTTP Keep-Alive helps eliminate the overhead of repeated connections, which is why it’s on by default in HTTP/1.1 (and HTTP/2 takes it further with multiplexing). However, effectively using Keep-Alive in Node.js/NestJS requires some configuration to get the best performance.

Enabling HTTP Keep-Alive on a NestJS Server

NestJS applications (when using the default Express HTTP adapter) run on Node’s built-in HTTP server. Out of the box, Node’s HTTP server will allow persistent connections, but it has default timeout settings that may not be optimal for high-throughput use cases. In Node.js, the server’s default keep-alive timeout is only 5 seconds (Tuning HTTP Keep-Alive in Node.js). This means if a client doesn’t send a new request within 5 seconds after the last response, the server will close the connection. For many APIs, 5 seconds may be too short – especially if clients make bursts of requests with pauses slightly longer than 5 seconds, causing connections to drop and re-establish frequently.

Tuning the server’s Keep-Alive settings: We can increase the keep-alive timeout on the NestJS server to keep connections open longer. NestJS gives us access to the underlying Node http.Server instance, which has properties to configure timeouts. For example, in your main.ts (bootstrap file) you can do the following:

import { NestFactory } from '@nestjs/core';
import { AppModule } from './app.module';
import * as http from 'http';

async function bootstrap() {
  const app = await NestFactory.create(AppModule);
  const server: http.Server = app.getHttpServer();

  // Set Keep-Alive and header timeouts (in milliseconds)
  server.keepAliveTimeout = 30000;   // 30 seconds keep-alive timeout
  server.headersTimeout = 31000;     // 31 seconds headers timeout (should be a bit more than keepAliveTimeout)

  await app.listen(3000);
}
bootstrap();

In the above example, we set the server to keep idle HTTP connections open for 30 seconds. We also set the headersTimeout slightly higher (31 seconds) – Node.js recommends the headers timeout (maximum time to wait for the next request’s headers) be slightly greater than the keep-alive timeout (Tuning HTTP Keep-Alive in Node.js) (Tuning HTTP Keep-Alive in Node.js). This prevents the server from terminating a socket exactly as a client is sending a new request. The code is effectively the same as you would use in a plain Node.js HTTP server (NestJS tip: how to change HTTP server timeouts – DEV Community), but accessed through Nest’s app.getHttpServer().

Why increase keep-alive timeout? If your NestJS service is behind a load balancer or receives traffic in bursts, a longer keep-alive can reduce the need to frequently reconnect. For example, AWS Elastic Load Balancers have a default idle timeout of 60 seconds. If Node’s keep-alive timeout remained 5 seconds, the Node server would close connections much sooner than the ELB, possibly leading to “502 Bad Gateway” errors when the ELB tries to reuse a closed connection (Tuning HTTP Keep-Alive in Node.js). By configuring Node’s server.keepAliveTimeout to match or exceed the LB’s timeout (e.g. ~61 seconds for a 60s ELB timeout) (Tuning HTTP Keep-Alive in Node.js), you ensure the server doesn’t drop the connection first. In general, make sure the server’s keep-alive timeout is >= any client or proxy keep-alive intervals to avoid unexpected connection resets (Tuning HTTP Keep-Alive in Node.js).

Note: Keep in mind that a very long keep-alive timeout means the server will hold many connections open, which could use more memory. If you have tens of thousands of clients, you might not want each to hold an idle connection for 2 minutes. Choose a timeout that balances reuse benefits against resource usage. For many high-throughput APIs, something on the order of 30–60 seconds is reasonable in production, but your needs may vary.

If you’re using Fastify (the alternative HTTP adapter for NestJS), the same concept applies – Node’s HTTP server is still under the hood. You can similarly access the server and set keepAliveTimeout. (Fastify uses slightly different defaults but generally also supports persistent connections).

Using HTTP Keep-Alive in HTTP Clients (Axios, http/https) for NestJS

In addition to configuring your NestJS server, it’s equally important to enable Keep-Alive for outgoing HTTP requests made by your NestJS app. This is relevant if your NestJS service calls external APIs or microservices (e.g., Service A calls Service B’s REST API). By default, Node.js HTTP clients do not keep connections alive – the default HTTP agent opens a new TCP connection for each request, unless instructed otherwise (Reusing Connections with Keep-Alive in Node.js – AWS SDK for JavaScript) (HTTP | Node.js v23.11.0 Documentation). We need to explicitly enable Keep-Alive in our HTTP client libraries to reap the performance benefits.

Using Axios (or NestJS HttpService) with Keep-Alive

NestJS offers an HttpModule (built on Axios) to make HTTP requests. Axios, when run in Node.js, uses Node’s http/https module under the hood. By default, Axios does not reuse TCP connections unless you configure an HTTP agent with keepAlive. The Axios docs note that you can define a custom httpAgent and httpsAgent to enable options like keep-alive, which are not enabled by default (axios – NestJS).

To use Keep-Alive with Axios in a NestJS service, you have two main options:

Use Nest’s HttpModule with a custom Axios config: When importing the HttpModule, you can pass a configuration object that Nest will use to create the Axios instance. For example:

import { HttpModule } from '@nestjs/axios';
import * as http from 'http';
import * as https from 'https';

@Module({
  imports: [HttpModule.register({
    timeout: 5000,  // request timeout in ms (example)
    maxRedirects: 5,
    // Enable persistent connections:
    httpAgent: new http.Agent({ keepAlive: true }),
    httpsAgent: new https.Agent({ keepAlive: true }),
  })],
  // ...providers, controllers
})
export class AppModule {}

This configures Axios to use a keep-alive HTTP agent for both HTTP and HTTPS. Once this is set up, you can inject HttpService into your services and controllers, and any outbound HTTP call will reuse connections. For instance:

@Injectable()
export class ApiService {
  constructor(private http: HttpService) {}

  async getData() {
    const resp = await this.http.get('https://api.example.com/data').toPromise();
    return resp.data;
  }
}

Under the hood, the first request to api.example.com will establish a TCP connection. Subsequent requests to the same host will reuse that connection (until it’s closed due to inactivity or other constraints). This dramatically reduces latency and CPU overhead for each call after the first.

Create and use a custom Axios instance: If you prefer not to use Nest’s HttpModule, you can directly use Axios by creating an instance with keep-alive agents. For example:

import axios from 'axios';
import * as http from 'http';
import * as https from 'https';

const httpAgent = new http.Agent({ keepAlive: true });
const httpsAgent = new https.Agent({ keepAlive: true });

const apiClient = axios.create({ 
  baseURL: 'https://api.example.com', 
  httpAgent, 
  httpsAgent 
});

// Example usage:
const response1 = await apiClient.get('/data');
const response2 = await apiClient.get('/other');

Here, apiClient will maintain a pool of TCP sockets (one per host:port basically) and reuse them. The second request (response2) to the same host will reuse the connection from the first request if it’s still open. This is exactly what we want for performance. (If you didn’t set keepAlive: true, Axios/Node would by default close the TCP socket after the first response, and response2 would open a new connection).

When to Use Keep-Alive and When to Avoid It

While Keep-Alive is generally beneficial for performance, it’s not a silver bullet for every scenario. Here are some guidelines on when you should use it and when you might disable or limit it:

Use Keep-Alive when:

High-Throughput or Frequent Requests: If your application makes many requests to the same host (e.g., a microservice calling another service repeatedly, or an API receiving many requests from the same clients), Keep-Alive will almost always improve performance. The connection handshake cost is amortized over many requests, yielding lower latency per request and higher overall throughput.
Microservices Communication: In a microservices architecture with RESTful calls, persistent connections between services reduce overhead. This is especially true for HTTPS calls – avoiding repeated TLS handshakes can significantly cut down response times (Enable HTTPS keepAlive in Node.js – DEV Community).
Latency-Sensitive Interactions: If every millisecond counts (for example, low-latency trading systems or real-time systems), eliminating the extra 50–100ms of a handshake can be critical.
When experiencing port exhaustion or socket errors: If you’ve hit issues like ECONNRESET or “socket hang up” errors under load, it might be due to many connections opening/closing. Using Keep-Alive can mitigate issues like running out of ephemeral ports or exceeding connection limits (When should I use (or not use) the keepAlive option of the Node.js http Agent? – Stack Overflow) (When should I use (or not use) the keepAlive option of the Node.js http Agent? – Stack Overflow).
Clients behind load balancers: If clients (like a load balancer or an API gateway) support Keep-Alive, enabling it on the server allows those clients to reuse connections. This can dramatically reduce load on your server by not having to perform a TCP handshake for each request. (Just be sure to coordinate timeouts as mentioned.)

Avoid or be cautious with Keep-Alive when:

Very Low Traffic or One-off Requests: If your application rarely makes more than one request to the same host in a short period, a persistent connection might not provide much benefit. For example, a cron job that pings a service once a day doesn’t gain anything from keep-alive – the connection will likely time out long before the next use. In such cases, the default of opening a new connection each time is fine, and keeping a socket open would just consume resources needlessly.
Large Number of Idle Connections: Each open connection ties up some resources on both the client and server. If you enable keep-alive indiscriminately, you could end up with lots of idle sockets. For instance, a server that has 10,000 clients with keep-alive and a long timeout might have up to 10,000 open sockets sitting idle. This can consume file descriptors and memory on the serve (HTTP persistent connection – Wikipedia)】. If your environment can’t handle that, you might choose a shorter timeout or not use keep-alive for certain clients. Always consider the trade-off between connection reuse and resource usage.
Load Balancing Considerations: In some scenarios, reusing a connection might skew load balancing. For example, if a client holds one keep-alive connection to a load balancer, and the LB routes that connection to a specific server, all requests will hit the same server. If you expected to distribute requests across multiple servers, a single persistent connection won’t do that. (Some load balancers do per request balancing for HTTP/1.1, but others stick to per connection.) In practice this is usually handled by having many clients or multiple connections, so it’s not a huge issue, but be aware if you see uneven load distribution – you may need multiple connections to achieve parallelism.
Short-lived scripts or serverless functions: If your code runs in an environment that doesn’t persist between invocations (e.g., AWS Lambda cold starts or short CLI scripts), you might not benefit from keep-alive. The process may terminate before a second request is ever made. However, note that in serverless warm invocations, reusing connections can help if you make multiple requests within one invocation or if the runtime instance is reused for consecutive events.
Incompatibility or Bugs: On rare occasions, you might encounter a buggy intermediary or client that doesn’t handle persistent connections properly. Historically, some proxies or older HTTP/1.0 clients required Connection: close. This is uncommon today, but if you have to interact with such a system, you might disable keep-alive for that specific interaction.

In general, the advice is: enable Keep-Alive in all the places we discussed, unless you have a specific reason not to. The benefits (better latency, throughput, and efficiency) typically outweigh the downsides. Just monitor your application’s resource usage – if you see an excessive number of open idle connections, you can adjust timeouts or agent settings accordingly.

Best Practices for Keep-Alive (Timeouts, Socket Pools, and Agent Reuse)

To use HTTP Keep-Alive effectively in NestJS, keep these best practices in mind:

Match Timeouts Appropriately: Ensure your client’s idle timeout is slightly shorter than the server’s timeout. This prevents clients from sending requests on connections that the server already closed. For example, if your NestJS server has keepAliveTimeout = 30s, configure your HTTP client agent’s socket timeout to, say, 25s or 29s. This way, the client will proactively discard an idle connection before the server does, avoiding the ECONNRESET scenari (Tuning HTTP Keep-Alive in Node.js)】. On the flip side, if you control both ends (service to service), you could also make the server’s timeout longer – the key is the server should not drop before the client. Node’s default was server 5s vs. client agent 15s in one case, which caused error (Tuning HTTP Keep-Alive in Node.js)】. Align these values to be safe.
Tune keepAliveTimeout and headersTimeout on servers: As shown earlier, increase server.keepAliveTimeout from the Node default 5s to a more reasonable value for your use case (30s, 60s, etc. (Tuning HTTP Keep-Alive in Node.js)】. Also set server.headersTimeout to a bit more than that (e.g. +5s (Tuning HTTP Keep-Alive in Node.js)】. This ensures that persistent connections remain open long enough to be useful, without being so long that they hog resources indefinitely.
Reuse Agent instances: When making HTTP requests from your NestJS app, create a single http.Agent (or Axios instance) and reuse it for all requests to a given host. Do not create a new Agent for every request. Reusing an Agent is what allows the sockets to be pooled. For example, create one axiosInstance per base URL (or a few for groups of services) and use it throughout your service. If you use Nest’s HttpModule as a singleton, it already reuses the Axios HttpService for you.
Limit Max Sockets if necessary: Node’s default maxSockets for an agent is Infinity (unlimited (HTTP | Node.js v23.11.0 Documentation), which means it will happily open as many parallel connections as you have requests. In a controlled environment (like a microservice calling another), this is usually fine and you want as many as needed. But if you fear oversubscription (maybe a bug causes a thundering herd of requests), you can set a reasonable maxSockets on your keep-alive agent. For example, if you normally expect at most 100 concurrent outgoing requests, set maxSockets: 100 to prevent using excessive connections. Similarly, you might tune maxFreeSockets (default 256) which is how many idle sockets per host the agent will keep open for reus (HTTP | Node.js v23.11.0 Documentation)】. If memory is an issue, you could lower this, though typically 256 idle sockets per host is plenty.
Use keepAliveMsecs if needed: The keepAliveMsecs on an agent (default 1000ms) is the initial delay for TCP keep-alive probes at the OS leve (HTTP | Node.js v23.11.0 Documentation)】. Usually you don’t need to touch this – it’s not the same as the HTTP keep-alive timeout. It’s more about TCP-level heartbeat. In most cases, leave it default. If you want to aggressively detect dead peers, you might shorten it, but that can increase network chatter.
Set Request Timeouts: Keep-Alive keeps connections open, but you still should set request timeouts for your HTTP calls. For instance, if using Axios, set a timeout: 5000 (5 seconds, or whatever is appropriate) so that if the remote service doesn’t respond, your call doesn’t hang indefinitely. Keep-Alive on its own doesn’t enforce an overall request timeout. NestJS HttpModule allows a global timeout option, or you can set per-request timeouts in Axios/fetch, etc. This is just good practice to prevent stuck requests from hogging a connection forever.
Graceful Shutdown: When shutting down a NestJS application, you may want to close any persistent connections. NestJS will by default try to wait for ongoing requests to complete, but if you have keep-alive sockets open with no active requests, they could theoretically keep the process running. Node’s server.close() will stop accepting new connections, but existing keep-alive connections could remain open until they time out. To be safe, you might manually destroy connections or set a shorter timeout just before shutdown. Similarly, for outgoing connections, you can call agent.destroy() to immediately close all sockets if the process is ending (HTTP | Node.js v23.11.0 Documentation). This ensures a fast, clean shutdown. I have a dedicated lecture on graceful shutdown in Node.js here.
Monitoring and Debugging: It’s a good idea to monitor the number of open connections. Tools like netstat or Node’s process._getActiveHandles() (for debugging) can show how many sockets are open. If you see far more open sockets than expected, you might need to adjust your keep-alive strategy. NestJS doesn’t provide this directly, but you can instrument your app or use OS-level metrics.

In practice, enabling HTTP Keep-Alive in NestJS (both on the server and client side) is a straightforward but powerful optimization. It reduces waste and latency, especially under high load or with chatty microservices. Just remember that with great power comes responsibility – manage your timeouts and resources wisely. By following the guidelines above, you can avoid common pitfalls (like sockets lingering too long or unexpected resets) and ensure your application reaps the performance gains of persistent connections.