SQL Query Optimization in 2026: 7 Techniques for Speed

Contents

1. Replace IN with a JOIN on a Virtual Table
Problem (1)
Solution (1)
Why It Can Be Faster (1)
2. Use ANY(ARRAY[]) Instead of IN in PostgreSQL
Problem (2)
Solution (2)
Why It Can Be Faster (2)
3. Use JOIN Instead of a Correlated Subquery
Problem (3)
Solution (3)
Why It Can Be Faster (3)
4. Use BETWEEN Instead of Date Functions
Problem (4)
Solution (4)
Why It Can Be Faster (4)
5. Use EXISTS Instead of JOIN When Checking for Existence
Problem (5)
Solution (5)
Why It Can Be Faster (5)
6. Apply Small Query Hygiene Improvements
Select Only the Columns You Need
Add LIMIT When You Only Need a Sample
Avoid Functions in WHERE Conditions
Optimize Conditional Aggregations
Use Logical Expressions for Simple Boolean Checks
7. Use ROW_NUMBER() as an Alternative to DISTINCT
Goal
Solution (7)
Why It Can Be Faster (7)
My Thoughts

Hey there. In this article, I will share a few simple ways to speed up SQL queries and make them more efficient. The examples focus on PostgreSQL, but most of the ideas apply to other relational databases too.

The goal is not to memorize tricks. The goal is to understand why certain query shapes make it easier for the database engine to use indexes, avoid repeated scans, and return results faster.

SQL optimization: because your queries deserve the fast lane.

1. Replace IN with a JOIN on a Virtual Table

Problem (1)

Having a large list of values in an IN clause can lead to many checks for each row. With a small list this may not matter, but with long lists it can make the planner’s job harder and add unnecessary CPU work.

Solution (1)

Use a virtual table with VALUES, then join it to the main table. This can give the database a query shape that is easier to optimize.

-- BEFORE:
EXPLAIN
SELECT order_id, city
  FROM orders
 WHERE city IN ('Berlin', 'Paris', 'Rome');

-- AFTER:
EXPLAIN
SELECT o.order_id, o.city
  FROM orders AS o
  JOIN (
        VALUES ('Berlin'), ('Paris'), ('Rome')
       ) AS v(city_name)
    ON o.city = v.city_name;

Why It Can Be Faster (1)

By treating the list of values as a table, PostgreSQL can sometimes create a more efficient execution plan than it would with a long IN clause.

This is especially useful when the value list is large, repeated, or generated dynamically by the application.

2. Use ANY(ARRAY[]) Instead of IN in PostgreSQL

Problem (2)

A large IN list can slow down a query because each possible match must be checked.

Solution (2)

In PostgreSQL, try the = ANY(ARRAY[...]) syntax. It can be clearer when the application already has values as an array, and PostgreSQL can stop checking once a match is found.

-- BEFORE:
EXPLAIN
SELECT product_id, quantity
  FROM order_items
 WHERE product_id IN (101, 202, 303, 404);

-- AFTER:
EXPLAIN
SELECT product_id, quantity
  FROM order_items
 WHERE product_id = ANY(ARRAY[101, 202, 303, 404]);

Why It Can Be Faster (2)

ANY can reduce unnecessary comparisons in some cases, especially when combined with a clean query plan and indexed columns.

This technique is PostgreSQL-specific, so check behavior before applying it across different database engines.

3. Use JOIN Instead of a Correlated Subquery

Problem (3)

Correlated subqueries run in relation to each row from the outer query. That can cause repeated scans of the same table and become expensive as the dataset grows.

Solution (3)

Replace the correlated subquery with a regular JOIN or a non-correlated subquery when possible.

-- BEFORE:
EXPLAIN
SELECT c.customer_id, c.name
  FROM customers AS c
 WHERE EXISTS (
       SELECT 1
         FROM orders AS o
        WHERE o.customer_id = c.customer_id
          AND o.amount > 1000
       );

-- AFTER:
EXPLAIN
SELECT DISTINCT c.customer_id, c.name
  FROM customers AS c
  JOIN orders AS o
    ON c.customer_id = o.customer_id
 WHERE o.amount > 1000;

Why It Can Be Faster (3)

A JOIN gives PostgreSQL more room to use indexes and optimize the relationship between the two tables. It also avoids executing the same lookup repeatedly for every row in the outer query.

The DISTINCT is needed here because one customer can have several qualifying orders.

4. Use BETWEEN Instead of Date Functions

Problem (4)

Expressions like this are common:

EXTRACT(YEAR FROM order_date) = 2026

The problem is that applying a function to a column can prevent the database from using a normal index on that column efficiently. PostgreSQL has to evaluate the function for many rows instead of scanning a clean date range.

Solution (4)

Use explicit date ranges with BETWEEN, or even better, an inclusive lower bound and exclusive upper bound for timestamp columns.

-- BEFORE:
EXPLAIN
SELECT *
  FROM orders
 WHERE EXTRACT(YEAR FROM order_date) = 2026
   AND EXTRACT(MONTH FROM order_date) = 5;

-- AFTER:
EXPLAIN
SELECT *
  FROM orders
 WHERE order_date BETWEEN '2026-05-01'::DATE
                     AND '2026-05-31'::DATE;

For timestamp columns, this shape is often safer:

SELECT *
  FROM orders
 WHERE order_date >= '2026-05-01'::DATE
   AND order_date <  '2026-06-01'::DATE;

Why It Can Be Faster (4)

Filtering directly on the indexed date column enables an index range scan. The database can jump to the relevant part of the index instead of applying date functions row by row.

5. Use EXISTS Instead of JOIN When Checking for Existence

Problem (5)

If you only need to know whether at least one related row exists, a JOIN can fetch more rows than necessary.

Solution (5)

Use EXISTS. It can stop as soon as it finds the first matching row.

-- BEFORE:
EXPLAIN
SELECT COUNT(DISTINCT o.order_id)
  FROM orders AS o
  JOIN order_items AS i
    ON o.order_id = i.order_id;

-- AFTER:
EXPLAIN
SELECT COUNT(DISTINCT o.order_id)
  FROM orders AS o
 WHERE EXISTS (
       SELECT 1
         FROM order_items AS i
        WHERE i.order_id = o.order_id
       );

Why It Can Be Faster (5)

The query does not need to retrieve all related rows. Once PostgreSQL finds a matching record, the condition is satisfied and the engine can move on.

This is a clean pattern for permission checks, relationship checks, and filters where the actual joined data is not needed.

6. Apply Small Query Hygiene Improvements

Some optimizations are not dramatic on their own, but they add up quickly in production systems.

Select Only the Columns You Need

Avoid SELECT * when you only need a few fields. Pulling unnecessary columns increases IO, memory use, network transfer, and sometimes prevents index-only scans.

-- BEFORE:
SELECT *
  FROM customers;

-- AFTER:
SELECT customer_id, name, email
  FROM customers;

Add LIMIT When You Only Need a Sample

If the UI or script only needs a small subset of rows, use LIMIT.

SELECT order_id, created_at, status
  FROM orders
 ORDER BY created_at DESC
 LIMIT 50;

Avoid Functions in WHERE Conditions

Prefer index-friendly predicates:

-- BEFORE:
WHERE SUBSTRING(code, 1, 3) = 'ABC'

-- AFTER:
WHERE code LIKE 'ABC%'

The second form gives the database a better chance to use an index on code.

Optimize Conditional Aggregations

In PostgreSQL, FILTER can make conditional counts clearer than repeated subqueries or verbose CASE expressions.

-- BEFORE:
EXPLAIN
SELECT SUM(CASE WHEN status = 'NEW' THEN 1 END) AS new_orders,
       SUM(CASE WHEN status = 'CLOSED' THEN 1 END) AS closed_orders
  FROM orders;

-- AFTER:
EXPLAIN
SELECT COUNT(*) FILTER (WHERE status = 'NEW')    AS new_orders,
       COUNT(*) FILTER (WHERE status = 'CLOSED') AS closed_orders
  FROM orders;

Use Logical Expressions for Simple Boolean Checks

For simple boolean logic, this:

table1.is_deleted OR table2.is_deleted

is usually clearer than a large CASE expression that returns a boolean.

7. Use ROW_NUMBER() as an Alternative to DISTINCT

Goal

Extract unique values efficiently, especially in large datasets.

Solution (7)

Sometimes ROW_NUMBER() with partitioning can be faster than DISTINCT, particularly when the partitioning columns are indexed and the query needs more control over which row is kept.

-- BEFORE:
EXPLAIN
SELECT COUNT(DISTINCT user_id)
  FROM logins
 WHERE login_date BETWEEN '2026-01-01'::DATE
                     AND '2026-01-31'::DATE;

-- AFTER:
EXPLAIN
SELECT COUNT(user_id)
  FROM (
       SELECT user_id,
              ROW_NUMBER() OVER (PARTITION BY user_id ORDER BY user_id) AS rn
         FROM logins
        WHERE login_date BETWEEN '2026-01-01'::DATE
                            AND '2026-01-31'::DATE
       ) AS tmp
 WHERE rn = 1;

Why It Can Be Faster (7)

ROW_NUMBER() can avoid a heavy distinct-sorting operation by assigning row numbers within each group. It also gives you more flexibility if you need to choose the newest, oldest, or highest-priority row per group.

That said, always verify with EXPLAIN ANALYZE. DISTINCT can still be the better choice for simple cases.

My Thoughts

SQL query optimization is a core step in any database-backed project. The best improvements often come from small query-shape changes:

Replace long IN lists with virtual tables or PostgreSQL arrays when appropriate.
Use range predicates instead of applying functions to indexed columns.
Choose EXISTS when you only need to check whether a related row exists.
Avoid unnecessary columns, rows, repeated scans, and heavy sorting.
Keep queries readable enough that the next engineer can maintain them.

The most important habit is to measure. Use EXPLAIN, EXPLAIN ANALYZE, real data volumes, and realistic filters. A query that looks elegant in isolation may behave differently under production data distribution.

Good SQL performance is not magic. It is usually the result of making the database do less unnecessary work.

LATEST FROM MY BLOG

SQL Query Optimization in 2026: 7 Simple Techniques for Faster Database Performance

1. Replace IN with a JOIN on a Virtual Table

Problem (1)

Solution (1)

Why It Can Be Faster (1)

2. Use ANY(ARRAY[]) Instead of IN in PostgreSQL

Problem (2)

Solution (2)

Why It Can Be Faster (2)

3. Use JOIN Instead of a Correlated Subquery

Problem (3)

Solution (3)

Why It Can Be Faster (3)

4. Use BETWEEN Instead of Date Functions

Problem (4)

Solution (4)

Why It Can Be Faster (4)

5. Use EXISTS Instead of JOIN When Checking for Existence

Problem (5)

Solution (5)

Why It Can Be Faster (5)

6. Apply Small Query Hygiene Improvements

Select Only the Columns You Need

Add LIMIT When You Only Need a Sample

Avoid Functions in WHERE Conditions

Optimize Conditional Aggregations

Use Logical Expressions for Simple Boolean Checks

7. Use ROW_NUMBER() as an Alternative to DISTINCT

Goal

Solution (7)

Why It Can Be Faster (7)

My Thoughts

LATEST FROM MY BLOG

1. Replace IN with a JOIN on a Virtual Table

Problem (1)

Solution (1)

Why It Can Be Faster (1)

2. Use ANY(ARRAY[]) Instead of IN in PostgreSQL

Problem (2)

Solution (2)

Why It Can Be Faster (2)

3. Use JOIN Instead of a Correlated Subquery

Problem (3)

Solution (3)

Why It Can Be Faster (3)

4. Use BETWEEN Instead of Date Functions

Problem (4)

Solution (4)

Why It Can Be Faster (4)

5. Use EXISTS Instead of JOIN When Checking for Existence

Problem (5)

Solution (5)

Why It Can Be Faster (5)

6. Apply Small Query Hygiene Improvements

Select Only the Columns You Need

Add LIMIT When You Only Need a Sample

Avoid Functions in WHERE Conditions

Optimize Conditional Aggregations

Use Logical Expressions for Simple Boolean Checks

7. Use ROW_NUMBER() as an Alternative to DISTINCT

Goal

Solution (7)

Why It Can Be Faster (7)

My Thoughts

Related articles