Clean SQL Queries: Indentation, Joins & CTEs | Web Formatter Blog

Introduction to SQL Query Formatting
SQL (Structured Query Language) is the standard language for interacting with relational databases. While SQL is powerful and flexible, complex queries can quickly become difficult to read and maintain without proper formatting. Well-formatted SQL not only improves readability but also helps in debugging, maintenance, and collaboration.
This guide covers comprehensive best practices for formatting SQL queries, with a focus on indentation strategies, join syntax, and common table expressions (CTEs). Whether you're writing simple SELECT statements or complex analytical queries, these formatting guidelines will help make your SQL more readable and maintainable.
Why SQL Formatting Matters
- Improved readability: Well-formatted SQL is easier to scan and understand
- Easier debugging: Logical structure helps identify issues more quickly
- Better collaboration: Consistent formatting helps team members understand each other's code
- Reduced errors: Clear structure makes it easier to spot logical and syntax errors
- Easier maintenance: Well-formatted queries are easier to modify and extend
Even if you're the only person who will ever see your queries, future you will thank present you for writing clean, well-formatted SQL.
Basic Principles of SQL Formatting
Before diving into specific formatting patterns, let's establish some fundamental principles:
- Consistency: Whatever style you choose, apply it consistently
- Keywords capitalization: Either capitalize all SQL keywords (SELECT, FROM, WHERE) or use lowercase consistently
- Line breaks: Use line breaks to separate logical components of the query
- Indentation: Use consistent indentation to show the hierarchical structure
- Alignment: Align related elements to improve readability
- Comments: Add comments to explain complex logic or business rules
-- Good formatting
SELECT
customer_id,
first_name,
last_name,
email
FROM
customers
WHERE
signup_date >= '2023-01-01'
AND status = 'active'
ORDER BY
last_name,
first_name;
-- Poor formatting
SELECT customer_id, first_name, last_name, email FROM customers WHERE signup_date >= '2023-01-01' AND status = 'active' ORDER BY last_name, first_name;
SQL Indentation Best Practices
Proper indentation is the foundation of readable SQL. It visually communicates the structure and hierarchy of your query.
Common Indentation Styles
The aligned style is generally the most popular and readable for most SQL queries.
Clause Alignment
SELECT
c.customer_id,
c.first_name,
c.last_name,
o.order_id,
o.order_date,
o.total_amount
FROM
customers c
LEFT JOIN
orders o ON c.customer_id = o.customer_id
WHERE
c.status = 'active'
AND c.region = 'North America'
AND o.status != 'cancelled'
ORDER BY
o.order_date DESC;
Notice how the conditions in the WHERE clause are aligned to make the comparisons easy to scan.
Subqueries
SELECT
department_name,
(
SELECT COUNT(*)
FROM employees e
WHERE e.department_id = d.department_id
) AS employee_count
FROM
departments d
WHERE
department_status = 'active';
SELECT
dept_summary.department_name,
dept_summary.employee_count,
dept_summary.avg_salary
FROM
(
SELECT
d.department_name,
COUNT(*) AS employee_count,
AVG(e.salary) AS avg_salary
FROM
departments d
JOIN
employees e ON d.department_id = e.department_id
GROUP BY
d.department_name
) AS dept_summary
WHERE
dept_summary.employee_count > 10
ORDER BY
dept_summary.avg_salary DESC;
Formatting SQL Joins
Joins connect tables together and are a fundamental part of relational database queries. Proper formatting of joins is essential for understanding the relationships between tables.
Join Formatting Styles
Multiple Joins
SELECT
c.customer_name,
r.region_name,
o.order_id,
o.order_date,
p.product_name,
oi.quantity,
oi.unit_price
FROM
customers c
JOIN
regions r ON c.region_id = r.region_id
LEFT JOIN
orders o ON c.customer_id = o.customer_id
LEFT JOIN
order_items oi ON o.order_id = oi.order_id
LEFT JOIN
products p ON oi.product_id = p.product_id
WHERE
r.region_name IN ('North', 'South')
AND o.order_date BETWEEN '2023-01-01' AND '2023-12-31'
ORDER BY
o.order_date DESC,
c.customer_name;
Join Conditions
SELECT
e.employee_name,
d.department_name
FROM
employees e
JOIN
departments d
ON e.department_id = d.department_id
AND e.location_id = d.location_id
AND (
e.status = 'active'
OR e.status = 'on leave'
)
WHERE
e.hire_date >= '2022-01-01';
Working with Common Table Expressions (CTEs)
Common Table Expressions (CTEs) are a powerful SQL feature that allows you to define temporary result sets that can be referenced within a SELECT, INSERT, UPDATE, or DELETE statement. They can greatly improve the readability of complex queries.
CTE Formatting Basics
WITH active_customers AS (
SELECT
customer_id,
customer_name,
email
FROM
customers
WHERE
status = 'active'
AND last_order_date >= DATEADD(month, -6, CURRENT_DATE)
)
SELECT
ac.customer_name,
o.order_id,
o.order_date,
o.total_amount
FROM
active_customers ac
JOIN
orders o ON ac.customer_id = o.customer_id
WHERE
o.order_date >= '2023-01-01'
ORDER BY
o.order_date DESC;
Nested CTEs
WITH active_customers AS (
SELECT
customer_id,
customer_name,
email,
region_id
FROM
customers
WHERE
status = 'active'
),
regional_sales AS (
SELECT
r.region_id,
r.region_name,
SUM(o.total_amount) AS total_sales
FROM
regions r
JOIN
active_customers ac ON r.region_id = ac.region_id
JOIN
orders o ON ac.customer_id = o.customer_id
WHERE
o.order_date >= '2023-01-01'
GROUP BY
r.region_id,
r.region_name
)
SELECT
rs.region_name,
rs.total_sales,
COUNT(DISTINCT ac.customer_id) AS customer_count
FROM
regional_sales rs
JOIN
active_customers ac ON rs.region_id = ac.region_id
GROUP BY
rs.region_name,
rs.total_sales
ORDER BY
rs.total_sales DESC;
Recursive CTEs
WITH RECURSIVE employee_hierarchy AS (
-- Base case: top-level employees (no manager)
SELECT
employee_id,
employee_name,
manager_id,
0 AS level,
employee_name AS path
FROM
employees
WHERE
manager_id IS NULL
UNION ALL
-- Recursive case: employees with managers
SELECT
e.employee_id,
e.employee_name,
e.manager_id,
eh.level + 1,
eh.path || ' > ' || e.employee_name
FROM
employees e
JOIN
employee_hierarchy eh ON e.manager_id = eh.employee_id
)
SELECT
employee_id,
employee_name,
level,
path
FROM
employee_hierarchy
ORDER BY
path;
Formatting CASE Statements
Simple CASE Statements
SELECT
product_id,
product_name,
price,
CASE
WHEN price < 10 THEN 'Budget'
WHEN price BETWEEN 10 AND 50 THEN 'Mid-range'
WHEN price > 50 THEN 'Premium'
ELSE 'Unknown'
END AS price_category
FROM
products
ORDER BY
price;
Complex CASE Statements
SELECT
customer_id,
customer_name,
CASE
WHEN last_order_date IS NULL THEN
'Never ordered'
WHEN last_order_date < DATEADD(month, -12, CURRENT_DATE) THEN
'Inactive (> 12 months)'
WHEN last_order_date BETWEEN DATEADD(month, -12, CURRENT_DATE) AND DATEADD(month, -6, CURRENT_DATE) THEN
'At risk (6-12 months)'
WHEN last_order_date BETWEEN DATEADD(month, -6, CURRENT_DATE) AND DATEADD(month, -3, CURRENT_DATE) THEN
'Recent (3-6 months)'
WHEN last_order_date >= DATEADD(month, -3, CURRENT_DATE) THEN
'Active (< 3 months)'
ELSE
'Unknown'
END AS customer_status,
last_order_date
FROM
customers
ORDER BY
last_order_date DESC NULLS LAST;
SELECT
order_id,
customer_id,
order_date,
CASE
WHEN order_status = 'Cancelled' THEN
'Cancelled'
WHEN order_status = 'Completed' THEN
CASE
WHEN return_date IS NOT NULL THEN
'Returned'
WHEN review_rating IS NOT NULL THEN
CASE
WHEN review_rating >= 4 THEN 'Satisfied'
WHEN review_rating < 4 THEN 'Unsatisfied'
ELSE 'No rating'
END
ELSE
'Completed without review'
END
ELSE
order_status
END AS order_outcome
FROM
orders
WHERE
order_date >= '2023-01-01';
Window Functions Formatting
Window functions perform calculations across a set of table rows. Proper formatting makes these powerful functions easier to understand:
SELECT
department_id,
employee_id,
employee_name,
salary,
AVG(salary) OVER (
PARTITION BY department_id
) AS dept_avg_salary,
salary - AVG(salary) OVER (
PARTITION BY department_id
) AS salary_diff_from_avg,
RANK() OVER (
PARTITION BY department_id
ORDER BY salary DESC
) AS salary_rank_in_dept
FROM
employees
WHERE
status = 'active'
ORDER BY
department_id,
salary_rank_in_dept;
SELECT
customer_id,
order_date,
order_amount,
SUM(order_amount) OVER (
PARTITION BY customer_id
ORDER BY order_date
ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW
) AS running_total,
LAG(order_amount, 1, 0) OVER (
PARTITION BY customer_id
ORDER BY order_date
) AS previous_order_amount,
order_amount - LAG(order_amount, 1, 0) OVER (
PARTITION BY customer_id
ORDER BY order_date
) AS order_amount_change
FROM
orders
WHERE
order_date >= '2023-01-01'
ORDER BY
customer_id,
order_date;
SQL Comments and Documentation
Comments are essential for explaining complex queries, especially those involving business logic or unusual techniques:
-- This query identifies high-value customers who haven't ordered recently
-- for targeted re-engagement campaigns
WITH customer_metrics AS (
SELECT
c.customer_id,
c.customer_name,
c.email,
MAX(o.order_date) AS last_order_date,
SUM(o.total_amount) AS lifetime_value,
COUNT(o.order_id) AS order_count
FROM
customers c
LEFT JOIN
orders o ON c.customer_id = o.customer_id
WHERE
-- Only consider orders from the past 3 years
o.order_date >= DATEADD(year, -3, CURRENT_DATE)
AND o.status != 'cancelled'
GROUP BY
c.customer_id,
c.customer_name,
c.email
)
SELECT
cm.customer_id,
cm.customer_name,
cm.email,
cm.last_order_date,
cm.lifetime_value,
cm.order_count,
-- Calculate days since last order
DATEDIFF(day, cm.last_order_date, CURRENT_DATE) AS days_since_last_order
FROM
customer_metrics cm
WHERE
-- High value: over $1000 lifetime value
cm.lifetime_value > 1000
-- At risk: haven't ordered in over 180 days
AND cm.last_order_date < DATEADD(day, -180, CURRENT_DATE)
ORDER BY
cm.lifetime_value DESC;
- The purpose of the query
- Complex business logic or calculations
- Why certain filters or joins are being used
- Assumptions made in the query
- Performance considerations
SQL Formatting Tools
While manual formatting is important to understand, tools can help automate and standardize the process.
Popular SQL Formatters
- SQL Formatter: Our online tool for quick SQL formatting
- pgFormatter: A PostgreSQL-specific formatter with many configuration options
- Poor Man's T-SQL Formatter: A formatter specifically for Microsoft SQL Server
- SQLFormat: A simple online SQL formatter
- SQLFluff: A SQL linter that can also format code
IDE Integration
- DataGrip: JetBrains' database IDE with excellent formatting options
- VS Code: Extensions like "SQL Formatter" provide formatting capabilities
- SQL Server Management Studio: Includes basic formatting options
- DBeaver: Open-source database tool with SQL formatting
Try Our Formatting Tools
Format your code with our free online tools:
Need more formatting tools?
Explore our complete collection of free online code formatting tools.
Browse All Formatters