Skip to main content

SQL Index Performance

·1129 words·6 mins
Author
Alessandro Ferrini

This article is part of the 2025 DBMS course lessons and it focuses on SQLite. It builds on Understanding DBMS architecture and the Relational Data Modeling workflow so you see how schema choices influence query speed.

Relational data models and the DBMS role
#

Relational models treat data as tables while the DBMS enforces constraints, manages transactions, and wires up indexes. Concepts such as tuples, keys, and normalization appear in Introduction to Databases, and the algebraic operations these indexes accelerate are summarized in the relational algebra operations guide.

Why SQL indexes matter
#

Indexes avoid full-table scans by offering a sorted roadmap to rows. SQLite and many other relational systems store indexes as B-trees, so every node stays balanced and traversal depth stays predictable. Each index keeps copies of the indexed column values (and optionally other covering columns) paired with the rowid, so the DBMS updates those copies whenever an INSERT, UPDATE, or DELETE touches the base table.

Index anatomy (example)
#

Imagine a minimal table with rowid, column_a, column_b, and column_c. An index on column_a sorts the column values and pairs them with rowid:

rowidcolumn_acolumn_bcolumn_c
122hello1
218world2

The index itself becomes:

column_arowid
182
221

Searching for column_a = 18 means traversing the index tree, hitting the matching key, and reading just the pointed row instead of scanning every row sequentially.

Creating indexes
#

Use CREATE INDEX when you know which columns are queried most often. The basic syntax is:

CREATE [UNIQUE] INDEX index_name ON table_name(column_list [ASC/DESC]);

UNIQUE guarantees that no two keys share the same value, which is useful for email addresses or tax identifiers. Specifying ASC or DESC makes your intent explicit, and because indexes slow down writes, add them only to columns that appear in WHERE, JOIN, or ORDER BY clauses.

If you want to make the candidate table searchable by skill, define this index:

CREATE INDEX candidate_idx ON candidate (skill);

From that point on, filters like WHERE skill = 'Python' allow the planner to consult candidate_idx instead of scanning the entire table.

Reading EXPLAIN QUERY PLAN
#

Prefixing a query with EXPLAIN QUERY PLAN reveals whether the planner uses your indexes:

EXPLAIN QUERY PLAN SELECT * FROM candidate WHERE skill = 'Python';

SQLite replies with something like SEARCH candidate USING INDEX candidate_idx (skill=?), showing the index in use. If it falls back to a table scan, add constraints or indexes to guide the planner. When the query requests only indexed columns, you might see USING COVERING INDEX, meaning the planner answers without touching the base table.

EXPLAIN QUERY PLAN SELECT skill FROM candidate WHERE skill = 'Python';

Composite indexes and the left-to-right rule
#

Composite indexes contain tuples of multiple columns, sorted first on the leading column, then on the next column, and so on. The planner adheres to the left-to-right rule: queries can use the index only if they constrain the leading column, and additional columns in the composite index benefit only when they appear later in the order. When a predicate applies a range operator such as >, <, BETWEEN, or LIKE, the index can be used up to that column but not beyond it.

Keep the most selective column first, avoid skipping columns in the middle, and rely on the planner to reorder WHERE clauses to match the index order when possible.

Left to right
#

SQLite always honors the leftmost column of a multi-column index first. When your query includes that leading column, the optimizer narrows the range before it touches any other columns. For example, CREATE INDEX idx_users ON users(first_name, last_name, birthday); lets SQLite scan first_name before it evaluates last_name and birthday.

Structure your WHERE clauses with that directionality in mind. A predicate such as:

SELECT *
FROM users
WHERE first_name = 'Ryann'
  AND last_name = 'Koch'
  AND birthday = '1986-03-11';

can fully exploit idx_users, because each column appears in the same order as the index definition. SQLite locks in the first column and only refines further when the query keeps pace with the index.

No skipping
#

You cannot skip a column in the middle of an index and continue to rely on the later columns. Once SQLite sees a missing filter for an earlier column, it abandons any attempt to use the remaining prefix, even if those later columns have tight equality predicates. With idx_users(first_name, last_name, birthday), a query that omits a last_name filter stops after first_name, so the index provides almost no benefit for the other predicates. Keep predicates aligned with the index order or split the index to match the actual query.

Stop at the first range
#

Once SQLite meets a range condition (>, <, BETWEEN, LIKE) on a column, it stops using the index for any columns that follow, because the range broadens the search spectrum.

For example, if you write:

SELECT *
FROM users
WHERE first_name = 'Ryann'
  AND last_name > 'Koch'
  AND birthday = '1986-03-11';

SQLite leverages the index up to last_name, but after the > range it no longer consults birthday. The same applies if the range predicate touches an earlier column in a different index definition: everything to the right of that range is ignored, regardless of how the SQL is written.

Index economics
#

Indexes are powerful, but they cost writes and storage. Before creating one, check whether the column combination has high cardinality (many distinct values) and high selectivity (few rows match the filter). These queries help quantify those qualities:

SELECT count(DISTINCT column_name)
FROM table_name;

SELECT (count(DISTINCT column_name) * 1.0) / count(*)
FROM table_name;

Columns that return a greater diversity of values and whose predicates push the selectivity ratio toward zero make better index candidates. Boolean columns and other low-cardinality fields only make sense when combined with other filters that further narrow the dataset.

Index hygiene
#

A few extra guidelines keep your indexes manageable:

  • Prioritize the columns that power your JOIN, WHERE, and ORDER BY clauses, because they shape your production workload.
  • Avoid adding an index for every column; each one slows writes and consumes space, so focus on combinations you actually query.
  • Low-cardinality fields can still justify an index when they regularly pair with another predicate (for example, an is_pro flag used alongside another filter).
  • Revisit index definitions whenever requirements shift so they reflect the queries you actually run.

Dropping indexes
#

When an index stops pulling its weight because a workload has shifted or the indexed columns lost selectivity, remove it with DROP INDEX index_name;. Dropping unused indexes reduces write amplification and lets SQLite optimize plans for the current queries.

Next steps
#

Pair these habits with Relational Normalization to keep your schema clean, and then revisit SQL Origins & Standards so you understand how the planner translates this syntax into execution plans.