Subqueries and CTEs
This page covers nested queries with IN (SELECT ...), Common Table Expressions (WITH), CTE join types, deep join paths, and mixing CTEs with custom joins.
Subqueries in Filters
Pass a query object to @in to create a server-side IN (SELECT ...) predicate:
# Build the subquery
subquery = M.Status.objects
subquery.filter("status" => "Engine")
subquery.values("statusId")
# Use it in the main query
query = M.Result.objects
query.filter("statusId__@in" => subquery)
query.values("resultId", "statusId", "statusId__status", "grid", "driverId")
df = query |> DataFrameThis generates:
SELECT ... FROM "result" as "Tb" ...
WHERE "Tb"."statusid" IN (SELECT "Tb"."statusid" FROM "status" as "Tb" WHERE "Tb"."status" = $1)[!TIP] Subqueries run entirely on the server — PormG does not materialize the subquery in Julia. This is much more efficient for large datasets.
Subqueries with Additional Filters
Combine subquery @in with other filter conditions:
subquery = M.Status.objects
subquery.filter("status" => "Engine")
subquery.values("statusId")
query = M.Result.objects
query.filter("statusId__@in" => subquery, "driverId__@lte" => 7)
query.values(
"resultId",
"statusId",
"statusId__status",
"grid",
"driverId",
"raceId__date__@quarter"
)
query.order_by("raceId__date__quarter")
df = query |> DataFrameCommon Table Expressions (CTEs)
CTEs (SQL WITH clauses) are useful when a query becomes easier to reason about in stages. PormG supports CTEs through the .with() method.
Why Use CTEs?
- Readability — Break complex queries into named stages.
- Aggregation — Pre-compute aggregates and join them back to the main query.
- Reuse — Reference the same subquery multiple times without duplication.
Basic CTE with JOIN
Define a subquery, give it a name, and join it to the main query via join_field:
using PormG: Count
# Define the CTE: count results per driver (where status = 1)
driver_stats = M.Result.objects
driver_stats.filter("statusId" => 1)
driver_stats.values("driverId", "total_results" => Count("resultId"))
# Main query: join the CTE to Result via driverId
main_query = M.Result.objects
main_query.with("stats" => driver_stats, join_field="driverId" => "driverId")
main_query.filter("resultId__@lte" => 100)
main_query.values("resultId", "driverId", "stats__total_results")
df = main_query |> DataFrameThe .with() method:
- Emits the subquery as a
WITH stats AS (SELECT ...)clause. - Creates a
LEFT JOIN stats ON result.driverid = stats.driverid. - Makes
stats__total_resultsavailable for selection viavalues().
CTE with Multiple Aggregated Fields
using PormG: Count, Sum
# CTE with multiple aggregates
stats = M.Result.objects
stats.filter("raceId__@lte" => 100)
stats.values(
"driverId",
"total_results" => Count("resultId"),
"total_grid_positions" => Sum("grid")
)
# Join CTE to Driver model
query = M.Driver.objects
query.with("driver_stats" => stats, join_field="driverId" => "driverId")
query.filter("driverId__@lte" => 50)
query.values(
"driverId",
"forename",
"surname",
"driver_stats__total_results",
"driver_stats__total_grid_positions"
)
df = query |> DataFrameMultiple CTEs
Attach multiple CTEs to the same query:
# CTE 1: Recent races
recent_races = M.Race.objects
recent_races.filter("year__@gte" => 2020)
recent_races.values("raceId", "name", "year")
# CTE 2: Top drivers
top_drivers = M.Driver.objects
top_drivers.filter("driverId__@lte" => 100)
top_drivers.values("driverId", "forename", "surname")
# Main query with both CTEs
query = M.Result.objects
query.with("recent" => recent_races, join_field="raceId" => "raceId")
query.with("top_d" => top_drivers, join_field="driverId" => "driverId")
query.values("resultId", "recent__name", "top_d__forename", "points")
query.filter("recent__name__@isnull" => false, "top_d__forename__@isnull" => false)
df = query |> DataFrameEach .with() call adds another WITH clause and JOIN to the final SQL.
Choosing Join Types for CTEs
By default, CTEs use LEFT JOIN. Use join_type="INNER" to filter out non-matching rows:
using PormG: Sum
high_scorers = M.Result.objects
high_scorers.filter("points__@gte" => 10)
high_scorers.values("driverId", "max_points" => Sum("points"))
query = M.Driver.objects
query.with(
"high_scorers" => high_scorers,
join_field="driverId" => "driverId",
join_type="INNER" # Only include drivers who have high scores
)
query.values("driverId", "forename", "max_points" => "high_scorers__max_points")
query.filter("driverId__@lte" => 100)
df = query |> DataFrameAvailable CTE Join Types
| Join Type | Behavior |
|---|---|
"LEFT" | Default. All main-table rows are kept; unmatched CTE columns are missing. |
"INNER" | Only rows that match the CTE are returned. |
The SQL builder can also render "RIGHT" and "FULL" join keywords, but these are not the primary documented workflow.
Deep Join Paths in CTE Links
The join_field can contain a multi-level path with __. PormG builds the intermediate joins needed to connect the main query to the CTE:
using PormG: Count
# CTE: count drivers per nationality
nat_stats = M.Driver.objects
nat_stats.values("nationality", "driver_count" => Count("driverId"))
# Main query: join CTE via a deep path (Result → Driver → nationality)
query = M.Result.objects
query.with("stats" => nat_stats, join_field="driverId__nationality" => "nationality")
query.filter("raceId__year" => 2023)
query.values("raceId__name", "driverId__surname", "stats__driver_count")
df = query |> DataFramePormG automatically:
- Joins
Result → Driver(via thedriverIdForeignKey). - Links
Driver.nationalitytostats.nationality(the CTE join column).
CTE Without a JOIN
If you call .with("name" => subq) without providing join_field, PormG still emits the CTE in the WITH clause but does not join it to the main table:
subq = M.Status.objects.filter("status" => "Engine").values("statusId")
query = M.Result.objects
query.with("engine_statuses" => subq) # Declared but not joinedThis is valid SQL, but the CTE data won't be accessible through values(). The main use case is combining a CTE declaration with a subquery filter:
query = M.Result.objects
query.with("sub" => subq)
query.filter("statusId__@in" => subq) # Reuse the subquery in a filter[!TIP] Always provide
join_fieldwhen you want CTE data accessible via.values(). Without it, the CTE is emitted but produces no additional projectable columns.
Mixing CTEs and Custom Joins
PormG allows CTEs and custom joins (cjoin) in the same query. Parameter ordering is deterministic across these combinations:
using PormG: cjoin
# 1. Define the CTE
top_const = M.Constructor.objects
.filter("constructorId__@lte" => 5)
.values("constructorId", "name")
query = M.Result.objects
# 2. Attach the CTE
query.with("tc" => top_const, join_field="constructorId" => "constructorId")
# 3. Add a custom join
query.cjoin("driverId" => "Driver", filters=["nationality" => "German"])
# 4. Filter and select across physical tables, custom joins, and CTEs
query.filter("positionOrder" => 1)
query.values("resultId", "tc__name", "driverId__surname")
df = query |> DataFrameChaining Multiple Custom Joins
query = M.Result.objects
query.cjoin("driverId" => "Driver", filters=["nationality" => "Brazilian", "forename" => "Ayrton"])
query.cjoin("raceId" => "Race", filters=["year" => 1991])
query.filter("positionOrder" => 1)
query.values("resultId", "driverId__surname", "raceId__name")
df = query |> DataFrameSee Custom Joins for the full cjoin() and on() documentation.
Summary
| Feature | Syntax | Use Case |
|---|---|---|
Subquery IN | "field__@in" => subquery | Filter by a set computed on the server. |
| CTE with JOIN | .with("name" => subq, join_field=...) | Pre-aggregate data and join it. |
| CTE without JOIN | .with("name" => subq) | Declare for use in filters only. |
| CTE INNER JOIN | join_type="INNER" | Only keep matching rows. |
| Deep join path | join_field="a__b" => "field" | Link CTE via multi-level relationships. |
| CTE + cjoin | .with(...) + .cjoin(...) | Combine all join strategies. |
Next Steps
- Custom Joins — Full
cjoin()andon()documentation. - Field Expressions — Use
F()for arithmetic and computed columns. - Q Objects — Complex boolean logic in filters.