How to properly format your SQL code for readability.
Throughout the years of working with SQL i came across many legacy sripts that are badly formatted or even not formatted at all. When you start reading them you couldn't even find where to start. No comments, no formatting leads to hours spent just understanding what s going on in the code. After spending those hours and learning from other people i made a set of certain rules for myself how to write code that s easy to read and understand.
The code
So lets start with the code sample straight away and below the code i ll describe the rules i use when formatting. This code is not complicated and is not the longest. But if there s no formatting applied it would take a while to figure out where to start and what s the logic behind the code.
/*
Logic:
* create CTE that will be later used for aggregation
* later we need to make a separate query for each step
* finally we join all steps with UNION
*/
with _main as (
select
coalesce(d3._date, d4._date) _date
,coalesce(d3._source, d4._source) _source
,coalesce(d3._enterprise, d4._enterprise) _enterprise
,coalesce(d3._group, d4._group) _group
,coalesce(d3._entGroup, d4._entGroup) _entGroup
,[_load_plan_month]
,[_load_fact_month]
,[_load_delta_month]
,[_load_plan_NET]
,[_loadRW_fact_month]
,[_loadRW_delta_month]
,[_loadRW_plan_NET]
,[_loadHybrid_fact_month]
,[_loadHybrid_delta_month]
,[_loadHybrid_plan_NET]
from [PAC].[dbo].[_mea_3_table] d3
full join [PAC].[dbo].[_mea_4_table] d4
on d3._date = d4._date
and d3._group = d4._group
and d3._entGroup = d4._entGroup)
-- ################################### Total ###################################
select
_date
,_entGroup _enterprise
,N'Total' _cat
,sum(coalesce([_load_fact_month],0)
+ coalesce([_loadRW_fact_month],0)
+ coalesce([_loadHybrid_fact_month],0)) _fact
,sum(coalesce([_load_plan_NET],0)
+ coalesce([_loadRW_plan_NET],0)
+ coalesce([_loadHybrid_plan_NET],0)) _plan
from _main
where 1=1
and _source = N'office'
group by
_date
,_entGroup
,_source
union all
-- ################################### Subcon ###################################
select
_date
,_entGroup _enterprise
,N'subcon' _cat
,sum(coalesce([_load_fact_month],0)
+ coalesce([_loadRW_fact_month],0)
+ coalesce([_loadHybrid_fact_month],0)) _fact
,sum(coalesce([_load_plan_NET],0)
+ coalesce([_loadRW_plan_NET],0)
+ coalesce([_loadHybrid_plan_NET],0)) _plan
from _main
where 1=1
and _source = N'subcontractor'
group by
_date
,_entGroup
,_source
union all
-- ################################### Own ###################################
select
_date
,_enterprise
,N'Own' _cat
,_fact_TOTAL - _fact_SUB _fact
,_plan_TOTAL - _plan_SUB _plan
/*
below we transpose rows
*/
from (
select
_date
,_entGroup _enterprise
,sum(
case
when _source = N'office'
then coalesce([_load_plan_NET],0)
+ coalesce([_loadRW_plan_NET],0)
+ coalesce([_loadHybrid_plan_NET],0)
else 0 end) _plan_TOTAL
,sum(
case
when _source = N'office'
then coalesce([_load_fact_month],0)
+ coalesce([_loadRW_fact_month],0)
+ coalesce([_loadHybrid_fact_month],0)
else 0 end) _fact_TOTAL
,sum(
case
when _source = N'subcontractor'
then coalesce([_load_plan_NET],0)
+ coalesce([_loadRW_plan_NET],0)
+ coalesce([_loadHybrid_plan_NET],0)
else 0 end) _plan_SUB
,sum(
case
when _source = N'subcontractor'
then coalesce([_load_fact_month],0)
+ coalesce([_loadRW_fact_month],0)
+ coalesce([_loadHybrid_fact_month],0)
else 0 end) _fact_SUB
from _main
group by
_date
,_entGroup) t
union all
-- ################################### Trucks ###################################
select
_date
,_enterprise
,N'Trucks' _cat
,_fact_TOTAL - _fact_SUB _fact
,_plan_TOTAL - _plan_SUB _plan
/*
below we transpose rows
*/
from (
select
_date
,_entGroup _enterprise
,sum(
case
when _source = N'office'
then [_load_fact_month]
else 0 end) _fact_TOTAL
,sum(
case
when _source = N'office'
then [_load_plan_NET]
else 0 end) _plan_TOTAL
,sum(
case
when _source = N'subcontractor'
then [_load_fact_month]
else 0 end) _fact_SUB
,sum(
case
when _source = N'subcontractor'
then [_load_plan_NET]
else 0 end) _plan_SUB
from _main
group by
_date
,_entGroup) t
union all
-- ################################### Other ###################################
select
_date
,_enterprise
,N'Inc. other' _cat
,_fact_TOTAL - _fact_SUB _fact
,_plan_TOTAL - _plan_SUB _plan
/*
below we transpose rows
*/
from (
select
_date
,_entGroup _enterprise
,sum(
case
when _source = N'office'
then coalesce([_loadRW_fact_month],0)
+ coalesce([_loadHybrid_fact_month],0)
else 0 end) _fact_TOTAL
,sum(
case
when _source = N'office'
then coalesce([_loadRW_plan_NET],0)
+ coalesce([_loadHybrid_plan_NET],0)
else 0 end) _plan_TOTAL
,sum(
case
when _source = N'subcontractor'
then coalesce([_loadRW_fact_month],0)
+ coalesce([_loadHybrid_fact_month],0)
else 0 end) _fact_SUB
,sum(
case
when _source = N'subcontractor'
then coalesce([_loadRW_plan_NET],0)
+ coalesce([_loadHybrid_plan_NET],0)
else 0 end) _plan_SUB
from _main
group by
_date
,_entGroup
) t
The rules
This sample above shows all of the rules i follow, namely:
- Comments: start commenting your code!!! No one knows what s happening in your head, what s the logic & transformations you apply. Dont make other people who probably will inherit the code feel lost in the mess you made. Use one-line comments for short notes (like: "--") or multi-line comments (like: "/* ... */") to describe general logic.
- Indentation: it s absolutely neccessary for understanding the blocks of the code. Blocks allow you to see the map of your solution (eg modules and their connections).
- Commas: so many times i lost them (i bet you too) which lead to execution errors and then to scanning the code trying to find the place where you lost one. Here s the rule: put comma infront of the column selection. You will never lose it. And you wont need to jump one line up-down to see if you lost it.
- Add "1": adding to the point above you can put "1" as your first column so when commenting out other column wont break the code.
- New lines: use new lines to separate functions like SUM, CASE WHEN etc. Yes this will make you code more sparse but it will add to readability as well. Also new lines allow for quick "commenting out" the columns you want to exclude.
- Alias: dont mix everything. What i mean: make two general vertical areas where you can see the original columns and aliases (see my sample above). On the left side i list all the columns i need in the SELECT statement and on the right side i leave all the aliases.
- Naming: i prefer (and do recommend) to use your custom names starting with "_" (underscore). In some cases this maybe too much but mostly it s a good practice to separate original names from the ones you create for the script (aliases for example).
- Comment breaks: break your code with comments that visualy allow for chunks separation. Use this "--######### NEW BLOCK #########" for example. Someone who will read the code will easily see those chunks without guessing.
- Where clause: when using WHERE clause add one dull condition: 1=1. It will make adding and deleting conditions in the WHERE cluase easier (commenting out extra conditions wont break the code).
While these rules might be agruable i use them in the code i write and didnt hear any negative reactions from people who use my code later on. Also when i have to return and amend or debug the code i act as a new user (since i forget the code as well) and these rules make my life much easier.
Useful links
Last but not the least i want to highlight a few links that would help you with formatting sheeety code (my top 3):
- My favorite: https://poorsql.com
- Number two: https://www.draxlr.com/tools/sql-formatter/
- Number three: https://www.red-gate.com/website/sql-formatter
That s it folks.