[SQL] Practice Makes Perfect – On Getting Started with SQL
Technical Debt
As a programmer who graduated with a software engineering degree, databases were a required course in university.
I still remember the night before the exam: I stayed up all night cramming all the exam points from my textbook. The next day, relying on my short-term memory, I actually passed. After that, I tossed all database knowledge aside and never touched it again.
After starting work, I somehow ended up as a frontend developer. Database knowledge became even more irrelevant.
Recently, by chance, I picked up databases again and started relearning from SQL Basics by Mick.
I basically flipped through the book while typing SQL code on the computer, and I finished it quickly. After finishing, I had two feelings: there are so many SQL dialects, and I need hands-on practice.
SQL Has So Many Dialects
Frontend developers certainly know what "many dialects" means – you write code that seems fine, but it just won't run on other devices. To solve compatibility issues, you have to complicate and bloat your code. The code is no longer elegant. Even if it runs smoothly on all test devices during testing, you still have to pray that no compatibility issues appear after deployment.
Compatibility issues are the eternal pain of frontend developers, because we cannot control users' devices.
The pain from SQL's many dialects seems to be the opposite extreme – the server's devices, database, and versions are all controlled by developers. But potential migration difficulties, maintenance costs, learning costs, etc., still exist. This also leads to the flourishing ecosystem of ORM libraries.
Hands-On Practice
What you get from books is only shallow. I searched for "SQL practice problems" on a search engine and picked the first result: a website's popular SQL exercises.
I did the first three, submitted, and got them all wrong.
Luckily, they weren't horribly wrong – just a few adjustments needed.
Now I'll document these three problems and where I went wrong.
Find the employee who joined most recently
SELECT *
FROM employees
ORDER BY hire_date DESC
LIMIT 1;
My first answer didn't have LIMIT 1.
Find all information of the employee with the third most recent hire date
SELECT *
FROM employees
ORDER BY hire_date DESC
LIMIT 2, 1;
If you think the SQL above is correct, you've been tricked. Because there might be multiple employees hired on the same day.
Correct answer:
SELECT * FROM employees
WHERE hire_date = (
SELECT DISTINCT hire_date
FROM employees
ORDER BY hire_date DESC
LIMIT 2,1
)
That is, first find the third most recent hire date via a subquery, then retrieve all employee information for that hire date via the main query.
Find current salary details and department number dept_no
There is a salary table salaries for all employees, as follows:
emp_no | salary | from_date | to_date |
|---|---|---|---|
10001 | 88958 | 2002-06-22 | 9999-01-01 |
10002 | 72527 | 2001-08-02 | 9999-01-01 |
10003 | 43311 | 2001-12-01 | 9999-01-01 |
There is a department manager table dept_manager as follows:
dept_no | emp_no | to_date |
|---|---|---|
d001 | 10001 | 9999-01-01 |
d002 | 10003 | 9999-01-01 |
Please find the salary details and corresponding department number dept_no for current managers of each department. Output the results sorted in ascending order by salaries.emp_no, and note that the dept_no column should be the last column. The output for the above example is as follows:
emp_no | salary | from_date | to_date | dept_no |
|---|---|---|---|---|
10001 | 88958 | 2002-06-22 | 9999-01-01 | d001 |
10003 | 43311 | 2001-12-01 | 9999-01-01 | d002 |
When I saw this problem, I thought, "This is my old friend – table joins!"
select sal.emp_no,
sal.salary,
sal.from_date,
sal.to_date,
dept.dept_no
from salaries as sal right join dept_manager as dept
on sal.emp_no = dept.emp_no
order by emp_no
This is the correct answer, but the first time I submitted I wrote left join instead of right join.
Because when I learned it, the teacher said that the left table in a left join is the main table. I thought "main table" meant the table with the most columns in common.
Actually, the main table is the table whose all records are retained, even if those records have no matches in the other table. In a left join, all records from the left table appear in the result set, and unmatched records from the right table are filled with NULL.
But here, obviously the right table is the main table, because the desired result is the department managers' information.
Conclusion
After these three failures, I understand that I still need a lot more practice!