Back to topics

[SQL] Practice Makes Perfect – On Getting Started with SQL

Technical Debt

As a programmer who graduated with a software engineering degree, databases were a required course in university.

I still remember the night before the exam: I stayed up all night cramming all the exam points from my textbook. The next day, relying on my short-term memory, I actually passed. After that, I tossed all database knowledge aside and never touched it again.

After starting work, I somehow ended up as a frontend developer. Database knowledge became even more irrelevant.

Recently, by chance, I picked up databases again and started relearning from SQL Basics by Mick.

I basically flipped through the book while typing SQL code on the computer, and I finished it quickly. After finishing, I had two feelings: there are so many SQL dialects, and I need hands-on practice.

SQL Has So Many Dialects

Frontend developers certainly know what "many dialects" means – you write code that seems fine, but it just won't run on other devices. To solve compatibility issues, you have to complicate and bloat your code. The code is no longer elegant. Even if it runs smoothly on all test devices during testing, you still have to pray that no compatibility issues appear after deployment.

Compatibility issues are the eternal pain of frontend developers, because we cannot control users' devices.

The pain from SQL's many dialects seems to be the opposite extreme – the server's devices, database, and versions are all controlled by developers. But potential migration difficulties, maintenance costs, learning costs, etc., still exist. This also leads to the flourishing ecosystem of ORM libraries.

Hands-On Practice

What you get from books is only shallow. I searched for "SQL practice problems" on a search engine and picked the first result: a website's popular SQL exercises.

I did the first three, submitted, and got them all wrong.

Luckily, they weren't horribly wrong – just a few adjustments needed.

Now I'll document these three problems and where I went wrong.

Find the employee who joined most recently

SELECT *
FROM employees
ORDER BY hire_date DESC
LIMIT 1;

My first answer didn't have LIMIT 1.

Find all information of the employee with the third most recent hire date

SELECT *
FROM employees
ORDER BY hire_date DESC
LIMIT 2, 1;

If you think the SQL above is correct, you've been tricked. Because there might be multiple employees hired on the same day.
Correct answer:

SELECT * FROM employees
WHERE hire_date = (
    SELECT DISTINCT hire_date
    FROM employees
    ORDER BY hire_date DESC
    LIMIT 2,1
)

That is, first find the third most recent hire date via a subquery, then retrieve all employee information for that hire date via the main query.

Find current salary details and department number dept_no

There is a salary table salaries for all employees, as follows:

emp_no

salary

from_date

to_date

10001

88958

2002-06-22

9999-01-01

10002

72527

2001-08-02

9999-01-01

10003

43311

2001-12-01

9999-01-01

There is a department manager table dept_manager as follows:

dept_no

emp_no

to_date

d001

10001

9999-01-01

d002

10003

9999-01-01

Please find the salary details and corresponding department number dept_no for current managers of each department. Output the results sorted in ascending order by salaries.emp_no, and note that the dept_no column should be the last column. The output for the above example is as follows:

emp_no

salary

from_date

to_date

dept_no

10001

88958

2002-06-22

9999-01-01

d001

10003

43311

2001-12-01

9999-01-01

d002

When I saw this problem, I thought, "This is my old friend – table joins!"

select  sal.emp_no,
        sal.salary,
        sal.from_date,
        sal.to_date,
        dept.dept_no
from salaries as sal right join dept_manager as dept
on sal.emp_no = dept.emp_no
order by emp_no

This is the correct answer, but the first time I submitted I wrote left join instead of right join.

Because when I learned it, the teacher said that the left table in a left join is the main table. I thought "main table" meant the table with the most columns in common.

Actually, the main table is the table whose all records are retained, even if those records have no matches in the other table. In a left join, all records from the left table appear in the result set, and unmatched records from the right table are filled with NULL.

But here, obviously the right table is the main table, because the desired result is the department managers' information.

Conclusion

After these three failures, I understand that I still need a lot more practice!