Hi guys, I hope it’s been a great week for everyone! This past week I was at DataConnect Conference with some POWERHOUSE women in data 🤩 Let me know if you recognize any of them!
Full recap of my talk about how we evolve as data professionals in the age of AI coming next week!
Master these 10 SQL questions for your next data interview
Last week we covered 9 SQL tricks you may not have known about but what about the SQL you need to learn for your data interviews? I’ve got you covered with the 10 most commonly asked SQL interview questions and how I would answer them!
What are primary keys and foreign keys in a database?
A primary key is a unique identifier for a record in a table, while a foreign key is a field in a table that refers to the primary key of another table.
What are the different types of SQL joins?
Inner joins return records that have matching values in both tables.
Left (outer) joins returns all records from the left table, and the matched records from the right table. If no match is found, NULLs are returned for columns from the right table.
Right (outer) joins return all records from the right table, and the matched records from the left table. If no match is found, NULLs are returned for columns from the left table.
Full (outer) joins return all records when there is a match in either left or right table. Records that don't match in either table will have NULLs for columns from the table that doesn't have the match.
When would you use a subquery vs a CTE?
Subqueries are nested queries within another query. They are typically employed when the result of the query is dependent on the result of another query.
CTEs (common table expressions) are typically used when you want to create a complex temporary result set for use in a single query. They’re particularly useful when working with recursive queries, or when you need to use the result set multiple times within the same query.
You would use a subquery when the result of one query depends on the result of another query, and you would use a CTE when you need to create a complex temporary result set for use in a single query.
What is the purpose of the GROUP BY clause?
The GROUP BY clause is used to group rows in a table based on one or more columns. It is often used with aggregate functions like SUM, COUNT, AVG, etc., to perform calculations on grouped data. For example:
SELECT department, COUNT(*) AS employee_count FROM employees GROUP BY department
What is the difference between WHERE and HAVING?
The
WHERE
clause is used to filter rows before any groupings are made.HAVING
is used to filter groups after theGROUP BY
clause has been applied.
What is the difference between UNION, INTERSECT, and EXCEPT?
UNION is used to combine the result sets of two or more SELECT statements, removing duplicate rows.
INTERSECT is used to return only the common rows between two SELECT statements.
EXCEPT is used to return the rows from the first SELECT statement that are not present in the second SELECT statement.
What is indexing in SQL?
Indexing is a technique used to improve the performance of database queries by creating data structures that allow for faster data retrieval.
Indexes are created on columns that are frequently searched or used for sorting.
What is normalization in SQL?
Normalization is the process of organizing data in a database to reduce redundancy and improve data integrity. It involves breaking down a table into smaller tables and establishing relationships between them.
How do you find the second highest salary in a table?
One way to find the second highest salary is using a subquery with the
LIMIT
clause:
SELECT MAX(salary) AS SecondHighestSalary FROM employees WHERE salary < (SELECT MAX(salary) FROM employees)
How would you write a SQL query to find all employees who have not been assigned a department?
Assuming the
department_id
is the column that should be non-null for assigned employees, you can use the following:
SELECT * FROM employees WHERE department_id IS NULL
These are just a few examples of SQL questions you’ll come across in your data analyst, data scientist and data engineering interviews.
The actual questions you may encounter can vary depending on the specific job and company, so make sure you’re practicing a variety of questions on sites like HackerRank, Leetcode and DataLemur!
That’s it for this week’s newsletter! See you for the next episode of The Data Diaries soon 👋
- Megan
And in case you don’t know who I am, I’m Megan Lieu, Data Scientist-turned-Developer Advocate who has helped thousands of job seekers through my content on LinkedIn and Instagram, as well as my courses on LinkedIn Learning. I’ve learned a lot from the ups and downs of my data career, and sharing the lessons has helped me build a community of 180k+ tech and data professionals.
In question number 9 you said you are using subquery in LIMIT clause but in example code you are using subquery in WHERE clause. Please give example with LIMIT clause.