100 SQL Questions and Answers for Data Analytics

Whether you’re preparing for a data analyst interview or want to boost your SQL skills, this comprehensive guide of 100 SQL questions and answers will help you understand and master the fundamentals and advanced topics of SQL for data analytics.


Basic SQL

1. What is SQL?
Structured Query Language (SQL) is used to manage and manipulate relational databases.

2. What are the different types of SQL commands?
DML, DDL, DCL, TCL (Data Manipulation, Definition, Control, Transaction Control Languages).

3. What is a primary key?
A column (or combination) that uniquely identifies each row in a table.

4. What is a foreign key?
A key used to link two tables; it references a primary key in another table.

5. What is the difference between WHERE and HAVING?
WHERE filters rows before grouping, HAVING filters groups.

6. What is a JOIN?
JOINs combine rows from two or more tables based on a related column.

7. Name different types of JOINs.
INNER, LEFT, RIGHT, FULL OUTER, CROSS.

8. What is the difference between UNION and UNION ALL?
UNION removes duplicates; UNION ALL includes all rows.

9. What is normalization?
A process of organizing data to reduce redundancy.

10. What is denormalization?
The process of combining tables to improve read performance.

11. What is the use of the SELECT statement?
To retrieve data from a database.

12. How do you sort data in SQL?
Using the ORDER BY clause.

13. What is the difference between DELETE and TRUNCATE?
DELETE logs each row delete and can be rolled back; TRUNCATE is faster and cannot be rolled back in many DBs.

14. What is the difference between COUNT(*), COUNT(1), and COUNT(column_name)?
All count differently depending on NULLs and indexing.

15. How do you select unique values?
Using the DISTINCT keyword.

16. What is a NULL value?
A missing or undefined value.

17. How do you handle NULLs in SQL?
With functions like IS NULL, COALESCE(), or IFNULL().

18. What is the BETWEEN operator?
Filters values within a range.

19. What does the IN clause do?
Matches values in a list.

20. What is the LIKE operator?
Used for pattern matching using % and _.


Intermediate SQL

21. How to find the second highest salary?
Use subqueries or DENSE_RANK().

22. What are window functions?
Functions like ROW_NUMBER(), RANK(), LEAD() over partitions of data.

23. What is a CTE (Common Table Expression)?
A temporary result set used within a WITH clause.

24. How is a CTE different from a subquery?
CTEs are more readable and reusable.

25. What is the use of GROUP BY?
To aggregate rows sharing a common value.

26. Can we use GROUP BY and ORDER BY together?
Yes.

27. What is a correlated subquery?
A subquery that references a column from the outer query.

28. What is the difference between RANK() and DENSE_RANK()?
RANK() leaves gaps, DENSE_RANK() does not.

29. What is a self-join?
Joining a table with itself.

30. How do you retrieve even and odd rows in SQL?
Use modulo: MOD(ROWNUM, 2).

31. How do you get the top N records?
Use LIMIT, TOP, or FETCH FIRST depending on the SQL dialect.

32. What is a subquery?
A query nested inside another query.

33. What is EXISTS in SQL?
Checks for the existence of rows returned by a subquery.

34. What is CASE in SQL?
Conditional logic in a query.

35. How do you remove duplicates in SQL?
Use DISTINCT or window functions with ROW_NUMBER() and filtering.

36. What is an alias in SQL?
A temporary name for a column or table.

37. What is the difference between CHAR and VARCHAR?
CHAR is fixed-length, VARCHAR is variable-length.

38. What is a composite key?
A primary key made up of multiple columns.

39. What are scalar functions?
Functions that return a single value like LEN(), GETDATE(), ROUND().

40. How do you concatenate strings in SQL?
Use ||, +, or CONCAT() depending on the DB.

41. What is the COALESCE function?
Returns the first non-null value in a list.

42. What is the purpose of the DEFAULT constraint?
Assigns a default value if none is provided.

43. What is a trigger?
A set of actions executed in response to a table event.

44. What is a stored procedure?
A saved collection of SQL statements that can be reused.

45. What is a view?
A virtual table based on the result of a query.

46. What is data type conversion?
Converting from one type to another using CAST() or CONVERT().

47. What is a transaction?
A unit of work that is performed as a single logical operation.

48. What are ACID properties?
Atomicity, Consistency, Isolation, Durability.

49. What is rollback?
Undoing changes of a transaction.

50. What is commit?
Saving all changes made by a transaction.

51. How to find duplicate records?
Use GROUP BY with HAVING COUNT(*) > 1.

52. How do you delete duplicates but keep one?
Use ROW_NUMBER() in a CTE and delete where the row number > 1.

53. What are system tables?
Tables maintained by the DBMS that store metadata.

54. What is the difference between IS NULL and = NULL?
Use IS NULL to check nulls; = NULL does not work.

55. What is a surrogate key?
A system-generated unique identifier.

56. What is the difference between NOW() and GETDATE()?
Same purpose; syntax depends on the DBMS.

57. What is indexing and why is it used?
Indexes speed up data retrieval.

58. Can a table have multiple foreign keys?
Yes.

59. Can a table have multiple unique constraints?
Yes.

60. What is a schema in SQL?
A logical container for database objects.


Advanced SQL & Analytics

61. How do you calculate moving averages?
Use window functions with ROWS BETWEEN.

62. How do you find running totals?
Use SUM() over a window frame.

63. What is a pivot in SQL?
Rotating rows into columns.

64. How do you unpivot data?
With UNPIVOT or multiple UNION ALL.

65. What is indexing and why is it important?
It speeds up data retrieval.

66. What is the difference between clustered and non-clustered indexes?
Clustered defines physical order, non-clustered does not.

67. How can you optimize a slow query?
Check indexes, avoid SELECT *, limit subqueries, analyze execution plans.

68. What is a materialized view?
A saved result of a query, physically stored.

69. What is an execution plan?
Shows how the SQL engine executes a query.

70. What is a case statement?
Used for conditional logic in SQL.

71. What is a temporary table?
A table created for temporary use during a session.

72. What is the difference between temp table and table variable?
Temp tables exist in tempdb; table variables are in memory and scoped to batch.

73. What is a derived table?
A subquery in the FROM clause.

74. What is the difference between EXISTS and IN?
EXISTS stops at the first match; IN checks all.

75. What is JSON in SQL?
Many databases support querying and storing JSON.

76. How do you use JSON data in SQL Server or PostgreSQL?
Using functions like JSON_VALUE(), ->, or ->>.

77. What are analytical functions?
Functions like NTILE(), LAG(), LEAD(), used over partitions.

78. What is recursive CTE?
A CTE that calls itself to return hierarchical or tree-structured data.

79. What is the use of CROSS APPLY?
Joins each row of the left table to the result of a function or subquery on the right.

80. What is the difference between APPLY and JOIN?
APPLY can use row-dependent logic.

81. What is a calendar table and why is it useful?
A date dimension table useful for time-series analysis.

82. How do you calculate year-over-year or month-over-month in SQL?
Use LAG() or join date offsets.

83. What is a surrogate key vs natural key?
Surrogate: artificial ID; Natural: business key like email, SSN.

84. How do you find gaps in sequences?
Using LEAD() and checking for non-sequential values.

85. What is database sharding?
Partitioning data across multiple machines.

86. How do you monitor query performance?
Using EXPLAIN, ANALYZE, SQL Profiler, or similar tools.

87. What is query plan caching?
Reusing query execution plans to improve performance.

88. What are the types of database constraints?
NOT NULL, UNIQUE, PRIMARY KEY, FOREIGN KEY, CHECK, DEFAULT.

89. How do you enforce data integrity?
Using constraints, triggers, and application-level logic.

90. What is a dimensional model?
A star or snowflake schema used in data warehouses.

91. What is ETL in SQL?
Extract, Transform, Load — the process of data integration.

92. What is OLAP vs OLTP?
OLAP: analytical, read-heavy; OLTP: transactional, write-heavy.

93. How do you schedule SQL jobs?
Using tools like SQL Server Agent, cron jobs, or task schedulers.

94. What is data lake vs data warehouse?
Data lake stores raw data, warehouse stores structured, cleaned data.

95. What is a snapshot table?
A table that stores historical states of data.

96. What are slowly changing dimensions (SCD)?
Techniques for managing changes in dimension data.

97. What is partitioning in SQL?
Dividing large tables into smaller parts for performance.

98. What is an index scan vs seek?
Scan reads all rows, seek jumps directly to rows.

99. What is a fact table?
Central table in a star schema holding quantitative data.

100. What is a dimension table?
A descriptive table that joins to the fact table.

Leave a Reply

Your email address will not be published. Required fields are marked *