Mastering Complex SQL Challenges: Solving LinkedIn’s Budget Risk Assessment Problem

Adekolu Peter
4 min readAug 7, 2023

--

Image Source: LinkedinLearn

Hello everyone,

It’s been a while, but I’ll like to share an interesting problem statement with you. This particular challenge stems from a Linkedin SQL interview question as presented by Stratascratch.

Problem Statement:

Now, imagine your boss assigns you a task: identify projects that are at risk of exceeding their budgets. In essence, a project is considered at risk if the total expense incurred for the project surpasses the allocated budget. To calculate this, we consider the expense to be the total salaries disbursed to employees relative to the project’s duration.

For instance, if a project has a budget of $10,000 and takes six months to complete, the combined salaries of all employees on the project over that six-month period should not exceed $10,000. If it does, the project is flagged as being at risk of going over budget.

Implementation:

To tackle this challenge, we have a set of tables : “Linkedin_projects,” “linkedin_emp_projects,” and “linkedin_employee.” These tables hold information such as project IDs, project names (titles), budgets, start and end dates, employee IDs, and salaries.

Given these tables and their relationships, we need to join them to create a merged table that provides details about each project’s total expense, budget, and duration. To achieve this, we perform a series of table joins.

We start by joining the “linkedin_projects” and “linkedin_emp_projects” tables based on the “project_id” foreign key. Next, we join the “linkedin_emp_projects” table with the “linkedin_employee” table using the “emp_id” foreign key.

Here’s the initial SQL code for these joins:

SELECT
lp.id AS project_id,
lp.title AS project_name,
SUM(le.salary) AS total_salary,
lp.budget AS budget,
DATEDIFF("day", lp.start_date, lp.end_date) AS project_duration
FROM
linkedin_projects lp
JOIN
linkedin_emp_projects lep ON lp.id = lep.project_id
JOIN
linkedin_employees le ON lep.emp_id = le.id
GROUP BY
lp.id, lp.title, lp.budget, lp.start_date, lp.end_date;

In this code, we use the SUM() function to aggregate employee salaries for each project. However, we don't aggregate the project budget, as it is fixed.

Executing the above code yields a result like this:

Sample Output

At this point, the “total_salary” column in the result represents the combined yearly salaries of all employees on each project. To determine the project’s expense, which is the total employee salary relative to the project’s duration, we introduce a subquery. For instance, the expense for “project1” is calculated as (total_salary * project_duration) / 365.

Here’s the code for the subquery:

select project_name, budget, 
ROUND((CAST(total_salary as float) *
CAST(project_duration as float))/365, 0) as prorated_employee_expense
FROM (
-- Previous code for joining tables and calculating total_salary and project_duration
) AS ProjectDetails;

Within the subquery, we cast the total salary and project duration as float data types to ensure precision. We also use the ROUND() function to round the prorated employee expense to the nearest dollar as it was required in the problem statement.

Executing the subquery provides a result similar to the previous output, now including the prorated employee expense alongside the project budget:

Sample Output

With this progress, we’re 90% closer to solving the problem. To identify projects where the prorated employee expense exceeds the budget, we encapsulate the subquery within a Common Table Expression (CTE) and filter the results using a WHERE clause which led to this whole query and output:

with risky_projects_cte  as (
select project_name, budget,
ROUND((CAST(total_salary as float) *
CAST(project_duration as float))/365, 0) as prorated_employee_expense
from
(
select lp.id as project_id,
lp.title as project_name,
sum(le.salary) as total_salary,
lp.budget as budget,
datediff("day", lp.start_date, lp.end_date) as project_duration

from linkedin_projects lp
join linkedin_emp_projects lep
on lp.id = lep.project_id
join linkedin_employees le
on lep.emp_id = le.id
group by lp.id, lp.title, lp.budget, lp.start_date, lp.end_date) project_details

)

select * from risky_projects_cte
where prorated_employee_expense > budget
order by project_name

Sample Output:

And voilà — the problem is solved! Kudos to Stratascratch platform for providing intriguing SQL challenges particularly from top companies. They also offer Python problems, which I’m eager to explore soon. Feel free to reach out to me on LinkedIn if you have questions or suggestions for solving this problem or general questions. You can also reach out to me via email. I hope you’ve gained a grasp of the entire approach. You can explore my other solutions to SQL problems here. Stay tuned for more exciting problems we’ll explore.

--

--

Adekolu Peter
Adekolu Peter

No responses yet