Skip to content

Latest commit

 

History

History
307 lines (260 loc) · 11.4 KB

12 - Relational Division.md

File metadata and controls

307 lines (260 loc) · 11.4 KB

Relational Division

      Relational division is a Relational Algebra operation that represents the division of one relation into another relation based on certain conditions. In SQL, relational division can be achieved by using multiple join operations and conditional statements to divide the data in one table into multiple groups based on certain criteria. This can be useful for solving problems such as finding all employees who have had a shift in every department, or employees that have the same licenses or skillsets.

      Relational division is not a built-in operator in SQL, so it must be simulated using a combination of other SQL operations, such as joins, subqueries, and conditional statements. The exact implementation of relational division may vary depending on the specific requirements of the problem and the database management system being used.

      Relational division is used in SQL to select rows that conform to a number of different criteria.

  • I want to find all pilots who can fly 100% of the airplanes in the hanger (the most common example of relational division).
  • I want to find all employees who match on their issued licenses.
  • I want to find all managers who have worked in every department.

Let’s look at a few examples.


Planes In The Hanger

The most common example you will find on the internet is the airplanes in the hanger example.

Pilot Skills

Pilot Name Plane Name
Johnson Piper Cub
Williams B-52 Bomber
Williams F-14 Fighter
Williams Piper Cub
Roberts B-52 Bomber
Roberts F-14 Fighter
Jones B-1 Bomber
Jones B-52 Bomber
Jones F-14 Fighter
Brown B-1 Bomber
Brown B-52 Bomber
Brown F-14 Fighter
Brown F-17 Fighter

Hanger

Plane Name
B-1 Bomber
B-52 Bomber
F-14 Fighter

The query is rather simple once you understand how to use the HAVING clause.

CREATE TABLE #PilotSkills 
(
PilotName VARCHAR(255),
PlaneName VARCHAR(255)
);

INSERT INTO #PilotSkills (PilotName, PlaneName) VALUES ('Johnson', 'Piper Cub');
INSERT INTO #PilotSkills (PilotName, PlaneName) VALUES ('Williams', 'B-52 Bomber');
INSERT INTO #PilotSkills (PilotName, PlaneName) VALUES ('Williams', 'F-14 Fighter');
INSERT INTO #PilotSkills (PilotName, PlaneName) VALUES ('Williams', 'Piper Cub');
INSERT INTO #PilotSkills (PilotName, PlaneName) VALUES ('Roberts', 'B-52 Bomber');
INSERT INTO #PilotSkills (PilotName, PlaneName) VALUES ('Roberts', 'F-14 Fighter');
INSERT INTO #PilotSkills (PilotName, PlaneName) VALUES ('Jones', 'B-1 Bomber');
INSERT INTO #PilotSkills (PilotName, PlaneName) VALUES ('Jones', 'B-52 Bomber');
INSERT INTO #PilotSkills (PilotName, PlaneName) VALUES ('Jones', 'F-14 Fighter');
INSERT INTO #PilotSkills (PilotName, PlaneName) VALUES ('Brown', 'B-1 Bomber');
INSERT INTO #PilotSkills (PilotName, PlaneName) VALUES ('Brown', 'B-52 Bomber');
INSERT INTO #PilotSkills (PilotName, PlaneName) VALUES ('Brown', 'F-14 Fighter');
INSERT INTO #PilotSkills (PilotName, PlaneName) VALUES ('Brown', 'F-17 Fighter');

CREATE TABLE #Hangar
(
PlaneName VARCHAR(255)
);

INSERT INTO #Hangar (PlaneName) VALUES ('B-1 Bomber');
INSERT INTO #Hangar (PlaneName) VALUES ('B-52 Bomber');
INSERT INTO #Hangar (PlaneName) VALUES ('F-14 Fighter');

SELECT  ps.PilotName
FROM    #PilotSkills ps,
        #Hangar h
WHERE   ps.PlaneName = h.PlaneName
GROUP BY ps.PilotName 
HAVING  COUNT(ps.PlaneName) = (SELECT COUNT(PlaneName) FROM #Hangar);
Pilot Name
Brown
Jones

Employees With Matching Licenses.

Another example is finding all employees who have the same licenses.

EmployeeID License
1001 Class A
1001 Class B
1001 Class C
2002 Class A
2002 Class B
2002 Class C
3003 Class A
3003 Class D
4004 Class A
4004 Class B
4004 Class D
5005 Class A
5005 Class B
5005 Class D
CREATE TABLE #Employees
(
EmployeeID  INTEGER,
License     VARCHAR(100),
PRIMARY KEY (EmployeeID, License)
);

INSERT INTO #Employees (EmployeeID, License) VALUES (1001,'Class A');
INSERT INTO #Employees (EmployeeID, License) VALUES (1001,'Class B');
INSERT INTO #Employees (EmployeeID, License) VALUES(1001,'Class C');
INSERT INTO #Employees (EmployeeID, License) VALUES (2002,'Class A');
INSERT INTO #Employees (EmployeeID, License) VALUES (2002,'Class B');
INSERT INTO #Employees (EmployeeID, License) VALUES(2002,'Class C');
INSERT INTO #Employees (EmployeeID, License) VALUES (3003,'Class A');
INSERT INTO #Employees (EmployeeID, License) VALUES(3003,'Class D');
INSERT INTO #Employees (EmployeeID, License) VALUES(4004,'Class A');
INSERT INTO #Employees (EmployeeID, License) VALUES(4004,'Class B');
INSERT INTO #Employees (EmployeeID, License) VALUES(4004,'Class D');
INSERT INTO #Employees (EmployeeID, License) VALUES(5005,'Class A');
INSERT INTO #Employees (EmployeeID, License) VALUES(5005,'Class B');
INSERT INTO #Employees (EmployeeID, License) VALUES(5005,'Class D');

WITH cte_Count AS
(
SELECT  EmployeeID,
        COUNT(*) AS LicenseCount
FROM    #Employees
GROUP BY EmployeeID
),
cte_CountWindow AS
(
SELECT  a.EmployeeID AS EmployeeID_A,
        b.EmployeeID AS EmployeeID_B,
        COUNT(*) OVER (PARTITION BY a.EmployeeID, b.EmployeeID) AS CountWindow
FROM    #Employees a CROSS JOIN
        #Employees b
WHERE   a.EmployeeID <> b.EmployeeID AND a.License = b.License
)
SELECT  DISTINCT
        a.EmployeeID_A,
        a.EmployeeID_B,
        a.CountWindow AS LicenseCount
FROM    cte_CountWindow a INNER JOIN
        cte_Count b ON a.CountWindow = b.LicenseCount AND a.EmployeeID_A = b.EmployeeID INNER JOIN
        cte_Count c ON a.CountWindow = c.LicenseCount AND a.EmployeeID_B = c.EmployeeID;
EmployeeID_A EmployeeID_B LicenseCount
1001 2002 3
2002 1001 3
4004 5005 3
5005 4004 3

All Departments

The last example shows all employees who have worked in all departments.

Department History

Name Department IsActive
Chris Wardrobe 0
Chris Lighting 1
Chris Music 0
Nancy Wardrobe 1
Jim Music 1
Jim Wardrobe 0
CREATE TABLE #DepartmentHistory
(
Name       VARCHAR(255),
Department VARCHAR(255),
IsActive   INTEGER
);

INSERT INTO #DepartmentHistory (Name, Department, IsActive) VALUES ('Chris', 'Wardrobe', 0);
INSERT INTO #DepartmentHistory (Name, Department, IsActive) VALUES ('Chris', 'Lighting', 1);
INSERT INTO #DepartmentHistory (Name, Department, IsActive) VALUES ('Chris', 'Music', 0);
INSERT INTO #DepartmentHistory (Name, Department, IsActive) VALUES ('Nancy', 'Wardrobe', 1);
INSERT INTO #DepartmentHistory (Name, Department, IsActive) VALUES ('Jim', 'Music', 1);
INSERT INTO #DepartmentHistory (Name, Department, IsActive) VALUES ('Jim', 'Wardrobe', 0);
GO

;WITH cte_DistinctEmployeeDepartment AS
(
SELECT  DISTINCT
        Name,
        Department
FROM    #DepartmentHistory
),
cte_EmployeeDepartmentCount AS
(
SELECT  Name,
        COUNT(Department) AS DepartmentCount
FROM    cte_DistinctEmployeeDepartment
GROUP BY Name
),
cte_DistinctDeparments AS
(
SELECT  COUNT(DISTINCT Department) AS DepartmentCount
FROM    #DepartmentHistory
)
SELECT  *
FROM    cte_EmployeeDepartmentCount
WHERE   DepartmentCount IN (SELECT DepartmentCount FROM cte_DistinctDepartments);
Name DepartmentCount
Chris 3

Examples Of What Is Not Relational Division

Sometimes when learning, it is best to understand what does not fit the criteria.

This SQL statement finds the median of a number. The median is defined as the middle value in a set of data when the values are sorted in ascending or descending order.

The query works by dividing the data into two halves (top 50% and bottom 50%) and then selecting the top 1 value from each of these halves. These two values are then added together and divided by 2 to obtain the median.

While the query does involve dividing the data into two halves, it does not perform a true relational division in the sense that relational division is typically understood in the context of relational databases.

Relational division refers to a set of operations performed on two or more relations (tables) to divide one relation into multiple groups based on certain conditions.

CREATE TABLE #SampleData
(
IntegerValue  INTEGER NOT NULL
);
GO

INSERT INTO #SampleData VALUES
(5),(6),(10),(10),(13),(14),(17),(20),(81),(90),(76);
GO

--Median
SELECT
        ((SELECT TOP 1 IntegerValue
        FROM    (
                SELECT  TOP 50 PERCENT IntegerValue
                FROM    #SampleData
                ORDER BY IntegerValue
                ) a
        ORDER BY IntegerValue DESC) +  --Add the Two Together
        (SELECT TOP 1 IntegerValue
        FROM (
            SELECT  TOP 50 PERCENT IntegerValue
            FROM    #SampleData
            ORDER BY IntegerValue DESC
            ) a
        ORDER BY IntegerValue ASC)
        ) * 1.0 /2 AS Median;

Ranking is also not relational division. Here is an example finding a percentile on test scores and determining the top 10%. Relational division refers to a set operation that returns all the tuples in a table that do not have a corresponding tuple in another table, based on certain conditions. For brevity on this statement, I won't include the test data.

Below is not an example of relational division, but rather a way to divide the scores into percentiles and then select the top 10% of scores based on their value.

WITH cte_ScoreData AS 
(
SELECT  Name,
        Score,
        NTILE(100) OVER (ORDER BY Score DESC) AS Percentile
FROM    Scores
)
SELECT  Name,
        Score
FROM   ScoreData
WHERE  Percentile <= 10;

  1. Introduction
  2. SQL Processing Order
  3. Table Types
  4. Equi, Theta, and Natural Joins
  5. Inner Joins
  6. Outer Joins
  7. Full Outer Joins
  8. Cross Joins
  9. Semi and Anti Joins
  10. Any, All, and Some
  11. Self Joins
  12. Relational Division
  13. Set Operations
  14. Join Algorithms
  15. Exists
  16. Complex Joins

https://advancedsqlpuzzles.com