sql中with xxxx as () 是对一个查询子句做别名,同时数据库会对该子句生成临时表。with recursive 则是一个递归的查询子句,他会把查询出来的结果再次代入到查询子句中继续查询。
#1613.找到遗失的ID
https://leetcode-cn.com/problems/find-the-missing-ids/
表: Customers
+---------------+---------+
| Column Name | Type |
+---------------+---------+
| customer_id | int |
| customer_name | varchar |
+---------------+---------+
customer_id 是该表主键.
该表第一行包含了顾客的名字和id.
写一个 SQL 语句, 找到所有遗失的顾客id. 遗失的顾客id是指那些不在 Customers 表中, 值却处于 1 和表中最大 customer_id 之间的id。
注意: 最大的 customer_id 值不会超过 100。
返回结果按 ids 升序排列
Create table If Not Exists Customers (customer_id int, customer_name varchar(20))
Truncate table Customers
insert into Customers (customer_id, customer_name) values ('1', 'Alice')
insert into Customers (customer_id, customer_name) values ('4', 'Bob')
insert into Customers (customer_id, customer_name) values ('5', 'Charlie')
查询结果格式如下例所示.
Customers 表:
+-------------+---------------+
| customer_id | customer_name |
+-------------+---------------+
| 1 | Alice |
| 4 | Bob |
| 5 | Charlie |
+-------------+---------------+
Result 表:
+-----+
| ids |
+-----+
| 2 |
| 3 |
+-----+
表中最大的customer_id是5, 所以在范围[1,5]内, ID2和3从表中遗失.
方法
with recursive t(n) as(
select 1 as n
union all
select n+1 from t where n<100
)
select n as ids from t
where n<=(select max(customer_id) from customers)
and n not in(select customer_id from customers)
#1635. Hopper Company Queries I
https://leetcode-cn.com/problems/hopper-company-queries-i/
Table: Drivers
+-------------+---------+
| Column Name | Type |
+-------------+---------+
| driver_id | int |
| join_date | date |
+-------------+---------+
driver_id is the primary key for this table.
Each row of this table contains the driver's ID and the date they joined the Hopper company.
Table: Rides
+--------------+---------+
| Column Name | Type |
+--------------+---------+
| ride_id | int |
| user_id | int |
| requested_at | date |
+--------------+---------+
ride_id is the primary key for this table.
Each row of this table contains the ID of a ride, the user's ID that requested it, and the day they requested it.
There may be some ride requests in this table that were not accepted.
Table: AcceptedRides
+---------------+---------+
| Column Name | Type |
+---------------+---------+
| ride_id | int |
| driver_id | int |
| ride_distance | int |
| ride_duration | int |
+---------------+---------+
ride_id is the primary key for this table.
Each row of this table contains some information about an accepted ride.
It is guaranteed that each accepted ride exists in the Rides table.
Write an SQL query to report the following statistics for each month of 2020:
The number of drivers currently with the Hopper company by the end of the month (active_drivers).
The number of accepted rides in that month (accepted_rides).
Return the result table ordered by month in ascending order, where month is the month's number (January is 1, February is 2, etc.).
Create table If Not Exists Drivers (driver_id int, join_date date)
Create table If Not Exists Rides (ride_id int, user_id int, requested_at date)
Create table If Not Exists AcceptedRides (ride_id int, driver_id int, ride_distance int, ride_duration int)
Truncate table Drivers
insert into Drivers (driver_id, join_date) values ('10', '2019-12-10')
insert into Drivers (driver_id, join_date) values ('8', '2020-1-13')
insert into Drivers (driver_id, join_date) values ('5', '2020-2-16')
insert into Drivers (driver_id, join_date) values ('7', '2020-3-8')
insert into Drivers (driver_id, join_date) values ('4', '2020-5-17')
insert into Drivers (driver_id, join_date) values ('1', '2020-10-24')
insert into Drivers (driver_id, join_date) values ('6', '2021-1-5')
Truncate table Rides
insert into Rides (ride_id, user_id, requested_at) values ('6', '75', '2019-12-9')
insert into Rides (ride_id, user_id, requested_at) values ('1', '54', '2020-2-9')
insert into Rides (ride_id, user_id, requested_at) values ('10', '63', '2020-3-4')
insert into Rides (ride_id, user_id, requested_at) values ('19', '39', '2020-4-6')
insert into Rides (ride_id, user_id, requested_at) values ('3', '41', '2020-6-3')
insert into Rides (ride_id, user_id, requested_at) values ('13', '52', '2020-6-22')
insert into Rides (ride_id, user_id, requested_at) values ('7', '69', '2020-7-16')
insert into Rides (ride_id, user_id, requested_at) values ('17', '70', '2020-8-25')
insert into Rides (ride_id, user_id, requested_at) values ('20', '81', '2020-11-2')
insert into Rides (ride_id, user_id, requested_at) values ('5', '57', '2020-11-9')
insert into Rides (ride_id, user_id, requested_at) values ('2', '42', '2020-12-9')
insert into Rides (ride_id, user_id, requested_at) values ('11', '68', '2021-1-11')
insert into Rides (ride_id, user_id, requested_at) values ('15', '32', '2021-1-17')
insert into Rides (ride_id, user_id, requested_at) values ('12', '11', '2021-1-19')
insert into Rides (ride_id, user_id, requested_at) values ('14', '18', '2021-1-27')
Truncate table AcceptedRides
insert into AcceptedRides (ride_id, driver_id, ride_distance, ride_duration) values ('10', '10', '63', '38')
insert into AcceptedRides (ride_id, driver_id, ride_distance, ride_duration) values ('13', '10', '73', '96')
insert into AcceptedRides (ride_id, driver_id, ride_distance, ride_duration) values ('7', '8', '100', '28')
insert into AcceptedRides (ride_id, driver_id, ride_distance, ride_duration) values ('17', '7', '119', '68')
insert into AcceptedRides (ride_id, driver_id, ride_distance, ride_duration) values ('20', '1', '121', '92')
insert into AcceptedRides (ride_id, driver_id, ride_distance, ride_duration) values ('5', '7', '42', '101')
insert into AcceptedRides (ride_id, driver_id, ride_distance, ride_duration) values ('2', '4', '6', '38')
insert into AcceptedRides (ride_id, driver_id, ride_distance, ride_duration) values ('11', '8', '37', '43')
insert into AcceptedRides (ride_id, driver_id, ride_distance, ride_duration) values ('15', '8', '108', '82')
insert into AcceptedRides (ride_id, driver_id, ride_distance, ride_duration) values ('12', '8', '38', '34')
insert into AcceptedRides (ride_id, driver_id, ride_distance, ride_duration) values ('14', '1', '90', '74')
The query result format is in the following example.
Drivers table:
+-----------+------------+
| driver_id | join_date |
+-----------+------------+
| 10 | 2019-12-10 |
| 8 | 2020-1-13 |
| 5 | 2020-2-16 |
| 7 | 2020-3-8 |
| 4 | 2020-5-17 |
| 1 | 2020-10-24 |
| 6 | 2021-1-5 |
+-----------+------------+
Rides table:
+---------+---------+--------------+
| ride_id | user_id | requested_at |
+---------+---------+--------------+
| 6 | 75 | 2019-12-9 |
| 1 | 54 | 2020-2-9 |
| 10 | 63 | 2020-3-4 |
| 19 | 39 | 2020-4-6 |
| 3 | 41 | 2020-6-3 |
| 13 | 52 | 2020-6-22 |
| 7 | 69 | 2020-7-16 |
| 17 | 70 | 2020-8-25 |
| 20 | 81 | 2020-11-2 |
| 5 | 57 | 2020-11-9 |
| 2 | 42 | 2020-12-9 |
| 11 | 68 | 2021-1-11 |
| 15 | 32 | 2021-1-17 |
| 12 | 11 | 2021-1-19 |
| 14 | 18 | 2021-1-27 |
+---------+---------+--------------+
AcceptedRides table:
+---------+-----------+---------------+---------------+
| ride_id | driver_id | ride_distance | ride_duration |
+---------+-----------+---------------+---------------+
| 10 | 10 | 63 | 38 |
| 13 | 10 | 73 | 96 |
| 7 | 8 | 100 | 28 |
| 17 | 7 | 119 | 68 |
| 20 | 1 | 121 | 92 |
| 5 | 7 | 42 | 101 |
| 2 | 4 | 6 | 38 |
| 11 | 8 | 37 | 43 |
| 15 | 8 | 108 | 82 |
| 12 | 8 | 38 | 34 |
| 14 | 1 | 90 | 74 |
+---------+-----------+---------------+---------------+
Result table:
+-------+----------------+----------------+
| month | active_drivers | accepted_rides |
+-------+----------------+----------------+
| 1 | 2 | 0 |
| 2 | 3 | 0 |
| 3 | 4 | 1 |
| 4 | 4 | 0 |
| 5 | 5 | 0 |
| 6 | 5 | 1 |
| 7 | 5 | 1 |
| 8 | 5 | 1 |
| 9 | 5 | 0 |
| 10 | 6 | 0 |
| 11 | 6 | 2 |
| 12 | 6 | 1 |
+-------+----------------+----------------+
By the end of January --> two active drivers (10, 8) and no accepted rides.
By the end of February --> three active drivers (10, 8, 5) and no accepted rides.
By the end of March --> four active drivers (10, 8, 5, 7) and one accepted ride (10).
By the end of April --> four active drivers (10, 8, 5, 7) and no accepted rides.
By the end of May --> five active drivers (10, 8, 5, 7, 4) and no accepted rides.
By the end of June --> five active drivers (10, 8, 5, 7, 4) and one accepted ride (13).
By the end of July --> five active drivers (10, 8, 5, 7, 4) and one accepted ride (7).
By the end of August --> five active drivers (10, 8, 5, 7, 4) and one accepted ride (17).
By the end of Septemeber --> five active drivers (10, 8, 5, 7, 4) and no accepted rides.
By the end of October --> six active drivers (10, 8, 5, 7, 4, 1) and no accepted rides.
By the end of November --> six active drivers (10, 8, 5, 7, 4, 1) and two accepted rides (20, 5).
By the end of December --> six active drivers (10, 8, 5, 7, 4, 1) and one accepted ride (2).
方法
with recursive t1 as(
select 1 as month
union all
select month+1 from t1 where month<=11
) #用recursive生成1-12月的连续序列
select t1.month as month,ifnull(sum(t2.counts)over(order by month),0) as active_drivers,#要求累计的时候,可以先求每个月的,再sum()
ifnull(t3.counts,0) as accepted_rides
from t1 left join
(select if(year(join_date)<2020,1,month(join_date)) as month,
count(driver_id)as counts
from drivers
where year(join_date)<=2020
group by if(year(join_date)<2020,1,month(join_date))) t2
on t1.month=t2.month # t2表是每个月的激活用户数,注意如果是19年的,要一起算在2020年1月上。(非累积)
left join
(select month(requested_at) as month,count(*) as counts
from rides r join AcceptedRides a on r.ride_id=a.ride_id
where year(requested_at)=2020
group by month(requested_at)) t3 # t3是每个月的通过申请的个数(非累计)
on t1.month=t3.month
order by t1.month
#1645. Hopper Company Queries II
Write an SQL query to report the percentage of working drivers (working_percentage) for each month of 2020 where:
Note that if the number of available drivers during a month is zero, we consider the working_percentage to be 0.
Return the result table ordered by month in ascending order, where month is the month's number (January is 1, February is 2, etc.). Round working_percentage to the nearest 2 decimal places.
The query result format is in the following example.
Result table:
+-------+--------------------+
| month | working_percentage |
+-------+--------------------+
| 1 | 0.00 |
| 2 | 0.00 |
| 3 | 25.00 |
| 4 | 0.00 |
| 5 | 0.00 |
| 6 | 20.00 |
| 7 | 20.00 |
| 8 | 20.00 |
| 9 | 0.00 |
| 10 | 0.00 |
| 11 | 33.33 |
| 12 | 16.67 |
+-------+--------------------
方法
with recursive t1 as(
select 1 as month
union all
select month+1 from t1 where month<12
)
#round(ifnull(ifnull(t3.counts,0)/sum(t2.counts)over(order by month),0)*100,2)as working_percentage
select month,
if(driver_num=0,0,round(actice_num/driver_num*100,2)) as working_percentage
from
(select t1.month,
ifnull(sum(t2.counts)over(order by month),0) as driver_num,
ifnull(t3.counts,0) as actice_num
from t1 left join
(select if(year(join_date)<2020,1,month(join_date)) as month,count(driver_id) as counts
from drivers
where year(join_date)<=2020
group by if(year(join_date)<2020,1,month(join_date)))t2
on t1.month=t2.month
left join
(select month(requested_at) as month,count(distinct a.driver_id) as counts
from rides r join AcceptedRides a on r.ride_id=a.ride_id
where year(requested_at)=2020
group by month(requested_at)
##跟上一道题的区别在于 t3 count(distinct a.driver_id)
) t3 on t1.month=t3.month) aaa
#1651. Hopper Company Queries III
Write an SQL query to compute the average_ride_distance and average_ride_duration of every 3-month window starting from January - March 2020 to October - December 2020. Round average_ride_distance and average_ride_duration to the nearest two decimal places.
The average_ride_distance is calculated by summing up the total ride_distance values from the three months and dividing it by 3. The average_ride_duration is calculated in a similar way.
Return the result table ordered by month in ascending order, where month is the starting month's number (January is 1, February is 2, etc.).
Result table:
+-------+-----------------------+-----------------------+
| month | average_ride_distance | average_ride_duration |
+-------+-----------------------+-----------------------+
| 1 | 21.00 | 12.67 |
| 2 | 21.00 | 12.67 |
| 3 | 21.00 | 12.67 |
| 4 | 24.33 | 32.00 |
| 5 | 57.67 | 41.33 |
| 6 | 97.33 | 64.00 |
| 7 | 73.00 | 32.00 |
| 8 | 39.67 | 22.67 |
| 9 | 54.33 | 64.33 |
| 10 | 56.33 | 77.00 |
方法
with recursive t1 as(
select 1 as month
union all
select month+1 from t1 where month<=11
)#虽然显示前10,但是这里要全部月份,因为计算的时候需要包含后边两个月
select t1.month,
round(avg(ifnull(aa.ride_distance,0))over(order by t1.month rows between current row and 2 following),2) as average_ride_distance,
##要注意ifnull()的位置
round(avg(ifnull(aa.ride_duration,0))over(order by t1.month rows between current row and 2 following),2) as average_ride_duration
from t1 left join
(select month(requested_at) as month,sum(ride_distance)as ride_distance,sum(ride_duration)as ride_duration
from rides r join AcceptedRides a on r.ride_id=a.ride_id
where year(requested_at)=2020
group by month(requested_at)) aa
on t1.month=aa.month
order by t1.month
limit 10
#1336. 每次访问的交易次数
https://leetcode-cn.com/problems/number-of-transactions-per-visit/
写一条 SQL 查询多少客户访问了银行但没有进行任何交易,多少客户访问了银行进行了一次交易等等
结果包含两列:
transactions_count: 客户在一次访问中的交易次数
visits_count: 在 transactions_count 交易次数下相应的一次访问时的客户数量
transactions_count 的值从 0 到所有用户一次访问中的 max(transactions_count)
按 transactions_count 排序
表: Visits
+---------------+---------+
| Column Name | Type |
+---------------+---------+
| user_id | int |
| visit_date | date |
+---------------+---------+
(user_id, visit_date) 是该表的主键
该表的每行表示 user_id 在 visit_date 访问了银行
表: Transactions
+------------------+---------+
| Column Name | Type |
+------------------+---------+
| user_id | int |
| transaction_date | date |
| amount | int |
+------------------+---------+
该表没有主键,所以可能有重复行
该表的每一行表示 user_id 在 transaction_date 完成了一笔 amount 数额的交易
可以保证用户 (user) 在 transaction_date 访问了银行 (也就是说 Visits 表包含 (user_id, transaction_date) 行)
Create table If Not Exists Visits (user_id int, visit_date date)
Create table If Not Exists Transactions (user_id int, transaction_date date, amount int)
Truncate table Visits
insert into Visits (user_id, visit_date) values ('1', '2020-01-01')
insert into Visits (user_id, visit_date) values ('2', '2020-01-02')
insert into Visits (user_id, visit_date) values ('12', '2020-01-01')
insert into Visits (user_id, visit_date) values ('19', '2020-01-03')
insert into Visits (user_id, visit_date) values ('1', '2020-01-02')
insert into Visits (user_id, visit_date) values ('2', '2020-01-03')
insert into Visits (user_id, visit_date) values ('1', '2020-01-04')
insert into Visits (user_id, visit_date) values ('7', '2020-01-11')
insert into Visits (user_id, visit_date) values ('9', '2020-01-25')
insert into Visits (user_id, visit_date) values ('8', '2020-01-28')
Truncate table Transactions
insert into Transactions (user_id, transaction_date, amount) values ('1', '2020-01-02', '120')
insert into Transactions (user_id, transaction_date, amount) values ('2', '2020-01-03', '22')
insert into Transactions (user_id, transaction_date, amount) values ('7', '2020-01-11', '232')
insert into Transactions (user_id, transaction_date, amount) values ('1', '2020-01-04', '7')
insert into Transactions (user_id, transaction_date, amount) values ('9', '2020-01-25', '33')
insert into Transactions (user_id, transaction_date, amount) values ('9', '2020-01-25', '66')
insert into Transactions (user_id, transaction_date, amount) values ('8', '2020-01-28', '1')
insert into Transactions (user_id, transaction_date, amount) values ('9', '2020-01-25', '99')
下面是查询结果格式的例子:
Visits 表:
+---------+------------+
| user_id | visit_date |
+---------+------------+
| 1 | 2020-01-01 |
| 2 | 2020-01-02 |
| 12 | 2020-01-01 |
| 19 | 2020-01-03 |
| 1 | 2020-01-02 |
| 2 | 2020-01-03 |
| 1 | 2020-01-04 |
| 7 | 2020-01-11 |
| 9 | 2020-01-25 |
| 8 | 2020-01-28 |
+---------+------------+
Transactions 表:
+---------+------------------+--------+
| user_id | transaction_date | amount |
+---------+------------------+--------+
| 1 | 2020-01-02 | 120 |
| 2 | 2020-01-03 | 22 |
| 7 | 2020-01-11 | 232 |
| 1 | 2020-01-04 | 7 |
| 9 | 2020-01-25 | 33 |
| 9 | 2020-01-25 | 66 |
| 8 | 2020-01-28 | 1 |
| 9 | 2020-01-25 | 99 |
+---------+------------------+--------+
结果表:
+--------------------+--------------+
| transactions_count | visits_count |
+--------------------+--------------+
| 0 | 4 |
| 1 | 5 |
| 2 | 0 |
| 3 | 1 |
+--------------------+--------------+
- 对于 transactions_count = 0, visits 中 (1, "2020-01-01"), (2, "2020-01-02"), (12, "2020-01-01") 和 (19, "2020-01-03") 没有进行交易,所以 visits_count = 4 。
- 对于 transactions_count = 1, visits 中 (2, "2020-01-03"), (7, "2020-01-11"), (8, "2020-01-28"), (1, "2020-01-02") 和 (1, "2020-01-04") 进行了一次交易,所以 visits_count = 5 。
- 对于 transactions_count = 2, 没有客户访问银行进行了两次交易,所以 visits_count = 0 。
- 对于 transactions_count = 3, visits 中 (9, "2020-01-25") 进行了三次交易,所以 visits_count = 1 。
- 对于 transactions_count >= 4, 没有客户访问银行进行了超过3次交易,所以我们停止在 transactions_count = 3 。
方法
with recursive t1 as(
select 0 as n
union all
select n+1 from t1 where
n<(select max(counts) from(select count(1) as counts from Transactions group by user_id,transaction_date)a)),#生成从0开始连续的数,这个数不能超过一次访问中最多的交易次数
r as(
select user_id,date,sum(view) view,sum(trans) trans from
(select user_id,visit_date date,count(*) view,0 trans
from visits
group by user_id,visit_date
union all
select user_id,transaction_date date,0 view,count(*) trans
from Transactions
group by user_id,transaction_date) a1
group by user_id,date) #每天、每个用户的访问次数和交易次数
select n transactions_count,ifnull(sum(r.view),0) visits_count
from t1 left join r on t1.n=r.trans
group by n