数据集
我们这一篇文章采用PostgreSQL的SQL语法。重点我们关注select...from...where
这种读操作,分析query (analytical query)。
数据集在 https://hyper-db.de/interface.html 可以直接使用。另外在这个网页不允许进行写操作:insert
, update
, delete
之类的transactional query。当然create table
和drop table
也不被允许。
架构 Schema:
下载:
https://db.in.tum.de/teaching/ws1920/grundlagen/uni_mysql.sql?lang=de
Schma和大部分SQL语句来自Prof. Alfons Kemper, Ph.D.的课件和书。
课件:
- https://db.in.tum.de/teaching/bookDBMSeinf/folien/?lang=de
- https://db.in.tum.de/teaching/bookDBMSeinf/folien/pdf/Kapitel4.pdf?lang=de
书: https://db.in.tum.de/teaching/bookDBMSeinf/?lang=de
中级SQL
[图片上传失败...(image-3019be-1579523422389)]
假设我们的schema变成上图(SQL不能直接运行 数据集不对应上图)。但是我们可以加入暂时的view
进入:
with professorenF as (
select *,
(case when p.persnr in (2125, 2126, 2133, 2137) then 'Philosophie'
when p.persnr in (2127, 2136) then 'Physik'
else 'Theologie'
end) as fakname
from professoren p
)
select *
from professorenF
with studentenGF as (
select *,
(case when s.matrnr in (24002, 26830, 27550, 29120) then 'M'
else 'W'
end) as geschlecht,
(case when s.matrnr in (24002, 26120, 26830, 27550) then 'Philosophie'
when s.matrnr in (28106, 29120) then 'Physik'
else 'Theologie'
end) as fakname
from studenten s
)
select *
from studentenGF
我们下面每一个query都需要加入上面两个view
, 即如下模式:
with professorenF as (
select *,
(case when p.persnr in (2125, 2126, 2133, 2137) then 'Philosophie'
when p.persnr in (2127, 2136) then 'Physik'
else 'Theologie'
end) as fakname
from professoren p
), studentenGF as (
select *,
(case when s.matrnr in (24002, 26830, 27550, 29120) then 'M'
else 'W'
end) as geschlecht,
(case when s.matrnr in (24002, 26120, 26830, 27550) then 'Philosophie'
when s.matrnr in (28106, 29120) then 'Physik'
else 'Theologie'
end) as fakname
from studenten s
)
-- 我们的query
select ...
from ...
where ...
- 求每一个FakName对应的女性占比:
with anz(Fakname,AnzStudenten) as (
select s.FakName, count(*)
from StudentenGF s
group by s.FakNAme),
anzw(Fakname,AnzWeiblich) as (
select sw.FakName,count(*) as AnzWeiblich
from StudentenGF sw
where sw.Geschlecht ='W'
group by sw.FakName)
select anz.FakName, anz.AnzStudenten, anzw.AnzWeiblich, (cast(anzw.AnzWeiblich as decimal(5,2))/anz.AnzStudenten * 100) as ProzentWeiblich
from anz, anzw
where anz.FakName = anzw.FakName
- 求每一个FakName对应的男性占比:
with anz(Fakname, AnzStudenten) as (
select s.FakName, count(*)
from StudentenGF s
group by s.FakNAme),
anzm(Fakname, AnzMaenner) as (
select sw.FakName, count(*) as AnzWeiblich
from StudentenGF sw
where sw.Geschlecht = 'M'
group by sw.FakName)
select anz.FakName,
anz.AnzStudenten,
anzm.AnzMaenner,
(case when anzm.AnzMaenner is null then 0 else anzm.AnzMaenner end) / anz.AnzStudenten * 100.00 as ProzentMaenner
from anz left outer join anzm
on anz.FakName = anzm.FakName
这里并不是女性版直接更改成男性。一个重点是:存在系没有任何男性。
case
也可以被替换为: COALESCE(anzm.AnzMaenner, 0) / anz.AnzStudenten * 100.00 as ProzentMaenner
或者再换一种:
select fakname,
(sum(case when geschlecht = 'M' then 1.00 else 0.00 end)) / count(*)
from studentenGF
group by fakname
- 搜索所有学生把自己系教授提供的课都听完了:
select s.*
from studentenGF s
where not exists(
select *
from vorlesungen v, professorenF p
where v.gelesenvon = p.persnr and p.fakname = s.fakname and not exists(
select *
from hoeren h
where h.vorlnr = v.vorlnr and h.matrnr = s.matrnr
)
)
用中文就是:对这个学生,不存在一门他系里教授的课,这个学生没有听过。
或者
select s.*
from studentenGF s
where (
select count(*)
from vorlesungen v, professorenF p
where v.gelesenvon = p.persnr and p.fakname = s.fakname
)
=
(
select count(*)
from hoeren h, vorlesungen v, professorenF p
where h.matrnr = s.matrnr and h.vorlnr = v.vorlnr and p.persnr = v.gelesenvon and p.fakname= s.fakname
)
该文章遵循创作共用版权协议 CC BY-NC 3.0,要求署名、非商业 、保持一致。在满足创作共用版权协议 CC BY-NC 3.0 的基础上可以转载,但请以超链接形式注明出处。文章仅代表作者的知识和看法,如有不同观点,可以回复并讨论。