题目:
在同一家公司由于在不同岗位工作,简历表里会存在多条同公司的记录。需要输出候选人公司履历数据。包含,简历ID、姓名、公司名、开始时间、结束时间;
要求:如果时间连续则做合并;如果时间不连续则不合并;即上表中第一条和第二条合并为同一个公司,开始时间取最小,结束时间取最大;第五条虽然也是在华为工作,但时间不连续,不合并;
答案如下:
思路1:
假设表名为table,把“至今”换成当前日期
先用左连接关联出是否存在同人同公司两两连续(间隔1天)的时间点
再用关联到的entry_time去覆盖已有的entry_time,并将leave_time大的记录放在前面
然后对同人同公司的记录按照entry_time进行去重,保留第一个即可。
该思路的缺点是只能解决给定的题目的情形,如果有三条记录连续的话就会失效。
SELECT * FROM (
SELECT
c.resume_id,c.name,c.company_name,
CASE
WHEN c.bId is not null THEN
c.bEntryTime
ELSE
c.entry_time
END AS entry_time,
c.leave_time
FROM
(
SELECT
*, DATE_FORMAT(a.entry_time, '%Y-%m-%d') AS aStartDate,
DATE_FORMAT(a.leave_time, '%Y-%m-%d') AS aEndDate
FROM
table a
LEFT JOIN (
SELECT
resume_id AS bId,
`name` AS bName,
entry_time AS bEntryTime,
leave_time AS bLeaveTime,
company_name AS bCompany,
DATE_FORMAT(entry_time, '%Y-%m-%d') AS bStartDate,
DATE_FORMAT(leave_time, '%Y-%m-%d') AS bEndDate
FROM
table
) b ON a.resume_id = b.bId
AND a.company_name = b.bCompany
AND a.entry_time > b.bEntryTime
and DATEDIFF(DATE_FORMAT(a.entry_time, '%Y-%m-%d'),DATE_FORMAT(b.bLeaveTime, '%Y-%m-%d')) = 1
ORDER BY
a.entry_time
) c
)d
GROUP BY d.resume_id,d.company_name,d.entry_time
ORDER BY d.entry_time;
思路2:
为了解决上述问题,比如三个连续的情形,就需要按照上述查询后的结果进一步用entry_time和leave_time去筛选即可。
超过三个的情形只需要迭代排本sql即可
先用with as 临时存储(和上面的查询语句一致)
with table_double as
(
SELECT *
FROM (
SELECT resume_id, name, company_name, job_name, CASE WHEN resume_id_b is not null THEN entry_time_b ELSE entry_time END AS entry_time, leave_time
FROM (
SELECT a.*, b.resume_id as resume_id_b, b.entry_time as entry_time_b, b.leave_time as leave_time_b
FROM table a
LEFT JOIN table b
ON a.resume_id = b.resume_id
AND a.company_name = b.company_name
AND a.entry_time > b.entry_time
and DATEDIFF(a.entry_time, b.leave_time) = 1
ORDER BY a.entry_time
) c
) d
GROUP BY resume_id, company_name, entry_time
ORDER BY resume_id, entry_time
)
再对临时表table_double进一步将entry_time和leave_time有时间重叠的部分合并即可
SELECT *
FROM (
SELECT resume_id, name, company_name, job_name, CASE WHEN resume_id_b is not null THEN entry_time_b ELSE entry_time END AS entry_time, leave_time
FROM (
SELECT a.*, b.resume_id as resume_id_b, b.entry_time as entry_time_b, b.leave_time as leave_time_b
FROM table_double a
LEFT JOIN table_double b
ON a.resume_id = b.resume_id
AND a.company_name = b.company_name
AND a.leave_time > b.leave_time
and a.entry_time < b.leave_time
ORDER BY a.entry_time
)c
) d
GROUP BY resume_id, company_name, entry_time
ORDER BY resume_id, entry_time
下一篇:找好多动漫