[MySQL]LV 2 문제 모음

2023-03-03

Intro

SQL 문제들은 많이 없기 때문에 LV 1 ~ LV 5 까지 문제들을 레벨별로 모아서 한번에 정리하겠다.

문제 1 - datediff, group by, having + 서브쿼리

자동차 평균 대여 기간 구하기

코드

# 그냥 바로 풀이
SELECT car_id, round(avg(DATEDIFF(end_date,start_date)+1),1) as average_duration 
from car_rental_company_rental_history 
group by car_id 
having round(avg(DATEDIFF(end_date,start_date)+1),1) >= 7 
order by average_duration desc, car_id desc;

# 서브쿼리 풀이
select a.car_id, round(avg(a.duration),1) as average_duration
from (
    select car_id, datediff(end_date, start_date)+1 as duration
    from car_rental_company_rental_history 
) a
group by a.car_id having round(avg(a.duration),1) >= 7
order by average_duration desc, a.car_id desc

풀이

우선 round(avg(DATEDIFF(end_date,start_date)+1),1) as average_duration 를 이해하고 뒤에 그룹으로 나누고, 정렬한 것을 이해해보자.

DATEDIFF(end_date,start_date)+1) 는 두 날짜의 차이를 구해주는 메소드이다.
- +1 은 1일 ~ 10일 일때 9(=10-1)일이 아닌 10일을 빌린것이기 때문
round(avg(DATEDIFF....),1) as average_duration 는 평균, 반올림 그리고 컬럼명 지정이다.
- 소수점 둘째자리에서 반올림=소수점 첫재짜리까지 반올림 을 의미

group by car_id having average_duration >= 7 order by average_duration desc, car_id desc;

group by 문법 이후에 having 문법으로 해당 그룹에 조건을 줄 수 있다!
- where 문법에는 방금 생성한 필드명인 average_duration 을 바로 사용불가
- where 은 그룹화 전을 필터링, having 은 그룹화 후를 필터링의 차이!!
order by 로 정렬할 때 나열을 해서 여러개 정렬이 가능하다. (정렬 순번은 나열 순서대로 적용)

문제 2 - join

조건에 맞는 도서와 저자 리스트 출력하기

코드

# 첫번째 방법
SELECT b.book_id, a.author_name, date_format(b.published_date,'%Y-%m-%d') as published_date from book b, author a where b.author_id = a.author_id and b.category = '경제' order by b.published_date;

# 두번째 방법(join)
SELECT b.book_id, a.author_name, date_format(b.published_date,'%Y-%m-%d') as published_date from book b join author a on (b.author_id = a.author_id) where b.category = '경제' order by b.published_date;

풀이

date_format 사용법과 A join B on (...) 사용을 익혀두자!

참고로 두 테이블을 조인할 땐 b.author_id = a.author_id 처럼 연결하는 개념이 필수

inner join : SELECT * from book b join author a on (b.author_id = a.author_id) where b.category = '경제';
- id 컬럼이 중복해서 존재 (즉, id 컬럼 2개 출력 됨) -> 근데 원하는 컬럼명만 선정해서 출력하면 상관없긴 함.
natural join : SELECT * from book b natural join author a where b.category = '경제';
- id 컬럼의 중복을 제거 (즉, id 컬럼 1개 출력 됨)

따라서 행이 중복되어 나오는건 전혀 join 종류랑은 상관 없는 이야기. 그저 1:N, N:N 이런식의 관계에 join을 하게되면 중복 행이 발생할 가능성이 있는 것이다.

distinct를 사용해서 제거 필요.
물론, distinct를 사용하게끔 설계된 것은 잘 못 설계했을 가능성 높다.

문제 3 - having절 중복 활용(count(*)>1)

재구매가 일어난 상품과 회원 리스트 구하기

코드

SELECT user_id, product_id from online_sale group by user_id, product_id having count(*)>1 order by user_id asc, product_id desc;

풀이

group by 를 user_id, product_id 두개를 하게되면 두개를 조합해서 분류한다.

이때, 그룹화 하면서 얻은 동일 데이터들은 having 절에서 처리할 수 있다. -> count(*)>1 로 재구매 필터링

문제 4 - 언제 join 사용? bit 연산 심화? + self join

부모의 형질을 모두 가지는 대장균 찾기

코드

# 계층형 조인이 필요해 보임. -> self join
select a.id, a.genotype, b.genotype as parent_genotype
from ecoli_data a join ecoli_data b on(a.parent_id=b.id)
where a.genotype & b.genotype = b.genotype
order by a.id

풀이

하나의 테이블에 id와 parent_id 와 조인해서 해결해야 하기에 self join 형태이다.

계층형 질의는 자식과 부모를 구분하는게 먼저다. 당연히 parent_id가 부모다.
그럼 parent_id = id 로 조인하면 NL조인 생각하면 parent_id 기준으로 id가 조인되는것을 상상해보자.
- 참고: NL조인은 2중 for문과 흐름이 동일하다. 조인문은 select * from ecoli_data a join ecoli_data b on(a.parent_id=b.id) 이다.
- 아래 사진은 ECOLI_DATA 테이블과 조인 후 모습이다. (2번째 사진: a범위, 3번째 사진: b범위)
결론: a범위를 자식, b범위를 부모로 볼 수 있다. 그래서 코드에서 부모 형질을 b를 사용한 것!

bit 연산의 경우 a.genotype & b.genotype = b.genotype 이고, 이미 LV1에서 공부했으므로 설명 생략한다.

문제 5 - 서브쿼리는 메인쿼리에 종속, SQL 실행 순서 + 스칼라, 인라인 뷰

연도별 대장균 크기의 편차 구하기

코드

# 스칼라 풀이
select year(differentiation_date) as year, 
(select max(size_of_colony) from ecoli_data where year(differentiation_date)=year)-size_of_colony as year_dev, id
from ecoli_data order by year, year_dev;

# 인라인 뷰 풀이
select a.year, a.year_max-b.size_of_colony as year_dev, b.id
from (
    select year(DIFFERENTIATION_DATE) as year, max(size_of_colony) as year_max
    from ecoli_data
    group by year(DIFFERENTIATION_DATE)
) a, (
    select id, year(DIFFERENTIATION_DATE) as year, size_of_colony
    from ecoli_data
) b
where a.year = b.year
order by a.year, year_dev

풀이

(1)서브쿼리는 메인쿼리에 종속 관계이므로 메인쿼리의 year 별칭을 가져다 사용 가능

(2)인라인 뷰는 max크기 구하는 테이블을 서브쿼리로 만들어서 기본 테이블에 year로 조인해서 풀었다.

SQL 실제 실행 순서는 from -> where -> group by -> select -> order by 순 이므로 order by 에서 select 절의 결과를 가져다 사용 가능

(TMI)여기서 사용자 편의를 위해 MySQL에서는 GROUP BY, ORDER BY, HAVING절에서는 SELECT 절에 정의된 별칭을 사용할수 있다.
(TMI)서브쿼리는 select 절, from 절, where 절에 사용 가능하다. 각각 스칼라 서브쿼리, 인라인 뷰, 서브쿼리라 부른다.

문제 6 - case when 심화, select 별칭 사용과 사용X

분기별 분화된 대장균의 개체 수 구하기

코드

# MySQL 이라 별칭 group by 사용 가능 풀이.
select 
(case 
 when month(differentiation_date) >=1 and month(differentiation_date) <=3 then '1Q'
 when month(differentiation_date) >=4 and month(differentiation_date) <=6 then '2Q'
 when month(differentiation_date) >=7 and month(differentiation_date) <=9 then '3Q'
 else '4Q'
 end) as quarter, count(*) as ecoli_count
 from ecoli_data group by quarter order by quarter;
 
 # 그외에서는 별칭 group by 사용 불가 풀이.
 # 분기?: 1,2,3,4분기
select concat(a.quarter, 'Q') as quarter, count(*) as ecoli_count
from (
    select id, (case 
         when month(differentiation_date) >=1 and month(differentiation_date) <=3 then 1
         when month(differentiation_date) >=4 and month(differentiation_date) <=6 then 2
         when month(differentiation_date) >=7 and month(differentiation_date) <=9 then 3
         else 4
     end) as quarter
    from ecoli_data
) a
group by a.quarter
order by a.quarter

풀이

조건문을 여러개 작성할 때 when ... then 을 반복한다.

별칭 사용 풀이와 아닌 풀이이다. 정렬 때 ‘1Q’ 같이 숫자와 문자 조합은 어려우니 이부분 주의

문제 7 - date_format, having 심화 + convert(or cast)

월별 잡은 물고기 수 구하기

코드

select count(*) as fish_count, month(time) as month from fish_info
group by month order by month;

select count(*) as fish_count, convert(date_format(time,'%c'),UNSIGNED) as month from fish_info
group by month order by month;

-- date_format(time,'%c') 실패 -> 문제에서 숫자 타입이라 해서 -> convert 로 해결 가능
-- having count(month) > 0 생략 가능 -> 어차피 group by에서 없는 값은 출력 안하므로
select count(*) as fish_count, date_format(time,'%c') as month from fish_info
group by month having count(month) > 0 order by month;

풀이

date_format 를 원래 ‘Y-%m-%d’ 로 보통 쓰고 있었다.

%c : 월 을 한자리 수로 -> 단, 문자 타입
%e : 일 수를 한자리 수로
%y : 년 도를 두자리 수로 -> ex : 2024 라면 24

having count(month) > 0 은 값이 없는걸 출력 안하게 할 수 있다.

group by에서 이미 없는 값은 출력 안하므로 있으나 마나이긴 하다.

convert 와 cast 는 오라클의 to_number, to_char 처럼 타입 변환을 지원한다.

정수는 SIGNED(USIGNED), 문자는 CHAR, 날짜는 DATE 를 기입한다.
사용법이 차이: convert(컬럼, char) <-> cast(컬럼 as char)

문제 8 - 숫자 정렬이면 문자 꼭 제외

노선별 평균 역 사이 거리 조회하기

코드

-- hint: 내림차순 정렬은 'km'를 제외한 숫자만 정렬할 것

select route, concat(round(sum(d_between_dist),1),'km') as total_distance, concat(round(avg(d_between_dist),2),'km') as average_distance from subway_distance group by route order by round(sum(d_between_dist),1) desc;

풀이

문제에서 총 누계 거리를 기준으로 내림차순을 하라고 했는데 숫자이므로 숫자로만 정렬하자.

즉, ‘km’를 붙인 별칭 total_distance 로 정렬하면 실패!

문제 9 - distinct -> 중복 제거!! + group by

조건에 맞는 개발자 찾기

코드

# 코드를 실행해보면 developers 쪽 컬럼은 중복이 발생함을 알 수 있다.
# skillcodes랑 같이 출력하면 사실 중복은 아니지만, 출력 컬럼은 developers 쪽 컬럼만 출력한다.
# 따라서 distinct 로 중복을 제거한 것이다.
select distinct id, email, first_name, last_name from developers d, skillcodes s where d.skill_code & s.code = s.code and s.name in("Python", "C#") order by d.id;

# distinct 대신 group by도 중복제거 사용할 수 있다.
select id, email, first_name, last_name from developers d, skillcodes s where d.skill_code & s.code = s.code and s.name in("Python", "C#") group by id, email, first_name, last_name order by d.id;

풀이

d.skill_code & s.code = s.code 은 where 절에서 d스킬 코드에 대해 s코드 전체를 순회하면서 &연산을 한다.

이때, Python또는 C#이면 조인 성공이며 둘 다 가진 개발자는 중복 생성 문제가 발생.

skillcodes랑 같이 출력하면 사실 행이 중복은 아니지만, 출력 컬럼은 developers 쪽 컬럼만 출력한다. 중복!!
중복 제거를 위해 distinct 사용 or order by 사용
만약 스킬 둘 다 가진 개발자가 없다면 중복 문제 걱정 안해도 된다. 1:1 로 조인 잘 되었을 것이다.

문제 10 - 2연속 join + 스칼라, 중첩 서브쿼리

업그레이드 된 아이템 구하기

코드

-- from -> where -> group by -> select -> order by
-- 스칼라 서브쿼리 여러번 사용 방식 (효율X)
select b.item_id, 
(select item_name from item_info where b.item_id = item_id) as item_name, 
(select rarity from item_info where b.item_id = item_id) as rarity from item_info a join item_tree b on (a.item_id=b.parent_item_id) where a.rarity='RARE' order by b.item_id desc;

-- 조인 2연속 방식 (효율적)
select c.item_id, c.item_name, c.rarity  
from item_info a 
join item_tree b on (a.item_id=b.parent_item_id)  
join item_info c on (b.item_id=c.item_id)
where a.rarity='RARE' order by b.item_id desc;

-- in 서브쿼리 활용
select item_id, item_name, rarity
from item_info
where item_id in (
select b.item_id from 
    (select item_id from item_info where rarity='RARE') a join item_tree b on (a.item_id=b.parent_item_id)
)
order by item_id desc

풀이

처음에는 select 절에 서브쿼리를 2개나 작성하는 코드를 구현했는데,

다른 사람들의 힌트를 보니 join으로만 해결한 사람이 있길래 조인을 2연속 하는 방식으로 풀이했다.

item_info 의 item_id와 item_tree 의 parent_item_id 를 먼저 join 한 후
조인된 id값의 item_info 정보를 구하기 위해 한번 더 item_info 와 조인한다.

(3)이번엔 다시 풀어보니 in서브쿼리로 해결이 되어서 이렇게 풀어보았다.

(select item_id from item_info where rarity='RARE') a 는 ‘RARE’인 아이템을 구하는 서브쿼리며 이 테이블을 item_tree b 와 조인을 (a.item_id=b.parent_item_id) 이렇게 하는게 핵심이다.
POINT: 이때 a.item_id 는 업그레이드 가능한 아이템을 의미하고, b.item_id 는 업그레이드 된 아이템을 의미한다.
POINT: a 테이블이 부모, b 테이블을 자식으로 볼 수 있겠다.

문제 11 - order by 주의점

진료과별 총 예약 횟수 출력하기

코드

SELECT mcdp_cd as '진료과코드', count(*) as '5월예약건수' from appointment where year(apnt_ymd)=2022 and month(apnt_ymd)=5 group by mcdp_cd order by 5월예약건수, 진료과코드;

풀이

ORDER BY 를 사용하는 경우에 MYSQL에서는 따옴표로 감싸면 하나의 컬럼으로 처리하기에 따옴표를 지워서 열의 별칭이나 실제 이름으로 작성하면 됩니다.

즉, order by '5월예약건수', '진료과코드' 이렇게 작성시 채점 실패

문제 12 - floor (버림) + ceil (올림) + truncate (자름)

가격대 별 상품 개수 구하기

코드

select floor(price/10000)*10000 as price_group, count(*) as products from product 
group by floor(price/10000) order by price_group asc;

풀이

0~10000, 10000~20000 등 각 부분들을 0원, 만원, 2만원 이런식으로 그룹핑 해달라고 했는데,

버림을 통해서 0,1,2,3… 으로 그룹핑 했고, 출력은 10000을 곱해서 만원 단위로 나오게 했다.

반올림: ROUND(value, decimals)
버림: FLOOR(value)
올림: CEIL(value) 또는 CEILING(value)
자르기: TRUNCATE(value, decimals)

문제 13 - substring (문자 자르기)

카테고리 별 상품 개수 구하기

코드

SELECT substring(product_code,1,2) as category, count(*) as products from product group by category order by category;

풀이

부분 문자열 추출: SUBSTRING(string, position, length) 또는 SUBSTRING(string FROM position FOR length)

문제 14 - 서브쿼리에 group by절 사용, distinct

중복 제거하기

코드

-- from절에 서브쿼리로 group by 사용하여 count(name)>1 then 1 적용하여 나타냄
-- 메인쿼리엔 전체 개수 세는걸로 마무리
SELECT count(*) as count from 
(select (case when count(name)>1 then 1 else count(name) end) from animal_ins group by name having name is not null) a;

-- 더 쉬운 풀이 -> group by 특징인 중복 제거 효과 사용 + count(컬럼) 특징인 null 무시 효과 사용
select count(a.name) as count
from (
    select name
    from animal_ins
    group by name
) a

-- 더더 쉬운 풀이 -> distinct(컬럼) 지정으로 중복만 제거 사용 + count(컬럼) 특징인 null 무시 사용
select count(distinct(name)) as count
from animal_ins

풀이

(1)count(name)>1 then 1 부분이 중복 제거!

(2)group by 로 중복 제거 + count(컬럼명) 으로 null 제외 로 해결

(3)COUNT(DISTINCT NAME) 처럼 distinct를 이렇게도 사용이 가능! -> 중복 만 제거하므로 count(컬럼) 필수

Twitter Facebook LinkedIn

Intro

문제 1 - datediff, group by, having + 서브쿼리

코드

풀이

문제 2 - join

코드

풀이

문제 3 - having절 중복 활용(count(*)>1)

코드

풀이

문제 4 - 언제 join 사용? bit 연산 심화? + self join

코드

풀이

문제 5 - 서브쿼리는 메인쿼리에 종속, SQL 실행 순서 + 스칼라, 인라인 뷰

코드

풀이

문제 6 - case when 심화, select 별칭 사용과 사용X

코드

풀이

문제 7 - date_format, having 심화 + convert(or cast)

코드

풀이

문제 8 - 숫자 정렬이면 문자 꼭 제외

코드

풀이

문제 9 - distinct -> 중복 제거!! + group by

코드

풀이

문제 10 - 2연속 join + 스칼라, 중첩 서브쿼리

코드

풀이

문제 11 - order by 주의점

코드

풀이

문제 12 - floor (버림) + ceil (올림) + truncate (자름)

코드

풀이

문제 13 - substring (문자 자르기)

코드

풀이

문제 14 - 서브쿼리에 group by절 사용, distinct

코드

풀이

공유하기

댓글남기기

참고

2023-03-02 [MySQL]LV 1 문제 모음 ProgrammersTest

2023-03-03 [MySQL]LV 2 문제 모음 ProgrammersTest

2023-03-04 [MySQL]LV 3 문제 모음 ProgrammersTest

2023-03-02

[MySQL]LV 1 문제 모음

ProgrammersTest

2023-03-03

[MySQL]LV 2 문제 모음

ProgrammersTest

2023-03-04

[MySQL]LV 3 문제 모음

ProgrammersTest