[Paper] Deep Double Descent: Where Bigger Models and More Data Hurt

DeepLearning/ETC

[Paper] Deep Double Descent: Where Bigger Models and More Data Hurt

seokhyun2 2019. 12. 19. 09:13

https://arxiv.org/abs/1912.02292

Deep Double Descent: Where Bigger Models and More Data Hurt

We show that a variety of modern deep learning tasks exhibit a "double-descent" phenomenon where, as we increase model size, performance first gets worse and then gets better. Moreover, we show that double descent occurs not just as a function of model siz

arxiv.org

이 논문은 하버드 대학과 OpenAI에서 작성한 논문입니다.

기존에는 모델이 커지거나 학습 에폭이 늘어나다보면 오버피팅으로 인하여 트레이닝 데이터에선 성능이 올라도, 테스트 데이터에선 성능이 오히려 안 좋아질 수 있다는 것이 지배적이였습니다.

하지만 본 논문에서는, 안 좋아지는 구간이 있지만, 모델을 더 크게 만들거나 에폭을 더 늘리면 다시 학습이 잘된다는 것입니다. 논문에서 발췌한 아래의 이미지 하나로 모두 설명이 될 것 같네요.

기존의 오버피팅에 대한 패러다임을 깰 수 있는 이론이 될 지 궁금합니다!

현재글[Paper] Deep Double Descent: Where Bigger Models and More Data Hurt

seokhyun2

파이썬, onnx, index, MySQL, deep learning model serving, 글또, Python, TensorFlow, deep learning, tf2.0, Elasticsearch, pytorch, Inference, MLOps, tensorflow serving, fastapi, docker, 엘라스틱서치, serving, flask,

Today :
Yesterday :

일	월	화	수	목	금	토
		1	2	3	4	5
6	7	8	9	10	11	12
13	14	15	16	17	18	19
20	21	22	23	24	25	26
27	28	29	30

seokhyun2

[Paper] Deep Double Descent: Where Bigger Models and More Data Hurt

'DeepLearning/ETC'의 다른글

티스토리툴바