๋ณธ๋ฌธ ๋ฐ”๋กœ๊ฐ€๊ธฐ

Artificial Intelligence/Paper8

Improving Language Understanding by Generative Pre-Training ๐Ÿ’ฌ ๋…ผ๋ฌธ ๋‚ด์šฉ๊ณผ ์ด ๊ธ€์— ๋Œ€ํ•œ ์˜๊ฒฌ ๊ณต์œ , ์˜คํƒˆ์ž ์ง€์  ํ™˜์˜ํ•ฉ๋‹ˆ๋‹ค. ํŽธํ•˜๊ฒŒ ๋Œ“๊ธ€ ๋‚จ๊ฒจ์ฃผ์„ธ์š” ! ๐Ÿ’ฌ โ—พ ๊ธฐํ˜ธ๋Š” ์›๋ฌธ ๋‚ด์šฉ์ด๋ฉฐ, โ—ฝ ๊ธฐํ˜ธ๋Š” ๊ธ€ ์ž‘์„ฑ์ž์˜ ๊ฐœ์ธ์ ์ธ ์ƒ๊ฐ์ž…๋‹ˆ๋‹ค. ์›๋ฌธ: https://cdn.openai.com/research-covers/language-unsupervised/language_understanding_paper.pdf Abstract โ—พ ์ž์—ฐ์–ด ์ƒ์„ฑ(NLG) ๋ถ„์•ผ์—์„œ ๋ ˆ์ด๋ธ”์ด ์—†๋Š” ํ…์ŠคํŠธ ๋ฐ์ดํ„ฐ๋Š” ์ถฉ๋ถ„ํ•˜์ง€๋งŒ ํŠน์ • ํƒœ์Šคํฌ(textual entailment, QA, semantic similarity assessment ๋“ฑ)๋ฅผ ์œ„ํ•ด ๋ ˆ์ด๋ธ” ๋œ ๋ฐ์ดํ„ฐ๋Š” ๋ถ€์กฑํ•จ โ—พ ๋ ˆ์ด๋ธ” ๋œ ๋ฐ์ดํ„ฐ๊ฐ€ ๋ถ€์กฑํ•œ ์ƒํ™ฉ์€ ํ•™์Šต๋œ ๋ชจ๋ธ์ด ์ œ๋Œ€๋กœ ์„ฑ๋Šฅ์„ ๋ฐœํœ˜ํ•˜์ง€ ๋ชปํ•˜๊ฒŒ ํ•จ โ—พ ๋ ˆ์ด๋ธ”์ด ์—†๋Š” ๋‹ค์–‘ํ•œ ํ…์ŠคํŠธ ์ฝ”ํผ์Šค์—.. 2023. 1. 15.
RoBERTa: A Robustly Optimized BERT Pretraining Approach ๐Ÿ’ฌ ๋…ผ๋ฌธ ๋‚ด์šฉ๊ณผ ์ด ๊ธ€์— ๋Œ€ํ•œ ์˜๊ฒฌ ๊ณต์œ , ์˜คํƒˆ์ž ์ง€์  ํ™˜์˜ํ•ฉ๋‹ˆ๋‹ค. ํŽธํ•˜๊ฒŒ ๋Œ“๊ธ€ ๋‚จ๊ฒจ์ฃผ์„ธ์š” ! ๐Ÿ’ฌ โ—ฝ ๊ธฐํ˜ธ๋Š” ๊ธ€ ์ž‘์„ฑ์ž์˜ ๊ฐœ์ธ์ ์ธ ์ƒ๊ฐ์ด๋ฉฐ, โ—พ ๊ธฐํ˜ธ๋Š” ์›๋ฌธ ๋‚ด์šฉ์ž…๋‹ˆ๋‹ค. ์›๋ฌธ: https://arxiv.org/pdf/1907.11692.pdf Abstract โ—พ BERT ๋ชจ๋ธ์— ๋Œ€ํ•ด ์žฌํ˜„ ์—ฐ๊ตฌ(replication study)๋ฅผ ์ˆ˜ํ–‰ํ•˜๋ฉด์„œ ๋ฐ์ดํ„ฐ ํฌ๊ธฐ, ์ฃผ์š” ํ•˜์ดํผํŒŒ๋ผ๋ฏธํ„ฐ๊ฐ€ ๊ฒฐ๊ณผ์— ์–ด๋–ค ์˜ํ–ฅ์„ ์ฃผ๋Š”์ง€ ํ™•์ธ โ—พ BERT ๋ชจ๋ธ์ด undertrained๋˜์—ˆ์œผ๋ฉฐ BERT ๋ชจ๋ธ ๋ฐœํ‘œ ์ดํ›„ ๋‚˜์˜จ ๋ชจ๋ธ๋“ค์˜ ์„ฑ๋Šฅ์„ ๋Šฅ๊ฐ€ํ•œ๋‹ค๋Š” ๊ฒƒ์„ ์•Œ๊ฒŒ ๋จ โ—พ ์ด์ „์— ๊ฐ„๊ณผ๋˜๋˜ ๋ชจ๋ธ ์„ค๊ณ„ ๋ฐฉ๋ฒ•์˜ ์ค‘์š”์„ฑ์— ๋Œ€ํ•ด ๊ฐ•์กฐ โ—ฝ RoBERTa๋ผ๋Š” ์ƒˆ๋กœ์šด ๋ชจ๋ธ์„ ์ œ์•ˆํ•œ ๊ฒƒ์ด ์•„๋‹ˆ๋ผ BERT ๋ชจ๋ธ์„ ๊ฐ€์žฅ ์ข‹์€ ๋ฐฉ๋ฒ•์œผ๋กœ ํ•™์Šต์‹œํ‚จ ๊ฒƒ โ—ฝ 'undertra.. 2023. 1. 5.
EDA: Easy Data Augmentation Techniques for Boosting Performance on Text Classification Tasks ๐Ÿ’ฌ ๋…ผ๋ฌธ ๋‚ด์šฉ๊ณผ ์ด ๊ธ€์— ๋Œ€ํ•œ ์˜๊ฒฌ ๊ณต์œ , ์˜คํƒˆ์ž ์ง€์  ํ™˜์˜ํ•ฉ๋‹ˆ๋‹ค. ํŽธํ•˜๊ฒŒ ๋Œ“๊ธ€ ๋‚จ๊ฒจ์ฃผ์„ธ์š” ! ์›๋ฌธ: https://aclanthology.org/D19-1670.pdf 1 Introduction โ–ช๏ธ ๋จธ์‹ ๋Ÿฌ๋‹๊ณผ ๋”ฅ๋Ÿฌ๋‹์€ ๊ฐ์„ฑ๋ถ„์„๋ถ€ํ„ฐ ํ† ํ”ฝ ๋ถ„๋ฅ˜๊นŒ์ง€ NLP ๋ถ„์•ผ์—์„œ ๋†’์€ ์ •ํ™•๋„๋ฅผ ๋‹ฌ์„ฑํ–ˆ์ง€๋งŒ, ๋†’์€ ์„ฑ๋Šฅ์€ ์ข…์ข… ํ•™์Šต ๋ฐ์ดํ„ฐ์˜ ์–‘๊ณผ ํ€„๋ฆฌํ‹ฐ์— ๋‹ฌ๋ ค ์žˆ์Œ โ–ช๏ธ ์ž๋™ ๋ฐ์ดํ„ฐ ์ฆ๊ฐ•(Automatica data augmentation)์€ ์ปดํ“จํ„ฐ ๋น„์ „๊ณผ ์Œ์„ฑ ๋ถ„์•ผ์—์„œ ๋งŽ์ด ์‚ฌ์šฉ๋˜์ง€๋งŒ ์–ธ์–ด ๋ณ€ํ™˜์„ ์œ„ํ•œ ์ผ๋ฐ˜์ ์ธ ๊ทœ์น™์„ ๋งŒ๋“œ๋Š” ๊ฒƒ์€ ์–ด๋ ต๊ธฐ ๋•Œ๋ฌธ์— NLP ๋ถ„์•ผ์—์„œ ์ผ๋ฐ˜์ ์ธ ๋ฐ์ดํ„ฐ ์ฆ๊ฐ• ๊ธฐ๋ฒ•์€ ์™„์ „ํžˆ ์—ฐ๊ตฌ๋œ ์ ์ด ์—†์Œ โ–ช๏ธ ๋…ผ๋ฌธ์„ ํ†ตํ•ด EDA(Easy Data Augmentation)๋ผ๊ณ  ๋ถ€๋ฅด๋Š” ๊ฐ„๋‹จํ•œ NLP ๋ฐ์ดํ„ฐ ์ฆ๊ฐ• ๊ธฐ๋ฒ•.. 2022. 11. 7.
BERT: Pre-training of Deep Bidirectional Transformers forLanguage Understanding ๋…ผ๋ฌธ ์ฝ๊ธฐ ์—„์ฒญ ์˜ค๋žœ๋งŒ์ด๋‹ค. BERT ๊ธฐ๋ฐ˜ ์‚ฌ์ „ํ•™์Šต๋ชจ๋ธ ์จ๋ณด๋ ค๊ณ  ํ•˜๋‹ค๊ฐ€ ๊ด€๋ จ ๊ฐœ๋…์„ ํ•˜๋‚˜๋„ ๋ชจ๋ฅด๋‹ˆ๊นŒ ๋ชจ๋ธ ์ž…๋ ฅ์— ๋ญ๊ฐ€ ๋“ค์–ด๊ฐ€๋Š”์ง€~ ๋ฐ์ดํ„ฐ ํ˜•ํƒœ๋ฅผ ์–ด๋–ป๊ฒŒ ๋งž์ถฐ์ค˜์•ผ ํ•˜๋Š”์ง€~ ๋„ˆ๋ฌด ์ดํ•ด๊ฐ€ ์•ˆ ๋˜๋Š” ๋ถ€๋ถ„์ด ๋งŽ์•„์„œ ๋…ผ๋ฌธ ๋ณธ์ธ๋“ฑํŒ์‹œํ‚ด ๐Ÿ’ฌ ๋…ผ๋ฌธ ๋‚ด์šฉ๊ณผ ์ด ๊ธ€์— ๋Œ€ํ•œ ์˜๊ฒฌ ๊ณต์œ , ์˜คํƒˆ์ž ์ง€์  ํ™˜์˜ํ•ฉ๋‹ˆ๋‹ค. ํŽธํ•˜๊ฒŒ ๋Œ“๊ธ€ ๋‚จ๊ฒจ์ฃผ์„ธ์š” ! ์›๋ฌธ: https://arxiv.org/pdf/1810.04805.pdf โ–  : ์•„์ง ๋ฐ”๋กœ ์ดํ•ด ์•ˆ ๋˜๋Š” ๋ถ€๋ถ„ Introduction 1. Pre-train๋œ ์–ธ์–ด ํ‘œํ˜„์„ ํ•˜์œ„ ํƒœ์Šคํฌ์— ์ ์šฉํ•˜๋Š” 2๊ฐ€์ง€ ๋ฐฉ๋ฒ• ์กด์žฌ 1) Feature-based - Pre-trained representations์„ ํฌํ•จํ•˜๋Š” task-specific ๊ตฌ์กฐ๋ฅผ ์ถ”๊ฐ€์ ์ธ feature๋กœ ์‚ฌ์šฉ - ์˜ˆ: ELMo 2) .. 2022. 9. 21.
Sequence to Sequence Learning with Neural Networks Transformer๋ฅผ ์ œ๋Œ€๋กœ ์ดํ•ดํ•˜๊ธฐ ์œ„ํ•ด ๋ด์•ผ ํ•  ๋…ผ๋ฌธ๊ณผ ๊ฐœ๋…๋“ค์ด ๊ต‰์žฅํžˆ ๋งŽ๋‹ค. ์ฐจ๊ทผ์ฐจ๊ทผ ๋ณด๊ณ  Transformer๋„ ๋‹ค์‹œ ๋ณผ ๊ณ„ํš์ด๋‹ค. ๐Ÿ’ฌ ๋…ผ๋ฌธ ๋‚ด์šฉ๊ณผ ์ด ๊ธ€์— ๋Œ€ํ•œ ์˜๊ฒฌ ๊ณต์œ , ์˜คํƒˆ์ž ์ง€์  ํ™˜์˜ํ•ฉ๋‹ˆ๋‹ค. ํŽธํ•˜๊ฒŒ ๋Œ“๊ธ€ ๋‚จ๊ฒจ์ฃผ์„ธ์š” ! ์›๋ฌธ : https://arxiv.org/pdf/1409.3215.pdf Abstract - DNN์€ speech recognition๊ณผ ๊ฐ™์€ ์–ด๋ ค์šด ํ•™์Šต ํƒœ์Šคํฌ์—์„œ ์šฐ์ˆ˜ํ•œ ์„ฑ๊ณผ๋ฅผ ๋‹ฌ์„ฑํ•œ ๋ชจ๋ธ์ด์ง€๋งŒ ๊ณ ์ • ์ฐจ์›์„ ์‚ฌ์šฉํ•˜๊ธฐ ๋•Œ๋ฌธ์— ์ž…์ถœ๋ ฅ ๊ธธ์ด๊ฐ€ ๋‹ค๋ฅธ ์‹œํ€€์Šค(๋ฌธ์žฅ)๋ฅผ ๋‹ค๋ฃจ๋Š” ๋ฌธ์ œ์—๋Š” ์ ํ•ฉํ•˜์ง€ ์•Š์•˜๋‹ค. - ์ด ๋…ผ๋ฌธ์—์„œ๋Š” ๋‹ค์ธต LSTM์„ ์ธ์ฝ”๋”-๋””์ฝ”๋”๋กœ ์‚ฌ์šฉํ•˜์—ฌ ์ž…๋ ฅ ์‹œํ€€์Šค ์˜๋ฏธ์— ๋Œ€์‘ํ•˜๋Š” ๊ฐ€๋ณ€ ๊ธธ์ด ์‹œํ€€์Šค๋ฅผ ์ถœ๋ ฅํ•˜๋Š” ๋ฐฉ๋ฒ•์„ ์ œ์•ˆํ•œ๋‹ค. - ์ž…๋ ฅ ์‹œํ€€์Šค ๋‹จ์–ด ์ˆœ์„œ๋ฅผ ๋ฐ˜๋Œ€๋กœ ํ•  ๊ฒฝ์šฐ(.. 2022. 3. 21.
Feature Squeezing: Detecting Adversarial Examples in Deep Neural Networks ์—ฐ๊ตฌ์‹ค์—์„œ AI๋ณด์•ˆ ์ชฝ ๊ณต๋ถ€ํ•  ๋•Œ ๊ณต๊ฒฉ์— ๋Œ€ํ•ด์„œ๋งŒ ๊ณต๋ถ€ํ–ˆ๋˜ ๊ฑฐ ๊ฐ™์•„์„œ ๋ฐฉ์–ด ๊ธฐ๋ฒ•์— ๋Œ€ํ•ด ๊ถ๊ธˆํ•ด์กŒ๋‹ค. ์ด๋ฒˆ ์ฃผ ๋…ผ๋ฌธ์œผ๋กœ ๋‹น์ฒจ ๐Ÿ‘Š ๐Ÿ’ฌ ๋…ผ๋ฌธ ๋‚ด์šฉ๊ณผ ์ด ๊ธ€์— ๋Œ€ํ•œ ์˜๊ฒฌ ๊ณต์œ , ์˜คํƒˆ์ž ์ง€์  ํ™˜์˜ํ•ฉ๋‹ˆ๋‹ค. ํŽธํ•˜๊ฒŒ ๋Œ“๊ธ€ ๋‚จ๊ฒจ์ฃผ์„ธ์š” ! ์›๋ฌธ : https://arxiv.org/pdf/1704.01155.pdf Abstract ์ด์ „ ์—ฐ๊ตฌ๋“ค์€ adversarial example์„ ๋ฐฉ์–ดํ•˜๊ธฐ ์œ„ํ•ด DNN(Deep Neural Network) ๋ชจ๋ธ์„ ๊ฐœ์„ (๋ชจ๋ธ ์ž์ฒด๋ฅผ ์ˆ˜์ •ํ•ด์•ผ ํ•จ)ํ•˜๋Š” ๊ฒƒ์— ์ดˆ์ ์„ ๋งž์ท„์ง€๋งŒ ์„ฑ๊ณต์ด ์ œํ•œ์ ์ด๊ณ  ๊ณ„์‚ฐ ๋น„์šฉ์ด ๋†’๋‹ค๋Š” ๋‹จ์  ์กด์žฌ → adversarial examples๋ฅผ ํƒ์ง€ํ•จ์œผ๋กœ์จ DNN ๋ชจ๋ธ์„ ๊ฐ•ํ™”ํ•  ์ˆ˜ ์žˆ๋Š” Feature Squeezing ๋ฐฉ์‹ ์ œ์‹œ Introduction - ๋ถ„๋ฅ˜๊ธฐ๊ฐ€ advers.. 2022. 3. 3.