์ „์ฒด ๊ธ€ 202

2023/1์›”ํ˜ธ

๋˜ ๋‹ค์‹œ ์›”๋ณ„ ์ •๋ฆฌ๊ฐ€ ๋ฐ€๋ฆฌ๊ธฐ ์‹œ์ž‘ํ–ˆ๋‹ค! 2์›”์€ ๊ผญ ์ œ๋•Œ ์จ์•ผ์ง€. 1์›”์€ ๊ทธ๋ ‡๊ฒŒ ํฐ ์ด๋ฒคํŠธ๊ฐ€ ์—†์—ˆ๋‹ค. 1์›” ์ค‘์ˆœ?(์ด์—ˆ๋‚˜ ๋ง์ด์—ˆ๋‚˜)๋ถ€ํ„ฐ ๊ณต๋ชจ์ „ ์ค€๋น„๋ฅผ ํ–ˆ๊ณ  ์ง€๊ธˆ์€ ๋‹ค ์ œ์ถœ๊นŒ์ง€ ๋๋‚œ ์ƒํƒœ๋‹ค. ์—ฐ๊ตฌ์‹ค ์ผ๋„ ํฌ๊ฒŒ ๋ฐ”์œ ๊ฑด ์—†์—ˆ๋Š”๋ฐ ์•„๋งˆ ์Šฌ์Šฌ ๋ฐ”๋น ์ง€์ง€ ์•Š์„๊นŒ ์‹ถ๋‹ค. 1์›”๋„ ๋‚ ๋กœ ๋จน๋Š” ๋Š๋‚Œ์ด์ง€๋งŒ ๊ธฐ์–ต ๋‚˜๋Š” ๊ฒŒ ๋”ฑํžˆ ์—†์–ด์„œ ํ›„๋‹ค๋‹ฅ ๋งˆ๋ฌด๋ฆฌ..๐Ÿ™‚

Improving Language Understanding by Generative Pre-Training

๐Ÿ’ฌ ๋…ผ๋ฌธ ๋‚ด์šฉ๊ณผ ์ด ๊ธ€์— ๋Œ€ํ•œ ์˜๊ฒฌ ๊ณต์œ , ์˜คํƒˆ์ž ์ง€์  ํ™˜์˜ํ•ฉ๋‹ˆ๋‹ค. ํŽธํ•˜๊ฒŒ ๋Œ“๊ธ€ ๋‚จ๊ฒจ์ฃผ์„ธ์š” ! ๐Ÿ’ฌ โ—พ ๊ธฐํ˜ธ๋Š” ์›๋ฌธ ๋‚ด์šฉ์ด๋ฉฐ, โ—ฝ ๊ธฐํ˜ธ๋Š” ๊ธ€ ์ž‘์„ฑ์ž์˜ ๊ฐœ์ธ์ ์ธ ์ƒ๊ฐ์ž…๋‹ˆ๋‹ค. ์›๋ฌธ: https://cdn.openai.com/research-covers/language-unsupervised/language_understanding_paper.pdf Abstract โ—พ ์ž์—ฐ์–ด ์ƒ์„ฑ(NLG) ๋ถ„์•ผ์—์„œ ๋ ˆ์ด๋ธ”์ด ์—†๋Š” ํ…์ŠคํŠธ ๋ฐ์ดํ„ฐ๋Š” ์ถฉ๋ถ„ํ•˜์ง€๋งŒ ํŠน์ • ํƒœ์Šคํฌ(textual entailment, QA, semantic similarity assessment ๋“ฑ)๋ฅผ ์œ„ํ•ด ๋ ˆ์ด๋ธ” ๋œ ๋ฐ์ดํ„ฐ๋Š” ๋ถ€์กฑํ•จ โ—พ ๋ ˆ์ด๋ธ” ๋œ ๋ฐ์ดํ„ฐ๊ฐ€ ๋ถ€์กฑํ•œ ์ƒํ™ฉ์€ ํ•™์Šต๋œ ๋ชจ๋ธ์ด ์ œ๋Œ€๋กœ ์„ฑ๋Šฅ์„ ๋ฐœํœ˜ํ•˜์ง€ ๋ชปํ•˜๊ฒŒ ํ•จ โ—พ ๋ ˆ์ด๋ธ”์ด ์—†๋Š” ๋‹ค์–‘ํ•œ ํ…์ŠคํŠธ ์ฝ”ํผ์Šค์—..

RoBERTa: A Robustly Optimized BERT Pretraining Approach

๐Ÿ’ฌ ๋…ผ๋ฌธ ๋‚ด์šฉ๊ณผ ์ด ๊ธ€์— ๋Œ€ํ•œ ์˜๊ฒฌ ๊ณต์œ , ์˜คํƒˆ์ž ์ง€์  ํ™˜์˜ํ•ฉ๋‹ˆ๋‹ค. ํŽธํ•˜๊ฒŒ ๋Œ“๊ธ€ ๋‚จ๊ฒจ์ฃผ์„ธ์š” ! ๐Ÿ’ฌ โ—ฝ ๊ธฐํ˜ธ๋Š” ๊ธ€ ์ž‘์„ฑ์ž์˜ ๊ฐœ์ธ์ ์ธ ์ƒ๊ฐ์ด๋ฉฐ, โ—พ ๊ธฐํ˜ธ๋Š” ์›๋ฌธ ๋‚ด์šฉ์ž…๋‹ˆ๋‹ค. ์›๋ฌธ: https://arxiv.org/pdf/1907.11692.pdf Abstract โ—พ BERT ๋ชจ๋ธ์— ๋Œ€ํ•ด ์žฌํ˜„ ์—ฐ๊ตฌ(replication study)๋ฅผ ์ˆ˜ํ–‰ํ•˜๋ฉด์„œ ๋ฐ์ดํ„ฐ ํฌ๊ธฐ, ์ฃผ์š” ํ•˜์ดํผํŒŒ๋ผ๋ฏธํ„ฐ๊ฐ€ ๊ฒฐ๊ณผ์— ์–ด๋–ค ์˜ํ–ฅ์„ ์ฃผ๋Š”์ง€ ํ™•์ธ โ—พ BERT ๋ชจ๋ธ์ด undertrained๋˜์—ˆ์œผ๋ฉฐ BERT ๋ชจ๋ธ ๋ฐœํ‘œ ์ดํ›„ ๋‚˜์˜จ ๋ชจ๋ธ๋“ค์˜ ์„ฑ๋Šฅ์„ ๋Šฅ๊ฐ€ํ•œ๋‹ค๋Š” ๊ฒƒ์„ ์•Œ๊ฒŒ ๋จ โ—พ ์ด์ „์— ๊ฐ„๊ณผ๋˜๋˜ ๋ชจ๋ธ ์„ค๊ณ„ ๋ฐฉ๋ฒ•์˜ ์ค‘์š”์„ฑ์— ๋Œ€ํ•ด ๊ฐ•์กฐ โ—ฝ RoBERTa๋ผ๋Š” ์ƒˆ๋กœ์šด ๋ชจ๋ธ์„ ์ œ์•ˆํ•œ ๊ฒƒ์ด ์•„๋‹ˆ๋ผ BERT ๋ชจ๋ธ์„ ๊ฐ€์žฅ ์ข‹์€ ๋ฐฉ๋ฒ•์œผ๋กœ ํ•™์Šต์‹œํ‚จ ๊ฒƒ โ—ฝ 'undertra..

2022๋…„ ํšŒ๊ณ ๋ก | ์˜ฌํ•ด์˜ ํ‚ค์›Œ๋“œ

์ž‘๋…„์— ํ•ด ๋ฐ”๋€Œ๊ณ  ํšŒ๊ณ ๋ก์„ ์ž‘์„ฑํ–ˆ๋˜ ๊ฒŒ ์•„์‰ฌ์›Œ์„œ ์žŠ์ง€ ์•Š๊ณ  ์žˆ๋‹ค๊ฐ€ 12์›” 31์ผ์— ์ž‘์„ฑํ•˜๊ธฐ ์„ฑ๊ณต! ๐Ÿ”› ์‹œ์ž‘ ๋ณธ๊ฒฉ์ ์œผ๋กœ ์„์‚ฌ๊ณผ์ •์„ ์‹œ์ž‘ํ•œ ํ•ด์˜€๋‹ค. ์ด์ œ๋Š” ๊ฑฐ์˜ ๋‹ค ๋Œ€๋ฉด์ˆ˜์—…์œผ๋กœ ์ „ํ™˜๋๊ธฐ ๋•Œ๋ฌธ์— ์—ฐ๊ตฌ์‹ค ์ผํ•˜๋‹ค๊ฐ€ ์ˆ˜์—… ์‹œ๊ฐ„์ด ๋˜๋ฉด ๋“ฃ๊ณ  ์˜ค๋Š” ์‹์œผ๋กœ ์ฒซ ํ•™๊ธฐ๋ฅผ ๋งˆ๋ฌด๋ฆฌํ–ˆ๋‹ค. ๋งˆ์ง€๋ง‰ ํ•™๋ถ€ ์ƒํ™œ์ด ์™„์ „ํ•œ ๋น„๋Œ€๋ฉด ํ•™๊ธฐ์˜€๊ธฐ ๋•Œ๋ฌธ์— ๋Œ€๋ฉด ์ˆ˜์—…์ด ๋„ˆ๋ฌด ์˜ค๋žœ๋งŒ์ด์—ˆ๋‹ค. ๋ถ„๋ช… ๊ณผ๊ฑฐ์˜ ๋‚˜๋Š” ๋Œ€ํ•™๊ต ์กธ์—…ํ•˜๋ฉด ์ˆ˜์—… ๋“ค์„ ์ผ์ด ์—†์„ ๊ฒƒ์ด๋ผ๊ณ  ์ƒ๊ฐํ–ˆ์„ํ…๋ฐ ์ด๋ ‡๊ฒŒ ์ƒํ™œํ•˜๊ณ  ์žˆ๋Š” ๊ฑธ ๋ณด๋ฉด ์ธ์ƒ์€ ์ •๋ง ์•Œ ์ˆ˜ ์—†๋Š” ๊ฑฐ๊ตฌ๋‚˜ ์‹ถ๋‹ค. ์ด๋ฒˆ ํ•™๊ธฐ์— ๋Œ€ํ•ด ํ‰๊ฐ€ํ•˜๋ฉด ์–ธ์ œ๋‚˜ ๊ทธ๋žฌ๋“ฏ์ด ํ›„ํšŒ๊ฐ€ ๋งŽ์ด ๋˜๊ณ  ๋ฐ˜์„ฑํ•  ๊ฒƒ ํˆฌ์„ฑ์ด๋‹ค. ๋‚˜๋งŒ ์—ฐ๊ตฌ+์—ฐ๊ตฌ์‹ค ์—…๋ฌด์™€ ์ˆ˜์—…์„ ๋ณ‘ํ–‰ํ•˜๋Š” ๊ฒŒ ์•„๋‹Œ๋ฐ ํ•™๊ธฐ ์ค‘์—๋Š” ์Šค์Šค๋กœ์—๊ฒŒ ๋„ˆ๋ฌด๋‚˜ ๊ด€๋Œ€ํ–ˆ๋‹ค. ํ‰์ผ์— ๋ฐ”๋นด์œผ๋‹ˆ๊นŒ, ์ด๋ฒˆ์ฃผ๋Š” ํ”ผ๊ณคํ–ˆ์œผ๋‹ˆ๊นŒ ํ•˜..

2022/11์›”ํ˜ธ

ํ•œ ํ•ด๊ฐ€ ๋‹ค ๋๋‚˜๊ฐ€๋Š”๋ฐ ์ด์ œ์„œ์•ผ 11์›” ์ •๋ฆฌ๋ฅผ ํ•œ๋‹ค. ๋Šฆ๊ฒŒ ์“ฐ๋‹ˆ๊นŒ ๊ธฐ์–ต์ด ์•ˆ ๋‚˜์„œ ์นดํ…Œ๊ณ ๋ฆฌ๋„ ๋‚˜๋ˆ„์ง€ ์•Š๊ณ  ๊ต‰์žฅํžˆ ์งง๊ฒŒ ์ž‘์„ฑํ•œ๋‹ค. ์ฝ”ํ…Œ/์ธ๊ณต์ง€๋Šฅ ์Šคํ„ฐ๋””๋ฅผ ์‹œ์ž‘ํ•ด์„œ ์ง€๊ธˆ๊นŒ์ง€ ์ง„ํ–‰ํ•˜๊ณ  ์žˆ๋‹ค. ํ”„๋กœ๊ทธ๋ž˜๋จธ์Šค์™€ ๋ฐ์ด์ฝ˜ ์œ„์ฃผ๋กœ ์Šคํ„ฐ๋””๋ฅผ ํ•˜๊ณ  ์žˆ๋Š”๋ฐ ๋ฐ์ด์ฝ˜ ๋Œ€ํšŒ๋Š” ์ž˜ํ•˜๋Š” ๋ถ„๋“ค์ด ๋„ˆ๋ฌด ๋งŽ๋‹ค! ๊ทธ๋ถ„๋“ค์˜ ์ง€์‹์„ ์ŠคํŽ€์ง€์ฒ˜๋Ÿผ ์™์™ ๋นจ์•„๋“ค์ด๊ณ  ์‹ถ์€๋ฐ ๋ชจ๋ธ๋ง ๋ฐฉํ–ฅ์„ ์žก๋Š” ๊ฒƒ๋ถ€ํ„ฐ ์–ด๋ ค์›Œ์„œ ๋‹ค๋“ค ์–ด๋–ป๊ฒŒ ๊ณต๋ถ€ํ•˜์‹  ๊ฑด์ง€ ๊ถ๊ธˆ์ฆ๋งŒ ํ•œ๊ฐ€๋“ํ•œ ์ƒํƒœ ๐Ÿ˜ต ๋Œ€ํšŒ ์ˆ˜์ƒ์ž๋ถ„๋“ค ์ฝ”๋“œ ๋ณด๊ณ  ๊ณต๋ถ€ํ•˜๋Š” ์‹œ๊ฐ„์„ ๊ฐ€์ ธ์•ผ๊ฒ ๋‹ค๊ณ  ๊ณ„์† ์ƒ๊ฐํ•˜๊ณ  ์žˆ๋Š”๋ฐ ๋‚˜๋Š” ์ •๋ง ์‹คํ–‰๋ ฅ์ด ๋ฌธ์ œ๋‹ค ๋ฌธ์ œ~

[์ฝ”๋“œ ๋ฆฌ๋ทฐ] ๋…ธ๋…„์ธต ๋Œ€ํ™” ๊ฐ์„ฑ ๋ถ„๋ฅ˜ ๋ชจ๋ธ ๊ตฌํ˜„ (3): Transformer โ‘ 

๊ฐ์„ฑ ๋ถ„๋ฅ˜ ๋ชจ๋ธ ๊ตฌํ˜„ ์‹œ๋ฆฌ์ฆˆ (1) | CNN ๊ฐ์„ฑ ๋ถ„๋ฅ˜ ๋ชจ๋ธ ๊ตฌํ˜„ ์‹œ๋ฆฌ์ฆˆ (2) | RNN Transformer ๋ถ„๋ฅ˜ ๋ชจ๋ธ์€ ๋‹จ์ผ ํŒŒ์ผ์ด ์•„๋‹ˆ๋ผ์„œ ํ•˜๋‚˜์”ฉ ๋ถ„์„ํ•˜๋ฉด ๊ธ€์ด 3๊ฐœ๋‚˜ 4๊ฐœ ์ •๋„ ๋‚˜์˜ฌ ๊ฒƒ ๊ฐ™๋‹ค. ๐Ÿ‘ฉ‍๐Ÿซ ๋ชจ๋ธ ํด๋ž˜์Šค import torch import torch.nn as nn import torch.nn.functional as F from copy import deepcopy from .encoder import Encoder, EncoderLayer from .sublayers import * attn = MultiHeadAttention(8, 152) ff = PositionwiseFeedForward(152, 1024, 0.5) pe = PositionalEncoding(152, 0.5)..

[์ฝ”๋“œ ๋ฆฌ๋ทฐ] ๋…ธ๋…„์ธต ๋Œ€ํ™” ๊ฐ์„ฑ ๋ถ„๋ฅ˜ ๋ชจ๋ธ ๊ตฌํ˜„ (2) : RNN

๊ฐ์„ฑ ๋ถ„๋ฅ˜ ๋ชจ๋ธ ๊ตฌํ˜„ ์‹œ๋ฆฌ์ฆˆ (1) | CNN ๐Ÿ‘ฉ‍๐Ÿซ ๋ชจ๋ธ ํด๋ž˜์Šค class RNN(nn.Module): def __init__(self, vocab_size, embed_dim, hidden_dim, n_layers, dropout, num_class, device): super(RNN, self).__init__() self.device = device self.n_layers = n_layers self.hidden_dim = hidden_dim self.embed = nn.Embedding(vocab_size, embed_dim) self.dropout = nn.Dropout(p=dropout) self.gru = nn.GRU(embed_dim, self.hidden_dim, self.n_laye..

[๋ฒˆ์—ญ] Micro, Macro & Weighted Averages of F1 Score, Clearly Explained

๐Ÿ’ฌ ์ตœ๋Œ€ํ•œ ๋งค๋„๋Ÿฝ๊ฒŒ ํ•ด์„ํ•˜๊ณ ์ž ๋…ธ๋ ฅํ–ˆ์ง€๋งŒ ์–ด์ƒ‰ํ•œ ๋ฌธ์žฅ์ด ์žˆ์„ ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค. ํ”ผ๋“œ๋ฐฑ์€ ์–ธ์ œ๋‚˜ ํ™˜์˜์ž…๋‹ˆ๋‹ค ๐Ÿ™‚ ์›๋ณธ ๊ธ€ ์ฃผ์†Œ: https://towardsdatascience.com/micro-macro-weighted-averages-of-f1-score-clearly-explained-b603420b292f Micro, Macro & Weighted Averages of F1 Score, Clearly Explained Understanding the concepts behind the micro average, macro average and weighted average of F1 score in multi-class classification towardsdatascience.com F1 Score(..

Archive 2022.12.20

2022 ์ œ5ํšŒ ๋น…๋ฐ์ดํ„ฐ๋ถ„์„๊ธฐ์‚ฌ (๋™์ฐจํ•ฉ๊ฒฉ)

โœ๏ธ ์„ฑ์ธ ๋˜๊ณ  ์ฒ˜์Œ ๋”ด ์ž๊ฒฉ์ฆ ์ „๊ณต์ž / ํŒ๋‹ค์Šค ๊ฒฝํ—˜ ๅคš / ์ฃผ ์‚ฌ์šฉ ์–ธ์–ด ํŒŒ์ด์ฌ์ธ ์‚ฌ๋žŒ์ด ์ž‘์„ฑํ•˜๋Š” ์ œ5ํšŒ ๋น…๋ฐ์ดํ„ฐ๋ถ„์„๊ธฐ์‚ฌ ํ•ฉ๊ฒฉ ๋ฐ ์‹œํ—˜ ํ›„๊ธฐ ๊ธ€์ด๋‹ค. ํ•„๊ธฐ/์‹ค๊ธฐ ๋‘˜๋‹ค ์ผ์ฃผ์ผ ๊ณต๋ถ€ํ•˜๊ณ  ์ณค๋”๋‹ˆ ๋‘ ์‹œํ—˜ ๋‹ค ๋ถ™์„ ๊ฑฐ๋ž€ ํ™•์‹ ์ด ์—†์—ˆ๋‹ค. ์ƒ๊ฐ๋ณด๋‹ค ์•ˆ์ •์ ์ธ ์ ์ˆ˜๋กœ ํ•ฉ๊ฒฉํ–ˆ๋‹ค๋Š” ๊ฒŒ ๋‚˜๋„ ๋†€๋ผ์šธ ๋ฟ.. ํ•„๊ธฐ ๊ณผ๋ฝ์ด ์žˆ๊ณ  ์‚ฌ๋žŒ๋งˆ๋‹ค ์–ด๋ ค์šด ํŒŒํŠธ๋Š” ๋‹ค๋ฅด๊ฒ ์ง€๋งŒ ๋‚ด ๊ธฐ์ค€ '๋น…๋ฐ์ดํ„ฐ ํƒ์ƒ‰' ๋ถ€๋ถ„์ด ์•”๊ธฐํ•  ๊ฒŒ ์ œ์ผ ๋งŽ์•˜๋˜ ๊ฒƒ ๊ฐ™๋‹ค. ๋‹ค๋ฅธ ํŒŒํŠธ๋ณด๋‹ค ์ž์‹ ์ด ์—†๊ธฐ๋„ ํ–ˆ๋Š”๋ฐ ๊นŒ๋ณด๋‹ˆ๊นŒ ์—ญ์‹œ ๋„ค ํŒŒํŠธ ์ค‘์— ์ ์ˆ˜๊ฐ€ ์ œ์ผ ๋‚ฎ๋”๋ผ ใ…Žใ…Ž ํ•„๊ธฐ๋Š” 10์›”์ด์—ˆ๋‚˜ ๊ทธ ๋•Œ ์ณ์„œ ์–ด๋–ค ํŒŒํŠธ์— ๋ญ๊ฐ€ ๋‚˜์™”๊ณ  ์ด๋Ÿฐ ์ •ํ™•ํ•œ ๋‚ด์šฉ์ด ๊ธฐ์–ต์ด ์•ˆ ๋‚˜๋Š”๋ฐ, ๊ณ„์‚ฐํ•˜๋Š” ๋ฌธ์ œ๊ฐ€ ๊ฝค ๋‚˜์™”๋˜ ๊ฑธ๋กœ ๊ธฐ์–ตํ•œ๋‹ค. ๋‚˜์˜ฌ ๋ฒ•ํ•œ ๊ณ„์‚ฐ ๋ฌธ์ œ๋Š” ๋‹ค ๋ณด๊ณ  ๊ฐ”๋‹ค๊ณ  ์ƒ๊ฐํ–ˆ๋Š”๋ฐ ์™„๋ฒฝํ•˜๊ฒŒ ํ’€์ง„ ๋ชปํ–ˆ๊ณ  ..

Epilogue 2022.12.19

[์ฝ”๋“œ ๋ฆฌ๋ทฐ] ๋…ธ๋…„์ธต ๋Œ€ํ™” ๊ฐ์„ฑ ๋ถ„๋ฅ˜ ๋ชจ๋ธ ๊ตฌํ˜„ (1) : CNN

์ด๋ฒˆ ํ•™๊ธฐ ํ•˜๋‚˜ ์žˆ๋˜ ํ…€ํ”„๋กœ์ ํŠธ๊ฐ€ ๋๋‚œ ํ›„ ์‚ฌ์šฉํ•œ ์ฝ”๋“œ์— ๋Œ€ํ•ด ๋ณต์Šต ๋ชฉ์ ์œผ๋กœ ๊ธ€์„ ์ž‘์„ฑํ•œ๋‹ค. ๋ชจ๋ธ์€ ์ด 3๊ฐœ์ธ๋ฐ CNN, RNN, Transformer ์ˆœ์œผ๋กœ ์ •๋ฆฌํ•  ์˜ˆ์ •์ด๋‹ค. ๐Ÿ‘ฉ‍๐Ÿซ ๋ชจ๋ธ ํด๋ž˜์Šค class CNN(nn.Module): def __init__(self, vocab_size, embed_dim, n_filters, filter_size, dropout, num_class): super(CNN, self).__init__() self.embedding = nn.Embedding(vocab_size, embed_dim) self.conv1d_layers = nn.ModuleList([nn.Conv1d(in_channels=embed_dim, out_channels=n_filters[i], ke..