์ „์ฒด ๊ธ€ 202

[BOJ] 1789๋ฒˆ: ์ˆ˜๋“ค์˜ ํ•ฉ

๋ฌธ์ œ ์„œ๋กœ ๋‹ค๋ฅธ N๊ฐœ์˜ ์ž์—ฐ์ˆ˜์˜ ํ•ฉ์ด S๋ผ๊ณ  ํ•œ๋‹ค. S๋ฅผ ์•Œ ๋•Œ, ์ž์—ฐ์ˆ˜ N์˜ ์ตœ๋Œ“๊ฐ’์€ ์–ผ๋งˆ์ผ๊นŒ? ์ž…์ถœ๋ ฅ ์˜ˆ์‹œ ํ’€์ด์ž…์ถœ๋ ฅ์ด ๊ต‰์žฅํžˆ ๊ฐ„๋‹จํ•˜๊ณ  solved.ac ๊ธฐ์ค€ ํ‹ฐ์–ด๋„ ๋†’์ง„ ์•Š์ง€๋งŒ ๋‚ด ๊ธฐ์ค€ ํฅ๋ฏธ๋กœ์šด ๋ฌธ์ œ์—ฌ์„œ ์ •๋ฆฌํ•ด๋‘”๋‹ค. ๋ฌธ์ œ์™€ ์˜ˆ์‹œ๊ฐ€ ์งง๊ธด ํ–ˆ์œผ๋‚˜ ์ดํ•ดํ•˜๋Š” ๋ฐ์— ์˜ค๋ž˜ ๊ฑธ๋ฆฌ์ง€๋Š” ์•Š์•˜๋‹ค. ๋ฌธ์ œ๋ฅผ ๋ณด์ž๋งˆ์ž ๋‚ด ๋จธ๋ฆฟ์†์— ๋– ์˜ค๋ฅธ ์ƒ๊ฐ์€ ์•„๋ž˜์™€ ๊ฐ™๋‹ค. Step. 1) S๋ฅผ ์–ด๋–ป๊ฒŒ ๊ตฌ์„ฑํ•  ๊ฒƒ์ธ๊ฐ€?N์„ ์ตœ๋Œ€๋กœ ๋งŒ๋“ค๊ธฐ ์œ„ํ•ด์„œ๋Š” ์ž‘์€ ์ˆ˜๋“ค๋ถ€ํ„ฐ ๊ณ ๋ คํ•ด์•ผ ํ•œ๋‹ค. ์ฆ‰, ๋ช‡ ๊ฐœ์˜ ์ˆซ์ž๋ฅผ ๋”ํ•ด 200์„ ๋งŒ๋“œ๋ ค๊ณ  ํ•  ๋•Œ 99, 101 ์กฐํ•ฉ๋ณด๋‹ค๋Š” 5, 10, 25, ..., 90๊ณผ ๊ฐ™์€ ์กฐํ•ฉ์ด ์ข‹์€ ๊ฒƒ! Step. 2) ์ž‘์€ ์ˆ˜๋“ค๋ถ€ํ„ฐ ๊ณ ๋ คํ•˜๋Š” ๋ฐฉ๋ฒ•์€?์—ฌ๊ธฐ์„œ DP ๊ฐœ๋…์ด ์‚ด์ง ์–น์–ด์ง€๊ฒŒ ๋˜๋Š”๋ฐใ…ก ์ผ๋‹จ 1๋ถ€ํ„ฐ ์ˆœ์„œ๋Œ€๋กœ ๊ณ„์† ์ˆ˜๋ฅผ ๋”ํ•ด๋‚˜๊ฐ€๋Š”..

[Hugging Face] Trainer.train() ์‚ฌ์šฉ ์‹œ ํ•™์Šต ์ด์–ด์„œ ํ•˜๊ธฐ

trainer.train(resume_from_checkpoint=True) ๋‹จ ์œ„ ์ฝ”๋“œ์˜ ํ•œ ๊ฐ€์ง€ ๋ฌธ์ œ์ ์€ ์ฒ˜์Œ ์ € ์ฝ”๋“œ๋ฅผ ์‹คํ–‰ํ•˜๋ฉด ์•„๋ž˜์™€ ๊ฐ™์€ ์—๋Ÿฌ๊ฐ€ ๋œฌ๋‹ค๋Š” ๊ฑฐ๋‹ค. ์ผ๋‹จ try-except๋ฌธ์œผ๋กœ ์ฝ”๋“œ๋ฅผ ์ˆ˜์ •ํ•ด๋’€๋Š”๋ฐ ์ž„์‹œ ํ•ด๊ฒฐ์ฑ…์ธ ๊ฒƒ ๊ฐ™๊ธฐ๋„ ํ•ด์„œ ์‚ด์ง ์ฐ์ฐํ•˜๊ธด ํ•˜๋‹ค. ValueError: No valid checkpoint found in output directory (TrainingArguments์— ์ง€์ •ํ•œ output ๋””๋ ‰ํ† ๋ฆฌ๋ช…)

[BOJ] 1475๋ฒˆ: ๋ฐฉ ๋ฒˆํ˜ธ

๋ฌธ์ œ๋‹ค์†œ์ด๋Š” ์€์ง„์ด์˜ ์˜†์ง‘์— ์ƒˆ๋กœ ์ด์‚ฌ์™”๋‹ค. ๋‹ค์†œ์ด๋Š” ์ž๊ธฐ ๋ฐฉ ๋ฒˆํ˜ธ๋ฅผ ์˜ˆ์œ ํ”Œ๋ผ์Šคํ‹ฑ ์ˆซ์ž๋กœ ๋ฌธ์— ๋ถ™์ด๋ ค๊ณ  ํ•œ๋‹ค.๋‹ค์†œ์ด์˜ ์˜†์ง‘์—์„œ๋Š” ํ”Œ๋ผ์Šคํ‹ฑ ์ˆซ์ž๋ฅผ ํ•œ ์„ธํŠธ๋กœ ํŒ๋‹ค. ํ•œ ์„ธํŠธ์—๋Š” 0๋ฒˆ๋ถ€ํ„ฐ 9๋ฒˆ๊นŒ์ง€ ์ˆซ์ž๊ฐ€ ํ•˜๋‚˜์”ฉ ๋“ค์–ด์žˆ๋‹ค. ๋‹ค์†œ์ด์˜ ๋ฐฉ ๋ฒˆํ˜ธ๊ฐ€ ์ฃผ์–ด์กŒ์„ ๋•Œ, ํ•„์š”ํ•œ ์„ธํŠธ์˜ ๊ฐœ์ˆ˜์˜ ์ตœ์†Ÿ๊ฐ’์„ ์ถœ๋ ฅํ•˜์‹œ์˜ค. (6์€ 9๋ฅผ ๋’ค์ง‘์–ด์„œ ์ด์šฉํ•  ์ˆ˜ ์žˆ๊ณ , 9๋Š” 6์„ ๋’ค์ง‘์–ด์„œ ์ด์šฉํ•  ์ˆ˜ ์žˆ๋‹ค.) ์ž…์ถœ๋ ฅ ์˜ˆ์‹œ ํ’€์ด์ฒ˜์Œ์— ๋ฌธ์ œ๋ž‘ ์ž…์ถœ๋ ฅ ์˜ˆ์‹œ ๋ณด์ž๋งˆ์ž ๊ธˆ๋ฐฉ ํ’€ ์ค„ ์•Œ์•˜๋Š”๋ฐ ๋˜ ์Šค์Šค๋กœ๋ฅผ ๊ณผ๋Œ€ํ‰๊ฐ€ํ•ด๋ฒ„๋ฆฐ ๊ฒƒ์ด์—ˆ์Œ. ์งˆ๋ฌธ ๊ฒŒ์‹œํŒ๊ณผ ๊ตฌ๊ธ€๋ง์œผ๋กœ ํžŒํŠธ ์–ป์€ ํ›„์—์•ผ ์ œ์ถœํ•  ์ˆ˜ ์žˆ์—ˆ๋‹ค. ํžŒํŠธ ๋ณด๋ฉด ์•„~ ํ•˜๋Š”๋ฐ ๊ทธ ๊ณผ์ •์„ ๋‚ด ํž˜์œผ๋กœ ๋– ์˜ฌ๋ฆฌ๋Š” ๊ฒŒ ์™œ์ด๋ ‡๊ฒŒ ์–ด๋ ค์šธ๊นŒ ๐Ÿฅฒ ์ผ๋‹จ ๋‚ด๊ฐ€ ์ƒ๊ฐํ•ด๋ณธ ํ’€์ด ๋ฐฉ์‹๋“ค์€ ์•„๋ž˜์™€ ๊ฐ™๋‹ค.์•„์ด๋””์–ด 1) ๋ฌธ์ž์—ด ๋‹จ์ˆœ ํ™•์ธโ‘  f..

[BOJ] 1018๋ฒˆ: ์ฒด์ŠคํŒ ๋‹ค์‹œ ์น ํ•˜๊ธฐ

๋ฌธ์ œ์ง€๋ฏผ์ด๋Š” ์ž์‹ ์˜ ์ €ํƒ์—์„œ MN๊ฐœ์˜ ๋‹จ์œ„ ์ •์‚ฌ๊ฐํ˜•์œผ๋กœ ๋‚˜๋ˆ„์–ด์ ธ ์žˆ๋Š” M×N ํฌ๊ธฐ์˜ ๋ณด๋“œ๋ฅผ ์ฐพ์•˜๋‹ค. ์–ด๋–ค ์ •์‚ฌ๊ฐํ˜•์€ ๊ฒ€์€์ƒ‰์œผ๋กœ ์น ํ•ด์ ธ ์žˆ๊ณ , ๋‚˜๋จธ์ง€๋Š” ํฐ์ƒ‰์œผ๋กœ ์น ํ•ด์ ธ ์žˆ๋‹ค. ์ง€๋ฏผ์ด๋Š” ์ด ๋ณด๋“œ๋ฅผ ์ž˜๋ผ์„œ 8×8 ํฌ๊ธฐ์˜ ์ฒด์ŠคํŒ์œผ๋กœ ๋งŒ๋“ค๋ ค๊ณ  ํ•œ๋‹ค.์ฒด์ŠคํŒ์€ ๊ฒ€์€์ƒ‰๊ณผ ํฐ์ƒ‰์ด ๋ฒˆ๊ฐˆ์•„์„œ ์น ํ•ด์ ธ ์žˆ์–ด์•ผ ํ•œ๋‹ค. ๊ตฌ์ฒด์ ์œผ๋กœ, ๊ฐ ์นธ์ด ๊ฒ€์€์ƒ‰๊ณผ ํฐ์ƒ‰ ์ค‘ ํ•˜๋‚˜๋กœ ์ƒ‰์น ๋˜์–ด ์žˆ๊ณ , ๋ณ€์„ ๊ณต์œ ํ•˜๋Š” ๋‘ ๊ฐœ์˜ ์‚ฌ๊ฐํ˜•์€ ๋‹ค๋ฅธ ์ƒ‰์œผ๋กœ ์น ํ•ด์ ธ ์žˆ์–ด์•ผ ํ•œ๋‹ค. ๋”ฐ๋ผ์„œ ์ด ์ •์˜๋ฅผ ๋”ฐ๋ฅด๋ฉด ์ฒด์ŠคํŒ์„ ์ƒ‰์น ํ•˜๋Š” ๊ฒฝ์šฐ๋Š” ๋‘ ๊ฐ€์ง€๋ฟ์ด๋‹ค. ํ•˜๋‚˜๋Š” ๋งจ ์™ผ์ชฝ ์œ„ ์นธ์ด ํฐ์ƒ‰์ธ ๊ฒฝ์šฐ, ํ•˜๋‚˜๋Š” ๊ฒ€์€์ƒ‰์ธ ๊ฒฝ์šฐ์ด๋‹ค.๋ณด๋“œ๊ฐ€ ์ฒด์ŠคํŒ์ฒ˜๋Ÿผ ์น ํ•ด์ ธ ์žˆ๋‹ค๋Š” ๋ณด์žฅ์ด ์—†์–ด์„œ, ์ง€๋ฏผ์ด๋Š” 8×8 ํฌ๊ธฐ์˜ ์ฒด์ŠคํŒ์œผ๋กœ ์ž˜๋ผ๋‚ธ ํ›„์— ๋ช‡ ๊ฐœ์˜ ์ •์‚ฌ๊ฐํ˜•์„ ๋‹ค์‹œ ์น ํ•ด์•ผ๊ฒ ๋‹ค๊ณ  ์ƒ๊ฐํ–ˆ๋‹ค. ๋‹น์—ฐํžˆ 8*8 ..

[๋ฒˆ์—ญ] The History of Open-Source LLMs: Part โ…ก. Better Base Models

The History of Open-Source LLMs: Part โ… . Early days The History of Open-Source LLMs: Part โ…ก. Better Base Models The History of Open-Source LLMs: Part โ…ข. Imitations and alignment โญ ๊ธ€์— ์‚ฝ์ž…๋œ ๋ชจ๋“  ์ด๋ฏธ์ง€์˜ ์ถœ์ฒ˜๋Š” ์›๋ฌธ์ž…๋‹ˆ๋‹ค. ์ดˆ๊ธฐ ์˜คํ”ˆ์†Œ์Šค LLM - ๊ณต๊ฐœ๋˜์ง€ ์•Š์€ ์‚ฌ์ „ ํ•™์Šต ๋ชจ๋ธ๋“ค๊ณผ ๋น„๊ตํ•˜๋ฉด ์„ฑ๋Šฅ์ด ๋งŽ์ด ๋–จ์–ด์ง„๋‹ค๋Š” ๋‹จ์ ์ด ์กด์žฌํ•จ 1) LLM ํ•™์Šต ํŒŒ์ดํ”„๋ผ์ธ โ‘  ๋Œ€๋Ÿ‰์˜ ์›์‹œ ๋ฐ์ดํ„ฐ๋ฅผ ์ด์šฉํ•ด ๋ชจ๋ธ ์‚ฌ์ „ ํ•™์Šต โ‘ก SFT์™€ RLHF ๊ฐ™์€ ๊ธฐ์ˆ ์„ ์ด์šฉํ•ด alignment ์ˆ˜ํ–‰ โ‘ข LLM์„ ํŠน์ • ํƒœ์Šคํฌ์— ์ ์šฉํ•˜๊ธฐ ์œ„ํ•ด ํŒŒ์ธ ํŠœ๋‹ ๋˜๋Š” in-context learning ์ˆ˜ํ–‰..

Archive 2023.10.25

[๋ฒˆ์—ญ] The History of Open-Source LLMs: Part โ… . Early days

์—ฐ๊ตฌ์‹ค ์˜ค๋น ๊ฐ€ ๊ณต์œ ํ•ด์ค€ ์˜คํ”ˆ์†Œ์Šค LLM ๋ฐœ์ „ ๊ณผ์ •์— ๋Œ€ํ•œ ๊ธ€์„ ์ฝ๊ณ  ๋ฒˆ์—ญ ๋ฐ ์ถ”๊ฐ€ ๊ณต๋ถ€ํ•œ ๋‚ด์šฉ์„ ์ •๋ฆฌํ•ด๋ณธ๋‹ค. The History of Open-Source LLMs: Part โ… . Early days The History of Open-Source LLMs: Part โ…ก. Better Base Models The History of Open-Source LLMs: Part โ…ข. Imitations and alignment โญ ๊ธ€์— ์‚ฝ์ž…๋œ ๋ชจ๋“  ์ด๋ฏธ์ง€์˜ ์ถœ์ฒ˜๋Š” ์›๋ฌธ์ž…๋‹ˆ๋‹ค. LLM ๋“ฑ์žฅ ๋ฐฐ๊ฒฝ - ์–ธ์–ด ๋ชจ๋ธ ์ž์ฒด๋Š” ์—ญ์‚ฌ๊ฐ€ ์˜ค๋ž˜ ๋์ง€๋งŒ self-supervised pre-training๊ณผ in-context learning์„ ์กฐํ•ฉํ•˜์—ฌ ์—ฌ๋Ÿฌ ํƒœ์Šคํฌ์—์„œ ์ธ์ƒ ๊นŠ์€ few-shot learning ์„ฑ๋Šฅ์„ ๋ณด์ธ GP..

Archive 2023.10.23

[Hugging Face] ValueError: Unable to create tensor, you should probably activate truncation and/or padding with 'padding=True' 'truncation=True' to have batched tensors with the same length. Perhaps your features (`labels` in this case) have excessive n..

์ด์ค€๋ฒ”๋‹˜๊ป˜์„œ ์˜ฌ๋ ค์ฃผ์‹  QLoRA+Polyglot-Ko-12.8B ํ•™์Šต ์˜ˆ์ œ๋ฅผ ๋ณด๊ณ  ๋”ฐ๋ผํ•˜๊ณ  ์žˆ์—ˆ๋Š”๋ฐ ์›๋ณธ ์ฝ”๋“œ์—์„œ๋Š” ๋‚˜์˜ค์ง€ ์•Š๋˜ ์—๋Ÿฌ๊ฐ€ ๋ฐœ์ƒํ–ˆ๋‹ค. ๊ตฌ๊ธ€๋งํ•ด์„œ ์ฐพ์•˜๋˜ ํ•ด๊ฒฐ๋ฒ•(tokenizer ์ธ์ž๋กœ padding=True/๋˜๋Š” 'max_length', truncation=True/๋˜๋Š” 'max_length' ์ถ”๊ฐ€)์ด ํ•˜๋‚˜๋„ ๋จน์ง€ ์•Š์•„์„œ ๋„ˆ๋ฌด ๋‹ต๋‹ตํ–ˆ์—ˆ๋Š”๋ฐ ์•„๋ž˜์™€ ๊ฐ™์€ ๋ฐฉ๋ฒ•์œผ๋กœ ํ•ด๊ฒฐํ•  ์ˆ˜ ์žˆ์—ˆ๋‹ค. ํ•ต์‹ฌ์€ remove_columns! โœ… ํ•ด๊ฒฐ ๋ฐฉ๋ฒ• dataset = dataset.map(lambda samples: tokenizer(samples["text"], padding=True, truncation=True, max_length=128), batched=True, remove_columns=['input..

[PyTorch] RuntimeError - dtype, grad_fn

RuntimeError: Expected floating point type for target with class probabilities, got Long ํ•™์Šต ๊ณผ์ •์—์„œ loss = loss_fn(pred, label) ์ฝ”๋“œ๋ฅผ ์‚ฌ์šฉํ–ˆ๋Š”๋ฐ, pred์™€ label์ด float์ผ ์ค„ ์•Œ์•˜๋Š”๋ฐ long์„ ๋ฐ›์•˜๋‹ค๋Š” ์—๋Ÿฌ ๋ฌธ๊ตฌ๋‹ค. ์˜ˆ์ „์—๋Š” loss ๊ตฌํ•˜๋Š” ๋ถ€๋ถ„์—์„œ ๋ฐ์ดํ„ฐ ํƒ€์ž…์„ ์ง€์ •ํ•ด์ค€ ์ ์ด ์—†๋Š” ๊ฒƒ ๊ฐ™์€๋ฐ ์•„๋งˆ ๋ฐ์ดํ„ฐ์…‹ ํด๋ž˜์Šค๋‚˜ collate_fn ํ•จ์ˆ˜์—์„œ ์„ค์ •ํ•ด์คฌ๋˜ ๋“ฏ? ์ด๋ฒˆ์—๋Š” ์ € ๋ถ€๋ถ„์—์„œ ๋ฐ์ดํ„ฐ ํƒ€์ž…์„ ๋ช…์‹œํ•ด์คฌ๋‹ค. โœ… ํ•ด๊ฒฐ ๋ฐฉ๋ฒ• loss = loss_fn(torch.tensor(pred, dtype=torch.float16), torch.tensor(label, dtype=torch.float16)) R..

[Causal Inference] 01. Introduction to causality

์ฃผ 1ํšŒ ์ง„ํ–‰ํ•˜๋Š” ์Šคํ„ฐ๋”” ๋ชจ์ž„์—์„œ ํ•จ๊ป˜ ๊ณต๋ถ€ํ•ด๋ณด๊ธฐ๋กœ ํ•œ ์ž๋ฃŒ ์ •๋ฆฌ ๋ชฉ์ ์œผ๋กœ ๊ธ€์„ ์ž‘์„ฑํ•œ๋‹ค. ์Šคํ„ฐ๋””์›๋ถ„๊ป˜์„œ ์ข‹์€ ์ž๋ฃŒ๋ฅผ ์ถ”์ฒœํ•ด์ฃผ์…”์„œ ์ •๋ง ์˜ค๋žœ๋งŒ์— ์ง„๋“ํ•œ ์ด๋ก  ๊ณต๋ถ€๋ฅผ ํ•˜๊ฒŒ ๋  ๊ฒƒ ๊ฐ™๋‹ค. ์›๋ฌธ: https://github.com/CausalInferenceLab/Causal-Inference-with-Python ๋จธ์‹ ๋Ÿฌ๋‹๊ณผ ์ธ๊ณผ๊ด€๊ณ„ โ—พ ๋จธ์‹ ๋Ÿฌ๋‹์€ ์—„๊ฒฉํ•œ ๋ฐ”์šด๋”๋ฆฌ ์•ˆ์—์„œ ์—„์ฒญ๋‚œ ์ผ์„ ํ•  ์ˆ˜ ์žˆ์ง€๋งŒ, ์ฃผ์–ด์ง„ ๋ฐ์ดํ„ฐ๊ฐ€ ๋ชจ๋ธ์ด ํ•™์Šตํ•œ ๊ฒƒ๊ณผ ์กฐ๊ธˆ ๋‹ค๋ฅผ ๊ฒฝ์šฐ ์ œ๋Œ€๋กœ ์ž‘๋™ํ•˜์ง€ ์•Š์„ ์ˆ˜ ์žˆ์Œ โ—พ ๋จธ์‹ ๋Ÿฌ๋‹์€ ์ƒ๊ด€๊ด€๊ณ„์— ์˜์กดํ•˜๊ธฐ ๋•Œ๋ฌธ์— ์ธ๊ณผ๊ด€๊ณ„(causation) ์œ ํ˜•์˜ ๋ฌธ์ œ๋ฅผ ์ž˜ ๋‹ค๋ฃจ์ง€ ๋ชปํ•จ ์˜ˆ์‹œ) "ํ˜ธํ…” ์‚ฐ์—…์—์„œ ๊ฐ€๊ฒฉ์€ ๋น„์ˆ˜๊ธฐ์ผ ๋•Œ ์ €๋ ดํ•˜๊ณ , ์ˆ˜์š”๊ฐ€ ๊ฐ€์žฅ ๋งŽ๊ณ  ํ˜ธํ…”์ด ๊ฐ€๋“ ์ฐจ๋Š” ์„ฑ์ˆ˜๊ธฐ ์‹œ์ฆŒ์—๋Š” ๊ฐ€๊ฒฉ์ด ๋†’์Šต๋‹ˆ๋‹ค. ํ•ด๋‹น ๋ฐ์ด..