๐Ÿ‘ฉ‍๐Ÿ’ป

[ART] adversarial_training_mnist.ipynb ์ฝ”๋“œ ๋ถ„์„

geum 2022. 1. 12. 17:34

โœ… ์ฝ”๋“œ : 

https://github.com/Trusted-AI/adversarial-robustness-toolbox/blob/main/notebooks/adversarial_training_mnist.ipynb

 

GitHub - Trusted-AI/adversarial-robustness-toolbox: Adversarial Robustness Toolbox (ART) - Python Library for Machine Learning S

Adversarial Robustness Toolbox (ART) - Python Library for Machine Learning Security - Evasion, Poisoning, Extraction, Inference - Red and Blue Teams - GitHub - Trusted-AI/adversarial-robustness-too...

github.com

 

Load prereqs and data

๊ฐ์ข… ๋ชจ๋“ˆ import ๋ฐ load_dataset()์„ ์ด์šฉํ•ด MNIST ๋ฐ์ดํ„ฐ์…‹์„ ๋ถˆ๋Ÿฌ์˜ค๋Š” ๊ณผ์ •์€ ์—ฌ๋Ÿฌ ์‚ฌ์ดํŠธ์—์„œ ํ™•์ธ ๊ฐ€๋Šฅํ•œ MNIST ์˜ˆ์ œ์™€ ๋™์ผ

 

 

Train and evaluate a baseline classifier

 

path = get_file('mnist_cnn_original.h5', extract=False, path=config.ART_DATA_PATH,
                url='https://www.dropbox.com/s/p2nyzne9chcerid/mnist_cnn_original.h5?dl=1')
classifier_model = load_model(path)
classifier = KerasClassifier(clip_values=(min_, max_), model=classifier_model, use_logits=False)

1. classifier_model = load_model(path)

path์—์„œ ์ง€์ •ํ•œ ๋ชจ๋ธ์„ ๊ฐ€์ ธ์˜ค๊ฒ ๋‹ค๋Š” ์˜๋ฏธ์ด๋‹ค. 'mnist_cnn_original.h5' ๋ชจ๋ธ์„ ์“ฐ๊ฒ ๋‹ค๋Š” ๋œป์ธ๋ฐ ์ € ๋ชจ๋ธ ๊ตฌ์กฐ๋ฅผ ํ™•์ธํ•˜๊ณ  ์‹ถ์—ˆ๋Š”๋ฐ h5 ํŒŒ์ผ ์ฝ๊ธฐ๊ฐ€ ์–ด๋ ค์›Œ์„œ ํฌ๊ธฐ..ใ…Ž

 

๋Œ€์‹  original์ด๋ผ๋Š” ํ‚ค์›Œ๋“œ, tensorflow๋ฅผ import ํ•œ ์ ์„ ์ƒ๊ฐํ•ด TensorFlow MNIST ์˜ˆ์ œ๋ฅผ ์ฐพ์•„๋ดค๋‹ค. flatten ๋ ˆ์ด์–ด ์ „๊นŒ์ง€ ๊ฐ ์ธต์˜ output shape๊ฐ€ TensorFlow ๊ณต์‹ ์‚ฌ์ดํŠธ ์˜ˆ์ œ์™€ ๋™์ผํ•œ ๊ฒƒ์œผ๋กœ ๋ด์„œ ํ•ด๋‹น ๋ชจ๋ธ์˜ ๊ตฌ์กฐ๋ฅผ ์ฐธ๊ณ ํ•œ ๊ฒƒ ๊ฐ™๋‹ค.

 

โœ… ์ฝ”๋“œ(TensorFlow MNIST ์˜ˆ์ œ) : https://www.tensorflow.org/tutorials/images/cnn?hl=ko

 

ํ•ฉ์„ฑ๊ณฑ ์‹ ๊ฒฝ๋ง  |  TensorFlow Core

๋„์›€๋ง Kaggle์— TensorFlow๊ณผ ๊ทธ๋ ˆ์ดํŠธ ๋ฐฐ๋ฆฌ์–ด ๋ฆฌํ”„ (Great Barrier Reef)๋ฅผ ๋ณดํ˜ธํ•˜๊ธฐ ๋„์ „์— ์ฐธ์—ฌ ํ•ฉ์„ฑ๊ณฑ ์‹ ๊ฒฝ๋ง Note: ์ด ๋ฌธ์„œ๋Š” ํ…์„œํ”Œ๋กœ ์ปค๋ฎค๋‹ˆํ‹ฐ์—์„œ ๋ฒˆ์—ญํ–ˆ์Šต๋‹ˆ๋‹ค. ์ปค๋ฎค๋‹ˆํ‹ฐ ๋ฒˆ์—ญ ํ™œ๋™์˜ ํŠน์„ฑ์ƒ ์ •ํ™•

www.tensorflow.org

 

# adversarial examples ์ƒ์„ฑ
attacker = FastGradientMethod(classifier, eps=0.5)
x_test_adv = attacker.generate(x_test[:100])

1. x_test_adv = attacker.generate(x_test[:100])

FGSM ๋ฐฉ๋ฒ•์œผ๋กœ 100๊ฐœ์˜ adversarial example์„ ์ƒ์„ฑํ•œ๋‹ค. FastGradientMethod ํด๋ž˜์Šค์˜ generate ๋ฉ”์†Œ๋“œ๋Š” adversarial example์„ ๊ฐ€์ง€๊ณ  ์žˆ๋Š” ๋ฐฐ์—ด์„ ๋ฐ˜ํ™˜ํ•œ๋‹ค. 

 

๐Ÿง x_test_adv ํ™•์ธ์„ ์œ„ํ•œ ๊ฐ„๋‹จํ•œ ์ฝ”๋“œ ์ถ”๊ฐ€

adversarial example์ด๋‹ˆ๊นŒ ์˜ˆ์ธก์„ ์ž˜๋ชป ํ•  ์ค„ ์•Œ์•˜์œผ๋‚˜! ์ œ๋Œ€๋กœ ํ•œ ๊ฒƒ์ด ์˜๋ฌธ. ์ด๋ ‡๊ฒŒ ๋‚˜์˜ค๋Š” ๊ฒŒ ๋งž๋Š”์ง€ ์ž˜ ๋ชจ๋ฅด๊ฒ ๋‹ค.

100๊ฐœ์˜ adversarial examples์— ๋Œ€ํ•ด ์ œ๋Œ€๋กœ ๋ถ„๋ฅ˜ํ•œ ๊ฒฝ์šฐ๊ฐ€ ํ›จ์”ฌ ๋งŽ๊ธด ํ•œ๋ฐ ์ด๋ฏธ์ง€์— perturbation์ด ๋“ค์–ด๊ฐ€์žˆ๋Š”๊ฑด๊ฐ€,,?

 

 

Adversarially train a robust classifier

์ผ๋‹จ robust classifier์— ๋Œ€ํ•ด ๋‚ด๊ฐ€ ์ดํ•ดํ•œ ๋‚ด์šฉ์€ adversarial example์— ๋Œ€ํ•ด์„œ๋„ ์ œ๋Œ€๋กœ ๋œ ๋ถ„๋ฅ˜๋ฅผ ํ•˜๋Š” classifier์ด๋‹ค.

(์•„์ง AI๋ณด์•ˆ ๊ด€๋ จ ๊ฐœ๋… ์ •๋ฆฝ์ด ์™„๋ฒฝํ•˜๊ฒŒ ๋˜์ง€ ์•Š์€ ์ƒํƒœ๋ผ ์ž˜๋ชป๋œ ์ดํ•ด์ผ ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค. ํ‹€๋ฆฐ ๋‚ด์šฉ์ผ ๊ฒฝ์šฐ ๋Œ“๊ธ€๋กœ ๋‚จ๊ฒจ์ฃผ์„ธ์š”!)

 

# ์ฒซ๋ฒˆ์งธ dense layer์˜ ์œ ๋‹› ์ˆ˜๋ฅผ ์ œ์™ธํ•˜๊ณ  ์œ„์—์„œ ์‚ฌ์šฉํ•œ ๋ชจ๋ธ๊ณผ ๊ตฌ์กฐ ๋™์ผ
path = get_file('mnist_cnn_robust.h5', extract=False, path=config.ART_DATA_PATH,
                url='https://www.dropbox.com/s/yutsncaniiy5uy8/mnist_cnn_robust.h5?dl=1')
robust_classifier_model = load_model(path)
robust_classifier = KerasClassifier(clip_values=(min_, max_), model=robust_classifier_model, use_logits=False)

 

(์ขŒ) baseline classifier (์šฐ) robust classifier

 

attacks = BasicIterativeMethod(robust_classifier, eps=0.3, eps_step=0.01, max_iter=40)

# We had performed this before, starting with a randomly intialized model.
# Adversarial training takes about 80 minutes on an NVIDIA V100.
# The resulting model is the one loaded from mnist_cnn_robust.h5 above.

# Here is the command we had used for the Adversarial Training

trainer = AdversarialTrainer(robust_classifier, attacks, ratio=1.0)
trainer.fit(x_train, y_train, nb_epochs=83, batch_size=50)

1. attacks = BasicIterativeMethod(robust_classifier, eps=0.3, eps_step=0.01, max_iter=40)

BasicIterativeMethod๋Š” FGM๊ณผ FGSM์˜ Iterative version์„ ๊ตฌํ˜„ํ•ด๋†“์€ ๊ฒƒ์ด๋‹ค. ADVERSARIAL EXAMPLES IN THE PHYSICAL WORLD๋ผ๋Š” ๋…ผ๋ฌธ์—์„œ Basic Iterative Method์— ๋Œ€ํ•œ ๋‚ด์šฉ์ด ๋‚˜์˜ค๋Š”๋ฐ ์ด ๋…ผ๋ฌธ์€ ์ด๋ฒˆ์ฃผ ์ค‘์œผ๋กœ ์ฝ์–ด๋ด์•ผ๊ฒ ๋‹ค.

 

2. trainer = AdversarialTrainer(robust_classifier, attacks, ratio=1.0)

art/defences/trainer → adversarial_trainer.py

 

adversarial_trainer ํด๋ž˜์Šค๋Š” ๋ชจ๋ธ ๊ตฌ์กฐ์™€ ํ•˜๋‚˜ ์ด์ƒ์˜ ๊ณต๊ฒฉ ๋ฐฉ๋ฒ•์— ๊ธฐ๋ฐ˜ํ•˜์—ฌ adversarial training์„ ์ˆ˜ํ–‰ํ•˜๋Š” ํด๋ž˜์Šค-

๋ผ๊ณ  ๋˜์–ด ์žˆ๋Š”๋ฐ ์ฃผ์„์œผ๋กœ ๋‹ฌ๋ฆฐ ์„ค๋ช…์„ ๊ทธ๋Œ€๋กœ ํŒŒํŒŒ๊ณ ์— ๋„ฃ์–ด์„œ ๋‚˜์˜จ ๋ฌธ์žฅ์ด๊ณ  ์•„์ง ์ดํ•ด ๋ถˆ๊ฐ€ ๐Ÿ™‚ ์ด๋ฒˆ์—๋Š” ํŒŒ๋ผ๋ฏธํ„ฐ๋งŒ ๊ฐ„๋‹จํ•˜๊ฒŒ ์‚ดํŽด๋ณด๋„๋ก ํ•œ๋‹ค.

 

โ—ฝ classifier : adversarially train์„ ์ ์šฉํ•  ๋ชจ๋ธ 

โ—ฝ attacks : adversarial training์—์„œ data augmentation์„ ์œ„ํ•ด ์‚ฌ์šฉํ•  ๊ณต๊ฒฉ ๋ฐฉ๋ฒ•

โ—ฝ ratio : ๊ฐ batch์—์„œ adversarial counterparts๋กœ ๋Œ€์ฒดํ•  ์ƒ˜ํ”Œ์˜ ๋น„์œจ. 1์ด๋ฉด adversarial samples์— ๋Œ€ํ•ด์„œ๋งŒ ํ•™์Šต์„ ํ•œ๋‹ค๋Š” ์˜๋ฏธ

 

 

โœ… ์›๋ณธ ์ฝ”๋“œ์—์„œ๋Š” train ๊ด€๋ จ ์ฝ”๋“œ ๋‘ ์ค„์ด ์ฃผ์„ ์ฒ˜๋ฆฌ๋˜์–ด ์žˆ๋‹ค. NVIDIA V100 ๊ธฐ์ค€ ํ•™์Šต ์‹œ๊ฐ„์ด 80๋ถ„์ด์—ˆ๋‹ค๋Š”๋ฐ ์—ฐ๊ตฌ์‹ค ์ปดํ“จํ„ฐ(NVIDIA GeForce RTX 3090)๋กœ ํ•œ 6์‹œ๊ฐ„+a ๊ฑธ๋ฆด ๊ฒƒ ๊ฐ™๋‹ค. (83 epochs ์ค‘์— 25 epochs๊นŒ์ง€ ์˜ค๋Š” ๋ฐ์— 2์‹œ๊ฐ„ ์ •๋„ ๊ฑธ๋ ธ๊ณ  ์•„์ง ๋Œ์•„๊ฐ€๋Š” ์ค‘)

 

 

Evaluate the robust classifier

Evaluate baseline classifier ๋ถ€๋ถ„๊ณผ ํฐ ์ฐจ์ด๊ฐ€ ๋‚˜๋Š” ๋ถ€๋ถ„์€ ์—†๋‹ค.