[ART] attack_defence_imagenet.ipynb 코드 실습

👩‍💻

[ART] attack_defence_imagenet.ipynb 코드 실습

geum 2022. 1. 18. 13:37

원본 코드를 돌려보고 끝내는 것이 아니라 attack/defence에 대한 코드를 조금씩 바꿔보면서 실습을 진행했다.

✅ 코드 원본 :

https://github.com/Trusted-AI/adversarial-robustness-toolbox/blob/main/notebooks/attack_defence_imagenet.ipynb

GitHub - Trusted-AI/adversarial-robustness-toolbox: Adversarial Robustness Toolbox (ART) - Python Library for Machine Learning S

Adversarial Robustness Toolbox (ART) - Python Library for Machine Learning Security - Evasion, Poisoning, Extraction, Inference - Red and Blue Teams - GitHub - Trusted-AI/adversarial-robustness-too...

github.com

Load Images

image_list = []

# imagenet_stubs.get_image_paths()에서 인덱스와 이미지 경로 가져오기
for i, image_path in enumerate(imagenet_stubs.get_image_paths()):
    img = image.load_img(image_path, target_size=(224, 224))
    img = image.img_to_array(img)
    
    image_list.append(img)
    
    if 'koala.jpg' in image_path:
        koala_idx = i

imgs = np.array(image_list)

처음에 if 'koala.jpg' in image_path: 부분에서 ImageNet에 있겠지 싶은 사진 이름 아무거나 넣었다. 'car.jpg' 넣었는데 car_idx = i 부분에서 에러가 나길래 imagenet_stubs.get_image_paths()를 찾아보았다.

아무 사진이나 넣는 게 아니고 https://github.com/nottombrown/imagenet-stubs/tree/master/imagenet_stubs/images 이 폴더에 있는 이미지를 넣어줘야 했다.

# koala_idx 출력 결과 = 5
idx = koala_idx

이미지가 잘 나오는 것을 확인할 수 있다.

Load ResNet50 Classifier

model = ResNet50(weights='imagenet')

ResNet50 구조를 사용하기 위해 keras에 내장되어 있는 ResNet50을 로드한다. weights 파라미터는 None(랜덤 초기화), imagenet(ImageNet에서 pre-training), 가중치 파일 경로 세 가지 중에서 원하는 것으로 지정해주면 된다.

# ResNet50에 적용하기 위해 이미지 차원 추가 및 전처리
x = np.expand_dims(imgs[idx].copy(), axis=0)
x = preprocess_input(x)

pred = model.predict(x)
label = np.argmax(pred, axis=1)[0]
confidence = pred[:, label][0]

차원 개념이 아직도 너무 어렵다 😭

모델이 자기가 예측한 값에 대해 정답이라고 확신하고 있고 예측 결과도 정확한 것 확인!

Create Adversarial Sample

# 공격자 생성
adv = ProjectedGradientDescent(classifier, targeted=False, max_iter=10, eps_step=1, eps=5)

# adversarial sample 생성
x_art_adv = adv.generate(x_art)

PGD를 이용해서 공격자와 adversarial example을 생성한다. targeted=False로 지정하고 untargeted attack에 대해서 먼저 확인해보기로 한다.

1. Untargeted Attack

plt.figure(figsize=(8, 8)); plt.imshow(x_art_adv[0]/255); plt.axis('off'); plt.show()

pred_adv = classifier.predict(x_art_adv)
label_adv = np.argmax(pred_adv, axis=1)[0]
confidence_adv = pred_adv[:, label_adv][0]

print(f'Prediction: {label_to_name(label_adv)} - confidence: {confidence_adv}')

눈으로 봤을 때는 그냥 코알라 사진이지만 모델은 낮지 않은 confidence를 가지고 족제비로 분류했다.

2. Targeted Attack

Targeted attack은 분류기가 특정 레이블로 오분류하도록 원하는 레이블을 지정해준다. 레이블이 모두 1000개이기 때문에 그 중 하나를 선택해 공격을 진행한다.

# target label = label 84 peacock
target_label = 84

adv.set_params(targeted=True)

# adversarial sample 생성
x_art_adv = adv.generate(x_art, y=to_categorical([target_label]))

plt.figure(figsize=(8, 8)); plt.imshow(x_art_adv[0]/255); plt.axis('off'); plt.show()

분류기가 adversarial sample을 분류한 결과를 확인해보면 target label로 예측하는 것을 확인할 수 있다.

Apply Defences

adversarial example이 들어와도 제대로 분류할 수 있도록 해당 공격에 대한 방어를 적용해보기로 한다. 원본 코드에서 사용한 방어 방식은 'Spatial Smoothing'인데, 아직 방어에 대한 공부는 따로 안했기 때문에 spatial smoothing 방식을 그대로 사용했다.

✅ Spatial Smoothing

◽ 이미지 프로세싱에서 잡음(noise)를 줄이기 위해 사용하는 방법

◽ blur라고 부르기도 함

◽ 각 픽셀을 smooth하게 만드는 데에 사용되는 영역 범위에 따라 local smoothing, non-local smoothing으로 나뉨

# 각 픽셀 위로 sliding-window를 진행하기 때문에 window_size 지정 가능
ss = SpatialSmoothing(window_size=3)

# 원본 입력과 adversarial sample에 defences 적용
x_art_def, _ = ss(x_art)
x_art_adv_def, _ = ss(x_art_adv)

# Compute the classifier predictions on the preprocessed inputs:
pred_def = classifier.predict(x_art_def)
label_def = np.argmax(pred_def, axis=1)[0]
confidence_def = pred_def[:, label_def][0]

ss(x_art), ss(x_art_adv)를 할당할 때 둘 다 변수명이 아니라 _ 표시가 들어가는데 저 부분을 지워봤더니 배열이었나 뭐 크기가 안 맞다고 에러가 발생했다. 개인적인 생각인데 뒤쪽 _ 부분은 딱히 역할이 있는 것 같지 않다.

pred_adv_def = classifier.predict(x_art_adv_def)
label_adv_def = np.argmax(pred_adv_def, axis=1)[0]
confidence_adv_def = pred_adv_def[:, label_adv_def][0]

# Print the predictions:
print('Prediction of original sample:', label_to_name(label_def), '- confidence {0:.2f}'.format(confidence_def))
print('Prediction of adversarial sample:', label_to_name(label_adv_def), 
      '- confidence {0:.2f}'.format(confidence_adv_def))

# Show the preprocessed adversarial sample:
plt.figure(figsize=(8,8)); plt.imshow(x_art_adv_def[0] / 255); plt.axis('off'); plt.show()

출력된 이미지는 원본 코알라 사진과 비교하면 확실히 블러 효과가 들어갔고 adversarial sample에 대한 예측 결과는 original sample과 동일하다. perturbation이 추가된 이미지에 spatial smoothing를 적용해서 perturbation을 상쇄(라는 표현이 맞는지 모르겠지만)해주는 방식으로 이해했다.

Adaptive Whitebox Attack to Defeat Defences

세상에 완벽한 방어는 없다고 했다. 방어법이 있으면 다시 그걸 깨는 공격이 있는 법!

# defences를 포함하는 분류기 생성
classifier_def = KerasClassifier(preprocessing=preprocessor, preprocessing_defences=[ss], clip_values=(0, 255), 
                                 model=model)

위쪽에서 사용한 classifier와 이 부분에서 사용하는 classifier의 차이점은 preprocessing_defences 파라미터의 포함 여부이다. 분류기에 적용될 preprocessing defence(s)를 지정해주면 된다.

# Create the attacker.
# Note: here we use a larger number of iterations to achieve the same level of confidence in the misclassification
adv_def = ProjectedGradientDescent(classifier_def, targeted=True, max_iter=40, eps_step=1, eps=5)

# Generate the adversarial sample:
x_art_adv_def = adv_def.generate(x_art, y=to_categorical([target_label]))

# Plot the adversarial sample (note: we swap color channels back to RGB order):
plt.figure(figsize=(8,8)); plt.imshow(x_art_adv_def[0] / 255); plt.axis('off'); plt.show()

# And apply the classifier to it:
pred_adv = classifier_def.predict(x_art_adv_def)
label_adv = np.argmax(pred_adv, axis=1)[0]
confidence_adv = pred_adv[:, label_adv][0]
print('Prediction:', label_to_name(label_adv), '- confidence {0:.2f}'.format(confidence_adv))

공격자와 adversarial sample 생성 및 예측 코드는 이전에 사용한 코드와 크게 다르지 않다.

예측 결과가 다시 peacock로 나온다.

저작자표시 비영리 변경금지

'👩‍💻' 카테고리의 다른 글

[코드 리뷰] 노년층 대화 감성 분류 모델 구현 (1) : CNN (0)	2022.12.13
[ART] attack_adversarial_patch_TensorFlowV2.ipynb 코드 분석 (0)	2022.01.19
[ART] adversarial_training_mnist.ipynb 코드 분석 (0)	2022.01.12
[ART] ART for TensorFlow v2 - Callable 코드 분석 (0)	2022.01.03
[ART] ART for TensorFlow v2 - Keras API 코드 분석 (0)	2021.12.31

현재글[ART] attack_defence_imagenet.ipynb 코드 실습

nsbg 🌞

my life is nsbg

Today :
Yesterday :

nsbg 🌞