[Adversarial Example 논문리뷰] MagNet

🌌 Deep Learning/논문 리뷰 [KOR]

[Adversarial Example 논문리뷰] MagNet

복만 2020. 2. 8. 21:45

Meng, Dongyu, and Hao Chen. "Magnet: a two-pronged defense against adversarial examples." Proceedings of the 2017 ACM SIGSAC Conference on Computer and Communications Security. 2017.

이미지 출처 : https://www.youtube.com/watch?v=wZ-wIdAcWQE

2017년에 제안된, autoencoder을 이용한 adversarial defense 방법.

개요

- Adversarial attack에 대한 defense 방법.

- Autoencoder을 이용하여 "sanitize" 한 input을 classifier에 넣는다.

장점

- target classifier의 구조를 수정하지 않음

- 특정 attack에 specific하지 않고, 어떤 attack에도 적용 가능

Hypothesis

1) Classifier은 input들이 있는 manifold를 학습하여, 해당 manifold 위에 있는 input들은 제대로 분류한다.

2) 일부 adversarial example들은 manifold에서 아예 멀리 벗어나 있다. ->분류 안됨

3) 일부 adversarial example들은 manifold에서 조금 벗어나 있어 classifier가 다르게 분류한다.

Idea

- 2)의 경우는 detector을 통해 reject 한다.

- 3)의 경우는 reformer을 통해 sanitize 한다.

- 정리하자면 다음 그림과 같다.

Design

- Detector과 Reformer로는 autoencoder을 이용한다.

- Autoencoder는 normal example에 대해 학습시킨다.

- Detector은 는 1) input의 reconstruction error을 조사하거나, 2) probability divergence를 조사하는 두 가지 방법이 있다.

1) Detector based on reconstruction error

- input X를 미리 학습한 autoencoder을 이용하여 재구성한다(X').

- 둘의 reconstruction error을 조사하여, threshold보다 작을 시 통과한다.

- reconstruction error : encoder로 encode된 image가 decoder로 재구성될 때의 비용. 이 값이 크면 adversarial image로 판단함.

2) Detector based on probability divergence

- 1)의 방법은 Reconstruction error가 작은 image의 경우, detection이 잘 되지 않을 수 있음.

- Autoencoder의 성능에 의존하는 1)의 방법과 다르게, target classifier의 성능을 이용하는 방법.

- 우선, input X를 미리 학습한 autoencoder을 이용하여 재구성한다(X').

- X와 X'을 classifier에 넣고, 그 결과의 divergence가 threshold보다 작을 시 통과한다.

- Reformer은 autoencoder로 input을 재구성한다.

- Reformer에서 나온 output을 classifier에 넣는다.

Implementation

Performance

- MNIST, CIFAR 두 가지 dataset에 대하여 공격과 방어를 수행하였고, 모든 경우에 성능이 올라간 것을 볼 수 있다.

'🌌 Deep Learning > 논문 리뷰 [KOR]' 카테고리의 다른 글

[딥러닝 논문리뷰] Deep Double Descent: Where Bigger Models and More Data Hurt (ICLR 2020) (0)	2020.12.22
[딥러닝 논문리뷰] Class-Balanced Loss Based on Effective Number of Samples (CVPR 2019) (1)	2020.12.17
[딥러닝 논문리뷰] Mixed Precision Training (ICLR 2018) (0)	2020.12.10
[Adversarial Example 논문리뷰] MagNet and "Efficient Defenses Against Adversarial Attacks (0)	2020.03.23
[Adversarial Example 논문리뷰] DAPAS : Denoising Autoencoder to Prevent Adversarial attac (0)	2020.03.20

현재글[Adversarial Example 논문리뷰] MagNet

🐬

Today :
Yesterday :

일	월	화	수	목	금	토
				1	2	3
4	5	6	7	8	9	10
11	12	13	14	15	16	17
18	19	20	21	22	23	24
25	26	27	28	29	30	31

IBOK