기계 번역으로 제공되는 번역입니다. 제공된 번역과 원본 영어의 내용이 상충하는 경우에는 영어 버전이 우선합니다.

# SageMaker Neo를 사용한 모델 성능 최적화
<a name="neo"></a>

Neo는 기계 학습 모델을 한 번 훈련시키며 클라우드와 엣지의 어디서나 실행할 수 있는 Amazon SageMaker AI의 기능입니다.

SageMaker Neo를 처음 사용하는 경우 [엣지 장치 시작하기](https://docs.aws.amazon.com/sagemaker/latest/dg/neo-getting-started-edge.html) 섹션을 확인하여 컴파일하고 엣지 장치에 배포하는 방법에 대한 단계별 지침을 숙지하는 것을 권장합니다.

## SageMaker Neo란?
<a name="neo-what-it-is"></a>

일반적으로, 여러 플랫폼에서 추론을 위한 기계 학습 모델을 최적화하는 것은 각 플랫폼의 특정 하드웨어와 소프트웨어 구성에 맞춰 모델을 수동으로 조정해야 하기 때문에 어렵습니다. 주어진 워크로드에 대해 최적의 성능을 얻으려는 경우에는 여러 요소 중에서도 하드웨어 아키텍처, 명령 세트, 메모리 액세스 패턴 및 입력 데이터 셰이프에 대해 알아야 합니다. 기존에 소프트웨어를 개발하는 경우에는 컴파일러 및 프로파일러 등과 같은 도구가 개발 프로세스를 간소화합니다. 기계 학습의 경우 대부분의 도구는 프레임워크 또는 하드웨어와 관련이 있습니다. 따라서 개발자가 신뢰할 수 없고 생산성이 떨어지는 시행착오 과정을 직접 수행할 수 밖에 없습니다.

Neo는 Ambarella, ARM, Intel, Nvidia, NXP, Qualcomm, Texas Instruments 및 Xilinx의 프로세서를 기반으로 Android, Linux 및 Windows 머신에서 추론하기 위해 Gluon, Keras, MXNet, PyTorch, TensorFlow, TensorFlow-Lite 및 ONNX 모델을 자동으로 최적화합니다. Neo는 프레임워크에서 Model Zoo에 사용할 수 있는 컴퓨터 비전 모델로 테스트합니다. SageMaker Neo는 클라우드 인스턴스(Inferentia 포함) 및 엣지 장치라는 두 가지 주요 플랫폼에 대한 컴파일 및 배포를 지원합니다.

지원되는 프레임워크 및 배포할 수 있는 클라우드 인스턴스 유형에 대한 자세한 내용은 클라우드 인스턴스를 위한 [지원되는 인스턴스 유형 및 프레임워크](neo-supported-cloud.md)을 참고하세요.

엣지 디바이스용으로 SageMaker Neo가 테스트한 지원되는 프레임워크, 엣지 디바이스, 운영 체제, 칩 아키텍처, 일반적인 기계 학습 모델에 대한 자세한 내용은 엣지 디바이스를 위한 [지원되는 프레임워크, 디바이스, 시스템, 아키텍처](neo-supported-devices-edge.md) 섹션을 참조하세요.

## 작동 방식
<a name="neo-how-it-works"></a>

Neo는 컴파일러와 런타임으로 이루어져 있습니다. 먼저, Neo 컴파일 API는 여러 프레임워크에서 내보낸 모델을 읽고, 프레임워크별 함수와 작업을 프레임워크와 무관한 중간 표시로 변환합니다. 다음에는 일련의 최적화를 수행합니다. 이 API는 최적화된 작업에 대한 바이너리 코드를 생성해 공유 객체 라이브러리에 기록하고, 모델 정의 및 파라미터를 별도 파일에 저장합니다. 또한 Neo는 컴파일된 모델을 로드 및 실행하는 각 대상 플랫폼에 런타임을 제공합니다.

![\[SageMaker AI에서 Neo가 작동하는 방식.\]](http://docs.aws.amazon.com/ko_kr/sagemaker/latest/dg/images/neo/neo_how_it_works.png)


SageMaker AI 콘솔, AWS Command Line Interface(AWS CLI), Python 노트북 또는 SageMaker AI SDK에서 Neo 컴파일 작업을 만들 수 있습니다. 모델을 컴파일하는 방법에 대한 자세한 내용은 [Neo를 사용한 모델 컴파일](neo-job-compilation.md) 섹션을 참조하세요. 몇 가지 CLI 명령, API 호출 또는 몇 번의 클릭을 실행하면 선택한 플랫폼에 맞춰 모델을 변환할 수 있습니다. 모델을 SageMaker AI 엔드포인트 또는 AWS IoT Greengrass 디바이스에 빠르게 배포할 수 있습니다.

Neo는 FP32 또는 INT8의 파라미터 또는 FP16 비트 너비로 양자화된 파라미터를 사용하여 모델을 최적화할 수 있습니다.

**Topics**
+ [SageMaker Neo란?](#neo-what-it-is)
+ [작동 방식](#neo-how-it-works)
+ [Neo를 사용한 모델 컴파일](neo-job-compilation.md)
+ [클라우드 인스턴스](neo-cloud-instances.md)
+ [엣지 디바이스](neo-edge-devices.md)
+ [오류 문제 해결](neo-troubleshooting.md)

# Neo를 사용한 모델 컴파일
<a name="neo-job-compilation"></a>

이 섹션에서는 컴파일 작업을 생성, 설명 및 중지하고 목록을 생성하는 방법을 보여줍니다. 기계 학습 모델의 컴파일 작업을 관리하기 위해 Amazon SageMaker Neo에서 사용할 수 있는 옵션은 AWS Command Line Interface, Amazon SageMaker AI 콘솔 또는 Amazon SageMaker SDK입니다.

**Topics**
+ [컴파일용 ML 모델 준비](neo-compilation-preparing-model.md)
+ [모델 컴파일(AWS Command Line Interface)](neo-job-compilation-cli.md)
+ [모델 컴파일(Amazon SageMaker AI 콘솔)](neo-job-compilation-console.md)
+ [모델 컴파일(Amazon SageMaker AI SDK)](neo-job-compilation-sagemaker-sdk.md)

# 컴파일용 ML 모델 준비
<a name="neo-compilation-preparing-model"></a>

SageMaker Neo에는 특정 입력 데이터 형태를 충족하는 기계 학습 모델이 필요합니다. 컴파일에 필요한 입력 형태는 사용하는 딥러닝 프레임워크에 따라 달라집니다. 모델 입력 형태의 형식이 올바르게 지정되면 아래 요구 사항에 따라 모델을 저장합니다. 모델을 저장한 후에는 모델 아티팩트를 압축하세요.

**Topics**
+ [SageMaker Neo에 필요한 입력 데이터 형태는 무엇입니까?](#neo-job-compilation-expected-inputs)
+ [SageMaker Neo에 대한 모델 저장](#neo-job-compilation-how-to-save-model)

## SageMaker Neo에 필요한 입력 데이터 형태는 무엇입니까?
<a name="neo-job-compilation-expected-inputs"></a>

모델을 컴파일하기 전에 모델 형식이 올바른지 확인하세요. Neo에는 훈련된 모델에 필요한 데이터 입력의 이름 및 형태가 JSON 형식 또는 목록 형식으로 필요합니다. 필요한 입력은 프레임워크에 따라 다릅니다.

SageMaker Neo에 필요한 입력 형태는 다음과 같습니다.

### Keras
<a name="collapsible-section-1"></a>

훈련된 모델의 딕셔너리 형식을 사용하여 필요한 데이터 입력의 이름 및 형태(NCHW 형식)를 지정해야 합니다. Keras 모델 아티팩트는 NHWC(채널-마지막) 형식으로 업로드해야 하지만 DataInputConfig는 NCHW(채널-우선) 형식으로 지정해야 합니다. 필요한 딕셔너리 형식은 다음과 같습니다.
+ 하나의 입력인 경우: `{'input_1':[1,3,224,224]}`
+ 두개의 입력인 경우: `{'input_1': [1,3,224,224], 'input_2':[1,3,224,224]}`

### MXNet/ONNX
<a name="collapsible-section-2"></a>

훈련된 모델의 딕셔너리 형식을 사용하여 필요한 데이터 입력의 이름 및 형태(NCHW 형식)를 지정해야 합니다. 필요한 딕셔너리 형식은 다음과 같습니다.
+ 하나의 입력인 경우: `{'data':[1,3,1024,1024]}`
+ 두개의 입력인 경우: `{'var1': [1,1,28,28], 'var2':[1,1,28,28]}`

### PyTorch
<a name="collapsible-section-3"></a>

PyTorch 모델의 경우 다음 조건을 모두 충족하는 경우 필요한 데이터 입력의 이름과 형태를 제공할 필요가 없습니다.
+ PyTorch 2.0 이상을 사용하여 모델 정의 파일을 생성했습니다. 정의 파일을 생성하는 방법에 대한 자세한 내용은 *SageMaker Neo에 모델 저장* 아래의 [PyTorch](#how-to-save-pytorch) 섹션을 참조하세요.
+ 클라우드 인스턴스용 모델을 컴파일하고 있습니다. SageMaker Neo가 지원하는 인스턴스 유형에 대한 자세한 내용은 [지원되는 인스턴스 유형 및 프레임워크](neo-supported-cloud.md) 섹션을 참조하세요.

이러한 조건을 충족하는 경우 SageMaker Neo는 PyTorch로 생성하는 모델 정의 파일(.pt 또는.pth)에서 입력 구성을 가져옵니다.

그 밖에는 다음과 같은 방법이 있습니다.

훈련된 모델의 딕셔너리 형식을 사용하여 필요한 데이터 입력의 이름 및 형태(NCHW 형식)를 지정해야 합니다. 또는 목록 형식만 사용하여 형태를 지정할 수도 있습니다. 필요한 딕셔너리 형식은 다음과 같습니다.
+ 딕셔너리 형식으로 하나의 입력인 경우: `{'input0':[1,3,224,224]}`
+ 목록 형식으로 하나의 입력인 경우: `[[1,3,224,224]]`
+ 딕셔너리 형식으로 두 개의 입력인 경우: `{'input0':[1,3,224,224], 'input1':[1,3,224,224]}`
+ 목록 형식으로 두 개의 입력인 경우: `[[1,3,224,224], [1,3,224,224]]`

### TensorFlow
<a name="collapsible-section-4"></a>

훈련된 모델의 딕셔너리 형식을 사용하여 필요한 데이터 입력의 이름 및 셰이프(NHWC 형식)를 지정하세요. 필요한 딕셔너리 형식은 다음과 같습니다.
+ 하나의 입력인 경우: `{'input':[1,1024,1024,3]}`
+ 두개의 입력인 경우: `{'data1': [1,28,28,1], 'data2':[1,28,28,1]}`

### TFLite
<a name="collapsible-section-5"></a>

훈련된 모델의 딕셔너리 형식을 사용하여 필요한 데이터 입력의 이름 및 셰이프(NHWC 형식)를 지정하세요. 필요한 딕셔너리 형식은 다음과 같습니다.
+ 하나의 입력인 경우: `{'input':[1,224,224,3]}`

**참고**  
SageMaker Neo는 엣지 디바이스 대상용 TensorFlow Lite 만 지원합니다. 지원되는 SageMaker Neo 엣지 디바이스 대상 목록은 SageMaker Neo [Devices](neo-supported-devices-edge-devices.md#neo-supported-edge-devices) 페이지를 참조하세요. 지원되는 SageMaker Neo 클라우드 인스턴스 대상 목록은 SageMaker Neo [지원되는 인스턴스 유형 및 프레임워크](neo-supported-cloud.md) 페이지를 참조하세요.

### XGBoost
<a name="collapsible-section-6"></a>

입력 데이터 이름 및 형태는 필요하지 않습니다.

## SageMaker Neo에 대한 모델 저장
<a name="neo-job-compilation-how-to-save-model"></a>

다음 코드 예제는 Neo와 호환되도록 모델을 저장하는 방법을 보여줍니다. 모델은 압축된 tar 파일(`*.tar.gz`)로 패키징해야 합니다.

### Keras
<a name="how-to-save-tf-keras"></a>

Keras 모델에는 하나의 모델 정의 파일(`.h5`)이 필요합니다.

SageMaker Neo와 호환되도록 Keras 모델을 저장하는 데는 두 가지 옵션이 있습니다.

1. `model.save("<model-name>", save_format="h5")`를 사용하여 `.h5` 형식으로 내보냅니다.

1. 내보낸 후 `SavedModel`을 고정합니다.

다음은 `tf.keras` 모델을 고정 그래프로 내보내는 방법의 예입니다(옵션 2).

```
import os
import tensorflow as tf
from tensorflow.keras.applications.resnet50 import ResNet50
from tensorflow.keras import backend

tf.keras.backend.set_learning_phase(0)
model = tf.keras.applications.ResNet50(weights='imagenet', include_top=False, input_shape=(224, 224, 3), pooling='avg')
model.summary()

# Save as a SavedModel
export_dir = 'saved_model/'
model.save(export_dir, save_format='tf')

# Freeze saved model
input_node_names = [inp.name.split(":")[0] for inp in model.inputs]
output_node_names = [output.name.split(":")[0] for output in model.outputs]
print("Input names: ", input_node_names)
with tf.Session() as sess:
    loaded = tf.saved_model.load(sess, export_dir=export_dir, tags=["serve"]) 
    frozen_graph = tf.graph_util.convert_variables_to_constants(sess,
                                                                sess.graph.as_graph_def(),
                                                                output_node_names)
    tf.io.write_graph(graph_or_graph_def=frozen_graph, logdir=".", name="frozen_graph.pb", as_text=False)

import tarfile
tar = tarfile.open("frozen_graph.tar.gz", "w:gz")
tar.add("frozen_graph.pb")
tar.close()
```

**주의**  
`model.save(<path>, save_format='tf')`를 사용하여 `SavedModel` 클래스와 함께 모델을 내보내지 마세요. 이 형식은 훈련에는 적합하지만 추론에는 적합하지 않습니다.

### MXNet
<a name="how-to-save-mxnet"></a>

MXNet 모델은 단일 기호 파일 `*-symbol.json` 및 단일 파라미터 `*.params files`로 저장해야 합니다.

------
#### [ Gluon Models ]

`HybridSequential` 클래스를 사용하여 신경망을 정의합니다. 그러면 명령형 프로그래밍이 아닌 기호 프로그래밍 스타일로 코드가 실행됩니다.

```
from mxnet import nd, sym
from mxnet.gluon import nn

def get_net():
    net = nn.HybridSequential()  # Here we use the class HybridSequential.
    net.add(nn.Dense(256, activation='relu'),
            nn.Dense(128, activation='relu'),
            nn.Dense(2))
    net.initialize()
    return net

# Define an input to compute a forward calculation. 
x = nd.random.normal(shape=(1, 512))
net = get_net()

# During the forward calculation, the neural network will automatically infer
# the shape of the weight parameters of all the layers based on the shape of
# the input.
net(x)
                        
# hybridize model
net.hybridize()
net(x)

# export model
net.export('<model_name>') # this will create model-symbol.json and model-0000.params files

import tarfile
tar = tarfile.open("<model_name>.tar.gz", "w:gz")
for name in ["<model_name>-0000.params", "<model_name>-symbol.json"]:
    tar.add(name)
tar.close()
```

하이브리드 모델에 대한 자세한 내용은 [MXNet 하이브리드 설명서](https://mxnet.apache.org/versions/1.7.0/api/python/docs/tutorials/packages/gluon/blocks/hybridize.html)를 참조하세요.

------
#### [ Gluon Model Zoo (GluonCV) ]

GluonCV 모델 동물원 모델은 사전 하이브리드된 상태로 제공됩니다. 따라서 그냥 내보내기만 하면 됩니다.

```
import numpy as np
import mxnet as mx
import gluoncv as gcv
from gluoncv.utils import export_block
import tarfile

net = gcv.model_zoo.get_model('<model_name>', pretrained=True) # For example, choose <model_name> as resnet18_v1
export_block('<model_name>', net, preprocess=True, layout='HWC')

tar = tarfile.open("<model_name>.tar.gz", "w:gz")

for name in ["<model_name>-0000.params", "<model_name>-symbol.json"]:
    tar.add(name)
tar.close()
```

------
#### [ Non Gluon Models ]

모든 GLUON이 아닌 모델을 디스크 사용 `*-symbol` 및 `*.params` 파일에 저장한 경우. 따라서 이미 Neo를 위한 올바른 형식으로 되어 있습니다.

```
# Pass the following 3 parameters: sym, args, aux
mx.model.save_checkpoint('<model_name>',0,sym,args,aux) # this will create <model_name>-symbol.json and <model_name>-0000.params files

import tarfile
tar = tarfile.open("<model_name>.tar.gz", "w:gz")

for name in ["<model_name>-0000.params", "<model_name>-symbol.json"]:
    tar.add(name)
tar.close()
```

------

### PyTorch
<a name="how-to-save-pytorch"></a>

PyTorch 모델은 입력 데이터 유형이 `float32`인 정의 파일(`.pt` 또는 `.pth`)로 저장해야 합니다.

모델을 저장하려면 `torch.jit.trace` 메서드와 `torch.save` 메서드를 차례로 사용하세요. 이 프로세스는 객체를 디스크 파일에 저장하고 기본적으로 python pickle(`pickle_module=pickle`)을 사용하여 객체와 일부 메타데이터를 저장합니다. 다음으로, 저장된 모델을 압축된 tar 파일로 변환합니다.

```
import torchvision
import torch

model = torchvision.models.resnet18(pretrained=True)
model.eval()
inp = torch.rand(1, 3, 224, 224)
model_trace = torch.jit.trace(model, inp)

# Save your model. The following code saves it with the .pth file extension
model_trace.save('model.pth')

# Save as a compressed tar file
import tarfile
with tarfile.open('model.tar.gz', 'w:gz') as f:
    f.add('model.pth')
f.close()
```

PyTorch 2.0 이상을 사용하여 모델을 저장하는 경우 SageMaker Neo는 정의 파일에서 모델의 입력 구성(입력의 이름 및 형태)을 얻습니다. 이 경우 모델을 컴파일할 때 SageMaker AI에 데이터 입력 구성을 지정하지 않아도 됩니다.

SageMaker Neo가 입력 구성을 얻지 못하도록 하려면 `torch.jit.trace`의 `_store_inputs` 파라미터를 `False`로 설정하면 됩니다. 이렇게 하면 모델을 컴파일할 때 SageMaker AI에 데이터 입력 구성을 지정해야 합니다.

`torch.jit.trace` 메서드에 대한 자세한 내용은 PyTorch 설명서의 [TORCH.JIT.TRACE](https://pytorch.org/docs/stable/generated/torch.jit.trace.html#torch.jit.trace)를 참조하세요.

### TensorFlow
<a name="how-to-save-tf"></a>

TensorFlow에는 `.pb` 하나 또는 `.pbtxt` 하나의 파일과 변수가 포함된 변수 디렉터리가 필요합니다. 고정된 모델의 경우 하나의 `.pb` 또는 `.pbtxt` 파일만 필요합니다.

다음 코드 예제는 tar Linux 명령을 사용하여 모델을 압축하는 방법을 보여줍니다. 터미널이나 Jupyter notebook에서 다음을 실행합니다(Jupyter notebook을 사용하는 경우 명령문 앞에 `!` magic 명령을 삽입하세요).

```
# Download SSD_Mobilenet trained model
!wget http://download.tensorflow.org/models/object_detection/ssd_mobilenet_v2_coco_2018_03_29.tar.gz

# unzip the compressed tar file
!tar xvf ssd_mobilenet_v2_coco_2018_03_29.tar.gz

# Compress the tar file and save it in a directory called 'model.tar.gz'
!tar czvf model.tar.gz ssd_mobilenet_v2_coco_2018_03_29/frozen_inference_graph.pb
```

이 예제에 사용된 명령 플래그는 다음을 수행합니다.
+ `c`: 아카이브 생성
+ `z`: gzip으로 아카이브 압축
+ `v`: 아카이브 진행률 표시
+ `f`: 아카이브의 파일 이름 지정

### 내장 예측기
<a name="how-to-save-built-in"></a>

내장 추정기는 프레임워크별 컨테이너 또는 알고리즘별 컨테이너로 만들어집니다. 내장 알고리즘과 프레임워크별 예측기 모두에 대한 예측기 객체는 내장된 `.fit` 메서드를 사용하여 모델을 훈련시킬 때 사용자를 위해 모델을 올바른 형식으로 저장합니다.

예를 들어 `sagemaker.TensorFlow`를 사용하여 TensorFlow 예측기를 정의할 수 있습니다.

```
from sagemaker.tensorflow import TensorFlow

estimator = TensorFlow(entry_point='mnist.py',
                        role=role,  #param role can be arn of a sagemaker execution role
                        framework_version='1.15.3',
                        py_version='py3',
                        training_steps=1000, 
                        evaluation_steps=100,
                        instance_count=2,
                        instance_type='ml.c4.xlarge')
```

그런 다음 `.fit` 내장 메서드로 모델을 훈련시킵니다.

```
estimator.fit(inputs)
```

마지막으로 `compile_model` 메서드로 모델을 컴파일하기 전에:

```
# Specify output path of the compiled model
output_path = '/'.join(estimator.output_path.split('/')[:-1])

# Compile model
optimized_estimator = estimator.compile_model(target_instance_family='ml_c5', 
                              input_shape={'data':[1, 784]},  # Batch size 1, 3 channels, 224x224 Images.
                              output_path=output_path,
                              framework='tensorflow', framework_version='1.15.3')
```

또한 `sagemaker.estimator.Estimator` 클래스를 사용하여 SageMaker Python SDK의 `compile_model` 메서드로 내장 알고리즘을 훈련 및 컴파일하기 위한 예측기 객체를 초기화할 수 있습니다.

```
import sagemaker
from sagemaker.image_uris import retrieve
sagemaker_session = sagemaker.Session()
aws_region = sagemaker_session.boto_region_name

# Specify built-in algorithm training image
training_image = retrieve(framework='image-classification', 
                          region=aws_region, image_scope='training')

training_image = retrieve(framework='image-classification', region=aws_region, image_scope='training')

# Create estimator object for training
estimator = sagemaker.estimator.Estimator(image_uri=training_image,
                                          role=role,  #param role can be arn of a sagemaker execution role
                                          instance_count=1,
                                          instance_type='ml.p3.8xlarge',
                                          volume_size = 50,
                                          max_run = 360000,
                                          input_mode= 'File',
                                          output_path=s3_training_output_location,
                                          base_job_name='image-classification-training'
                                          )
                                          
# Setup the input data_channels to be used later for training.                                          
train_data = sagemaker.inputs.TrainingInput(s3_training_data_location,
                                            content_type='application/x-recordio',
                                            s3_data_type='S3Prefix')
validation_data = sagemaker.inputs.TrainingInput(s3_validation_data_location,
                                                content_type='application/x-recordio',
                                                s3_data_type='S3Prefix')
data_channels = {'train': train_data, 'validation': validation_data}


# Train model
estimator.fit(inputs=data_channels, logs=True)

# Compile model with Neo                                                                                  
optimized_estimator = estimator.compile_model(target_instance_family='ml_c5',
                                          input_shape={'data':[1, 3, 224, 224], 'softmax_label':[1]},
                                          output_path=s3_compilation_output_location,
                                          framework='mxnet',
                                          framework_version='1.7')
```

SageMaker Python SDK를 사용하여 모델을 컴파일하는 방법에 대한 자세한 내용은 [모델 컴파일(Amazon SageMaker AI SDK)](neo-job-compilation-sagemaker-sdk.md) 섹션을 참조하세요.

# 모델 컴파일(AWS Command Line Interface)
<a name="neo-job-compilation-cli"></a>

이 섹션에서는 AWS Command Line Interface (CLI)를 사용하여 기계 학습 모델의 Amazon SageMaker Neo 컴파일 작업을 관리하는 방법을 보여줍니다. 컴파일 작업을 생성, 설명, 중지 및 나열할 수 있습니다.

1. 컴파일 작업 생성

   [CreateCompilationJob](https://docs.aws.amazon.com/sagemaker/latest/APIReference/API_CreateCompilationJob.html) API 작업을 통해 데이터 입력 형식, 모델을 저장할 S3 버킷, 컴파일된 모델을 저장할 S3 버킷, 대상 하드웨어 디바이스 또는 플랫폼을 지정할 수 있습니다.

   다음 표는 대상이 디바이스인지 플랫폼인지에 따라 `CreateCompilationJob` API를 구성하는 방법을 보여줍니다.

------
#### [ Device Example ]

   ```
   {
       "CompilationJobName": "neo-compilation-job-demo",
       "RoleArn": "arn:aws:iam::<your-account>:role/service-role/AmazonSageMaker-ExecutionRole-yyyymmddThhmmss",
       "InputConfig": {
           "S3Uri": "s3://<your-bucket>/sagemaker/neo-compilation-job-demo-data/train",
           "DataInputConfig":  "{'data': [1,3,1024,1024]}",
           "Framework": "MXNET"
       },
       "OutputConfig": {
           "S3OutputLocation": "s3://<your-bucket>/sagemaker/neo-compilation-job-demo-data/compile",
           # A target device specification example for a ml_c5 instance family
           "TargetDevice": "ml_c5"
       },
       "StoppingCondition": {
           "MaxRuntimeInSeconds": 300
       }
   }
   ```

   PyTorch 프레임워크를 사용하여 모델을 훈련시켰고 대상 디바이스가 `ml_* `대상인 경우 [https://docs.aws.amazon.com/sagemaker/latest/APIReference/API_InputConfig.html#sagemaker-Type-InputConfig-FrameworkVersion](https://docs.aws.amazon.com/sagemaker/latest/APIReference/API_InputConfig.html#sagemaker-Type-InputConfig-FrameworkVersion) 필드에서 사용한 프레임워크 버전을 선택적으로 지정할 수 있습니다.

   ```
   {
       "CompilationJobName": "neo-compilation-job-demo",
       "RoleArn": "arn:aws:iam::<your-account>:role/service-role/AmazonSageMaker-ExecutionRole-yyyymmddThhmmss",
       "InputConfig": {
           "S3Uri": "s3://<your-bucket>/sagemaker/neo-compilation-job-demo-data/train",
           "DataInputConfig":  "{'data': [1,3,1024,1024]}",
           "Framework": "PYTORCH",
           "FrameworkVersion": "1.6"
       },
       "OutputConfig": {
           "S3OutputLocation": "s3://<your-bucket>/sagemaker/neo-compilation-job-demo-data/compile",
           # A target device specification example for a ml_c5 instance family
           "TargetDevice": "ml_c5",
           # When compiling for ml_* instances using PyTorch framework, use the "CompilerOptions" field in 
           # OutputConfig to provide the correct data type ("dtype") of the model’s input. Default assumed is "float32"
           "CompilerOptions": "{'dtype': 'long'}"
       },
       "StoppingCondition": {
           "MaxRuntimeInSeconds": 300
       }
   }
   ```

**참고:**  
PyTorch 2.0 이상 버전을 사용하여 모델을 저장한 경우 `DataInputConfig` 필드는 선택 사항입니다. SageMaker AI Neo는 PyTorch로 만드는 모델 정의 파일에서 입력 구성을 가져옵니다. 정의 파일을 만드는 방법에 대한 자세한 내용은 *SageMaker Neo에 모델 저장* 아래의 [PyTorch](neo-compilation-preparing-model.md#how-to-save-pytorch) 섹션을 참조하세요.
이 API 필드는 PyTorch에서만 지원됩니다.

------
#### [ Platform Example ]

   ```
   {
       "CompilationJobName": "neo-test-compilation-job",
       "RoleArn": "arn:aws:iam::<your-account>:role/service-role/AmazonSageMaker-ExecutionRole-yyyymmddThhmmss",
       "InputConfig": {
           "S3Uri": "s3://<your-bucket>/sagemaker/neo-compilation-job-demo-data/train",
           "DataInputConfig":  "{'data': [1,3,1024,1024]}",
           "Framework": "MXNET"
       },
       "OutputConfig": {
           "S3OutputLocation": "s3://<your-bucket>/sagemaker/neo-compilation-job-demo-data/compile",
           # A target platform configuration example for a p3.2xlarge instance
           "TargetPlatform": {
               "Os": "LINUX",
               "Arch": "X86_64",
               "Accelerator": "NVIDIA"
           },
           "CompilerOptions": "{'cuda-ver': '10.0', 'trt-ver': '6.0.1', 'gpu-code': 'sm_70'}"
       },
       "StoppingCondition": {
           "MaxRuntimeInSeconds": 300
       }
   }
   ```

------
**참고**  
`OutputConfig` API 연산의 경우 `TargetDevice` 및 `TargetPlatform` API 연산은 상호 배타적입니다. 두 옵션 중 하나를 선택해야 합니다.

   프레임워크에 따른 `DataInputConfig`의 JSON 문자열 예제를 찾으려면 [Neo에 필요한 입력 데이터 형태](https://docs.aws.amazon.com/sagemaker/latest/dg/neo-troubleshooting-compilation.html#neo-troubleshooting-errors-preventing)를 참조하세요.

   구성 설정에 대한 자세한 내용은 SageMaker API 참조에서 [InputConfig](https://docs.aws.amazon.com/sagemaker/latest/APIReference/API_InputConfig.html), [OutputConfig](https://docs.aws.amazon.com/sagemaker/latest/APIReference/API_OutputConfig.html) 및 [TargetPlatform](https://docs.aws.amazon.com/sagemaker/latest/APIReference/API_TargetPlatform.html) API 연산을 참조하세요.

1. JSON 파일을 구성한 후 다음 명령을 실행하여 컴파일 작업을 생성합니다.

   ```
   aws sagemaker create-compilation-job \
   --cli-input-json file://job.json \
   --region us-west-2 
   
   # You should get CompilationJobArn
   ```

1. 다음 명령어를 실행하여 컴파일 작업을 설명합니다.

   ```
   aws sagemaker describe-compilation-job \
   --compilation-job-name $JOB_NM \
   --region us-west-2
   ```

1. 다음 명령을 실행하여 컴파일 작업을 중지합니다.

   ```
   aws sagemaker stop-compilation-job \
   --compilation-job-name $JOB_NM \
   --region us-west-2
   
   # There is no output for compilation-job operation
   ```

1. 다음 명령을 실행하여 컴파일 작업을 나열합니다.

   ```
   aws sagemaker list-compilation-jobs \
   --region us-west-2
   ```

# 모델 컴파일(Amazon SageMaker AI 콘솔)
<a name="neo-job-compilation-console"></a>

Amazon SageMaker AI 콘솔에서 Amazon SageMaker Neo 컴파일 작업을 만들 수 있습니다.

1. **Amazon SageMaker AI** 콘솔에서 **컴파일 작업**을 선택한 후 **컴파일 작업 생성**을 선택합니다.  
![\[컴파일 작업 생성\]](http://docs.aws.amazon.com/ko_kr/sagemaker/latest/dg/images/neo/8-create-compilation-job.png)

1. **컴파일 작업 생성** 페이지에서 **작업 이름**에 이름을 입력합니다. **IAM 역할**을 선택합니다.  
![\[컴파일 작업 페이지를 생성합니다.\]](http://docs.aws.amazon.com/ko_kr/sagemaker/latest/dg/images/neo/9-create-compilation-job-config.png)

1. IAM 역할이 아직 없는 경우 **새 역할 생성**을 선택합니다.  
![\[IAM 역할 위치를 생성합니다.\]](http://docs.aws.amazon.com/ko_kr/sagemaker/latest/dg/images/neo/10a-create-iam-role.png)

1. **IAM 역할 생성** 페이지에서 **모든 S3 버킷**을 선택하고 **역할 생성**을 선택합니다.  
![\[IAM 역할 페이지를 생성합니다.\]](http://docs.aws.amazon.com/ko_kr/sagemaker/latest/dg/images/neo/10-create-iam-role.png)

1. 

------
#### [ Non PyTorch Frameworks ]

   **입력 구성** 섹션에서 **모델 아티팩트의 위치** 입력 필드에 모델 아티팩트가 포함된 Amazon S3 버킷의 전체 경로를 입력합니다. 모델 아티팩트는 압축된 tarball 파일 형식(`.tar.gz`) 이어야 합니다.

   **데이터 입력 구성** 필드에 입력 데이터의 형태를 지정하는 JSON 문자열을 입력합니다.

   **기계 학습 프레임워크**에서 선택한 프레임워크를 선택합니다.

![\[입력 구성 페이지.\]](http://docs.aws.amazon.com/ko_kr/sagemaker/latest/dg/images/neo/neo-create-compilation-job-input-config.png)


   프레임워크에 따른 입력 데이터 형태의 JSON 문자열 예제를 찾으려면 [Neo에 필요한 입력 데이터 형태](https://docs.aws.amazon.com/sagemaker/latest/dg/neo-troubleshooting.html#neo-troubleshooting-errors-preventing)를 참조하세요.

------
#### [ PyTorch Framework ]

   PyTorch 모델 컴파일에도 유사한 지침이 적용됩니다. 그러나 PyTorch로 훈련하고 target `ml_*`(`ml_inf` 제외)에 대해 모델을 컴파일하려는 경우 사용한 PyTorch의 버전을 선택적으로 지정할 수 있습니다.

![\[프레임워크 버전 를 선택할 수 있는 위치를 보여주는 예제 입력 구성 섹션입니다.\]](http://docs.aws.amazon.com/ko_kr/sagemaker/latest/dg/images/neo/compile_console_pytorch.png)


   프레임워크에 따른 입력 데이터 형태의 JSON 문자열 예제를 찾으려면 [Neo에 필요한 입력 데이터 형태](https://docs.aws.amazon.com/sagemaker/latest/dg/neo-troubleshooting.html#neo-troubleshooting-errors-preventing)를 참조하세요.

**참고**  
PyTorch 2.0 이상 버전을 사용하여 모델을 저장한 경우 **데이터 입력 구성 필드**는 선택 사항입니다. SageMaker Neo는 PyTorch로 생성하는 모델 정의 파일에서 입력 구성을 가져옵니다. 정의 파일을 만드는 방법에 대한 자세한 내용은 *SageMaker Neo에 모델 저장* 아래의 [PyTorch](neo-compilation-preparing-model.md#how-to-save-pytorch) 섹션을 참조하세요.
PyTorch 프레임워크를 사용하여 `ml_*` 인스턴스를 컴파일할 때 **출력 구성**의 **컴파일러 옵션** 필드를 사용하여 모델 입력의 올바른 데이터 유형(`dtype`)을 제공하세요. 기본값은 `"float32"`로 설정됩니다.

![\[예제 출력 구성 섹션.\]](http://docs.aws.amazon.com/ko_kr/sagemaker/latest/dg/images/neo/neo_compilation_console_pytorch_compiler_options.png)


**주의**  
 `.pth` 파일로 연결되는 Amazon S3 버킷 URI 경로를 지정하는 경우 컴파일을 시작한 후 다음 오류가 발생합니다. `ClientError: InputConfiguration: Unable to untar input model.Please confirm the model is a tar.gz file` 

------

1.  **출력 구성** 섹션으로 이동하세요. 모델을 배포할 위치를 선택합니다. 모델을 **대상 디바이스** 또는 **대상 플랫폼**에 배포할 수 있습니다. 대상 디바이스에는 클라우드 및 에지 디바이스가 포함됩니다. 대상 플랫폼은 모델을 실행하려는 특정 OS, 아키텍처 및 가속기를 말합니다.

    **S3 출력 위치**에 모델을 저장할 S3 버킷의 경로를 입력합니다. **컴파일러 옵션** 섹션에서 JSON 형식의 컴파일러 옵션을 선택적으로 추가할 수 있습니다.  
![\[출력 구성 페이지.\]](http://docs.aws.amazon.com/ko_kr/sagemaker/latest/dg/images/neo/neo-console-output-config.png)

1. 시작되면 컴파일 작업의 상태를 확인합니다. 이 작업 상태는 다음 스크린샷과 같이 **컴파일 작업** 페이지 상단에서 확인할 수 있습니다. 또한 **상태** 열에서 상태를 확인할 수도 있습니다.  
![\[컴파일 작업 상태\]](http://docs.aws.amazon.com/ko_kr/sagemaker/latest/dg/images/neo/12-run-model-compilation.png)

1. 완료되면 컴파일 작업의 상태를 확인합니다. 다음 스크린샷과 같이 **상태** 열에서 상태를 확인할 수 있습니다.  
![\[컴파일 작업 상태\]](http://docs.aws.amazon.com/ko_kr/sagemaker/latest/dg/images/neo/12a-completed-model-compilation.png)

# 모델 컴파일(Amazon SageMaker AI SDK)
<a name="neo-job-compilation-sagemaker-sdk"></a>

 [Amazon SageMaker AI SDK for Python](https://sagemaker.readthedocs.io/en/stable/)의 [https://sagemaker.readthedocs.io/en/stable/api/training/estimators.html?#sagemaker.estimator.Estimator.compile_model](https://sagemaker.readthedocs.io/en/stable/api/training/estimators.html?#sagemaker.estimator.Estimator.compile_model) API를 사용하여 훈련된 모델을 컴파일하고 특정 대상 하드웨어에 맞게 최적화할 수 있습니다. 모델 훈련 중에 사용되는 예측기 객체에서 API를 호출해야 합니다.

**참고**  
MXNet 또는 PyTorch로 모델을 컴파일할 때는 `MMS_DEFAULT_RESPONSE_TIMEOUT` 환경 변수를 `500`으로 설정해야 합니다. TensorFlow에는 환경 변수가 필요하지 않습니다.

 다음은 `trained_model_estimator` 객체를 사용하여 모델을 컴파일하는 방법의 예시입니다.

```
# Replace the value of expected_trained_model_input below and
# specify the name & shape of the expected inputs for your trained model
# in json dictionary form
expected_trained_model_input = {'data':[1, 784]}

# Replace the example target_instance_family below to your preferred target_instance_family
compiled_model = trained_model_estimator.compile_model(target_instance_family='ml_c5',
        input_shape=expected_trained_model_input,
        output_path='insert s3 output path',
        env={'MMS_DEFAULT_RESPONSE_TIMEOUT': '500'})
```

코드는 모델을 컴파일하고, 최적화된 모델을 `output_path`에 저장하고, 엔드포인트에 배포할 수 있는 SageMaker AI 모델을 만듭니다.

# 클라우드 인스턴스
<a name="neo-cloud-instances"></a>

Amazon SageMaker Neo는 TensorFlow, PyTorch, MXNet 등 인기 있는 기계 학습 프레임워크를 위한 컴파일을 지원합니다. 컴파일된 모델은 클라우드 인스턴스 및 AWS Inferentia 인스턴스에 배포할 수 있습니다. 지원되는 프레임워크의 전체 목록은 [지원되는 인스턴스 유형 및 프레임워크](https://docs.aws.amazon.com/sagemaker/latest/dg/neo-supported-cloud.html)를 참조하세요.

3가지 방법, 즉 AWS CLI, SageMaker AI 콘솔, SageMaker AI SDK for Python 중 하나로 모델을 컴파일할 수 있습니다. 자세한 내용은 [Neo를 이용한 모델 컴파일](https://docs.aws.amazon.com/sagemaker/latest/dg/neo-job-compilation.html)을 참조하세요. 컴파일된 모델 아티팩트는 컴파일 작업 중에 지정한 Amazon S3 버킷 URI에 저장됩니다. SageMaker AI SDK for Python, AWS SDK for Python (Boto3), AWS CLI 또는 AWS 콘솔을 사용하여 컴파일된 모델을 클라우드 인스턴스 및 AWS Inferentia 인스턴스에 배포할 수 있습니다.

AWS CLI, 콘솔 또는 Boto3를 사용하여 모델을 배포할 경우에는 도커 이미지 Amazon ECR URI를 기본 컨테이너로 선택해야 합니다. Amazon ECR URI 목록은 [Neo 추론 컨테이너 이미지](https://docs.aws.amazon.com/sagemaker/latest/dg/neo-deployment-hosting-services-container-images.html)를 참조하세요.

**Topics**
+ [지원되는 인스턴스 유형 및 프레임워크](neo-supported-cloud.md)
+ [모델 배포](neo-deployment-hosting-services.md)
+ [배포된 서비스를 사용한 추론 요청](neo-requests.md)
+ [추론 컨테이너 이미지](neo-deployment-hosting-services-container-images.md)

# 지원되는 인스턴스 유형 및 프레임워크
<a name="neo-supported-cloud"></a>

Amazon SageMaker Neo는 컴파일 및 배포 모두에 널리 사용되는 딥 러닝 프레임워크를 지원합니다. 모델은 클라우드 인스턴스 및 AWS Inferentia 인스턴스 유형에 배포할 수 있습니다.

다음은 SageMaker Neo가 지원하는 프레임워크와 사용자가 컴파일하고 배포할 수 있는 대상 클라우드 인스턴스에 대해 설명합니다. 컴파일된 모델을 클라우드 또는 Inferentia 인스턴스에 배포하는 방법에 대한 자세한 내용은 [클라우드 인스턴스로 모델 배포](https://docs.aws.amazon.com/sagemaker/latest/dg/neo-deployment-hosting-services.html)를 참고하세요.

## 클라우드 인스턴스
<a name="neo-supported-cloud-instances"></a>

SageMaker Neo는 CPU 및 GPU 클라우드 인스턴스에 대해 다음과 같은 딥 러닝 프레임워크를 지원합니다.


| 프레임워크 | 프레임워크 버전 | 모델 버전 |  모델 | 모델 형식(\$1.tar.gz로 패키징됨) | 툴킷 | 
| --- | --- | --- | --- | --- | --- | 
| MXNet | 1.8.0 | 1.8.0 이하 지원 | 이미지 분류, 객체 감지, 의미적 분할, 포즈 추정, 활동 인식 | 기호 파일(.json) 한 개 및 파라미터 파일(.params) 한 개 | GluonCV v0.8.0 | 
| ONNX | 1.7.0 | 1.7.0 이하 지원 | 이미지 분류, SVM | 모델 파일(.onnx) 한 개 |  | 
| Keras | 2.2.4 | 2.2.4 이하 지원 | 이미지 분류 | 모델 정의 파일(.h5) 한 개 |  | 
| PyTorch | 1.4, 1.5, 1.6, 1.7, 1.8, 1.12, 1.13, 2.0 | 1.4, 1.5, 1.6, 1.7, 1.8, 1.12, 1.13, 2.0 지원 |  이미지 분류 버전 1.13 및 2.0은 객체 감지, 비전 트랜스포머, HuggingFace를 지원합니다.  | 입력 dtype이 float32인 모델 정의 파일(.pt 또는.pth) 한 개 |  | 
| TensorFlow | 1.15.3 또는 2.9 | 1.15.3 및 2.9를 지원합니다. | 이미지 분류 | 저장된 모델의 경우, .pb 또는 .pbtxt 파일 하나와 변수가 포함되어 있는 변수 디렉터리 동결 모델의 경우, .pb 또는 .pbtxt 파일 하나만 |  | 
| XGBoost | 1.3.3 | 1.3.3 이하 지원 | 의사결정 트리 | 트리의 노드 수가 2^31개 미만인 XGBoost 모델 파일(.model) 한 개 |  | 

**참고**  
“모델 버전”은 모델을 훈련하고 내보내는 데 사용되는 프레임워크 버전입니다.

## 인스턴스 유형
<a name="neo-supported-cloud-instances-types"></a>

 SageMaker AI가 컴파일한 모델을 아래 나열된 클라우드 인스턴스 중 하나에 배포할 수 있습니다.


| Instance | 컴퓨팅 유형 | 
| --- | --- | 
| `ml_c4` | 표준 | 
| `ml_c5` | 표준 | 
| `ml_m4` | 표준 | 
| `ml_m5` | 표준 | 
| `ml_p2` | 액셀러레이티드 컴퓨팅 | 
| `ml_p3` | 액셀러레이티드 컴퓨팅 | 
| `ml_g4dn` | 액셀러레이티드 컴퓨팅 | 

 각 인스턴스 유형별로 사용 가능한 vCPU, 메모리 및 시간당 요금에 대한 자세한 내용은 [Amazon SageMaker 요금](https://aws.amazon.com/sagemaker/pricing/)을 참고하세요.

**참고**  
PyTorch 프레임워크를 사용하여 `ml_*` 인스턴스를 컴파일할 때 **출력 구성**의 **컴파일러 옵션** 필드를 사용하여 모델 입력의 올바른 데이터 유형(`dtype`)을 제공하세요.  
기본값은 `"float32"`로 설정됩니다.

## AWS Inferentia
<a name="neo-supported-inferentia"></a>

 SageMaker Neo는 Inf1에 대해 다음과 같은 딥 러닝 프레임워크를 지원합니다.


| 프레임워크 | 프레임워크 버전 | 모델 버전 |  모델 | 모델 형식(\$1.tar.gz로 패키징됨) | 툴킷 | 
| --- | --- | --- | --- | --- | --- | 
| MXNet | 1.5 또는 1.8  | 1.8, 1.5 및 이전 버전을 지원합니다. | 이미지 분류, 객체 감지, 의미적 분할, 포즈 추정, 활동 인식 | 기호 파일(.json) 한 개 및 파라미터 파일(.params) 한 개 | GluonCV v0.8.0 | 
| PyTorch | 1.7, 1.8 또는 1.9 | 1.9 이하 지원 | 이미지 분류 | 입력 dtype이 float32인 모델 정의 파일(.pt 또는.pth) 한 개 |  | 
| TensorFlow | 1.15 또는 2.5 | 2.5, 1.15 및 이전 버전 지원 | 이미지 분류 | 저장된 모델의 경우, .pb 또는 .pbtxt 파일 하나와 변수가 포함되어 있는 변수 디렉터리 동결 모델의 경우, .pb 또는 .pbtxt 파일 하나만 |  | 

**참고**  
“모델 버전”은 모델을 훈련하고 내보내는 데 사용되는 프레임워크 버전입니다.

SageMaker Neo가 컴파일한 모델을 AWS 추론 기반 Amazon EC2 Inf1 인스턴스에 배포할 수 있습니다. AWS Inferentia는 딥 러닝을 가속화하도록 설계된 아마존 최초의 맞춤형 실리콘 칩입니다. 현재 `ml_inf1` 인스턴스를 사용하여 컴파일된 모델을 배포할 수 있습니다.

### AWS Inferentia2 및 AWS Trainium
<a name="neo-supported-inferentia-trainium"></a>

현재 SageMaker Neo가 컴파일한 모델을 AWS Inferentia2 기반 Amazon EC2 Inf2 인스턴스(미국 동부(오하이오) 리전) 및 AWS Trainium 기반 Amazon EC2 Trn1 인스턴스(미국 동부(버지니아 북부) 리전)에 배포할 수 있습니다. 이러한 인스턴스에서 지원되는 모델에 대한 자세한 내용은 AWS Neuron 설명서의 [모델 아키텍처 적합 지침](https://awsdocs-neuron.readthedocs-hosted.com/en/latest/general/arch/model-architecture-fit.html) 및 [Neuron Github 리포지토리](https://github.com/aws-neuron/aws-neuron-sagemaker-samples)의 예제를 참고하세요.

# 모델 배포
<a name="neo-deployment-hosting-services"></a>

HTTPS 엔드포인트에 Amazon SageMaker Neo 컴파일 모델을 배포하려면 Amazon SageMaker AI 호스팅 서비스를 사용하여 모델에 맞는 엔드포인트를 구성 및 만들어야 합니다. 현재, 개발자는 Amazon SageMaker API를 사용하여 ml.c5, ml.c4, ml.m5, ml.m4, ml.p3, ml.p2, ml.inf1 인스턴스에 모듈을 배포할 수 있습니다.

[Inferentia](https://aws.amazon.com/machine-learning/inferentia/) 및 [Trainium](https://aws.amazon.com/machine-learning/trainium/) 인스턴스의 경우 모델을 특별히 그러한 인스턴스용으로 컴파일해야 합니다. 다른 인스턴스 유형용으로 컴파일된 모델은 Inferentia 또는 Trainium 인스턴스에서 작동하도록 보장되지 않습니다.

컴파일된 모델을 배포하는 경우 컴파일에 사용한 대상에 대해 동일한 인스턴스를 사용해야 합니다. 그러면 추론을 수행하는 데 사용할 수 있는 SageMaker AI 엔드포인트가 만들어집니다. [Amazon SageMaker AI SDK for Python](https://sagemaker.readthedocs.io/en/stable/), [SDK for Python(Boto3)](https://boto3.amazonaws.com/v1/documentation/api/latest/index.html), [AWS Command Line Interface](https://docs.aws.amazon.com/cli/latest/reference/) 및 [SageMaker 콘솔](https://console.aws.amazon.com/sagemaker) 중 하나를 사용하여 NEO에서 컴파일된 모델을 배포할 수 있습니다.

**참고**  
 AWS CLI콘솔 또는 Boto3를 사용하여 모델을 배포하려면 [Neo 추론 컨테이너 이미지를](https://docs.aws.amazon.com/sagemaker/latest/dg/neo-deployment-hosting-services-container-images.html) 참조하여 기본 컨테이너의 추론 이미지 URI를 선택합니다.

**Topics**
+ [사전 조건](neo-deployment-hosting-services-prerequisites.md)
+ [SageMaker SDK를 사용하여 컴파일된 모델 배포](neo-deployment-hosting-services-sdk.md)
+ [Boto3를 사용하여 컴파일된 모델 배포](neo-deployment-hosting-services-boto3.md)
+ [를 사용하여 컴파일된 모델 배포 AWS CLI](neo-deployment-hosting-services-cli.md)
+ [콘솔을 사용하여 컴파일된 모델 배포](neo-deployment-hosting-services-console.md)

# 사전 조건
<a name="neo-deployment-hosting-services-prerequisites"></a>

**참고**  
 AWS SDK for Python (Boto3) AWS CLI, 또는 SageMaker AI 콘솔을 사용하여 모델을 컴파일한 경우이 섹션의 지침을 따르세요.

SageMaker NEO 컴파일 모델을 생성하려면 다음이 필요합니다.

1. 도커 이미지 Amazon ECR URI. [이 목록](https://docs.aws.amazon.com/sagemaker/latest/dg/neo-deployment-hosting-services-container-images.html)에서 필요에 맞는 것을 선택할 수 있습니다.

1. 진입점 스크립트 파일:

   1. **PyTorch 및 MXNet 모델의 경우:**

      *SageMaker AI를 사용하여 모델을 훈련시킨 경우*, 훈련 스크립트에는 아래에 설명된 함수가 구현되어야 합니다. 훈련 스크립트는 추론 중에 진입점 스크립트 역할을 합니다. [MXNet 모듈 및 SageMaker Neo를 사용한 MNIST 훈련, 컴파일 및 배포](https://sagemaker-examples.readthedocs.io/en/latest/sagemaker_neo_compilation_jobs/mxnet_mnist/mxnet_mnist_neo.html)에 자세히 설명된 예제에서는 훈련 스크립트(`mnist.py`)가 필수 함수를 구현합니다.

      *SageMaker AI를 사용하여 모델을 훈련시키지 않은 경우*, 추론 시 사용할 수 있는 진입점 스크립트(`inference.py`) 파일을 제공해야 합니다. 프레임워크(MXNet 또는 Pytorch)에 따라 추론 스크립트 위치는 SageMaker Python SDK [MXnet용 모델 디렉터리 구조](https://sagemaker.readthedocs.io/en/stable/frameworks/mxnet/using_mxnet.html#model-directory-structure) 또는 [PyTorch용 모델 디렉터리 구조](https://sagemaker.readthedocs.io/en/stable/frameworks/pytorch/using_pytorch.html#model-directory-structure)를 준수해야 합니다.

      CPU 및 GPU 인스턴스 유형에서 **PyTorch** 및 **MXNet**과 함께 Neo 추론 최적화 컨테이너 이미지를 사용하는 경우, 추론 스크립트는 다음 함수를 구현해야 합니다.
      + `model_fn`: 모델을 로드합니다. (선택 사항)
      + `input_fn`: 수신 요청 페이로드를 numpy 배열로 변환합니다.
      + `predict_fn`: 예측을 수행합니다.
      + `output_fn`: 예측 출력을 응답 페이로드로 변환합니다.
      + 또는, `input_fn`, `predict_fn` 및 `output_fn`을 조합하여 `transform_fn`을 정의할 수도 있습니다.

      다음은 **PyTorch와 MXNet(Gluon 및 모듈)**의 경우 `code`(`code/inference.py`)라는 디렉터리 내의 `inference.py` 스크립트 예제입니다. 이 예제는 먼저 모델을 로드한 다음 GPU의 이미지 데이터에 사용합니다.

------
#### [ MXNet Module ]

      ```
      import numpy as np
      import json
      import mxnet as mx
      import neomx  # noqa: F401
      from collections import namedtuple
      
      Batch = namedtuple('Batch', ['data'])
      
      # Change the context to mx.cpu() if deploying to a CPU endpoint
      ctx = mx.gpu()
      
      def model_fn(model_dir):
          # The compiled model artifacts are saved with the prefix 'compiled'
          sym, arg_params, aux_params = mx.model.load_checkpoint('compiled', 0)
          mod = mx.mod.Module(symbol=sym, context=ctx, label_names=None)
          exe = mod.bind(for_training=False,
                         data_shapes=[('data', (1,3,224,224))],
                         label_shapes=mod._label_shapes)
          mod.set_params(arg_params, aux_params, allow_missing=True)
          
          # Run warm-up inference on empty data during model load (required for GPU)
          data = mx.nd.empty((1,3,224,224), ctx=ctx)
          mod.forward(Batch([data]))
          return mod
      
      
      def transform_fn(mod, image, input_content_type, output_content_type):
          # pre-processing
          decoded = mx.image.imdecode(image)
          resized = mx.image.resize_short(decoded, 224)
          cropped, crop_info = mx.image.center_crop(resized, (224, 224))
          normalized = mx.image.color_normalize(cropped.astype(np.float32) / 255,
                                        mean=mx.nd.array([0.485, 0.456, 0.406]),
                                        std=mx.nd.array([0.229, 0.224, 0.225]))
          transposed = normalized.transpose((2, 0, 1))
          batchified = transposed.expand_dims(axis=0)
          casted = batchified.astype(dtype='float32')
          processed_input = casted.as_in_context(ctx)
      
          # prediction/inference
          mod.forward(Batch([processed_input]))
      
          # post-processing
          prob = mod.get_outputs()[0].asnumpy().tolist()
          prob_json = json.dumps(prob)
          return prob_json, output_content_type
      ```

------
#### [ MXNet Gluon ]

      ```
      import numpy as np
      import json
      import mxnet as mx
      import neomx  # noqa: F401
      
      # Change the context to mx.cpu() if deploying to a CPU endpoint
      ctx = mx.gpu()
      
      def model_fn(model_dir):
          # The compiled model artifacts are saved with the prefix 'compiled'
          block = mx.gluon.nn.SymbolBlock.imports('compiled-symbol.json',['data'],'compiled-0000.params', ctx=ctx)
          
          # Hybridize the model & pass required options for Neo: static_alloc=True & static_shape=True
          block.hybridize(static_alloc=True, static_shape=True)
          
          # Run warm-up inference on empty data during model load (required for GPU)
          data = mx.nd.empty((1,3,224,224), ctx=ctx)
          warm_up = block(data)
          return block
      
      
      def input_fn(image, input_content_type):
          # pre-processing
          decoded = mx.image.imdecode(image)
          resized = mx.image.resize_short(decoded, 224)
          cropped, crop_info = mx.image.center_crop(resized, (224, 224))
          normalized = mx.image.color_normalize(cropped.astype(np.float32) / 255,
                                        mean=mx.nd.array([0.485, 0.456, 0.406]),
                                        std=mx.nd.array([0.229, 0.224, 0.225]))
          transposed = normalized.transpose((2, 0, 1))
          batchified = transposed.expand_dims(axis=0)
          casted = batchified.astype(dtype='float32')
          processed_input = casted.as_in_context(ctx)
          return processed_input
      
      
      def predict_fn(processed_input_data, block):
          # prediction/inference
          prediction = block(processed_input_data)
          return prediction
      
      def output_fn(prediction, output_content_type):
          # post-processing
          prob = prediction.asnumpy().tolist()
          prob_json = json.dumps(prob)
          return prob_json, output_content_type
      ```

------
#### [ PyTorch 1.4 and Older ]

      ```
      import os
      import torch
      import torch.nn.parallel
      import torch.optim
      import torch.utils.data
      import torch.utils.data.distributed
      import torchvision.transforms as transforms
      from PIL import Image
      import io
      import json
      import pickle
      
      
      def model_fn(model_dir):
          """Load the model and return it.
          Providing this function is optional.
          There is a default model_fn available which will load the model
          compiled using SageMaker Neo. You can override it here.
      
          Keyword arguments:
          model_dir -- the directory path where the model artifacts are present
          """
      
          # The compiled model is saved as "compiled.pt"
          model_path = os.path.join(model_dir, 'compiled.pt')
          with torch.neo.config(model_dir=model_dir, neo_runtime=True):
              model = torch.jit.load(model_path)
              device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
              model = model.to(device)
      
          # We recommend that you run warm-up inference during model load
          sample_input_path = os.path.join(model_dir, 'sample_input.pkl')
          with open(sample_input_path, 'rb') as input_file:
              model_input = pickle.load(input_file)
          if torch.is_tensor(model_input):
              model_input = model_input.to(device)
              model(model_input)
          elif isinstance(model_input, tuple):
              model_input = (inp.to(device) for inp in model_input if torch.is_tensor(inp))
              model(*model_input)
          else:
              print("Only supports a torch tensor or a tuple of torch tensors")
              return model
      
      
      def transform_fn(model, request_body, request_content_type,
                       response_content_type):
          """Run prediction and return the output.
          The function
          1. Pre-processes the input request
          2. Runs prediction
          3. Post-processes the prediction output.
          """
          # preprocess
          decoded = Image.open(io.BytesIO(request_body))
          preprocess = transforms.Compose([
              transforms.Resize(256),
              transforms.CenterCrop(224),
              transforms.ToTensor(),
              transforms.Normalize(
                  mean=[
                      0.485, 0.456, 0.406], std=[
                      0.229, 0.224, 0.225]),
          ])
          normalized = preprocess(decoded)
          batchified = normalized.unsqueeze(0)
          # predict
          device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
          batchified = batchified.to(device)
          output = model.forward(batchified)
      
          return json.dumps(output.cpu().numpy().tolist()), response_content_type
      ```

------
#### [ PyTorch 1.5 and Newer ]

      ```
      import os
      import torch
      import torch.nn.parallel
      import torch.optim
      import torch.utils.data
      import torch.utils.data.distributed
      import torchvision.transforms as transforms
      from PIL import Image
      import io
      import json
      import pickle
      
      
      def model_fn(model_dir):
          """Load the model and return it.
          Providing this function is optional.
          There is a default_model_fn available, which will load the model
          compiled using SageMaker Neo. You can override the default here.
          The model_fn only needs to be defined if your model needs extra
          steps to load, and can otherwise be left undefined.
      
          Keyword arguments:
          model_dir -- the directory path where the model artifacts are present
          """
      
          # The compiled model is saved as "model.pt"
          model_path = os.path.join(model_dir, 'model.pt')
          device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
          model = torch.jit.load(model_path, map_location=device)
          model = model.to(device)
      
          return model
      
      
      def transform_fn(model, request_body, request_content_type,
                          response_content_type):
          """Run prediction and return the output.
          The function
          1. Pre-processes the input request
          2. Runs prediction
          3. Post-processes the prediction output.
          """
          # preprocess
          decoded = Image.open(io.BytesIO(request_body))
          preprocess = transforms.Compose([
                                      transforms.Resize(256),
                                      transforms.CenterCrop(224),
                                      transforms.ToTensor(),
                                      transforms.Normalize(
                                          mean=[
                                              0.485, 0.456, 0.406], std=[
                                              0.229, 0.224, 0.225]),
                                          ])
          normalized = preprocess(decoded)
          batchified = normalized.unsqueeze(0)
          
          # predict
          device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
          batchified = batchified.to(device)
          output = model.forward(batchified)
          return json.dumps(output.cpu().numpy().tolist()), response_content_type
      ```

------

   1.  **inf1 인스턴스 또는 onnx, xgboost, keras 컨테이너 이미지의 경우** 

      다른 모든 Neo 추론에 최적화된 컨테이너 이미지 또는 inferentia 인스턴스 유형의 경우, 진입점 스크립트는 Neo 딥 러닝 런타임을 위한 다음 함수를 구현해야 합니다.
      + `neo_preprocess`: 수신 요청 페이로드를 numpy 배열로 변환합니다.
      + `neo_postprocess`: Neo 딥 러닝 런타임의 예측 출력을 응답 본문으로 변환합니다.
**참고**  
이러한 두 개의 함수는 MXNet, Pytorch 또는 Tensorflow의 기능을 사용하지 않습니다.

      이러한 함수를 사용하는 방법에 대한 예는 [Neo Model 컴파일 샘플 노트북](https://docs.aws.amazon.com//sagemaker/latest/dg/neo.html#neo-sample-notebooks)을 참조하세요.

   1. **TensorFlow 모델의 경우**

      모델에 데이터가 전송되기 전에 사용자 지정 사전 및 사후 처리 로직이 필요한 경우, 추론 시 사용할 수 있는 진입점 스크립트 `inference.py` 파일을 지정해야 합니다. 스크립트는 `input_handler` 및 `output_handler` 함수 쌍이나 단일 핸들러 함수를 구현해야 합니다.
**참고**  
참고로 핸들러 함수가 구현된 경우, `input_handler` 및 `output_handler`는 무시됩니다.

      다음은 컴파일 모델과 함께 사용하여 이미지 분류 모델에서 사용자 지정 사전 및 사후 처리를 수행할 수 있는 `inference.py` 스크립트의 코드 예제입니다. SageMaker AI 클라이언트는 이미지 파일을 `application/x-image` 콘텐츠 유형으로 `input_handler` 함수에 전송하고, 함수에서 JSON으로 변환됩니다. 그런 다음 변환된 이미지 파일은 REST API를 사용하여 [Tensorflow 모델 서버(TFX](https://www.tensorflow.org/tfx/serving/api_rest))로 전송됩니다.

      ```
      import json
      import numpy as np
      import json
      import io
      from PIL import Image
      
      def input_handler(data, context):
          """ Pre-process request input before it is sent to TensorFlow Serving REST API
          
          Args:
          data (obj): the request data, in format of dict or string
          context (Context): an object containing request and configuration details
          
          Returns:
          (dict): a JSON-serializable dict that contains request body and headers
          """
          f = data.read()
          f = io.BytesIO(f)
          image = Image.open(f).convert('RGB')
          batch_size = 1
          image = np.asarray(image.resize((512, 512)))
          image = np.concatenate([image[np.newaxis, :, :]] * batch_size)
          body = json.dumps({"signature_name": "serving_default", "instances": image.tolist()})
          return body
      
      def output_handler(data, context):
          """Post-process TensorFlow Serving output before it is returned to the client.
          
          Args:
          data (obj): the TensorFlow serving response
          context (Context): an object containing request and configuration details
          
          Returns:
          (bytes, string): data to return to client, response content type
          """
          if data.status_code != 200:
              raise ValueError(data.content.decode('utf-8'))
      
          response_content_type = context.accept_header
          prediction = data.content
          return prediction, response_content_type
      ```

      사용자 지정 사전 또는 사후 처리가 없는 경우 SageMaker AI 클라이언트는 SageMaker AI 엔드포인트로 전송하기 전에 유사한 방식으로 파일 이미지를 JSON으로 변환합니다.

      자세한 내용은 [SageMaker Python SDK에서 TensorFlow 제공 엔드포인트에 배포](https://sagemaker.readthedocs.io/en/stable/frameworks/tensorflow/deploying_tensorflow_serving.html#providing-python-scripts-for-pre-pos-processing)를 참조하세요.

1. 컴파일된 모델 아티팩트가 포함된 Amazon S3 버킷 URI입니다.

# SageMaker SDK를 사용하여 컴파일된 모델 배포
<a name="neo-deployment-hosting-services-sdk"></a>

모델이 AWS SDK for Python (Boto3) AWS CLI또는 Amazon SageMaker AI 콘솔을 사용하여 컴파일된 경우 [ 사전 조건](https://docs.aws.amazon.com//sagemaker/latest/dg/neo-deployment-hosting-services-prerequisites) 섹션을 충족해야 합니다. 다음 사용 사례 중 하나를 따라 모델을 컴파일한 방식에 따라 SageMaker Neo로 컴파일된 모델을 배포하세요.

**Topics**
+ [SageMaker SDK를 사용하여 모델을 컴파일한 경우](#neo-deployment-hosting-services-sdk-deploy-sm-sdk)
+ [MXNet 또는 PyTorch를 사용하여 모델을 컴파일한 경우](#neo-deployment-hosting-services-sdk-deploy-sm-boto3)
+ [Boto3, SageMaker 콘솔 또는 TensorFlow용 CLI를 사용하여 모델을 컴파일한 경우](#neo-deployment-hosting-services-sdk-deploy-sm-boto3-tensorflow)

## SageMaker SDK를 사용하여 모델을 컴파일한 경우
<a name="neo-deployment-hosting-services-sdk-deploy-sm-sdk"></a>

컴파일된 모델에 대한 [sagemaker.Model](https://sagemaker.readthedocs.io/en/stable/api/inference/model.html?highlight=sagemaker.Model) 객체 핸들은 추론 요청을 제공하는 엔드포인트를 생성하도록 허용하는 [deploy()](https://sagemaker.readthedocs.io/en/stable/api/inference/model.html?highlight=sagemaker.Model#sagemaker.model.Model.deploy) 함수를 제공합니다. 이 함수를 사용하면 엔드포인트에 사용되는 인스턴스의 수 및 유형을 설정할 수 있습니다. 모델을 컴파일한 인스턴스를 선택해야 합니다. 예를 들어, [모델 컴파일(Amazon SageMaker SDK)](https://docs.aws.amazon.com/sagemaker/latest/dg/neo-job-compilation-sagemaker-sdk.html) 섹션에서 컴파일된 작업에서는 `ml_c5`입니다.

```
predictor = compiled_model.deploy(initial_instance_count = 1, instance_type = 'ml.c5.4xlarge')

# Print the name of newly created endpoint
print(predictor.endpoint_name)
```

## MXNet 또는 PyTorch를 사용하여 모델을 컴파일한 경우
<a name="neo-deployment-hosting-services-sdk-deploy-sm-boto3"></a>

SageMaker AI 모델을 만들고 프레임워크별 모델 API에서 deploy() API를 사용하여 배포합니다. MXNet의 경우 [MXNetModel](https://sagemaker.readthedocs.io/en/stable/frameworks/mxnet/sagemaker.mxnet.html?highlight=MXNetModel#mxnet-model)이고 PyTorch의 경우 [PyTorchModel](https://sagemaker.readthedocs.io/en/stable/frameworks/pytorch/sagemaker.pytorch.html?highlight=PyTorchModel#sagemaker.pytorch.model.PyTorchModel)입니다. SageMaker AI 모델을 만들고 배포할 때 `MMS_DEFAULT_RESPONSE_TIMEOUT` 환경 변수를 `500`으로 설정하고, `entry_point` 파라미터를 추론 스크립트(`inference.py`)로 지정하고, `source_dir` 파라미터를 추론 스크립트의 디렉터리 위치(`code`)로 지정해야 합니다. 추론 스크립트(`inference.py`)를 준비하려면 사전 조건 단계를 따르세요.

다음 예시는 SageMaker AI SDK for Python을 사용하여 컴파일된 모델을 배포하기 위해 이러한 함수를 사용하는 방법을 보여줍니다.

------
#### [ MXNet ]

```
from sagemaker.mxnet import MXNetModel

# Create SageMaker model and deploy an endpoint
sm_mxnet_compiled_model = MXNetModel(
    model_data='insert S3 path of compiled MXNet model archive',
    role='AmazonSageMaker-ExecutionRole',
    entry_point='inference.py',
    source_dir='code',
    framework_version='1.8.0',
    py_version='py3',
    image_uri='insert appropriate ECR Image URI for MXNet',
    env={'MMS_DEFAULT_RESPONSE_TIMEOUT': '500'},
)

# Replace the example instance_type below to your preferred instance_type
predictor = sm_mxnet_compiled_model.deploy(initial_instance_count = 1, instance_type = 'ml.p3.2xlarge')

# Print the name of newly created endpoint
print(predictor.endpoint_name)
```

------
#### [ PyTorch 1.4 and Older ]

```
from sagemaker.pytorch import PyTorchModel

# Create SageMaker model and deploy an endpoint
sm_pytorch_compiled_model = PyTorchModel(
    model_data='insert S3 path of compiled PyTorch model archive',
    role='AmazonSageMaker-ExecutionRole',
    entry_point='inference.py',
    source_dir='code',
    framework_version='1.4.0',
    py_version='py3',
    image_uri='insert appropriate ECR Image URI for PyTorch',
    env={'MMS_DEFAULT_RESPONSE_TIMEOUT': '500'},
)

# Replace the example instance_type below to your preferred instance_type
predictor = sm_pytorch_compiled_model.deploy(initial_instance_count = 1, instance_type = 'ml.p3.2xlarge')

# Print the name of newly created endpoint
print(predictor.endpoint_name)
```

------
#### [ PyTorch 1.5 and Newer ]

```
from sagemaker.pytorch import PyTorchModel

# Create SageMaker model and deploy an endpoint
sm_pytorch_compiled_model = PyTorchModel(
    model_data='insert S3 path of compiled PyTorch model archive',
    role='AmazonSageMaker-ExecutionRole',
    entry_point='inference.py',
    source_dir='code',
    framework_version='1.5',
    py_version='py3',
    image_uri='insert appropriate ECR Image URI for PyTorch',
)

# Replace the example instance_type below to your preferred instance_type
predictor = sm_pytorch_compiled_model.deploy(initial_instance_count = 1, instance_type = 'ml.p3.2xlarge')

# Print the name of newly created endpoint
print(predictor.endpoint_name)
```

------

**참고**  
`AmazonSageMakerFullAccess` 및 `AmazonS3ReadOnlyAccess` 정책은 `AmazonSageMaker-ExecutionRole` IAM 역할에 연결되어야 합니다.

## Boto3, SageMaker 콘솔 또는 TensorFlow용 CLI를 사용하여 모델을 컴파일한 경우
<a name="neo-deployment-hosting-services-sdk-deploy-sm-boto3-tensorflow"></a>

`TensorFlowModel` 객체를 구성한 다음 deploy를 호출합니다.

```
role='AmazonSageMaker-ExecutionRole'
model_path='S3 path for model file'
framework_image='inference container arn'
tf_model = TensorFlowModel(model_data=model_path,
                framework_version='1.15.3',
                role=role, 
                image_uri=framework_image)
instance_type='ml.c5.xlarge'
predictor = tf_model.deploy(instance_type=instance_type,
                    initial_instance_count=1)
```

자세한 내용은 [모델 아티팩트에서 직접 배포](https://sagemaker.readthedocs.io/en/stable/frameworks/tensorflow/deploying_tensorflow_serving.html#deploying-directly-from-model-artifacts)를 참조하세요.

[이 목록](https://docs.aws.amazon.com//sagemaker/latest/dg/neo-deployment-hosting-services-container-images.html)에서 요구 사항에 맞는 도커 이미지 Amazon ECR URI를 선택할 수 있습니다.

`TensorFlowModel` 객체를 구성하는 방법에 대한 자세한 내용은 [SageMaker SDK](https://sagemaker.readthedocs.io/en/stable/frameworks/tensorflow/sagemaker.tensorflow.html#tensorflow-serving-model)를 참조하세요.

**참고**  
모델을 GPU에 배포하는 경우 첫 번째 추론 요청의 지연 시간이 길어질 수 있습니다. 첫 번째 추론 요청에서 최적화된 컴퓨팅 커널이 만들어지기 때문입니다. TFX로 보내기 전에 추론 요청의 워밍업 파일을 만들어 모델 파일과 함께 저장하는 것이 좋습니다. 이를 모델을 “워밍업”하는 것이라고 합니다.

다음 코드 스니펫은 [사전 조건](https://docs.aws.amazon.com//sagemaker/latest/dg/neo-deployment-hosting-services-prerequisites) 섹션의 이미지 분류 예제를 위한 워밍업 파일을 생성하는 방법을 보여줍니다.

```
import tensorflow as tf
from tensorflow_serving.apis import classification_pb2
from tensorflow_serving.apis import inference_pb2
from tensorflow_serving.apis import model_pb2
from tensorflow_serving.apis import predict_pb2
from tensorflow_serving.apis import prediction_log_pb2
from tensorflow_serving.apis import regression_pb2
import numpy as np

with tf.python_io.TFRecordWriter("tf_serving_warmup_requests") as writer:       
    img = np.random.uniform(0, 1, size=[224, 224, 3]).astype(np.float32)
    img = np.expand_dims(img, axis=0)
    test_data = np.repeat(img, 1, axis=0)
    request = predict_pb2.PredictRequest()
    request.model_spec.name = 'compiled_models'
    request.model_spec.signature_name = 'serving_default'
    request.inputs['Placeholder:0'].CopyFrom(tf.compat.v1.make_tensor_proto(test_data, shape=test_data.shape, dtype=tf.float32))
    log = prediction_log_pb2.PredictionLog(
    predict_log=prediction_log_pb2.PredictLog(request=request))
    writer.write(log.SerializeToString())
```

모델을 “워밍업”하는 방법에 대한 자세한 내용은 [TensorFlow TFX 페이지](https://www.tensorflow.org/tfx/serving/saved_model_warmup)를 참조하세요.

# Boto3를 사용하여 컴파일된 모델 배포
<a name="neo-deployment-hosting-services-boto3"></a>

모델이 AWS SDK for Python (Boto3) AWS CLI또는 Amazon SageMaker AI 콘솔을 사용하여 컴파일된 경우 [ 사전 조건](https://docs.aws.amazon.com//sagemaker/latest/dg/neo-deployment-hosting-services-prerequisites) 섹션을 충족해야 합니다. 아래 단계에 따라 [Python용 Amazon Web Services SDK(Boto3)](https://boto3.amazonaws.com/v1/documentation/api/latest/index.html)를 사용하여 SageMaker NEO 컴파일 모델 생성하고 배포하세요.

**Topics**
+ [모델 배포](#neo-deployment-hosting-services-boto3-steps)

## 모델 배포
<a name="neo-deployment-hosting-services-boto3-steps"></a>

[사전 조건](https://docs.aws.amazon.com//sagemaker/latest/dg/neo-deployment-hosting-services-prerequisites)을 충족한 후에는, `create_model`, `create_enpoint_config` 및 `create_endpoint` API를 사용하세요.

다음 예시는 이러한 API를 사용하여 Neo로 컴파일된 모델을 배포하는 방법을 보여줍니다.

```
import boto3
client = boto3.client('sagemaker')

# create sagemaker model
create_model_api_response = client.create_model(
                                    ModelName='my-sagemaker-model',
                                    PrimaryContainer={
                                        'Image': <insert the ECR Image URI>,
                                        'ModelDataUrl': 's3://path/to/model/artifact/model.tar.gz',
                                        'Environment': {}
                                    },
                                    ExecutionRoleArn='ARN for AmazonSageMaker-ExecutionRole'
                            )

print ("create_model API response", create_model_api_response)

# create sagemaker endpoint config
create_endpoint_config_api_response = client.create_endpoint_config(
                                            EndpointConfigName='sagemaker-neomxnet-endpoint-configuration',
                                            ProductionVariants=[
                                                {
                                                    'VariantName': <provide your variant name>,
                                                    'ModelName': 'my-sagemaker-model',
                                                    'InitialInstanceCount': 1,
                                                    'InstanceType': <provide your instance type here>
                                                },
                                            ]
                                       )

print ("create_endpoint_config API response", create_endpoint_config_api_response)

# create sagemaker endpoint
create_endpoint_api_response = client.create_endpoint(
                                    EndpointName='provide your endpoint name',
                                    EndpointConfigName=<insert your endpoint config name>,
                                )

print ("create_endpoint API response", create_endpoint_api_response)
```

**참고**  
`AmazonSageMakerFullAccess` 및 `AmazonS3ReadOnlyAccess` 정책은 `AmazonSageMaker-ExecutionRole` IAM 역할에 연결되어야 합니다.

`create_model`, `create_endpoint_config` 및 `create_endpoint` API의 전체 구문은 각각 [https://boto3.amazonaws.com/v1/documentation/api/latest/reference/services/sagemaker.html#SageMaker.Client.create_model](https://boto3.amazonaws.com/v1/documentation/api/latest/reference/services/sagemaker.html#SageMaker.Client.create_model), [https://boto3.amazonaws.com/v1/documentation/api/latest/reference/services/sagemaker.html#SageMaker.Client.create_endpoint_config](https://boto3.amazonaws.com/v1/documentation/api/latest/reference/services/sagemaker.html#SageMaker.Client.create_endpoint_config) 및 [https://boto3.amazonaws.com/v1/documentation/api/latest/reference/services/sagemaker.html#SageMaker.Client.create_endpoint](https://boto3.amazonaws.com/v1/documentation/api/latest/reference/services/sagemaker.html#SageMaker.Client.create_endpoint)를 참조하세요.

SageMaker AI를 사용하여 모델을 훈련시키지 않은 경우 다음 환경 변수를 지정합니다.

------
#### [ MXNet and PyTorch ]

```
"Environment": {
    "SAGEMAKER_PROGRAM": "inference.py",
    "SAGEMAKER_SUBMIT_DIRECTORY": "/opt/ml/model/code",
    "SAGEMAKER_CONTAINER_LOG_LEVEL": "20",
    "SAGEMAKER_REGION": "insert your region",
    "MMS_DEFAULT_RESPONSE_TIMEOUT": "500"
}
```

------
#### [ TensorFlow ]

```
"Environment": {
    "SAGEMAKER_PROGRAM": "inference.py",
    "SAGEMAKER_SUBMIT_DIRECTORY": "/opt/ml/model/code",
    "SAGEMAKER_CONTAINER_LOG_LEVEL": "20",
    "SAGEMAKER_REGION": "insert your region"
}
```

------

 SageMaker AI를 사용하여 모델을 훈련한 경우 `SAGEMAKER_SUBMIT_DIRECTORY` 환경 변수를 훈련 스크립트가 포함된 전체 Amazon S3 버킷 URI로 지정합니다.

# 를 사용하여 컴파일된 모델 배포 AWS CLI
<a name="neo-deployment-hosting-services-cli"></a>

모델이 AWS SDK for Python (Boto3) AWS CLI또는 Amazon SageMaker AI 콘솔을 사용하여 컴파일된 경우 [ 사전 조건](https://docs.aws.amazon.com//sagemaker/latest/dg/neo-deployment-hosting-services-prerequisites) 섹션을 충족해야 합니다. 아래 단계에 따라 [AWS CLI](https://docs.aws.amazon.com/cli/latest/reference/)를 사용하여 SageMaker NEO 컴파일 모델을 생성하고 배포하세요.

**Topics**
+ [모델 배포](#neo-deploy-cli)

## 모델 배포
<a name="neo-deploy-cli"></a>

[ 사전 조건을](https://docs.aws.amazon.com//sagemaker/latest/dg/neo-deployment-hosting-services-prerequisites) 충족한 후에는 , `create-model` `create-enpoint-config`및 `create-endpoint` AWS CLI 명령을 사용합니다. 다음 예시는 이러한 명령을 사용하여 Neo로 컴파일된 모델을 배포하는 방법을 보여줍니다.


### 모델 생성
<a name="neo-deployment-hosting-services-cli-create-model"></a>

[Neo 추론 컨테이너 이미지](https://docs.aws.amazon.com/sagemaker/latest/dg/neo-deployment-hosting-services-container-images.html)에서 추론 이미지 URI를 선택한 다음 `create-model` API를 사용하여 SageMaker AI 모델을 만듭니다. 두 단계를 이용해 할 수 있습니다.

1. `create_model.json` 파일을 생성합니다. 파일 내에서 모델 이름, 이미지 URI, Amazon S3 버킷의 `model.tar.gz` 파일 경로, SageMaker AI 실행 역할을 지정합니다.

   ```
   {
       "ModelName": "insert model name",
       "PrimaryContainer": {
           "Image": "insert the ECR Image URI",
           "ModelDataUrl": "insert S3 archive URL",
           "Environment": {"See details below"}
       },
       "ExecutionRoleArn": "ARN for AmazonSageMaker-ExecutionRole"
   }
   ```

   SageMaker AI를 사용하여 모델을 훈련한 경우 다음 환경 변수를 지정합니다.

   ```
   "Environment": {
       "SAGEMAKER_SUBMIT_DIRECTORY" : "[Full S3 path for *.tar.gz file containing the training script]"
   }
   ```

   SageMaker AI를 사용하여 모델을 훈련시키지 않은 경우 다음 환경 변수를 지정합니다.

------
#### [ MXNet and PyTorch ]

   ```
   "Environment": {
       "SAGEMAKER_PROGRAM": "inference.py",
       "SAGEMAKER_SUBMIT_DIRECTORY": "/opt/ml/model/code",
       "SAGEMAKER_CONTAINER_LOG_LEVEL": "20",
       "SAGEMAKER_REGION": "insert your region",
       "MMS_DEFAULT_RESPONSE_TIMEOUT": "500"
   }
   ```

------
#### [ TensorFlow ]

   ```
   "Environment": {
       "SAGEMAKER_PROGRAM": "inference.py",
       "SAGEMAKER_SUBMIT_DIRECTORY": "/opt/ml/model/code",
       "SAGEMAKER_CONTAINER_LOG_LEVEL": "20",
       "SAGEMAKER_REGION": "insert your region"
   }
   ```

------
**참고**  
`AmazonSageMakerFullAccess` 및 `AmazonS3ReadOnlyAccess` 정책은 `AmazonSageMaker-ExecutionRole` IAM 역할에 연결되어야 합니다.

1. 다음 명령을 실행합니다.

   ```
   aws sagemaker create-model --cli-input-json file://create_model.json
   ```

   `create-model` API의 전체 구문은 [https://docs.aws.amazon.com/cli/latest/reference/sagemaker/create-model.html](https://docs.aws.amazon.com/cli/latest/reference/sagemaker/create-model.html) 섹션을 참조하세요.

### 엔드포인트 구성 생성
<a name="neo-deployment-hosting-services-cli-create-endpoint-config"></a>

SageMaker AI 모델을 만든 후 `create-endpoint-config` API를 사용하여 엔드포인트 구성을 만듭니다. 이렇게 하려면 엔드포인트 구성 사양이 포함된 JSON 파일을 생성하세요. 예를 들어, 다음 코드 템플릿을 사용하여 이를 `create_config.json`으로 저장할 수 있습니다.

```
{
    "EndpointConfigName": "<provide your endpoint config name>",
    "ProductionVariants": [
        {
            "VariantName": "<provide your variant name>",
            "ModelName": "my-sagemaker-model",
            "InitialInstanceCount": 1,
            "InstanceType": "<provide your instance type here>",
            "InitialVariantWeight": 1.0
        }
    ]
}
```

이제 다음 AWS CLI 명령을 실행하여 엔드포인트 구성을 생성합니다.

```
aws sagemaker create-endpoint-config --cli-input-json file://create_config.json
```

`create-endpoint-config` API의 전체 구문은 [https://docs.aws.amazon.com/cli/latest/reference/sagemaker/create-endpoint-config.html](https://docs.aws.amazon.com/cli/latest/reference/sagemaker/create-endpoint-config.html) 섹션을 참조하세요.

### 엔드포인트 생성
<a name="neo-deployment-hosting-services-cli-create-endpoint"></a>

엔드포인트 구성을 생성한 후 `create-endpoint` API를 사용하여 엔드포인트를 생성합니다.

```
aws sagemaker create-endpoint --endpoint-name '<provide your endpoint name>' --endpoint-config-name '<insert your endpoint config name>'
```

`create-endpoint` API의 전체 구문은 [https://docs.aws.amazon.com/cli/latest/reference/sagemaker/create-endpoint.html](https://docs.aws.amazon.com/cli/latest/reference/sagemaker/create-endpoint.html) 섹션을 참조하세요.

# 콘솔을 사용하여 컴파일된 모델 배포
<a name="neo-deployment-hosting-services-console"></a>

모델이 AWS SDK for Python (Boto3)또는 AWS CLI Amazon SageMaker AI 콘솔을 사용하여 컴파일된 경우 [ 사전 조건](https://docs.aws.amazon.com//sagemaker/latest/dg/neo-deployment-hosting-services-prerequisites) 섹션을 충족해야 합니다. 아래 단계에 따라 SageMaker AI 콘솔([https://console.aws.amazon.com/ SageMaker AI](https://console.aws.amazon.com/sagemaker/))을 사용하여 SageMaker AI NEO 컴파일 모델을 만들고 배포합니다.

**Topics**
+ [모델 배포](#deploy-the-model-console-steps)

## 모델 배포
<a name="deploy-the-model-console-steps"></a>

 [사전 조건](https://docs.aws.amazon.com//sagemaker/latest/dg/neo-deployment-hosting-services-prerequisites)을 충족한 후에는 다음 단계를 사용하여 Neo로 컴파일된 모델을 배포하세요.

1. **모델**을 선택한 다음 **추론** 그룹에서 **모델 생성**을 선택합니다. **모델 생성** 페이지에서 **모델 이름**, **IAM 역할**, 그리고 필요할 경우 **VPC**(선택)를 입력합니다.  
![\[추론을 위한 Neo 모델 생성\]](http://docs.aws.amazon.com/ko_kr/sagemaker/latest/dg/images/create-pipeline-model.png)

1. 모델 배포에 사용되는 컨테이너에 대한 정보를 추가하려면 **컨테이너 추가**를 선택하고 **다음**을 선택합니다. **컨테이너 입력 옵션**, **추론 코드 이미지 위치** 및 **모델 아티팩트의 위치**에 값을 입력하고, 선택에 따라 **컨테이너 호스트 이름** 및 **Environmental variables(환경 변수)** 필드를 입력합니다.  
![\[추론을 위한 Neo 모델 생성\]](http://docs.aws.amazon.com/ko_kr/sagemaker/latest/dg/images/neo-deploy-console-container-definition.png)

1. Neo 컴파일 모델을 배포하려면 다음 항목을 선택합니다.
   + **컨테이너 입력 옵션**: **모델 아티팩트 및 추론 이미지**를 션택합니다.
   + **추론 코드 이미지 위치**: AWS 리전 및 애플리케이션 종류에 따라 [네오 추론 컨테이너 이미지](https://docs.aws.amazon.com/sagemaker/latest/dg/neo-deployment-hosting-services-container-images.html)에서 추론 이미지 URI를 선택합니다.
   + **모델 아티팩트 위치**: Neo 컴파일 API에서 생성된 컴파일된 모델 아티팩트의 Amazon S3 버킷 URI를 입력합니다.
   + **환경 변수**:
     + **SageMaker XGBoost**의 경우 이 필드를 비워 두세요.
     + SageMaker AI를 사용하여 모델을 훈련한 경우 `SAGEMAKER_SUBMIT_DIRECTORY` 환경 변수를 훈련 스크립트가 포함된 Amazon S3 버킷 URI로 지정합니다.
     + SageMaker AI를 사용하여 모델을 훈련시키지 않은 경우 다음 환경 변수를 지정합니다.    
[\[See the AWS documentation website for more details\]](http://docs.aws.amazon.com/ko_kr/sagemaker/latest/dg/neo-deployment-hosting-services-console.html)

1. 컨테이너에 대한 정보가 정확한지 확인한 다음 **모델 생성**을 선택합니다. **모델 생성 시작 페이지**에서 **엔드포인트 생성**을 선택합니다.  
![\[모델 생성 시작 페이지\]](http://docs.aws.amazon.com/ko_kr/sagemaker/latest/dg/images/neo-deploy-console-create-model-land-page.png)

1. **엔드포인트 생성 및 구성** 다이어그램에서 **엔드포인트 이름**을 지정합니다. **엔드포인트 구성 연결**에서 **새 엔드포인트 구성 생성**을 선택합니다.  
![\[Neo 콘솔이 엔드포인트 UI를 생성 및 구성합니다.\]](http://docs.aws.amazon.com/ko_kr/sagemaker/latest/dg/images/neo-deploy-console-config-endpoint.png)

1. **새로운 엔드포인트 구성** 페이지에서 **엔드포인트 구성 이름**을 지정합니다.  
![\[Neo 콘솔의 새 엔드포인트 구성 UI.\]](http://docs.aws.amazon.com/ko_kr/sagemaker/latest/dg/images/neo-deploy-console-new-endpoint-config.png)

1. 모델 이름 옆의 **편집**을 선택하고 **프로덕션 변형 편집** 페이지에서 올바른 **인스턴스 유형**을 지정합니다. **인스턴스 유형** 값은 컴파일 작업에 지정된 유형과 일치해야 합니다.  
![\[Neo 콘솔의 새 엔드포인트 구성 UI.\]](http://docs.aws.amazon.com/ko_kr/sagemaker/latest/dg/images/neo-deploy-console-edit-production-variant.png)

1. **저장**을 선택합니다.

1. **새 엔드포인트 구성** 페이지에서 **엔드포인트 구성 생성**을 선택한 다음 **엔드포인트 생성**을 선택합니다.

# 배포된 서비스를 사용한 추론 요청
<a name="neo-requests"></a>

[모델 배포](neo-deployment-hosting-services.md)의 설명을 따른 경우에는 SageMaker AI 엔드포인트가 설정되어 실행 중이어야 합니다. Neo가 컴파일한 모델을 배포한 방식에 관계없이 다음과 같은 세 가지 방법으로 추론 요청을 제출할 수 있습니다.

**Topics**
+ [배포된 서비스에서 추론 요청(Amazon SageMaker SDK)](neo-requests-sdk.md)
+ [배포된 서비스에서 추론 요청(Boto3)](neo-requests-boto3.md)
+ [배포된 서비스에서 추론 요청(AWS CLI)](neo-requests-cli.md)

# 배포된 서비스에서 추론 요청(Amazon SageMaker SDK)
<a name="neo-requests-sdk"></a>

다음과 같은 코드 예제를 사용하여 모델 훈련에 사용한 프레임워크를 기반으로 배포된 서비스로부터 추론을 요청할 수 있습니다. 각 프레임워크의 코드 예제는 비슷합니다. 주요 차이점은 TensorFlow가 콘텐츠 유형으로 `application/json`을 요구한다는 것입니다.

 
## PyTorch 및 MXNet
<a name="neo-requests-sdk-py-mxnet"></a>

 **PyTorch v1.4 이상** 또는 **MXNet 1.7.0 이상**을 사용 중이고 Amazon SageMaker AI 엔드포인트 `InService`가 있는 경우 Python용 SageMaker AI SDK의 `predictor` 패키지를 사용하여 추론 요청을 할 수 있습니다.

**참고**  
API는 SageMaker AI SDK for Python 버전에 따라 달라집니다.  
버전 1.x의 경우, [https://sagemaker.readthedocs.io/en/v1.72.0/api/inference/predictors.html#sagemaker.predictor.RealTimePredictor](https://sagemaker.readthedocs.io/en/v1.72.0/api/inference/predictors.html#sagemaker.predictor.RealTimePredictor)및 [https://sagemaker.readthedocs.io/en/v1.72.0/api/inference/predictors.html#sagemaker.predictor.RealTimePredictor.predict](https://sagemaker.readthedocs.io/en/v1.72.0/api/inference/predictors.html#sagemaker.predictor.RealTimePredictor.predict) API를 사용하세요.
버전 2.x의 경우, [https://sagemaker.readthedocs.io/en/stable/api/inference/predictors.html#sagemaker.predictor.Predictor](https://sagemaker.readthedocs.io/en/stable/api/inference/predictors.html#sagemaker.predictor.Predictor)및 [https://sagemaker.readthedocs.io/en/stable/api/inference/predictors.html#sagemaker.predictor.Predictor.predict](https://sagemaker.readthedocs.io/en/stable/api/inference/predictors.html#sagemaker.predictor.Predictor.predict) API를 사용하세요.

다음 코드 예제에서는 이러한 API를 통해 추론용 이미지를 전송하는 방법을 보여줍니다.

------
#### [ SageMaker Python SDK v1.x ]

```
from sagemaker.predictor import RealTimePredictor

endpoint = 'insert name of your endpoint here'

# Read image into memory
payload = None
with open("image.jpg", 'rb') as f:
    payload = f.read()

predictor = RealTimePredictor(endpoint=endpoint, content_type='application/x-image')
inference_response = predictor.predict(data=payload)
print (inference_response)
```

------
#### [ SageMaker Python SDK v2.x ]

```
from sagemaker.predictor import Predictor

endpoint = 'insert name of your endpoint here'

# Read image into memory
payload = None
with open("image.jpg", 'rb') as f:
    payload = f.read()
    
predictor = Predictor(endpoint)
inference_response = predictor.predict(data=payload)
print (inference_response)
```

------

## TensorFlow
<a name="neo-requests-sdk-py-tf"></a>

다음 코드 예제에서는 SageMaker Python SDK API를 사용하여 추론용 이미지를 전송하는 방법을 보여줍니다.

```
from sagemaker.predictor import Predictor
from PIL import Image
import numpy as np
import json

endpoint = 'insert the name of your endpoint here'

# Read image into memory
image = Image.open(input_file)
batch_size = 1
image = np.asarray(image.resize((224, 224)))
image = image / 128 - 1
image = np.concatenate([image[np.newaxis, :, :]] * batch_size)
body = json.dumps({"instances": image.tolist()})
    
predictor = Predictor(endpoint)
inference_response = predictor.predict(data=body)
print(inference_response)
```

# 배포된 서비스에서 추론 요청(Boto3)
<a name="neo-requests-boto3"></a>

 SageMaker AI 엔드포인트 `InService`가 있으면 SageMaker AI SDK for Python(Boto3) 클라이언트 및 [https://boto3.amazonaws.com/v1/documentation/api/latest/reference/services/sagemaker-runtime.html#SageMakerRuntime.Client.invoke_endpoint](https://boto3.amazonaws.com/v1/documentation/api/latest/reference/services/sagemaker-runtime.html#SageMakerRuntime.Client.invoke_endpoint) API를 사용하여 추론 요청을 제출할 수 있습니다. 다음 코드 예제에서는 추론을 위해 이미지를 전송하는 방법을 보여줍니다.

------
#### [ PyTorch and MXNet ]

```
import boto3

import json
 
endpoint = 'insert name of your endpoint here'
 
runtime = boto3.Session().client('sagemaker-runtime')
 
# Read image into memory
with open(image, 'rb') as f:
    payload = f.read()
# Send image via InvokeEndpoint API
response = runtime.invoke_endpoint(EndpointName=endpoint, ContentType='application/x-image', Body=payload)

# Unpack response
result = json.loads(response['Body'].read().decode())
```

------
#### [ TensorFlow ]

TensorFlow의 경우 콘텐츠 유형에 대한 입력을 `application/json`과 함께 제출하세요.

```
from PIL import Image
import numpy as np
import json
import boto3

client = boto3.client('sagemaker-runtime') 
input_file = 'path/to/image'
image = Image.open(input_file)
batch_size = 1
image = np.asarray(image.resize((224, 224)))
image = image / 128 - 1
image = np.concatenate([image[np.newaxis, :, :]] * batch_size)
body = json.dumps({"instances": image.tolist()})
ioc_predictor_endpoint_name = 'insert name of your endpoint here'
content_type = 'application/json'   
ioc_response = client.invoke_endpoint(
    EndpointName=ioc_predictor_endpoint_name,
    Body=body,
    ContentType=content_type
 )
```

------
#### [ XGBoost ]

 XGBoost 애플리케이션의 경우 대신 CSV 텍스트를 제출해야 합니다.

```
import boto3
import json
 
endpoint = 'insert your endpoint name here'
 
runtime = boto3.Session().client('sagemaker-runtime')
 
csv_text = '1,-1.0,1.0,1.5,2.6'
# Send CSV text via InvokeEndpoint API
response = runtime.invoke_endpoint(EndpointName=endpoint, ContentType='text/csv', Body=csv_text)
# Unpack response
result = json.loads(response['Body'].read().decode())
```

------

 BYOM은 사용자 지정 콘텐츠 유형에 허용됩니다. 자세한 내용은 [https://docs.aws.amazon.com/sagemaker/latest/APIReference/API_runtime_InvokeEndpoint.html](https://docs.aws.amazon.com/sagemaker/latest/APIReference/API_runtime_InvokeEndpoint.html) 단원을 참조하십시오.

# 배포된 서비스에서 추론 요청(AWS CLI)
<a name="neo-requests-cli"></a>

Amazon SageMaker AI 엔드포인트 `InService`가 있으면 [https://docs.aws.amazon.com/cli/latest/reference/sagemaker-runtime/invoke-endpoint.html](https://docs.aws.amazon.com/cli/latest/reference/sagemaker-runtime/invoke-endpoint.html)를 사용하여 추론 요청을 할 수 있습니다. AWS Command Line Interface (AWS CLI)를 사용하여 추론 요청을 할 수 있습니다. 다음 예제에서는 추론을 위해 이미지를 전송하는 방법을 보여줍니다.

```
aws sagemaker-runtime invoke-endpoint --endpoint-name 'insert name of your endpoint here' --body fileb://image.jpg --content-type=application/x-image output_file.txt
```

추론이 성공하면 추론 요청에 대한 정보를 포함한 `output_file.txt`가 생성됩니다.

 TensorFlow의 경우 콘텐츠 유형으로 입력을 `application/json`과 함께 제출합니다.

```
aws sagemaker-runtime invoke-endpoint --endpoint-name 'insert name of your endpoint here' --body fileb://input.json --content-type=application/json output_file.txt
```

# 추론 컨테이너 이미지
<a name="neo-deployment-hosting-services-container-images"></a>

SageMaker Neo에서 이제 `ml_*` 대상에 대한 추론 이미지 URI 정보를 제공합니다. 자세한 내용은 [DescribeCompilationJob](https://docs.aws.amazon.com/sagemaker/latest/APIReference/API_DescribeCompilationJob.html#sagemaker-DescribeCompilationJob-response-InferenceImage)을 참조하세요.

사용 사례에 따라 아래 제공된 추론 이미지 URI 템플릿에서 강조 표시된 부분을 적절한 값으로 바꾸세요.

## Amazon SageMaker AI XGBoost
<a name="inference-container-collapse-xgboost"></a>

```
aws_account_id.dkr.ecr.aws_region.amazonaws.com/xgboost-neo:latest
```

사용한 *aws\$1region*을 기준으로 이 페이지 끝에 있는 표의 *aws\$1account\$1id*를 바꾸세요.

## Keras
<a name="inference-container-collapse-keras"></a>

```
aws_account_id.dkr.ecr.aws_region.amazonaws.com/sagemaker-neo-keras:fx_version-instance_type-py3
```

사용한 *aws\$1region*을 기준으로 이 페이지 끝에 있는 표의 *aws\$1account\$1id*를 바꾸세요.

*fx\$1version*을 `2.2.4`로 바꾸세요.

*instance\$1type*을 `cpu` 또는 `gpu`로 바꾸세요.

## MXNet
<a name="inference-container-collapse-mxnet"></a>

------
#### [ CPU or GPU instance types ]

```
aws_account_id.dkr.ecr.aws_region.amazonaws.com/sagemaker-inference-mxnet:fx_version-instance_type-py3
```

사용한 *aws\$1region*을 기준으로 이 페이지 끝에 있는 표의 *aws\$1account\$1id*를 바꾸세요.

*fx\$1version*을 `1.8.0`로 바꾸세요.

*instance\$1type*을 `cpu` 또는 `gpu`로 바꾸세요.

------
#### [ Inferentia1 ]

```
aws_account_id.dkr.ecr.aws_region.amazonaws.com/sagemaker-neo-mxnet:fx_version-instance_type-py3
```

*aws\$1region*을 `us-east-1` 또는 `us-west-2`로 바꾸세요.

사용한 *aws\$1region*을 기준으로 이 페이지 끝에 있는 표의 *aws\$1account\$1id*를 바꾸세요.

*fx\$1version*을 `1.5.1`로 바꾸세요.

*`instance_type`*를 `inf`로 바꿉니다.

------

## ONNX
<a name="inference-container-collapse-onnx"></a>

```
aws_account_id.dkr.ecr.aws_region.amazonaws.com/sagemaker-neo-onnx:fx_version-instance_type-py3
```

사용한 *aws\$1region*을 기준으로 이 페이지 끝에 있는 표의 *aws\$1account\$1id*를 바꾸세요.

*fx\$1version*을 `1.5.0`로 바꾸세요.

*instance\$1type*을 `cpu` 또는 `gpu`로 바꾸세요.

## PyTorch
<a name="inference-container-collapse-pytorch"></a>

------
#### [ CPU or GPU instance types ]

```
aws_account_id.dkr.ecr.aws_region.amazonaws.com/sagemaker-inference-pytorch:fx_version-instance_type-py3
```

사용한 *aws\$1region*을 기준으로 이 페이지 끝에 있는 표의 *aws\$1account\$1id*를 바꾸세요.

*fx\$1version*을 `1.4`, `1.5`, `1.6`, `1.7`, `1.8`, `1.12`, `1.13` 또는 `2.0`으로 바꾸세요.

*instance\$1type*을 `cpu` 또는 `gpu`로 바꾸세요.

------
#### [ Inferentia1 ]

```
aws_account_id.dkr.ecr.aws_region.amazonaws.com/sagemaker-neo-pytorch:fx_version-instance_type-py3
```

*aws\$1region*을 `us-east-1` 또는 `us-west-2`로 바꾸세요.

사용한 *aws\$1region*을 기준으로 이 페이지 끝에 있는 표의 *aws\$1account\$1id*를 바꾸세요.

*fx\$1version*을 `1.5.1`로 바꾸세요.

*`instance_type`*를 `inf`로 바꿉니다.

------
#### [ Inferentia2 and Trainium1 ]

```
763104351884.dkr.ecr.aws_region.amazonaws.com/pytorch-inference-neuronx:1.13.1-neuronx-py38-sdk2.10.0-ubuntu20.04
```

*aws\$1region*을 Inferentia2의 경우 `us-east-2`로, Trainium1의 경우 `us-east-1`로 바꾸세요.

------

## TensorFlow
<a name="inference-container-collapse-tf"></a>

------
#### [ CPU or GPU instance types ]

```
aws_account_id.dkr.ecr.aws_region.amazonaws.com/sagemaker-inference-tensorflow:fx_version-instance_type-py3
```

사용한 *aws\$1region*을 기준으로 이 페이지 끝에 있는 표의 *aws\$1account\$1id*를 바꾸세요.

*fx\$1version*을 `1.15.3` 또는 `2.9`로 바꾸세요.

*instance\$1type*을 `cpu` 또는 `gpu`로 바꾸세요.

------
#### [ Inferentia1 ]

```
aws_account_id.dkr.ecr.aws_region.amazonaws.com/sagemaker-neo-tensorflow:fx_version-instance_type-py3
```

사용한 *aws\$1region*을 기준으로 이 페이지 끝에 있는 표의 *aws\$1account\$1id*를 바꾸세요. `inf` 인스턴스 유형의 경우 `us-east-1` 및 `us-west-2`만 지원된다는 점에 유의하세요.

*fx\$1version*을 `1.15.0`으로 바꾸세요.

*instance\$1type*을 `inf`로 바꾸세요.

------
#### [ Inferentia2 and Trainium1 ]

```
763104351884.dkr.ecr.aws_region.amazonaws.com/tensorflow-inference-neuronx:2.10.1-neuronx-py38-sdk2.10.0-ubuntu20.04
```

*aws\$1region*을 Inferentia2의 경우 `us-east-2`로, Trainium1의 경우 `us-east-1`로 바꾸세요.

------

다음 표는 *aws\$1account\$1id*를 *aws\$1region*과 매핑한 것입니다. 이 표를 사용하여 애플리케이션에 필요한 올바른 추론 이미지 URI를 찾을 수 있습니다.


| aws\$1account\$1id | aws\$1region | 
| --- | --- | 
| 785573368785 | us-east-1 | 
| 007439368137 | us-east-2 | 
| 710691900526 | us-west-1 | 
| 301217895009 | us-west-2 | 
| 802834080501 | eu-west-1 | 
| 205493899709 | eu-west-2 | 
| 254080097072 | eu-west-3 | 
| 601324751636 | eu-north-1 | 
| 966458181534 | eu-south-1 | 
| 746233611703 | eu-central-1 | 
| 110948597952 | ap-east-1 | 
| 763008648453 | ap-south-1 | 
| 941853720454 | ap-northeast-1 | 
| 151534178276 | ap-northeast-2 | 
| 925152966179 | ap-northeast-3 | 
| 324986816169 | ap-southeast-1 | 
| 355873309152 | ap-southeast-2 | 
| 474822919863 | cn-northwest-1 | 
| 472730292857 | cn-north-1 | 
| 756306329178 | sa-east-1 | 
| 464438896020 | ca-central-1 | 
| 836785723513 | me-south-1 | 
| 774647643957 | af-south-1 | 
| 275950707576 | il-central-1 | 

# 엣지 디바이스
<a name="neo-edge-devices"></a>

Amazon SageMaker Neo는 인기 있는 기계 학습 프레임워크에 대한 컴파일을 지원합니다. Raspberry Pi 3, Texas Instruments의 Sitara, Jetson TX1 등 NEO로 컴파일된 엣지 디바이스를 배포할 수 있습니다. 지원되는 프레임워크 및 엣지 디바이스의 전체 목록은 [지원되는 프레임워크, 디바이스, 시스템 및 아키텍처](https://docs.aws.amazon.com/sagemaker/latest/dg/neo-supported-devices-edge.html)를 참조하세요.

엣지 디바이스를 구성해야만 AWS 서비스를 사용할 수 있습니다. 이 구성을 실행하는 방법은 디바이스에 DLR 및 Boto3를 설치하는 것입니다. 이를 실행하려면 인증 자격 증명을 설정해야 합니다. 자세한 내용은 [Boto3 AWS 구성](https://boto3.amazonaws.com/v1/documentation/api/latest/guide/quickstart.html#configuration)을 참조하세요. 모델이 컴파일되고 엣지 디바이스가 구성되면 Amazon S3에서 엣지 디바이스로 해당 모델을 다운로드할 수 있습니다. 그런 다음 [딥 러닝 런타임(DLR)](https://neo-ai-dlr.readthedocs.io/en/latest/index.html)을 사용하여 컴파일된 모델을 읽고 추론을 수행할 수 있습니다.

처음 사용하는 경우에는 [시작](https://docs.aws.amazon.com/sagemaker/latest/dg/neo-getting-started-edge.html) 안내서를 읽어 보는 것이 좋습니다. 이 안내서는 보안 인증을 설정하고, 모델을 컴파일하고, 모델을 Raspberry Pi 3에 배포하고, 이미지를 추론하는 방법을 안내합니다.

**Topics**
+ [지원되는 프레임워크, 디바이스, 시스템, 아키텍처](neo-supported-devices-edge.md)
+ [모델 배포](neo-deployment-edge.md)
+ [엣지 디바이스에서 Neo 설정](neo-getting-started-edge.md)

# 지원되는 프레임워크, 디바이스, 시스템, 아키텍처
<a name="neo-supported-devices-edge"></a>

Amazon SageMaker Neo는 일반적인 기계 학습 프레임워크, 엣지 디바이스, 운영 체제 및 칩 아키텍처를 지원합니다. 아래 주제 중 하나를 선택하여 Neo가 프레임워크, 엣지 디바이스, OS 및 칩 아키텍처를 지원하는지 알아보세요.

[테스트 완료 모델](neo-supported-edge-tested-models.md) 섹션에서 Amazon SageMaker Neo Team에서 테스트한 모델 목록을 찾을 수 있습니다.

**참고**  
Ambarella 장치는 컴파일을 위해 전송하기 전에 압축된 TAR 파일 내에 추가 파일을 포함해야 합니다. 자세한 내용은 [Ambarella 오류 문제 해결](neo-troubleshooting-target-devices-ambarella.md) 섹션을 참조하세요.
i.MX 8M Plus에는 TIM-VX(libtim-vx.so)가 필요합니다. TIM-VX를 구축하는 방법에 대한 추가 정보는 [TIM-VX GitHub 리포지토리](https://github.com/VeriSilicon/TIM-VX)를 참고하세요.

**Topics**
+ [지원되는 프레임워크](neo-supported-devices-edge-frameworks.md)
+ [지원되는 디바이스, 칩 아키텍처, 시스템](neo-supported-devices-edge-devices.md)
+ [테스트 완료 모델](neo-supported-edge-tested-models.md)

# 지원되는 프레임워크
<a name="neo-supported-devices-edge-frameworks"></a>

Amazon SageMaker Neo는 다음과 같은 프레임워크를 지원합니다.


| 프레임워크 | 프레임워크 버전 | 모델 버전 |  모델 | 모델 형식(\$1.tar.gz로 패키징됨) | 툴킷 | 
| --- | --- | --- | --- | --- | --- | 
| MXNet | 1.8 | 1.8 이하 지원 | 이미지 분류, 객체 감지, 의미적 분할, 포즈 추정, 활동 인식 | 기호 파일(.json) 한 개 및 파라미터 파일(.params) 한 개 | GluonCV v0.8.0 | 
| ONNX | 1.7 | 1.7 이하 지원 | 이미지 분류, SVM | 모델 파일(.onnx) 한 개 |  | 
| Keras | 2.2 | 2.2 이하 지원 | 이미지 분류 | 모델 정의 파일(.h5) 한 개 |  | 
| PyTorch | 1.7, 1.8 | 1.7, 1.8 이하 지원 | 이미지 분류, 객체 감지 | 모델 정의 파일(.pth) 한 개 |  | 
| TensorFlow | 1.15, 2.4, 2.5(ml.inf1.\$1 인스턴스에만 해당) | 1.15, 2.4, 2.5(ml.inf1.\$1 인스턴스에만 해당) 이하 지원 | 이미지 분류, 객체 감지 | \$1저장된 모델의 경우, .pb 또는 .pbtxt 파일 하나와 변수가 포함된 변수 디렉터리 \$1고정된 모델의 경우 .pb 또는 .pbtxt 파일 하나만 |  | 
| TensorFlow-Lite | 1.15 | 1.15 이하 지원 | 이미지 분류, 객체 감지 | 모델 정의 플랫 버퍼 파일(.tflite) 한 개 |  | 
| XGBoost | 1.3 | 1.3 이하 지원 | 의사결정 트리 | 트리의 노드 수가 2^31개 미만인 XGBoost 모델 파일(.model) 한 개 |  | 
| DARKNET |  |  | 이미지 분류, 객체 감지(Yolo 모델은 지원되지 않음) | 구성(.cfg) 파일 하나와 가중치(.weight) 파일 하나 |  | 

# 지원되는 디바이스, 칩 아키텍처, 시스템
<a name="neo-supported-devices-edge-devices"></a>

Amazon SageMaker Neo는 다음과 같은 디바이스, 칩 아키텍처 및 운영 체제를 지원합니다.

## Devices
<a name="neo-supported-edge-devices"></a>

[Amazon SageMaker AI 콘솔](https://console.aws.amazon.com/sagemaker)의 드롭다운 목록을 사용하거나 [https://docs.aws.amazon.com/sagemaker/latest/APIReference/API_CreateCompilationJob.html](https://docs.aws.amazon.com/sagemaker/latest/APIReference/API_CreateCompilationJob.html) API의 출력 구성에서 `TargetDevice`를 지정하여 디바이스를 선택할 수 있습니다.

다음 엣지 디바이스 중 하나를 선택할 수 있습니다.


| 디바이스 목록 | 시스템 온 어 칩(SoC) | 운영 체제 | 아키텍처 | 액셀러레이터 | 컴파일러 옵션 예제 | 
| --- | --- | --- | --- | --- | --- | 
| aisage | 없음 | Linux | ARM64 | 말리 | 없음 | 
| amba\$1cv2 | CV2 | Arch Linux | ARM64 | cvflow | 없음 | 
| amba\$1cv22 | CV22 | Arch Linux | ARM64 | cvflow | 없음 | 
| amba\$1cv25 | CV25 | Arch Linux | ARM64 | cvflow | 없음 | 
| coreml | 없음 | iOS, macOS | 없음 | 없음 | \$1"class\$1labels": "imagenet\$1labels\$11000.txt"\$1 | 
| imx8qm | NXP imx8 | Linux | ARM64 | 없음 | 없음 | 
| imx8mplus | i.MX 8M Plus | Linux | ARM64 | NPU | 없음 | 
| jacinto\$1tda4vm | TDA4VM | Linux | ARM | TDA4VM | 없음 | 
| jetson\$1nano | 없음 | Linux | ARM64 | NVIDIA | \$1'gpu-code': 'sm\$153', 'trt-ver': '5.0.6', 'cuda-ver': '10.0'\$1`TensorFlow2`, `{'JETPACK_VERSION': '4.6', 'gpu_code': 'sm_72'}`용 | 
| jetson\$1tx1 | 없음 | Linux | ARM64 | NVIDIA | \$1'gpu-code': 'sm\$153', 'trt-ver': '6.0.1', 'cuda-ver': '10.0'\$1 | 
| jetson\$1tx2 | 없음 | Linux | ARM64 | NVIDIA | \$1'gpu-code': 'sm\$162', 'trt-ver': '6.0.1', 'cuda-ver': '10.0'\$1 | 
| jetson\$1xavier | 없음 | Linux | ARM64 | NVIDIA | \$1'gpu-code': 'sm\$172', 'trt-ver': '5.1.6', 'cuda-ver': '10.0'\$1 | 
| qcs605 | 없음 | Android | ARM64 | 말리 | \$1'ANDROID\$1PLATFORM': 27\$1 | 
| qcs603 | 없음 | Android | ARM64 | 말리 | \$1'ANDROID\$1PLATFORM': 27\$1 | 
| rasp3b | ARM A56 | Linux | ARM\$1EABIHF | 없음 | \$1'mattr': ['\$1neon']\$1 | 
| rasp4b | ARM A72 | 없음 | 없음 | 없음 | 없음 | 
| rk3288 | 없음 | Linux | ARM\$1EABIHF | 말리 | 없음 | 
| rk3399 | 없음 | Linux | ARM64 | 말리 | 없음 | 
| sbe\$1c | 없음 | Linux | x86\$164 | 없음 | \$1'mcpu': 'core-avx2'\$1 | 
| sitara\$1am57x | AM57X | Linux | ARM64 | EVE 및/또는 C66x DSP | 없음 | 
| x86\$1win32 | 없음 | Windows 10 | X86\$132 | 없음 | 없음 | 
| x86\$1win64 | 없음 | Windows 10 | X86\$132 | 없음 | 없음 | 

각 대상 디바이스의 JSON 키-값 컴파일러 옵션에 대한 추가 정보는 [`OutputConfig` API](https://docs.aws.amazon.com/sagemaker/latest/APIReference/API_OutputConfig.html) 데이터 유형의 `CompilerOptions` 필드를 참조하세요.

## 시스템 및 칩 아키텍처
<a name="neo-supported-edge-granular"></a>

다음 순람표는 Neo 모델 컴파일 작업에 사용할 수 있는 운영 체제 및 아키텍처에 관한 정보를 제공합니다.

------
#### [ Linux ]


| 액셀러레이터 | X86\$164 | X86 | ARM64 | ARM\$1EABIHF | ARM\$1EABI | 
| --- | --- | --- | --- | --- | --- | 
| 액셀러레이터 (CPU) 없음 | ![\[alt text not found\]](http://docs.aws.amazon.com/ko_kr/sagemaker/latest/dg/images/success_icon.svg) 예 | ![\[alt text not found\]](http://docs.aws.amazon.com/ko_kr/sagemaker/latest/dg/images/negative_icon.svg) 아니요 | ![\[alt text not found\]](http://docs.aws.amazon.com/ko_kr/sagemaker/latest/dg/images/success_icon.svg) 예 | ![\[alt text not found\]](http://docs.aws.amazon.com/ko_kr/sagemaker/latest/dg/images/success_icon.svg) 예 | ![\[alt text not found\]](http://docs.aws.amazon.com/ko_kr/sagemaker/latest/dg/images/success_icon.svg) 예 | 
| Nvidia GPU | ![\[alt text not found\]](http://docs.aws.amazon.com/ko_kr/sagemaker/latest/dg/images/success_icon.svg) 예 | ![\[alt text not found\]](http://docs.aws.amazon.com/ko_kr/sagemaker/latest/dg/images/negative_icon.svg) 아니요 | ![\[alt text not found\]](http://docs.aws.amazon.com/ko_kr/sagemaker/latest/dg/images/success_icon.svg) 예 | ![\[alt text not found\]](http://docs.aws.amazon.com/ko_kr/sagemaker/latest/dg/images/negative_icon.svg) 아니요 | ![\[alt text not found\]](http://docs.aws.amazon.com/ko_kr/sagemaker/latest/dg/images/negative_icon.svg) 아니요 | 
| Intel\$1Graphics | ![\[alt text not found\]](http://docs.aws.amazon.com/ko_kr/sagemaker/latest/dg/images/success_icon.svg) 예 | ![\[alt text not found\]](http://docs.aws.amazon.com/ko_kr/sagemaker/latest/dg/images/negative_icon.svg) 아니요 | ![\[alt text not found\]](http://docs.aws.amazon.com/ko_kr/sagemaker/latest/dg/images/negative_icon.svg) 아니요 | ![\[alt text not found\]](http://docs.aws.amazon.com/ko_kr/sagemaker/latest/dg/images/negative_icon.svg) 아니요 | ![\[alt text not found\]](http://docs.aws.amazon.com/ko_kr/sagemaker/latest/dg/images/negative_icon.svg) 아니요 | 
| ARM Mali | ![\[alt text not found\]](http://docs.aws.amazon.com/ko_kr/sagemaker/latest/dg/images/negative_icon.svg) 아니요 | ![\[alt text not found\]](http://docs.aws.amazon.com/ko_kr/sagemaker/latest/dg/images/negative_icon.svg) 아니요 | ![\[alt text not found\]](http://docs.aws.amazon.com/ko_kr/sagemaker/latest/dg/images/success_icon.svg) 예 | ![\[alt text not found\]](http://docs.aws.amazon.com/ko_kr/sagemaker/latest/dg/images/success_icon.svg) 예 | ![\[alt text not found\]](http://docs.aws.amazon.com/ko_kr/sagemaker/latest/dg/images/success_icon.svg) 예 | 

------
#### [ Android ]


| 액셀러레이터 | X86\$164 | X86 | ARM64 | ARM\$1EABIHF | ARM\$1EABI | 
| --- | --- | --- | --- | --- | --- | 
| 액셀러레이터 (CPU) 없음 | ![\[alt text not found\]](http://docs.aws.amazon.com/ko_kr/sagemaker/latest/dg/images/success_icon.svg) 예 | ![\[alt text not found\]](http://docs.aws.amazon.com/ko_kr/sagemaker/latest/dg/images/success_icon.svg) 예 | ![\[alt text not found\]](http://docs.aws.amazon.com/ko_kr/sagemaker/latest/dg/images/success_icon.svg) 예 | ![\[alt text not found\]](http://docs.aws.amazon.com/ko_kr/sagemaker/latest/dg/images/negative_icon.svg) 아니요 | ![\[alt text not found\]](http://docs.aws.amazon.com/ko_kr/sagemaker/latest/dg/images/success_icon.svg) 예 | 
| Nvidia GPU | ![\[alt text not found\]](http://docs.aws.amazon.com/ko_kr/sagemaker/latest/dg/images/negative_icon.svg) 아니요 | ![\[alt text not found\]](http://docs.aws.amazon.com/ko_kr/sagemaker/latest/dg/images/negative_icon.svg) 아니요 | ![\[alt text not found\]](http://docs.aws.amazon.com/ko_kr/sagemaker/latest/dg/images/negative_icon.svg) 아니요 | ![\[alt text not found\]](http://docs.aws.amazon.com/ko_kr/sagemaker/latest/dg/images/negative_icon.svg) 아니요 | ![\[alt text not found\]](http://docs.aws.amazon.com/ko_kr/sagemaker/latest/dg/images/negative_icon.svg) 아니요 | 
| Intel\$1Graphics | ![\[alt text not found\]](http://docs.aws.amazon.com/ko_kr/sagemaker/latest/dg/images/success_icon.svg) 예 | ![\[alt text not found\]](http://docs.aws.amazon.com/ko_kr/sagemaker/latest/dg/images/success_icon.svg) 예 | ![\[alt text not found\]](http://docs.aws.amazon.com/ko_kr/sagemaker/latest/dg/images/negative_icon.svg) 아니요 | ![\[alt text not found\]](http://docs.aws.amazon.com/ko_kr/sagemaker/latest/dg/images/negative_icon.svg) 아니요 | ![\[alt text not found\]](http://docs.aws.amazon.com/ko_kr/sagemaker/latest/dg/images/negative_icon.svg) 아니요 | 
| ARM Mali | ![\[alt text not found\]](http://docs.aws.amazon.com/ko_kr/sagemaker/latest/dg/images/negative_icon.svg) 아니요 | ![\[alt text not found\]](http://docs.aws.amazon.com/ko_kr/sagemaker/latest/dg/images/negative_icon.svg) 아니요 | ![\[alt text not found\]](http://docs.aws.amazon.com/ko_kr/sagemaker/latest/dg/images/success_icon.svg) 예 | ![\[alt text not found\]](http://docs.aws.amazon.com/ko_kr/sagemaker/latest/dg/images/negative_icon.svg) 아니요 | ![\[alt text not found\]](http://docs.aws.amazon.com/ko_kr/sagemaker/latest/dg/images/success_icon.svg) 예 | 

------
#### [ Windows ]


| 액셀러레이터 | X86\$164 | X86 | ARM64 | ARM\$1EABIHF | ARM\$1EABI | 
| --- | --- | --- | --- | --- | --- | 
| 액셀러레이터 (CPU) 없음 | ![\[alt text not found\]](http://docs.aws.amazon.com/ko_kr/sagemaker/latest/dg/images/success_icon.svg) 예 | ![\[alt text not found\]](http://docs.aws.amazon.com/ko_kr/sagemaker/latest/dg/images/success_icon.svg) 예 | ![\[alt text not found\]](http://docs.aws.amazon.com/ko_kr/sagemaker/latest/dg/images/negative_icon.svg) 아니요 | ![\[alt text not found\]](http://docs.aws.amazon.com/ko_kr/sagemaker/latest/dg/images/negative_icon.svg) 아니요 | ![\[alt text not found\]](http://docs.aws.amazon.com/ko_kr/sagemaker/latest/dg/images/negative_icon.svg) 아니요 | 

------

# 테스트 완료 모델
<a name="neo-supported-edge-tested-models"></a>

아래의 접이식 섹션은 Amazon SageMaker Neo 팀에서 테스트한 기계 학습 모델에 대한 정보를 제공합니다. 프레임워크를 기반으로 접이식 섹션을 확장하여 모델이 테스트되었는지 확인하세요.

**참고**  
이 목록은 Neo로 컴파일할 수 있는 전체 모델 목록이 아닙니다.

SageMaker Neo로 모델을 컴파일할 수 있는지 알아보려면 [지원되는 프레임워크](neo-supported-devices-edge-frameworks.md) 및 [SageMaker Neo Supported Operators](https://aws.amazon.com/releasenotes/sagemaker-neo-supported-frameworks-and-operators/)를 참조하세요.

## DarkNet
<a name="collapsible-section-01"></a>


|  모델 | ARM V8 | ARM Mali | Ambarella CV22 | Nvidia | Panorama | TI TDA4VM | Qualcomm QCS603 | X86\$1Linux | X86\$1Windows | 
| --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | 
| Alexnet |  |  |  |  |  |  |  |  |  | 
| Resnet50 | X | X |  | X | X | X |  | X | X | 
| YOLOv2 |  |  |  | X | X | X |  | X | X | 
| YOLOv2\$1tiny | X | X |  | X | X | X |  | X | X | 
| YOLOv3\$1416 |  |  |  | X | X | X |  | X | X | 
| YOLOv3\$1tiny | X | X |  | X | X | X |  | X | X | 

## MXNet
<a name="collapsible-section-02"></a>


|  모델 | ARM V8 | ARM Mali | Ambarella CV22 | Nvidia | Panorama | TI TDA4VM | Qualcomm QCS603 | X86\$1Linux | X86\$1Windows | 
| --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | 
| Alexnet |  |  | X |  |  |  |  |  |  | 
| Densenet121 |  |  | X |  |  |  |  |  |  | 
| DenseNet201 | X | X | X | X | X | X |  | X | X | 
| GoogLeNet | X | X |  | X | X | X |  | X | X | 
| InceptionV3 |  |  |  | X | X | X |  | X | X | 
| MobileNet0.75 | X | X |  | X | X | X |  |  | X | 
| MobileNet1.0 | X | X | X | X | X | X |  |  | X | 
| MobileNetV2\$10.5 | X | X |  | X | X | X |  |  | X | 
| MobileNetV2\$11.0 | X | X | X | X | X | X | X | X | X | 
| MobileNetV3\$1Large | X | X | X | X | X | X | X | X | X | 
| MobileNetV3\$1Small | X | X | X | X | X | X | X | X | X | 
| ResNeSt50 |  |  |  | X | X |  |  | X | X | 
| ResNet18\$1v1 | X | X | X | X | X | X |  |  | X | 
| ResNet18\$1v2 | X | X |  | X | X | X |  |  | X | 
| ResNet50\$1v1 | X | X | X | X | X | X |  | X | X | 
| ResNet50\$1v2 | X | X | X | X | X | X |  | X | X | 
| ResNext101\$132x4d |  |  |  |  |  |  |  |  |  | 
| ResNext50\$132x4d | X |  | X | X | X |  |  | X | X | 
| SENet\$1154 |  |  |  | X | X | X |  | X | X | 
| SE\$1ResNext50\$132x4d | X | X |  | X | X | X |  | X | X | 
| SqueezeNet1.0 | X | X | X | X | X | X |  |  | X | 
| SqueezeNet1.1 | X | X | X | X | X | X |  | X | X | 
| VGG11 | X | X | X | X | X |  |  | X | X | 
| Xception | X | X | X | X | X | X |  | X | X | 
| darknet53 | X | X |  | X | X | X |  | X | X | 
| resnet18\$1v1b\$10.89 | X | X |  | X | X | X |  |  | X | 
| resnet50\$1v1d\$10.11 | X | X |  | X | X | X |  |  | X | 
| resnet50\$1v1d\$10.86 | X | X | X | X | X | X |  | X | X | 
| ssd\$1512\$1mobilenet1.0\$1coco | X |  | X | X | X | X |  | X | X | 
| ssd\$1512\$1mobilenet1.0\$1voc | X |  | X | X | X | X |  | X | X | 
| ssd\$1resnet50\$1v1 | X |  | X | X | X |  |  | X | X | 
| yolo3\$1darknet53\$1coco | X |  |  | X | X |  |  | X | X | 
| yolo3\$1mobilenet1.0\$1coco | X | X |  | X | X | X |  | X | X | 
| deeplab\$1resnet50 |  |  | X |  |  |  |  |  |  | 

## Keras
<a name="collapsible-section-03"></a>


|  모델 | ARM V8 | ARM Mali | Ambarella CV22 | Nvidia | Panorama | TI TDA4VM | Qualcomm QCS603 | X86\$1Linux | X86\$1Windows | 
| --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | 
| densenet121 | X | X | X | X | X | X |  | X | X | 
| densenet201 | X | X | X | X | X | X |  |  | X | 
| inception\$1v3 | X | X |  | X | X | X |  | X | X | 
| mobilenet\$1v1 | X | X | X | X | X | X |  | X | X | 
| mobilenet\$1v2 | X | X | X | X | X | X |  | X | X | 
| resnet152\$1v1 |  |  |  | X | X |  |  |  | X | 
| resnet152\$1v2 |  |  |  | X | X |  |  |  | X | 
| resnet50\$1v1 | X | X | X | X | X |  |  | X | X | 
| resnet50\$1v2 | X | X | X | X | X | X |  | X | X | 
| vgg16 |  |  | X | X | X |  |  | X | X | 

## ONNX
<a name="collapsible-section-04"></a>


|  모델 | ARM V8 | ARM Mali | Ambarella CV22 | Nvidia | Panorama | TI TDA4VM | Qualcomm QCS603 | X86\$1Linux | X86\$1Windows | 
| --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | 
| alexnet |  |  | X |  |  |  |  |  |  | 
| mobilenetv2-1.0 | X | X | X | X | X | X |  | X | X | 
| resnet18v1 | X |  |  | X | X |  |  |  | X | 
| resnet18v2 | X |  |  | X | X |  |  |  | X | 
| resnet50v1 | X |  | X | X | X |  |  | X | X | 
| resnet50v2 | X |  | X | X | X |  |  | X | X | 
| resnet152v1 |  |  |  | X | X | X |  |  | X | 
| resnet152v2 |  |  |  | X | X | X |  |  | X | 
| squeezenet1.1 | X |  | X | X | X | X |  | X | X | 
| vgg19 |  |  | X |  |  |  |  |  | X | 

## PyTorch (FP32)
<a name="collapsible-section-05"></a>


|  모델 | ARM V8 | ARM Mali | Ambarella CV22 | Ambarella CV25 | Nvidia | Panorama | TI TDA4VM | Qualcomm QCS603 | X86\$1Linux | X86\$1Windows | 
| --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | 
| densenet121 | X | X | X | X | X | X | X |  | X | X | 
| inception\$1v3 |  | X |  |  | X | X | X |  | X | X | 
| resnet152 |  |  |  |  | X | X | X |  |  | X | 
| resnet18 | X | X |  |  | X | X | X |  |  | X | 
| resnet50 | X | X | X | X | X | X |  |  | X | X | 
| squeezenet1.0 | X | X |  |  | X | X | X |  |  | X | 
| squeezenet1.1 | X | X | X | X | X | X | X |  | X | X | 
| yolov4 |  |  |  |  | X | X |  |  |  |  | 
| yolov5 |  |  |  | X | X | X |  |  |  |  | 
| fasterrcnn\$1resnet50\$1fpn |  |  |  |  | X | X |  |  |  |  | 
| maskrcnn\$1resnet50\$1fpn |  |  |  |  | X | X |  |  |  |  | 

## TensorFlow
<a name="collapsible-section-06"></a>

------
#### [ TensorFlow ]


|  모델 | ARM V8 | ARM Mali | Ambarella CV22 | Ambarella CV25 | Nvidia | Panorama | TI TDA4VM | Qualcomm QCS603 | X86\$1Linux | X86\$1Windows | 
| --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | 
| densenet201 | X | X | X | X | X | X | X |  | X | X | 
| inception\$1v3 | X | X | X |  | X | X | X |  | X | X | 
| mobilenet100\$1v1 | X | X | X |  | X | X | X |  |  | X | 
| mobilenet100\$1v2.0 | X | X | X |  | X | X | X |  | X | X | 
| mobilenet130\$1v2 | X | X |  |  | X | X | X |  |  | X | 
| mobilenet140\$1v2 | X | X | X |  | X | X | X |  | X | X | 
| resnet50\$1v1.5 | X | X |  |  | X | X | X |  | X | X | 
| resnet50\$1v2 | X | X | X | X | X | X | X |  | X | X | 
| squeezenet | X | X | X | X | X | X | X |  | X | X | 
| mask\$1rcnn\$1inception\$1resnet\$1v2 |  |  |  |  | X |  |  |  |  |  | 
| ssd\$1mobilenet\$1v2 |  |  |  |  | X | X |  |  |  |  | 
| faster\$1rcnn\$1resnet50\$1lowproposals |  |  |  |  | X |  |  |  |  |  | 
| rfcn\$1resnet101 |  |  |  |  | X |  |  |  |  |  | 

------
#### [ TensorFlow.Keras ]


|  모델 | ARM V8 | ARM Mali | Ambarella CV22 | Nvidia | Panorama | TI TDA4VM | Qualcomm QCS603 | X86\$1Linux | X86\$1Windows | 
| --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | 
| DenseNet121  | X | X |  | X | X | X |  | X | X | 
| DenseNet201 | X | X |  | X | X | X |  |  | X | 
| InceptionV3 | X | X |  | X | X | X |  | X | X | 
| MobileNet | X | X |  | X | X | X |  | X | X | 
| MobileNetv2 | X | X |  | X | X | X |  | X | X | 
| NASNetLarge |  |  |  | X | X |  |  | X | X | 
| NASNetMobile | X | X |  | X | X | X |  | X | X | 
| ResNet101 |  |  |  | X | X | X |  |  | X | 
| ResNet101V2 |  |  |  | X | X | X |  |  | X | 
| ResNet152 |  |  |  | X | X |  |  |  | X | 
| ResNet152v2 |  |  |  | X | X |  |  |  | X | 
| ResNet50 | X | X |  | X | X |  |  | X | X | 
| ResNet50V2 | X | X |  | X | X | X |  | X | X | 
| VGG16 |  |  |  | X | X |  |  | X | X | 
| Xception | X | X |  | X | X | X |  | X | X | 

------

## TensorFlow-Lite
<a name="collapsible-section-07"></a>

------
#### [ TensorFlow-Lite (FP32) ]


|  모델 | ARM V8 | ARM Mali | Ambarella CV22 | Nvidia | Panorama | TI TDA4VM | Qualcomm QCS603 | X86\$1Linux | X86\$1Windows | i.MX 8M Plus | 
| --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | 
| densenet\$12018\$104\$127 | X |  |  | X | X | X |  |  | X |  | 
| inception\$1resnet\$1v2\$12018\$104\$127 |  |  |  | X | X | X |  |  | X |  | 
| inception\$1v3\$12018\$104\$127 |  |  |  | X | X | X |  |  | X | X | 
| inception\$1v4\$12018\$104\$127 |  |  |  | X | X | X |  |  | X | X | 
| mnasnet\$10.5\$1224\$109\$107\$12018 | X |  |  | X | X | X |  |  | X |  | 
| mnasnet\$11.0\$1224\$109\$107\$12018 | X |  |  | X | X | X |  |  | X |  | 
| mnasnet\$11.3\$1224\$109\$107\$12018 | X |  |  | X | X | X |  |  | X |  | 
| mobilenet\$1v1\$10.25\$1128 | X |  |  | X | X | X |  |  | X | X | 
| mobilenet\$1v1\$10.25\$1224 | X |  |  | X | X | X |  |  | X | X | 
| mobilenet\$1v1\$10.5\$1128 | X |  |  | X | X | X |  |  | X | X | 
| mobilenet\$1v1\$10.5\$1224 | X |  |  | X | X | X |  |  | X | X | 
| mobilenet\$1v1\$10.75\$1128 | X |  |  | X | X | X |  |  | X | X | 
| mobilenet\$1v1\$10.75\$1224 | X |  |  | X | X | X |  |  | X | X | 
| mobilenet\$1v1\$11.0\$1128 | X |  |  | X | X | X |  |  | X | X | 
| mobilenet\$1v1\$11.0\$1192 | X |  |  | X | X | X |  |  | X | X | 
| mobilenet\$1v2\$11.0\$1224 | X |  |  | X | X | X |  |  | X | X | 
| resnet\$1v2\$1101 |  |  |  | X | X | X |  |  | X |  | 
| squeezenet\$12018\$104\$127 | X |  |  | X | X | X |  |  | X |  | 

------
#### [ TensorFlow-Lite (INT8) ]


|  모델 | ARM V8 | ARM Mali | Ambarella CV22 | Nvidia | Panorama | TI TDA4VM | Qualcomm QCS603 | X86\$1Linux | X86\$1Windows | i.MX 8M Plus | 
| --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | 
| inception\$1v1 |  |  |  |  |  |  | X |  |  | X | 
| inception\$1v2 |  |  |  |  |  |  | X |  |  | X | 
| inception\$1v3 | X |  |  |  |  | X | X |  | X | X | 
| inception\$1v4\$1299 | X |  |  |  |  | X | X |  | X | X | 
| mobilenet\$1v1\$10.25\$1128 | X |  |  |  |  | X |  |  | X | X | 
| mobilenet\$1v1\$10.25\$1224 | X |  |  |  |  | X |  |  | X | X | 
| mobilenet\$1v1\$10.5\$1128 | X |  |  |  |  | X |  |  | X | X | 
| mobilenet\$1v1\$10.5\$1224 | X |  |  |  |  | X |  |  | X | X | 
| mobilenet\$1v1\$10.75\$1128 | X |  |  |  |  | X |  |  | X | X | 
| mobilenet\$1v1\$10.75\$1224 | X |  |  |  |  | X | X |  | X | X | 
| mobilenet\$1v1\$11.0\$1128 | X |  |  |  |  | X |  |  | X | X | 
| mobilenet\$1v1\$11.0\$1224 | X |  |  |  |  | X | X |  | X | X | 
| mobilenet\$1v2\$11.0\$1224 | X |  |  |  |  | X | X |  | X | X | 
| deeplab-v3\$1513 |  |  |  |  |  |  | X |  |  |  | 

------

# 모델 배포
<a name="neo-deployment-edge"></a>

Amazon S3에서 컴파일된 모델을 디바이스로 다운로드하고 [DLR](https://github.com/neo-ai/neo-ai-dlr)을 사용하여 리소스가 제한된 엣지 디바이스에 컴퓨팅 모듈을 배포하거나, [AWS IoT Greengrass](https://docs.aws.amazon.com/greengrass/latest/developerguide/what-is-gg.html)를 사용할 수 있습니다.

계속 진행하기 전에 SageMaker Neo가 엣지 디바이스를 지원해야 하는지 확인하세요. 지원되는 엣지 디바이스를 알아보려면 [지원되는 프레임워크, 디바이스, 시스템, 아키텍처](https://docs.aws.amazon.com/sagemaker/latest/dg/neo-supported-devices-edge.html)를 참고하세요. 컴파일 작업을 제출했을 때 대상 엣지 디바이스를 지정했는지 확인하세요. [Neo를 사용하여 모델 컴파일하기](https://docs.aws.amazon.com/sagemaker/latest/dg/neo-job-compilation.html)를 참고하세요.

## 컴파일된 모델 배포(DLR)
<a name="neo-deployment-dlr"></a>

[DLR](https://github.com/neo-ai/neo-ai-dlr)은 딥 러닝 모델 및 의사결정 트리 모델을 위한 작고 일반적인 런타임입니다. DLR은 [TVM](https://github.com/neo-ai/tvm) 런타임, [Treelite](https://treelite.readthedocs.io/en/latest/install.html) 런타임, NVIDIA TensorRT™를 사용하며 다른 하드웨어별 런타임을 포함할 수 있습니다. DLR은 다양한 디바이스에서 컴파일된 모델을 로드하고 실행하기 위한 통합 Python/C\$1\$1 API를 제공합니다.

다음 pip 명령을 사용하여 DLR 패키지의 최신 릴리스를 설치할 수 있습니다.

```
pip install dlr
```

GPU 대상 또는 x86이 아닌 엣지 디바이스에 DLR을 설치하려면 사전 빌드된 바이너리에 대한 [릴리스](https://github.com/neo-ai/neo-ai-dlr/releases) 또는 소스에서 DLR을 빌드하기 위한 [DLR 설치](https://neo-ai-dlr.readthedocs.io/en/latest/install.html)를 참고하세요. 예를 들어, Raspberry Pi 3용 DLR을 설치하려면 다음을 사용할 수 있습니다.

```
pip install https://neo-ai-dlr-release.s3-us-west-2.amazonaws.com/v1.3.0/pi-armv7l-raspbian4.14.71-glibc2_24-libstdcpp3_4/dlr-1.3.0-py3-none-any.whl
```

## 모델 배포(AWS IoT Greengrass)
<a name="neo-deployment-greengrass"></a>

[AWS IoT Greengrass는](https://docs.aws.amazon.com/greengrass/latest/developerguide/what-is-gg.html) 클라우드 기능을 로컬 디바이스로 확장합니다. 이를 통해 디바이스는 정보 소스에 더 가까운 데이터를 수집 및 분석하고, 로컬 이벤트에 자율적으로 반응하고, 로컬 네트워크에서 서로 안전하게 통신할 수 있습니다. AWS IoT Greengrass를 사용하면 클라우드 훈련 모델을 사용하여 로컬에서 생성된 데이터에서 엣지에서 기계 학습 추론을 수행할 수 있습니다. 현재 ARM Cortex-A, Intel Atom 및 Nvidia Jetson 시리즈 프로세서를 기반으로 모든 AWS IoT Greengrass 디바이스에 모델을 배포할 수 있습니다. AWS IoT Greengrass를 사용하여 기계 학습 추론을 수행하기 위해 Lambda 추론 애플리케이션을 배포하는 방법에 대한 자세한 내용은 [AWS Management Console을 사용하여 최적화된 기계 학습 추론을 구성하는 방법을](https://docs.aws.amazon.com/greengrass/latest/developerguide/ml-dlc-console.html) 참조하세요.

# 엣지 디바이스에서 Neo 설정
<a name="neo-getting-started-edge"></a>

Amazon SageMaker Neo를 시작하기 위한 이 안내서는 모델을 컴파일하고, 디바이스를 설정하고, 디바이스에서 추론을 수행하는 방법을 보여줍니다. 대부분의 코드 예제는 Boto3를 사용합니다. 해당하는 AWS CLI 경우를 사용하는 명령과 Neo의 사전 조건을 충족하는 방법에 대한 지침을 제공합니다.

**참고**  
다음 코드 스니펫은 로컬 컴퓨터, SageMaker 노트북, Amazon SageMaker Studio 내에서 또는 엣지 디바이스(엣지 디바이스에 따라 다름)에서 실행할 수 있습니다. 설정은 비슷하지만 SageMaker 노트북 인스턴스 또는 SageMaker Studio 세션 내에서 이 가이드를 실행하는 경우 두 가지 주요 예외가 있습니다.  
Boto3를 설치할 필요가 없습니다.
`‘AmazonSageMakerFullAccess’` IAM 정책을 추가할 필요가 없습니다.

 이 가이드에서는 엣지 디바이스에서 다음 지침을 실행하고 있다고 가정합니다.

# 사전 조건
<a name="neo-getting-started-edge-step0"></a>

SageMaker Neo는 기계 학습 모델을 한 번 훈련시켜 클라우드와 엣지 어디서나 실행할 수 있는 기능입니다. Neo로 모델을 컴파일하고 최적화하려면 먼저 몇 가지 사전 조건을 설정해야 합니다. 필요한 Python 라이브러리를 설치하고, AWS 자격 증명을 구성하고, 필요한 권한이 있는 IAM 역할을 생성하고, 모델 아티팩트를 저장하기 위한 S3 버킷을 설정해야 합니다. 훈련된 기계 학습 모델도 준비해야 합니다. 다음 단계는 설정을 안내합니다.

1. **Boto3 설치**

   엣지 디바이스에서 이러한 명령을 실행하는 경우 AWS SDK for Python (Boto3)을 설치해야 합니다. Python 환경(가급적 가상 환경) 내에서 다음을 엣지 디바이스의 터미널이나 Jupyter notebook 인스턴스 내에서 로컬로 실행합니다.

------
#### [ Terminal ]

   ```
   pip install boto3
   ```

------
#### [ Jupyter Notebook ]

   ```
   !pip install boto3
   ```

------

1.  ** AWS 자격 증명 설정** 

   Python용 SDK(Boto3)를 실행하려면 디바이스에서 Amazon Web Services 자격 증명을 설정해야 합니다. 기본적으로 자격 AWS 증명은 `~/.aws/credentials` 엣지 디바이스의 파일에 저장되어야 합니다. 자격 증명 파일에는 두 개의 환경 변수인 `aws_access_key_id` 및 `aws_secret_access_key`가 표시되어야 합니다.

   터미널에서 다음을 실행합니다.

   ```
   $ more ~/.aws/credentials
   
   [default]
   aws_access_key_id = YOUR_ACCESS_KEY
   aws_secret_access_key = YOUR_SECRET_KEY
   ```

   [AWS 일반 참조 안내서](https://docs.aws.amazon.com/general/latest/gr/aws-sec-cred-types.html#access-keys-and-secret-access-keys)에는 필요한 `aws_access_key_id` 및 `aws_secret_access_key`를 얻는 방법에 대한 지침이 있습니다. 디바이스에서 자격 증명을 설정하는 방법에 대한 자세한 내용은 [Boto3](https://boto3.amazonaws.com/v1/documentation/api/latest/guide/quickstart.html#configuration) 설명서를 참조하세요.

1.  **IAM 역할을 설정하고 정책을 연결합니다.**

   Neo는 S3 버킷 URI에 액세스해야 합니다. SageMaker AI를 실행할 수 있고 S3 URI에 액세스할 수 있는 권한이 있는 IAM 역할을 생성합니다. Python용 SDK(Boto3), 콘솔 또는 AWS CLI를 사용하여 IAM 역할을 생성할 수 있습니다. 다음 예제는 Python용 SDK(Boto3)를 사용하여 IAM 역할을 생성하는 방법을 보여줍니다.

   ```
   import boto3
   
   AWS_REGION = 'aws-region'
   
   # Create an IAM client to interact with IAM
   iam_client = boto3.client('iam', region_name=AWS_REGION)
   role_name = 'role-name'
   ```

   콘솔 AWS CLI을 사용하거나 AWS API를 통해 IAM 역할을 생성하는 방법에 대한 자세한 내용은 [AWS 계정에서 IAM 사용자 생성을 참조하세요](https://docs.aws.amazon.com/IAM/latest/UserGuide/id_users_create.html#id_users_create_api).

    연결하려는 IAM 정책을 설명하는 사전을 만드세요. 이 정책은 새 IAM 역할을 생성하는 데 사용됩니다.

   ```
   policy = {
       'Statement': [
           {
               'Action': 'sts:AssumeRole',
               'Effect': 'Allow',
               'Principal': {'Service': 'sagemaker.amazonaws.com'},
           }],  
        'Version': '2012-10-17		 	 	 '
   }
   ```

   위에서 정의한 정책을 사용하여 새 IAM 역할을 생성합니다.

   ```
   import json 
   
   new_role = iam_client.create_role(
       AssumeRolePolicyDocument=json.dumps(policy),
       Path='/',
       RoleName=role_name
   )
   ```

   이후 단계에서 컴파일 작업을 생성할 때 Amazon 리소스 이름(ARN)이 무엇인지 알아야 하므로 변수에도 저장하세요.

   ```
   role_arn = new_role['Role']['Arn']
   ```

    이제 새 역할을 만들었으니 Amazon SageMaker AI 및 Amazon S3와 상호 작용하는 데 필요한 권한을 연결합니다.

   ```
   iam_client.attach_role_policy(
       RoleName=role_name,
       PolicyArn='arn:aws:iam::aws:policy/AmazonSageMakerFullAccess'
   )
   
   iam_client.attach_role_policy(
       RoleName=role_name,
       PolicyArn='arn:aws:iam::aws:policy/AmazonS3FullAccess'
   );
   ```

1. **Amazon S3 버킷을 생성하여 모델 아티팩트를 저장합니다.**

   SageMaker 네오는 아마존 S3의 모델 아티팩트에 액세스합니다.

------
#### [ Boto3 ]

   ```
   # Create an S3 client
   s3_client = boto3.client('s3', region_name=AWS_REGION)
   
   # Name buckets
   bucket='name-of-your-bucket'
   
   # Check if bucket exists
   if boto3.resource('s3').Bucket(bucket) not in boto3.resource('s3').buckets.all():
       s3_client.create_bucket(
           Bucket=bucket,
           CreateBucketConfiguration={
               'LocationConstraint': AWS_REGION
           }
       )
   else:
       print(f'Bucket {bucket} already exists. No action needed.')
   ```

------
#### [ CLI ]

   ```
   aws s3 mb s3://'name-of-your-bucket' --region specify-your-region 
   
   # Check your bucket exists
   aws s3 ls s3://'name-of-your-bucket'/
   ```

------

1. **기계 훈련 모델 훈련**

   Amazon SageMaker AI를 사용하여 기계 학습 모델을 훈련시키는 방법에 대한 자세한 내용은 [Train a Model with Amazon SageMaker AI](https://docs.aws.amazon.com/sagemaker/latest/dg/how-it-works-training.html)를 참조하세요. 선택적으로 로컬에서 훈련된 모델을 Amazon S3 URI 버킷에 직접 업로드할 수 있습니다.
**참고**  
 사용한 프레임워크에 따라 모델 형식이 올바른지 확인하세요. [SageMaker Neo에 필요한 입력 데이터 셰이프는 무엇입니까?](https://docs.aws.amazon.com/sagemaker/latest/dg/neo-job-compilation.html#neo-job-compilation-expected-inputs)를 참조하세요.

   아직 모델이 없는 경우 `curl` 명령어를 사용하여 TensorFlow 웹사이트에서 `coco_ssd_mobilenet` 모델의 로컬 사본을 가져오세요. 방금 복사한 모델은 [COCO 데이터세트](https://cocodataset.org/#home)에서 훈련된 객체 감지 모델입니다. Jupyter notebook에 다음을 입력합니다.

   ```
   model_zip_filename = './coco_ssd_mobilenet_v1_1.0.zip'
   !curl http://storage.googleapis.com/download.tensorflow.org/models/tflite/coco_ssd_mobilenet_v1_1.0_quant_2018_06_29.zip \
       --output {model_zip_filename}
   ```

   참고로 이 예제는.zip 파일로 패키지되었습니다. 이후 단계에서 사용하기 전에 이 파일의 압축을 풀고 압축된 tarfile(`.tar.gz`)로 다시 패키징하세요. Jupyter notebook에 다음을 입력합니다.

   ```
   # Extract model from zip file
   !unzip -u {model_zip_filename}
   
   model_filename = 'detect.tflite'
   model_name = model_filename.split('.')[0]
   
   # Compress model into .tar.gz so SageMaker Neo can use it
   model_tar = model_name + '.tar.gz'
   !tar -czf {model_tar} {model_filename}
   ```

1. **훈련된 모델을 S3 버킷에 업로드**

   기계 훈련 모드를 훈련시킨 후에는 S3 버킷에 저장합니다.

------
#### [ Boto3 ]

   ```
   # Upload model        
   s3_client.upload_file(Filename=model_filename, Bucket=bucket, Key=model_filename)
   ```

------
#### [ CLI ]

   `your-model-filename` 및 `amzn-s3-demo-bucket`를 Amazon S3 버킷 이름으로 바꿉니다.

   ```
   aws s3 cp your-model-filename s3://amzn-s3-demo-bucket
   ```

------

# 모델 컴파일
<a name="neo-getting-started-edge-step1"></a>

[사전 조건](https://docs.aws.amazon.com/sagemaker/latest/dg/neo-getting-started-edge.html#neo-getting-started-edge-step0)을 충족하면 Amazon SageMaker AI Neo를 사용하여 모델을 컴파일할 수 있습니다. [Python용 콘솔 또는 Amazon Web Services SDK(Boto3)](https://boto3.amazonaws.com/v1/documentation/api/latest/index.html) AWS CLI를 사용하여 모델을 컴파일할 수 있습니다. 모델 [컴파일에 Neo 사용을](https://docs.aws.amazon.com/sagemaker/latest/dg/neo-job-compilation.html) 참조하세요. 이 예제에서는 Boto3를 사용하여 모델을 컴파일합니다.

모델을 컴파일하려면 SageMaker Neo에는 다음 정보가 필요합니다.

1.  **훈련된 모델을 저장한 Amazon S3 버킷 URI입니다.**

   사전 요구 사항을 따랐다면 버킷이 `bucket`라는 변수에 저장됩니다. 다음 코드 스니펫은 AWS CLI를 사용하여 모든 버킷을 나열하는 방법을 보여줍니다.

   ```
   aws s3 ls
   ```

   예제: 

   ```
   $ aws s3 ls
   2020-11-02 17:08:50 bucket
   ```

1.  **컴파일된 모델을 저장하려는 Amazon S3 버킷 URI입니다.**

   아래 코드 스니펫은 Amazon S3 버킷 URI를 `output`라는 출력 디렉터리의 이름과 연결합니다.

   ```
   s3_output_location = f's3://{bucket}/output'
   ```

1.  **모델 훈련에 사용한 기계 훈련 프레임워크.**

   모델 훈련에 사용한 프레임워크를 정의하세요.

   ```
   framework = 'framework-name'
   ```

   예를 들어 TensorFlow를 사용하여 훈련된 모델을 컴파일하는데 `tflite` 또는 `tensorflow`를 사용할 수 있습니다. 스토리지 메모리를 적게 사용하는 더 가벼운 버전의 TensorFlow를 사용하려는 경우 `tflite`를 사용하세요.

   ```
   framework = 'tflite'
   ```

   NEO 지원 프레임워크의 전체 목록은 [지원되는 프레임워크, 디바이스, 시스템, 아키텍처](https://docs.aws.amazon.com/sagemaker/latest/dg/neo-supported-devices-edge.html)를 참조하세요.

1.  **모델 입력의 셰이프.**

    Neo에는 입력 텐서의 이름과 모양이 필요합니다. 이름과 모양은 키-값 페어로 전달됩니다. `value`는 입력 텐서의 정수 크기 목록이며 `key`는 모델에 있는 입력 텐서의 정확한 이름입니다.

   ```
   data_shape = '{"name": [tensor-shape]}'
   ```

   예제:

   ```
   data_shape = '{"normalized_input_image_tensor":[1, 300, 300, 3]}'
   ```
**참고**  
사용한 프레임워크에 따라 모델 형식이 올바른지 확인하세요. [SageMaker Neo에 필요한 입력 데이터 셰이프는 무엇입니까?](https://docs.aws.amazon.com/sagemaker/latest/dg/neo-job-compilation.html#neo-job-compilation-expected-inputs)를 참조하세요. 이 사전의 키를 새 입력 텐서 이름으로 변경해야 합니다.

1.  **컴파일할 대상 디바이스의 이름 또는 하드웨어 플랫폼의 일반 세부 정보** 

   ```
   target_device = 'target-device-name'
   ```

   예를 들어, Raspberry Pi 3에 배포하려면 다음을 사용하세요.

   ```
   target_device = 'rasp3b'
   ```

   지원되는 엣지 디바이스의 전체 목록은 [지원되는 프레임워크, 디바이스, 시스템, 아키텍처](https://docs.aws.amazon.com/sagemaker/latest/dg/neo-supported-devices-edge.html)에서 찾을 수 있습니다.

 이전 단계를 완료했으니 이제 Neo에 컴파일 작업을 제출할 수 있습니다.

```
# Create a SageMaker client so you can submit a compilation job
sagemaker_client = boto3.client('sagemaker', region_name=AWS_REGION)

# Give your compilation job a name
compilation_job_name = 'getting-started-demo'
print(f'Compilation job for {compilation_job_name} started')

response = sagemaker_client.create_compilation_job(
    CompilationJobName=compilation_job_name,
    RoleArn=role_arn,
    InputConfig={
        'S3Uri': s3_input_location,
        'DataInputConfig': data_shape,
        'Framework': framework.upper()
    },
    OutputConfig={
        'S3OutputLocation': s3_output_location,
        'TargetDevice': target_device 
    },
    StoppingCondition={
        'MaxRuntimeInSeconds': 900
    }
)

# Optional - Poll every 30 sec to check completion status
import time

while True:
    response = sagemaker_client.describe_compilation_job(CompilationJobName=compilation_job_name)
    if response['CompilationJobStatus'] == 'COMPLETED':
        break
    elif response['CompilationJobStatus'] == 'FAILED':
        raise RuntimeError('Compilation failed')
    print('Compiling ...')
    time.sleep(30)
print('Done!')
```

디버깅에 대한 추가 정보가 필요하면 다음 인쇄 문장을 포함하세요.

```
print(response)
```

컴파일 작업이 성공하면 컴파일된 모델은 이전에 지정한 출력 Amazon S3 버킷(`s3_output_location`)에 저장됩니다. 컴파일된 모델을 로컬로 다운로드하세요.

```
object_path = f'output/{model}-{target_device}.tar.gz'
neo_compiled_model = f'compiled-{model}.tar.gz'
s3_client.download_file(bucket, object_path, neo_compiled_model)
```

# 디바이스 설정
<a name="neo-getting-started-edge-step2"></a>

디바이스에서 추론할 수 있도록 엣지 디바이스에 패키지를 설치해야 합니다. 또한 [AWS IoT Greengrass](https://docs.aws.amazon.com/greengrass/latest/developerguide/what-is-gg.html) 코어 또는 [딥 러닝 런타임(DLR)](https://github.com/neo-ai/neo-ai-dlr)을 설치해야 합니다. 이 예제에서는 `coco_ssd_mobilenet` 객체 감지 알고리즘을 추론하는 데 필요한 패키지를 설치하고 DLR을 사용합니다.

1. **추가 패키지 설치**

   Boto3 외에도 엣지 디바이스에 특정 라이브러리를 설치해야 합니다. 설치하는 라이브러리는 사용 사례에 따라 다릅니다.

   예를 들어 앞서 다운로드한 `coco_ssd_mobilenet` 객체 감지 알고리즘의 경우 데이터 조작 및 통계를 위해서는 [NumPy](https://numpy.org/)를, 이미지를 로드하려면 [PIL](https://pillow.readthedocs.io/en/stable/)을, 플롯을 생성하려면 [Matplotlib](https://matplotlib.org/)를 설치해야 합니다. Neo를 사용한 컴파일이 베이스라인과 비교하여 미치는 영향을 측정하려면 TensorFlow 사본이 필요합니다.

   ```
   !pip3 install numpy pillow tensorflow matplotlib 
   ```

1. **디바이스에 추론 엔진을 설치하세요.**

   NEO 컴파일 모델을 실행하려면 디바이스에 [딥 러닝 런타임(DLR)](https://github.com/neo-ai/neo-ai-dlr)을 설치하세요. DLR은 딥 러닝 모델 및 의사결정 트리 모델을 위한 작고 일반적인 런타임입니다. Linux를 실행하는 x86\$164 CPU 타겟에서는 다음 `pip` 명령을 사용하여 DLR 패키지의 최신 릴리스를 설치할 수 있습니다.

   ```
   !pip install dlr
   ```

   GPU 대상 또는 x86이 아닌 엣지 디바이스에 DLR을 설치하려면 사전 빌드된 바이너리에 대한 [릴리스](https://github.com/neo-ai/neo-ai-dlr/releases) 또는 소스에서 DLR을 빌드하기 위한 [DLR 설치](https://neo-ai-dlr.readthedocs.io/en/latest/install.html)를 참조하세요. 예를 들어, 라즈베리 파이 3용 DLR을 설치하려면 다음을 사용할 수 있습니다.

   ```
   !pip install https://neo-ai-dlr-release.s3-us-west-2.amazonaws.com/v1.3.0/pi-armv7l-raspbian4.14.71-glibc2_24-libstdcpp3_4/dlr-1.3.0-py3-none-any.whl
   ```

# 디바이스에서 추론하기
<a name="neo-getting-started-edge-step3"></a>

이 예시에서는 Boto3를 사용하여 컴파일 작업의 출력을 엣지 디바이스에 다운로드합니다. 그런 다음 DLR을 가져오고, 데이터세트에서 예제 이미지를 다운로드하고, 모델의 원래 입력과 일치하도록 이미지 크기를 조정한 다음 예측을 수행합니다.

1. **Amazon S3에서 컴파일된 모델을 디바이스로 다운로드하고 압축된 tarfile에서 추출합니다.**

   ```
   # Download compiled model locally to edge device
   object_path = f'output/{model_name}-{target_device}.tar.gz'
   neo_compiled_model = f'compiled-{model_name}.tar.gz'
   s3_client.download_file(bucket_name, object_path, neo_compiled_model)
   
   # Extract model from .tar.gz so DLR can use it
   !mkdir ./dlr_model # make a directory to store your model (optional)
   !tar -xzvf ./compiled-detect.tar.gz --directory ./dlr_model
   ```

1. **DLR 및 초기화된 객체를 `DLRModel` 가져옵니다.**

   ```
   import dlr
   
   device = 'cpu'
   model = dlr.DLRModel('./dlr_model', device)
   ```

1. **추론할 이미지를 다운로드하고 모델이 훈련된 방식에 따라 형식을 지정합니다.**

   `coco_ssd_mobilenet`의 예를 들면 [COCO 데이터세트](https://cocodataset.org/#home)에서 이미지를 다운로드한 다음 이미지를 `300x300`와 같이 수정할 수 있습니다.

   ```
   from PIL import Image
   
   # Download an image for model to make a prediction
   input_image_filename = './input_image.jpg'
   !curl https://farm9.staticflickr.com/8325/8077197378_79efb4805e_z.jpg --output {input_image_filename}
   
   # Format image so model can make predictions
   resized_image = image.resize((300, 300))
   
   # Model is quantized, so convert the image to uint8
   x = np.array(resized_image).astype('uint8')
   ```

1. **DLR을 사용하여 추론하세요.**

   마지막으로 DLR을 사용하여 방금 다운로드한 이미지를 예측할 수 있습니다.

   ```
   out = model.run(x)
   ```

DLR을 사용하여 엣지 디바이스에서 NEO로 컴파일된 모델을 추론하는 예제를 더 보려면 [neo-ai-dlr Github 리포지토리](https://github.com/neo-ai/neo-ai-dlr)를 참조하세요.

# 오류 문제 해결
<a name="neo-troubleshooting"></a>

이 섹션에는 일반적인 오류를 파악해 방지하는 방법, 오류로 인해 생성되는 오류 메시지와 오류 해결 방법에 대한 지침이 나와 있습니다. 계속 진행하기 전에 다음 질문을 스스로 해보세요.

 **모델을 배포하기 전에 오류가 발생했나요?** 그렇다면 [Neo 컴파일 오류 문제 해결](https://docs.aws.amazon.com/sagemaker/latest/dg/neo-troubleshooting-compilation.html)을 참조하세요.

 **모델을 컴파일한 후 오류가 발생했나요?** 그렇다면 [Neo 추론 오류 문제 해결](https://docs.aws.amazon.com/sagemaker/latest/dg/neo-troubleshooting-inference.html)을 참조하세요.

**Ambarella 장치용 모델을 컴파일하는 중 오류가 발생했습니까?** 그렇다면 [Ambarella 오류 문제 해결](neo-troubleshooting-target-devices-ambarella.md) 섹션을 참조하세요.

## 오류 분류 유형
<a name="neo-error-messages"></a>

이 목록에는 Neo에서 발생할 수 있는 *사용자 오류*가 분류되어 있습니다. 여기에는 지원되는 각 프레임워크에 대한 액세스 및 권한 오류와 로드 오류가 포함되어 있습니다. 기타 모든 오류는 *시스템 오류*입니다.

### 클라이언트 권한 오류
<a name="neo-error-client-permission"></a>

 Neo는 종속 서비스를 통해 직접 오류를 전달합니다.
+ sts:AssumeRole 호출 시 *액세스 거부됨*
+ 클라이언트 모델 다운로드 또는 업로드를 위해 Amazon S3 호출 시 발생하는 *모든 400* 오류
+ *PassRole* 오류

### 로드 오류
<a name="collapsible-section-2"></a>

Neo 컴파일러가 Amazon S3에서 .tar.gz를 성공적으로 로드했다고 가정하고 tarball에 컴파일에 필요한 파일이 포함되어 있는지 확인하세요. 확인 기준은 프레임워크별로 다릅니다.
+ **TensorFlow**: protobuf 파일(\$1.pb 또는 \$1.pbtxt)만 필요합니다. 저장된 모델의 경우 변수 폴더가 하나 필요합니다.
+ **Pytorch**: pytorch 파일(\$1.pth)이 하나만 필요합니다.
+ **MXNET**: 기호 파일(\$1.json)과 파라미터 파일(\$1.params)이 하나씩 필요합니다.
+ **XGBoost**: XGBoost 모델 파일(\$1.model) 하나만 필요합니다. 입력 모델에는 크기 제한이 있습니다.

### 컴파일 오류
<a name="neo-error-compilation"></a>

Neo 컴파일러가 Amazon S3에서 .tar.gz를 성공적으로 로드했다고 가정하고 tarball에 컴파일에 필요한 파일이 포함되어 있는지 확인하세요. 확인 기준은 다음과 같습니다.
+ **OperatorNotImplemented**: 연산자가 구현되지 않았습니다.
+ **OperatorAttributeNotImplemented**: 지정된 연산자의 속성이 구현되지 않았습니다.
+ **OperatorAttributeRequired**: 속성이 내부 기호 그래프에 필요하지만 사용자 입력 모델 그래프에 나열되지 않았습니다.
+ **OperatorAttributeValueNotValid**: 특정 연산자의 속성 값이 유효하지 않습니다.

**Topics**
+ [오류 분류 유형](#neo-error-messages)
+ [Neo 컴파일 오류 문제 해결](neo-troubleshooting-compilation.md)
+ [네오 추론 오류 문제 해결](neo-troubleshooting-inference.md)
+ [Ambarella 오류 문제 해결](neo-troubleshooting-target-devices-ambarella.md)

# Neo 컴파일 오류 문제 해결
<a name="neo-troubleshooting-compilation"></a>

이 섹션에는 일반적인 컴파일 오류를 파악해 방지하는 방법, 오류로 인해 생성되는 오류 메시지와 오류 해결 방법에 대한 지침이 나와 있습니다.

**Topics**
+ [이 페이지 사용 방법](#neo-troubleshooting-compilation-how-to-use)
+ [프레임워크 관련 오류](#neo-troubleshooting-compilation-framework-related-errors)
+ [인프라 관련 오류](#neo-troubleshooting-compilation-infrastructure-errors)
+ [컴파일 로그를 확인하세요.](#neo-troubleshooting-compilation-logs)

## 이 페이지 사용 방법
<a name="neo-troubleshooting-compilation-how-to-use"></a>

다음 순서대로 이 섹션을 검토하여 오류를 해결해 보세요.

1. 컴파일 작업의 입력이 입력 요구 사항을 충족하는지 확인하세요. [SageMaker Neo에 필요한 입력 데이터 형태는 무엇입니까?](neo-compilation-preparing-model.md#neo-job-compilation-expected-inputs) 섹션을 참조하세요

1.  일반적인 [프레임워크별 오류](https://docs.aws.amazon.com/sagemaker/latest/dg/neo-troubleshooting-compilation.html#neo-troubleshooting-compilation-framework-related-errors)를 확인하세요.

1.  오류가 [인프라 오류](https://docs.aws.amazon.com/sagemaker/latest/dg/neo-troubleshooting-compilation.html#neo-troubleshooting-compilation-infrastructure-errors)인지 확인하세요.

1. [컴파일 로그](https://docs.aws.amazon.com/sagemaker/latest/dg/neo-troubleshooting-compilation.html#neo-troubleshooting-compilation-logs)를 확인하세요.

## 프레임워크 관련 오류
<a name="neo-troubleshooting-compilation-framework-related-errors"></a>

### Keras
<a name="neo-troubleshooting-compilation-framework-related-errors-keras"></a>


| 오류 | Solution | 
| --- | --- | 
|   `InputConfiguration: No h5 file provided in <model path>`   |   h5 파일이 지정한 Amazon S3 URI에 있는지 확인합니다. *또는* [h5 파일의 형식이 올바른지](https://www.tensorflow.org/guide/keras/save_and_serialize#keras_h5_format) 확인하세요.  | 
|   `InputConfiguration: Multiple h5 files provided, <model path>, when only one is allowed`   |  `h5` 파일을 하나만 제공하고 있는지 확인하세요.  | 
|   `ClientError: InputConfiguration: Unable to load provided Keras model. Error: 'sample_weight_mode'`   |  지정한 Keras 버전이 지원되는지 확인하세요. [클라우드 인스턴스](https://docs.aws.amazon.com/sagemaker/latest/dg/neo-supported-cloud.html) 및 [엣지 디바이스](https://docs.aws.amazon.com/sagemaker/latest/dg/neo-supported-devices-edge.html)에 지원되는 프레임워크를 참조하세요.  | 
|   `ClientError: InputConfiguration: Input input has wrong shape in Input Shape dictionary. Input shapes should be provided in NCHW format. `   |   모델 입력이 NCHW 형식을 따르는지 확인하세요. [SageMaker Neo에 필요한 입력 데이터 셰이프는 무엇입니까?](https://docs.aws.amazon.com/sagemaker/latest/dg/neo-job-compilation.html#neo-job-compilation-expected-inputs)를 참조하세요.  | 

### MXNet
<a name="neo-troubleshooting-compilation-framework-related-errors-mxnet"></a>


| 오류 | Solution | 
| --- | --- | 
|   `ClientError: InputConfiguration: Only one parameter file is allowed for MXNet model. Please make sure the framework you select is correct.`   |   SageMaker Neo는 컴파일을 위해 주어진 첫 번째 파라미터 파일을 선택합니다.  | 

### TensorFlow
<a name="neo-troubleshooting-compilation-framework-related-errors-tensorflow"></a>


| 오류 | Solution | 
| --- | --- | 
|   `InputConfiguration: Exactly one .pb file is allowed for TensorFlow models.`   |  .pb 또는 .pbtxt 파일을 하나만 제공해야 합니다.  | 
|  `InputConfiguration: Exactly one .pb or .pbtxt file is allowed for TensorFlow models.`  |  .pb 또는 .pbtxt 파일을 하나만 제공해야 합니다.  | 
|   ` ClientError: InputConfiguration: TVM cannot convert <model zoo> model. Please make sure the framework you selected is correct. The following operators are not implemented: {<operator name>} `   |   선택한 연산자가 지원되는지 확인하세요. [SageMaker Neo 지원 프레임워크 및 연산자](https://aws.amazon.com/releasenotes/sagemaker-neo-supported-frameworks-and-operators/)를 참조하세요.  | 

### PyTorch
<a name="neo-troubleshooting-compilation-framework-related-errors-pytorch"></a>


| 오류 | Solution | 
| --- | --- | 
|   `InputConfiguration: We are unable to extract DataInputConfig from the model due to input_config_derivation_error. Please override by providing a DataInputConfig during compilation job creation.`  |  다음 중 하나를 수행하세요. [\[See the AWS documentation website for more details\]](http://docs.aws.amazon.com/ko_kr/sagemaker/latest/dg/neo-troubleshooting-compilation.html)  | 

## 인프라 관련 오류
<a name="neo-troubleshooting-compilation-infrastructure-errors"></a>


| 오류 | Solution | 
| --- | --- | 
|   `ClientError: InputConfiguration: S3 object does not exist. Bucket: <bucket>, Key: <bucket key>`   |  제공한 Amazon S3 URI를 확인하세요.  | 
|   ` ClientError: InputConfiguration: Bucket <bucket name> is in region <region name> which is different from AWS Sagemaker service region <service region> `   |   서비스와 동일한 리전의 Amazon S3 버킷을 생성합니다.  | 
|   ` ClientError: InputConfiguration: Unable to untar input model. Please confirm the model is a tar.gz file `   |   Amazon S3의 모델이 `tar.gz` 파일로 압축되었는지 확인합니다.  | 

## 컴파일 로그를 확인하세요.
<a name="neo-troubleshooting-compilation-logs"></a>

1. [https://console.aws.amazon.com/cloudwatch/](https://console.aws.amazon.com/cloudwatch/)에서 Amazon CloudWatch로 이동합니다.

1. 컴파일 작업을 생성한 리전을 오른쪽 상단 **리전** 드롭다운 목록에서 선택합니다.

1. Amazon CloudWatch의 탐색 창에서 **로그**를 선택합니다. **로그 그룹**을 선택합니다.

1. `/aws/sagemaker/CompilationJobs` 로그 그룹을 검색합니다. 로그 그룹을 선택합니다.

1. 컴파일 작업 이름을 따서 이름이 지정된 로그 스트림을 검색합니다. 로그 스트림을 선택합니다.

# 네오 추론 오류 문제 해결
<a name="neo-troubleshooting-inference"></a>

이 섹션에는 엔드포인트 배포 및/또는 호출 시 발생할 수 있는 몇 가지 일반적인 오류를 예방하고 해결하는 방법에 대한 정보가 포함되어 있습니다. 이 섹션은 **PyTorch 1.4.0 이상** 및 **MXnet v1.7.0 이상**에 적용됩니다.
+ 추론 스크립트에서 `model_fn`를 정의한 경우 검증 입력 데이터에 대한 첫 번째 추론(웜업 추론)이 `model_fn()`에서 수행되었는지 확인하세요. 그렇지 않으면 [https://sagemaker.readthedocs.io/en/stable/api/inference/predictors.html#sagemaker.predictor.Predictor.predict](https://sagemaker.readthedocs.io/en/stable/api/inference/predictors.html#sagemaker.predictor.Predictor.predict) 호출 시 터미널에 다음 오류 메시지가 표시될 수 있습니다.

  ```
  An error occurred (ModelError) when calling the InvokeEndpoint operation: Received server error (0) from <users-sagemaker-endpoint> with message "Your invocation timed out while waiting for a response from container model. Review the latency metrics for each container in Amazon CloudWatch, resolve the issue, and try again."                
  ```
+ 다음 표의 환경 변수가 설정되어 있는지 확인하세요. 설정되지 않은 경우 다음과 같은 오류 메시지가 표시될 수 있습니다.

  **터미널에서:**

  ```
  An error occurred (ModelError) when calling the InvokeEndpoint operation: Received server error (503) from <users-sagemaker-endpoint> with message "{ "code": 503, "type": "InternalServerException", "message": "Prediction failed" } ".
  ```

  **CloudWatch에서:**

  ```
  W-9001-model-stdout com.amazonaws.ml.mms.wlm.WorkerLifeCycle - AttributeError: 'NoneType' object has no attribute 'transform'
  ```    
[\[See the AWS documentation website for more details\]](http://docs.aws.amazon.com/ko_kr/sagemaker/latest/dg/neo-troubleshooting-inference.html)
+ Amazon SageMaker AI 모델을 만드는 동안 `MMS_DEFAULT_RESPONSE_TIMEOUT` 환경 변수가 500 이상의 값으로 설정되어 있어야 합니다. 그렇지 않으면 터미널에 다음 오류 메시지가 표시될 수 있습니다.

  ```
  An error occurred (ModelError) when calling the InvokeEndpoint operation: Received server error (0) from <users-sagemaker-endpoint> with message "Your invocation timed out while waiting for a response from container model. Review the latency metrics for each container in Amazon CloudWatch, resolve the issue, and try again."
  ```

# Ambarella 오류 문제 해결
<a name="neo-troubleshooting-target-devices-ambarella"></a>

SageMaker Neo를 사용하려면 모델을 압축된 TAR 파일(`*.tar.gz`)로 패키징해야 합니다. Ambarella 장치는 컴파일을 위해 전송하기 전에 압축된 TAR 파일 내에 추가 파일을 포함해야 합니다. SageMaker Neo를 사용하여 Ambarella 타겟용 모델을 컴파일하려면 압축된 TAR 파일 내에 다음 파일을 포함하세요.
+ SageMaker Neo에서 지원하는 프레임워크를 사용하는 훈련된 모델 
+ JSON 구성 파일
+ 보정 이미지

예를 들어 압축된 TAR 파일의 내용은 다음 예제와 비슷해야 합니다.

```
├──amba_config.json
├──calib_data
|    ├── data1
|    ├── data2
|    ├── .
|    ├── .
|    ├── .
|    └── data500
└──mobilenet_v1_1.0_0224_frozen.pb
```

디렉토리는 다음과 같이 구성됩니다.
+ `amba_config.json`: 구성 파일
+ `calib_data`: 보정 이미지가 포함된 폴더
+ `mobilenet_v1_1.0_0224_frozen.pb`: 프로즌 그래프로 저장된 TensorFlow 모델

SageMaker Neo에서 지원하는 프레임워크에 대한 추가 정보는 [지원되는 프레임워크](neo-supported-devices-edge-frameworks.md) 섹션을 참조하세요.

## 구성 파일 설정
<a name="neo-troubleshooting-target-devices-ambarella-config"></a>

구성 파일은 Ambarella 툴체인이 모델을 컴파일하는 데 필요한 정보를 제공합니다. 구성 파일은 JSON 파일로 저장해야 하며 파일 이름은 반드시 `*config.json`로 끝나야 합니다. 다음 차트는 구성 파일의 콘텐츠를 보여줍니다.


| Key(키) | 설명 | 예제 | 
| --- | --- | --- | 
| 입력 | 입력 계층을 속성에 매핑하는 사전 | <pre>{inputs:{"data":{...},"data1":{...}}}</pre> | 
| “데이터” | 입력 계층 이름. 참고: “data”는 입력 계층에 레이블을 지정하는 데 사용할 수 있는 이름의 예입니다. | “데이터” | 
| 셰이프 | 모델에 입력되는 입력의 형태를 설명합니다. 이는 SageMaker Neo가 사용하는 것과 동일한 규칙을 따릅니다. | “셰이프”: “1,3,224,224" | 
| 파일 경로 | 보정 이미지가 들어 있는 디렉토리의 상대 경로. JPG 또는 PNG와 같은 바이너리 또는 이미지 파일일 수 있습니다. | “파일 경로”: “calib\$1data/” | 
| 컬러 포맷 | 모델에 필요한 색상 형식. 이미지를 바이너리로 변환할 때 사용됩니다. 지원되는 값: [RGB, BGR]. 기본값은 RGB입니다. | “컬러 포맷”: “RGB” | 
| 평균 | 입력값에서 뺄 평균값입니다. 단일 값 또는 값 목록일 수 있습니다. 평균이 목록으로 제공되는 경우 항목 수는 입력의 채널 차원과 일치해야 합니다. | “평균”: 128.0 | 
| 사용 | 입력을 정규화하는 데 사용할 스케일 값입니다. 단일 값 또는 값 목록일 수 있습니다. 스케일이 목록으로 제공되는 경우 항목 수는 입력의 채널 크기와 일치해야 합니다. | “스케일”: 255.0 | 

다음은 구성 파일 예제입니다.

```
{
    "inputs": {
        "data": {
                "shape": "1, 3, 224, 224",
                "filepath": "calib_data/",
                "colorformat": "RGB",
                "mean":[128,128,128],
                "scale":[128.0,128.0,128.0]
        }
    }
}
```

## 보정 이미지
<a name="neo-troubleshooting-target-devices-ambarella-calibration-images"></a>

보정 이미지를 제공하여 훈련된 모델을 양자화합니다. 모델을 양자화하면 Ambarella SoC(시스템 온 칩) 의 CVFlow 엔진 성능이 향상됩니다. Ambarella 툴체인은 보정 이미지를 사용하여 최적의 성능과 정확도를 달성하기 위해 모델의 각 계층을 양자화하는 방법을 결정합니다. 각 계층은 INT8 또는 INT16 형식으로 독립적으로 양자화됩니다. 최종 모델에는 양자화 후 INT8 계층과 INT16 계층이 혼합되어 있습니다.

**몇 개의 이미지를 사용해야 할까요?**

모델이 처리할 것으로 예상되는 장면 유형을 나타내는 이미지를 100\$1200개 정도 포함하는 것이 좋습니다. 모델 컴파일 시간은 입력 파일의 보정 이미지 수에 따라 선형적으로 증가합니다.

**권장 이미지 형식은 무엇입니까?**

보정 이미지는 원시 바이너리 형식이거나 JPG 및 PNG와 같은 이미지 형식일 수 있습니다.

보정 폴더에는 이미지와 바이너리 파일이 혼합되어 포함될 수 있습니다. 보정 폴더에 이미지와 바이너리 파일이 모두 들어 있는 경우 툴체인은 먼저 이미지를 바이너리 파일로 변환합니다. 변환이 완료되면 새로 생성된 바이너리 파일을 원래 폴더에 있던 바이너리 파일과 함께 사용합니다.

**먼저 이미지를 바이너리 형식으로 변환할 수 있나요?**

예. [OpenCV](https://opencv.org/) 또는 [PIL](https://python-pillow.org/)과 같은 오픈 소스 패키지를 사용하여 이미지를 바이너리 형식으로 변환할 수 있습니다. 훈련된 모델의 입력 계층에 맞도록 이미지를 자르고 크기를 조정합니다.


## 평균 및 척도
<a name="neo-troubleshooting-target-devices-ambarella-mean-scale"></a>

Amberalla 툴체인에 평균 및 스케일링 전처리 옵션을 지정할 수 있습니다. 이러한 연산은 네트워크에 내장되며 각 입력에 대한 추론 중에 적용됩니다. 평균이나 척도를 지정하는 경우 처리된 데이터를 제공하지 마세요. 더 구체적으로 말하자면, 평균을 빼거나 스케일링을 적용한 데이터는 제공하지 마세요.

## 컴파일 로그를 확인하세요.
<a name="neo-troubleshooting-target-devices-ambarella-compilation"></a>

Ambarella 장치의 컴파일 로그를 확인하는 방법에 대한 추가 정보는 [컴파일 로그를 확인하세요.](neo-troubleshooting-compilation.md#neo-troubleshooting-compilation-logs) 섹션을 참조하세요.