

# Run your local code as a SageMaker training job
Run local code as a remote job

You can run your local machine learning (ML) Python code as a large single-node Amazon SageMaker training job or as multiple parallel jobs. You can do this by annotating your code with an @remote decorator, as shown in the following code example. [Distributed training](https://docs.aws.amazon.com/sagemaker/latest/dg/distributed-training.html) (across multiple instances) are not supported with remote functions.

```
@remote(**settings)
def divide(x, y):
    return x / y
```

The SageMaker Python SDK will automatically translate your existing workspace environment and any associated data processing code and datasets into a SageMaker training job that runs on the SageMaker training platform. You can also activate a persistent cache feature, which will further reduce job start latency by caching previously downloaded dependency packages. This reduction in job latency is greater than the reduction in latency from using SageMaker AI managed warm pools alone. For more information, see [Using persistent cache](train-warm-pools.md#train-warm-pools-persistent-cache).

**Note**  
Distributed training jobs are not supported by remote functions.

The following sections show how to annotate your local ML code with an @remote decorator and tailor your experience for your use case. This includes customizing your environment and integrating with SageMaker Experiments.

**Topics**
+ [

## Set up your environment
](#train-remote-decorator-env)
+ [

# Invoke a remote function
](train-remote-decorator-invocation.md)
+ [

# Configuration file
](train-remote-decorator-config.md)
+ [

# Customize your runtime environment
](train-remote-decorator-customize.md)
+ [

# Container image compatibility
](train-remote-decorator-container.md)
+ [

# Logging parameters and metrics with Amazon SageMaker Experiments
](train-remote-decorator-experiments.md)
+ [

# Using modular code with the @remote decorator
](train-remote-decorator-modular.md)
+ [

# Private repository for runtime dependencies
](train-remote-decorator-private.md)
+ [

# Example notebooks
](train-remote-decorator-examples.md)

## Set up your environment


Choose one of the following three options to set up your environment.

### Run your code from Amazon SageMaker Studio Classic


You can annotate and run your local ML code from SageMaker Studio Classic by creating a SageMaker Notebook and attaching any image available on SageMaker Studio Classic image. The following instructions help you create a SageMaker Notebook, install the SageMaker Python SDK, and annotate your code with the decorator.

1. Create a SageMaker Notebook and attach an image in SageMaker Studio Classic as follows:

   1. Follow the instructions in [Launch Amazon SageMaker Studio Classic](https://docs.aws.amazon.com/sagemaker/latest/dg/studio-launch.html) in the *Amazon SageMaker AI Developer Guide*.

   1. Select **Studio** from the left navigation pane. This opens a new window.

   1. In the **Get Started** dialog box, select a user profile from the down arrow. This opens a new window.

   1. Select **Open Studio Classic**.

   1. Select **Open Launcher** from the main working area. This opens a new page.

   1. Select **Create notebook** from the main working area.

   1. Select **Base Python 3.0** from the down arrow next to **Image** in the **Change environment** dialog box. 

      The @remote decorator automatically detects the image attached to the SageMaker Studio Classic notebook and uses it to run the SageMaker training job. If `image_uri` is specified either as an argument in the decorator or in the configuration file, then the value specified in `image_uri` will be used instead of the detected image.

      For more information about how to create a notebook in SageMaker Studio Classic, see the **Create a Notebook from the File Menu** section in [Create or Open an Amazon SageMaker Studio Classic Notebook](https://docs.aws.amazon.com/sagemaker/latest/dg/notebooks-create-open.html#notebooks-create-file-menu).

      For a list of available images, see [Supported Docker images](https://docs.aws.amazon.com/sagemaker/latest/dg/train-remote-decorator-container.html).

1. Install the SageMaker Python SDK.

   To annotate your code with the @remote function inside a SageMaker Studio Classic Notebook, you must have the SageMaker Python SDK installed. Install the SageMaker Python SDK, as shown in the following code example.

   ```
   !pip install sagemaker
   ```

1. Use @remote decorator to run functions in a SageMaker training job.

   To run your local ML code, first create a dependencies file to instruct SageMaker AI where to locate your local code. To do so, follow these steps:

   1. From the SageMaker Studio Classic Launcher main working area, in **Utilities and files**, choose **Text file**. This opens a new tab with a text file called `untitled.txt.` 

      For more information about the SageMaker Studio Classic user interface (UI), see [Amazon SageMaker Studio Classic UI Overview](https://docs.aws.amazon.com//sagemaker/latest/dg/studio-ui.html).

   1. Rename `untitled.txt `to `requirements.txt`.

   1. Add all the dependencies required for the code along with the SageMaker AI library to `requirements.txt`. 

      A minimal code example for `requirements.txt` for the example `divide` function is provided in the following section, as follows.

      ```
      sagemaker
      ```

   1. Run your code with the remote decorator by passing the dependencies file, as follows.

      ```
      from sagemaker.remote_function import remote
      
      @remote(instance_type="ml.m5.xlarge", dependencies='./requirements.txt')
      def divide(x, y):
          return x / y
      
      divide(2, 3.0)
      ```

      For additional code examples, see the sample notebook [quick\$1start.ipynb](https://github.com/aws/amazon-sagemaker-examples/blob/main/sagemaker-remote-function/quick_start/quick_start.ipynb).

      If you’re already running a SageMaker Studio Classic notebook, and you install the Python SDK as instructed in **2. Install the SageMaker Python SDK**, you must restart your kernel. For more information, see [Use the SageMaker Studio Classic Notebook Toolbar](https://docs.aws.amazon.com/sagemaker/latest/dg/notebooks-menu.html) in the *Amazon SageMaker AI Developer Guide*.

### Run your code from an Amazon SageMaker notebook


You can annotate your local ML code from a SageMaker notebook instance. The following instructions show how to create a notebook instance with a custom kernel, install the SageMaker Python SDK, and annotate your code with the decorator.

1. Create a notebook instance with a custom `conda` kernel.

   You can annotate your local ML code with an @remote decorator to use inside of a SageMaker training job. First you must create and customize a SageMaker notebook instance to use a kernel with Python version 3.7 or higher, up to 3.10.x. To do so, follow these steps:

   1. Open the SageMaker AI console at [https://console.aws.amazon.com/sagemaker/](https://console.aws.amazon.com/sagemaker/).

   1. In the left navigation panel, choose **Notebook** to expand its options.

   1. Choose **Notebook Instances** from the expanded options.

   1. Choose the **Create Notebook Instance** button. This opens a new page.

   1. For **Notebook instance name**, enter a name with a maximum of 63 characters and no spaces. Valid characters: **A-Z**, **a-z**, **0-9**, and **.****:****\$1****=****@**** \$1****%****-** (hyphen).

   1. In the **Notebook instance settings** dialog box, expand the right arrow next to **Additional Configuration**.

   1. Under **Lifecycle configuration - optional**, expand the down arrow and select **Create a new lifecycle configuration**. This opens a new dialog box.

   1. Under **Name**, enter a name for your configuration setting.

   1. In the **Scripts** dialog box, in the **Start notebook** tab, replace the existing contents of the text box with the following script.

      ```
      #!/bin/bash
      
      set -e
      
      sudo -u ec2-user -i <<'EOF'
      unset SUDO_UID
      WORKING_DIR=/home/ec2-user/SageMaker/custom-miniconda/
      source "$WORKING_DIR/miniconda/bin/activate"
      for env in $WORKING_DIR/miniconda/envs/*; do
          BASENAME=$(basename "$env")
          source activate "$BASENAME"
          python -m ipykernel install --user --name "$BASENAME" --display-name "Custom ($BASENAME)"
      done
      EOF
      
      echo "Restarting the Jupyter server.."
      # restart command is dependent on current running Amazon Linux and JupyterLab
      CURR_VERSION_AL=$(cat /etc/system-release)
      CURR_VERSION_JS=$(jupyter --version)
      
      if [[ $CURR_VERSION_JS == *$"jupyter_core     : 4.9.1"* ]] && [[ $CURR_VERSION_AL == *$" release 2018"* ]]; then
       sudo initctl restart jupyter-server --no-wait
      else
       sudo systemctl --no-block restart jupyter-server.service
      fi
      ```

   1. In the **Scripts** dialog box, in the **Create notebook** tab, replace the existing contents of the text box with the following script.

      ```
      #!/bin/bash
      
      set -e
      
      sudo -u ec2-user -i <<'EOF'
      unset SUDO_UID
      # Install a separate conda installation via Miniconda
      WORKING_DIR=/home/ec2-user/SageMaker/custom-miniconda
      mkdir -p "$WORKING_DIR"
      wget https://repo.anaconda.com/miniconda/Miniconda3-4.6.14-Linux-x86_64.sh -O "$WORKING_DIR/miniconda.sh"
      bash "$WORKING_DIR/miniconda.sh" -b -u -p "$WORKING_DIR/miniconda" 
      rm -rf "$WORKING_DIR/miniconda.sh"
      # Create a custom conda environment
      source "$WORKING_DIR/miniconda/bin/activate"
      KERNEL_NAME="custom_python310"
      PYTHON="3.10"
      conda create --yes --name "$KERNEL_NAME" python="$PYTHON" pip
      conda activate "$KERNEL_NAME"
      pip install --quiet ipykernel
      # Customize these lines as necessary to install the required packages
      EOF
      ```

   1. Choose the **Create configuration** button on the bottom right of the window.

   1. Choose the **Create notebook instance** button on the bottom right of the window.

   1. Wait for the notebook instance **Status** to change from **Pending** to **InService**.

1. Create a Jupyter notebook in the notebook instance.

   The following instructions show how to create a Jupyter notebook using Python 3.10 in your newly created SageMaker instance.

   1. After the notebook instance **Status** from the previous step is **InService**, do the following: 

      1. Select **Open Jupyter** under **Actions** in the row containing your newly created notebook instance **Name**. This opens a new Jupyter server.

   1. In the Jupyter server, select **New** from the top right menu. 

   1. From the down arrow, select **conda\$1custom\$1python310**. This creates a new Jupyter notebook that uses a Python 3.10 kernel. This new Jupyter notebook can now be used similarly to a local Jupyter notebook. 

1. Install the SageMaker Python SDK.

   After your virtual environment is running, install the SageMaker Python SDK by using the following code example.

   ```
   !pip install sagemaker
   ```

1. Use an @remote decorator to run functions in a SageMaker training job.

   When you annotate your local ML code with an @remote decorator inside the SageMaker notebook, SageMaker training will automatically interpret the function of your code and run it as a SageMaker training job. Set up your notebook by doing the following:

   1. Select the kernel name in the notebook menu from the SageMaker notebook instance that you created in step 1, **Create a SageMaker Notebook instance with a custom kernel**.

      For more information, see [Change an Image or a Kernel](https://docs.aws.amazon.com/sagemaker/latest/dg/notebooks-run-and-manage-change-image.html). 

   1. From the down arrow, choose the a custom `conda` kernel that uses a version of Python that is 3.7 or higher. 

      As an example, selecting `conda_custom_python310` chooses the kernel for Python 3.10.

   1. Choose **Select**.

   1. Wait for the kernel’s status to show as idle, which indicates that the kernel has started.

   1. In the Jupyter Server Home, select **New** from the top right menu.

   1. Next to the down arrow, select **Text file**. This creates a new text file called `untitled.txt.`

   1. Rename `untitled.txt` to `requirements.txt` and add any dependencies required for the code along with `sagemaker`.

   1. Run your code with the remote decorator by passing the dependencies file as shown below.

      ```
      from sagemaker.remote_function import remote
      
      @remote(instance_type="ml.m5.xlarge", dependencies='./requirements.txt')
      def divide(x, y):
          return x / y
      
      divide(2, 3.0)
      ```

      See the sample notebook [quick\$1start.ipnyb](https://github.com/aws/amazon-sagemaker-examples/blob/main/sagemaker-remote-function/quick_start/quick_start.ipynb) for additional code examples.

### Run your code from within your local IDE


You can annotate your local ML code with an @remote decorator inside your preferred local IDE. The following steps show the necessary prerequisites, how to install the Python SDK, and how to annotate your code with the @remote decorator.

1. Install prerequisites by setting up the AWS Command Line Interface (AWS CLI) and creating a role, as follows:
   + Onboard to a SageMaker AI domain following the instructions in the **AWS CLI Prerequisites** section of [Set Up Amazon SageMaker AI Prerequisites](https://docs.aws.amazon.com/sagemaker/latest/dg/gs-set-up.html#gs-cli-prereq).
   + Create an IAM role following the **Create execution role** section of [SageMaker AI Roles](https://docs.aws.amazon.com/sagemaker/latest/dg/sagemaker-roles.html).

1. Create a virtual environment by using either PyCharm or `conda` and using Python version 3.7 or higher, up to 3.10.x.
   + Set up a virtual environment using PyCharm as follows:

     1. Select **File** from the main menu.

     1. Choose **New Project**.

     1. Choose **Conda** from the down arrow under **New environment using**.

     1. In the field for **Python version** use the down arrow to select a version of Python that is 3.7 or above. You can go up to 3.10.x from the list.  
![\[\]](http://docs.aws.amazon.com/sagemaker/latest/dg/images/training-pycharm-ide.png)
   + If you have Anaconda installed, you can set up a virtual environment using `conda`, as follows:
     + Open an Anaconda prompt terminal interface.
     + Create and activate a new `conda` environment using a Python version of 3.7 or higher, up to 3.10x. The following code example shows how to create a `conda` environment using Python version 3.10.

       ```
       conda create -n sagemaker_jobs_quick_start python=3.10 pip
       conda activate sagemaker_jobs_quick_start
       ```

1. Install the SageMaker Python SDK.

   To package your code from your preferred IDE, you must have a virtual environment set up using Python 3.7 or higher, up to 3.10x. You also need a compatible container image. Install the SageMaker Python SDK using the following code example.

   ```
   pip install sagemaker
   ```

1. Wrap your code inside the @remote decorator. The SageMaker Python SDK will automatically interpret the function of your code and run it as a SageMaker training job. The following code examples show how to import the necessary libraries, set up a SageMaker session, and annotate a function with the @remote decorator.

   You can run your code by either providing the dependencies needed directly, or by using dependencies from the active `conda` environment.
   + To provide the dependencies directly, do the following:
     + Create a `requirements.txt` file in the working directory that the code resides in.
     + Add all of the dependencies required for the code along with the SageMaker library. The following section provides a minimal code example for `requirements.txt` for the example `divide` function.

       ```
       sagemaker
       ```
     + Run your code with the @remote decorator by passing the dependencies file. In the following code example, replace `The IAM role name` with an AWS Identity and Access Management (IAM) role ARN that you would like SageMaker to use to run your job.

       ```
       import boto3
       import sagemaker
       from sagemaker.remote_function import remote
       
       sm_session = sagemaker.Session(boto_session=boto3.session.Session(region_name="us-west-2"))
       settings = dict(
           sagemaker_session=sm_session,
           role=<The IAM role name>,
           instance_type="ml.m5.xlarge",
           dependencies='./requirements.txt'
       )
       
       @remote(**settings)
       def divide(x, y):
           return x / y
       
       
       if __name__ == "__main__":
           print(divide(2, 3.0))
       ```
   + To use dependencies from the active `conda` environment, use the value `auto_capture` for the `dependencies` parameter, as shown in the following.

     ```
     import boto3
     import sagemaker
     from sagemaker.remote_function import remote
     
     sm_session = sagemaker.Session(boto_session=boto3.session.Session(region_name="us-west-2"))
     settings = dict(
         sagemaker_session=sm_session,
         role=<The IAM role name>,
         instance_type="ml.m5.xlarge",
         dependencies="auto_capture"
     )
     
     @remote(**settings)
     def divide(x, y):
         return x / y
     
     
     if __name__ == "__main__":
         print(divide(2, 3.0))
     ```
**Note**  
You can also implement the previous code inside a Jupyter notebook. PyCharm Professional Edition supports Jupyter natively. For more guidance, see [Jupyter notebook support](https://www.jetbrains.com/help/pycharm/ipython-notebook-support.html) in PyCharm's documentation.

# Invoke a remote function


To invoke a function inside the @remote decorator, use either of the following methods:
+ [Use an @remote decorator to invoke a function](#train-remote-decorator-invocation-decorator).
+ [Use the `RemoteExecutor` API to invoke a function](#train-remote-decorator-invocation-api).

If you use the @remote decorator method to invoke a function, the training job will wait for the function to complete before starting a new task. However, if you use the `RemoteExecutor` API, you can run more than one job in parallel. The following sections show both ways of invoking a function.

## Use an @remote decorator to invoke a function


You can use the @remote decorator to annotate a function. SageMaker AI will transform the code inside the decorator into a SageMaker training job. The training job will then invoke the function inside the decorator and wait for the job to complete. The following code example shows how to import the required libraries, start a SageMaker AI instance, and annotate a matrix multiplication with the @remote decorator.

```
from sagemaker.remote_function import remote
import numpy as np

@remote(instance_type="ml.m5.large")
def matrix_multiply(a, b):
    return np.matmul(a, b)
    
a = np.array([[1, 0],
             [0, 1]])
b = np.array([1, 2])

assert (matrix_multiply(a, b) == np.array([1,2])).all()
```

The decorator is defined as follows.

```
def remote(
    *,
    **kwarg):
        ...
```

When you invoke a decorated function, SageMaker Python SDK loads any exceptions raised by an error into local memory. In the following code example, the first call to the divide function completes successfully and the result is loaded into local memory. In the second call to the divide function, the code returns an error and this error is loaded into local memory.

```
from sagemaker.remote_function import remote
import pytest

@remote()
def divide(a, b):
    return a/b

# the underlying job is completed successfully 
# and the function return is loaded
assert divide(10, 5) == 2

# the underlying job fails with "AlgorithmError" 
# and the function exception is loaded into local memory 
with pytest.raises(ZeroDivisionError):
    divide(10, 0)
```

**Note**  
The decorated function is run as a remote job. If the thread is interrupted, the underlying job will not be stopped.

### How to change the value of a local variable


The decorator function is run on a remote machine. Changing a non-local variable or input arguments inside a decorated function will not change the local value.

In the following code example, a list and a dict are appended inside the decorator function. This does not change when the decorator function is invoked.

```
a = []

@remote
def func():
    a.append(1)

# when func is invoked, a in the local memory is not modified        
func() 
func()

# a stays as []
    
a = {}
@remote
def func(a):
    # append new values to the input dictionary
    a["key-2"] = "value-2"
    
a = {"key": "value"}
func(a)

# a stays as {"key": "value"}
```

To change the value of a local variable declared inside of a decorator function, return the variable from the function. The following code example shows that the value of a local variable is changed when it is returned from the function.

```
a = {"key-1": "value-1"}

@remote
def func(a):
    a["key-2"] = "value-2"
    return a

a = func(a)

-> {"key-1": "value-1", "key-2": "value-2"}
```

### Data serialization and deserialization


When you invoke a remote function, SageMaker AI automatically serializes your function arguments during the input and output stages. Function arguments and returns are serialized using [cloudpickle](https://github.com/cloudpipe/cloudpickle). SageMaker AI supports serializing the following Python objects and functions. 
+ Built-in Python objects including dicts, lists, floats, ints, strings, boolean values and tuples
+ Numpy arrays
+ Pandas Dataframes
+ Scikit-learn datasets and estimators
+ PyTorch models
+ TensorFlow models
+ The Booster class for XGBoost

The following can be used with some limitations.
+ Dask DataFrames
+ The XGBoost Dmatrix class
+ TensorFlow datasets and subclasses
+ PyTorch models

The following section contains best practices for using the previous Python classes with some limitations in your remote function, information about where SageMaker AI stores your serialized data and how to manage access to it.

#### Best practices for Python classes with limited support for remote data serialization


You can use the Python classes listed in this section with limitations. The next sections discuss best practices for how to use the following Python classes.
+ [Dask](https://www.dask.org/) DataFrames
+ The XGBoost DMatric class
+ TensorFlow datasets and subclasses
+ PyTorch models

##### Best practices for Dask


[Dask](https://www.dask.org/) is an open-source library used for parallel computing in Python. This section shows the following.
+ How to pass a Dask DataFrame into your remote function
+ How to convert summary statistics from a Dask DataFrame into a Pandas DataFrame

##### How to pass a Dask DataFrame into your remote function


[Dask DataFrames](https://docs.dask.org/en/latest/dataframe.html) are often used to process large datasets because they can hold datasets that require more memory than is available. This is because a Dask DataFrame does not load your local data into memory. If you pass a Dask DataFrame as a function argument to your remote function, Dask may pass a reference to the data in your local disk or cloud storage, instead of the data itself. The following code shows an example of passing a Dask DataFrame inside your remote function that will operate on an empty DataFrame.

```
#Do not pass a Dask DataFrame  to your remote function as follows
def clean(df: dask.DataFrame ):
    cleaned = df[] \ ...
```

Dask will load the data from the Dask DataFrame into memory only when you use the DataFrame . If you want to use a Dask DataFrame inside a remote function, provide the path to the data . Then Dask will read the dataset directly from the data path that you specify when the code runs.

The following code example shows how to use a Dask DataFrame inside the remote function `clean`. In the code example, `raw_data_path` is passed to clean instead of the Dask DataFrame. When the code runs, the dataset is read directly from the location of an Amazon S3 bucket specified in `raw_data_path`. Then the `persist` function keeps the dataset in memory to facilitate the subsequent `random_split` function and written back to the output data path in an S3 bucket using Dask DataFrame API functions.

```
import dask.dataframe as dd

@remote(
   instance_type='ml.m5.24xlarge',
   volume_size=300, 
   keep_alive_period_in_seconds=600)
#pass the data path to your remote function rather than the Dask DataFrame  itself
def clean(raw_data_path: str, output_data_path: str: split_ratio: list[float]):
    df = dd.read_parquet(raw_data_path) #pass the path to your DataFrame 
    cleaned = df[(df.column_a >= 1) & (df.column_a < 5)]\
        .drop(['column_b', 'column_c'], axis=1)\
        .persist() #keep the data in memory to facilitate the following random_split operation

    train_df, test_df = cleaned.random_split(split_ratio, random_state=10)

    train_df.to_parquet(os.path.join(output_data_path, 'train')
    test_df.to_parquet(os.path.join(output_data_path, 'test'))
    
clean("s3://amzn-s3-demo-bucket/raw/", "s3://amzn-s3-demo-bucket/cleaned/", split_ratio=[0.7, 0.3])
```

##### How to convert summary statistics from a Dask DataFrame into a Pandas DataFrame


Summary statistics from a Dask DataFrame can be converted into a Pandas DataFrame by invoking the `compute` method as shown in the following example code. In the example, the S3 bucket contains a large Dask DataFrame that cannot fit into memory or into a Pandas dataframe. In the following example, a remote function scans the data set and returns a Dask DataFrame containing the output statistics from `describe` to a Pandas DataFrame.

```
executor = RemoteExecutor(
    instance_type='ml.m5.24xlarge',
    volume_size=300, 
    keep_alive_period_in_seconds=600)

future = executor.submit(lambda: dd.read_parquet("s3://amzn-s3-demo-bucket/raw/").describe().compute())

future.result()
```

##### Best practices for the XGBoost DMatric class


DMatrix is an internal data structure used by XGBoost to load data. A DMatrix object can’t be pickled in order to move easily between compute sessions. Directly passing DMatrix instances will fail with a `SerializationError`.

##### How to pass a data object to your remote function and train with XGBoost


To convert a Pandas DataFrame into a DMatrix instance and use it for training in your remote function, pass it directly to the remote function as shown in the following code example.

```
import xgboost as xgb

@remote
def train(df, params):
    #Convert a pandas dataframe into a DMatrix DataFrame and use it for training
    dtrain = DMatrix(df) 
    return xgb.train(dtrain, params)
```

##### Best practices for TensorFlow datasets and sub-classes


TensorFlow datasets and subclasses are internal objects used by TensorFlow to load data during training. TensorFlow datasets and subclasses can’t be pickled in order to move easily between compute sessions. Directly passing Tensorflow datasets or subclasses will fail with a `SerializationError`. Use the Tensorflow I/O APIs to load data from the storage, as shown in the following code example.

```
import tensorflow as tf
import tensorflow_io as tfio

@remote
def train(data_path: str, params):
    
    dataset = tf.data.TextLineDataset(tf.data.Dataset.list_files(f"{data_path}/*.txt"))
    ...
    
train("s3://amzn-s3-demo-bucket/data", {})
```

##### Best practices for PyTorch models


PyTorch models are serializable and can be passed between your local environment and remote function. If your local environment and remote environment have different device types, such as (GPUs and CPUs), you cannot return a trained model to your local environment. For example, if the following code is developed in a local environment without GPUs but run in an instance with GPUs, returning the trained model directly will lead to a `DeserializationError`.

```
# Do not return a model trained on GPUs to a CPU-only environment as follows

@remote(instance_type='ml.g4dn.xlarge')
def train(...):
    if torch.cuda.is_available():
        device = torch.device("cuda")
    else:
        device = torch.device("cpu") # a device without GPU capabilities
    
    model = Net().to(device)
    
    # train the model
    ...
    
    return model
    
model = train(...) #returns a DeserializationError if run on a device with GPU
```

To return a model trained in a GPU environment to one that contains only CPU capabilities, use the PyTorch model I/O APIs directly as shown in the code example below.

```
import s3fs

model_path = "s3://amzn-s3-demo-bucket/folder/"

@remote(instance_type='ml.g4dn.xlarge')
def train(...):
    if torch.cuda.is_available():
        device = torch.device("cuda")
    else:
        device = torch.device("cpu")
    
    model = Net().to(device)
    
    # train the model
    ...
    
    fs = s3fs.FileSystem()
    with fs.open(os.path.join(model_path, 'model.pt'), 'wb') as file:
        torch.save(model.state_dict(), file) #this writes the model in a device-agnostic way (CPU vs GPU)
    
train(...) #use the model to train on either CPUs or GPUs

model = Net()
fs = s3fs.FileSystem()with fs.open(os.path.join(model_path, 'model.pt'), 'rb') as file:
    model.load_state_dict(torch.load(file, map_location=torch.device('cpu')))
```

#### Where SageMaker AI stores your serialized data


When you invoke a remote function, SageMaker AI automatically serializes your function arguments and return values during the input and output stages. This serialized data is stored under a root directory in your S3 bucket. You specify the root directory, `<s3_root_uri>`, in a configuration file. The parameter `job_name` is automatically generated for you. 

Under the root directory, SageMaker AI creates a `<job_name>` folder, which holds your current work directory, serialized function, the arguments for your serialized function, results and any exceptions that arose from invoking the serialized function.

Under `<job_name>`, the directory `workdir` contains a zipped archive of your current working directory. The zipped archive includes any Python files in your working directory and the `requirements.txt` file, which specifies any dependencies needed to run your remote function.

The following is an example of the folder structure under an S3 bucket that you specify in your configuration file. 

```
<s3_root_uri>/ # specified by s3_root_uri or S3RootUri
    <job_name>/ #automatically generated for you
        workdir/workspace.zip # archive of the current working directory (workdir)
        function/ # serialized function
        arguments/ # serialized function arguments
        results/ # returned output from the serialized function including the model
        exception/ # any exceptions from invoking the serialized function
```

The root directory that you specify in your S3 bucket is not meant for long term storage. The serialized data are tightly tied to the Python version and machine learning (ML) framework version that were used during serialization. If you upgrade the Python version or ML framework, you may not be able to use your serialized data. Instead, do the following.
+ Store your model and model artifacts in a format that is agnostic to your Python version and ML framework.
+ If you upgrade your Python or ML framework, access your model results from your long-term storage.

**Important**  
To delete your serialized data after a specified amount of time, set a [lifetime configuration](https://docs.aws.amazon.com/AmazonS3/latest/userguide/how-to-set-lifecycle-configuration-intro.html) on your S3 bucket.

**Note**  
Files that are serialized with the Python [pickle](https://docs.python.org/3/library/pickle.html) module can be less portable than other data formats including CSV, Parquet and JSON. Be wary of loading pickled files from unknown sources.

For more information about what to include in a configuration file for a remote function, see [Configuration File](https://docs.aws.amazon.com/sagemaker/latest/dg/train-remote-decorator-config.html).

#### Access to your serialized data


Administrators can provide settings for your serialized data, including its location and any encryption settings in a configuration file. By default, the serialized data are encrypted with an AWS Key Management Service (AWS KMS) Key. Administrators can also restrict access to the root directory that you specify in your configuration file with a [bucket policy](https://docs.aws.amazon.com/AmazonS3/latest/userguide/example-bucket-policies.html). The configuration file can be shared and used across projects and jobs. For more information, see [Configuration File](https://docs.aws.amazon.com/sagemaker/latest/dg/train-remote-decorator-config.html).

## Use the `RemoteExecutor` API to invoke a function


You can use the `RemoteExecutor` API to invoke a function. SageMaker AI Python SDK will transform the code inside the `RemoteExecutor` call into a SageMaker AI training job. The training job will then invoke the function as an asynchronous operation and return a future. If you use the `RemoteExecutor` API, you can run more than one training job in parallel. For more information about futures in Python, see [Futures](https://docs.python.org/3/library/asyncio-future.html).

The following code example shows how to import the required libraries, define a function, start a SageMaker AI instance, and use the API to submit a request to run `2` jobs in parallel.

```
from sagemaker.remote_function import RemoteExecutor

def matrix_multiply(a, b):
    return np.matmul(a, b)


a = np.array([[1, 0],
             [0, 1]])
b = np.array([1, 2])

with RemoteExecutor(max_parallel_job=2, instance_type="ml.m5.large") as e:
    future = e.submit(matrix_multiply, a, b)

assert (future.result() == np.array([1,2])).all()
```

The `RemoteExecutor` class is an implementation of the [concurrent.futures.Executor](https://docs.python.org/3/library/concurrent.futures.html) library.

The following code example shows how to define a function and call it using the `RemoteExecutorAPI`. In this example, the `RemoteExecutor` will submit `4` jobs in total, but only `2` in parallel. The last two jobs will reuse the clusters with minimal overhead.

```
from sagemaker.remote_function.client import RemoteExecutor

def divide(a, b):
    return a/b 

with RemoteExecutor(max_parallel_job=2, keep_alive_period_in_seconds=60) as e:
    futures = [e.submit(divide, a, 2) for a in [3, 5, 7, 9]]

for future in futures:
    print(future.result())
```

The `max_parallel_job` parameter only serves as a rate limiting mechanism without optimizing compute resource allocation. In the previous code example, `RemoteExecutor` doesn’t reserve compute resources for the two parallel jobs before any jobs are submitted. For more information about `max_parallel_job` or other parameters for the @remote decorator, see [Remote function classes and methods specification](https://sagemaker.readthedocs.io/en/stable/remote_function/sagemaker.remote_function.html).

### Future class for the `RemoteExecutor` API


A future class is a public class that represents the return function from the training job when it is invoked asynchronously. The future class implements the [concurrent.futures.Future](https://docs.python.org/3/library/concurrent.futures.html) class. This class can be used to do operations on the underlying job and load data into memory.

# Configuration file


The Amazon SageMaker Python SDK supports setting of default values for AWS infrastructure primitive types. After administrators configure these defaults, they are automatically passed when SageMaker Python SDK calls supported APIs. The arguments for the decorator function can be put inside of configuration files. This is so that you can separate settings that are related to the infrastructure from the code base. For more information about parameters and arguments for the remote function and methods, see [Remote function classes and methods specification](https://sagemaker.readthedocs.io/en/stable/remote_function/sagemaker.remote_function.html).

You can set infrastructure settings for the network configuration, IAM roles, Amazon S3 folder for input, output data, and tags inside the configuration file. The configuration file can be used when invoking a function using either the @remote decorator or the `RemoteExecutor` API.

An example configuration file that defines the dependencies, resources, and other arguments follows. This example configuration file is used to invoke a function that is initiated either using the @remote decorator or the RemoteExecutor API.

```
SchemaVersion: '1.0'
SageMaker:
  PythonSDK:
    Modules:
      RemoteFunction:
        Dependencies: 'path/to/requirements.txt'
        EnableInterContainerTrafficEncryption: true
        EnvironmentVariables: {'EnvVarKey': 'EnvVarValue'}
        ImageUri: '366666666666.dkr.ecr.us-west-2.amazonaws.com/my-image:latest'
        IncludeLocalWorkDir: true
        CustomFileFilter: 
          IgnoreNamePatterns:
          - "*.ipynb"
          - "data"
        InstanceType: 'ml.m5.large'
        JobCondaEnvironment: 'your_conda_env'
        PreExecutionCommands:
            - 'command_1'
            - 'command_2'
        PreExecutionScript: 'path/to/script.sh'
        RoleArn: 'arn:aws:iam::366666666666:role/MyRole'
        S3KmsKeyId: 'yourkmskeyid'
        S3RootUri: 's3://amzn-s3-demo-bucket/my-project'
        VpcConfig:
            SecurityGroupIds: 
            - 'sg123'
            Subnets: 
            - 'subnet-1234'
        Tags: [{'Key': 'yourTagKey', 'Value':'yourTagValue'}]
        VolumeKmsKeyId: 'yourkmskeyid'
```

The @remote decorator and `RemoteExecutor` will look for `Dependencies` in the following configuration files:
+ An admin-defined configuration file.
+ A user-defined configuration file.

The default locations for these configuration files depend on, and are relative to, your environment. The following code example returns the default location of your admin and user configuration files. These commands must be run in the same environment where you're using the SageMaker Python SDK.

```
import os
from platformdirs import site_config_dir, user_config_dir

#Prints the location of the admin config file
print(os.path.join(site_config_dir("sagemaker"), "config.yaml"))

#Prints the location of the user config file
print(os.path.join(user_config_dir("sagemaker"), "config.yaml"))
```

You can override the default locations of these files by setting the `SAGEMAKER_ADMIN_CONFIG_OVERRIDE` and `SAGEMAKER_USER_CONFIG_OVERRIDE` environment variables for the admin-defined and user-defined configuration file paths, respectively. 

If a key exists in both the admin-defined and user-defined configuration files, the value in the user-defined file will be used.

# Customize your runtime environment


You can customize your runtime environment to use your preferred local integrated development environments (IDEs), SageMaker notebooks, or SageMaker Studio Classic notebooks to write your ML code. SageMaker AI will help package and submit your functions and its dependencies as a SageMaker training job. This allows you to access the capacity of the SageMaker training server to run your training jobs.

Both the remote decorator and the `RemoteExecutor` methods to invoke a function allow users to define and customize their runtime environment. You can use either a `requirements.txt` file or a conda environment YAML file.

To customize a runtime environment using both a conda environment YAML file and a `requirements.txt` file, refer to the following code example.

```
# specify a conda environment inside a yaml file
@remote(instance_type="ml.m5.large",
        image_uri = "my_base_python:latest", 
        dependencies = "./environment.yml")
def matrix_multiply(a, b):
    return np.matmul(a, b)

# use a requirements.txt file to import dependencies
@remote(instance_type="ml.m5.large",
        image_uri = "my_base_python:latest", 
        dependencies = './requirements.txt')
def matrix_multiply(a, b):
    return np.matmul(a, b)
```

Alternatively, you can set `dependencies` to `auto_capture` to let the SageMaker Python SDK capture the installed dependencies in the active conda environment. The following are required for `auto_capture` to work reliably:
+ You must have an active conda environment. We recommend not using the `base` conda environment for remote jobs so that you can reduce potential dependency conflicts. Not using the `base` conda environment also allows for faster environment setup in the remote job.
+ You must not have any dependencies installed using pip with a value for the parameter `--extra-index-url`.
+ You must not have any dependency conflicts between packages installed with conda and packages installed with pip in the local development environment.
+ Your local development environment must not contain operating system-specific dependencies that are not compatible with Linux.

In case `auto_capture` does not work, we recommend that you pass in your dependencies as a requirement.txt or conda environment.yaml file, as described in the first coding example in this section.

# Container image compatibility


The following table shows a list of SageMaker training images that are compatible with the @remote decorator.


| Name | Python Version | Image URI - CPU | Image URI - GPU | 
| --- | --- | --- | --- | 
|  Data Science  |  3.7(py37)  |  For SageMaker Studio Classic Notebooks only. Python SDK automatically selects the image URI when used as SageMaker Studio Classic Notebook kernel image.  |  For SageMaker Studio Classic Notebooks only. Python SDK automatically selects the image URI when used as SageMaker Studio Classic Notebook kernel image.  | 
|  Data Science 2.0  |  3.8(py38)  |  For SageMaker Studio Classic Notebooks only. Python SDK automatically selects the image URI when used as SageMaker Studio Classic Notebook kernel image.  |  For SageMaker Studio Classic Notebooks only. Python SDK automatically selects the image URI when used as SageMaker Studio Classic Notebook kernel image.  | 
|  Data Science 3.0  |  3.10(py310)  |  For SageMaker Studio Classic Notebooks only. Python SDK automatically selects the image URI when used as SageMaker Studio Classic Notebook kernel image.  |  For SageMaker Studio Classic Notebooks only. Python SDK automatically selects the image URI when used as SageMaker Studio Classic Notebook kernel image.  | 
|  Base Python 2.0  |  3.8(py38)  |  Python SDK selects this image when it detects that development environment is using Python 3.8 runtime. Otherwise Python SDK automatically selects this image when used as SageMaker Studio Classic Notebook kernel image  |  For SageMaker Studio Classic Notebooks only. Python SDK automatically selects the image URI when used as SageMaker Studio Classic Notebook kernel image.  | 
|  Base Python 3.0  |  3.10(py310)  |  Python SDK selects this image when it detects that development environment is using Python 3.8 runtime. Otherwise Python SDK automatically selects this image when used as SageMaker Studio Classic Notebook kernel image  |  For SageMaker Studio Classic Notebooks only. Python SDK automatically selects the image URI when used as Studio Classic Notebook kernel image.  | 
|  DLC-TensorFlow 2.12.0 for SageMaker Training  |  3.10(py310)  |  763104351884.dkr.ecr.<region>.amazonaws.com/tensorflow-training:2.12.0-cpu-py310-ubuntu20.04-sagemaker  |  763104351884.dkr.ecr.<region>.amazonaws.com/tensorflow-training:2.12.0-gpu-py310-cu118-ubuntu20.04-sagemaker  | 
|  DLC-Tensorflow 2.11.0 for SageMaker training  |  3.9(py39)  |  763104351884.dkr.ecr.<region>.amazonaws.com/tensorflow-training:2.11.0-cpu-py39-ubuntu20.04-sagemaker  |  763104351884.dkr.ecr.<region>.amazonaws.com/tensorflow-training:2.11.0-gpu-py39-cu112-ubuntu20.04-sagemaker  | 
|  DLC-TensorFlow 2.10.1 for SageMaker training  |  3.9(py39)  |  763104351884.dkr.ecr.<region>.amazonaws.com/tensorflow-training:2.10.1-cpu-py39-ubuntu20.04-sagemaker  |  763104351884.dkr.ecr.<region>.amazonaws.com/tensorflow-training:2.10.1-gpu-py39-cu112-ubuntu20.04-sagemaker  | 
|  DLC-TensorFlow 2.9.2 for SageMaker training  |  3.9(py39)  |  763104351884.dkr.ecr.<region>.amazonaws.com/tensorflow-training:2.9.2-cpu-py39-ubuntu20.04-sagemaker  |  763104351884.dkr.ecr.<region>.amazonaws.com/tensorflow-training:2.9.2-gpu-py39-cu112-ubuntu20.04-sagemaker  | 
|  DLC-TensorFlow 2.8.3 for SageMaker training  |  3.9(py39)  |  763104351884.dkr.ecr.<region>.amazonaws.com/tensorflow-training:2.8.3-cpu-py39-ubuntu20.04-sagemaker  |  763104351884.dkr.ecr.<region>.amazonaws.com/tensorflow-training:2.8.3-gpu-py39-cu112-ubuntu20.04-sagemaker  | 
|  DLC-PyTorch 2.0.0 for SageMaker training  |  3.10(py310)  |  763104351884.dkr.ecr.<region>.amazonaws.com/pytorch-training:2.0.0-cpu-py310-ubuntu20.04-sagemaker  |  763104351884.dkr.ecr.<region>.amazonaws.com/pytorch-training:2.0.0-gpu-py310-cu118-ubuntu20.04-sagemaker  | 
|  DLC-PyTorch 1.13.1 for SageMaker training  |  3.9(py39)  |  763104351884.dkr.ecr.<region>.amazonaws.com/pytorch-training:1.13.1-cpu-py39-ubuntu20.04-sagemaker  |  763104351884.dkr.ecr.<region>.amazonaws.com/pytorch-training:1.13.1-gpu-py39-cu117-ubuntu20.04-sagemaker  | 
|  DLC-PyTorch 1.12.1 for SageMaker training  |  3.8(py38)  |  763104351884.dkr.ecr.<region>.amazonaws.com/pytorch-training:1.12.1-cpu-py38-ubuntu20.04-sagemaker  |  763104351884.dkr.ecr.<region>.amazonaws.com/pytorch-training:1.12.1-gpu-py38-cu113-ubuntu20.04-sagemaker  | 
|  DLC-PyTorch 1.11.0 for SageMaker training  |  3.8(py38)  |  763104351884.dkr.ecr.<region>.amazonaws.com/pytorch-training:1.11.0-cpu-py38-ubuntu20.04-sagemaker  |  763104351884.dkr.ecr.<region>.amazonaws.com/pytorch-training:1.11.0-gpu-py38-cu113-ubuntu20.04-sagemaker  | 
|  DLC-MXNet 1.9.0 for SageMaker training  |  3.8(py38)  |  763104351884.dkr.ecr.<region>.amazonaws.com/mxnet-training:1.9.0-cpu-py38-ubuntu20.04-sagemaker  |  763104351884.dkr.ecr.<region>.amazonaws.com/mxnet-training:1.9.0-gpu-py38-cu112-ubuntu20.04-sagemaker  | 

**Note**  
To run jobs locally using AWS Deep Learning Containers (DLC) images, use the image URIs found in the [DLC documentation](https://github.com/aws/deep-learning-containers/blob/master/available_images.md). The DLC images do not support the `auto_capture` value for dependencies.  
Jobs with [SageMaker AI Distribution in SageMaker Studio](https://github.com/aws/sagemaker-distribution#amazon-sagemaker-studio) run in a container as a non-root user named `sagemaker-user`. This user needs full permission to access `/opt/ml` and `/tmp`. Grant this permission by adding `sudo chmod -R 777 /opt/ml /tmp` to the `pre_execution_commands` list, as shown in the following snippet:  

```
@remote(pre_execution_commands=["sudo chmod -R 777 /opt/ml /tmp"])
def func():
    pass
```

You can also run remote functions with your custom images. For compatibility with remote functions, custom images should be built with Python version 3.7.x-3.10.x. The following is a minimal Dockerfile example showing you how to use a Docker image with Python 3.10.

```
FROM python:3.10

#... Rest of the Dockerfile
```

To create `conda` environments in your image and use it to run jobs, set the environment variable `SAGEMAKER_JOB_CONDA_ENV` to the `conda` environment name. If your image has the `SAGEMAKER_JOB_CONDA_ENV` value set, the remote function cannot create a new conda environment during the training job runtime. Refer to the following Dockerfile example that uses a `conda` environment with Python version 3.10.

```
FROM continuumio/miniconda3:4.12.0  

ENV SHELL=/bin/bash \
    CONDA_DIR=/opt/conda \
    SAGEMAKER_JOB_CONDA_ENV=sagemaker-job-env

RUN conda create -n $SAGEMAKER_JOB_CONDA_ENV \
   && conda install -n $SAGEMAKER_JOB_CONDA_ENV python=3.10 -y \
   && conda clean --all -f -y \
```

For SageMaker AI to use [mamba](https://mamba.readthedocs.io/en/latest/user_guide/mamba.html) to manage your Python virtual environment in the container image, install the [mamba toolkit from miniforge](https://github.com/conda-forge/miniforge). To use mamba, add the following code example to your Dockerfile. Then, SageMaker AI will detect the `mamba` availability at runtime and use it instead of `conda`.

```
#Mamba Installation
RUN curl -L -O "https://github.com/conda-forge/miniforge/releases/latest/download/Mambaforge-Linux-x86_64.sh" \
    && bash Mambaforge-Linux-x86_64.sh -b -p "/opt/conda"  \
    && /opt/conda/bin/conda init bash
```

Using a custom conda channel on an Amazon S3 bucket is not compatible with mamba when using a remote function. If you choose to use mamba, make sure you are not using a custom conda channel on Amazon S3. For more information, see the **Prerequisites** section under **Custom conda repository using Amazon S3**.

The following is a complete Dockerfile example showing how to create a compatible Docker image.

```
FROM python:3.10

RUN apt-get update -y \
    # Needed for awscli to work
    # See: https://github.com/aws/aws-cli/issues/1957#issuecomment-687455928
    && apt-get install -y groff unzip curl \
    && pip install --upgrade \
        'boto3>1.0<2' \
        'awscli>1.0<2' \
        'ipykernel>6.0.0<7.0.0' \
#Use ipykernel with --sys-prefix flag, so that the absolute path to 
    #/usr/local/share/jupyter/kernels/python3/kernel.json python is used
    # in kernelspec.json file
    && python -m ipykernel install --sys-prefix

#Install Mamba
RUN curl -L -O "https://github.com/conda-forge/miniforge/releases/latest/download/Mambaforge-Linux-x86_64.sh" \
    && bash Mambaforge-Linux-x86_64.sh -b -p "/opt/conda"  \
    && /opt/conda/bin/conda init bash

#cleanup
RUN apt-get clean \
    && rm -rf /var/lib/apt/lists/* \
    && rm -rf ${HOME}/.cache/pip \
    && rm Mambaforge-Linux-x86_64.sh

ENV SHELL=/bin/bash \
    PATH=$PATH:/opt/conda/bin
```

 The resulting image from running the previous Dockerfile example can also be used as a [SageMaker Studio Classic kernel image](https://docs.aws.amazon.com/sagemaker/latest/dg/studio-byoi.html).

# Logging parameters and metrics with Amazon SageMaker Experiments


This guide show how to log parameters and metrics with Amazon SageMaker Experiments. A SageMaker AI experiment consists of runs, and each run consists of all the inputs, parameters, configurations and results for a single model training interaction. 

You can log parameters and metrics from a remote function using either the @remote decorator or the `RemoteExecutor` API. 

To log parameters and metrics from a remote function, choose one of the following methods:
+ Instantiate a SageMaker AI experiment run inside a remote function using `Run` from the SageMaker Experiments library. For more information, see [Create an Amazon SageMaker AI Experiment](https://docs.aws.amazon.com/sagemaker/latest/dg/experiments-create.html).
+ Use the `load_run` function inside a remote function from the SageMaker AI Experiments library. This will load a `Run` instance that is declared outside of the remote function.

The following sections show how to create and track lineage with SageMaker AI experiment runs by using the previous listed methods. The sections also describe cases that are not supported by SageMaker training.

## Use the @remote decorator to integrate with SageMaker Experiments


You can either instantiate an experiment in SageMaker AI, or load a current SageMaker AI experiment from inside a remote function. The following sections show you show to use either method. 

### Create an experiment with SageMaker Experiments


You can create an experiment run in SageMaker AI experiment. To do this you pass your experiment name, run name, and other parameters into your remote function.

The following code example imports the name of your experiment, the name of the run, and the parameters to log during each run. The parameters `param_1` and `param_2` are logged over time inside a training loop. Common parameters may include batch size or epochs. In this example, the metrics `metric_a` and `metric_b` are logged for a run over time inside a training loop. Other common metrics may include `accuracy` or `loss`. 

```
from sagemaker.remote_function import remote
from sagemaker.experiments.run import Run

# Define your remote function
@remote
def train(value_1, value_2, exp_name, run_name):
    ...
    ...
    #Creates the experiment
    with Run(
        experiment_name=exp_name,
        run_name=run_name,
    ) as run:
        ...
        #Define values for the parameters to log
        run.log_parameter("param_1", value_1)
        run.log_parameter("param_2", value_2) 
        ...
        #Define metrics to log
        run.log_metric("metric_a", 0.5)
        run.log_metric("metric_b", 0.1)


# Invoke your remote function        
train(1.0, 2.0, "my-exp-name", "my-run-name")
```

### Load current SageMaker Experiments with a job initiated by the @remote decorator


Use the `load_run()` function from the SageMaker Experiments library to load the current run object from the run context. You can also use the `load_run()` function within your remote function. Load the run object initialized locally by the `with` statement on the run object as shown in the following code example.

```
from sagemaker.experiments.run import Run, load_run

# Define your remote function
@remote
def train(value_1, value_2):
    ...
    ...
    with load_run() as run:
        run.log_metric("metric_a", value_1)
        run.log_metric("metric_b", value_2)


# Invoke your remote function
with Run(
    experiment_name="my-exp-name",
    run_name="my-run-name",
) as run:
    train(0.5, 1.0)
```

## Load a current experiment run within a job initiated with the `RemoteExecutor` API


You can also load a current SageMaker AI experiment run if your jobs were initiated with the `RemoteExecutor` API. The following code example shows how to use `RemoteExecutor` API with the SageMaker Experiments `load_run` function. You do this to load a current SageMaker AI experiment run and capture metrics in the job submitted by `RemoteExecutor`.

```
from sagemaker.experiments.run import Run, load_run

def square(x):
    with load_run() as run:
        result = x * x
        run.log_metric("result", result)
    return result


with RemoteExecutor(
    max_parallel_job=2,
    instance_type="ml.m5.large"
) as e:
    with Run(
        experiment_name="my-exp-name",
        run_name="my-run-name",
    ):
        future_1 = e.submit(square, 2)
```

## Unsupported uses for SageMaker Experiments while annotating your code with an @remote decorator


SageMaker AI does not support passing a `Run` type object to an @remote function or using global `Run` objects. The following examples show code that will throw a `SerializationError`.

The following code example attempts to pass a `Run` type object to an @remote decorator, and it generates an error.

```
@remote
def func(run: Run):
    run.log_metrics("metric_a", 1.0)
    
with Run(...) as run:
    func(run) ---> SerializationError caused by NotImplementedError
```

The following code example attempts to use a global `run` object instantiated outside of the remote function. In the code example, the `train()` function is defined inside the `with Run` context, referencing a global run object from within. When `train()` is called, it generates an error.

```
with Run(...) as run:
    @remote
    def train(metric_1, value_1, metric_2, value_2):
        run.log_parameter(metric_1, value_1)
        run.log_parameter(metric_2, value_2)
    
    train("p1", 1.0, "p2", 0.5) ---> SerializationError caused by NotImplementedError
```

# Using modular code with the @remote decorator


You can organize your code into modules for ease of workspace management during development and still use the @remote function to invoke a function. You can also replicate the local modules from your development environment to the remote job environment. To do so, set the parameter `include_local_workdir` to `True`, as shown in the following code example.

```
@remote(
  include_local_workdir=True,
)
```

**Note**  
The @remote decorator and parameter must appear in the main file, rather than in any of the dependent files.

When `include_local_workdir` is set to `True`, SageMaker AI packages all of the Python scripts while maintaining the directory structure in the process' current directory. It also makes the dependencies available in the job's working directory.

For example, suppose your Python script which processes the MNIST dataset is divided into a `main.py` script and a dependent `pytorch_mnist.py` script. `main.py` calls the dependent script. Also, the `main.py` script contains code to import the dependency as shown.

```
from mnist_impl.pytorch_mnist import ...
```

The `main.py` file must also contain the `@remote` decorator, and it must set the `include_local_workdir` parameter to `True`.

The `include_local_workdir` parameter by default includes all the Python scripts in the directory. You can customize which files you want to upload to the job by using this parameter in conjunction with the `custom_file_filter` parameter. You can either pass a function that filters job dependencies to be uploaded to S3, or a `CustomFileFilter` object that specifies the local directories and files to ignore in the remote function. You can use `custom_file_filter` only if `include_local_workdir` is set to `True`—otherwise the parameter is ignored.

The following example uses `CustomFileFilter` to ignore all notebook files and folders or files named `data` when uploading files to S3.

```
@remote(
   include_local_workdir=True,
   custom_file_filter=CustomFileFilter(
      ignore_name_patterns=[ # files or directories to ignore
        "*.ipynb", # all notebook files
        "data", # folter or file named data
      ]
   )
)
```

The following example demonstrates how you can package an entire workspace.

```
@remote(
   include_local_workdir=True,
   custom_file_filter=CustomFileFilter(
      ignore_pattern_names=[] # package whole workspace
   )
)
```

The following example shows how you can use a function to filter files.

```
import os

def my_filter(path: str, files: List[str]) -> List[str]:
    to_ignore = []
   for file in files:
       if file.endswith(".txt") or file.endswith(".ipynb"):
           to_ignore.append(file)
   return to_ignore

@remote(
   include_local_workdir=True,
   custom_file_filter=my_filter
)
```

## Best practices in structuring your working directory


The following best practices suggest how you can organize your directory structure while using the `@remote` decorator in your modular code.
+ Put the @remote decorator in a file that resides at the root level directory of the workspace.
+ Structure the local modules at the root level.

The following example image shows the recommended directory structure. In this example structure, the `main.py` script is located at the root level directory.

```
.
├── config.yaml
├── data/
├── main.py <----------------- @remote used here 
├── mnist_impl
│ ├── __pycache__/
│ │ └── pytorch_mnist.cpython-310.pyc
│ ├── pytorch_mnist.py <-------- dependency of main.py
├── requirements.txt
```

The following example image shows a directory structure that will result in inconsistent behavior when it is used to annotate your code with an @remote decorator. 

In this example structure, the `main.py` script that contains the @remote decorator is **not** located at the root level directory. The following structure is **NOT** recommended.

```
.
├── config.yaml
├── entrypoint
│ ├── data
│ └── main.py <----------------- @remote used here
├── mnist_impl
│ ├── __pycache__
│ │ └── pytorch_mnist.cpython-310.pyc
│ └── pytorch_mnist.py <-------- dependency of main.py
├── requirements.txt
```

# Private repository for runtime dependencies


You can use pre-execution commands or script to configure a dependency manager like pip or conda in your job environment. To achieve network isolation, use either of these options to redirect your dependency managers to access your private repositories and run remote functions within a VPC. The pre-execution commands or script will run before your remote function runs. You can define them with the @remote decorator, the `RemoteExecutor` API, or within a configuration file.

The following sections show you how to access a private Python Package Index (PyPI) repository managed with AWS CodeArtifact. The sections also show how to access a custom conda channel hosted on Amazon Simple Storage Service (Amazon S3).

## How to use a custom PyPI repository managed with AWS CodeArtifact


To use CodeArtifact to manage a custom PyPI repository, the following prerequisites are required:
+ Your private PyPI repository should already have been created. You can utilize AWS CodeArtifact to create and manage your private package repositories. To learn more about CodeArtifact, see the [CodeArtifact User Guide](https://docs.aws.amazon.com/codeartifact/latest/ug/welcome.html).
+ Your VPC should have access to your CodeArtifact repository. To allow a connection from your VPC to your CodeArtifact repository, you must do the following:
  + [Create VPC endpoints for CodeArtifact](https://docs.aws.amazon.com/codeartifact/latest/ug/create-vpc-endpoints.html).
  + [Create an Amazon S3 gateway endpoint](https://docs.aws.amazon.com/codeartifact/latest/ug/create-s3-gateway-endpoint.html) for your VPC, which allows CodeArtifact to store package assets.

The following pre-execution command example shows how to configure pip in the SageMaker AI training job to point to your CodeArtifact repository. For more information, see [Configure and use pip with CodeArtifact](https://docs.aws.amazon.com/codeartifact/latest/ug/python-configure-pip.html).

```
# use a requirements.txt file to import dependencies
@remote(
    instance_type="ml.m5.large"
    image_uri = "my_base_python:latest", 
    dependencies = './requirements.txt',
    pre_execution_commands=[
        "aws codeartifact login --tool pip --domain my-org --domain-owner <000000000000> --repository my-codeartifact-python-repo --endpoint-url https://vpce-xxxxx.api.codeartifact.us-east-1.vpce.amazonaws.com"
    ]
)
def matrix_multiply(a, b):
    return np.matmul(a, b)
```

## How to use a custom conda channel hosted on Amazon S3


To use Amazon S3 to manage a custom conda repository, the following prerequisites are required:
+ Your private conda channel must already be set up in your Amazon S3 bucket, and all dependent packages must be indexed and uploaded to your Amazon S3 bucket. For instructions on how to index your conda packages, see [Creating custom channels](https://conda.io/projects/conda/en/latest/user-guide/tasks/create-custom-channels.html).
+ Your VPC should have access to the Amazon S3 bucket. For more information, see [Endpoints for Amazon S3](https://docs.aws.amazon.com/vpc/latest/userguide/vpc-endpoints-s3.html).
+ The base conda environment in your job image should have `boto3` installed. To check your environment, enter the following in your Anaconda prompt to check that `boto3` appears in the resulting generated list.

  ```
  conda list -n base
  ```
+ You job image should be installed with conda, not [mamba](https://mamba.readthedocs.io/en/latest/installation.html). To check your environment, ensure that the previous code prompt does not return `mamba`.

The following pre-execution commands example shows how to configure conda in the SageMaker training job to point to your private channel on Amazon S3 The pre-execution commands removes the defaults channel and adds custom channels to a `.condarc` conda configuration file.

```
# specify your dependencies inside a conda yaml file
@remote(
    instance_type="ml.m5.large"
    image_uri = "my_base_python:latest", 
    dependencies = "./environment.yml",
    pre_execution_commands=[
        "conda config --remove channels 'defaults'"
        "conda config --add channels 's3://my_bucket/my-conda-repository/conda-forge/'",
        "conda config --add channels 's3://my_bucket/my-conda-repository/main/'"
    ]
)
def matrix_multiply(a, b):
    return np.matmul(a, b)
```

# Example notebooks


You can transform a training code in an existing workspace environment and any associated data processing code and datasets into a SageMaker training job. The following notebooks show you how to customize your environment, job settings, and more for an image classification problem, using the XGBoost algorithm and Hugging Face.

The [quick\$1start notebook](https://github.com/aws/amazon-sagemaker-examples/blob/main/sagemaker-remote-function/quick_start/quick_start.ipynb) contains the following code examples:
+ How to customize your job settings with a configuration file.
+ How to invoke Python functions as jobs, asynchronously.
+ How to customize the job runtime environment by bringing in additional dependencies.
+ How to use local dependencies with the @remote function method.

The following notebooks provide additional code examples for different ML problems types and implementations. 
+ To see code examples to use the @remote decorator for an image classification problem, open the [pytorch\$1mnist.ipynb](https://github.com/aws/amazon-sagemaker-examples/tree/main/sagemaker-remote-function/pytorch_mnist_sample_notebook) notebook. This classification problem recognizes handwritten digits using the Modified National Institute of Standards and Technology (MNIST) sample dataset.
+ To see code examples for using the @remote decorator for the previous image classification problem with a script, see the Pytorch MNIST sample script, [train.py](https://github.com/aws/amazon-sagemaker-examples/tree/main/sagemaker-remote-function/pytorch_mnist_sample_script).
+ To see how the XGBoost algorithm implemented with an @remote decorator: Open the [xgboost\$1abalone.ipynb](https://github.com/aws/amazon-sagemaker-examples/tree/main/sagemaker-remote-function/xgboost_abalone) notebook.
+ To see how Hugging Face is integrated with an @remote decorator: Open the [huggingface.ipynb](https://github.com/aws/amazon-sagemaker-examples/tree/main/sagemaker-remote-function/huggingface_text_classification) notebook.