

# Using DataBrew as an extension in JupyterLab
<a name="jupyter"></a>

**Warning**  
 AWS Glue DataBrew JupyterLab extension support is ending on December 31, 2024 as JupyterLab 3 will reach end of support. For more information, see [ JupyterLab 3 end of maintenance ](https://blog.jupyter.org/jupyterlab-3-end-of-maintenance-879778927db2). 

If you prefer to prepare data in a Jupyter Notebook environment, you can use all the capabilities of AWS Glue DataBrew in JupyterLab. 

JupyterLab is a web-based interactive development environment for Jupyter Notebook. In the local JupyterLab webpage, you can add sections for a terminal, a SQL session, Python, and more. After installing the AWS Glue DataBrew extension, you can add a section for the DataBrew console. It runs with any existing notebooks or other extensions that you already have, directly from the JupyterLab environment. 

**Topics**
+ [Prerequisites](jupyter-prereqs.md)
+ [Configuring JupyterLab to use the extension](jupyter-configuration.md)
+ [Enabling the DataBrew extension for JupyterLab](jupyter-enabling-databrew.md)

# Prerequisites
<a name="jupyter-prereqs"></a>

Before you begin, set up the following items:
+ An AWS account – If you don't have one yet, start with [Setting up a new AWS account](setting-up-aws.md). 
+ An AWS Identity and Access Management (IAM) user with access to the permissions needed for DataBrew – For more information, see [Adding users or groups with DataBrew permissions](setting-up-iam-users-and-groups-for-databrew.md). 
+ An IAM role to use in DataBrew operations – You can use the default, if `AwsGlueDataBrewDataAccessRole` is configured. To set up additional IAM roles, see [Adding an IAM role with data resource permissions](setting-up-iam-role-to-use-in-databrew.md).
+ A JupyterLab installation (version 2.2.6 or greater) – For more information, see the following topics in the [JupyterLab documentation](https://JupyterLab.readthedocs.io/en/stable/index.html):
  + [JupyterLab prerequisites](https://JupyterLab.readthedocs.io/en/stable/getting_started/installation.html#prerequisites)
  + [JupyterLab installation](https://JupyterLab.readthedocs.io/en/stable/getting_started/installation.html) – We recommend using `pip install jupyterlab`.
+ A Node.js installation (version 12.0 or greater).
+ An AWS Command Line Interface (AWS CLI) installation – For more information, see [Setting up the AWS CLI](setting-up-the-aws-cli.md).
+ An AWS Jupyter proxy installation (`pip install aws-jupyter-proxy`)– This extension is used with an AWS service endpoint to securely pass your AWS credentials. For more information, see [aws-jupyter-proxy](https://github.com/aws/aws-jupyter-proxy) on GitHub.

To verify that you have the prerequisites installed, you can run a test that's similar to the following at the command line, as shown in the following example.

```
echo "
AWS CLI:"
which aws
aws --version 
aws configure list
aws sts get-caller-identity

echo "
Python (current environment):"
which python
python --version

echo "
Node.JS:"
which node
node --version 

echo "
Jupyter:"
where jupyter
jupyter --version
jupyter serverextension list
pip3 freeze | grep jupyter
```

The output should look something like the following. The directories vary by operating system and configuration.

```
AWS CLI:
/usr/local/bin/aws 
aws-cli/2.1.2 Python/3.7.4 Darwin/19.6.0 exe/x86_64
      Name                    Value             Type    Location
      ----                    -----             ----    --------
   profile                <not set>             None    None
access_key     ****************VXW4 shared-credentials-file
secret_key     ****************MRJN shared-credentials-file
    region                us-east-1      config-file    ~/.aws/config
{
    "UserId": "",
    "Account": "111122223333",
    "Arn": "arn:aws:iam::111122223333:user/user2"
}

Python (current environment):
/usr/local/opt/python /libexec/bin/python
Python 3.8.5

Node.JS:
/usr/local/bin/node
v15.0.1

Jupyter:
/usr/local/bin/jupyter
jupyter core     : 4.6.3
jupyter-notebook : 6.0.3
qtconsole        : 4.7.5
ipython          : 7.16.1
ipykernel        : 5.3.2
jupyter client   : 6.1.6
jupyter lab      : 2.2.9
nbconvert        : 5.6.1
ipywidgets       : 7.5.1
nbformat         : 5.0.7
traitlets        : 4.3.3        

config dir: /usr/local/etc/jupyter
    aws_jupyter_proxy  enabled
    - Validating...
      aws_jupyter_proxy  OK
    jupyterlab  enabled
    - Validating...
      jupyterlab 2.2.9 OK

aws-jupyter-proxy==0.1.0
jupyter-client==6.1.7
jupyter-core==4.7.0
jupyterlab==2.2.9
jupyterlab-pygments==0.1.2
jupyterlab-server==1.2.0
```

# Configuring JupyterLab to use the extension
<a name="jupyter-configuration"></a>

After you install JupyterLab, you need to configure it to secure data access and to enable server extensions.

**To configure a password and encryption**

1. Set a password to protect the data that you plan to add in the extension. Jupyter provides a password utility. Run the following command and enter your preferred password at the prompt.

   ```
   jupyter notebook password
   ```

    The output looks something like the following.

   ```
   Enter password:
   Verify password:
   [NotebookPasswordApp] Wrote hashed password to /home/ubuntu/.jupyter/jupyter_notebook_config.json
   ```

1. Enable encryption on the Jupyter server. If you install Jupyter on your local machine, and no one can access it over the network, you can skip this step. 

   To set up encryption with Transport Layer Security (TLS), create a certificate customized for your environment. For more information, [Using Let's Encrypt](https://jupyter-notebook.readthedocs.io/en/stable/public_server.html#using-let-s-encrypt) in [Securing a server](https://jupyter-notebook.readthedocs.io/en/stable/public_server.html#securing-a-notebook-server) in the Jupyter documentation.

1. To start JupyterLab, run the following command at the command prompt.

   ```
   jupyter lab
   ```

   For more information, see [Starting JupyterLab](https://JupyterLab.readthedocs.io/en/stable/getting_started/starting.html) in the JupyterLab documentation.

1. While JupyterLab is running, you can access it at a URL similar to the following: [http://localhost:8888/lab](http://localhost:8888/lab). If you set up encryption, use `https` instead of `http`. If you customized the port, substitute your port number instead of `8888`. 

Use the following procedure to enable the third-party extensions.

**To enable third-party extensions in JupyterLab**

1. On the JupyterLab webpage, choose the **Extension Manager** icon in the menu at left. 

1. Read the warning about the risks of running third-party extensions. Only install extensions from developers that you trust.

1. To enable third-party extensions in JupyterLab, choose **Enable**.

1. Follow the prompts to rebuild and reload JupyterLab.

# Enabling the DataBrew extension for JupyterLab
<a name="jupyter-enabling-databrew"></a>

After you have a secure installation of JupyterLab with extensions enabled, install the DataBrew extension so you can run DataBrew in your notebook. 

**To install the extensions for DataBrew (console)**

1. To start JupyterLab, run the following command at the command prompt.

   ```
   jupyter lab
   ```

1. On the JupyterLab webpage, choose the **Extension Manager** icon in the menu at left. 

1. Search for the DataBrew extension by entering "**brew**" for **Search** at top left. 

1. Locate **aws\$1glue\$1databrew\$1jupyter** in the list, but don't click it. If you click the highlighted name of the extension, a new browser window opens with the [aws\$1glue\$1databrew\$1jupyter](https://github.com/aws/aws-glue-databrew-jupyter-extension#readme) page on GitHub. 

1. To install the DataBrew extension, choose one of the following:
   + At the command line, run `jupyter labextension install aws_glue_databrew_jupyter`.
   + Choose **Install** at the bottom of the extension card, underneath "**aws\$1glue\$1databrew\$1jupyter**" in gray lettering. 

   DataBrew extension is compatible with JupyterLab version 1.2 and 2.x.

1. To verify that it installed, run `jupyter labextension list`. The output should look something like the following.

   ```
   JupyterLab v2.2.9
   Known labextensions:
      app dir: /usr/local/share/jupyter/lab  # varies by OS
           aws_glue_databrew_jupyter v1.0.1  enabled  OK
   ```

1. Rebuild JupyterLab by using one of the following:
   + At the command prompt, run `jupyter lab build`.
   + In the webpage, choose **Rebuild** at top left.

1. When the build is complete, do one of the following:
   + At the command prompt, run `jupyter lab`.
   + In the webpage, choose **Reload** on the **Build Complete** message.

1. In the JupyterLab webpage, close the **Extension Manager** by choosing its icon in the menu at left. 

   To open the extension, choose **Launch AWS Glue DataBrew** from the **Other** section on the **Launcher** tab. The extension uses your current AWS CLI configuration for access keys and AWS region settings. 

After you complete the setup, you can use the **AWS Glue DataBrew** tab to interact with DataBrew from within JupyterLab. 