

# Using Sensitive Data Detection outside AWS Glue Studio
<a name="aws-glue-api-sensitive-data-example"></a>

 AWS Glue Studio allows you to detect sensitive data, however, you can also use the Sensitive Data Detection functionality outside of AWS Glue Studio. 

 For a full list of managed sensitive data types, see [Managed data types](https://docs.aws.amazon.com/glue/latest/dg/sensitive-data-managed-data-types.html). 

## Detecting Sensitive Data Detection using AWS Managed PII types
<a name="sensitive-data-managed-pii-types"></a>

 AWS Glue provides two APIs in a AWS Glue ETL job. These are `detect()` and `classifyColumns()`: 

```
  detect(frame: DynamicFrame, 
      entityTypesToDetect: Seq[String], 
      outputColumnName: String = "DetectedEntities",
      detectionSensitivity: String = "LOW"): DynamicFrame

 detect(frame: DynamicFrame, 
      detectionParameters: JsonOptions,
      outputColumnName: String = "DetectedEntities",
      detectionSensitivity: String = "LOW"): DynamicFrame
      
  classifyColumns(frame: DynamicFrame, 
      entityTypesToDetect: Seq[String], 
      sampleFraction: Double = 0.1, 
      thresholdFraction: Double = 0.1,
      detectionSensitivity: String = "LOW")
```

 You can use the `detect()` API to identify AWS Managed PII types and custom entity types. A new column is automatically created with the detection result. The `classifyColumns()` API returns a map where keys are column names and values are list of detected entity types. `SampleFraction` indicates the fraction of the data to sample when scanning for PII entities whereas `ThresholdFraction` indicates the fraction of the data that must be met in order for a column to be identified as PII data. 

### Row-level detection
<a name="w2aac67c11c24c19b9c11"></a>

 In the example, the job is performing the following actions using the `detect()` and `classifyColumns()` APIs: 
+  reading data from an Amazon S3 bucket and turns it into a dynamicFrame 
+  detecting instances of "Email" and "Credit Card" in the dynamicFrame 
+  returning a dynamicFrame with original values plus one column which encompasses detection result for each row 
+  writing the returned dynamicFrame in another Amazon S3 path 

```
  import com.amazonaws.services.glue.GlueContext
  import com.amazonaws.services.glue.MappingSpec
  import com.amazonaws.services.glue.errors.CallSite
  import com.amazonaws.services.glue.util.GlueArgParser
  import com.amazonaws.services.glue.util.Job
  import com.amazonaws.services.glue.util.JsonOptions
  import org.apache.spark.SparkContext
  import scala.collection.JavaConverters._
  import com.amazonaws.services.glue.ml.EntityDetector
  
  object GlueApp {
    def main(sysArgs: Array[String]) {
      val spark: SparkContext = new SparkContext()
      val glueContext: GlueContext = new GlueContext(spark)
      val args = GlueArgParser.getResolvedOptions(sysArgs, Seq("JOB_NAME").toArray)
      Job.init(args("JOB_NAME"), glueContext, args.asJava)
      val frame= glueContext.getSourceWithFormat(formatOptions=JsonOptions("""{"quoteChar": "\"", "withHeader": true, "separator": ","}"""), connectionType="s3", format="csv", options=JsonOptions("""{"paths": ["s3://pathToSource"], "recurse": true}"""), transformationContext="AmazonS3_node1650160158526").getDynamicFrame()
  
      val frameWithDetectedPII = EntityDetector.detect(frame, Seq("EMAIL", "CREDIT_CARD"))
  
      glueContext.getSinkWithFormat(connectionType="s3", options=JsonOptions("""{"path": "s3://pathToOutput/", "partitionKeys": []}"""), transformationContext="someCtx", format="json").writeDynamicFrame(frameWithDetectedPII)
  
      Job.commit()
    }
  }
```

### Row-level detection with fine-grained actions
<a name="w2aac67c11c24c19b9c15"></a>

 In the example, the job is performing the following actions using the `detect()` APIs: 
+  reading data from an Amazon S3 bucket and turns it into a dynamicFrame 
+  detecting sensitive data types for “USA\$1PTIN”, “ BANK\$1ACCOUNT”, “USA\$1SSN”, “USA\$1PASSPORT\$1NUMBER” , and “PHONE\$1NUMBER” in the dynamicFrame 
+  returning a dynamicFrame with modified masked values plus one column which encompasses detection result for each row 
+  writing the returned dynamicFrame in another Amazon S3 path 

 In contrast with the above `detect()` API, this uses fine-grained actions for entity types to detect. For more information, see [Detection parameters for using `detect()`](#sensitive-data-detect-parameters-fine-grained-actions). 

```
import com.amazonaws.services.glue.GlueContext
import com.amazonaws.services.glue.MappingSpec
import com.amazonaws.services.glue.errors.CallSite
import com.amazonaws.services.glue.util.GlueArgParser
import com.amazonaws.services.glue.util.Job
import com.amazonaws.services.glue.util.JsonOptions
import org.apache.spark.SparkContext
import scala.collection.JavaConverters._
import com.amazonaws.services.glue.ml.EntityDetector

object GlueApp {
  def main(sysArgs: Array[String]) {
    val spark: SparkContext = new SparkContext()
    val glueContext: GlueContext = new GlueContext(spark)
    val args = GlueArgParser.getResolvedOptions(sysArgs, Seq("JOB_NAME").toArray)
    Job.init(args("JOB_NAME"), glueContext, args.asJava)
    val frame = glueContext.getSourceWithFormat(formatOptions=JsonOptions("""{"quoteChar": "\"", "withHeader": true, "separator": ","}"""), connectionType="s3", format="csv", options=JsonOptions("""{"paths": ["s3://pathToSource"], "recurse": true}"""), transformationContext="AmazonS3_node_source").getDynamicFrame()

    val detectionParameters = JsonOptions(
      """
        {
          "USA_DRIVING_LICENSE": [{
            "action": "PARTIAL_REDACT",
            "sourceColumns": ["Driving License"],
            "actionOptions": {
              "matchPattern": "[0-9]",
              "redactChar": "*"
            }
          }],
          "BANK_ACCOUNT": [{
            "action": "DETECT",
            "sourceColumns": ["*"]
          }],
          "USA_SSN": [{
            "action": "SHA256_HASH",
            "sourceColumns": ["SSN"]
          }],
          "IP_ADDRESS": [{
            "action": "REDACT",
            "sourceColumns": ["IP Address"],
            "actionOptions": {"redactText": "*****"}
          }],
          "PHONE_NUMBER": [{
            "action": "PARTIAL_REDACT",
            "sourceColumns": ["Phone Number"],
            "actionOptions": {
              "numLeftCharsToExclude": 1,
              "numRightCharsToExclude": 0,
              "redactChar": "*"
            }
          }]
        }
      """
    )

    val frameWithDetectedPII = EntityDetector.detect(frame, detectionParameters, "DetectedEntities", "HIGH")

    glueContext.getSinkWithFormat(connectionType="s3", options=JsonOptions("""{"path": "s3://pathToOutput/", "partitionKeys": []}"""), transformationContext="AmazonS3_node_target", format="json").writeDynamicFrame(frameWithDetectedPII)

    Job.commit()
  }
}
```

### Column-level detection
<a name="w2aac67c11c24c19b9c19"></a>

 In the example, the job is performing the following actions using the `classifyColumns()`APIs: 
+  reading data from an Amazon S3 bucket and turns it into a dynamicFrame 
+  detecting instances of "Email" and "Credit Card" in the dynamicFrame 
+  set parameters to sample 100% of the column, mark an entity as detected if it is in 10% of cells, and have “LOW” sensitivity 
+  returns a map where keys are column names and values are list of detected entity types 
+  writing the returned dynamicFrame in another Amazon S3 path 

```
import com.amazonaws.services.glue.GlueContext
import com.amazonaws.services.glue.MappingSpec
import com.amazonaws.services.glue.errors.CallSite
import com.amazonaws.services.glue.util.GlueArgParser
import com.amazonaws.services.glue.util.Job
import com.amazonaws.services.glue.util.JsonOptions
import org.apache.spark.SparkContext
import scala.collection.JavaConverters._
import com.amazonaws.services.glue.DynamicFrame
import com.amazonaws.services.glue.ml.EntityDetector

object GlueApp {
  def main(sysArgs: Array[String]) {
    val spark: SparkContext = new SparkContext()
    val glueContext: GlueContext = new GlueContext(spark)
    val args = GlueArgParser.getResolvedOptions(sysArgs, Seq("JOB_NAME").toArray)
    Job.init(args("JOB_NAME"), glueContext, args.asJava)
    val frame = glueContext.getSourceWithFormat(formatOptions=JsonOptions("""{"quoteChar": "\"", "withHeader": true, "separator": ",", "optimizePerformance": false}"""), connectionType="s3", format="csv", options=JsonOptions("""{"paths": ["s3://pathToSource"], "recurse": true}"""), transformationContext="frame").getDynamicFrame()
    
    import glueContext.sparkSession.implicits._

    val detectedDataFrame = EntityDetector.classifyColumns(
        frame, 
        entityTypesToDetect = Seq("CREDIT_CARD", "PHONE_NUMBER"), 
        sampleFraction = 1.0, 
        thresholdFraction = 0.1,
        detectionSensitivity = "LOW"
    )
    val detectedDF = (detectedDataFrame).toSeq.toDF("columnName", "entityTypes")
    val DetectSensitiveData_node = DynamicFrame(detectedDF, glueContext)

    glueContext.getSinkWithFormat(connectionType="s3", options=JsonOptions("""{"path": "s3://pathToOutput", "partitionKeys": []}"""), transformationContext="someCtx", format="json").writeDynamicFrame(DetectSensitiveData_node)

    Job.commit()
  }
}
```

## Detecting Sensitive Data Detection using AWS CustomEntityType PII types
<a name="sensitive-data-custom-entity-PII-types"></a>

 You can define custom entities through AWS Studio. However, to use this feature out of AWS Studio, you have to first define the custom entity types and then add the defined custom entity types to the list of `entityTypesToDetect`. 

 If you have specific sensitive data types in your data (such as 'Employee Id'), you can create custom entities by calling the `CreateCustomEntityType()` API. The following example defines the custom entity type 'EMPLOYEE\$1ID' to the `CreateCustomEntityType()` API with the request parameters: 

```
  { 
      "name": "EMPLOYEE_ID",
      "regexString": "\d{4}-\d{3}",
      "contextWords": ["employee"]
  }
```

 Then, modify the job to use the new custom sensitive data type by adding the custom entity type (EMPLOYEE\$1ID) to the `EntityDetector()` API: 

```
  import com.amazonaws.services.glue.GlueContext
  import com.amazonaws.services.glue.MappingSpec
  import com.amazonaws.services.glue.errors.CallSite
  import com.amazonaws.services.glue.util.GlueArgParser
  import com.amazonaws.services.glue.util.Job
  import com.amazonaws.services.glue.util.JsonOptions
  import org.apache.spark.SparkContext
  import scala.collection.JavaConverters._
  import com.amazonaws.services.glue.ml.EntityDetector
  
  object GlueApp {
    def main(sysArgs: Array[String]) {
      val spark: SparkContext = new SparkContext()
      val glueContext: GlueContext = new GlueContext(spark)
      val args = GlueArgParser.getResolvedOptions(sysArgs, Seq("JOB_NAME").toArray)
      Job.init(args("JOB_NAME"), glueContext, args.asJava)
      val frame= glueContext.getSourceWithFormat(formatOptions=JsonOptions("""{"quoteChar": "\"", "withHeader": true, "separator": ","}"""), connectionType="s3", format="csv", options=JsonOptions("""{"paths": ["s3://pathToSource"], "recurse": true}"""), transformationContext="AmazonS3_node1650160158526").getDynamicFrame()
  
      val frameWithDetectedPII = EntityDetector.detect(frame, Seq("EMAIL", "CREDIT_CARD", "EMPLOYEE_ID"))
  
      glueContext.getSinkWithFormat(connectionType="s3", options=JsonOptions("""{"path": "s3://pathToOutput/", "partitionKeys": []}"""), transformationContext="someCtx", format="json").writeDynamicFrame(frameWithDetectedPII)
  
      Job.commit()
    }
  }
```

**Note**  
 If a custom sensitive data type is defined with the same name as an existing managed entity type, then the custom sensitive data type will take precedent and overwrite the managed entity type's logic. 

## Detection parameters for using `detect()`
<a name="sensitive-data-detect-parameters-fine-grained-actions"></a>

 This method is used for detecting entities in a DynamicFrame. It returns a new DataFrame with original values and an additional column outputColumnName that has PII detection metadata. Custom masking can be done after this DynamicFrame is returned within the AWS Glue script, or the detect() with fine-grained actions API can be used instead. 

```
detect(frame: DynamicFrame, 
       entityTypesToDetect: Seq[String], 
       outputColumnName: String = "DetectedEntities",
       detectionSensitivity: String = "LOW"): DynamicFrame
```

 Parameters: 
+  **frame** – (type: `DynamicFrame`) The input DynamicFrame containing the data to be processed. 
+  **entityTypesToDetect** – (type: `[Seq[String]`) List of entity types to detect. Can be either Managed Entity Types or Custom Entity Types. 
+  **outputColumnName** – (type: `String`, default: "DetectedEntities") The name of the column where detected entities will be stored. If not provided, the default column name is "DetectedEntities". 
+  **detectionSensitivity** – (type: `String`, options: "LOW" or "HIGH", default: "LOW") Specifies the sensitivity of the detection process. Valid options are "LOW" or "HIGH". If not provided, the default sensitivity is set to "LOW". 

 `outputColumnName` settings: 

 The name of the column where detected entities will be stored. If not provided, the default column name is "DetectedEntities". For each row in the output column, the supplementary column includes a map of the column name to the detected entity metadata with the following key-value pairs: 
+  **entityType** – The detected entity type. 
+  **start** – The starting position of the detected entity in the original data. 
+  **end** – The ending position of the detected entity in the original data. 
+  **actionUsed** – The action performed on the detected entity (e.g., "DETECT," "REDACT," "PARTIAL\$1REDACT," "SHA256\$1HASH"). 

 Example: 

```
{
   "DetectedEntities":{
      "SSN Col":[
         {
            "entityType":"USA_SSN",
            "actionUsed":"DETECT",
            "start":4,
            "end":15
         }
      ],
      "Random Data col":[
         {
            "entityType":"BANK_ACCOUNT",
            "actionUsed":"PARTIAL_REDACT",
            "start":4,
            "end":13
         },
         {
            "entityType":"IP_ADDRESS",
            "actionUsed":"REDACT",
            "start":4,
            "end":13
         }
      ]
   }
}
```

 **Detection Parameters for `detect()` with fine grained actions** 

 This method is used for detecting entities in a DynamicFrame using specified parameters. It returns a new DataFrame with original values replaced with masked sensitive data and an additional column `outputColumnName` that has PII detection metadata. 

```
detect(frame: DynamicFrame, 
       detectionParameters: JsonOptions,
       outputColumnName: String = "DetectedEntities",
       detectionSensitivity: String = "LOW"): DynamicFrame
```

 Parameters: 
+  **frame** – (type: `DynamicFrame`): The input DynamicFrame containing the data to be processed. 
+  **detectionParameters** – (type: `JsonOptions`): JSON options specifying parameters for the detection process. 
+  **outputColumnName** – (type: `String`, default: "DetectedEntities"): The name of the column where detected entities will be stored. If not provided, the default column name is "DetectedEntities". 
+  **detectionSensitivity** – (type: `String`, options: "LOW" or "HIGH", default: "LOW"): Specifies the sensitivity of the detection process. Valid options are "LOW" or "HIGH". If not provided, the default sensitivity is set to "LOW". 

<a name="detection-parameters-settings"></a> `detectionParameters` settings 

 If no settings are included, default values will be used. 
+  **action** – (type: `String`, options: "DETECT", "REDACT", "PARTIAL\$1REDACT", "SHA256\$1HASH") Specifies the action to be performed on the entity. Required. Note that actions that perform masking (all but "DETECT") can only perform one action per column. This is a preventative measure for masking coalesced entities. 
+  **sourceColumns** – (type: `List[String]`, default: [“\$1”]) List of source column names to perform detection on for the entity. Defaults to [“\$1”] if not present. Raises `IllegalArgumentException` if an invalid column name is used. 
+  **sourceColumnsToExclude** – (type: `List[String]`) List of source column names to to perform detection on for the entity. Use either `sourceColumns` or `sourceColumnsToExclude`. Raises `IllegalArgumentException` if an invalid column name is used. 
+  **actionOptions** – Additional options based on the specified action: 
  +  For "DETECT" and "SHA256\$1HASH", no options are allowed. 
  +  For "REDACT": 
    + **redactText** – (type: `String`, default: "\$1\$1\$1\$1\$1") Text to replace the detected entity.
  +  For "PARTIAL\$1REDACT": 
    +  **redactChar** – (type: `String`, default: "\$1") Character to replace each detected character in the entity. 
    +  **matchPattern** – (type: `String`) Regex pattern for partial redaction. Cannot be combined with numLeftCharsToExclude or `numRightCharsToExclude`. 
    +  **numLeftCharsToExclude** – (type: `String, integer`) Number of left characters to exclude. Cannot be combined with matchPattern, but can be used with `numRightCharsToExclude`. 
    +  **numRightCharsToExclude** – (type: `String, integer`) Number of right characters to exclude. Cannot be combined with matchPattern, but can be used with `numRightCharsToExclude`. 

 `outputColumnName` settings 

 [See outputColumnName settings](#sensitive-data-detect-parameters-fine-grained-actions) 

## Detection Parameters for `classifyColumns()`
<a name="detection-parameters-classifycolumns"></a>

 This method is used for detecting entities in a DynamicFrame. It returns a map where keys are column names and values are list of detected entity types. Custom masking can be done after this is returned within the AWS Glue script. 

```
classifyColumns(frame: DynamicFrame, 
                entityTypesToDetect: Seq[String], 
                sampleFraction: Double = 0.1, 
                thresholdFraction: Double = 0.1,
                detectionSensitivity: String = "LOW")
```

 Parameters: 
+  **frame** – (type: `DynamicFrame`) The input DynamicFrame containing the data to be processed. 
+  **entityTypesToDetect** – (type: `Seq[String]`) List of entity types to detect. Can be either Managed Entity Types or Custom Entity Types. 
+  **sampleFraction** – (type: `Double`, default: 10%) The fraction of the data to sample when scanning for PII entities. 
+  **thresholdFraction** – (type: `Double`, default: 10%): The fraction of the data that must be met in order for a column to be identified as PII data. 
+  **detectionSensitivity** – (type: `String`, options: "LOW" or "HIGH", default: "LOW") Specifies the sensitivity of the detection process. Valid options are "LOW" or "HIGH". If not provided, the default sensitivity is set to "LOW". 

# Managed Sensitive Data Types
<a name="sensitive-data-managed-data-types"></a>

 **Global entities** 


| Data Type | Category | Description | 
| --- | --- | --- | 
| PERSON\$1NAME | Universal |  The name of the person.  | 
| EMAIL | Personal |  The email address.  | 
| IP\$1ADDRESS | Computer |  The IP address   | 
| MAC\$1ADDRESS | Personal |  The MAC address.  | 



 **US data types** 


| Data Type | Description | 
| --- | --- | 
| BANK\$1ACCOUNT |  The bank account number. Not specific to a country or region, however, only US and Canadian account formats are detected.   | 
| CREDIT\$1CARD |  The credit card number.  | 
| PHONE\$1NUMBER |   The phone number. Not specific to a country or region, however, only US and Canadian phone numbers are detected at this time.   | 
| USA\$1ATIN |  The US Adoption Taxpayer Identification Number issued by the Internal Revenue Service.  | 
| USA\$1CPT\$1CODE |  The CPT Code (US specific).  | 
| USA\$1DEA\$1NUMBER |  The DEA number (US specific).  | 
| USA\$1DRIVING\$1LICENSE |  The driver license number (US specific).  | 
| USA\$1HCPCS\$1CODE |  The HCPCS code (US specific).  | 
| USA\$1HEALTH\$1INSURANCE\$1CLAIM\$1NUMBER |  Health Insurance Claim Number (US specific).  | 
| USA\$1ITIN |  The ITIN (for US persons or entities).  | 
| USA\$1MEDICARE\$1BENEFICIARY\$1IDENTIFIER |  Medicare Beneficiary Identifier (US specific).  | 
| USA\$1NATIONAL\$1DRUG\$1CODE |  The NDC code (US specific).  | 
| USA\$1NATIONAL\$1PROVIDER\$1IDENTIFIER |  The National Provider Identifier number (US specific).  | 
| USA\$1PASSPORT\$1NUMBER |  The passport number (for US persons).  | 
| USA\$1PTIN |  The US Preparer Tax Identification Number issued by the Internal Revenue Service.  | 
| USA\$1SSN |  The social security number (for US persons).  | 



 **Argentina data types** 


| Data Type | Description | 
| --- | --- | 
| ARGENTINA\$1TAX\$1IDENTIFICATION\$1NUMBER |   Argentina Tax Identification Number. Also known as CUIT or CUIL.   | 

 **Australian data types** 


| Data Type | Description | 
| --- | --- | 
| AUSTRALIA\$1BUSINESS\$1NUMBER |   Australia Business Number (ABN). A unique identifier issued by the Australian Business Register (ABR) to identify businesses to the government and community.   | 
| AUSTRALIA\$1COMPANY\$1NUMBER |   Australia Company Number (ACN). Unique identifier issued by the Australian Securities and Investments Commission.   | 
| AUSTRALIA\$1DRIVING\$1LICENSE |  A driver’s license number for Australia.   | 
| AUSTRALIA\$1MEDICARE\$1NUMBER |  Australian Medicare Number. Personal identifier issued by the Australian Health Insurance Commission.  | 
| AUSTRALIA\$1PASSPORT\$1NUMBER |  Australian passport number.   | 
| AUSTRALIA\$1TAX\$1FILE\$1NUMBER |   Australia Tax File Number (TFN). Issued by the Australian Taxation Office (ATO) to taxpayers (individual, company, etc) for tax dealings.   | 

 **Austria data types** 


| Data Type | Description | 
| --- | --- | 
| AUSTRIA\$1DRIVING\$1LICENSE |  The driver license number (Austria specific).  | 
| AUSTRIA\$1PASSPORT\$1NUMBER |  The passport number (Austria specific).  | 
| AUSTRIA\$1SSN |  The social security number (for Austria persons).  | 
| AUSTRIA\$1TAX\$1IDENTIFICATION\$1NUMBER |  Tax identification number (Austria specific).  | 
| AUSTRIA\$1VALUE\$1ADDED\$1TAX |  Value-Added Tax (Austria specific).  | 

 **Balkans data types** 


| Data Type | Description | 
| --- | --- | 
| BOSNIA\$1UNIQUE\$1MASTER\$1CITIZEN\$1NUMBER |  Unique master citizen number (JMBG) for Bosnia-Herzegovina citizens.  | 
| KOSOVO\$1UNIQUE\$1MASTER\$1CITIZEN\$1NUMBER |  Unique master citizen number (JMBG) for Kosovo.  | 
| MACEDONIA\$1UNIQUE\$1MASTER\$1CITIZEN\$1NUMBER |  Unique master citizen number for Macedonia.  | 
| MONTENEGRO\$1UNIQUE\$1MASTER\$1CITIZEN\$1NUMBER |  Unique master citizen number (JMBG) for Montenegro.  | 
| SERBIA\$1UNIQUE\$1MASTER\$1CITIZEN\$1NUMBER |  Unique master citizen number (JMBG) for Serbia.  | 
| SERBIA\$1VALUE\$1ADDED\$1TAX |  Value-Added Tax (Serbia specific).  | 
| VOJVODINA\$1UNIQUE\$1MASTER\$1CITIZEN\$1NUMBER |  Unique master citizen number (JMBG) for Vojvodina.  | 

 **Belgium data types** 


| Data Type | Description | 
| --- | --- | 
| BELGIUM\$1DRIVING\$1LICENSE |  The driver license number (Belgium specific).  | 
| BELGIUM\$1NATIONAL\$1IDENTIFICATION\$1NUMBER |  The Belgian National Number (BNN).  | 
| BELGIUM\$1PASSPORT\$1NUMBER |  The passport number (Belgium specific).  | 
| BELGIUM\$1TAX\$1IDENTIFICATION\$1NUMBER |  Tax identification number (Belgium specific).  | 
| BELGIUM\$1VALUE\$1ADDED\$1TAX |  Value-Added Tax (Belgium specific).  | 

 **Brazil data types** 


| Data Type | Description | 
| --- | --- | 
| BRAZIL\$1BANK\$1ACCOUNT | The bank account number (Brazil specific). | 
| BRAZIL\$1NATIONAL\$1IDENTIFICATION\$1NUMBER |  The national identifier (Brazil specific).  | 
| BRAZIL\$1NATIONAL\$1REGISTRY\$1OF\$1LEGAL\$1ENTITIES\$1NUMBER |  The identification number issued to companies (Brazil specific), also known as the CNPJ.  | 
| BRAZIL\$1NATURAL\$1PERSON\$1REGISTRY\$1NUMBER |  Natural Person Registry Number, also known as CPF.  | 

 **Bulgaria data types** 


| Data Type | Description | 
| --- | --- | 
| BULGARIA\$1DRIVING\$1LICENSE |  The driver license number (Bulgaria specific).  | 
| BULGARIA\$1UNIFORM\$1CIVIL\$1NUMBER |  Unified Civil Number (EGN) that serves as a national identification number.  | 
| BULGARIA\$1VALUE\$1ADDED\$1TAX |  Value-Added Tax (Bulgaria specific).  | 

 **Canada data types** 


| Data Type | Description | 
| --- | --- | 
| CANADA\$1DRIVING\$1LICENSE |  The driver license number (Canada specific).  | 
| CANADA\$1GOVERNMENT\$1IDENTIFICATION\$1CARD\$1NUMBER |  The national identifier (Canada specific).  | 
| CANADA\$1PASSPORT\$1NUMBER |  The passport number (Canada specific).  | 
| CANADA\$1PERMANENT\$1RESIDENCE\$1NUMBER |  Permanent residence number (PR Card number).  | 
| CANADA\$1PERSONAL\$1HEALTH\$1NUMBER |  The unique identifier for healthcare (PHN number).  | 
| CANADA\$1SOCIAL\$1INSURANCE\$1NUMBER |  The social insurance number (SIN) in Canada.  | 

 **Chile data types** 


| Data Type | Description | 
| --- | --- | 
| CHILE\$1DRIVING\$1LICENSE |  The driver license number (Chile specific).  | 
| CHILE\$1NATIONAL\$1IDENTIFICATION\$1NUMBER |  The Chile national identifier, also known as RUT or RUN.  | 

 **China, Hong Kong, Macau, and Taiwan data types** 


| Data Type | Description | 
| --- | --- | 
| CHINA\$1IDENTIFICATION |  The China identifier.  | 
| CHINA\$1LICENSE\$1PLATE\$1NUMBER |  The driver license number (China specific).  | 
| CHINA\$1MAINLAND\$1TRAVEL\$1PERMIT\$1ID\$1HONG\$1KONG\$1MACAU |  The Mainland Travel Permit for Hong Kong and Macao Residents.  | 
| CHINA\$1MAINLAND\$1TRAVEL\$1PERMIT\$1ID\$1TAIWAN |  The Mainland Travel Permit for Taiwan Residents issued by Government of the People's Republic of China (PRC).  | 
| CHINA\$1PASSPORT\$1NUMBER |  The passport number (China specific).  | 
| CHINA\$1PHONE\$1NUMBER |  The phone number (China specific).  | 
| HONG\$1KONG\$1IDENTITY\$1CARD |  The official identity document issued by the Immigration Department of Hong Kong.  | 
| MACAU\$1RESIDENT\$1IDENTITY\$1CARD |  The Macau Resident Identity Card or BIR is an official identity card issued by the Identification Services Bureau of Macau.  | 
| TAIWAN\$1NATIONAL\$1IDENTIFICATION\$1NUMBER |  The national identifier (Taiwan specific).  | 
| TAIWAN\$1PASSPORT\$1NUMBER |  The passport number (Taiwan specific).  | 

 **Colombia data types** 


| Data Type | Description | 
| --- | --- | 
| COLOMBIA\$1PERSONAL\$1IDENTIFICATION\$1NUMBER |  Unique identifier assigned to Colombians at birth.  | 
| COLOMBIA\$1TAX\$1IDENTIFICATION\$1NUMBER |  Tax identification number (Colombia specific).  | 

 **Croatia data types** 


| Data Type | Description | 
| --- | --- | 
| CROATIA\$1DRIVING\$1LICENSE |  The driver license number (Croatia specific).  | 
| CROATIA\$1IDENTITY\$1NUMBER |  The national identifier (Croatia specific).  | 
| CROATIA\$1PASSPORT\$1NUMBER |  The passport number (Croatia specific).  | 
| CROATIA\$1PERSONAL\$1IDENTIFICATION\$1NUMBER |  The personal identifier number (OIB).  | 

 **Cyprus data types** 


| Data Type | Description | 
| --- | --- | 
| CYPRUS\$1DRIVING\$1LICENSE |  The driver license number (Cyprus specific).  | 
| CYPRUS\$1NATIONAL\$1IDENTIFICATION\$1NUMBER |  The Cypriot identity card.  | 
| CYPRUS\$1PASSPORT\$1NUMBER |  The passport number (Cyprus specific).  | 
| CYPRUS\$1TAX\$1IDENTIFICATION\$1NUMBER |  Tax identification number (Cyprus specific).  | 
| CYPRUS\$1VALUE\$1ADDED\$1TAX |  Value-Added Tax (Cyprus specific).  | 

 **Czechia data types** 


| Data Type | Description | 
| --- | --- | 
| CZECHIA\$1DRIVING\$1LICENSE |  The driver license number (Czechia specific).  | 
| CZECHIA\$1PERSONAL\$1IDENTIFICATION\$1NUMBER |  The personal identifier number (Czechia specific).  | 
| CZECHIA\$1VALUE\$1ADDED\$1TAX |  Value-Added Tax (Czechia specific).  | 

 **Denmark data types** 


| Data Type | Description | 
| --- | --- | 
| DENMARK\$1DRIVING\$1LICENSE |  The driver license number (Denmark specific).  | 
| DENMARK\$1PERSONAL\$1IDENTIFICATION\$1NUMBER |  The personal identifier number (Denmark specific).  | 
| DENMARK\$1TAX\$1IDENTIFICATION\$1NUMBER |  Tax identification number (Denmark specific).  | 
| DENMARK\$1VALUE\$1ADDED\$1TAX |  Value-Added Tax (Denmark specific).  | 

 **Estonia data types** 


| Data Type | Description | 
| --- | --- | 
| ESTONIA\$1DRIVING\$1LICENSE |  The driver license number (Estonia specific).  | 
| ESTONIA\$1PASSPORT\$1NUMBER |  The passport number (Estonia specific).  | 
| ESTONIA\$1PERSONAL\$1IDENTIFICATION\$1CODE |  The personal identifier number (Estonia specific).  | 
| ESTONIA\$1VALUE\$1ADDED\$1TAX |  Value-Added Tax (Estonia specific).  | 

 **Finland data types** 


| Data Type | Description | 
| --- | --- | 
| FINLAND\$1DRIVING\$1LICENSE |  The driver license number (Finland specific).  | 
| FINLAND\$1HEALTH\$1INSURANCE\$1NUMBER |  The health insurance number (Finland specific).  | 
| FINLAND\$1NATIONAL\$1IDENTIFICATION\$1NUMBER |  The national identifier number (Finland specific).  | 
| FINLAND\$1PASSPORT\$1NUMBER |  The passport number (Finland specific).  | 
| FINLAND\$1VALUE\$1ADDED\$1TAX |  Value-Added Tax (Finland specific).  | 

 **France data types** 


| Data Type | Description | 
| --- | --- | 
| FRANCE\$1BANK\$1ACCOUNT |  The bank account number (France specific).  | 
| FRANCE\$1DRIVING\$1LICENSE |  The driver license number (France specific).  | 
| FRANCE\$1HEALTH\$1INSURANCE\$1NUMBER |  France health insurance number.  | 
| FRANCE\$1INSEE\$1CODE |  France social security, SSN, or NIR number.  | 
| FRANCE\$1NATIONAL\$1IDENTIFICATION\$1NUMBER |  France national identifier number (CNI).  | 
| FRANCE\$1PASSPORT\$1NUMBER |  The passport number (France specific).  | 
| FRANCE\$1TAX\$1IDENTIFICATION\$1NUMBER |  Tax identification number (France specific).  | 
| FRANCE\$1VALUE\$1ADDED\$1TAX |  Value-Added Tax (France specific).  | 

 **Germany data types** 


| Data Type | Description | 
| --- | --- | 
| GERMANY\$1BANK\$1ACCOUNT |  The bank account number (Germany specific).  | 
| GERMANY\$1DRIVING\$1LICENSE |  The driver license number (Germany specific).  | 
| GERMANY\$1PASSPORT\$1NUMBER |  The passport number (Germany specific).  | 
| GERMANY\$1PERSONAL\$1IDENTIFICATION\$1NUMBER |  The personal identification number (Germany specific).  | 
| GERMANY\$1TAX\$1IDENTIFICATION\$1NUMBER |  Tax identification number (Germany specific).  | 
| GERMANY\$1VALUE\$1ADDED\$1TAX |  Value-Added Tax (Germany specific).  | 

 **Greece data types** 


| Data Type | Description | 
| --- | --- | 
| GREECE\$1DRIVING\$1LICENSE |  The driver license number (Greece specific).  | 
| GREECE\$1PASSPORT\$1NUMBER |  The passport number (Greece specific).  | 
| GREECE\$1SSN |  The social security number (for Greece persons).  | 
| GREECE\$1TAX\$1IDENTIFICATION\$1NUMBER |  Tax identification number (Greece specific).  | 
| GREECE\$1VALUE\$1ADDED\$1TAX |  Value-Added Tax (Greece specific).  | 

 **Hungary data types** 


| Data Type | Description | 
| --- | --- | 
| HUNGARY\$1DRIVING\$1LICENSE |  The driver license number (Hungary specific).  | 
| HUNGARY\$1PASSPORT\$1NUMBER |  The passport number (Hungary specific).  | 
| HUNGARY\$1SSN |  The social security number (for Hungary persons).  | 
| HUNGARY\$1TAX\$1IDENTIFICATION\$1NUMBER |  Tax identification number (Hungary specific).  | 
| HUNGARY\$1VALUE\$1ADDED\$1TAX |  Value-Added Tax (Hungary specific).  | 

 **Iceland data types** 


| Data Type | Description | 
| --- | --- | 
| ICELAND\$1NATIONAL\$1IDENTIFICATION\$1NUMBER |  The national identifier (Iceland specific).  | 
| ICELAND\$1PASSPORT\$1NUMBER |  The passport number (Iceland specific).  | 
| ICELAND\$1VALUE\$1ADDED\$1TAX |  Value-Added Tax (Iceland specific).  | 

 **India data types** 


| Data Type | Description | 
| --- | --- | 
| INDIA\$1AADHAAR\$1NUMBER |  Aadhaar identification number issued by the Unique Identification Authority of India.  | 
| INDIA\$1PERMANENT\$1ACCOUNT\$1NUMBER |  India Permanent Account Number (PAN).  | 

 **Indonesia data types** 


| Data Type | Description | 
| --- | --- | 
| INDONESIA\$1IDENTITY\$1CARD\$1NUMBER |  The national identifier (Indonesia specific).  | 

 **Ireland data types** 


| Data Type | Description | 
| --- | --- | 
| IRELAND\$1DRIVING\$1LICENSE |  The driver license number (Ireland specific).  | 
| IRELAND\$1PASSPORT\$1NUMBER |  The passport number (Ireland specific).  | 
| IRELAND\$1PERSONAL\$1PUBLIC\$1SERVICE\$1NUMBER |  Ireland personal public service number (PPS).  | 
| IRELAND\$1TAX\$1IDENTIFICATION\$1NUMBER |  Tax identification number (Ireland specific).  | 
| IRELAND\$1VALUE\$1ADDED\$1TAX |  Value-Added Tax (Ireland specific).  | 

 **Israel data types** 


| Data Type | Description | 
| --- | --- | 
| ISRAEL\$1IDENTIFICATION\$1NUMBER |  The national identifier (Israel specific).  | 

 **Italy data types** 


| Data Type | Description | 
| --- | --- | 
| ITALY\$1BANK\$1ACCOUNT |  The bank account number (Italy specific).  | 
| ITALY\$1DRIVING\$1LICENSE |  The driver license number (Italy specific).  | 
| ITALY\$1FISCAL\$1CODE |  The identifier number, also known as the Italian Codice Fiscale.  | 
| ITALY\$1PASSPORT\$1NUMBER |  The passport number (Italy specific).  | 
| ITALY\$1VALUE\$1ADDED\$1TAX |  Value-Added Tax (Italy specific).  | 

 **Japan data types** 


| Data Type | Description | 
| --- | --- | 
| JAPAN\$1BANK\$1ACCOUNT |  Japan bank account.  | 
| JAPAN\$1DRIVING\$1LICENSE |  A driver's license number for Japan.  | 
| JAPAN\$1MY\$1NUMBER |  The unique identifier for Japan citizens or corporations used for tax administration, social security administration, and disaster response   | 
| JAPAN\$1PASSPORT\$1NUMBER |  Japan passort number.  | 

 **Korea data types** 


| Data Type | Description | 
| --- | --- | 
| KOREA\$1PASSPORT\$1NUMBER |  The passport number (Korea specific).  | 
| KOREA\$1RESIDENCE\$1REGISTRATION\$1NUMBER\$1FOR\$1CITIZENS |  Korea residence registrant number for residents.  | 
| KOREA\$1RESIDENCE\$1REGISTRATION\$1NUMBER\$1FOR\$1FOREIGNERS |  Korea residence registrant number for foreigners.  | 

 **Latvia data types** 


| Data Type | Description | 
| --- | --- | 
| LATVIA\$1DRIVING\$1LICENSE |  The driver license number (Latvia specific).  | 
| LATVIA\$1PASSPORT\$1NUMBER |  The passport number (Latvia specific).  | 
| LATVIA\$1PERSONAL\$1IDENTIFICATION\$1NUMBER |  The personal identifier number (Latvia specific).  | 
| LATVIA\$1VALUE\$1ADDED\$1TAX |  Value-Added Tax (Latvia specific).  | 

 **Liechtenstein data types** 


| Data Type | Description | 
| --- | --- | 
| LIECHTENSTEIN\$1NATIONAL\$1IDENTIFICATION\$1NUMBER |  The national identifier (Liechtenstein specific).  | 
| LIECHTENSTEIN\$1PASSPORT\$1NUMBER |  The passport number (Liechtenstein specific).  | 
| LIECHTENSTEIN\$1TAX\$1IDENTIFICATION\$1NUMBER |  Tax identification number (Liechtenstein specific).  | 

 **Lithuania data types** 


| Data Type | Description | 
| --- | --- | 
| LITHUANIA\$1DRIVING\$1LICENSE |  The driver license number (Lithuania specific).  | 
| LITHUANIA\$1PERSONAL\$1IDENTIFICATION\$1NUMBER |  The personal identifier number (Lithuania specific).  | 
| LITHUANIA\$1TAX\$1IDENTIFICATION\$1NUMBER |  Tax identification number (Lithuania specific).  | 
| LITHUANIA\$1VALUE\$1ADDED\$1TAX |  Value-Added Tax (Lithuania specific).  | 

 **Luxembourg data types** 


| Data Type | Description | 
| --- | --- | 
| LUXEMBOURG\$1DRIVING\$1LICENSE |  The driver license number (Luxembourg specific).  | 
| LUXEMBOURG\$1NATIONAL\$1INDIVIDUAL\$1NUMBER |  The national identifier (Luxembourg specific).  | 
| LUXEMBOURG\$1PASSPORT\$1NUMBER |  The passport number (Luxembourg specific).  | 
| LUXEMBOURG\$1TAX\$1IDENTIFICATION\$1NUMBER |  Tax identification number (Luxembourg specific).  | 
| LUXEMBOURG\$1VALUE\$1ADDED\$1TAX |  Value-Added Tax (Luxembourg specific).  | 

 **Malaysia data types** 


| Data Type | Description | 
| --- | --- | 
| MALAYSIA\$1MYKAD\$1NUMBER |  The national identifier (Malaysia specific).  | 
| MALAYSIA\$1PASSPORT\$1NUMBER |  The passport number (Malaysia specific).  | 

 **Malta data types** 


| Data Type | Description | 
| --- | --- | 
| MALTA\$1DRIVING\$1LICENSE |  The driver license number (Malta specific).  | 
| MALTA\$1NATIONAL\$1IDENTIFICATION\$1NUMBER |  The national identifier (Malta specific).  | 
| MALTA\$1TAX\$1IDENTIFICATION\$1NUMBER |  Tax identification number (Malta specific).  | 
| MALTA\$1VALUE\$1ADDED\$1TAX |  Value-Added Tax (Malta specific).  | 

 **Mexico data types** 


| Data Type | Description | 
| --- | --- | 
| MEXICO\$1CLABE\$1NUMBER |  Mexico CLABE (Clave Bancaria Estandarizada) bank number).  | 
| MEXICO\$1DRIVING\$1LICENSE |  The driver license number (Mexico specific).  | 
| MEXICO\$1PASSPORT\$1NUMBER |  The passport number (Mexico specific).  | 
| MEXICO\$1TAX\$1IDENTIFICATION\$1NUMBER |  Tax identification number (Mexico specific).  | 
| MEXICO\$1UNIQUE\$1POPULATION\$1REGISTRY\$1CODE |  The Clave Única de Registro de Población (CURP) unique identity code for Mexico.  | 

 **Netherlands data types** 


| Data Type | Description | 
| --- | --- | 
| NETHERLANDS\$1CITIZEN\$1SERVICE\$1NUMBER |  Netherlands citizen number (BSN, burgerservicenummer).  | 
| NETHERLANDS\$1DRIVING\$1LICENSE |  The driver license number (Netherlands specific).  | 
| NETHERLANDS\$1PASSPORT\$1NUMBER |  The passport number (Netherlands specific).  | 
| NETHERLANDS\$1TAX\$1IDENTIFICATION\$1NUMBER |  Tax identification number (Netherlands specific).  | 
| NETHERLANDS\$1VALUE\$1ADDED\$1TAX |  Value-Added Tax (Netherlands specific).  | 
| NETHERLANDS\$1BANK\$1ACCOUNT |  The bank account number (Netherlands specific).  | 

 **New Zealand data types** 


| Data Type | Description | 
| --- | --- | 
| NEW\$1ZEALAND\$1DRIVING\$1LICENSE |  The driver license number (New Zealand specific).  | 
| NEW\$1ZEALAND\$1NATIONAL\$1HEALTH\$1INDEX\$1NUMBER |  New Zealand national health index number.  | 
| NEW\$1ZEALAND\$1TAX\$1IDENTIFICATION\$1NUMBER |  Tax identification number, also known as inland revenue number (New Zealand specific).  | 

 **Norway data types** 


| Data Type | Description | 
| --- | --- | 
| NORWAY\$1BIRTH\$1NUMBER |  Norwegian national identity number.  | 
| NORWAY\$1DRIVING\$1LICENSE |  The driver license number (Norway specific).  | 
| NORWAY\$1HEALTH\$1INSURANCE\$1NUMBER |  Norway health insurance number.  | 
| NORWAY\$1NATIONAL\$1IDENTIFICATION\$1NUMBER |  The national identifier number (Norway specific).  | 
| NORWAY\$1VALUE\$1ADDED\$1TAX |  Value-Added Tax (Norway specific).  | 

 **Philippines data types** 


| Data Type | Description | 
| --- | --- | 
| PHILIPPINES\$1DRIVING\$1LICENSE |  The driver license number (Philippines specific).  | 
| PHILIPPINES\$1PASSPORT\$1NUMBER |  The passport number (Philippines specific).  | 

 **Poland data types** 


| Data Type | Description | 
| --- | --- | 
| POLAND\$1DRIVING\$1LICENSE |  The driver license number (Poland specific).  | 
| POLAND\$1IDENTIFICATION\$1NUMBER |  The Poland identifier.  | 
| POLAND\$1PASSPORT\$1NUMBER |  The passport number (Poland specific).  | 
| POLAND\$1REGON\$1NUMBER |  The REGON identifier number, also known as the Statistical Identification Number.  | 
| POLAND\$1SSN |  The social security number (for Poland persons).  | 
| POLAND\$1TAX\$1IDENTIFICATION\$1NUMBER |  Tax identification number (Poland specific).  | 
| POLAND\$1VALUE\$1ADDED\$1TAX |  Value-Added Tax (Poland specific).  | 

 **Portugal data types** 


| Data Type | Description | 
| --- | --- | 
| PORTUGAL\$1DRIVING\$1LICENSE |  The driver license number (Portugal specific).  | 
| PORTUGAL\$1NATIONAL\$1IDENTIFICATION\$1NUMBER |  The national identifier number (Portugal specific).  | 
| PORTUGAL\$1PASSPORT\$1NUMBER |  The passport number (Portugal specific).  | 
| PORTUGAL\$1TAX\$1IDENTIFICATION\$1NUMBER |  Tax identification number (Portugal specific).  | 
| PORTUGAL\$1VALUE\$1ADDED\$1TAX |  Value-Added Tax (Portugal specific).  | 

 **Romania data types** 


| Data Type | Description | 
| --- | --- | 
| ROMANIA\$1DRIVING\$1LICENSE |  The driver license number (Romania specific).  | 
| ROMANIA\$1NUMERICAL\$1PERSONAL\$1CODE |  The personal identifier number (Romania specific).  | 
| ROMANIA\$1PASSPORT\$1NUMBER |  The passport number (Romania specific).  | 
| ROMANIA\$1VALUE\$1ADDED\$1TAX |  Value-Added Tax (Romania specific).  | 

 **Singapore data types** 


| Data Type | Description | 
| --- | --- | 
| SINGAPORE\$1DRIVING\$1LICENSE |  The driver license number (Singapore specific).  | 
| SINGAPORE\$1NATIONAL\$1REGISTRY\$1IDENTIFICATION\$1NUMBER |  The national registration identity card for Singapore.  | 
| SINGAPORE\$1PASSPORT\$1NUMBER |  The passport number (Singapore specific).  | 
| SINGAPORE\$1UNIQUE\$1ENTITY\$1NUMBER |  The Unique Entity Number for Singapore.  | 

 **Slovakia data types** 


| Data Type | Description | 
| --- | --- | 
| SLOVAKIA\$1DRIVING\$1LICENSE |  The driver license number (Slovakia specific).  | 
| SLOVAKIA\$1NATIONAL\$1IDENTIFICATION\$1NUMBER |  The national identifier number (Slovakia specific).  | 
| SLOVAKIA\$1PASSPORT\$1NUMBER |  The passport number (Slovakia specific).  | 
| SLOVAKIA\$1VALUE\$1ADDED\$1TAX |  Value-Added Tax (Slovakia specific).  | 

 **Slovenia data types** 


| Data Type | Description | 
| --- | --- | 
| SLOVENIA\$1DRIVING\$1LICENSE |  The driver license number (Slovenia specific).  | 
| SLOVENIA\$1PASSPORT\$1NUMBER |  The passport number (Slovenia specific).  | 
| SLOVENIA\$1TAX\$1IDENTIFICATION\$1NUMBER |  Tax identification number (Slovenia specific).  | 
| SLOVENIA\$1UNIQUE\$1MASTER\$1CITIZEN\$1NUMBER |  Unique master citizen number (JMBG) for Slovenia citizens.  | 
| SLOVENIA\$1VALUE\$1ADDED\$1TAX |  Value-Added Tax (Slovenia specific).  | 

 **South Africa data types** 


| Data Type | Description | 
| --- | --- | 
| SOUTH\$1AFRICA\$1PERSONAL\$1IDENTIFICATION\$1NUMBER |  The personal identifier number (South Sfrica specific).  | 

 **Spain data types** 


| Data Type | Description | 
| --- | --- | 
| SPAIN\$1BANK\$1ACCOUNT |  The bank account number (Spain specific).  | 
| SPAIN\$1DNI |  The national identity card (Documento Nacional de Identidad) of Spain.  | 
| SPAIN\$1DRIVING\$1LICENSE |  The driver license number (Spain specific).  | 
| SPAIN\$1NIE |  The foreigner identity number (Spain specific), also known as the NIE.  | 
| SPAIN\$1NIF |  Tax identification number (Spain specific), also known as the NIF.  | 
| SPAIN\$1PASSPORT\$1NUMBER |  The passport number (Spain specific).  | 
| SPAIN\$1SSN |  The social security number (for Spain persons).  | 
| SPAIN\$1VALUE\$1ADDED\$1TAX |  Value-Added Tax (Spain specific).  | 

 **Sri Lanka data types** 


| Data Type | Description | 
| --- | --- | 
| SRI\$1LANKA\$1NATIONAL\$1IDENTIFICATION\$1NUMBER |  The national identifier (Sri Lanka specific).  | 

 **Sweden data types** 


| Data Type | Description | 
| --- | --- | 
| SWEDEN\$1DRIVING\$1LICENSE |  The driver license number (Sweden specific).  | 
| SWEDEN\$1PASSPORT\$1NUMBER |  The passport number (Sweden specific).  | 
| SWEDEN\$1PERSONAL\$1IDENTIFICATION\$1NUMBER |  The national identifier number (Sweden specific).  | 
| SWEDEN\$1TAX\$1IDENTIFICATION\$1NUMBER |  Sweden tax identification number (personnummer).  | 
| SWEDEN\$1VALUE\$1ADDED\$1TAX |  Value-Added Tax (Sweden specific).  | 

 **Switzerland data types** 


| Data Type | Description | 
| --- | --- | 
| SWITZERLAND\$1AHV |  The social security number for Swiss persons (AHV).  | 
| SWITZERLAND\$1HEALTH\$1INSURANCE\$1NUMBER |  Swiss health insurance number.  | 
| SWITZERLAND\$1PASSPORT\$1NUMBER |  The passport number (Switzerland specific).  | 
| SWITZERLAND\$1VALUE\$1ADDED\$1TAX |  Value-Added Tax (Switzerland specific).  | 

 **Thailand data types** 


| Data Type | Description | 
| --- | --- | 
| THAILAND\$1PASSPORT\$1NUMBER |  The passport number (Thailand specific).  | 
| THAILAND\$1PERSONAL\$1IDENTIFICATION\$1NUMBER |  The personal identifier number (Thailand specific).  | 

 **Turkey data types** 


| Data Type | Description | 
| --- | --- | 
| TURKEY\$1NATIONAL\$1IDENTIFICATION\$1NUMBER |  The national identifier number (Turkey specific).  | 
| TURKEY\$1PASSPORT\$1NUMBER |  The passport number (Turkey specific).  | 
| TURKEY\$1VALUE\$1ADDED\$1TAX |  Value-Added Tax (Turkey specific).  | 

 **Ukraine data types** 


| Data Type | Description | 
| --- | --- | 
| UKRAINE\$1INDIVIDUAL\$1IDENTIFICATION\$1NUMBER |  The unique identifier (Ukraine specific).  | 
| UKRAINE\$1PASSPORT\$1NUMBER\$1DOMESTIC |  The domestic passport number (Ukraine specific).  | 
| UKRAINE\$1PASSPORT\$1NUMBER\$1INTERNATIONAL |  The international passport number (Ukraine specific).  | 

 **United Arab Emirates (UAE) data types** 


| Data Type | Description | 
| --- | --- | 
| UNITED\$1ARAB\$1EMIRATES\$1PERSONAL\$1NUMBER |  The personal identifier number (UAE specific).  | 

 **UK data types** 


| Data Type | Description | 
| --- | --- | 
| UK\$1BANK\$1ACCOUNT |  United Kingdom (UK) bank account.  | 
| UK\$1BANK\$1SORT\$1CODE |   United Kingdom (UK) bank sort code. Sort codes are bank codes used to route money transfers between banks within their respective countries via their respective clearance organizations.   | 
| UK\$1DRIVING\$1LICENSE |  The driver's license number for the United Kingdom of Great Britain and Northern Ireland (UK specific)  | 
| UK\$1ELECTORAL\$1ROLL\$1NUMBER |  The Electoral Roll Number (ERN) is the identification number issued to an individual for UK election registration. The format of this number is specified by the UK Government Standards of the UK Cabinet Office.  | 
| UK\$1NATIONAL\$1HEALTH\$1SERVICE\$1NUMBER |  The National Health Service (NHS) number is the unique number allocated to a registered user of public health services in the United Kingdom.  | 
| UK\$1NATIONAL\$1INSURANCE\$1NUMBER |  The National Insurance number (NINO) is a number used in the United Kingdom (UK) to identify an individual for the national insurance program or social security system. The number is sometimes referred to as NI No or NINO.  | 
| UK\$1PASSPORT\$1NUMBER |  United Kingdom (UK) passport number.  | 
| UK\$1UNIQUE\$1TAXPAYER\$1REFERENCE\$1NUMBER |  The United Kingdom (UK) Unique Taxpayer Reference (UTR) number. An identifier used by the UK government to manage the taxation system.   | 
| UK\$1VALUE\$1ADDED\$1TAX |  VAT is a consumption tax that is borne by the end consumer. VAT is paid for each transaction in the manufacturing and distribution process. For the United Kingdom, the VAT number is issued by the VAT office for the region in which the business is established.  | 
| UK\$1PHONE\$1NUMBER |  United Kingdom (UK) phone number.  | 

 **Venezuela data types** 


| Data Type | Description | 
| --- | --- | 
| VENEZUELA\$1DRIVING\$1LICENSE |  The driver license number (Venezuela specific).  | 
| VENEZUELA\$1NATIONAL\$1IDENTIFICATION\$1NUMBER |  The national identifier number (Venezuela specific).  | 
| VENEZUELA\$1VALUE\$1ADDED\$1TAX |  Value-Added Tax (Venezuela specific).  | 

# Using fine-grained sensitive data detection
<a name="sensitive-data-fine-grained-actions"></a>

**Note**  
 Fine-grained actions is only available in AWS Glue 3.0 and 4.0. This includes the AWS Glue Studio experience. The persistent audit log changes are also not available in 2.0.   
 All AWS Glue Studio 3.0 and 4.0 visual jobs will have a script created that automatically uses fine-grained actions APIs. 

 The Detect Sensitive Data transform provides the ability to detect, mask, or remove entities that you define, or are pre-defined by AWS Glue. Fine-grained actions further allows you to apply a specific action per entity. Additional benefits include: 
+  Improved performance as actions are being applied as soon data is detected. 
+  The option to include or exclude specific columns. 
+  The ability to use partial masking. This allows you to mask detected sensitive data entities partially, rather than masking the entire string. Both simple params with offsets and regex are supported. 

 The following are code snippets of sensitive data detection APIs and fine-grained actions used in the sample jobs referenced in the next section. 

 **Detect API** – fine-grained actions use the new `detectionParameters` parameter: 

```
def detect(
    frame: DynamicFrame,
    detectionParameters: JsonOptions,
    outputColumnName: String = "DetectedEntities",
    detectionSensitivity: String = "LOW"
): DynamicFrame = {}
```

## Using Sensitive Data Detection APIs with fine-grained actions
<a name="sensitive-data-fine-grained-actions-glue-jobs"></a>

 Sensitive data detection APIs using **detect** analyzes the data given, determines if the rows or columns are Sensitive Data Entity Types, and will run actions specified by the user for each Entity type. 

### Using the detect API with fine-grained actions
<a name="sensitive-data-fine-grained-actions-glue-jobs-detect"></a>

 Use the **detect** API and specify the `outputColumnName` and ` detectionParameters`. 

```
    object GlueApp {
      def main(sysArgs: Array[String]) {
      
        val spark: SparkContext = new SparkContext()
        val glueContext: GlueContext = new GlueContext(spark)
        
        // @params: [JOB_NAME]
        val args = GlueArgParser.getResolvedOptions(sysArgs, Seq("JOB_NAME").toArray)
        Job.init(args("JOB_NAME"), glueContext, args.asJava)
        
        // Script generated for node S3 bucket. Creates DataFrame from data stored in S3.
        val S3bucket_node1 = glueContext.getSourceWithFormat(formatOptions=JsonOptions("""{"quoteChar": "\"", "withHeader": true, "separator": ",", "optimizePerformance": false}"""), connectionType="s3", format="csv", options=JsonOptions("""{"paths": ["s3://189657479688-ddevansh-pii-test-bucket/tiny_pii.csv"], "recurse": true}"""), transformationContext="S3bucket_node1").getDynamicFrame()
     
        // Script generated for node Detect Sensitive Data. Will run detect API for the DataFrame
        // detectionParameter contains information on which EntityType are being detected
        // and what actions are being applied to them when detected. 
        val DetectSensitiveData_node2 = EntityDetector.detect(
            frame = S3bucket_node1, 
            detectionParameters = JsonOptions(
             """
                {
                    "PHONE_NUMBER": [
                        {
                            "action": "PARTIAL_REDACT",
                            "actionOptions": {
                                "numLeftCharsToExclude": "3",
                                "numRightCharsToExclude": "4",
                                "redactChar": "#"
                            },
                            "sourceColumnsToExclude": [ "Passport No", "DL NO#" ]
                        }
                    ],
                    "USA_PASSPORT_NUMBER": [
                        {
                            "action": "SHA256_HASH",
                            "sourceColumns": [ "Passport No" ]
                        }
                    ],
                    "USA_DRIVING_LICENSE": [
                        {
                            "action": "REDACT",
                            "actionOptions": {
                                "redactText": "USA_DL"
                            },
                            "sourceColumns": [ "DL NO#" ]
                        }
                    ]
                    
                }
            """
            ),
            outputColumnName = "DetectedEntities"
        )
     
        // Script generated for node S3 bucket. Store Results of detect to S3 location
        val S3bucket_node3 = glueContext.getSinkWithFormat(connectionType="s3", options=JsonOptions("""{"path": "s3://amzn-s3-demo-bucket/test-output/", "partitionKeys": []}"""), transformationContext="S3bucket_node3", format="json").writeDynamicFrame(DetectSensitiveData_node2)
     
        Job.commit()
      }
```

 The above script will create a DataFrame from a location in Amazon S3 and then it will run the `detect` API. Since the `detect` API requires the field `detectionParameters` (a map of the entity name to a list all of the action settings to be used for that entity) is represented by AWS Glue’s `JsonOptions` object, it will also allow us to extend the functionality of the API. 

 For each action specified per entity, enter a list of all column names to which to apply the entity/action combination. This allows you to customize the entities to detect for every column in your dataset and skip entities that you know are not in a specific column. This also allows your jobs to be more performant by not performing unnecessary detection calls those entities and allows you to perform actions unique to each column and entity combination. 

 Taking a closer look at the `detectionParameters`, there are three entity types in the sample job. These are `Phone Number`, `USA_PASSPORT_NUMBER`, and `USA_DRIVING_LICENSE`. For each of these entity types AWS Glue will run different actions which are either `PARTIAL_REDACT`, `SHA256_HASH`, `REDACT`, and `DETECT`. Each of the Entity Types also have `sourceColumns` to apply to and/or `sourceColumnsToExclude` if detected. 

**Note**  
 Only one edit-in-place action (`PARTIAL_REDACT`, `SHA256_HASH`, or `REDACT`) can be used per column but the `DETECT` action can be used with any of these actions. 

 The `detectionParameters` field has the below layout: 

```
    ENTITY_NAME -> List[Actions]
    {
    	"ENTITY_NAME": [{
    		Action, // required
    		ColumnSpecs,
    		ActionOptionsMap
        }],
        "ENTITY_NAME2": [{
    		...
        }]
    }
```

 The types of `actions` and `actionOptions` are listed below: 

```
DETECT
{
    # Required
    "action": "DETECT",
    # Optional, depending on action chosen
    "actionOptions": {
        // There are no actionOptions for DETECT 
    },
    # 1 of below required, both can also used
    "sourceColumns": [
        "COL_1", "COL_2", ..., "COL_N"
    ],
    "sourceColumnsToExclude": [
        "COL_5"
    ]
}

SHA256_HASH
{
    # Required
    "action": "SHA256_HASH",
    # Required or optional, depending on action chosen
    "actionOptions": {
        // There are no actionOptions for SHA256_HASH
    },
    
    # 1 of below required, both can also used
    "sourceColumns": [
        "COL_1", "COL_2", ..., "COL_N"
    ],
    "sourceColumnsToExclude": [
        "COL_5"
    ]
}

REDACT
{
    # Required
    "action": "REDACT",
    # Required or optional, depending on action chosen
    "actionOptions": {
        // The text that is being replaced
        "redactText": "USA_DL"
    },
    
    # 1 of below required, both can also used
    "sourceColumns": [
        "COL_1", "COL_2", ..., "COL_N"
    ],
    "sourceColumnsToExclude": [
        "COL_5"
    ]
}

PARTIAL_REDACT
{
    # Required
    "action": "PARTIAL_REDACT",
    # Required or optional, depending on action chosen
    "actionOptions": {
        // number of characters to not redact from the left side 
        "numLeftCharsToExclude": "3",
        // number of characters to not redact from the right side
        "numRightCharsToExclude": "4",
        // the partial redact will be made with this redacted character  
        "redactChar": "#",
        // regex pattern for partial redaction
        "matchPattern": "[0-9]"
    },
    
    # 1 of below required, both can also used
    "sourceColumns": [
        "COL_1", "COL_2", ..., "COL_N"
    ],
    "sourceColumnsToExclude": [
        "COL_5"
    ]
}
```

 Once the script runs, results are output to the given Amazon S3 location. You can view your data in Amazon S3 but with the selected entity types being sensitized based on the selected action. In the case, we would have a rows that would have that looked like this: 

```
{
    "Name": "Colby Schuster",
    "Address": "39041 Antonietta Vista, South Rodgerside, Nebraska 24151",
    "Car Owned": "Fiat",
    "Email": "Kitty46@gmail.com",
    "Company": "O'Reilly Group",
    "Job Title": "Dynamic Functionality Facilitator",
    "ITIN": "991-22-2906",
    "Username": "Cassandre.Kub43",
    "SSN": "914-22-2906",
    "DOB": "2020-08-27",
    "Phone Number": "1-2#######1718",
    "Bank Account No": "69741187",
    "Credit Card Number": "6441-6289-6867-2162-2711",
    "Passport No": "94f311e93a623c72ccb6fc46cf5f5b0265ccb42c517498a0f27fd4c43b47111e",
    "DL NO#": "USA_DL"
}
```

 In the above script, the `Phone Number` was partially redacted with `#`. The `Passport No` was changed into a SHA256 hash. The `DL NO# `was detected as a USA driver license number and was redacted to “USA\$1DL” just like it was stated in the `detectionParameters`. 

**Note**  
 The classifyColumns API is not available for use with fine-grained actions due to the nature of the API. This API performs column sampling (adjustable by the user but has default values) to perform detection more quickly. Fine-grained actions require iterating over every value for this reason. 

### Persistent Audit Log
<a name="sensitive-data-fine-grained-actions-persistent-audit-log"></a>

 A new feature introduced with fine-grained actions (but also available when using the normal APIs) is the presence of a persistent audit log. Currently, running the detect API adds an additional column (defaults to `DetectedEntities` but customizable through the `outputColumnName`) parameter with PII detection metadata. This now has an “actionUsed” metadata key, which is one of `DETECT`, `PARTIAL_REDACT`, `SHA256_HASH`, `REDACT`. 

```
"DetectedEntities": {
    "Credit Card Number": [
        {
            "entityType": "CREDIT_CARD",
            "actionUsed": "DETECT",
            "start": 0,
            "end": 19
        }
    ],
    "Phone Number": [
        {
            "entityType": "PHONE_NUMBER",
            "actionUsed": "REDACT",
            "start": 0,
            "end": 14
        }
    ]
}
```

 Even customers using APIs without fine-grained actions such as `detect(entityTypesToDetect, outputColumnName)` will see this persistent audit log in the resulting dataframe. 

 Customers using APIs with fine-grained actions will see all of the actions, regardless of if they are redacted or not. Example: 

```
+---------------------+----------------+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| Credit Card Number  |  Phone Number  |                                                                                            DetectedEntities                                                                                             |
+---------------------+----------------+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| 622126741306XXXX    | +12#####7890   | {"Credit Card Number":[{"entityType":"CREDIT_CARD","actionUsed":"PARTIAL_REDACT","start":0,"end":16}],"Phone Number":[{"entityType":"PHONE_NUMBER","actionUsed":"PARTIAL_REDACT","start":0,"end":12}]}} |
| 6221 2674 1306 XXXX | +12#######7890 | {"Credit Card Number":[{"entityType":"CREDIT_CARD","actionUsed":"PARTIAL_REDACT","start":0,"end":19}],"Phone Number":[{"entityType":"PHONE_NUMBER","actionUsed":"PARTIAL_REDACT","start":0,"end":14}]}} |
| 6221-2674-1306-XXXX | 22#######7890  | {"Credit Card Number":[{"entityType":"CREDIT_CARD","actionUsed":"PARTIAL_REDACT","start":0,"end":19}],"Phone Number":[{"entityType":"PHONE_NUMBER","actionUsed":"PARTIAL_REDACT","start":0,"end":14}]}} |
+---------------------+----------------+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
```

 If you do not want to see the **DetectedEntities** column, you can simply drop the additional column in a custom script. 