PIIDetection
Specifies a transform that identifies, removes or masks PII data.
Contents
- EntityTypesToDetect
-
Indicates the types of entities the PIIDetection transform will identify as PII data.
PII type entities include: PERSON_NAME, DATE, USA_SNN, EMAIL, USA_ITIN, USA_PASSPORT_NUMBER, PHONE_NUMBER, BANK_ACCOUNT, IP_ADDRESS, MAC_ADDRESS, USA_CPT_CODE, USA_HCPCS_CODE, USA_NATIONAL_DRUG_CODE, USA_MEDICARE_BENEFICIARY_IDENTIFIER, USA_HEALTH_INSURANCE_CLAIM_NUMBER,CREDIT_CARD,USA_NATIONAL_PROVIDER_IDENTIFIER,USA_DEA_NUMBER,USA_DRIVING_LICENSE
Type: Array of strings
Pattern:
([\u0009\u000B\u000C\u0020-\uD7FF\uE000-\uFFFD\uD800\uDC00-\uDBFF\uDFFF])*
Required: Yes
- Inputs
-
The node ID inputs to the transform.
Type: Array of strings
Array Members: Fixed number of 1 item.
Pattern:
[A-Za-z0-9_-]*
Required: Yes
- Name
-
The name of the transform node.
Type: String
Pattern:
([^\r\n])*
Required: Yes
- PiiType
-
Indicates the type of PIIDetection transform.
Type: String
Valid Values:
RowAudit | RowHashing | RowMasking | RowPartialMasking | ColumnAudit | ColumnHashing | ColumnMasking
Required: Yes
- DetectionParameters
-
Additional parameters for configuring PII detection behavior and sensitivity settings.
Type: String
Pattern:
([\u0009\u000B\u000C\u0020-\uD7FF\uE000-\uFFFD\uD800\uDC00-\uDBFF\uDFFF])*
Required: No
- DetectionSensitivity
-
The sensitivity level for PII detection. Higher sensitivity levels detect more potential PII but may result in more false positives.
Type: String
Pattern:
([\u0009\u000B\u000C\u0020-\uD7FF\uE000-\uFFFD\uD800\uDC00-\uDBFF\uDFFF])*
Required: No
- MaskValue
-
Indicates the value that will replace the detected entity.
Type: String
Length Constraints: Minimum length of 0. Maximum length of 256.
Pattern:
[*A-Za-z0-9_-]*
Required: No
- MatchPattern
-
A regular expression pattern used to identify additional PII content beyond the standard detection algorithms.
Type: String
Pattern:
([\u0009\u000B\u000C\u0020-\uD7FF\uE000-\uFFFD\uD800\uDC00-\uDBFF\uDFFF])*
Required: No
- NumLeftCharsToExclude
-
The number of characters to exclude from redaction on the left side of detected PII content. This allows preserving context around the sensitive data.
Type: Integer
Valid Range: Minimum value of 0.
Required: No
- NumRightCharsToExclude
-
The number of characters to exclude from redaction on the right side of detected PII content. This allows preserving context around the sensitive data.
Type: Integer
Valid Range: Minimum value of 0.
Required: No
- OutputColumnName
-
Indicates the output column name that will contain any entity type detected in that row.
Type: String
Pattern:
([\u0009\u000B\u000C\u0020-\uD7FF\uE000-\uFFFD\uD800\uDC00-\uDBFF\uDFFF])*
Required: No
- RedactChar
-
The character used to replace detected PII content when redaction is enabled. The default redaction character is
*
.Type: String
Pattern:
([\u0009\u000B\u000C\u0020-\uD7FF\uE000-\uFFFD\uD800\uDC00-\uDBFF\uDFFF])*
Required: No
- RedactText
-
Specifies whether to redact the detected PII text. When set to
true
, PII content is replaced with redaction characters.Type: String
Pattern:
([\u0009\u000B\u000C\u0020-\uD7FF\uE000-\uFFFD\uD800\uDC00-\uDBFF\uDFFF])*
Required: No
- SampleFraction
-
Indicates the fraction of the data to sample when scanning for PII entities.
Type: Double
Valid Range: Minimum value of 0. Maximum value of 1.
Required: No
- ThresholdFraction
-
Indicates the fraction of the data that must be met in order for a column to be identified as PII data.
Type: Double
Valid Range: Minimum value of 0. Maximum value of 1.
Required: No
See Also
For more information about using this API in one of the language-specific AWS SDKs, see the following: