

# Troubleshooting AWS Managed Microsoft AD
<a name="ms_ad_troubleshooting"></a>

The following can help you troubleshoot some common problems you might encounter when creating or using your AWS Managed Microsoft AD Active Directory.

## Problems with your AWS Managed Microsoft AD
<a name="general_issues"></a>

Some troubleshooting tasks can only be completed by Support. Here are some of the tasks:
+ Restarting your Directory Service-provided domain controllers.
+ [Upgrading your AWS Managed Microsoft AD](ms_ad_upgrade_edition.md).

To create a support case, see [Creating support cases and case management](https://docs.aws.amazon.com/awssupport/latest/user/case-management.html).

## Problems with Netlogon and secure channel communications
<a name="ms_ad_tshoot_netlogon_issues"></a>

As a mitigation against [CVE-2020-1472](https://cve.mitre.org/cgi-bin/cvename.cgi?name=CVE-2020-1472), Microsoft has released patching which modifies the way that Netlogon secure channel communications are processed by domain controllers. Since the introduction of these secure Netlogon changes, some Netlogon connections (servers, workstations, and trust validations) may not be accepted by your AWS Managed Microsoft AD.

To verify if your issue is related to Netlogon or secure channel communications, search your Amazon CloudWatch Logs for event IDs 5827 (for device authentication related issues) or 5828 (for AD trust validation related issues). For information about CloudWatch in AWS Managed Microsoft AD, see [Enabling Amazon CloudWatch Logs log forwarding for AWS Managed Microsoft AD](ms_ad_enable_log_forwarding.md).

For more information about the mitigation against CVE-2020-1472, see [How to manage the changes in Netlogon secure channel connections associated with CVE-2020-1472](https://support.microsoft.com/en-us/topic/how-to-manage-the-changes-in-netlogon-secure-channel-connections-associated-with-cve-2020-1472-f7e8cc17-0309-1d6a-304e-5ba73cd1a11e) on Microsoft 's website.

## You receive a 'Response Status: 400 Bad Request' error when attempting to reset a user's password
<a name="ms_ad_tshoot_reset_password"></a>

You receive an error message similar to the following when attempting to reset a user's password:

`Response Status: 400 Bad Request`

You may experience this issue when there are duplicate objects in your AWS Managed Microsoft AD Organizational Unit (OU) with identical user logon names. User logon names must be unique. See [Troubleshooting Directory Data problems](https://learn.microsoft.com/en-us/previous-versions/windows/it-pro/windows-2000-server/bb727059(v=technet.10)?redirectedfrom=MSDN) in Microsoft documentation for more information.

## Password recovery
<a name="ms_ad_tshoot_password_recovery"></a>

If a user forgets a password or is having trouble signing in to your AWS Managed Microsoft AD directory, you can reset their password using either the AWS Management Console, PowerShell or the AWS CLI.

For more information, see [Resetting an AWS Managed Microsoft AD user password](ms_ad_manage_users_groups_reset_password.md).

## Additional resources
<a name="troubleshoot_general_resources"></a>

The following resources can help you troubleshoot as you work with AWS.
+ **[AWS Knowledge Center](https://aws.amazon.com/premiumsupport/knowledge-center/)**–Find FAQs and links to other resources to help you troubleshoot issues.
+ **[AWS Support Center](https://console.aws.amazon.com/support/home#/)**–Get technical support.
+ **[AWS Premium Support Center](https://aws.amazon.com/premiumsupport/)**–Get premium technical support.

The following resources can help you troubleshoot common Active Directory issues.
+ [Active Directory Documentation](https://learn.microsoft.com/en-us/troubleshoot/windows-server/active-directory/active-directory-overview)
+ [AD DS Troubleshooting](https://learn.microsoft.com/en-us/windows-server/identity/ad-ds/manage/ad-ds-troubleshooting)

**Topics**
+ [Problems with your AWS Managed Microsoft AD](#general_issues)
+ [Problems with Netlogon and secure channel communications](#ms_ad_tshoot_netlogon_issues)
+ [You receive a 'Response Status: 400 Bad Request' error when attempting to reset a user's password](#ms_ad_tshoot_reset_password)
+ [Password recovery](#ms_ad_tshoot_password_recovery)
+ [Additional resources](#troubleshoot_general_resources)
+ [Amazon EC2 Linux instance domain join errors](ms_ad_troubleshooting_join_linux.md)
+ [AWS Managed Microsoft AD low available storage space](ms_ad_troubleshooting_low_storage_space.md)
+ [Schema extension errors](ms_ad_troubleshooting_schema.md)
+ [Trust creation status reasons](ms_ad_troubleshooting_trusts.md)
+ [Troubleshooting AWS Managed Microsoft AD high CPU utilization](ms_ad_troubleshooting_high_cpu.md)

# Amazon EC2 Linux instance domain join errors
<a name="ms_ad_troubleshooting_join_linux"></a>

The following can help you troubleshoot some error messages you might encounter when joining an Amazon EC2 Linux instance to your AWS Managed Microsoft AD directory.

## Linux instances unable to join domain or authenticate
<a name="unable-to-join"></a>

Ubuntu 14.04, 16.04, and 18.04 instances *must* be reverse-resolvable in the DNS before a realm can work with Microsoft Active Directory. Otherwise, you might encounter one of the following two scenarios:

### Scenario 1: Ubuntu instances that are not yet joined to a realm
<a name="ubuntu-not-yet-joined"></a>

For Ubuntu instances that are attempting to join a realm, the `sudo realm join` command might not provide the required permissions to join the domain and might display the following error:

\$1 Couldn't authenticate to active directory: SASL(-1): generic failure: GSSAPI Error: An invalid name was supplied (Success) adcli: couldn't connect to EXAMPLE.COM domain: Couldn't authenticate to active directory: SASL(-1): generic failure: GSSAPI Error: An invalid name was supplied (Success) \$1 Insufficient permissions to join the domain realm: Couldn't join realm: Insufficient permissions to join the domain

### Scenario 2: Ubuntu instances that are joined to a realm
<a name="ubuntu-joined"></a>

For Ubuntu instances that are already joined to a Microsoft Active Directory domain, attempts to SSH into the instance using the domain credentials might fail with following errors:

\$1 ssh admin@EXAMPLE.COM@198.51.100

no such identity: /Users/username/.ssh/id\$1ed25519: No such file or directory

admin@EXAMPLE.COM@198.51.100's password:

Permission denied, please try again.

admin@EXAMPLE.COM@198.51.100's password:

If you log in to the instance with a public key and check `/var/log/auth.log`, you might see the following errors about being unable to find the user:

May 12 01:02:12 ip-192-0-2-0 sshd[2251]: pam\$1unix(sshd:auth): authentication failure; logname= uid=0 euid=0 tty=ssh ruser= rhost=203.0.113.0

May 12 01:02:12 ip-192-0-2-0 sshd[2251]: pam\$1sss(sshd:auth): authentication failure; logname= uid=0 euid=0 tty=ssh ruser= rhost=203.0.113.0 user=admin@EXAMPLE.COM

May 12 01:02:12 ip-192-0-2-0 sshd[2251]: pam\$1sss(sshd:auth): received for user admin@EXAMPLE.COM: 10 (User not known to the underlying authentication module)

May 12 01:02:14 ip-192-0-2-0 sshd[2251]: Failed password for invalid user admin@EXAMPLE.COM from 203.0.113.0 port 13344 ssh2

May 12 01:02:15 ip-192-0-2-0 sshd[2251]: Connection closed by 203.0.113.0 [preauth]

However, `kinit` for the user still works. See this example:

ubuntu@ip-192-0-2-0:\$1\$1 kinit admin@EXAMPLE.COM Password for admin@EXAMPLE.COM: ubuntu@ip-192-0-2-0:\$1\$1 klist Ticket cache: FILE:/tmp/krb5cc\$11000 Default principal: admin@EXAMPLE.COM

### Workaround
<a name="ubuntu-scenarios-workaround"></a>

The current recommended workaround for both of these scenarios is to disable reverse DNS in `/etc/krb5.conf` in the [libdefaults] section as shown below:

```
[libdefaults]
default_realm = EXAMPLE.COM
rdns = false
```

## One-way trust authentication issue with seamless domain join
<a name="1-way-trust-auth-issues"></a>

If you have a one-way outgoing trust established between your AWS Managed Microsoft AD and your on-premises Active Directory, you might encounter an authentication issue when attempting to authenticate against the domain joined Linux instance using your trusted Active Directory credentials with Winbind. 

### Errors
<a name="1-way-trust-auth-issues-errors"></a>

Jul 31 00:00:00 EC2AMAZ-LSMWqT sshd[23832]: Failed password for user@corp.example.com from xxx.xxx.xxx.xxx port 18309 ssh2

Jul 31 00:05:00 EC2AMAZ-LSMWqT sshd[23832]: pam\$1winbind(sshd:auth): getting password (0x00000390)

Jul 31 00:05:00 EC2AMAZ-LSMWqT sshd[23832]: pam\$1winbind(sshd:auth): pam\$1get\$1item returned a password

Jul 31 00:05:00 EC2AMAZ-LSMWqT sshd[23832]: pam\$1winbind(sshd:auth): request wbcLogonUser failed: WBC\$1ERR\$1AUTH\$1ERROR, PAM error: PAM\$1SYSTEM\$1ERR (4), NTSTATUS: \$1\$1NT\$1STATUS\$1OBJECT\$1NAME\$1NOT\$1FOUND\$1\$1, Error message was: The object name is not found.

Jul 31 00:05:00 EC2AMAZ-LSMWqT sshd[23832]: pam\$1winbind(sshd:auth): internal module error (retval = PAM\$1SYSTEM\$1ERR(4), user = 'CORP\$1user')

## Workaround
<a name="1-way-trust-auth-issues-workaround"></a>

To resolve this issue, you will need to comment out or remove a directive from the PAM module configuration file (`/etc/security/pam_winbind.conf`) using the following steps.

1. Open the `/etc/security/pam_winbind.conf` file in a text editor.

   ```
   sudo vim /etc/security/pam_winbind.conf
   ```

1. Comment out or remove the following directive **krb5\$1auth = yes**.

   ```
   [global]
   
   cached_login = yes
   krb5_ccache_type = FILE
   #krb5_auth = yes
   ```

1. Stop the Winbind service, and then start it again.

   ```
   service winbind stop or systemctl stop winbind
   net cache flush 
   service winbind start or systemctl start winbind
   ```

# AWS Managed Microsoft AD low available storage space
<a name="ms_ad_troubleshooting_low_storage_space"></a>

When your AWS Managed Microsoft AD is impaired due to Active Directory having low available storage space, immediate action is required to return the directory to an active state. The two most common causes of this impairment are covered in the sections below:

1. [SYSVOL folder is storing more than essential group policy objects](#sysvol-folder-gpo)

1. [Active Directory database has filled the volume](#ad-db-filled-volume)

For pricing information about AWS Managed Microsoft AD storage, see [Directory Service Pricing](https://aws.amazon.com/directoryservice/pricing/#Comparison_Table).

## SYSVOL folder is storing more than essential group policy objects
<a name="sysvol-folder-gpo"></a>

A common cause of this impairment is due to storing non-essential files for Group Policy processing in the SYSVOL folder. These non-essential files could be EXEs, MSIs, or any other file that is not essential for Group Policy to process. The essential objects for Group Policy to process are Group Policy Objects, Logon/off Scripts, and the [Central Store for Group Policy objects](https://learn.microsoft.com/en-us/troubleshoot/windows-client/group-policy/create-and-manage-central-store). Any non-essential files should be stored on a file server(s) other than your AWS Managed Microsoft AD domain controllers.

If files for [Group Policy Software Installation](https://learn.microsoft.com/en-us/troubleshoot/windows-server/group-policy/use-group-policy-to-install-software) are needed you should use a file server to store those installation files. If you would prefer to not self manage a file server, AWS provides a managed file server option, [Amazon FSx](https://aws.amazon.com/fsx/).

To remove any unnecessary files you can access the SYSVOL share via it's universal naming convention (UNC) path. For example, if your domain's fully qualified domain name (FQDN) is example.com, the UNC path for the SYSVOL would be "\$1\$1example.local\$1SYSVOL\$1example.local\$1". Once you locate and remove objects that are not essential for Group Policy to process the directory, it should return to an Active state within 30 minutes. If after 30 minutes the directory is not active, please contact AWS Support.

Storing only essential Group Policy files in your SYSVOL share will ensure that you will not impair your directory due to SYSVOL bloat.

## Active Directory database has filled the volume
<a name="ad-db-filled-volume"></a>

A common cause of this impairment is due to the Active Directory database filling the volume. To verify if this is the case, you can review the **total** count of objects in your directory. We bold the word **total** to ensure that you understand **deleted** objects still count towards the total number of objects in a directory.

By default AWS Managed Microsoft AD keeps items in the AD Recycling Bin for 180 days before they become a Recycled-Object. Once an object becomes a Recycled-Object (tombstoned), it is retained for another 180 days before it is finally purged from the directory. So when an object is deleted it exists in the directory database for 360 day before it is purged. This is why the total number of objects need to be evaluated.

For more details on AWS Managed Microsoft AD supported object counts, see [Directory Service Pricing](https://aws.amazon.com/directoryservice/pricing/#Comparison_Table).

To get the total number of objects in a directory that includes the deleted objects, you can run the following PowerShell command from a domain joined Windows instance. For steps how to setup a management instance, see [User and group management in AWS Managed Microsoft AD](ms_ad_manage_users_groups.md). 

```
Get-ADObject -Filter * -IncludeDeletedObjects | Measure-Object -Property 'Count' | Select-Object -Property 'Count'
```

Below is an example output from the above command:

```
Count
10000
```

If the total count is above the supported object count for your directory size listed in the note above, you have exceeded the capacity of your directory.

Below are the options to resolve this impairment:

1. Cleanup AD

   1. Delete any unwanted AD objects.

   1. Remove any objects that are not wanted from the AD Recycling Bin. Note this is destructive and the only way to recover those deleted objects will be to perform a restore of the directory. 

   1. The following command will remove all deleted objects from the AD Recycling Bin.
**Important**  
Use this command with extreme caution as this is a destructive command and the only way to recover those deleted objects will be to perform a restore of the directory. 

      ```
      $DomainInfo = Get-ADDomain
      $BaseDn = $DomainInfo.DistinguishedName
      $NetBios = $DomainInfo.NetBIOSName
      $ObjectsToRemove = Get-ADObject -Filter { isDeleted -eq $true } -IncludeDeletedObjects -SearchBase "CN=Deleted Objects,$BaseDn" -Properties 'LastKnownParent','DistinguishedName','msDS-LastKnownRDN' | Where-Object { ($_.LastKnownParent -Like "*OU=$NetBios,$BaseDn") -or ($_.LastKnownParent -Like '*\0ADEL:*') }
      ForEach ($ObjectToRemove in $ObjectsToRemove) { Remove-ADObject -Identity $ObjectToRemove.DistinguishedName -IncludeDeletedObjects }
      ```

   1. Open a case with AWS Support to request that Directory Service reclaims the free space. 

1. If your directory type is Standard Edition Open a case with AWS Support requesting your directory be upgraded to Enterprise Edition. This will also increase the cost of your directory. For pricing information, see [Directory Service Pricing](https://aws.amazon.com/directoryservice/pricing/#Comparison_Table).

In AWS Managed Microsoft AD, members of the **AWS Delegated Deleted Object Lifetime Administrators** group have the ability to modify the `msDS-DeletedObjectLifetime` attribute which sets the amount of time in days that deleted objects are kept in the AD Recycling Bin before they become Recycled-Objects. 

**Note**  
This is an advanced topic. If configured inappropriately, it can result in data loss. We highly recommend that you first review [The AD Recycle Bin: Understanding, Implementing, Best Practices, and Troubleshooting](https://techcommunity.microsoft.com/t5/ask-the-directory-services-team/the-ad-recycle-bin-understanding-implementing-best-practices-and/ba-p/396944) to get a better understanding of these processes.

The ability to change the `msDS-DeletedObjectLifetime` attribute value to a lower number can help ensure your object count does not exceed supported levels. The lowest valid value this attribute can be set to is 2 days. Once that value has exceeded you will no longer be able to recover the deleted object using the AD Recycling Bin. It will require restoring your directory from a snapshot to recover the object(s). For more information, see [Restoring your AWS Managed Microsoft AD with snapshots](ms_ad_snapshots.md). **Any restore from snapshot can result in data loss as they are a point in time.**

To change Deleted Object Lifetime of your directory run the following command:

**Note**  
If you run the command as is, it will set the Deleted Object Lifetime attribute value to 30 days. If you would like to make it longer or shorter replace "30" with whatever number you prefer. However, we recommend that you go no higher than the default number of 180.

```
$DeletedObjectLifetime = 30
$DomainInfo = Get-ADDomain
$BaseDn = $DomainInfo.DistinguishedName
Set-ADObject -Identity "CN=Directory Service,CN=Windows NT,CN=Services,CN=Configuration,$BaseDn" -Partition "CN=Configuration,$BaseDn" -Replace:@{"msDS-DeletedObjectLifetime" = $DeletedObjectLifetime}
```

# Schema extension errors
<a name="ms_ad_troubleshooting_schema"></a>

The following can help you troubleshoot some error messages you might encounter when extending the schema for your AWS Managed Microsoft AD directory.

## Referral
<a name="referral"></a>

**Error**  
*Add error on entry starting on line 1: Referral The server side error is: 0x202b A referral was returned from the server. The extended server error is: 0000202B: RefErr: DSID-0310082F, data 0, 1 access points \$1tref 1: ‘example.com' Number of Objects Modified: 0*

**Troubleshooting**  
Ensure that all of the distinguished name fields have the correct domain name. In the example above, `DC=example,dc=com` should be replaced with the `DistinguishedName` shown by the cmdlet `Get-ADDomain`.

## Unable to read import file
<a name="unabletoread"></a>

**Error**  
*Unable to read the import file. Number of Objects Modified: 0*

**Troubleshooting**  
The imported LDIF file is empty (0 bytes). Ensure the correct file was uploaded.

## Syntax error
<a name="syntaxerror"></a>

**Error**  
*There is a syntax error in the input file Failed on line 21. The last token starts with 'q'. Number of Objects Modified: 0*

**Troubleshooting**  
The text on line 21 is not formatted correctly. The first letter of the invalid text is `A`. Update line 21 with valid LDIF syntax. For more information about how to format the LDIF file, see [Step 1: Create your LDIF file](create.md).

## Attribute or value exists
<a name="attributeorvalue"></a>

**Error**  
*Add error on entry starting on line 1: Attribute Or Value Exists The server side error is: 0x2083 The specified value already exists. The extended server error is: 00002083: AtrErr: DSID-03151830, \$11: \$1t0: 00002083: DSID-03151830, problem 1006 (ATT\$1OR\$1VALUE\$1EXISTS), data 0, Att 20019 (mayContain):len 4 Number of Objects Modified: 0*

**Troubleshooting**  
The schema change has already been applied.

## No such attribute
<a name="nosuchattribute"></a>

**Error**  
*Add error on entry starting on line 1: No Such Attribute The server side error is: 0x2085 The attribute value cannot be removed because it is not present on the object. The extended server error is: 00002085: AtrErr: DSID-03152367, \$11: \$1t0: 00002085: DSID-03152367, problem 1001 (NO\$1ATTRIBUTE\$1OR\$1VAL), data 0, Att 20019 (mayContain):len 4 Number of Objects Modified: 0*

**Troubleshooting**  
The LDIF file is trying to remove an attribute from a class, but that attribute is currently not attached to the class. Schema change was probably already applied.

**Error**  
*Add error on entry starting on line 41: No Such Attribute 0x57 The parameter is incorrect. The extended server error is: 0x208d Directory object not found. The extended server error is: "00000057: LdapErr: DSID-0C090D8A, comment: Error in attribute conversion operation, data 0, v2580" Number of Objects Modified: 0*

**Troubleshooting**  
The attribute listed on line 41 is incorrect. Double-check the spelling.

## No such object
<a name="nosuchobject"></a>

**Error**  
*Add error on entry starting on line 1: No Such Object The server side error is: 0x208d Directory object not found. The extended server error is: 0000208D: NameErr: DSID-03100238, problem 2001 (NO\$1OBJECT), data 0, best match of: 'CN=Schema,CN=Configuration,DC=example,DC=com' Number of Objects Modified: 0*

**Troubleshooting**  
The object referenced by the distinguished name (DN) does not exist.

# Trust creation status reasons
<a name="ms_ad_troubleshooting_trusts"></a>

When trust creation fails for AWS Managed Microsoft AD, the status message contains additional information. The following can help you understand what those messages mean.

## Access is denied
<a name="access_denied"></a>

Access was denied when trying to create the trust. Either the trust password is incorrect or the remote domain's security settings do not allow a trust to be configured. For more information on trusts, see [Enhancing Trust Efficiency with Site Names and DCLocator](#enhancing-trust-site-names). To resolve this problem, try the following:
+ Verify that you are using the same trust password that you used when creating the corresponding trust on the remote domain.
+ Verify that your domain security settings allow for trust creation.
+ Verify that your local security policy is set correctly. Specifically check `Local Security Policy > Local Policies > Security Options > Network access: Named Pipes that can be accessed anonymously` and ensure that it contains at least the following three named pipes:
  + netlogon
  + samr
  + lsarpc
+ Verify that the above named pipes exist as the value(s) on the **NullSessionPipes** registry key which is in the registry path **HKLM\$1SYSTEM\$1CurrentControlSet\$1services\$1LanmanServer\$1Parameters**. These values must be inserted on separated rows.
**Note**  
By default, `Network access: Named Pipes that can be accessed anonymously` is not set and will display `Not Defined`. This is normal, as the domain controller's effective default settings for `Network access: Named Pipes that can be accessed anonymously` is `netlogon`, `samr`, `lsarpc`.
+ Verify the following Server Message Block (SMB) Signing Setting in the *Default Domain Controllers Policy*. These settings can be found under **Computer Configuration** > **Windows Settings** > **Security Settings** > **Local Policies/Security Options**. They should match the following settings: 
  + Microsoft network client: Digitally sign communications (always): Default: Enabled
  + Microsoft network server: Digitally sign communications (always): Enabled

### Enhancing Trust Efficiency with Site Names and DCLocator
<a name="enhancing-trust-site-names"></a>

The First Site name like Default-First-Site-Name is not a requirement for establishing trust relationships between domains. However, aligning site names between domains can significantly improve the efficiency of the Domain Controller Locator (DCLocator) process. This alignment improves predicting and controlling the selection of domain controllers across the forest trusts.

The DCLocator process is crucial for finding domain controllers across different domains and forests. For more information on the DCLocator process, see [Microsoft documentation](https://learn.microsoft.com/en-us/troubleshoot/windows-server/active-directory/troubleshoot-domain-controller-location-issues). Efficient site configuration allows for quicker and more accurate domain controller location, which leads to better performance and reliability in cross-forest operations. 

For more information on how site names and DCLocator process interacts, see the following Microsoft articles:
+ [How Domain Controllers are Located Across Trusts](https://techcommunity.microsoft.com/t5/core-infrastructure-and-security/how-domain-controllers-are-located-across-trusts/ba-p/256180)
+ [Domain Locator Across Forests](https://techcommunity.microsoft.com/blog/askds/domain-locator-across-a-forest-trust/395689)

## The specified domain name does not exist or could not be contacted
<a name="no_domain_name"></a>

To resolve this problem, ensure the security group settings for your domain and access control list (ACL) for your VPC are correct and you have accurately entered the information for your conditional forwarder. AWS configures the security group to open only the ports that are required for Active Directory communications. In the default configuration, the security group accepts traffic to these ports from any IP address. Outbound traffic is restricted to the Security group. You will need to update the outbound rule on the security group to allow traffic to your on premise network. For more information about security requirements, please see [Step 2: Prepare your AWS Managed Microsoft AD](ms_ad_tutorial_setup_trust_prepare_mad.md).

![\[Edit security group\]](http://docs.aws.amazon.com/directoryservice/latest/admin-guide/images/edit_security_group.png)


If the DNS servers for the networks of the other directories use public (non-RFC 1918) IP addresses, you will need add an IP route on the directory from the Directory Services Console to the DNS Servers. For more information, see [Create, verify, or delete a trust relationship](ms_ad_setup_trust.md#trust_steps) and [Prerequisites](ms_ad_setup_trust.md#trust_prereq).

The Internet Assigned Numbers Authority (IANA) has reserved the following three blocks of the IP address space for private internets:
+ 10.0.0.0 - 10.255.255.255 (10/8 prefix)
+ 172.16.0.0 - 172.31.255.255 (172.16/12 prefix)
+ 192.168.0.0 - 192.168.255.255 (192.168/16 prefix)

For more information, see [https://tools.ietf.org/html/rfc1918](https://tools.ietf.org/html/rfc1918).

Verify that the **Default AD Site Name** for your AWS Managed Microsoft AD matches the **Default AD Site Name** in your on-premises infrastructure. The computer determines the site name using a domain of which the computer is a member, not the user's domain. Renaming the site to match the closest on-premises ensures the DC locator will use a domain controller from the closest site. If this does not solve the issue, it is possible that information from a previously created conditional forwarder has been cached, preventing the creation of a new trust. Wait several minutes, and then try creating the trust and conditional forwarder again.

For more information about how this works, see [Domain Locator Across a Forest Trust](https://techcommunity.microsoft.com/t5/ask-the-directory-services-team/domain-locator-across-a-forest-trust/ba-p/395689) on Microsoft website.

![\[Default first site name\]](http://docs.aws.amazon.com/directoryservice/latest/admin-guide/images/default_first_site_name.png)


## The operation could not be performed on this domain
<a name="operationfailedondomain"></a>

To resolve this, ensure both domains / directories do not have overlapping NETBIOS name(s). If the domains / directories do have overlapping NETBIOS names, recreate one of them with a different NETBIOS name, and then try again.

## Trust creation is failing because of the error "Required and valid domain name"
<a name="trustcreationfailing"></a>

DNS names can contain only alphabetical characters (A-Z), numeric characters (0-9), the minus sign (-), and a period (.). Period characters are allowed only when they are used to delimit the components of domain style names. Also, consider the following:
+ AWS Managed Microsoft AD does not support trusts with Single label domains. For more information, see [Microsoft support for Single Label Domains](https://docs.microsoft.com/en-US/troubleshoot/windows-server/networking/single-label-domains-support-policy).
+ According to RFC 1123 ([https://tools.ietf.org/html/rfc1123](https://tools.ietf.org/html/rfc1123)), the only characters that can be used in DNS labels are "A" to "Z", "a" to "z", "0" to "9", and a hyphen ("-"). A period [.] is also used in DNS names, but only between DNS labels and at the end of an FQDN.
+ According to RFC 952 ([https://tools.ietf.org/html/rfc952](https://tools.ietf.org/html/rfc952)), a "name" (Net, Host, Gateway, or Domain name) is a text string up to 24 characters drawn from the alphabet (A-Z), digits (0-9), minus sign (-), and period (.). Note that periods are only allowed when they serve to delimit components of "domain style names".

For more information, see [Complying with Name Restrictions for Hosts and Domains](https://docs.microsoft.com/en-us/previous-versions/windows/it-pro/windows-2000-server/cc959336(v=technet.10)) on Microsoft website.

## General tools for testing trusts
<a name="trusttroubleshootingtools"></a>

The following are tools that can be used to troubleshoot various trust related issues.

**AWS Systems Manager Automation troubleshooting tool**

[Support Automation Workflows (SAW)](https://docs.aws.amazon.com/systems-manager/latest/userguide/automation-walk-support.html) leverage AWS Systems Manager Automation to provide you with a predefined runbook for Directory Service. The [AWSSupport-TroubleshootDirectoryTrust](https://docs.aws.amazon.com/systems-manager/latest/userguide/automation-awssupport-troubleshootdirectorytrust.html) runbook tool helps you diagnose common trust creation issues between AWS Managed Microsoft AD and an on-premises Microsoft Active Directory.

**DirectoryServicePortTest tool**

The [DirectoryServicePortTest](samples/DirectoryServicePortTest.zip) testing tool can be helpful when troubleshooting trust creation issues between AWS Managed Microsoft AD and on-premises Active Directory. For an example on how the tool can be used, see [Test your AD Connector](ad_connector_getting_started.md#connect_verification).

**NETDOM and NLTEST tool**

Administrators can use both the **Netdom** and **Nltest** command-line tools to find, display, create, remove and manage trusts. These tools communicate directly with the LSA authority on a domain controller. For an example on how to use these tools, see [Netdom](https://docs.microsoft.com/en-us/previous-versions/windows/it-pro/windows-server-2012-r2-and-2012/cc772217(v=ws.11)) and [NLTEST](https://docs.microsoft.com/en-us/previous-versions/windows/it-pro/windows-server-2012-r2-and-2012/cc731935(v=ws.11)) on Microsoft website.

**Packet capture tool**

You can use the built-in Windows package capture utility to investigate and troubleshoot a potential network issue. For more information, see [Capture a Network Trace without installing anything](https://techcommunity.microsoft.com/t5/iis-support-blog/capture-a-network-trace-without-installing-anything-amp-capture/ba-p/376503).

# Troubleshooting AWS Managed Microsoft AD high CPU utilization
<a name="ms_ad_troubleshooting_high_cpu"></a>

The following can help you troubleshoot high CPU issues on AWS Managed Microsoft AD domain controllers.

## Finding the root cause
<a name="ms_ad_high_cpu_root_cause"></a>

The first step in troubleshooting high CPU utilization is to analyze CloudWatch metrics to identify patterns that may explain the increased resource consumption.

### Step 1: Review Directory Service CloudWatch metrics
<a name="ms_ad_high_cpu_step1"></a>

Monitor your AWS Managed Microsoft AD performance using CloudWatch metrics to identify traffic patterns that correlate with high CPU usage. For detailed information on viewing and interpreting Directory Service metrics, see [Using CloudWatch to monitor the performance of your AWS Managed Microsoft AD domain controllers](ms_ad_monitor_dc_performance.md).

Look for shifting patterns in the following key metrics that might explain the CPU increase:
+ **DNS queries per second** – Sudden spikes may indicate DNS resolution issues or misconfigured applications.
+ **Kerberos/NTLM authentications** – Higher authentication rates from user logons or service accounts.
+ **LDAP queries per second** – Increased LDAP traffic from applications or services.

Compare current metrics with historical baselines to identify when the high CPU utilization began and correlate it with specific traffic increases. If no correlation is found in the metrics then the root cause is not an overwhelming increase in traffic. Instead the root cause is likely an inefficient LDAP query, skip to [Step 3: Capture detailed traffic analysis with Traffic Mirroring](#ms_ad_high_cpu_step3).

### Step 2: Identify source machines using VPC Flow Logs
<a name="ms_ad_high_cpu_step2"></a>

VPC Flow Logs provide an effective method to identify the source IP addresses of machines generating traffic to your domain controllers. For more information, see [Logging IP traffic using VPC Flow Logs](https://docs.aws.amazon.com/vpc/latest/userguide/flow-logs.html). Use the destination port numbers to differentiate between services:
+ **Port 53** – DNS queries
+ **Port 88** – Kerberos authentication
+ **Port 123** – NTP clock synchronization
+ **Port 135, 49152-65535** – RPC
+ **Ports 389, 636, 3268, 3269** – LDAP queries (389 or 3268 for standard LDAP, 636 or 3269 for LDAPS)
+ **Port 445** – SMB file sharing (Group Policies)
+ **Port 464** – Kerberos password change
+ **Port 9389** – Active Directory Web Service

To enable and analyze VPC Flow Logs:
+ Enable VPC Flow Logs for the subnets containing your domain controller ENIs.
+ Filter logs by destination ports to identify traffic patterns.
+ Organize by most packets and/or most bytes over the period of time.
+ Analyze source IP addresses to determine which machines are generating the most traffic.

### Step 3: Capture detailed traffic analysis with Traffic Mirroring
<a name="ms_ad_high_cpu_step3"></a>

VPC Flow Logs provide limited information about the actual content of requests. For more detailed analysis, consider Traffic Mirroring to capture full packet data. For more information, see [Get started using Traffic Mirroring to monitor network traffic](https://docs.aws.amazon.com/vpc/latest/mirroring/traffic-mirroring-getting-started.html). This is particularly useful when you need to analyze:
+ LDAP filter complexity and efficiency
+ Specific DNS query patterns
+ Authentication request details

Traffic Mirroring allows you to capture complete network packets sent to your domain controller instances, enabling deep analysis of the traffic causing high CPU utilization.

### Step 4: Investigate source applications and optimize traffic
<a name="ms_ad_high_cpu_step4"></a>

Once you've identified the source machines and traffic patterns, investigate the applications generating the traffic:
+ **Review application configurations** – Check if applications are making inefficient queries or excessive requests. Avoid hard coding the application to a single domain controller.
+ **Analyze LDAP queries** – Inefficient LDAP queries are the most common cause of high domain controller CPU. Look for complex filters that could benefit from attribute indexing.
+ **Examine DNS caching** – Verify that DNS client caching is enabled to reduce repetitive queries.
+ **Check authentication patterns** – Identify if service accounts are authenticating too frequently.

## Resolution strategies
<a name="ms_ad_high_cpu_resolution"></a>

Based on your investigation, implement appropriate optimization strategies:

### Optimize applications
<a name="ms_ad_high_cpu_optimize_apps"></a>
+ **Optimize LDAP queries** – Rewrite complex LDAP queries. Avoid setting the search base to the root of the domain and instead configure it to an OU where the objects you are searching for reside. Avoid using a search scope that performs subtree searches. Instead use a base or single level scope. Include the object class in your filter. For example, `(objectClass=user)` or `(objectClass=computer)`. Avoid using wildcards in the filter unless the attribute is indexed. Add an index if a wildcard scan is required. For more information, see [Extend your AWS Managed Microsoft AD schema](ms_ad_schema_extensions.md). Don't index everything as the indexing process also increases CPU utilization.

  ```
  # Sample LDIF code to index the email attribute
  dn: CN=mail,CN=Schema,CN=Configuration,DC=yourdomain,DC=com
  changetype: modify
  replace: searchFlags
  searchFlags: 1
  ```
+ **Enable DNS client caching** – Configure clients to cache DNS responses locally to reduce server load.
+ **Implement connection pooling** – Configure applications to reuse LDAP connections rather than creating new ones for each query.

### Scale your directory infrastructure
<a name="ms_ad_high_cpu_scale"></a>

If traffic optimization doesn't resolve the high CPU utilization:
+ **Add more domain controllers** – Scale out by deploying additional domain controllers to distribute the load. For more information, see [Deploying additional domain controllers for your AWS Managed Microsoft AD](ms_ad_deploy_additional_dcs.md).
+ **Upgrade to Enterprise Edition** – If using Standard Edition, upgrade to Enterprise Edition for increased CPU capacity and performance. For more information, see [Upgrading your AWS Managed Microsoft AD](ms_ad_upgrade_edition.md). If already using Enterprise Edition, Contact [AWS Support](https://docs.aws.amazon.com//awssupport/latest/user/case-management.html) for increased capacity.

For pricing information about AWS Managed Microsoft AD editions, see [Directory Service Pricing](https://aws.amazon.com/directoryservice/pricing/#Comparison_Table).