SAP HANA 横向扩展 - SAP HANA on AWS

SAP HANA 横向扩展

以下部分是一个用于 SAP HANA 横向扩展的主机设置示例,备用节点位于 AWS 云端,采用 FSx for ONTAP 作为主存储方案。您可以使用 SAP HANA 主机自动失效转移(SAP 提供的自动化解决方案),从 SAP HANA 主机故障中恢复。有关更多信息,请参阅 SAP HANA - Host Auto-Failover

Linux 内核参数

在所有节点的 /etc/sysctl.d 目录中创建一个名为 91-NetApp-HANA.conf 的文件,并添加以下配置。

net.core.rmem_max = 16777216 net.core.wmem_max = 16777216 net.ipv4.tcp_rmem = 4096 131072 16777216 net.ipv4.tcp_wmem = 4096 16384 16777216 net.core.netdev_max_backlog = 300000 net.ipv4.tcp_slow_start_after_idle=0 net.ipv4.tcp_no_metrics_save = 1 net.ipv4.tcp_moderate_rcvbuf = 1 net.ipv4.tcp_window_scaling = 1 net.ipv4.tcp_timestamps = 1 net.ipv4.tcp_sack = 1

将 NFSv4 的最大会话位置增加到 180。

echo options nfs max_session_slots=180 > /etc/modprobe.d/nfsclient.conf

您必须重启实例以使内核参数和 NFS 设置生效。

网络文件系统 (NFS)

重要

对于 SAP HANA 横向扩展系统,FSx for ONTAP 仅支持 NFS 版本 4.1。

网络文件系统(NFS)版本 4 及更高版本要求进行用户身份验证。您可以使用轻型目录访问协议(LDAP)服务器或本地用户账户进行身份验证。

如果您使用的是本地用户账户,则必须在所有 Linux 服务器和 SVM 上将 NFSv4 域设置为相同的值。您可以在 Linux 主机上的 /etc/idmapd.conf 文件中设置域参数(Domain = <domain name>)。

要确定 SVM 的域设置,请使用以下命令:

nfs show -vserver hana-data -fields v4-id-domain

下面是示例输出:

vserver v4-id-domain --------- ------------ hana-data ec2.internal

创建子目录

挂载 /hana/shared 卷并为每台主机创建 sharedusr-sap 子目录。以下示例命令适用于 4+1 SAP HANA 横向扩展系统。

mkdir /mnt/tmp mount -t nfs -o sec=sys,vers=4.1 <svm-shared>:/HDB-shared /mnt/tmp cd /mnt/tmp mkdir shared mkdir usr-sap-host1 mkdir usr-sap-host2 mkdir usr-sap-host3 mkdir usr-sap-host4 mkdir usr-sap-host5 cd umount /mnt/tmp

创建挂载点

在横向扩展系统中,在所有从属节点和备用节点上创建以下挂载点。以下示例命令适用于 4+1 SAP HANA 横向扩展系统。

mkdir -p /hana/data/HDB/mnt00001 mkdir -p /hana/log/HDB/mnt00001 mkdir -p /hana/data/HDB/mnt00002 mkdir -p /hana/log/HDB/mnt00002 mkdir -p /hana/data/HDB/mnt00003 mkdir -p /hana/log/HDB/mnt00003 mkdir -p /hana/data/HDB/mnt00004 mkdir -p /hana/log/HDB/mnt00004 mkdir -p /hana/shared mkdir -p /usr/sap/HDB

挂载文件系统

创建的文件系统必须作为 NFS 文件系统挂载在 Amazon EC2 上。下表是针对不同 SAP HANA 文件系统的 NFS 选项的推荐示例。

文件系统

常用挂载选项

版本选项

传输大小选项

连接选项

SAP HANA 数据

rw,bg,hard,timeo=600,noatime,

vers=4,minorversion=1,lock,

rsize=262144,wsize=262144,

nconnect=4

SAP HANA 日志

rw,bg,hard,timeo=600,noatime,

vers=4,minorversion=1,lock,

rsize=262144,wsize=262144,

nconnect=2

SAP HANA 共享

rw,bg,hard,timeo=600,noatime,

vers=4,minorversion=1,lock,

rsize=262144,wsize=262144,

nconnect=2

SAP HANA 二进制文件

rw,bg,hard,timeo=600,noatime,

vers=4,minorversion=1,lock,

rsize=262144,wsize=262144,

nconnect=2

  • 只有卸载并重新挂载 NFS 文件系统后,对 nconnect 参数的更改才会生效。

  • 访问 FSx for ONTAP 时,客户端系统必须具有唯一的主机名。如果存在同名的系统,则第二个系统可能无法访问 FSx for ONTAP。

示例 – 挂载共享卷

所有主机上向 /etc/fstab 添加以下行,以便在实例重启期间保留已挂载的文件系统。然后,您可以运行 mount -a 来挂载 NFS 文件系统。

<svm-data_1>:/HDB_data_mnt00001 /hana/data/HDB/mnt00001 nfs rw,bg,hard,timeo=600,noatime,vers=4,minorversion=1,lock,rsize=262144,wsize=262144,nconnect=4 <svm-log_1>:/HDB_log_mnt00001 /hana/log/HDB/mnt00001 nfs rw,bg,hard,timeo=600,noatime,vers=4,minorversion=1,lock,rsize=262144,wsize=262144,nconnect=2 <svm-data_2>:/HDB_data_mnt00002 /hana/data/HDB/mnt00002 nfs rw,bg,hard,timeo=600,noatime,vers=4,minorversion=1,lock,rsize=262144,wsize=262144,nconnect=4 <svm-log_2>:/HDB_log_mnt00002 /hana/log/HDB/mnt00002 nfs rw,bg,hard,timeo=600,noatime,vers=4,minorversion=1,lock,rsize=262144,wsize=262144,nconnect=2 <svm-data_3>:/HDB_data_mnt00003 /hana/data/HDB/mnt00003 nfs rw,bg,hard,timeo=600,noatime,vers=4,minorversion=1,lock,rsize=262144,wsize=262144,nconnect=4 <svm-log_3>:/HDB_log_mnt00003 /hana/log/HDB/mnt00003 nfs rw,bg,hard,timeo=600,noatime,vers=4,minorversion=1,lock,rsize=262144,wsize=262144,nconnect=2 <svm-data_4>:/HDB_data_mnt00004 /hana/data/HDB/mnt00004 nfs rw,bg,hard,timeo=600,noatime,vers=4,minorversion=1,lock,rsize=262144,wsize=262144,nconnect=4 <svm-log_4>:/HDB_log_mnt00004 /hana/log/HDB/mnt00004 nfs rw,bg,hard,timeo=600,noatime,vers=4,minorversion=1,lock,rsize=262144,wsize=262144,nconnect=2 <svm-shared>:/HDB_shared/shared /hana/shared nfs rw,bg,hard,timeo=600,noatime,vers=4,minorversion=1,lock,rsize=262144,wsize=262144,nconnect=2

示例 – 挂载特定于主机的卷

每台主机上向 /etc/fstab 添加特定于主机的行,以便在实例重启期间保留已挂载的文件系统。然后,您可以运行 mount -a 来挂载 NFS 文件系统。

主机

主机 1

<svm-shared>:/HDB_shared/usr-sap-host1 /usr/sap/HDB nfs rw,bg,hard,timeo=600,noatime,vers=4,minorversion=1,lock,rsize=262144,wsize=262144,nconnect=2

主机 2

<svm-shared>:/HDB_shared/usr-sap-host2 /usr/sap/HDB nfs rw,bg,hard,timeo=600,noatime,vers=4,minorversion=1,lock,rsize=262144,wsize=262144,nconnect=2

主机 3

<svm-shared>:/HDB_shared/usr-sap-host3 /usr/sap/HDB nfs rw,bg,hard,timeo=600,noatime,vers=4,minorversion=1,lock,rsize=262144,wsize=262144,nconnect=2

主机 4

<svm-shared>:/HDB_shared/usr-sap-host4 /usr/sap/HDB nfs rw,bg,hard,timeo=600,noatime,vers=4,minorversion=1,lock,rsize=262144,wsize=262144,nconnect=2

主机 5(备用主机)

<svm-shared>:/HDB_shared/usr-sap-host5 /usr/sap/HDB nfs rw,bg,hard,timeo=600,noatime,vers=4,minorversion=1,lock,rsize=262144,wsize=262144,nconnect=2

为目录设置所有权

使用以下命令设置对 SAP HANA 数据和日志目录的 hdbadm 所有权。

sudo chown hdbadm:sapsys /hana/data/HDB sudo chown hdbadm:sapsys /hana/log/HDB

SAP HANA 参数

使用所需配置安装 SAP HANA 系统,然后设置以下参数。有关 SAP HANA 安装的更多信息,请参阅 SAP HANA Server Installation and Update Guide

最佳性能

为了获得最佳性能,请在 global.ini 文件中设置以下参数。

[fileio] max_parallel_io_requests=128 async_read_submit=on async_write_submit_active=on async_write_submit_blocks=all

以下 SQL 命令可用于在 SYSTEM 级别设置这些参数。

ALTER SYSTEM ALTER CONFIGURATION ('global.ini', 'SYSTEM') SET ('fileio', 'max_parallel_io_requests') = '128' WITH RECONFIGURE; ALTER SYSTEM ALTER CONFIGURATION ('global.ini', 'SYSTEM') SET ('fileio', 'async_read_submit') = 'on' WITH RECONFIGURE; ALTER SYSTEM ALTER CONFIGURATION ('global.ini', 'SYSTEM') SET ('fileio', 'async_write_submit_active') = 'on' WITH RECONFIGURE; ALTER SYSTEM ALTER CONFIGURATION ('global.ini', 'SYSTEM') SET ('fileio', 'async_write_submit_blocks') = 'all' WITH RECONFIGURE;

NFS 锁定租用

从 SAP HANA 2.0 SPS4 开始,SAP HANA 提供了控制失效转移行为的参数。建议使用这些参数,而不是在 SVM 级别设置租用时间。namerserver.ini 文件中配置了以下参数。

板块 参数

failover

normal_retries

9

distributed_watchdog

deactivation_retries

11

distributed_watchdog

takeover_retries

9

以下 SQL 命令可用于在 SYSTEM 级别设置这些参数。

ALTER SYSTEM ALTER CONFIGURATION ('nameserver.ini', 'SYSTEM') SET ('failover', 'normal_retries') = '9' WITH RECONFIGURE; ALTER SYSTEM ALTER CONFIGURATION ('nameserver.ini', 'SYSTEM') SET ('distributed_watchdog', 'deactivation_retries') = '11' WITH RECONFIGURE; ALTER SYSTEM ALTER CONFIGURATION ('nameserver.ini', 'SYSTEM') SET ('distributed_watchdog', 'takeover_retries') = '9' WITH RECONFIGURE;

数据卷分区

在 SAP HANA 2.0 SPS4 中,额外的数据卷分区支持在单主机或多主机系统中,为 SAP HANA 租户数据库的数据卷配置两个或更多文件系统卷。数据卷分区使 SAP HANA 能够突破单个卷的大小和性能限制进行扩展。您可以随时添加额外的数据卷分区。有关更多信息,请参阅主机配置

主机准备

必须创建额外的挂载点和 /etc/fstab 条目,并且必须挂载了新卷。

  • 创建额外的挂载点并分配所需的权限、组和所有权。

    mkdir -p /hana/data2/HDB/mnt00001 chmod -R 777 /hana/data2/HDB/mnt00001
  • /etc/fstab 添加额外的文件系统。

    <data2>:/data2 /hana/data2/HDB/mnt00001 nfs <mount options>
  • 将权限设置为 777。要使 SAP HANA 能够在后续步骤中添加新数据卷,这是必需的。SAP HANA 会在数据卷创建期间自动设置更严格的权限。

启用数据卷分区

要启用数据卷分区,请在 SYSTEMDB 配置的 global.ini 文件中添加以下条目。

[customizable_functionalities] persistence_datavolume_partition_multipath = true
ALTER SYSTEM ALTER CONFIGURATION ('global.ini', 'SYSTEM') SET ('customizable_functionalities', 'PERSISTENCE_DATAVOLUME_PARTITION_MULTIPATH') = 'true' WITH RECONFIGURE;
注意

更新 global.ini 文件后,您必须重新启动数据库。

添加额外的数据卷分区

对租户数据库运行以下 SQL 语句,向租户数据库添加额外的数据卷分区。

ALTER SYSTEM ALTER DATAVOLUME ADD PARTITION PATH '/hana/data2/HDB/';

添加数据卷分区的速度很快。新数据卷分区在创建后为空。数据会随着时间的推移均匀分布在各个数据卷中。

测试主机自动失效转移

建议您测试 SAP HANA 主机自动失效转移场景。有关更多信息,请参阅 SAP HANA - Host Auto-Failover

部分单词已被编辑并替换为包容性词语。这些词语在您的产品、系统代码或表中的显示可能有所不同。有关更多详细信息,请参阅 Inclusive Language at SAP

下表列出了不同测试场景的预期结果。

场景 预期结果

使用 echo b > /proc/sysrq-trigger 时出现 SAP HANA 从属节点故障

从属节点失效转移到备用节点

使用 HDB 终止功能时出现 SAP HANA 协调器节点故障

SAP HANA 服务失效转移到备用节点(协调器节点的另一个候选节点)

SAP HANA 协调器节点出现故障,而其他协调器节点充当从属节点

协调器节点失效转移到备用节点,而其他协调器节点充当从属节点

SAP HANA 从属节点故障

在测试前,请检查场景的状态。

hdbadm@hana:/usr/sap/HDB/HDB00/exe/python_support> python landscapeHostConfiguration.py | Host | Host | Host | Failover | Remove | Storage | Storage | Failover | Failover | NameServer | NameServer | IndexServer | IndexServer | Host | Host | Worker | Worker | | | Active | Status | Status | Status | Config | Actual | Config | Actual | Config | Actual | Config | Actual | Config | Actual | Config | Actual | | | | | | | Partition | Partition | Group | Group | Role | Role | Role | Role | Roles | Roles | Groups | Groups | | ------- | ------ | ------ | -------- | ------ | --------- | --------- | -------- | -------- | ---------- | ---------- | ----------- | ----------- | ------- | ------- | ------- | ------- | | hana | yes | ok | | | 1 | 1 | default | default | coordinator 1 | coordinator | worker | coordinator | worker | worker | default | default | | hanaw01 | yes | ok | | | 2 | 2 | default | default | subordinate | subordinate | worker | subordinate | worker | worker | default | default | | hanaw02 | yes | ok | | | 3 | 3 | default | default | subordinate | subordinate | worker | subordinate | worker | worker | default | default | | hanaw03 | yes | ok | | | 4 | 4 | default | default | coordinator 3 | subordinate | worker | subordinate | worker | worker | default | default | | hanaw04 | yes | ignore | | | 0 | 0 | default | default | coordinator 2 | subordinate | standby | standby | standby | standby | default | - | overall host status: ok

root 身份在从属节点上运行以下命令来模拟节点崩溃。在本例中,从属节点是 hanaw01

echo b > /proc/sysrq-trigger
hdbadm@hana:/usr/sap/HDB/HDB00/exe/python_support> python landscapeHostConfiguration.py | Host | Host | Host | Failover | Remove | Storage | Storage | Failover | Failover | NameServer | NameServer | IndexServer | IndexServer | Host | Host | Worker | Worker | | | Active | Status | Status | Status | Config | Actual | Config | Actual | Config | Actual | Config | Actual | Config | Actual | Config | Actual | | | | | | | Partition | Partition | Group | Group | Role | Role | Role | Role | Roles | Roles | Groups | Groups | | ------- | ------ | ------ | -------- | ------ | --------- | --------- | -------- | -------- | ---------- | ---------- | ----------- | ----------- | ------- | ------- | ------- | ------- | | hana | yes | ok | | | 1 | 1 | default | default | coordinator 1 | coordinator | worker | coordinator | worker | worker | default | default | | hanaw01 | no | info | | | 2 | 0 | default | default | subordinate | subordinate | worker | standby | worker | standby | default | - | | hanaw02 | yes | ok | | | 3 | 3 | default | default | subordinate | subordinate | worker | subordinate | worker | worker | default | default | | hanaw03 | yes | ok | | | 4 | 4 | default | default | coordinator 3 | subordinate | worker | subordinate | worker | worker | default | default | | hanaw04 | yes | info | | | 0 | 2 | default | default | coordinator 2 | subordinate | standby | subordinate | standby | worker | default | default | overall host status: info hdbadm@hana:/usr/sap/HDB/HDB00/exe/python_support>

SAP HANA 协调器节点故障

在使节点崩溃前,请检查场景的状态。

hdbadm@hana:/usr/sap/HDB/HDB00/exe/python_support> python landscapeHostConfiguration.py | Host | Host | Host | Failover | Remove | Storage | Storage | Failover | Failover | NameServer | NameServer | IndexServer | IndexServer | Host | Host | Worker | Worker | | | Active | Status | Status | Status | Config | Actual | Config | Actual | Config | Actual | Config | Actual | Config | Actual | Config | Actual | | | | | | | Partition | Partition | Group | Group | Role | Role | Role | Role | Roles | Roles | Groups | Groups | | ------- | ------ | ------ | -------- | ------ | --------- | --------- | -------- | -------- | ---------- | ---------- | ----------- | ----------- | ------- | ------- | ------- | ------- | | hana | yes | ok | | | 1 | 1 | default | default | coordinator 1 | coordinator | worker | coordinator | worker | worker | default | default | | hanaw01 | yes | ok | | | 2 | 2 | default | default | subordinate | subordinate | worker | subordinate | worker | worker | default | default | | hanaw02 | yes | ok | | | 3 | 3 | default | default | subordinate | subordinate | worker | subordinate | worker | worker | default | default | | hanaw03 | yes | ok | | | 4 | 4 | default | default | coordinator 3 | subordinate | worker | subordinate | worker | worker | default | default | | hanaw04 | yes | ignore | | | 0 | 0 | default | default | coordinator 2 | subordinate | standby | standby | standby | standby | default | - | overall host status: ok hdbadm@hana:/usr/sap/HDB/HDB00/exe/python_support>

使用以下命令,通过中断协调器节点上的 SAP HANA 进程来模拟故障。在本例中,协调器节点是 hana

hdbadm@hana:/usr/sap/HDB/HDB00/exe/python_support> HDB kill
hdbadm@hana:/usr/sap/HDB/HDB00/exe/python_support> python landscapeHostConfiguration.py nameserver hana:30001 not responding. | Host | Host | Host | Failover | Remove | Storage | Storage | Failover | Failover | NameServer | NameServer | IndexServer | IndexServer | Host | Host | Worker | Worker | | | Active | Status | Status | Status | Config | Actual | Config | Actual | Config | Actual | Config | Actual | Config | Actual | Config | Actual | | | | | | | Partition | Partition | Group | Group | Role | Role | Role | Role | Roles | Roles | Groups | Groups | | ------- | ------ | ------ | -------- | ------ | --------- | --------- | -------- | -------- | ---------- | ---------- | ----------- | ----------- | ------- | ------- | ------- | ------- | | hana | no | info | | | 1 | 0 | default | default | coordinator 1 | subordinate | worker | standby | worker | standby | default | - | | hanaw01 | yes | ok | | | 2 | 2 | default | default | subordinate | subordinate | worker | subordinate | worker | worker | default | default | | hanaw02 | yes | ok | | | 3 | 3 | default | default | subordinate | subordinate | worker | subordinate | worker | worker | default | default | | hanaw03 | yes | ok | | | 4 | 4 | default | default | coordinator 3 | subordinate | worker | subordinate | worker | worker | default | default | | hanaw04 | yes | info | | | 0 | 1 | default | default | coordinator 2 | coordinator | standby | coordinator | standby | worker | default | default | overall host status: info hdbadm@hana:/usr/sap/HDB/HDB00/exe/python_support>

SAP HANA 协调器节点出现故障,而其他协调器节点充当从属节点

在测试前,请检查场景的状态。

hdbadm@hana:/usr/sap/HDB/HDB00/exe/python_support> python landscapeHostConfiguration.py | Host | Host | Host | Failover | Remove | Storage | Storage | Failover | Failover | NameServer | NameServer | IndexServer | IndexServer | Host | Host | Worker | Worker | | | Active | Status | Status | Status | Config | Actual | Config | Actual | Config | Actual | Config | Actual | Config | Actual | Config | Actual | | | | | | | Partition | Partition | Group | Group | Role | Role | Role | Role | Roles | Roles | Groups | Groups | | ------- | ------ | ------ | -------- | ------ | --------- | --------- | -------- | -------- | ---------- | ---------- | ----------- | ----------- | ------- | ------- | ------- | ------- | | hana | yes | ok | | | 1 | 2 | default | default | coordinator 1 | subordinate | worker | subordinate | worker | worker | default | default | | hanaw01 | yes | info | | | 2 | 0 | default | default | subordinate | subordinate | worker | standby | worker | standby | default | - | | hanaw02 | yes | ok | | | 3 | 4 | default | default | subordinate | subordinate | worker | subordinate | worker | worker | default | default | | hanaw03 | yes | ok | | | 4 | 3 | default | default | coordinator 3 | subordinate | worker | subordinate | worker | worker | default | default | | hanaw04 | yes | info | | | 0 | 1 | default | default | coordinator 2 | coordinator | standby | coordinator | standby | worker | default | default | overall host status: info hdbadm@hana:/usr/sap/HDB/HDB00/exe/python_support>

使用以下命令,通过中断协调器节点上的 SAP HANA 进程来模拟故障。在本例中,协调器节点是 hana04

hdbadm@hanaw04:/usr/sap/HDB/HDB00> HDB kill
hdbadm@hana:/usr/sap/HDB/HDB00/exe/python_support> python landscapeHostConfiguration.py | Host | Host | Host | Failover | Remove | Storage | Storage | Failover | Failover | NameServer | NameServer | IndexServer | IndexServer | Host | Host | Worker | Worker | | | Active | Status | Status | Status | Config | Actual | Config | Actual | Config | Actual | Config | Actual | Config | Actual | Config | Actual | | | | | | | Partition | Partition | Group | Group | Role | Role | Role | Role | Roles | Roles | Groups | Groups | | ------- | -------- | ------- | ---------------- | ------ | --------- | --------- | -------- | -------- | ---------- | ---------- | ----------- | ----------- | ------- | ------- | ------- | ------- | | hana | starting | warning | | | 1 | 1 | default | default | coordinator 1 | coordinator | worker | coordinator | worker | worker | default | default | | hanaw01 | starting | warning | | | 2 | 2 | default | default | subordinate | subordinate | worker | subordinate | worker | worker | default | default | | hanaw02 | yes | ok | | | 3 | 3 | default | default | subordinate | subordinate | worker | subordinate | worker | worker | default | default | | hanaw03 | yes | ok | | | 4 | 4 | default | default | coordinator 3 | subordinate | worker | subordinate | worker | worker | default | default | | hanaw04 | no | warning | failover to hana | | 0 | 0 | default | default | coordinator 2 | subordinate | standby | standby | standby | standby | default | - | overall host status: warning hdbadm@hana:/usr/sap/HDB/HDB00/exe/python_support> python landscapeHostConfiguration.py | Host | Host | Host | Failover | Remove | Storage | Storage | Failover | Failover | NameServer | NameServer | IndexServer | IndexServer | Host | Host | Worker | Worker | | | Active | Status | Status | Status | Config | Actual | Config | Actual | Config | Actual | Config | Actual | Config | Actual | Config | Actual | | | | | | | Partition | Partition | Group | Group | Role | Role | Role | Role | Roles | Roles | Groups | Groups | | ------- | ------ | ------ | -------- | ------ | --------- | --------- | -------- | -------- | ---------- | ---------- | ----------- | ----------- | ------- | ------- | ------- | ------- | | hana | yes | ok | | | 1 | 1 | default | default | coordinator 1 | coordinator | worker | coordinator | worker | worker | default | default | | hanaw01 | yes | ok | | | 2 | 2 | default | default | subordinate | subordinate | worker | subordinate | worker | worker | default | default | | hanaw02 | yes | ok | | | 3 | 3 | default | default | subordinate | subordinate | worker | subordinate | worker | worker | default | default | | hanaw03 | yes | ok | | | 4 | 4 | default | default | coordinator 3 | subordinate | worker | subordinate | worker | worker | default | default | | hanaw04 | no | ignore | | | 0 | 0 | default | default | coordinator 2 | subordinate | standby | standby | standby | standby | default | - | overall host status: ok hdbadm@hana:/usr/sap/HDB/HDB00/exe/python_support>