迪庆定制网站建设费用,建筑人才网官网平台,全球十大搜索引擎排名及网址,中学网站管理系统下载Tivoli System Automation#xff08;TSA#xff09;是一个高可用性集群管理软件,DB2 TSAHADR高可用方案可以实现DB2 hadr主备的自动检测切换。本文详细介绍了TSA的常用命令#xff0c;如何把CDC或者DSG添加到TSA集群中#xff0c;以及TSA的错误分析方法
常用命令#xf…Tivoli System AutomationTSA是一个高可用性集群管理软件,DB2 TSAHADR高可用方案可以实现DB2 hadr主备的自动检测切换。本文详细介绍了TSA的常用命令如何把CDC或者DSG添加到TSA集群中以及TSA的错误分析方法
常用命令 lsrpdomain/lsrpnode - 查询domain和node信息
[db2inst1p0-pbd-pbd-db2 ~]$ lsrpdomain Name OpState RSCTActiveVersion MixedVersions TSPort GSPort hadr_domain Online 3.2.4.4 No 12347 12348 [db2inst1p0-pbd-pbd-db2 ~]$ lsrpnode Name OpState RSCTVersion p0-pbd-pbd-db2 Online 3.2.4.4 p0-pbd-pbd-db1 Online 3.2.4.4
lssam - 查询resource状态: [db2inst1p0-pbd-pbd-db2 ~]$ lssam Online IBM.ResourceGroup:cdc_I2KFK38-rg NominalOnline - Online IBM.Application:cdc-I2KFK38-rs |- Offline IBM.Application:cdc-I2KFK38-rs:p0-pbd-pbd-db1 - Online IBM.Application:cdc-I2KFK38-rs:p0-pbd-pbd-db2 lsrg -Ab -V -g resource group - 查询resource group状态以及属性
[db2inst1p0-pbd-pbd-db2 ~]$ lsrg -Ab -V -g cdc_I2KFK38-rg Starting to list resource group information. lsrg: Executed on Thu Aug 31 09:50:58 2023 at p0-pbd-pbd-db2, master node p0-pbd-pbd-db2.
Displaying Resource Group information: All Attributes For Resource Group cdc_I2KFK38-rg. Resource Group 1: Name cdc_I2KFK38-rg MemberLocation Collocated Priority 0 AllowedNode ALL NominalState Online ExcludedList {} Subscription {} Owner Description InfoLink Requests {} Force 0 ActivePeerDomain hadr_domain OpState Online TopGroup cdc_I2KFK38-rg MoveStatus [None] ConfigValidity LockState 0 AutomationDetails[CompoundState] Satisfactory [DesiredState] Online [ObservedState] Online [BindingState] Bound [AutomationState] Idle [ControlState] Startable [HealthState] Not Applicable Completed listing resource group information. chrg -o online(offline) resource group - 启停resource group同时修改Nominal State rgreq -o start(stop) resource group - 启停resource group但是不修改Nominal State rgreq -o lock(unlock) resource group - 锁定或解锁resource group。
锁定资源组就可以让资源组不再自动根据依赖的资源组进行启停可以等依赖的资源组发生切换后确定Online后再解锁资源组确保资源组正常运行比如一台DB2 HADR上建了很多CDC实例和DSG复制软件实例并把这些实例进程加到了TSA资源组并依赖HADR PRIMARYPRIMARY在哪台机器上这些CDC和DSG进程就跑在哪台机器上已保证追到最新的日志。 再进行高可用切换演练的时候在shutdown HADR standby机器的之前先把CDC和DSG资源组锁上如果不锁上的话而原备机和primary Log GAP比较大的话切换到新的PRIMARY起来后 CDC和DSG会找不到最新的log而报错失败。 lsrg |egrep -i dsg|cdc | grep -v db2inst1|awk {print rgreq -o lock $1} | sh
lsrsrc IBM.Application - 列出所有resource属性监控的CDC/Db2脚本及timeout时间。 resetrsrc -s Name db2_db2inst1_0-rs IBM.Application - 重置资源状态。
lsrsrc IBM.Application resource 57: Name db2_db2inst1_p0-pbd-pbd-db2_0-rs ResourceType 0 AggregateResource 0x2028 0xffff 0xe38eb1e1 0xa0a9fe1d 0x96244eb2 0x54fb9408 StartCommand /usr/sbin/rsct/sapolicies/db2/db2V105_start.ksh db2inst1 0 StopCommand /usr/sbin/rsct/sapolicies/db2/db2V105_stop.ksh db2inst1 0 MonitorCommand /usr/sbin/rsct/sapolicies/db2/db2V105_monitor.ksh db2inst1 0
resource 58: Name db2_db2inst1_p0-pbd-pbd-db2_0-rs ResourceType 1 AggregateResource 0x3fff 0xffff 0x00000000 0x00000000 0x00000000 0x00000000 StartCommand /usr/sbin/rsct/sapolicies/db2/db2V105_start.ksh db2inst1 0 StopCommand /usr/sbin/rsct/sapolicies/db2/db2V105_stop.ksh db2inst1 0 MonitorCommand /usr/sbin/rsct/sapolicies/db2/db2V105_monitor.ksh db2inst1 0
如下cdc_tsa.sh脚本可以将CDC实例添加到TSA集群资源组里如cdc_I2KFK38-rg资源组I2KFK38就是CDC的实例名 vi cdc_tsa.sh
OsUsercdcuser instNametest ResourceNamecdc_${InstName}-rs ResourceGroupNamecdc_${InstName}-rg dependondb2ResourceNameIBM.ResourceGroup:db2_db2inst1_db2inst1_TKYLCDC-rg
mkrsrc IBM.Application Name${ResourceName} ResourceType1 StartCommand/usr/sbin/rsct/sapolicies/cdc/${InstName}_start.sh StopCommand/usr/sbin/rsct/sapolicies/cdc/${InstName}_stop.sh MonitorCommand/usr/sbin/rsct/sapolicies/cdc/${InstName}_jiankong.sh MonitorCommandPeriod10 MonitorCommandTimeout120 StartCommandTimeout900 StopCommmandTimeout900 UserName${OsUser} RunCommandsSync1 ProtectionMode0 NodeNameList{p0-pbd-pbd-db2,p0-pbd-pbd-db1}
mkrg ${ResourceGroupName}
#锁定资源 rgreq -o lock ${ResourceGroupName}
#node2 offline 资源 chrg -o Offline ${ResourceGroupName}
# 绑定 资源 - 资源组 关系 addrgmbr -g ${ResourceGroupName} IBM.Application:${ResourceName}
# 绑定 资源组 和 DB2资源组的依赖关系 mkrel -p DependsOn -S IBM.Application:${ResourceName} -G ${dependondb2ResourceName} ${ResourceName}_DependsOn_db2-rel
# 切换资源组上线 chrg -o Online ${ResourceGroupName}
# 解锁资源 rgreq -o unlock ${ResourceGroupName} TSA问题诊断 问题诊断日志 1/var/log/messages 2/var/ct/hadr_domain/log/mc drwxr-x--- 2 root root 6 Jul 23 14:42 IBM.ConfigRM drwxr-xr-x 2 root root 4096 Jul 23 14:42 IBM.GblResRM drwxr-xr-x 2 root root 4096 Jul 23 14:42 IBM.RecoveryRM drwxr-xr-x 2 root root 4096 Jul 23 14:42 IBM.StorageRM drwxr-xr-x 2 root root 78 Jul 23 14:42 IBM.TestRM
如上所示每个resource manager daemon对应一个文件夹。TSA重点关注GblResRM和RecoveryRM。 1) IBM.GblResRM – The “eyes and hands” of the cluster. Responsible for start, stop, monitor and cleanup of IBM.Application resources. In the context of DB2, it is responsible for managing all DB2 defined entities.
Basically passive. It invokes monitor commands for resources based on defined intervals and services IBM.RecoveryRM requests.
2) IBM.RecoveryRM – The “brain” of the cluster. Inputs are RMC events from other resource managers (IBM.GblResRM for IBM.Application resources, IBM.ConfigRM for hosts and network adapters, etc.), commands from users, and the resource model.
Output is commands issued to other resource managers to start/stop/cleanup resources.
Structured as a rule engine that determines how to respond to incoming events, and an optimizer component (called a “binder”) to determine resource placement if resources need to move between hosts.
使用rpttr -o dtic trace.29.sp format各个Resource Manager的trace文件。