贵州做网站怎么推广,vs2012 做网站教程,外卖网站 模板,山东东营市房价V 8 nfsdrbdheartbeatnfsdrbdheartbeat#xff0c;nfs或分布式存储mfs只要有单点都可用此方案解决在企业实际生产场景中#xff0c;nfs是中小企业最常用的存储架构解决方案之一#xff0c;该架构方案部署简单、维护方便#xff0c;只需通过配inotifyrsync简单而高效的数据同… V 8 nfsdrbdheartbeat nfsdrbdheartbeatnfs或分布式存储mfs只要有单点都可用此方案解决在企业实际生产场景中nfs是中小企业最常用的存储架构解决方案之一该架构方案部署简单、维护方便只需通过配inotifyrsync简单而高效的数据同步方式就可实现对nfs存储系统的数据进行异机主从同步及类似MySQL的rw splitting且多个从读r还可通过LVS或haproxy实现LB既分担大并发读数据的压力又排除了从的单点故障web server上的存储方案一读写在任意一台web server上都行通过inotifyrsync将每个web server上的数据同步至其它web server例如web1--web2--web3--web2--web1方案二在LB器上配置在写上传文件时只能到web3上r在web12上使用inotifyrsync在同步时web3--web2、web3--web1方案三使用共享存储nfs若只一个rw都在一个上就成了单点再加一台一主一备彼此间使用inotifyrsync同步数据这两个可rw都放到一个上另一个仅用来备份数据也可以读在备上写在主上一般读多写少可再加一台作为备一主多从减轻r的压力同步数据时主--备1、主--备2若某一个存储故障web{1,2,3}要重新挂载备挂掉一个不影响主若挂掉则不可写于是给主做高可用master-active和master-inactive这两台主同一时间仅一台对外提供服务master-inactive为不活动状态仅在切换后才对外提供服务方案四弃用nfs共享存储使用master-active和master-inactive只将数据写到共享存储上再返回来将共享存储的数据同步到web{1,2,3}的本地读时直接从本地拿 一主多从模型中若要实现当主挂掉时仍可写且可继续同步到从用nfsdrbdheartbeat实现主的高可用解决主单点问题当master-activenfs故障切至master-inactive nfs上这两主的数据是一致的master-inactivenfs会自动与其它所有从nfs进行同步从而实现nfs存储系统热备 master-active故障切至备master-inactive上时备node要仍能向nfs slave同步数据此时同步就不能全部同步而要仅同步切换后变更的数据此处可用sersync代替inotify通过sersync的-r选项或者也可以先不让inotify启动待备node的heartbeat启动并挂载好之后再开启inotify服务 注此方案与MFS、FastDFS、GFS等FS相比部署简单、维护控制方面较容易符合简单易高效的原则但也有缺陷每个node都是全部的数据和MySQL同步一样大数据量文件同步很可能有数据延迟发生可根据不同的数据目录进行拆分同步类似MySQL的库表拆分方案对于延迟也可开启多个同步实例并通过程序实现控制读写逻辑还要能监测到同步状态 nfs高可用方案解决在两主node在切换时nfsslave读不到数据卡死状态可从以下几方面入手rpcbind服务要一直确保开启主node、备node、nfs客户端都要开启nfs clientnfs slave监控本地已挂载的nfs共享目录如果发现读不了执行重新挂载nfs client监控master-inactivenode是否有VIP出现或者drbd的状态变为Primary如果有执行重新挂载nfs服务切换时通过SSH等机制nfs client实现remount利用nagios监控如果master inactive node出现VIP执行一个指脚本进行多台nfsclient的remount 如图椭圆标注是此节操作的内容 注单台server无需文件存储数据放本地只有做集群的情况下才需要做专门的存储 注问题单点rw都在一个上性能不好企业中做运维要考虑的数据保护7*24小时持续服务 注web1和web2一般用LNMPIMG1和IMG2一般nginx或lighttpd该方案既解决nfs master单点又解决了并发读性能问题但如果数据写并发持续加大会导致如下问题适用于200-300张/s上传的图片并发同步效率方面还可以若高于300张/s可能导致master和slave同步延迟解决办法开启多线程同步优化监控事件、磁盘IO、网络IO若IMG server很多的情况下只有一台mastermaster既负责写又负责给多台同步数据压力会很大图片问题非常大时每个node都是全部完整的数据若总容量3T以上可能导致单台server存储空间不够解决办法一、可利用MySQL拆库的思路解决容量、写性能、同步延迟的问题例如初期规则img1--img55个目录对应5个域名挂载这5个目录每个imgNUM变为一组新的nfs主从高可用及rw spltting的集群rw splitting可用POST或webDAV的方式二、通过DNS扩展多主的架构增加新的服务意味着单点三、利用MySQL、Oracle、Mongodb、cassandra等数据库的内部功能实现文件数据的同步爱奇艺用mongodb的GridFS做图片存储注mongodb的GridFS做图片存储支持分布式设计思路图片存储唯一只存原始图首次请求生成缩略图并生成静态文件url固定根据不同url产生缩略图参考Abusing Amazon p_w_picpaths注facebook图片管理架构 注给nfs做HA解决了单点浪费了一台servernfs两主之间是通过heartbeatdrbd采用drbd的C协议实时同步nfs(M)和nfs(S)之间通过inotifyrsync异步同步nfs(S)通过VIP与nfs(M)进行同步nfsslaveNUM用来读nfs master用来写这解决了并发读性能问题也可将nfs master只写再由nfs master推至appserver弃用nfs方案物理磁盘做RAID10或RAID0根据性能和冗余需求来选择server之间、server和switch是用双千兆网卡bounding绑定应用server包括不限于web通过VIP访问nfs(M)通过不同的VIP访问LB的nfs(S)存储池nfs(M)的数据在drbd的分区中在数据量不大的情况下可将直接将数据从nfs(M)上直接同步至appserver本地读全都从appserver本地读取写要到nfs(M)上用inotifyrsync做从master--slave同步时在并发写大的情况下会导致数据延迟或不同步 注在企业实际工作场景中只有万不得已才会去搞DB和文件存储的问题平时应多在网站架构上做调整以让用户请求最小化的访问DB及存储系统例如做文件缓存和数据缓存高并发的核心原则把所有的用户访问请求都尽量往前推而不是上来就搞分布式存储系统对于中小企业用分布式存储就是大炮打蚊子2012年facebook已经很大的时候还是用nfs存储系统分布式不是万能的会消耗大量的人力、物力控制不好会带来灾难的后果 注为缓解网站访问的压力尽量将user访问的内容往前推有放到user本地的就不要放到CDN能放到CDN的就不要放到本地server充分利用每一层的缓存直到万不得已才让用户访问后端的DB在此基础上若撑不住解决办法使用ssdsata还不行使用分布式存储 1、安装配置heartbeat准备环境VIP10.96.20.8mastereth010.96.20.113、eth1172.16.1.113不配网关及dns、主机名test-masterbackupeth010.96.20.114、eth1172.16.1.114不配网关及dns、主机名test-backup双网卡、双硬盘、注eth0为管理IPeth1心跳连接及drbd传输通道若是生产环境中心跳传输和数据传输用一个网卡要做限制给心跳留有带宽注规范vmware中标签Xshell中标签公司中的生产环境所有主机均应在/etc/hosts文件中有相应记录方便分发及管理维护 test-master分别配置主机名/etc/sysconfig/network结果一定要与uname-n保持一致/etc/hosts文件ssh双机互信时间同步iptablesselinux[roottest-master ~]# cat /etc/redhat-releaseRed Hat Enterprise Linux Server release 6.5(Santiago)[roottest-master ~]# uname -rm2.6.32-431.el6.x86_64 x86_64[roottest-master ~]# uname -ntest-master[roottest-master ~]# ifconfig | grep eth0 -A 1eth0 Link encap:Ethernet HWaddr00:0C:29:1F:B6:AC inet addr:10.96.20.113 Bcast:10.96.20.255 Mask:255.255.255.0[roottest-master ~]# ifconfig | grep eth1 -A 1eth1 Link encap:Ethernet HWaddr00:0C:29:1F:B6:B6 inet addr:172.16.1.113 Bcast:172.16.1.255 Mask:255.255.255.0[roottest-master ~]# routeadd -host 172.16.1.114 dev eth1 #添加主机路由心跳传送通过指定网卡出去此句可追加到/etc/rc.local中也可配置静态路由#vim /etc/sysconfig/network-scripts/route-eth1添加172.16.1.114/24via 172.16.1.113[roottest-master ~]# ssh-keygen-t rsa -f ./.ssh/id_rsa -P Generating public/private rsa key pair.Your identification has been saved in./.ssh/id_rsa.Your public key has been saved in./.ssh/id_rsa.pub.The key fingerprint is:29:c3:a3:68:81:43:59:2f:0a:ad:8a:54:56:b0:1e:12roottest-masterThe keys randomart p_w_picpath is:--[ RSA 2048]----| E o.. || . ||..* . ||oo* o. . ||o.. S ||. o . ||o o . || . || |-----------------[roottest-master ~]# ssh-copy-id-i ./.ssh/id_rsa roottest-backupThe authenticity of host test-backup(10.96.20.114) cant be established.RSA key fingerprint is63:f5:2e:dc:96:64:54:72:8e:14:7e:ec:ef:b8:a1:0c.Are you sure you want to continue connecting(yes/no)? yesWarning: Permanently added test-backup (RSA) tothe list of known hosts.roottest-backups password:Now try logging into the machine, with sshroottest-backup, and check in: .ssh/authorized_keys to make sure we havent added extra keys that youwerent expecting.[roottest-master ~]# crontab -l*/5 * * * * /usr/sbin/ntpdate time.windows.com /dev/null[roottest-master ~]# service crond restartStopping crond: [ OK ]Starting crond: [ OK ][roottest-master ~]# wget http://mirrors.ustc.edu.cn/fedora/epel/6/x86_64/epel-release-6-8.noarch.rpm[roottest-master ~]# rpm -ivh epel-release-6-8.noarch.rpm warning: epel-release-6-8.noarch.rpm: Header V3RSA/SHA256 Signature, key ID 0608b895: NOKEYPreparing... ########################################### [100%] 1:epel-release ########################################### [100%][roottest-master ~]# yum search heartbeat……heartbeat-devel.i686 : Heartbeat developmentpackageheartbeat-devel.x86_64 : Heartbeat developmentpackageheartbeat-libs.i686 : Heartbeat librariesheartbeat-libs.x86_64 : Heartbeat librariesheartbeat.x86_64 : Messaging and membershipsubsystem for High-Availability Linux[roottest-master ~]# yum-y install heartbeat[roottest-master ~]# chkconfig heartbeat off[roottest-master ~]# chkconfig --list heartbeatheartbeat 0:off 1:off 2:off 3:off 4:off 5:off 6:off test-backup[roottest-backup ~]# uname -ntest-backup[roottest-backup ~]# ifconfig | grep eth0 -A 1eth0 Link encap:Ethernet HWaddr00:0C:29:15:E6:BB inet addr:10.96.20.114 Bcast:10.96.20.255 Mask:255.255.255.0[roottest-backup ~]# ifconfig | grep eth1 -A 1eth1 Link encap:Ethernet HWaddr00:0C:29:15:E6:C5 inet addr:172.16.1.114 Bcast:172.16.1.255 Mask:255.255.255.0[roottest-backup ~]# routeadd -host 172.16.1.113 dev eth1[roottest-backup ~]# ssh-keygen-t rsa -f ./.ssh/id_rsa -P Generating public/private rsa key pair.Your identification has been saved in./.ssh/id_rsa.Your public key has been saved in ./.ssh/id_rsa.pub.The key fingerprint is:08:ea:6a:44:7f:1a:c9:bf:ff:01:d5:32:e5:39:1b:b8roottest-backupThe keys randomart p_w_picpath is:--[ RSA 2048]----| . || . || . * || . . . .. ||. . ..SE . || o . . ||. . . || o . . . ||o .o... |-----------------[roottest-backup ~]#ssh-copy-id -i ./.ssh/id_rsa roottest-masterThe authenticity of host test-master(10.96.20.113) cant be established.RSA key fingerprint is63:f5:2e:dc:96:64:54:72:8e:14:7e:ec:ef:b8:a1:0c.Are you sure you want to continue connecting(yes/no)? yesWarning: Permanently added test-master (RSA) tothe list of known hosts.roottest-masters password:Now try logging into the machine, with sshroottest-master, and check in: .ssh/authorized_keys to make sure we havent added extra keys that youwerent expecting.[roottest-backup ~]# crontab -l*/5 * * * * /usr/sbin/ntpdate time.windows.com /dev/null[roottest-backup ~]# service crond restartStopping crond: [ OK ]Starting crond: [ OK ][roottest-backup ~]# wgethttp://mirrors.ustc.edu.cn/fedora/epel/6/x86_64/epel-release-6-8.noarch.rpm[roottest-backup ~]# rpm -ivh epel-release-6-8.noarch.rpm[roottest-backup ~]# yum -y install heartbeat[roottest-backup ~]# chkconfig heartbeat off[roottest-backup ~]# chkconfig --list heartbeatheartbeat 0:off 1:off 2:off 3:off 4:off 5:off 6:off test-master[roottest-master ~]# cp /usr/share/doc/heartbeat-3.0.4/{ha.cf,authkeys,haresources} /etc/ha.d/[roottest-master ~]# cd /etc/ha.d[roottest-master ha.d]# lsauthkeys ha.cf harc haresources rc.d README.config resource.d shellfuncs[roottest-master ha.d]# vim authkeys #使用#ddif/dev/random count1 bs512 | md5sum生成随机数sha1后跟随机数auth 11 sha1912d6402295ac8d47109e56b177073b9[roottest-master ha.d]# chmod 600 authkeys #此文件权限600否则启动服务时会报错[roottest-master ha.d]# ll !$ll authkeys-rw-------. 1 root root 692 Aug 7 21:51 authkeys[roottest-master ha.d]# vim ha.cfdebugfile /var/log/ha-debug #调试日志logfile /var/log/ha-loglogfacility local1 #在rsyslog服务中配置通过local1接收日志keepalive 2 #指定心跳间隔时间即2s发一次广播deadtime 30 #指定备node在30s内没收到主node的心跳信息则立即接管对方的服务资源warntime 10 #指定心跳延迟的时间为10s当10s内备node没收到主node的心跳信息就会往日志中写警告此时不会切换服务initdead 120 #指定在heartbeat首次运行后需等待120s才启动主node的各资源此项用于解决等待对方heartbeat服务启动了自己才启此项值至少要是deadtime的两倍udpport 694#bcast eth0 #指定心跳使用以太网广播方式在eth0上广播若要使用两个实际网络传送心跳则要为bcast eth0 eth1mcast eth0 225.0.0.11 694 1 0 #设置多播通信的参数多播地址在LAN内必须是唯一的因为有可能有多个heartbeat服务多播地址使用D类IP224.0.0.0--239.255.255.255格式为mcastdev mcast_group port ttl loopauto_failback on #用于主node恢复后failbacknode test-master #主node主机名uname -n结果node test-backup #备node主机名crm no #是否开启CRM功能[roottest-master ha.d]# vim haresourcestest-master IPaddr::10.96.20.8/24/eth0 #此句相当于执行#/etc/ha.d/resource.d/IPaddr10.96.20.8/24/eth0 stop|startIPaddr即是/etc/ha.d/resource.d/下的脚本[roottest-master ha.d]#scp authkeys ha.cf haresources roottest-backup:/etc/ha.d/authkeys 100% 692 0.7KB/s 00:00 ha.cf 100% 10KB 10.3KB/s 00:00 haresources 100% 5944 5.8KB/s 00:00 [roottest-master ha.d]# service heartbeat startStarting High-Availability services: INFO: Resource is stoppedDone. [roottest-master ha.d]# ssh test-backup service heartbeat startStarting High-Availability services:2016/08/07_22:39:00 INFO: Resource isstoppedDone.[roottest-master ha.d]# ps aux | grep heartbeatroot 63089 0.0 3.1 50124 7164 ? SLs 22:38 0:00 heartbeat: mastercontrol processroot 63093 0.0 3.1 50076 7116 ? SL 22:38 0:00 heartbeat: FIFOreader root 63094 0.0 3.1 50072 7112 ? SL 22:38 0:00 heartbeat: write:mcast eth0 root 63095 0.0 3.1 50072 7112 ? SL 22:38 0:00 heartbeat: read:mcast eth0 root 63136 0.0 0.3 103264 836 pts/0 S 22:39 0:00 grep heartbeat[roottest-master ha.d]# ssh test-backup ps aux |grep heartbeatroot 3050 0.0 3.1 50124 7164 ? SLs 22:39 0:00 heartbeat: mastercontrol processroot 3054 0.0 3.1 50076 7116 ? SL 22:39 0:00 heartbeat: FIFOreader root 3055 0.0 3.1 50072 7112 ? SL 22:39 0:00 heartbeat: write:mcast eth0 root 3056 0.0 3.1 50072 7112 ? SL 22:39 0:00 heartbeat: read:mcast eth0 root 3094 0.0 0.5 106104 1368 ? Ss 22:39 0:00 bash -c ps aux | grep heartbeatroot 3108 0.0 0.3 103264 832 ? S 22:39 0:00 grep heartbeat[roottest-master ha.d]# netstat -tnulp | grep heartbeatudp 0 0 225.0.0.11:694 0.0.0.0:* 63094/heartbeat:wrudp 0 0 0.0.0.0:50268 0.0.0.0:* 63094/heartbeat:wr[roottest-master ha.d]# ssh test-backup netstat-tnulp | grep heartbeatudp 0 0 0.0.0.0:58019 0.0.0.0:* 3055/heartbeat:wriudp 0 0 225.0.0.11:694 0.0.0.0:* 3055/heartbeat: wri[roottest-master ha.d]# ip addr | grep 10.96.20 inet 10.96.20.113/24 brd 10.96.20.255 scope global eth0 inet 10.96.20.8/24 brd 10.96.20.255 scope global secondaryeth0[roottest-master ha.d]# ssh test-backup ip addr |grep 10.96.20 inet10.96.20.114/24 brd 10.96.20.255 scope global eth0[roottest-master ha.d]# service heartbeat stopStopping High-Availability services: Done. [roottest-master ha.d]# ip addr | grep 10.96.20 inet10.96.20.113/24 brd 10.96.20.255 scope global eth0[roottest-master ha.d]# ssh test-backup ip addr |grep 10.96.20 inet 10.96.20.114/24 brd 10.96.20.255 scope global eth0 inet 10.96.20.8/24 brd 10.96.20.255 scope global secondaryeth0[roottest-master ha.d]# service heartbeat startStarting High-Availability services: INFO: Resource is stoppedDone. [roottest-master ha.d]# ip addr | grep 10.96.20 inet10.96.20.113/24 brd 10.96.20.255 scope global eth0 inet10.96.20.8/24 brd 10.96.20.255 scope global secondary eth0[roottest-master ha.d]# ssh test-backup ip addr |grep 10.96.20 inet10.96.20.114/24 brd 10.96.20.255 scope global eth0[roottest-master ~]# service heartbeat stopStopping High-Availability services: Done. [roottest-master ~]# ssh test-backup serviceheartbeat stopStopping High-Availability services: Done. 2、安装配置drbdtest-master[roottest-master ~]# fdisk -l……Disk /dev/sdb: 2147 MB, 2147483648 bytes255 heads, 63 sectors/track, 261 cylindersUnits cylinders of 16065 * 512 8225280 bytesSector size (logical/physical): 512 bytes / 512bytesI/O size (minimum/optimal): 512 bytes / 512 bytesDisk identifier: 0x00000000[roottest-master ~]# parted /dev/sdb #parted命令可支持大于2T的硬盘将新硬盘分两个区一个区用于放数据另一个区用于drbd的meta dataGNU Parted 2.1Using /dev/sdbWelcome to GNU Parted! Type help to view a listof commands.(parted) h align-checkTYPE N checkpartition N for TYPE(min|opt) alignment checkNUMBER do asimple check on the file system cp[FROM-DEVICE] FROM-NUMBER TO-NUMBER copy file system to another partition help[COMMAND] printgeneral help, or help on COMMAND mklabel,mktable LABEL-TYPE create a new disklabel (partitiontable) mkfs NUMBERFS-TYPE make aFS-TYPE file system on partition NUMBER mkpart PART-TYPE [FS-TYPE] START END make a partition mkpartfsPART-TYPE FS-TYPE START END make apartition with a file system move NUMBERSTART END movepartition NUMBER name NUMBERNAME namepartition NUMBER as NAME print [devices|free|list,all|NUMBER] display the partition table, availabledevices, free space, all found partitions, or a particular partition quit exitprogram rescueSTART END rescuea lost partition near START and END resizeNUMBER START END resizepartition NUMBER and its file system rmNUMBER delete partition NUMBER selectDEVICE choosethe device to edit set NUMBERFLAG STATE change theFLAG on partition NUMBER toggle[NUMBER [FLAG]] togglethe state of FLAG on partition NUMBER unitUNIT setthe default unit to UNIT version display the version number and copyrightinformation of GNU Parted(parted) mklabel gpt (parted) mkpart primary 0 1024Warning: The resulting partition is not properlyaligned for best performance.Ignore/Cancel? Ignore(parted) mkpart primary 1025 2147 Warning: The resulting partition is not properlyaligned for best performance.Ignore/Cancel? Ignore(parted) p Model: VMware, VMware Virtual S (scsi)Disk /dev/sdb: 2147MBSector size (logical/physical): 512B/512BPartition Table: gpt Number Start End Size File system Name Flags 1 17.4kB 1024MB 1024MB primary 2 1025MB 2147MB 1122MB primary[roottest-master ~]# wget http://www.elrepo.org/elrepo-release-6-6.el6.elrepo.noarch.rpm[roottest-master ~]# rpm -ivh elrepo-release-6-6.el6.elrepo.noarch.rpm warning: elrepo-release-6-6.el6.elrepo.noarch.rpm:Header V4 DSA/SHA1 Signature, key ID baadae52: NOKEYPreparing... ########################################### [100%] 1:elrepo-release ########################################### [100%][roottest-master ~]# yum -y install drbd kmod-drbd84[roottest-master ~]# modprobe drbdFATAL: Module drbd not found.[roottest-master ~]# yum -y install kernel* #更新内核后要重启系统[roottest-master ~]# uname -r2.6.32-642.3.1.el6.x86_64[roottest-master ~]# depmod[roottest-master ~]# lsmod| grep drbddrbd 372759 0libcrc32c 1246 1 drbd[roottest-master ~]# ll /usr/src/kernels/total 12drwxr-xr-x. 22 root root 4096 Mar 31 06:462.6.32-431.el6.x86_64drwxr-xr-x. 22 root root 4096 Aug 8 03:40 2.6.32-642.3.1.el6.x86_64drwxr-xr-x. 22 root root 4096 Aug 8 03:40 2.6.32-642.3.1.el6.x86_64.debug[roottest-master ~]# echo modprobe drbd /dev/null 21 /etc/sysconfig/modules/drbd.modules[roottest-master ~]# cat !$cat /etc/sysconfig/modules/drbd.modulesmodprobe drbd /dev/null 21 test-backup[roottest-backup ~]# parted /dev/sdb(parted) mklabel gpt(parted) mkpart primary 0 4096 Warning: The resulting partition is not properlyaligned for best performance.Ignore/Cancel? Ignore (parted) mkpart primary 4097 5368 (parted) p Model: VMware, VMware Virtual S (scsi)Disk /dev/sdb: 5369MBSector size (logical/physical): 512B/512BPartition Table: gpt Number Start End Size File system Name Flags 1 17.4kB 4096MB 4096MB primary 2 4097MB 5368MB 1271MB primary[roottest-backup ~]# wget http://www.elrepo.org/elrepo-release-6-6.el6.elrepo.noarch.rpm[roottest-backup ~]# rpm -ivh elrepo-release-6-6.el6.elrepo.noarch.rpm[roottest-backup ~]# ll /etc/yum.repos.d/total 20-rw-r--r--. 1 root root 1856 Jul 19 00:28CentOS6-Base-163.repo-rw-r--r--. 1 root root 2150 Feb 9 2014elrepo.repo-rw-r--r--. 1 root root 957 Nov 4 2012 epel.repo-rw-r--r--. 1 root root 1056 Nov 4 2012epel-testing.repo-rw-r--r--. 1 root root 529 Mar 30 23:00 rhel-source.repo.bak[roottest-backup ~]# yum -y install drbd kmod-drbd84[roottest-backup ~]# yum -y install kernel*[roottest-backup ~]# depmod[roottest-backup ~]# lsmod | grep drbddrbd 372759 0libcrc32c 1246 1 drbd[roottest-backup ~]# chkconfig drbd off[roottest-backup ~]# chkconfig --list drbddrbd 0:off 1:off 2:off 3:off 4:off 5:off 6:off[roottest-backup ~]# echo modprobe drbd /dev/null 21 /etc/sysconfig/modules/drbd.modules[roottest-backup ~]# cat !$cat /etc/sysconfig/modules/drbd.modulesmodprobe drbd /dev/null 21 test-master[roottest-master ~]# vim /etc/drbd.d/global_common.conf[roottest-master ~]# egrep -v #|^$ /etc/drbd.d/global_common.confglobal { usage-countno;}common { handlers{ } startup{ } options{ } disk{ on-io-error detach; } net { } syncer{ rate50M; verify-algcrc32c; }}[roottest-master ~]# vim /etc/drbd.d/data.resresource data { protocol C; ontest-master { device /dev/drbd0; disk /dev/sdb1; address 172.16.1.113:7788; meta-disk /dev/sdb2[0]; } ontest-backup { device /dev/drbd0; disk /dev/sdb1; address 172.16.1.114:7788; meta-disk /dev/sdb2[0]; }}[roottest-master ~]# cd /etc/drbd.d[roottest-master drbd.d]# scp global_common.conf data.res roottest-backup:/etc/drbd.d/global_common.conf 100% 2144 2.1KB/s 00:00 data.res 100% 251 0.3KB/s 00:00 [roottest-master drbd.d]# drbdadm --helpUSAGE: drbdadm COMMAND [OPTION...]{all|RESOURCE...}GENERAL OPTIONS: --stacked,-S --dry-run,-d --verbose,-v --config-file..., -c ... --config-to-test..., -t ... --drbdsetup...,-s ... --drbdmeta..., -m ... --drbd-proxy-ctl..., -p ... --sh-varname..., -n ... --peer...,-P ... --version,-V --setup-option..., -W ... --help, -h COMMANDS: attach disk-options detach connect net-options disconnect up resource-options down primary secondary invalidate invalidate-remote outdate resize verify pause-sync resume-sync adjust adjust-with-progress wait-connect wait-con-int role cstate dstate dump dump-xml create-md show-gi get-gi dump-md wipe-md apply-al hidden-commands [roottest-master drbd.d]# drbdadm create-md datainitializing activity logNOT initializing bitmapWriting meta data...New drbd meta data block successfully created.[roottest-master drbd.d]# ssh test-backup drbdadm create-md dataNOT initializing bitmapinitializing activity logWriting meta data...New drbd meta data block successfully created.[roottest-master drbd.d]#drbdadm up data[roottest-master drbd.d]# ssh test-backup drbdadm up data[roottest-master drbd.d]# cat /proc/drbdversion: 8.4.7-1 (api:1/proto:86-101)GIT-hash: 3a6a769340ef93b1ba2792c6461250790795db49build by mockbuildBuild64R6, 2016-01-12 13:27:11 0:cs:Connected ro:Secondary/Secondary ds:Inconsistent/Inconsistent C r----- ns:0 nr:0dw:0 dr:0 al:8 bm:0 lo:0 pe:0 ua:0 ap:0 ep:1 wo:f oos:999984[roottest-master drbd.d]# ssh test-backup cat /proc/drbdversion: 8.4.7-1 (api:1/proto:86-101)GIT-hash: 3a6a769340ef93b1ba2792c6461250790795db49build by mockbuildBuild64R6, 2016-01-12 13:27:11 0:cs:Connected ro:Secondary/Secondary ds:Inconsistent/Inconsistent C r----- ns:0 nr:0dw:0 dr:0 al:16 bm:0 lo:0 pe:0 ua:0 ap:0 ep:1 wo:f oos:999984[roottest-master drbd.d]# drbdadm -- --overwrite-data-of-peer primary data #仅在主上执行[roottest-master drbd.d]# cat /proc/drbdversion: 8.4.7-1 (api:1/proto:86-101)GIT-hash: 3a6a769340ef93b1ba2792c6461250790795db49build by mockbuildBuild64R6, 2016-01-12 13:27:11 0:cs:SyncSource ro:Primary/Secondaryds:UpToDate/Inconsistent C r----- ns:339968nr:0 dw:0 dr:340647 al:8 bm:0 lo:0 pe:0 ua:0 ap:0 ep:1 wo:f oos:660016 [..............]synced: 34.3% (660016/999984)K finish:0:00:15 speed: 42,496 (42,496) K/sec[roottest-master drbd.d]# cat /proc/drbdversion: 8.4.7-1 (api:1/proto:86-101)GIT-hash: 3a6a769340ef93b1ba2792c6461250790795db49build by mockbuildBuild64R6, 2016-01-12 13:27:11 0:cs:SyncSource ro:Primary/Secondary ds:UpToDate/Inconsistent C r----- ns:630784nr:0 dw:0 dr:631463 al:8 bm:0 lo:0 pe:0 ua:0 ap:0 ep:1 wo:f oos:369200 [........]synced: 63.3% (369200/999984)K finish:0:00:09 speed: 39,424 (39,424) K/sec[roottest-master drbd.d]# cat /proc/drbdversion: 8.4.7-1 (api:1/proto:86-101)GIT-hash: 3a6a769340ef93b1ba2792c6461250790795db49build by mockbuildBuild64R6, 2016-01-12 13:27:11 0:cs:SyncSource ro:Primary/Secondary ds:UpToDate/Inconsistent C r----- ns:942080nr:0 dw:0 dr:942759 al:8 bm:0 lo:0 pe:0 ua:0 ap:0 ep:1 wo:f oos:57904 [..]synced: 94.3% (57904/999984)K finish:0:00:01 speed: 39,196 (39,252) K/sec[roottest-master drbd.d]# cat /proc/drbdversion: 8.4.7-1 (api:1/proto:86-101)GIT-hash: 3a6a769340ef93b1ba2792c6461250790795db49build by mockbuildBuild64R6, 2016-01-12 13:27:11 0:cs:Connected ro:Primary/Secondary ds:UpToDate/UpToDate C r----- ns:999983nr:0 dw:0 dr:1000662 al:8 bm:0 lo:0 pe:0 ua:0 ap:0 ep:1 wo:f oos:0[roottest-master drbd.d]# ssh test-backup cat /proc/drbdversion: 8.4.7-1 (api:1/proto:86-101)GIT-hash: 3a6a769340ef93b1ba2792c6461250790795db49build by mockbuildBuild64R6, 2016-01-12 13:27:11 0:cs:Connected ro:Secondary/Primaryds:UpToDate/UpToDate C r----- ns:0nr:999983 dw:999983 dr:0 al:16 bm:0 lo:0 pe:0 ua:0 ap:0 ep:1 wo:f oos:0[roottest-master drbd.d]# mkdir/drbd[roottest-master drbd.d]# ssh test-backup mkdir /drbd[roottest-master drbd.d]# mkfs.ext4 -b 4096 /dev/drbd0 #仅在主上执行meta分区不要格式化Writing superblocks and filesystem accountinginformation: done[roottest-master drbd.d]# tune2fs -c -1 /dev/drbd0tune2fs 1.41.12 (17-May-2010)Setting maximal mount count to -1[roottest-master drbd.d]# mount /dev/drbd0 /drbd[roottest-master drbd.d]# cd /drbd[roottest-master drbd]# for i in seq 1 10; do touch test$i; done[roottest-master drbd]# lslostfound test1 test10 test2 test3 test4 test5 test6 test7 test8 test9[roottest-master drbd]# cd[roottest-master ~]# umount /dev/drbd0[roottest-master ~]# drbdadm secondary data[roottest-master ~]# cat /proc/drbdversion: 8.4.7-1 (api:1/proto:86-101)GIT-hash: 3a6a769340ef93b1ba2792c6461250790795db49build by mockbuildBuild64R6, 2016-01-12 13:27:11 0:cs:Connected ro:Secondary/Secondaryds:UpToDate/UpToDate C r----- ns:1032538 nr:0 dw:32554 dr:1001751 al:19 bm:0 lo:0 pe:0 ua:0 ap:0 ep:1wo:f oos:0 test-backup[roottest-backup ~]# cat /proc/drbdversion: 8.4.7-1 (api:1/proto:86-101)GIT-hash: 3a6a769340ef93b1ba2792c6461250790795db49build by mockbuildBuild64R6, 2016-01-12 13:27:11 0:cs:Connected ro:Secondary/Secondaryds:UpToDate/UpToDate C r----- ns:0nr:1032538 dw:1032538 dr:0 al:16 bm:0 lo:0 pe:0 ua:0 ap:0 ep:1 wo:f oos:0[roottest-backup ~]# drbdadm primary data[roottest-backup ~]# cat /proc/drbdversion: 8.4.7-1 (api:1/proto:86-101)GIT-hash: 3a6a769340ef93b1ba2792c6461250790795db49build by mockbuildBuild64R6, 2016-01-12 13:27:11 0:cs:Connected ro:Primary/Secondaryds:UpToDate/UpToDate C r----- ns:0nr:1032538 dw:1032538 dr:679 al:16 bm:0 lo:0 pe:0 ua:0 ap:0 ep:1 wo:f oos:0[roottest-backup ~]# mount /dev/drbd0 /drbd[roottest-backup ~]# ls /drbdlostfound test1 test10 test2 test3 test4 test5 test6 test7 test8 test9 3、调试heartbeatdrbd[roottest-master ~]# ssh test-backup umount/drbd[roottest-master ~]# ssh test-backup drbdadmsecondary data[roottest-master ~]# service drbd stopStopping all DRBD resources: .[roottest-master ~]# ssh test-backup service drbdstopStopping all DRBD resources: .[roottest-master ~]# service heartbeat statusheartbeat is stopped. No process[roottest-master ~]# ssh test-backup serviceheartbeat statusheartbeat is stopped. No process[roottest-master ~]# ll/etc/ha.d/resource.d/{Filesystem,drbddisk}-rwxr-xr-x. 1 root root 3162 Jan 12 2016 /etc/ha.d/resource.d/drbddisk-rwxr-xr-x. 1 root root 1903 Dec 2 2013/etc/ha.d/resource.d/Filesystem[roottest-master ~]# vim /etc/ha.d/haresources #此行内容相当于脚本加参数的执行方式例如#/etc/ha.d/resource.d/IPaddr 10.96.20.8/24/eth0 start|stop#/etc/ha.d/resource.d/drbddiskdata start|stop#/etc/ha.d/resource.d/Filesystem /dev/drbd0 /drbd ext4 start|stopheartbeat就是这样按配置的先后顺序控制资源的如果heartbeat出问题了可通过查看日志并单独运行这些命令排错test-master IPaddr::10.96.20.8/24/eth0 drbddisk::data Filesystem::/dev/drbd/0::/drbd::ext4[roottest-master ~]# scp /etc/ha.d/haresourcesroottest-backup:/etc/ha.d/haresources 100% 5996 5.9KB/s 00:00 [roottest-master~]# service drbd start #在主node执行Starting DRBD resources: [ createres: data preparedisk: data adjustdisk: data adjustnet: data]..........*************************************************************** DRBDs startupscript waits for the peer node(s) to appear. - If thisnode was already a degraded cluster before the reboot,the timeout is 0 seconds. [degr-wfc-timeout] - If thepeer was available before the reboot, the timeout is 0seconds. [wfc-timeout] (Thesevalues are for resource data; 0 sec - wait forever) To abortwaiting enter yes [ 23]:[roottest-backup~]# service drbd start #在备node执行Starting DRBD resources: [ createres: data preparedisk: data adjustdisk: data adjustnet: data].[roottest-master ~]# drbdadm role dataSecondary/Secondary[roottest-master ~]# ssh test-backup drbdadm roledataSecondary/Secondary[roottest-master ~]# drbdadm -- --overwrite-data-of-peer primary data[roottest-master ~]# drbdadm role dataPrimary/Secondary[roottest-master ~]# service heartbeat startStarting High-Availability services: INFO: Resource is stoppedDone.[roottest-master ~]# ssh test-backup serviceheartbeat startStarting High-Availability services: 2016/08/09_03:08:11INFO: Resource is stoppedDone.[roottest-master ~]# ip addr | grep 10.96.20 inet10.96.20.113/24 brd 10.96.20.255 scope global eth0 inet 10.96.20.8/24 brd 10.96.20.255 scope global secondaryeth0[roottest-master ~]# drbdadm role dataPrimary/Secondary[roottest-master ~]# df -hFilesystem Size Used Avail Use% Mounted on/dev/sda2 18G 6.3G 11G 38% /tmpfs 112M 0 112M 0% /dev/shm/dev/sda1 283M 83M 185M 31% /boot/dev/sr0 3.6G 3.6G 0 100% /mnt/cdrom/dev/drbd0 946M 1.3M 896M 1% /drbd[roottest-master ~]# ls /drbdlostfound test1 test10 test2 test3 test4 test5 test6 test7 test8 test9 [roottest-master ~]# service heartbeat stopStopping High-Availability services: Done.[roottest-master ~]# ssh test-backup ip addr |grep 10.96.20 inet10.96.20.114/24 brd 10.96.20.255 scope global eth0 inet10.96.20.8/24 brd 10.96.20.255 scope global secondary eth0[roottest-master ~]# ssh test-backup df -hFilesystem Size Used Avail Use% Mounted on/dev/sda2 18G 3.9G 13G 24% /tmpfs 112M 0 112M 0% /dev/shm/dev/sda1 283M 83M 185M 31% /boot/dev/sr0 3.6G 3.6G 0 100% /mnt/cdrom/dev/drbd0 946M 1.3M 896M 1% /drbd[roottest-master ~]# ssh test-backup ls /drbdlostfoundtest1test10test2test3test4test5test6test7test8test9 [roottest-master ~]# drbdadm role data Secondary/Primary[roottest-master ~]# service heartbeat start #主node恢复后先确保把drbd理顺弄正常再开启heartbeat服务Starting High-Availability services: INFO: Resource is stoppedDone.[roottest-master ~]# drbdadm role dataPrimary/Secondary[roottest-master ~]# ip addr | grep 10.96.20 inet10.96.20.113/24 brd 10.96.20.255 scope global eth0 inet10.96.20.8/24 brd 10.96.20.255 scope global secondary eth0[roottest-master ~]# df -hFilesystem Size Used Avail Use% Mounted on/dev/sda2 18G 6.3G 11G 38% /tmpfs 112M 0 112M 0% /dev/shm/dev/sda1 283M 83M 185M 31% /boot/dev/sr0 3.6G 3.6G 0 100% /mnt/cdrom/dev/drbd0 946M 1.3M 896M 1% /drbd[roottest-master ~]# ls /drbdlostfound test1 test10 test2 test3 test4 test5 test6 test7 test8 test9 注若两端出现Primary/Unknown或Secondary/Unknown调整方法#service heartbeat stop #把两端heartbeat服务停掉#drbdadm secondary data #将备node的drbd置从#drbdadm disconnect data#drbdadm -- --discard-my-data connect data#drbdadm role data#drbdadm connect data #主node操作 4、安装配置nfs在两个主node和nfs slave1上均如下操作[roottest-master ~]# yum -y groupinstall NFS fileserver[roottest-master ~]# rpm -qa nfs-utils rpcbindnfs-utils-1.2.3-70.el6_8.1.x86_64rpcbind-0.2.0-12.el6.x86_64[roottest-master ~]# service rpcbind start[roottest-master ~]# service nfs startStarting NFS services: [ OK ]Starting NFS quotas: [ OK ]Starting NFS mountd: [ OK ]Starting NFS daemon: [ OK ]Starting RPC idmapd: [ OK ][roottest-master ~]# chkconfig rpcbind on[roottest-master ~]# chkconfig nfs on[roottest-master ~]# chkconfig --list rpcbindrpcbind 0:off 1:off 2:on 3:on 4:on 5:on 6:off[roottest-master ~]# chkconfig --list nfsnfs 0:off 1:off 2:on 3:on 4:on 5:on 6:off 在两个主node上操作[roottest-master ~]# vim /etc/exports/drbd 10.96.20.*(rw,sync,all_squash,anonuid65534,anongid65534,mp,fsid2)[roottest-master ~]# chmod 777 -R /drbd[roottest-master ~]# service nfs reload #相当于#exportfs-r 5、测试两端主均开启heartbeat在nfs-slave上测试正常[roottest-master ~]# service heartbeat stopStopping High-Availability services:/sbin/service: line 66: 17235 Killed env -i PATH$PATHTERM$TERM ${SERVICEDIR}/${SERVICE} ${OPTIONS}[roottest-master ~]# tail -f /var/log/ha-log #测试在对heartbeat停服时切换过程中一直卸载不掉挂载的分区最终会强制重启serverFilesystem(Filesystem_/dev/drbd0)[19791]: 2016/08/09_04:36:21 INFO: No processes on/drbd were signalled. force_unmount isFilesystem(Filesystem_/dev/drbd0)[19791]: 2016/08/09_04:36:22 ERROR: Couldnt unmount /drbd; trying cleanup with KILLFilesystem(Filesystem_/dev/drbd0)[19791]: 2016/08/09_04:36:22 INFO: No processes on/drbd were signalled. force_unmount isFilesystem(Filesystem_/dev/drbd0)[19791]: 2016/08/09_04:36:23 ERROR: Couldnt unmount/drbd, giving up!/usr/lib/ocf/resource.d//heartbeat/Filesystem(Filesystem_/dev/drbd0)[19783]: 2016/08/09_04:36:23 ERROR: Generic errorResourceManager(default)[17256]: 2016/08/09_04:36:23 ERROR: Return code 1from /etc/ha.d/resource.d/Filesystem/usr/lib/ocf/resource.d//heartbeat/Filesystem(Filesystem_/dev/drbd0)[20014]: 2016/08/09_04:36:23 INFO: Running OKResourceManager(default)[17256]: 2016/08/09_04:36:23 CRIT: Resource STOP failure. Reboot required!ResourceManager(default)[17256]: 2016/08/09_04:36:23 CRIT: Killingheartbeat ungracefully! [roottest-backup ~]# drbdadm role data #主node那边server重启后备node查看已接管Primary/Unknown[roottest-backup ~]# ip addr……2: eth0: BROADCAST,MULTICAST,UP,LOWER_UPmtu 1500 qdisc pfifo_fast state UP qlen 1000 link/ether 00:0c:29:15:e6:bb brd ff:ff:ff:ff:ff:ff inet10.96.20.114/24 brd 10.96.20.255 scope global eth0 inet 10.96.20.8/24 brd 10.96.20.255 scopeglobal secondary eth0 inet6fe80::20c:29ff:fe15:e6bb/64 scope link valid_lft forever preferred_lft forever[roottest-backup ~]# df -hFilesystem Size Used Avail Use% Mounted on/dev/sda2 18G 3.9G 13G 24% /tmpfs 112M 0 112M 0% /dev/shm/dev/sda1 283M 83M 185M 31% /boot/dev/sr0 3.6G 3.6G 0 100% /mnt/cdrom/dev/drbd0 946M 1.3M 896M 1% /drbd[roottest-backup ~]# ls /drbdlostfound test111 test2 test222.txt test3 test4 test5 test6 test7 test8 test9 两主node的热备是实现了但nfs slave挂载时一直挂载不上卡住了服务端nfs master active保存有nfs客户端挂载状态这时需重启nfs服务端于是在heartbeat的haresources配置文件中加入脚本让其切换时重启nfs 关闭两主node的drbd和heartbeat服务[roottest-master ~]# vim /etc/ha.d/haresourcestest-master IPaddr::10.96.20.8/24/eth0 drbddisk::data Filesystem::/dev/drbd0::/drbd::ext4 killnfs[roottest-master ~]# cd /etc/ha.d/resource.d/[roottest-master resource.d]# vim killnfs---------------script start-------------#!/bin/bash# for i in {1..10};do killall nfsddoneservice nfs startexit 0----------------script end--------------[roottest-master resource.d]# chmod 755 killnfs[roottest-master resource.d]# ll killnfs-rwxr-xr-x. 1 root root 79 Aug 9 21:02 killnfs[roottest-master resource.d]# scp killnfs roottest-backup:/etc/ha.d/resource.d/killnfs 100% 79 0.1KB/s 00:00 [roottest-master resource.d]# cd ..[roottest-master ha.d]# scp haresources roottest-backup:/etc/ha.d/haresources 100% 6003 5.9KB/s 00:00 调整好drbd再开启heartbeat重新测试nfs slave在主切换时正常没有挂载不上或卡住的问题注调试的一个大前提是确保drbd是正常的再开启heartbeat这样就不会有问题 注ganji图片架构演变 注用户上传图片到web server上后web server把图片POST到对应设置ID的图片server上图片server上的php接收到POST来的图片把图片写入到本地磁盘并返回对应的成功状态码前端web server根据返回成功的状态码把图片server对应的ID和对应的图片path写入到DB server用户访问页面时根据请求从DB读取图片server ID和图片的URL到对应图片server上访问 转载于:https://blog.51cto.com/jowin/1837154