接着上一篇Centos7下GlusterFS分布式存储集详细部署文档,继续做一些补充记录,希望能加深对GlusterFS存储操作的理解和熟悉度。
========================清理glusterfs存储环境=========================
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 | 由上面可知,该glusterfs存储集群有四个节点: [root@GlusterFS-master ~]# cat /etc/hosts ....... 192.168.10.239 GlusterFS-master 192.168.10.212 GlusterFS-slave 192.168.10.204 GlusterFS-slave2 192.168.10.220 GlusterFS-slave3 现将四个节点的存储目录/opt/gluster/data全部删除 [root@GlusterFS-master ~]# rm -rf /opt/gluster [root@GlusterFS-slave ~]# rm -rf /opt/gluster [root@GlusterFS-slave2 ~]# rm -rf /opt/gluster [root@GlusterFS-slave3 ~]# rm -rf /opt/gluster [root@GlusterFS-master ~]# gluster volume list models [root@GlusterFS-master ~]# gluster volume info Volume Name: models Type: Distributed-Replicate Volume ID: f1945b0b-67d6-4202-9198-639244ab0a6a Status: Stopped Number of Bricks: 2 x 2 = 4 Transport-type: tcp Bricks: Brick1: 192.168.10.239:/opt/gluster/data Brick2: 192.168.10.212:/opt/gluster/data Brick3: 192.168.10.204:/opt/gluster/data Brick4: 192.168.10.220:/opt/gluster/data Options Reconfigured: performance.write-behind: on performance.io-thread-count: 32 performance.flush-behind: on performance.cache-size: 128MB features.quota: on 接着删除之前创建的models卷 [root@GlusterFS-master ~]# gluster volume stop models [root@GlusterFS-master ~]# gluster volume delete models [root@GlusterFS-master ~]# gluster volume info No volumes present 查看集群节点情况,如下发现glusterfs集群中有三个节点。 由于是在192.168.10.239机器上查看的,所以在集群中默认看不到自己的。 如果在其他集群上执行这个命令,就能查看到192.168.10.239这个节点了 [root@GlusterFS-master ~]# gluster peer status Number of Peers: 3 Hostname: 192.168.10.212 Uuid: f8e69297-4690-488e-b765-c1c404810d6a State: Peer in Cluster (Connected) Hostname: 192.168.10.204 Uuid: a989394c-f64a-40c3-8bc5-820f623952c4 State: Peer in Cluster (Connected) Hostname: 192.168.10.220 Uuid: dd99743a-285b-4aed-b3d6-e860f9efd965 State: Peer in Cluster (Connected) 然后分别将节点从集群中删除(不能在节点机本机的gluster命令下删除自己) [root@GlusterFS-master ~]# gluster //可以在gluster的交互界面里操作 gluster> peer detach 192.168.10.220 peer detach: success gluster> peer detach 192.168.10.204 peer detach: success gluster> peer detach 192.168.10.212 peer detach: success gluster> peer detach 192.168.10.239 peer detach: failed: 192.168.10.239 is localhost //默认在本机是删除不了自己的。需要在别的节点上删除它。 gluster> 登录另一台节点机上,执行将192.168.10.220节点从集群中移除的操作 [root@GlusterFS-slave ~]# gluster gluster> peer detach 192.168.10.239 peer detach: success gluster> 再次查看集群情况,发现没有节点了 [root@GlusterFS-master ~]# gluster peer status Number of Peers: 0 可以在gluster交互界面里执行它的所有相关命令: [root@GlusterFS-master ~]# gluster gluster> volume info No volumes present gluster> peer status Number of Peers: 0 gluster> |
=====================dd命令创建虚拟分区,创建存储目录=====================
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 | 首先利用dd命令创建虚拟分区,创建存储目录 [root@GlusterFS-master ~]# df -h Filesystem Size Used Avail Use% Mounted on /dev/mapper/centos-root 36G 1.8G 34G 5% / devtmpfs 2.9G 0 2.9G 0% /dev tmpfs 2.9G 0 2.9G 0% /dev/shm tmpfs 2.9G 8.5M 2.9G 1% /run tmpfs 2.9G 0 2.9G 0% /sys/fs/cgroup /dev/vda1 1014M 143M 872M 15% /boot /dev/mapper/centos-home 18G 33M 18G 1% /home tmpfs 581M 0 581M 0% /run/user/0 dd命令创建一个虚拟分区出来,格式化并挂载到/data目录下 [root@GlusterFS-master ~]# dd if=/dev/vda1 of=/dev/vdb1 2097152+0 records in 2097152+0 records out 1073741824 bytes (1.1 GB) copied, 2.0979 s, 512 MB/s [root@GlusterFS-master ~]# du -sh /dev/vdb1 1.0G /dev/vdb1 [root@GlusterFS-master ~]# mkfs.xfs -f /dev/vdb1 //这里格式成xfs格式文件,也可以格式化成ext4格式的。 meta-data=/dev/vdb1 isize=512 agcount=4, agsize=65536 blks = sectsz=512 attr=2, projid32bit=1 = crc=1 finobt=0, sparse=0 data = bsize=4096 blocks=262144, imaxpct=25 = sunit=0 swidth=0 blks naming =version 2 bsize=4096 ascii-ci=0 ftype=1 log =internal log bsize=4096 blocks=2560, version=2 = sectsz=512 sunit=0 blks, lazy-count=1 realtime =none extsz=4096 blocks=0, rtextents=0 [root@GlusterFS-master ~]# mkdir /data [root@GlusterFS-master ~]# mount /dev/vdb1 /data [root@GlusterFS-master ~]# df -h Filesystem Size Used Avail Use% Mounted on /dev/mapper/centos-root 36G 1.8G 34G 5% / devtmpfs 2.9G 34M 2.8G 2% /dev tmpfs 2.9G 0 2.9G 0% /dev/shm tmpfs 2.9G 8.5M 2.9G 1% /run tmpfs 2.9G 0 2.9G 0% /sys/fs/cgroup /dev/vda1 1014M 143M 872M 15% /boot /dev/mapper/centos-home 18G 33M 18G 1% /home tmpfs 581M 0 581M 0% /run/user/0 /dev/loop0 976M 2.6M 907M 1% /data [root@GlusterFS-master ~]# fdisk -l ....... Disk /dev/loop0: 1073 MB, 1073741824 bytes, 2097152 sectors Units = sectors of 1 * 512 = 512 bytes Sector size (logical/physical): 512 bytes / 512 bytes I/O size (minimum/optimal): 512 bytes / 512 bytes 设置开机自动挂载 [root@GlusterFS-master ~]# echo '/dev/loop0 /data xfs defaults 1 2' >> /etc/fstab 然后创建gluster存储目录 [root@GlusterFS-master ~]# mkdir /data/gluster 以上操作要在四台节点机器上都要执行一遍,即创建存储目录环境! |
================创建分布式卷(即Hash卷)及其相关管理操作================
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 255 256 257 258 259 260 261 262 263 264 265 266 267 268 269 270 271 272 273 274 275 276 277 278 279 280 281 282 283 284 285 286 287 288 289 290 291 292 293 294 295 296 297 298 299 300 301 302 303 304 305 306 307 308 309 310 311 312 313 314 315 316 317 318 319 320 321 322 323 324 325 326 327 328 329 330 331 332 333 334 335 336 337 338 339 340 341 342 343 344 345 346 347 348 349 350 351 352 353 354 355 356 357 358 359 360 361 362 363 364 365 366 367 368 369 370 371 372 373 374 375 376 377 378 379 380 381 382 383 384 385 386 387 388 389 390 391 392 393 394 395 396 397 398 399 400 401 402 403 404 405 406 407 408 409 410 411 412 413 414 415 416 417 418 419 420 421 422 423 424 425 426 427 428 429 | 接着将节点添加到集群。这里选择在GlusterFS-master节点上执行: [root@GlusterFS-master ~]# gluster peer probe 192.168.10.212 peer probe: success. [root@GlusterFS-master ~]# gluster peer probe 192.168.10.204 peer probe: success. [root@GlusterFS-master ~]# gluster peer probe 192.168.10.220 peer probe: success. [root@GlusterFS-master ~]# gluster peer status Number of Peers: 3 Hostname: 192.168.10.212 Uuid: f8e69297-4690-488e-b765-c1c404810d6a State: Peer in Cluster (Connected) Hostname: 192.168.10.204 Uuid: a989394c-f64a-40c3-8bc5-820f623952c4 State: Peer in Cluster (Connected) Hostname: 192.168.10.220 Uuid: dd99743a-285b-4aed-b3d6-e860f9efd965 State: Peer in Cluster (Connected) 登录其他节点集群查看,就能看到GlusterFS-master节点(192.168.10.239)也在集群中了 [root@GlusterFS-slave ~]# gluster peer status Number of Peers: 3 Hostname: GlusterFS-master Uuid: 5dfd40e2-096b-40b5-bee3-003b57a39007 State: Peer in Cluster (Connected) Hostname: 192.168.10.204 Uuid: a989394c-f64a-40c3-8bc5-820f623952c4 State: Peer in Cluster (Connected) Hostname: 192.168.10.220 Uuid: dd99743a-285b-4aed-b3d6-e860f9efd965 State: Peer in Cluster (Connected) ---------------------------------------------------------- 现在开始创建。这里在GlusterFS-master机器上执行的,默认这里创建的是哈希卷。 如下,它会自动在192.168.10.212节点的/data下面创建个gluster目录(这个目录不需要提前手动创建) [root@GlusterFS-master ~]# gluster volume create gluster_data 192.168.10.212:/data/gluster force volume create: gluster_data: success: please start the volume to access data 登录192.168.10.212节点查看,果然在/data目录下自动创建了gluster目录 [root@GlusterFS-slave ~]# ls /data/gluster [root@GlusterFS-slave ~]# ls /data/ //data分区大小为1G gluster 启动卷,查看卷状态 [root@GlusterFS-master ~]# gluster volume start gluster_data volume start: gluster_data: success [root@GlusterFS-master ~]# gluster volume info Volume Name: gluster_data Type: Distribute Volume ID: 0f8b2268-9d2f-4b5c-85df-13408825d6b3 Status: Started Number of Bricks: 1 Transport-type: tcp Bricks: Brick1: 192.168.10.212:/data/gluster 挂载卷操作 在客户端机器上执行glusterfs存储挂载的操作 注意,由于上面添加的是192.168.10.212节点,所以客户端要挂载的也是192.168.10.212节点的存储 [root@Client ~]# mkdir /opt/gfsmount [root@Client ~]# mount -t glusterfs 192.168.10.212:gluster_data /opt/gfsmount [root@Client ~]# df -h Filesystem Size Used Avail Use% Mounted on /dev/mapper/centos-root 38G 4.3G 33G 12% / devtmpfs 1.9G 0 1.9G 0% /dev tmpfs 1.9G 0 1.9G 0% /dev/shm tmpfs 1.9G 8.6M 1.9G 1% /run tmpfs 1.9G 0 1.9G 0% /sys/fs/cgroup /dev/vda1 1014M 143M 872M 15% /boot /dev/mapper/centos-home 19G 33M 19G 1% /home tmpfs 380M 0 380M 0% /run/user/0 overlay 38G 4.3G 33G 12% /var/lib/docker/overlay2/9904ac8cbcba967de3262dc0d5e230c64ad3c1c53b588048e263767d36df8c1a/merged shm 64M 0 64M 0% /var/lib/docker/containers/222ec7f21b2495591613e0d1061e4405cd57f99ffaf41dbba1a98c350cd70f60/mounts/shm 192.168.10.212:gluster_data 1014M 33M 982M 4% /opt/gfsmount 上面可知,已经挂载上了glusterfs存储,大小为1G(即是192.168.10.212的存储目录所在的/data分区的空间。 记住:存储目录在节点机的哪个分区下,客户端挂载后就享用这个分区空间。 在客户端挂载目录下测试写数据 [root@Client gfsmount]# mkdir test [root@Client gfsmount]# touch kevin [root@Client gfsmount]# ls kevin test 然后在192.168.10.212的存储目录下发现是正常同步过来的 [root@GlusterFS-slave ~]# cd /data/gluster/ [root@GlusterFS-slave gluster]# ls kevin test ---------------------------------------------------------- 增加brick(即扩容卷) [root@GlusterFS-master ~]# gluster volume add-brick gluster_data 192.168.10.239:/data/gluster force volume add-brick: success [root@GlusterFS-master ~]# gluster volume add-brick gluster_data 192.168.10.204:/data/gluster force volume add-brick: success [root@GlusterFS-master ~]# gluster volume add-brick gluster_data 192.168.10.220:/data/gluster force volume add-brick: success 同样,上面三个节点的/data下会自动创建gluster目录! 查看卷状态 [root@GlusterFS-master ~]# gluster volume info Volume Name: gluster_data Type: Distribute Volume ID: 0f8b2268-9d2f-4b5c-85df-13408825d6b3 Status: Started Number of Bricks: 4 Transport-type: tcp Bricks: Brick1: 192.168.10.212:/data/gluster Brick2: 192.168.10.239:/data/gluster Brick3: 192.168.10.204:/data/gluster Brick4: 192.168.10.220:/data/gluster 然后在到客户端,发现挂载点的容量已经由1G上升到了4G!!(即四个节点的存储目录所在分区空间之和) [root@Client ~]# df -h Filesystem Size Used Avail Use% Mounted on /dev/mapper/centos-root 38G 4.3G 33G 12% / devtmpfs 1.9G 0 1.9G 0% /dev tmpfs 1.9G 0 1.9G 0% /dev/shm tmpfs 1.9G 8.6M 1.9G 1% /run tmpfs 1.9G 0 1.9G 0% /sys/fs/cgroup /dev/vda1 1014M 143M 872M 15% /boot /dev/mapper/centos-home 19G 33M 19G 1% /home tmpfs 380M 0 380M 0% /run/user/0 overlay 38G 4.3G 33G 12% /var/lib/docker/overlay2/9904ac8cbcba967de3262dc0d5e230c64ad3c1c53b588048e263767d36df8c1a/merged shm 64M 0 64M 0% /var/lib/docker/containers/222ec7f21b2495591613e0d1061e4405cd57f99ffaf41dbba1a98c350cd70f60/mounts/shm 192.168.10.212:gluster_data 4.0G 130M 3.9G 4% /opt/gfsmount 如上操作后的总结 1)客户端挂载点的容量是集群中四个节点的存储目录所在分区总和。 2)在客户端挂载点下创建目录,所有节点的存储目录(即Brick)下都会同步到。 3)在客户端挂载点的目录下创建的文件,会在3个节点的存储目录内hash分布。 4)直接在客户端挂载点下创建的文件,则这些文件只会单独同步到所挂载的节点(如上是192.168.10.212的/data/gluster目录下),其他节点不会同步! 5)删除卷会造成一些数据丢失,因为被删除节点有数。 比如: a)客户端在挂载点下创建目录kevin [root@Client ~]# cd /opt/gfsmount/ [root@Client gfsmount]# mkdir kevin 节点查看 [root@GlusterFS-master ~]# ls /data/gluster/ kevin [root@GlusterFS-slave ~]# ls /data/gluster/ kevin [root@GlusterFS-slave2 ~]# ls /data/gluster/ kevin [root@GlusterFS-slave3 ~]# ls /data/gluster/ kevin b)客户端在上面挂载点下的目录里创建文件 [root@Client gfsmount]# for i in `seq -w 1 100`; do cp -rp /var/log/messages /opt/gfsmount/kevin/copy-test-$i; done [root@Client gfsmount]# ls kevin/ copy-test-001 copy-test-014 copy-test-027 copy-test-040 copy-test-053 copy-test-066 copy-test-079 copy-test-092 copy-test-002 copy-test-015 copy-test-028 copy-test-041 copy-test-054 copy-test-067 copy-test-080 copy-test-093 copy-test-003 copy-test-016 copy-test-029 copy-test-042 copy-test-055 copy-test-068 copy-test-081 copy-test-094 copy-test-004 copy-test-017 copy-test-030 copy-test-043 copy-test-056 copy-test-069 copy-test-082 copy-test-095 copy-test-005 copy-test-018 copy-test-031 copy-test-044 copy-test-057 copy-test-070 copy-test-083 copy-test-096 copy-test-006 copy-test-019 copy-test-032 copy-test-045 copy-test-058 copy-test-071 copy-test-084 copy-test-097 copy-test-007 copy-test-020 copy-test-033 copy-test-046 copy-test-059 copy-test-072 copy-test-085 copy-test-098 copy-test-008 copy-test-021 copy-test-034 copy-test-047 copy-test-060 copy-test-073 copy-test-086 copy-test-099 copy-test-009 copy-test-022 copy-test-035 copy-test-048 copy-test-061 copy-test-074 copy-test-087 copy-test-100 copy-test-010 copy-test-023 copy-test-036 copy-test-049 copy-test-062 copy-test-075 copy-test-088 copy-test-011 copy-test-024 copy-test-037 copy-test-050 copy-test-063 copy-test-076 copy-test-089 copy-test-012 copy-test-025 copy-test-038 copy-test-051 copy-test-064 copy-test-077 copy-test-090 copy-test-013 copy-test-026 copy-test-039 copy-test-052 copy-test-065 copy-test-078 copy-test-091 节点查看。发现文件同步到挂载的192.168.10.212节点山上了 [root@GlusterFS-master ~]# ls /data/gluster/kevin/ copy-test-002 copy-test-014 copy-test-036 copy-test-045 copy-test-056 copy-test-070 copy-test-075 copy-test-097 copy-test-009 copy-test-020 copy-test-042 copy-test-047 copy-test-062 copy-test-071 copy-test-080 copy-test-010 copy-test-027 copy-test-043 copy-test-053 copy-test-064 copy-test-072 copy-test-084 copy-test-013 copy-test-035 copy-test-044 copy-test-055 copy-test-068 copy-test-074 copy-test-092 [root@GlusterFS-master ~]# ll /data/gluster/kevin/|wc -l 30 [root@GlusterFS-slave ~]# ls /data/gluster/kevin/ copy-test-003 copy-test-018 copy-test-037 copy-test-050 copy-test-061 copy-test-069 copy-test-089 copy-test-005 copy-test-025 copy-test-040 copy-test-058 copy-test-066 copy-test-076 copy-test-091 copy-test-007 copy-test-026 copy-test-049 copy-test-059 copy-test-067 copy-test-085 copy-test-096 [root@GlusterFS-slave ~]# ll /data/gluster/kevin/|wc -l 22 [root@GlusterFS-slave2 gluster]# ls /data/gluster/kevin/ copy-test-004 copy-test-016 copy-test-024 copy-test-046 copy-test-065 copy-test-082 copy-test-088 copy-test-099 copy-test-006 copy-test-017 copy-test-029 copy-test-048 copy-test-078 copy-test-086 copy-test-093 copy-test-015 copy-test-023 copy-test-033 copy-test-052 copy-test-079 copy-test-087 copy-test-095 [root@GlusterFS-slave2 gluster]# ll /data/gluster/kevin/|wc -l 23 [root@GlusterFS-slave3 ~]# ls /data/gluster/kevin/ copy-test-001 copy-test-019 copy-test-030 copy-test-038 copy-test-054 copy-test-073 copy-test-090 copy-test-008 copy-test-021 copy-test-031 copy-test-039 copy-test-057 copy-test-077 copy-test-094 copy-test-011 copy-test-022 copy-test-032 copy-test-041 copy-test-060 copy-test-081 copy-test-098 copy-test-012 copy-test-028 copy-test-034 copy-test-051 copy-test-063 copy-test-083 copy-test-100 [root@GlusterFS-slave3 ~]# ll /data/gluster/kevin/|wc -l 29 c)如果直接在客户端挂载点下创建文件,则这些文件只会单独同步到所挂载的节点 (如上是192.168.10.212的/data/gluster目录下),其他节点不会同步! [root@Client gfsmount]# for i in `seq -w 1 30`; do cp -rp /var/log/messages /opt/gfsmount/haha-test-$i; done [root@Client gfsmount]# ls haha-test-01 haha-test-05 haha-test-09 haha-test-13 haha-test-17 haha-test-21 haha-test-25 haha-test-29 haha-test-02 haha-test-06 haha-test-10 haha-test-14 haha-test-18 haha-test-22 haha-test-26 haha-test-30 haha-test-03 haha-test-07 haha-test-11 haha-test-15 haha-test-19 haha-test-23 haha-test-27 kevin haha-test-04 haha-test-08 haha-test-12 haha-test-16 haha-test-20 haha-test-24 haha-test-28 节点查看,发现只是在192.168.10.212节点上有上面创建的那30个文件,其他3个节点都没有 [root@GlusterFS-master ~]# ls /data/gluster/ kevin [root@GlusterFS-slave ~]# ls /data/gluster/ haha-test-01 haha-test-05 haha-test-09 haha-test-13 haha-test-17 haha-test-21 haha-test-25 haha-test-29 haha-test-02 haha-test-06 haha-test-10 haha-test-14 haha-test-18 haha-test-22 haha-test-26 haha-test-30 haha-test-03 haha-test-07 haha-test-11 haha-test-15 haha-test-19 haha-test-23 haha-test-27 kevin haha-test-04 haha-test-08 haha-test-12 haha-test-16 haha-test-20 haha-test-24 haha-test-28 [root@GlusterFS-slave2 ~]# ls /data/gluster/ kevin [root@GlusterFS-slave3 ~]# ls /data/gluster/ kevin d)删除卷会造成一些数据丢失,因为被删除节点有数。 从卷中删除其中一个节点的brick [root@GlusterFS-master ~]# gluster volume remove-brick gluster_data 192.168.10.220:/data/gluster Usage: volume remove-brick <VOLNAME> [replica <COUNT>] <BRICK> ... <start|stop|status|commit|force> [root@GlusterFS-master ~]# gluster gluster> volume remove-brick gluster_data 192.168.10.220:/data/gluster force Removing brick(s) can result in data loss. Do you want to Continue? (y/n) y volume remove-brick commit force: success gluster> 查看卷的信息,mount点空间也下降了(少了192.168.10.220的存储空间) [root@GlusterFS-master ~]# gluster volume info Volume Name: gluster_data Type: Distribute Volume ID: 0f8b2268-9d2f-4b5c-85df-13408825d6b3 Status: Started Number of Bricks: 3 Transport-type: tcp Bricks: Brick1: 192.168.10.212:/data/gluster Brick2: 192.168.10.239:/data/gluster Brick3: 192.168.10.204:/data/gluster 客户端的挂载点空间下降到3G了 [root@Client ~]# df -h Filesystem Size Used Avail Use% Mounted on ....... 192.168.10.212:gluster_data 3.0G 98M 2.9G 4% /opt/gfsmount [root@Client ~]# ll /opt/gfsmount/kevin|wc -l 73 发现上面在客户端挂载点的kevin目录下创建的100个文件已经少了一部分,因为这部分数据在192.168.10.220节点上,该 节点的brick已经从卷中删除了,所以这部分数据就丢失了! 接着注意下面的操作!! 将上面删除的192.168.10.220的brick添加进去,发现添加失败! [root@GlusterFS-master ~]# gluster volume add-brick gluster_data 192.168.10.220:/data/gluster volume add-brick: failed: Staging failed on 192.168.10.220. Error: /data/gluster is already part of a volume 需要删除目录,才能加回来 [root@GlusterFS-slave3 ~]# rm -rf /data/gluster [root@GlusterFS-slave3 ~]# ll /data total 0 然后再添加就能成功了 [root@GlusterFS-master ~]# gluster volume add-brick gluster_data 192.168.10.220:/data/gluster volume add-brick: success [root@GlusterFS-master ~]# gluster volume info Volume Name: gluster_data Type: Distribute Volume ID: 0f8b2268-9d2f-4b5c-85df-13408825d6b3 Status: Started Number of Bricks: 4 Transport-type: tcp Bricks: Brick1: 192.168.10.212:/data/gluster Brick2: 192.168.10.239:/data/gluster Brick3: 192.168.10.204:/data/gluster Brick4: 192.168.10.220:/data/gluster 添加回来后,客户端的挂载点的容量又上升到4G了 [root@Client ~]# df -h ....... 192.168.10.212:gluster_data 4.0G 131M 3.9G 4% /opt/gfsmount --------------------------------------------------------------------------------- rebalance操作能够让文件按照之前的规则再分配 做下rebalance,看到新加的节点上分到了文件和目录。 注意,在实际生产环节中,做rebalance,最好在服务器空闲的时间操作 如上,新添加进去的192.168.10.220节点的存储目录下一开始是没有内容的 [root@GlusterFS-slave3 ~]# ls /data/gluster/ [root@GlusterFS-slave3 ~]# [root@GlusterFS-master ~]# gluster volume rebalance gluster_data start volume rebalance: gluster_data: success: Initiated rebalance on volume gluster_data. Execute "gluster volume rebalance <volume-name> status" to check status. ID: 49277bfa-df25-45c4-b1fb-cbcf8607a23e 再次查看新添加进去的192.168.10.220节点的存储目录,发现就有了数据 [root@GlusterFS-slave3 ~]# ls /data/gluster/ haha-test-03 haha-test-11 haha-test-14 haha-test-16 haha-test-25 haha-test-30 kevin 发现做了rebalance之后,客户端在挂载点下的数据就会均衡地分布到各节点上了 [root@Client ~]# ls /opt/gfsmount/ haha-test-01 haha-test-05 haha-test-09 haha-test-13 haha-test-17 haha-test-21 haha-test-25 haha-test-29 haha-test-02 haha-test-06 haha-test-10 haha-test-14 haha-test-18 haha-test-22 haha-test-26 haha-test-30 haha-test-03 haha-test-07 haha-test-11 haha-test-15 haha-test-19 haha-test-23 haha-test-27 kevin haha-test-04 haha-test-08 haha-test-12 haha-test-16 haha-test-20 haha-test-24 haha-test-28 [root@Client ~]# ls /opt/gfsmount/kevin/ copy-test-002 copy-test-014 copy-test-026 copy-test-043 copy-test-053 copy-test-066 copy-test-076 copy-test-088 copy-test-003 copy-test-015 copy-test-027 copy-test-044 copy-test-055 copy-test-067 copy-test-078 copy-test-089 copy-test-004 copy-test-016 copy-test-029 copy-test-045 copy-test-056 copy-test-068 copy-test-079 copy-test-091 copy-test-005 copy-test-017 copy-test-033 copy-test-046 copy-test-058 copy-test-069 copy-test-080 copy-test-092 copy-test-006 copy-test-018 copy-test-035 copy-test-047 copy-test-059 copy-test-070 copy-test-082 copy-test-093 copy-test-007 copy-test-020 copy-test-036 copy-test-048 copy-test-061 copy-test-071 copy-test-084 copy-test-095 copy-test-009 copy-test-023 copy-test-037 copy-test-049 copy-test-062 copy-test-072 copy-test-085 copy-test-096 copy-test-010 copy-test-024 copy-test-040 copy-test-050 copy-test-064 copy-test-074 copy-test-086 copy-test-097 copy-test-013 copy-test-025 copy-test-042 copy-test-052 copy-test-065 copy-test-075 copy-test-087 copy-test-099 [root@GlusterFS-master ~]# ls /data/gluster/ haha-test-06 haha-test-07 haha-test-15 haha-test-19 haha-test-22 haha-test-28 kevin [root@GlusterFS-master ~]# ls /data/gluster/kevin/ copy-test-002 copy-test-014 copy-test-036 copy-test-047 copy-test-059 copy-test-069 copy-test-080 copy-test-097 copy-test-003 copy-test-018 copy-test-037 copy-test-049 copy-test-061 copy-test-070 copy-test-084 copy-test-005 copy-test-020 copy-test-040 copy-test-050 copy-test-062 copy-test-071 copy-test-085 copy-test-007 copy-test-025 copy-test-042 copy-test-053 copy-test-064 copy-test-072 copy-test-089 copy-test-009 copy-test-026 copy-test-043 copy-test-055 copy-test-066 copy-test-074 copy-test-091 copy-test-010 copy-test-027 copy-test-044 copy-test-056 copy-test-067 copy-test-075 copy-test-092 copy-test-013 copy-test-035 copy-test-045 copy-test-058 copy-test-068 copy-test-076 copy-test-096 [root@GlusterFS-slave ~]# ls /data/gluster/ haha-test-01 haha-test-04 haha-test-08 haha-test-17 haha-test-20 haha-test-26 haha-test-27 haha-test-29 kevin [root@GlusterFS-slave ~]# ls /data/gluster/kevin/ copy-test-002 copy-test-017 copy-test-040 copy-test-050 copy-test-064 copy-test-074 copy-test-086 copy-test-097 copy-test-004 copy-test-020 copy-test-042 copy-test-052 copy-test-065 copy-test-075 copy-test-087 copy-test-099 copy-test-006 copy-test-023 copy-test-043 copy-test-053 copy-test-066 copy-test-076 copy-test-088 copy-test-009 copy-test-024 copy-test-044 copy-test-055 copy-test-067 copy-test-078 copy-test-089 copy-test-010 copy-test-027 copy-test-045 copy-test-056 copy-test-068 copy-test-079 copy-test-091 copy-test-013 copy-test-029 copy-test-046 copy-test-058 copy-test-069 copy-test-080 copy-test-092 copy-test-014 copy-test-033 copy-test-047 copy-test-059 copy-test-070 copy-test-082 copy-test-093 copy-test-015 copy-test-035 copy-test-048 copy-test-061 copy-test-071 copy-test-084 copy-test-095 copy-test-016 copy-test-036 copy-test-049 copy-test-062 copy-test-072 copy-test-085 copy-test-096 [root@GlusterFS-slave2 ~]# ls /data/gluster/ haha-test-02 haha-test-09 haha-test-12 haha-test-18 haha-test-23 kevin haha-test-05 haha-test-10 haha-test-13 haha-test-21 haha-test-24 [root@GlusterFS-slave2 ~]# ls /data/gluster/kevin/ copy-test-004 copy-test-016 copy-test-024 copy-test-046 copy-test-065 copy-test-082 copy-test-088 copy-test-099 copy-test-006 copy-test-017 copy-test-029 copy-test-048 copy-test-078 copy-test-086 copy-test-093 copy-test-015 copy-test-023 copy-test-033 copy-test-052 copy-test-079 copy-test-087 copy-test-095 [root@GlusterFS-slave3 ~]# ls /data/gluster/ haha-test-03 haha-test-11 haha-test-14 haha-test-16 haha-test-25 haha-test-30 kevin [root@GlusterFS-slave3 ~]# ls /data/gluster/kevin/ 查看下平衡状态 [root@GlusterFS-master ~]# gluster volume rebalance gluster_data status Node Rebalanced-files size scanned failures skipped status run time in secs --------- ----------- ----------- ----------- ----------- ----------- ------------ -------------- localhost 29 164.8KB 105 0 0 completed 1.00 192.168.10.212 43 248.7KB 131 0 0 completed 1.00 192.168.10.204 0 0Bytes 105 0 0 completed 0.00 192.168.10.220 0 0Bytes 105 0 0 completed 0.00 volume rebalance: gluster_data: success: -------------------------------------------------------------------------- 接着进行卸载挂载点,停止卷 这个操作很危险,但是卷删除了,下面的数据还在: [root@Client ~]# umount /opt/gfsmount -lf [root@GlusterFS-master ~]# gluster volume stop gluster_data Stopping volume will make its data inaccessible. Do you want to continue? (y/n) y volume stop: gluster_data: success [root@GlusterFS-master ~]# gluster volume delete gluster_data Deleting volume will erase all information about the volume. Do you want to continue? (y/n) y volume delete: gluster_data: success [root@GlusterFS-master ~]# gluster volume list No volumes present in cluster [root@GlusterFS-master ~]# gluster volume info No volumes present gluster_data卷删除了,但是各节点上存储目录下的数据还在 [root@GlusterFS-master ~]# ls /data/gluster/ haha-test-06 haha-test-07 haha-test-15 haha-test-19 haha-test-22 haha-test-28 kevin [root@GlusterFS-slave ~]# ls /data/gluster/ haha-test-01 haha-test-04 haha-test-08 haha-test-17 haha-test-20 haha-test-26 haha-test-27 haha-test-29 kevin [root@GlusterFS-slave2 ~]# ls /data/gluster/ haha-test-02 haha-test-09 haha-test-12 haha-test-18 haha-test-23 kevin haha-test-05 haha-test-10 haha-test-13 haha-test-21 haha-test-24 [root@GlusterFS-slave3 ~]# ls /data/gluster/ haha-test-03 haha-test-11 haha-test-14 haha-test-16 haha-test-25 haha-test-30 kevin 客户端挂载点数据不显示 [root@Client ~]# ls /opt/gfsmount/ [root@Client ~]# 想要清除数据,可以登录到每个节点上删除brick下面的数据 [root@GlusterFS-master ~]# rm -rf /data/gluster [root@GlusterFS-slave ~]# rm -rf /data/gluster [root@GlusterFS-slave2 ~]# rm -rf /data/gluster [root@GlusterFS-slave3 ~]# rm -rf /data/gluster |
================创建复制卷及其相关管理操作================
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 255 256 257 258 259 260 261 262 263 264 265 266 267 268 269 270 271 272 273 274 275 276 277 278 279 280 281 282 283 284 285 286 287 288 289 290 291 292 293 294 295 296 297 298 299 300 301 302 303 304 305 306 307 308 309 310 311 312 313 314 315 316 317 318 319 320 321 322 323 324 325 326 327 328 329 330 331 332 333 334 335 336 337 338 339 340 341 342 343 344 345 346 347 348 349 350 | 先删除各个节点在上面用到的存储目录 [root@GlusterFS-master ~]# rm -rf /data/gluster [root@GlusterFS-slave ~]# rm -rf /data/gluster [root@GlusterFS-slave2 ~]# rm -rf /data/gluster [root@GlusterFS-slave3 ~]# rm -rf /data/gluster [root@GlusterFS-master ~]# gluster volume info No volumes present [root@GlusterFS-master ~]# gluster volume status No volumes present [root@GlusterFS-master ~]# gluster volume list No volumes present in cluster [root@GlusterFS-master ~]# gluster peer status Number of Peers: 3 Hostname: 192.168.10.212 Uuid: f8e69297-4690-488e-b765-c1c404810d6a State: Peer in Cluster (Connected) Hostname: 192.168.10.204 Uuid: a989394c-f64a-40c3-8bc5-820f623952c4 State: Peer in Cluster (Connected) Hostname: 192.168.10.220 Uuid: dd99743a-285b-4aed-b3d6-e860f9efd965 State: Peer in Cluster (Connected) 删除集群中的节点 [root@GlusterFS-master ~]# gluster gluster> peer detach 192.168.10.220 peer detach: success gluster> peer detach 192.168.10.204 peer detach: success gluster> peer detach 192.168.10.212 peer detach: success gluster> peer detach 192.168.10.239 peer detach: failed: 192.168.10.239 is localhost gluster> [root@GlusterFS-slave ~]# gluster peer detach: success 查看集群,已没有节点信息了 [root@GlusterFS-master ~]# gluster peer status Number of Peers: 0 ---------------------------------------------------------------------- 现在开始创建复制卷,操作如下: 先添加集群 [root@GlusterFS-master ~]# gluster peer probe 192.168.10.212 peer probe: success. [root@GlusterFS-master ~]# gluster peer status Number of Peers: 1 Hostname: 192.168.10.212 Uuid: f8e69297-4690-488e-b765-c1c404810d6a State: Peer in Cluster (Connected) 创建复制卷,需要replica 2,表示两个副本 [root@GlusterFS-master ~]# gluster volume create gluster_share replica 2 192.168.10.239:/data/gluster 192.168.10.212:/data/gluster force volume create: gluster_share: success: please start the volume to access data [root@GlusterFS-master ~]# gluster volume info Volume Name: gluster_share Type: Replicate Volume ID: a9f989bd-7edd-4089-836a-d9f742b8d37a Status: Created Number of Bricks: 1 x 2 = 2 Transport-type: tcp Bricks: Brick1: 192.168.10.239:/data/gluster Brick2: 192.168.10.212:/data/gluster 以上两个节点会自动生成存储目录/data/gluster [root@GlusterFS-master ~]# ll /data/ total 0 drwxr-xr-x. 2 root root 6 Apr 10 12:58 gluster [root@GlusterFS-slave ~]# ll /data/ total 0 drwxr-xr-x. 2 root root 6 Apr 10 12:59 gluster 启动卷,查看卷状态 [root@GlusterFS-master ~]# gluster volume list gluster_share [root@GlusterFS-master ~]# gluster volume start gluster_share volume start: gluster_share: success [root@GlusterFS-master ~]# gluster volume info Volume Name: gluster_share Type: Replicate Volume ID: a9f989bd-7edd-4089-836a-d9f742b8d37a Status: Started Number of Bricks: 1 x 2 = 2 Transport-type: tcp Bricks: Brick1: 192.168.10.239:/data/gluster Brick2: 192.168.10.212:/data/gluster 客户端挂载后操作创建文件文件,发现容量只有一个节点的容量,因为是复制卷 [root@Client ~]# mount -t glusterfs 192.168.10.239:gluster_share /opt/gfsmount/ [root@Client ~]# df -h ....... 192.168.10.239:gluster_share 1014M 33M 982M 4% /opt/gfsmount 客户端挂载点写入数据 [root@Client gfsmount]# mkdir test [root@Client gfsmount]# touch kevin grace 两个节点由于是副本关系,内容一致 [root@GlusterFS-master ~]# ls /data/gluster/ grace kevin test [root@GlusterFS-slave ~]# ls /data/gluster/ grace kevin test ------------------------------------------------------------------ 模拟误删卷信息故障 删除卷信息,卷信息在下面路径下 [root@GlusterFS-master ~]# ls /usr/local/glusterfs/var/lib/glusterd/vols/gluster_share/ bricks gluster_share-rebalance.vol rbstate cksum gluster_share.tcp-fuse.vol run gluster_share.192.168.10.212.data-gluster.vol info snapd.info gluster_share.192.168.10.239.data-gluster.vol node_state.info trusted-gluster_share.tcp-fuse.vol [root@GlusterFS-slave ~]# ls /usr/local/glusterfs/var/lib/glusterd/vols/gluster_share/ bricks gluster_share-rebalance.vol rbstate cksum gluster_share.tcp-fuse.vol run gluster_share.192.168.10.212.data-gluster.vol info snapd.info gluster_share.192.168.10.239.data-gluster.vol node_state.info trusted-gluster_share.tcp-fuse.vol 这里以删除192.168.10.212(即GlusterFS-slave)节点的卷信息为例 [root@GlusterFS-slave ~]# rm -rf /usr/local/glusterfs/var/lib/glusterd/vols/gluster_share/ [root@GlusterFS-slave ~]# ls /usr/local/glusterfs/var/lib/glusterd/vols/gluster_share/ ls: cannot access /usr/local/glusterfs/var/lib/glusterd/vols/gluster_share/: No such file or directory 恢复卷信息 把卷信息同步过来,192.168.10.212节点上的卷信息删除了,但是它的备份节点192.168.10.239节点上的卷信息是正常的! 下面操作的all表示同步所有卷信息过来,这里也可以写成gluster_share卷 特别注意:这种卷信息要定期备份!!!! [root@GlusterFS-master ~]# gluster volume sync 192.168.10.239 all //不能在节点本机进行针对自己的备份 Sync volume may make data inaccessible while the sync is in progress. Do you want to continue? (y/n) y volume sync: failed: sync from localhost not allowed [root@GlusterFS-slave ~]# gluster volume sync 192.168.10.239 all //所以要在集群中的其他节点上操作 Sync volume may make data inaccessible while the sync is in progress. Do you want to continue? (y/n) y volume sync: success 然后再去192.168.10.212节点上查看卷信息,发现删除的卷信息已经恢复回来了! [root@GlusterFS-slave ~]# ls /usr/local/glusterfs/var/lib/glusterd/vols/gluster_share/ bricks gluster_share-rebalance.vol rbstate cksum gluster_share.tcp-fuse.vol run gluster_share.192.168.10.212.data-gluster.vol info snapd.info gluster_share.192.168.10.239.data-gluster.vol node_state.info trusted-gluster_share.tcp-fuse.vol ------------------------------------------------------------------ 追加节点操作 [root@GlusterFS-master ~]# gluster peer probe 192.168.10.204 peer probe: success. [root@GlusterFS-master ~]# gluster peer probe 192.168.10.220 peer probe: success. [root@GlusterFS-master ~]# gluster peer status Number of Peers: 3 Hostname: 192.168.10.212 Uuid: f8e69297-4690-488e-b765-c1c404810d6a State: Peer in Cluster (Connected) Hostname: 192.168.10.204 Uuid: a989394c-f64a-40c3-8bc5-820f623952c4 State: Peer in Cluster (Connected) Hostname: 192.168.10.220 Uuid: dd99743a-285b-4aed-b3d6-e860f9efd965 State: Peer in Cluster (Connected) 查看卷信息 [root@GlusterFS-master ~]# gluster volume info Volume Name: gluster_share Type: Replicate Volume ID: a9f989bd-7edd-4089-836a-d9f742b8d37a Status: Started Number of Bricks: 1 x 2 = 2 Transport-type: tcp Bricks: Brick1: 192.168.10.239:/data/gluster Brick2: 192.168.10.212:/data/gluster 卷扩容操作(即将新添加的两个节点的brick添加到上面的models磁盘里) 先将客户端的挂载卸载掉 [root@Client ~]# umount /opt/gfsmount/ 接着关闭gluster_share卷 [root@GlusterFS-master ~]# gluster volume stop gluster_share Stopping volume will make its data inaccessible. Do you want to continue? (y/n) y volume stop: gluster_share: success [root@GlusterFS-master ~]# gluster volume status gluster_share Volume gluster_share is not started 然后执行卷扩容操作 [root@GlusterFS-master ~]# gluster volume add-brick gluster_share 192.168.10.204:/data/gluster 192.168.10.220:/data/gluster force volume add-brick: success 再次查看卷信息,发现新节点已经加入进去了 [root@GlusterFS-master ~]# gluster volume info Volume Name: gluster_share Type: Distributed-Replicate Volume ID: a9f989bd-7edd-4089-836a-d9f742b8d37a Status: Stopped Number of Bricks: 2 x 2 = 4 Transport-type: tcp Bricks: Brick1: 192.168.10.239:/data/gluster Brick2: 192.168.10.212:/data/gluster Brick3: 192.168.10.204:/data/gluster Brick4: 192.168.10.220:/data/gluster 然后重新启动gluster_share卷 [root@GlusterFS-master ~]# gluster volume start gluster_share volume start: gluster_share: success [root@GlusterFS-master ~]# gluster volume status gluster_share Status of volume: gluster_share Gluster process Port Online Pid ------------------------------------------------------------------------------ Brick 192.168.10.239:/data/gluster 49155 Y 8907 Brick 192.168.10.212:/data/gluster 49156 Y 4049 Brick 192.168.10.204:/data/gluster 49156 Y 11447 Brick 192.168.10.220:/data/gluster 49157 Y 16714 NFS Server on localhost N/A N N/A Self-heal Daemon on localhost N/A N N/A NFS Server on 192.168.10.212 N/A N N/A Self-heal Daemon on 192.168.10.212 N/A N N/A NFS Server on 192.168.10.220 N/A N N/A Self-heal Daemon on 192.168.10.220 N/A N N/A NFS Server on 192.168.10.204 N/A N N/A Self-heal Daemon on 192.168.10.204 N/A N N/A Task Status of Volume gluster_share ------------------------------------------------------------------------------ There are no active volume tasks 如上发现四个节点的Online项状态都是"Y" 接着在Client客户机重新挂载存储 [root@Client ~]# mount -t glusterfs 192.168.10.239:gluster_share /opt/gfsmount/ [root@Client ~]# df -h ....... 192.168.10.239:gluster_share 2.0G 65M 2.0G 4% /opt/gfsmount 如上发现客户端挂载点的容量是2G,即两组replica的分区大小 在客户机上写入数据做测试,发现: 1)新添加到models卷里的节点的存储目录里不会有之前其他节点的数据,只会有新写入的数据。 2)由于上面创建副本卷的时候,指定的副本是2个(即replica 2),如果不做rebalance,那么在客户端写入的文件数据都只会同步到原来的那个replica 下的节点里,即新加入的两个节点里不会同步数据;在客户端创建目录,四个节点都会同步到;但是文件只会同步到之前的两个节点。 接着进行重新均衡。均衡卷的前提是至少有两个brick存储单元 [root@GlusterFS-master ~]# gluster volume rebalance gluster_share start volume rebalance: gluster_share: success: Initiated rebalance on volume gluster_share. Execute "gluster volume rebalance <volume-name> status" to check status. ID: 26035833-3c20-4822-b065-7a5e15d30b85 [root@GlusterFS-master ~]# gluster volume rebalance gluster_share status Node Rebalanced-files size scanned failures skipped status run time in secs --------- ----------- ----------- ----------- ----------- ----------- ------------ -------------- localhost 103 10.3MB 305 0 0 completed 1.00 192.168.10.212 0 0Bytes 210 0 0 completed 0.00 192.168.10.204 11 58.9KB 213 0 0 completed 0.00 192.168.10.220 0 0Bytes 210 0 0 completed 0.00 volume rebalance: gluster_share: success: 然后删除客户端的数据,重新写入进行测试 [root@Client ~]# rm -rf /opt/gfsmount/* [root@Client ~]# cd /opt/gfsmount/ [root@Client gfsmount]# ls [root@Client gfsmount]# mkdir kevin 发现四个节点都有这个kevin目录 [root@GlusterFS-master ~]# ls /data/gluster/ kevin [root@GlusterFS-slave ~]# ls /data/gluster/ kevin [root@GlusterFS-slave2 ~]# ls /data/gluster/ kevin [root@GlusterFS-slave3 ~]# ls /data/gluster/ kevin 然后批量写入数据 [root@Client gfsmount]# for i in `seq -w 1 100`; do cp -rp /var/log/messages /opt/gfsmount/haha-test-$i; done [root@Client gfsmount]# ls haha-test-001 haha-test-014 haha-test-027 haha-test-040 haha-test-053 haha-test-066 haha-test-079 haha-test-092 haha-test-002 haha-test-015 haha-test-028 haha-test-041 haha-test-054 haha-test-067 haha-test-080 haha-test-093 haha-test-003 haha-test-016 haha-test-029 haha-test-042 haha-test-055 haha-test-068 haha-test-081 haha-test-094 haha-test-004 haha-test-017 haha-test-030 haha-test-043 haha-test-056 haha-test-069 haha-test-082 haha-test-095 haha-test-005 haha-test-018 haha-test-031 haha-test-044 haha-test-057 haha-test-070 haha-test-083 haha-test-096 haha-test-006 haha-test-019 haha-test-032 haha-test-045 haha-test-058 haha-test-071 haha-test-084 haha-test-097 haha-test-007 haha-test-020 haha-test-033 haha-test-046 haha-test-059 haha-test-072 haha-test-085 haha-test-098 haha-test-008 haha-test-021 haha-test-034 haha-test-047 haha-test-060 haha-test-073 haha-test-086 haha-test-099 haha-test-009 haha-test-022 haha-test-035 haha-test-048 haha-test-061 haha-test-074 haha-test-087 haha-test-100 haha-test-010 haha-test-023 haha-test-036 haha-test-049 haha-test-062 haha-test-075 haha-test-088 kevin haha-test-011 haha-test-024 haha-test-037 haha-test-050 haha-test-063 haha-test-076 haha-test-089 haha-test-012 haha-test-025 haha-test-038 haha-test-051 haha-test-064 haha-test-077 haha-test-090 haha-test-013 haha-test-026 haha-test-039 haha-test-052 haha-test-065 haha-test-078 haha-test-091 发现这些数据被分为两个replica副本,分到放到了192.168.10.239、192.168.10.212(副本关系)和192.168.10.204、192.168.10.220上了 [root@GlusterFS-master ~]# ls /data/gluster/ haha-test-004 haha-test-015 haha-test-028 haha-test-040 haha-test-050 haha-test-070 haha-test-080 haha-test-091 haha-test-007 haha-test-017 haha-test-029 haha-test-042 haha-test-051 haha-test-072 haha-test-081 haha-test-093 haha-test-009 haha-test-019 haha-test-032 haha-test-043 haha-test-052 haha-test-073 haha-test-084 haha-test-094 haha-test-010 haha-test-020 haha-test-033 haha-test-044 haha-test-056 haha-test-075 haha-test-087 haha-test-095 haha-test-011 haha-test-021 haha-test-034 haha-test-047 haha-test-059 haha-test-076 haha-test-088 haha-test-097 haha-test-012 haha-test-022 haha-test-036 haha-test-048 haha-test-061 haha-test-078 haha-test-089 haha-test-100 haha-test-014 haha-test-027 haha-test-037 haha-test-049 haha-test-069 haha-test-079 haha-test-090 kevin [root@GlusterFS-slave ~]# ls /data/gluster/ haha-test-004 haha-test-015 haha-test-028 haha-test-040 haha-test-050 haha-test-070 haha-test-080 haha-test-091 haha-test-007 haha-test-017 haha-test-029 haha-test-042 haha-test-051 haha-test-072 haha-test-081 haha-test-093 haha-test-009 haha-test-019 haha-test-032 haha-test-043 haha-test-052 haha-test-073 haha-test-084 haha-test-094 haha-test-010 haha-test-020 haha-test-033 haha-test-044 haha-test-056 haha-test-075 haha-test-087 haha-test-095 haha-test-011 haha-test-021 haha-test-034 haha-test-047 haha-test-059 haha-test-076 haha-test-088 haha-test-097 haha-test-012 haha-test-022 haha-test-036 haha-test-048 haha-test-061 haha-test-078 haha-test-089 haha-test-100 haha-test-014 haha-test-027 haha-test-037 haha-test-049 haha-test-069 haha-test-079 haha-test-090 kevin [root@GlusterFS-slave2 ~]# ls /data/gluster/ haha-test-001 haha-test-013 haha-test-026 haha-test-041 haha-test-057 haha-test-065 haha-test-077 haha-test-096 haha-test-002 haha-test-016 haha-test-030 haha-test-045 haha-test-058 haha-test-066 haha-test-082 haha-test-098 haha-test-003 haha-test-018 haha-test-031 haha-test-046 haha-test-060 haha-test-067 haha-test-083 haha-test-099 haha-test-005 haha-test-023 haha-test-035 haha-test-053 haha-test-062 haha-test-068 haha-test-085 kevin haha-test-006 haha-test-024 haha-test-038 haha-test-054 haha-test-063 haha-test-071 haha-test-086 haha-test-008 haha-test-025 haha-test-039 haha-test-055 haha-test-064 haha-test-074 haha-test-092 [root@GlusterFS-slave3 ~]# ls /data/gluster/ haha-test-001 haha-test-013 haha-test-026 haha-test-041 haha-test-057 haha-test-065 haha-test-077 haha-test-096 haha-test-002 haha-test-016 haha-test-030 haha-test-045 haha-test-058 haha-test-066 haha-test-082 haha-test-098 haha-test-003 haha-test-018 haha-test-031 haha-test-046 haha-test-060 haha-test-067 haha-test-083 haha-test-099 haha-test-005 haha-test-023 haha-test-035 haha-test-053 haha-test-062 haha-test-068 haha-test-085 kevin haha-test-006 haha-test-024 haha-test-038 haha-test-054 haha-test-063 haha-test-071 haha-test-086 haha-test-008 haha-test-025 haha-test-039 haha-test-055 haha-test-064 haha-test-074 haha-test-092 同样在客户端的挂载点的目录(即/opt/gfsmount/kevin)下创建的文件也是同样被分为两个replica副本,分到放到了 192.168.10.239、192.168.10.212(副本关系)和192.168.10.204、192.168.10.220上了 |
================Gluster设置允许可信任客户端IP================
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 | 设置只允许192.168.1.*的访问 [root@GlusterFS-master ~]# gluster volume set gluster_share auth.allow 192.168.1.* volume set: success 查看卷信息 [root@GlusterFS-master ~]# gluster volume info Volume Name: gluster_share Type: Distributed-Replicate Volume ID: a9f989bd-7edd-4089-836a-d9f742b8d37a Status: Started Number of Bricks: 2 x 2 = 4 Transport-type: tcp Bricks: Brick1: 192.168.10.239:/data/gluster Brick2: 192.168.10.212:/data/gluster Brick3: 192.168.10.204:/data/gluster Brick4: 192.168.10.220:/data/gluster Options Reconfigured: auth.allow: 192.168.1.* 注意上面最后一行卷信息,说明只允许客户端ip为192.168.10.*网段的机器挂载。 然后在192.168.10.213客户端进行挂载,发现就挂载不上了 [root@Client ~]# mount -t glusterfs 192.168.10.239:gluster_share /opt/gfsmount/ [root@Client ~]# df -h Filesystem Size Used Avail Use% Mounted on /dev/mapper/centos-root 38G 4.3G 33G 12% / devtmpfs 1.9G 0 1.9G 0% /dev tmpfs 1.9G 0 1.9G 0% /dev/shm tmpfs 1.9G 8.6M 1.9G 1% /run tmpfs 1.9G 0 1.9G 0% /sys/fs/cgroup /dev/vda1 1014M 143M 872M 15% /boot /dev/mapper/centos-home 19G 33M 19G 1% /home tmpfs 380M 0 380M 0% /run/user/0 overlay 38G 4.3G 33G 12% /var/lib/docker/overlay2/9904ac8cbcba967de3262dc0d5e230c64ad3c1c53b588048e263767d36df8c1a/merged shm 64M 0 64M 0% /var/lib/docker/containers/222ec7f21b2495591613e0d1061e4405cd57f99ffaf41dbba1a98c350cd70f60/mounts/shm 接着修改授权信息 [root@GlusterFS-master ~]# gluster volume set gluster_share auth.allow 192.168.10.* volume set: success [root@GlusterFS-master ~]# gluster volume info Volume Name: gluster_share Type: Distributed-Replicate Volume ID: a9f989bd-7edd-4089-836a-d9f742b8d37a Status: Started Number of Bricks: 2 x 2 = 4 Transport-type: tcp Bricks: Brick1: 192.168.10.239:/data/gluster Brick2: 192.168.10.212:/data/gluster Brick3: 192.168.10.204:/data/gluster Brick4: 192.168.10.220:/data/gluster Options Reconfigured: auth.allow: 192.168.10.* 然后再在192.168.10.213的客户端进行挂载测试,发现能挂载上了 [root@Client ~]# mount -t glusterfs 192.168.10.239:gluster_share /opt/gfsmount/ [root@Client ~]# df -h Filesystem Size Used Avail Use% Mounted on /dev/mapper/centos-root 38G 4.3G 33G 12% / devtmpfs 1.9G 0 1.9G 0% /dev tmpfs 1.9G 0 1.9G 0% /dev/shm tmpfs 1.9G 8.6M 1.9G 1% /run tmpfs 1.9G 0 1.9G 0% /sys/fs/cgroup /dev/vda1 1014M 143M 872M 15% /boot /dev/mapper/centos-home 19G 33M 19G 1% /home tmpfs 380M 0 380M 0% /run/user/0 overlay 38G 4.3G 33G 12% /var/lib/docker/overlay2/9904ac8cbcba967de3262dc0d5e230c64ad3c1c53b588048e263767d36df8c1a/merged shm 64M 0 64M 0% /var/lib/docker/containers/222ec7f21b2495591613e0d1061e4405cd57f99ffaf41dbba1a98c350cd70f60/mounts/shm 192.168.10.239:gluster_share 2.0G 67M 2.0G 4% /opt/gfsmount |
================Gluster的性能测试工具================
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 | 在一个节点起服务。比如这里在GlusterFS-master节点上启动: [root@GlusterFS-master ~]# yum install -y iperf [root@GlusterFS-master ~]# iperf -s ------------------------------------------------------------ Server listening on TCP port 5001 TCP window size: 85.3 KByte (default) ------------------------------------------------------------ 客户端连接上面的GlusterFS-master节点服务器,测试网络速度 [root@Client ~]# yum install -y iperf [root@Client ~]# iperf -c 192.168.10.239 ------------------------------------------------------------ Client connecting to 192.168.10.239, TCP port 5001 TCP window size: 85.0 KByte (default) ------------------------------------------------------------ [ 3] local 192.168.10.213 port 55276 connected with 192.168.10.239 port 5001 [ ID] Interval Transfer Bandwidth [ 3] 0.0-10.0 sec 19.5 GBytes 16.7 Gbits/sec 节点服务器上也能查看到信息。因为是虚拟机环境,这里虚高了。 [root@GlusterFS-master ~]# iperf -s ------------------------------------------------------------ Server listening on TCP port 5001 TCP window size: 85.3 KByte (default) ------------------------------------------------------------ [ 4] local 192.168.10.239 port 5001 connected with 192.168.10.213 port 55276 [ ID] Interval Transfer Bandwidth [ 4] 0.0-10.0 sec 19.5 GBytes 16.7 Gbits/sec 如果觉得压力不够,可以客户端多个进程一起发包。使用-P参数 ,客户端结果如下 [root@Client ~]# iperf -c 192.168.10.239 -P 10 ------------------------------------------------------------ Client connecting to 192.168.10.239, TCP port 5001 TCP window size: 85.0 KByte (default) ------------------------------------------------------------ [ 12] local 192.168.10.213 port 55296 connected with 192.168.10.239 port 5001 [ 6] local 192.168.10.213 port 55284 connected with 192.168.10.239 port 5001 [ 7] local 192.168.10.213 port 55286 connected with 192.168.10.239 port 5001 [ 3] local 192.168.10.213 port 55278 connected with 192.168.10.239 port 5001 [ 5] local 192.168.10.213 port 55282 connected with 192.168.10.239 port 5001 [ 4] local 192.168.10.213 port 55280 connected with 192.168.10.239 port 5001 [ 8] local 192.168.10.213 port 55288 connected with 192.168.10.239 port 5001 [ 9] local 192.168.10.213 port 55290 connected with 192.168.10.239 port 5001 [ 11] local 192.168.10.213 port 55294 connected with 192.168.10.239 port 5001 [ 10] local 192.168.10.213 port 55292 connected with 192.168.10.239 port 5001 [ ID] Interval Transfer Bandwidth [ 12] 0.0-10.0 sec 3.18 GBytes 2.73 Gbits/sec [ 11] 0.0-10.0 sec 616 MBytes 516 Mbits/sec [ 10] 0.0-10.0 sec 3.26 GBytes 2.80 Gbits/sec [ 6] 0.0-10.0 sec 3.18 GBytes 2.72 Gbits/sec [ 7] 0.0-10.0 sec 3.18 GBytes 2.73 Gbits/sec [ 3] 0.0-10.0 sec 616 MBytes 516 Mbits/sec [ 4] 0.0-10.0 sec 3.07 GBytes 2.63 Gbits/sec [ 8] 0.0-10.0 sec 2.90 GBytes 2.49 Gbits/sec [ 9] 0.0-10.0 sec 2.89 GBytes 2.48 Gbits/sec [ 5] 0.0-10.0 sec 3.06 GBytes 2.62 Gbits/sec [SUM] 0.0-10.0 sec 25.9 GBytes 22.2 Gbits/sec 节点服务器上的信息: [root@GlusterFS-master ~]# iperf -s ------------------------------------------------------------ Server listening on TCP port 5001 TCP window size: 85.3 KByte (default) ------------------------------------------------------------ [ 4] local 192.168.10.239 port 5001 connected with 192.168.10.213 port 55276 [ ID] Interval Transfer Bandwidth [ 4] 0.0-10.0 sec 19.5 GBytes 16.7 Gbits/sec [ 4] local 192.168.10.239 port 5001 connected with 192.168.10.213 port 55278 [ 5] local 192.168.10.239 port 5001 connected with 192.168.10.213 port 55280 [ 7] local 192.168.10.239 port 5001 connected with 192.168.10.213 port 55284 [ 6] local 192.168.10.239 port 5001 connected with 192.168.10.213 port 55282 [ 8] local 192.168.10.239 port 5001 connected with 192.168.10.213 port 55286 [ 9] local 192.168.10.239 port 5001 connected with 192.168.10.213 port 55288 [ 10] local 192.168.10.239 port 5001 connected with 192.168.10.213 port 55290 [ 13] local 192.168.10.239 port 5001 connected with 192.168.10.213 port 55294 [ 11] local 192.168.10.239 port 5001 connected with 192.168.10.213 port 55292 [ 12] local 192.168.10.239 port 5001 connected with 192.168.10.213 port 55296 [ 13] 0.0-10.5 sec 616 MBytes 491 Mbits/sec [ 4] 0.0-10.5 sec 616 MBytes 491 Mbits/sec [ 5] 0.0-10.5 sec 3.07 GBytes 2.50 Gbits/sec [ 7] 0.0-10.5 sec 3.18 GBytes 2.59 Gbits/sec [ 8] 0.0-10.5 sec 3.18 GBytes 2.59 Gbits/sec [ 9] 0.0-10.5 sec 2.90 GBytes 2.37 Gbits/sec [ 6] 0.0-10.5 sec 3.06 GBytes 2.49 Gbits/sec [ 10] 0.0-10.5 sec 2.89 GBytes 2.35 Gbits/sec [ 11] 0.0-10.5 sec 3.26 GBytes 2.65 Gbits/sec [ 12] 0.0-10.5 sec 3.18 GBytes 2.59 Gbits/sec [SUM] 0.0-10.5 sec 25.9 GBytes 21.1 Gbits/sec ----------------------------------------------------- dd工具测试:客户端写入速度和读取速度测试 [root@Client ~]# mount -t glusterfs 192.168.10.239:gluster_share /opt/gfsmount [root@Client ~]# df -h ....... 192.168.10.239:gluster_share 2.0G 68M 2.0G 4% /opt/gfsmount 测试写文件的速度(写入一个300M文件的速度) [root@Client ~]# dd if=/dev/zero of=/opt/gfsmount/grace bs=1M count=500 500+0 records in 500+0 records out 524288000 bytes (524 MB) copied, 0.981631 s, 534 MB/s [root@Client ~]# du -sh /opt/gfsmount/grace 500M /opt/gfsmount/grace 测试读文件的速度 [root@Client ~]# dd if=/opt/gfsmount/grace of=/dev/null bs=1M count=500 500+0 records in 500+0 records out 524288000 bytes (524 MB) copied, 1.00698 s, 521 MB/s 再次测试,虚拟机不是很稳定 [root@Client ~]# dd if=/dev/zero of=/opt/gfsmount/grace1 bs=1M count=500 500+0 records in 500+0 records out 524288000 bytes (524 MB) copied, 1.15882 s, 452 MB/s [root@Client ~]# dd if=/opt/gfsmount/grace1 of=/dev/null bs=1M count=500 500+0 records in 500+0 records out 524288000 bytes (524 MB) copied, 1.0682 s, 491 MB/s |
================Gluster集群典型故障处理=================
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 | 1)复制卷数据不一致 故障现象:双副本卷数据出现不一致 故障模拟:删除其中一个brick数据 修复方法:访问文件触发自修复: [root@GlusterFS-master ~]# gluster volume info Volume Name: gluster_share Type: Distributed-Replicate Volume ID: a9f989bd-7edd-4089-836a-d9f742b8d37a Status: Started Number of Bricks: 2 x 2 = 4 Transport-type: tcp Bricks: Brick1: 192.168.10.239:/data/gluster Brick2: 192.168.10.212:/data/gluster Brick3: 192.168.10.204:/data/gluster Brick4: 192.168.10.220:/data/gluster Options Reconfigured: 客户端操作: [root@Client ~]# df -h ....... 192.168.10.239:gluster_share 2.0G 1.1G 961M 53% /opt/gfsmount [root@Client ~]# rm -rf /opt/gfsmount/* [root@Client ~]# cd /opt/gfsmount/ [root@Client gfsmount]# touch a b c d e f g 客户端写入的数据分为两个replica放在了四个节点上 [root@GlusterFS-master ~]# ls /data/gluster/ a b c e [root@GlusterFS-slave ~]# ls /data/gluster/ a b c e [root@GlusterFS-slave2 ~]# ls /data/gluster/ d f g [root@GlusterFS-slave3 ~]# ls /data/gluster/ d f g 模拟问题: 在GlusterFS-slave2机器上删除文件 [root@GlusterFS-slave2 ~]# cd /data/gluster/ [root@GlusterFS-slave2 gluster]# ls d f g [root@GlusterFS-slave2 gluster]# rm -rf d [root@GlusterFS-slave2 gluster]# rm -rf f [root@GlusterFS-slave2 gluster]# ls g GlusterFS-slave2的备份节点GlusterFS-slave3上有上面删除的数据 [root@GlusterFS-slave3 ~]# cd /data/gluster/ [root@GlusterFS-slave3 gluster]# ls d f g 客户端上访问文件可以触发文件的自动修复 [root@Client ~]# cd /opt/gfsmount/ [root@Client gfsmount]# ls a b c d e f g [root@Client gfsmount]# cat d [root@Client gfsmount]# cat f 再次到GlusterFS-slave2节点上查看,删除的数据就自动修复了 [root@GlusterFS-slave2 ~]# ls /data/gluster/ d f g 2)glusterfs集群节点配置信息不正确 故障模拟 删除server2部分配置信息 配置信息位置:/usr/local/glusterfs/var/lib/glusterd 修复方法 触发自修复:通过Gluster工具同步配置信息 gluster volume sync server1 all 恢复复制卷 brick 故障现象:双副本卷中一个brick损坏 恢复流程: a)重新建立故障brick目录 # setfattr -n trusted.gfid -v 0x00000000000000000000000000000001 /data2 # setfattr -n trusted.glusterfs.dht -v 0x000000010000000000000000ffffffff /data2 # setfattr -n trusted.glusterfs.volume-id -v 0xcc51d546c0af4215a72077ad9378c2ac /data2 -v 的参数设置成你的值 b)设置扩展属性(参考另一个复制 brick) c)重启 glusterd服务 d)触发数据自修复 # find /data/glusterd -type f -print0 | xargs -0 head -c1 >/dev/null 模拟删除brick的操作 [root@GlusterFS-master ~]# gluster volume info Volume Name: gluster_share Type: Distributed-Replicate Volume ID: a9f989bd-7edd-4089-836a-d9f742b8d37a Status: Started Number of Bricks: 2 x 2 = 4 Transport-type: tcp Bricks: Brick1: 192.168.10.239:/data/gluster Brick2: 192.168.10.212:/data/gluster Brick3: 192.168.10.204:/data/gluster Brick4: 192.168.10.220:/data/gluster Options Reconfigured: 在GlusterFS-slave节点上删除brick数据 [root@GlusterFS-slave ~]# ls /data/gluster/ a b c e [root@GlusterFS-slave ~]# rm -rf /data/gluster [root@GlusterFS-slave ~]# ll /data/gluster ls: cannot access /data/gluster: No such file or directory 接在在GlusterFS-slave的备份节点GlusterFS-master上操作,获取扩展属性(使用"yum search getfattr"命令getfattr工具的安装途径) [root@GlusterFS-master ~]# yum install -y attr.x86_64 [root@GlusterFS-master ~]# cd /data/ [root@GlusterFS-master data]# getfattr -d -m . -e hex gluster/ # file: gluster/ security.selinux=0x756e636f6e66696e65645f753a6f626a6563745f723a756e6c6162656c65645f743a733000 trusted.gfid=0x00000000000000000000000000000001 trusted.glusterfs.dht=0x0000000100000000000000007ffffc25 trusted.glusterfs.volume-id=0xa9f989bd7edd4089836ad9f742b8d37a 注意下面的操作将根据上面的属性信息中的id进行操作 重新建立故障brick目录 恢复操作,扩展属性可以从GlusterFS-master节点机上获取的复制(注意上面属性中的id:0xa9f989bd7edd4089836ad9f742b8d37a),执行顺序没关系 [root@GlusterFS-slave ~]# mkdir /data/gluster [root@GlusterFS-slave ~]# yum install -y attr.x86_64 [root@GlusterFS-slave ~]# getfattr -d -m . -e hex /data/gluster getfattr: Removing leading '/' from absolute path names # file: data/gluster security.selinux=0x756e636f6e66696e65645f753a6f626a6563745f723a756e6c6162656c65645f743a733000 [root@GlusterFS-slave ~]# setfattr -n trusted.glusterfs.volume-id -v 0xa9f989bd7edd4089836ad9f742b8d37a /data/gluster [root@GlusterFS-slave ~]# getfattr -d -m . -e hex /data/gluster getfattr: Removing leading '/' from absolute path names # file: data/gluster security.selinux=0x756e636f6e66696e65645f753a6f626a6563745f723a756e6c6162656c65645f743a733000 trusted.glusterfs.volume-id=0xa9f989bd7edd4089836ad9f742b8d37a [root@GlusterFS-slave ~]# setfattr -n trusted.gfid -v 0x00000000000000000000000000000001 /data/gluster [root@GlusterFS-slave ~]# setfattr -n trusted.glusterfs.dht -v 0x0000000100000000000000007ffffc25 /data/gluster [root@GlusterFS-slave ~]# getfattr -d -m . -e hex /data/gluster getfattr: Removing leading '/' from absolute path names # file: data/gluster security.selinux=0x756e636f6e66696e65645f753a6f626a6563745f723a756e6c6162656c65645f743a733000 trusted.gfid=0x00000000000000000000000000000001 trusted.glusterfs.dht=0x0000000100000000000000007ffffc25 trusted.glusterfs.volume-id=0xa9f989bd7edd4089836ad9f742b8d37a 重启gluster服务 [root@GlusterFS-slave ~]# ps -ef|grep gluster root 4909 1 0 15:12 ? 00:00:01 /usr/local/glusterfs/sbin/glusterfs -s localhost --volfile-id gluster/glustershd -p /usr/local/glusterfs/var/lib/glusterd/glustershd/run/glustershd.pid -l /usr/local/glusterfs/var/log/glusterfs/glustershd.log -S /var/run/f9f65f2dbb0c193ecab167839d75699e.socket --xlator-option *replicate*.node-uuid=f8e69297-4690-488e-b765-c1c404810d6a root 5069 5044 0 17:31 pts/0 00:00:00 grep --color=auto gluster root 32450 1 0 Apr08 ? 00:00:26 /usr/local/glusterfs/sbin/glusterd [root@GlusterFS-slave ~]# ps -ef|grep gluster|awk '{print $2}'|xargs kill -9 kill: sending signal to 5071 failed: No such process [root@GlusterFS-slave ~]# ps -ef|grep gluster root 5078 5044 0 17:32 pts/0 00:00:00 grep --color=auto gluster [root@GlusterFS-slave ~]# /usr/local/glusterfs/sbin/glusterd [root@GlusterFS-slave ~]# ps -ef|grep gluster root 5080 1 14 17:32 ? 00:00:00 /usr/local/glusterfs/sbin/glusterd root 5212 5044 0 17:32 pts/0 00:00:00 grep --color=auto gluster [root@GlusterFS-slave ~]# ls /data/gluster/ [root@GlusterFS-slave ~]# 可以尝试通过下面三种方法进行数据恢复 a)重启gluster服务后,Gluster-slave节点的数据。如果没有恢复,那么就尝试第2种方法 b)在客户端的挂载点下cat那些删除的文件或者再写入新数据,去触发自动修复。如果还没有恢复。那么尝试第3种方法 c)重启gluster_share卷 [root@GlusterFS-master ~]# gluster volume stop gluster_share Stopping volume will make its data inaccessible. Do you want to continue? (y/n) y volume stop: gluster_share: success [root@GlusterFS-master ~]# gluster volume start gluster_share volume start: gluster_share: success [root@GlusterFS-master ~]# gluster volume info Volume Name: gluster_share Type: Distributed-Replicate Volume ID: a9f989bd-7edd-4089-836a-d9f742b8d37a Status: Started Number of Bricks: 2 x 2 = 4 Transport-type: tcp Bricks: Brick1: 192.168.10.239:/data/gluster Brick2: 192.168.10.212:/data/gluster Brick3: 192.168.10.204:/data/gluster Brick4: 192.168.10.220:/data/gluster [root@GlusterFS-master ~]# gluster volume status gluster_share Status of volume: gluster_share Gluster process Port Online Pid ------------------------------------------------------------------------------ Brick 192.168.10.239:/data/gluster 49155 Y 15508 Brick 192.168.10.212:/data/gluster 49156 Y 5249 Brick 192.168.10.204:/data/gluster 49156 Y 12162 Brick 192.168.10.220:/data/gluster 49157 Y 17402 NFS Server on localhost N/A N N/A Self-heal Daemon on localhost N/A Y 15527 NFS Server on 192.168.10.204 N/A N N/A Self-heal Daemon on 192.168.10.204 N/A Y 12181 NFS Server on 192.168.10.220 N/A N N/A Self-heal Daemon on 192.168.10.220 N/A Y 17421 NFS Server on 192.168.10.212 N/A N N/A Self-heal Daemon on 192.168.10.212 N/A Y 5268 Task Status of Volume gluster_share ------------------------------------------------------------------------------ Task : Rebalance ID : 26035833-3c20-4822-b065-7a5e15d30b85 Status : completed [root@GlusterFS-master ~]# gluster volume heal gluster_share info Brick GlusterFS-master:/data/gluster/ Number of entries: 0 Brick GlusterFS-slave:/data/gluster/ Number of entries: 0 Brick GlusterFS-slave2:/data/gluster/ Number of entries: 0 Brick GlusterFS-slave3:/data/gluster/ Number of entries: 0 查看下数据是否恢复了 [root@GlusterFS-slave ~]# cd /data/gluster/ [root@GlusterFS-slave gluster]# ls a b c 注意: 如上在客户端挂载点写入新数据后,GlusterFS-slave节点的数据就会恢复,如果发现恢复后的数据跟它的备份节点 GlusterFS—master的数据不一致,这个时候只需要在客户端挂载点下cat那些不一致的文件,即触发自动修复机制, 则GlusterFS-slave节点就会自动恢复那些不一致的数据! |
================glusterfs集群生产场景调优精要=================
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 | 系统关键考虑 1)性能需求 2)Read/Write 3)吞吐量/IOPS/可用性 4)Workload 5)什么应用? 6)大文件? 7)小文件? 8)除了吞吐量之外的需求? 系统配置 1)根据Workload选择适当的 Volume类型 2)Volume类型 3)DHT – 高性能,无冗余 4)AFR – 高可用,读性能高 5)STP – 高并发读,写性能低,无冗余 6)协议/性能 7)Native – 性能最优 8)NFS – 特定应用下可获得最优性能 9)CIFS – 仅Windows平台使用 10)数据流 11)不同访问协议的数据流差异 12)hash+复制卷的模式是生产必须的 系统硬件配置 1)节点和集群配置 2)多 CPU-支持更多的并发线程 3)多 MEM-支持更大的 Cache 4)多网络端口-支持更高的吞吐量 5)专用后端网络用于集群内部通信 6)NFS/CIFS协议访问需要专用后端网络 7)推荐至少10GbE 8)Native协议用于内部节点通信 性能相关经验 1)GlusterFS性能很大程度上依赖硬件 2)充分理解应用基础上进行硬件配置 3)缺省参数主要用于通用目的 4)GlusterFS存在若干性能调优参数 5)性能问题应当优先排除磁盘和网络故障 6)建议6到8个磁盘最做一个raid 系统规模和架构 1)性能理论上由硬件配置决定 2)CPU/Mem/Disk/Network 3)系统规模由性能和容量需求决定 4)2U/4U存储服务器和JBOD适合构建Brick 5)三种典型应用部署 6)容量需求应用 7)2U/4U存储服务器+多个JBOD 8)CPU/RAM/Network要求低 9)性能和容量混合需求应用 10)2U/4U存储服务器+少数JBOD 11)高 CPU/RAM,低Network 12)性能需求应用 13)1U/2U存储服务器(无JBOD) 14)高 CPU/RAM,快DISK/Network 系统调优 1)关键调优参数 2)Performance.write-behind-window-size 65535 (字节) 3)Performance.cache-refresh-timeout 1 (秒) 4)Performance.cache-size 1073741824 (字节) 5)Performance.read-ahead off (仅1GbE) 6)Performance.io-thread-count 24 (CPU核数) 7)Performance.client-io-threads on (客户端) 8)performance.write-behind on 9)performance.flush-behind on 10)cluster.stripe-block-size 4MB (缺省 128KB) 11)Nfs.disable off (缺省打开) 12)缺省参数设置适用于混合workloads 13)不同应用调优 14)理解硬件/固件配置及对性能的影响 15)如CPU频率、 IB、 10GbE、 TCP offload KVM优化 1)使用 QEMU-GlusterFS(libgfapi)整合方案 2)gluster volume set <volume> group virt 3)tuned-adm profile rhs-virtualization 4)KVM host: tuned-adm profile virtual-host 5)Images和应用数据使用不同的 volume 6)每个gluster节点不超过2个KVM Host (16 guest/host) 7)提高响应时间 8)减少/sys/block/vda/queue/nr_request 9)Server/Guest: 128/8 (缺省企值256/128) 10)提高读带宽 11)提高 /sys/block/vda/queue/read_ahead_kb 12)VM readahead: 4096 (缺省值128) |
如果觉得我的文章对您有用,请随意打赏。你的支持将鼓励我继续创作!