知识问答

lusterFS分布式存储集群拾遗

接着上一篇Centos7下GlusterFS分布式存储集详细部署文档，继续做一些补充记录，希望能加深对GlusterFS存储操作的理解和熟悉度。========================清理glusterfs存储环境================...

接着上一篇Centos7下GlusterFS分布式存储集详细部署文档，继续做一些补充记录，希望能加深对GlusterFS存储操作的理解和熟悉度。

========================清理glusterfs存储环境=========================

由上面可知，该glusterfs存储集群有四个节点：

[root@GlusterFS-master ~]# cat /etc/hosts

.......

192.168.10.239 GlusterFS-master

192.168.10.212 GlusterFS-slave

192.168.10.204 GlusterFS-slave2

192.168.10.220 GlusterFS-slave3

现将四个节点的存储目录/opt/gluster/data全部删除

[root@GlusterFS-master ~]# rm -rf /opt/gluster

[root@GlusterFS-slave ~]# rm -rf /opt/gluster

[root@GlusterFS-slave2 ~]# rm -rf /opt/gluster

[root@GlusterFS-slave3 ~]# rm -rf /opt/gluster

[root@GlusterFS-master ~]# gluster volume list

models

[root@GlusterFS-master ~]# gluster volume info

Volume Name: models

Type: Distributed-Replicate

Volume ID: f1945b0b-67d6-4202-9198-639244ab0a6a

Status: Stopped

Number of Bricks: 2 x 2 = 4

Transport-type: tcp

Bricks:

Brick1: 192.168.10.239:/opt/gluster/data

Brick2: 192.168.10.212:/opt/gluster/data

Brick3: 192.168.10.204:/opt/gluster/data

Brick4: 192.168.10.220:/opt/gluster/data

Options Reconfigured:

performance.write-behind: on

performance.io-thread-count: 32

performance.flush-behind: on

performance.cache-size: 128MB

features.quota: on

接着删除之前创建的models卷

[root@GlusterFS-master ~]# gluster volume stop models

[root@GlusterFS-master ~]# gluster volume delete models

[root@GlusterFS-master ~]# gluster volume info

No volumes present

查看集群节点情况，如下发现glusterfs集群中有三个节点。

由于是在192.168.10.239机器上查看的，所以在集群中默认看不到自己的。

如果在其他集群上执行这个命令，就能查看到192.168.10.239这个节点了

[root@GlusterFS-master ~]# gluster peer status

Number of Peers: 3

Hostname: 192.168.10.212

Uuid: f8e69297-4690-488e-b765-c1c404810d6a

State: Peer in Cluster (Connected)

Hostname: 192.168.10.204

Uuid: a989394c-f64a-40c3-8bc5-820f623952c4

State: Peer in Cluster (Connected)

Hostname: 192.168.10.220

Uuid: dd99743a-285b-4aed-b3d6-e860f9efd965

State: Peer in Cluster (Connected)

然后分别将节点从集群中删除（不能在节点机本机的gluster命令下删除自己）

[root@GlusterFS-master ~]# gluster //可以在gluster的交互界面里操作

gluster> peer detach 192.168.10.220

peer detach: success

gluster> peer detach 192.168.10.204

peer detach: success

gluster> peer detach 192.168.10.212

peer detach: success

gluster> peer detach 192.168.10.239

peer detach: failed: 192.168.10.239 is localhost //默认在本机是删除不了自己的。需要在别的节点上删除它。

gluster>

登录另一台节点机上，执行将192.168.10.220节点从集群中移除的操作

[root@GlusterFS-slave ~]# gluster

gluster> peer detach 192.168.10.239

peer detach: success

gluster>

再次查看集群情况，发现没有节点了

[root@GlusterFS-master ~]# gluster peer status

Number of Peers: 0

可以在gluster交互界面里执行它的所有相关命令：

[root@GlusterFS-master ~]# gluster

gluster> volume info

No volumes present

gluster> peer status

Number of Peers: 0

gluster>

=====================dd命令创建虚拟分区，创建存储目录=====================

首先利用dd命令创建虚拟分区，创建存储目录

[root@GlusterFS-master ~]# df -h

Filesystem Size Used Avail Use% Mounted on

/dev/mapper/centos-root 36G 1.8G 34G 5% /

devtmpfs 2.9G 0 2.9G 0% /dev

tmpfs 2.9G 0 2.9G 0% /dev/shm

tmpfs 2.9G 8.5M 2.9G 1% /run

tmpfs 2.9G 0 2.9G 0% /sys/fs/cgroup

/dev/vda1 1014M 143M 872M 15% /boot

/dev/mapper/centos-home 18G 33M 18G 1% /home

tmpfs 581M 0 581M 0% /run/user/0

dd命令创建一个虚拟分区出来，格式化并挂载到/data目录下

[root@GlusterFS-master ~]# dd if=/dev/vda1 of=/dev/vdb1

2097152+0 records in

2097152+0 records out

1073741824 bytes (1.1 GB) copied, 2.0979 s, 512 MB/s

[root@GlusterFS-master ~]# du -sh /dev/vdb1

1.0G /dev/vdb1

[root@GlusterFS-master ~]# mkfs.xfs -f /dev/vdb1 //这里格式成xfs格式文件，也可以格式化成ext4格式的。

meta-data=/dev/vdb1 isize=512 agcount=4, agsize=65536 blks

= sectsz=512 attr=2, projid32bit=1

= crc=1 finobt=0, sparse=0

data = bsize=4096 blocks=262144, imaxpct=25

= sunit=0 swidth=0 blks

naming =version 2 bsize=4096 ascii-ci=0 ftype=1

log =internal log bsize=4096 blocks=2560, version=2

= sectsz=512 sunit=0 blks, lazy-count=1

realtime =none extsz=4096 blocks=0, rtextents=0

[root@GlusterFS-master ~]# mkdir /data

[root@GlusterFS-master ~]# mount /dev/vdb1 /data

[root@GlusterFS-master ~]# df -h

Filesystem Size Used Avail Use% Mounted on

/dev/mapper/centos-root 36G 1.8G 34G 5% /

devtmpfs 2.9G 34M 2.8G 2% /dev

tmpfs 2.9G 0 2.9G 0% /dev/shm

tmpfs 2.9G 8.5M 2.9G 1% /run

tmpfs 2.9G 0 2.9G 0% /sys/fs/cgroup

/dev/vda1 1014M 143M 872M 15% /boot

/dev/mapper/centos-home 18G 33M 18G 1% /home

tmpfs 581M 0 581M 0% /run/user/0

/dev/loop0 976M 2.6M 907M 1% /data

[root@GlusterFS-master ~]# fdisk -l

.......

Disk /dev/loop0: 1073 MB, 1073741824 bytes, 2097152 sectors

Units = sectors of 1 * 512 = 512 bytes

Sector size (logical/physical): 512 bytes / 512 bytes

I/O size (minimum/optimal): 512 bytes / 512 bytes

设置开机自动挂载

[root@GlusterFS-master ~]# echo '/dev/loop0 /data xfs defaults 1 2' >> /etc/fstab

然后创建gluster存储目录

[root@GlusterFS-master ~]# mkdir /data/gluster

以上操作要在四台节点机器上都要执行一遍，即创建存储目录环境！

================创建分布式卷（即Hash卷）及其相关管理操作================

100

101

102

103

104

105

106

107

108

109

110

111

112

113

114

115

116

117

118

119

120

121

122

123

124

125

126

127

128

129

130

131

132

133

134

135

136

137

138

139

140

141

142

143

144

145

146

147

148

149

150

151

152

153

154

155

156

157

158

159

160

161

162

163

164

165

166

167

168

169

170

171

172

173

174

175

176

177

178

179

180

181

182

183

184

185

186

187

188

189

190

191

192

193

194

195

196

197

198

199

200

201

202

203

204

205

206

207

208

209

210

211

212

213

214

215

216

217

218

219

220

221

222

223

224

225

226

227

228

229

230

231

232

233

234

235

236

237

238

239

240

241

242

243

244

245

246

247

248

249

250

251

252

253

254

255

256

257

258

259

260

261

262

263

264

265

266

267

268

269

270

271

272

273

274

275

276

277

278

279

280

281

282

283

284

285

286

287

288

289

290

291

292

293

294

295

296

297

298

299

300

301

302

303

304

305

306

307

308

309

310

311

312

313

314

315

316

317

318

319

320

321

322

323

324

325

326

327

328

329

330

331

332

333

334

335

336

337

338

339

340

341

342

343

344

345

346

347

348

349

350

351

352

353

354

355

356

357

358

359

360

361

362

363

364

365

366

367

368

369

370

371

372

373

374

375

376

377

378

379

380

381

382

383

384

385

386

387

388

389

390

391

392

393

394

395

396

397

398

399

400

401

402

403

404

405

406

407

408

409

410

411

412

413

414

415

416

417

418

419

420

421

422

423

424

425

426

427

428

429

接着将节点添加到集群。这里选择在GlusterFS-master节点上执行：

[root@GlusterFS-master ~]# gluster peer probe 192.168.10.212

peer probe: success.

[root@GlusterFS-master ~]# gluster peer probe 192.168.10.204

peer probe: success.

[root@GlusterFS-master ~]# gluster peer probe 192.168.10.220

peer probe: success.

[root@GlusterFS-master ~]# gluster peer status

Number of Peers: 3

Hostname: 192.168.10.212

Uuid: f8e69297-4690-488e-b765-c1c404810d6a

State: Peer in Cluster (Connected)

Hostname: 192.168.10.204

Uuid: a989394c-f64a-40c3-8bc5-820f623952c4

State: Peer in Cluster (Connected)

Hostname: 192.168.10.220

Uuid: dd99743a-285b-4aed-b3d6-e860f9efd965

State: Peer in Cluster (Connected)

登录其他节点集群查看，就能看到GlusterFS-master节点（192.168.10.239）也在集群中了

[root@GlusterFS-slave ~]# gluster peer status

Number of Peers: 3

Hostname: GlusterFS-master

Uuid: 5dfd40e2-096b-40b5-bee3-003b57a39007

State: Peer in Cluster (Connected)

Hostname: 192.168.10.204

Uuid: a989394c-f64a-40c3-8bc5-820f623952c4

State: Peer in Cluster (Connected)

Hostname: 192.168.10.220

Uuid: dd99743a-285b-4aed-b3d6-e860f9efd965

State: Peer in Cluster (Connected)

----------------------------------------------------------

现在开始创建。这里在GlusterFS-master机器上执行的，默认这里创建的是哈希卷。

如下，它会自动在192.168.10.212节点的/data下面创建个gluster目录（这个目录不需要提前手动创建）

[root@GlusterFS-master ~]# gluster volume create gluster_data 192.168.10.212:/data/gluster force

volume create: gluster_data: success: please start the volume to access data

登录192.168.10.212节点查看，果然在/data目录下自动创建了gluster目录

[root@GlusterFS-slave ~]# ls /data/gluster

[root@GlusterFS-slave ~]# ls /data/ //data分区大小为1G

gluster

启动卷，查看卷状态

[root@GlusterFS-master ~]# gluster volume start gluster_data

volume start: gluster_data: success

[root@GlusterFS-master ~]# gluster volume info

Volume Name: gluster_data

Type: Distribute

Volume ID: 0f8b2268-9d2f-4b5c-85df-13408825d6b3

Status: Started

Number of Bricks: 1

Transport-type: tcp

Bricks:

Brick1: 192.168.10.212:/data/gluster

挂载卷操作

在客户端机器上执行glusterfs存储挂载的操作

注意，由于上面添加的是192.168.10.212节点，所以客户端要挂载的也是192.168.10.212节点的存储

[root@Client ~]# mkdir /opt/gfsmount

[root@Client ~]# mount -t glusterfs 192.168.10.212:gluster_data /opt/gfsmount

[root@Client ~]# df -h

Filesystem Size Used Avail Use% Mounted on

/dev/mapper/centos-root 38G 4.3G 33G 12% /

devtmpfs 1.9G 0 1.9G 0% /dev

tmpfs 1.9G 0 1.9G 0% /dev/shm

tmpfs 1.9G 8.6M 1.9G 1% /run

tmpfs 1.9G 0 1.9G 0% /sys/fs/cgroup

/dev/vda1 1014M 143M 872M 15% /boot

/dev/mapper/centos-home 19G 33M 19G 1% /home

tmpfs 380M 0 380M 0% /run/user/0

overlay 38G 4.3G 33G 12% /var/lib/docker/overlay2/9904ac8cbcba967de3262dc0d5e230c64ad3c1c53b588048e263767d36df8c1a/merged

shm 64M 0 64M 0% /var/lib/docker/containers/222ec7f21b2495591613e0d1061e4405cd57f99ffaf41dbba1a98c350cd70f60/mounts/shm

192.168.10.212:gluster_data 1014M 33M 982M 4% /opt/gfsmount

上面可知，已经挂载上了glusterfs存储，大小为1G（即是192.168.10.212的存储目录所在的/data分区的空间。

记住：存储目录在节点机的哪个分区下，客户端挂载后就享用这个分区空间。

在客户端挂载目录下测试写数据

[root@Client gfsmount]# mkdir test

[root@Client gfsmount]# touch kevin

[root@Client gfsmount]# ls

kevin test

然后在192.168.10.212的存储目录下发现是正常同步过来的

[root@GlusterFS-slave ~]# cd /data/gluster/

[root@GlusterFS-slave gluster]# ls

kevin test

----------------------------------------------------------

增加brick（即扩容卷）

[root@GlusterFS-master ~]# gluster volume add-brick gluster_data 192.168.10.239:/data/gluster force

volume add-brick: success

[root@GlusterFS-master ~]# gluster volume add-brick gluster_data 192.168.10.204:/data/gluster force

volume add-brick: success

[root@GlusterFS-master ~]# gluster volume add-brick gluster_data 192.168.10.220:/data/gluster force

volume add-brick: success

同样，上面三个节点的/data下会自动创建gluster目录！

查看卷状态

[root@GlusterFS-master ~]# gluster volume info

Volume Name: gluster_data

Type: Distribute

Volume ID: 0f8b2268-9d2f-4b5c-85df-13408825d6b3

Status: Started

Number of Bricks: 4

Transport-type: tcp

Bricks:

Brick1: 192.168.10.212:/data/gluster

Brick2: 192.168.10.239:/data/gluster

Brick3: 192.168.10.204:/data/gluster

Brick4: 192.168.10.220:/data/gluster

然后在到客户端，发现挂载点的容量已经由1G上升到了4G！！（即四个节点的存储目录所在分区空间之和）

[root@Client ~]# df -h

Filesystem Size Used Avail Use% Mounted on

/dev/mapper/centos-root 38G 4.3G 33G 12% /

devtmpfs 1.9G 0 1.9G 0% /dev

tmpfs 1.9G 0 1.9G 0% /dev/shm

tmpfs 1.9G 8.6M 1.9G 1% /run

tmpfs 1.9G 0 1.9G 0% /sys/fs/cgroup

/dev/vda1 1014M 143M 872M 15% /boot

/dev/mapper/centos-home 19G 33M 19G 1% /home

tmpfs 380M 0 380M 0% /run/user/0

overlay 38G 4.3G 33G 12% /var/lib/docker/overlay2/9904ac8cbcba967de3262dc0d5e230c64ad3c1c53b588048e263767d36df8c1a/merged

shm 64M 0 64M 0% /var/lib/docker/containers/222ec7f21b2495591613e0d1061e4405cd57f99ffaf41dbba1a98c350cd70f60/mounts/shm

192.168.10.212:gluster_data 4.0G 130M 3.9G 4% /opt/gfsmount

如上操作后的总结

1）客户端挂载点的容量是集群中四个节点的存储目录所在分区总和。

2）在客户端挂载点下创建目录，所有节点的存储目录（即Brick）下都会同步到。

3）在客户端挂载点的目录下创建的文件，会在3个节点的存储目录内hash分布。

4）直接在客户端挂载点下创建的文件，则这些文件只会单独同步到所挂载的节点（如上是192.168.10.212的/data/gluster目录下），其他节点不会同步！

5）删除卷会造成一些数据丢失，因为被删除节点有数。

比如：

a）客户端在挂载点下创建目录kevin

[root@Client ~]# cd /opt/gfsmount/

[root@Client gfsmount]# mkdir kevin

节点查看

[root@GlusterFS-master ~]# ls /data/gluster/

kevin

[root@GlusterFS-slave ~]# ls /data/gluster/

kevin

[root@GlusterFS-slave2 ~]# ls /data/gluster/

kevin

[root@GlusterFS-slave3 ~]# ls /data/gluster/

kevin

b）客户端在上面挂载点下的目录里创建文件

[root@Client gfsmount]# for i in `seq -w 1 100`; do cp -rp /var/log/messages /opt/gfsmount/kevin/copy-test-$i; done

[root@Client gfsmount]# ls kevin/

copy-test-001 copy-test-014 copy-test-027 copy-test-040 copy-test-053 copy-test-066 copy-test-079 copy-test-092

copy-test-002 copy-test-015 copy-test-028 copy-test-041 copy-test-054 copy-test-067 copy-test-080 copy-test-093

copy-test-003 copy-test-016 copy-test-029 copy-test-042 copy-test-055 copy-test-068 copy-test-081 copy-test-094

copy-test-004 copy-test-017 copy-test-030 copy-test-043 copy-test-056 copy-test-069 copy-test-082 copy-test-095

copy-test-005 copy-test-018 copy-test-031 copy-test-044 copy-test-057 copy-test-070 copy-test-083 copy-test-096

copy-test-006 copy-test-019 copy-test-032 copy-test-045 copy-test-058 copy-test-071 copy-test-084 copy-test-097

copy-test-007 copy-test-020 copy-test-033 copy-test-046 copy-test-059 copy-test-072 copy-test-085 copy-test-098

copy-test-008 copy-test-021 copy-test-034 copy-test-047 copy-test-060 copy-test-073 copy-test-086 copy-test-099

copy-test-009 copy-test-022 copy-test-035 copy-test-048 copy-test-061 copy-test-074 copy-test-087 copy-test-100

copy-test-010 copy-test-023 copy-test-036 copy-test-049 copy-test-062 copy-test-075 copy-test-088

copy-test-011 copy-test-024 copy-test-037 copy-test-050 copy-test-063 copy-test-076 copy-test-089

copy-test-012 copy-test-025 copy-test-038 copy-test-051 copy-test-064 copy-test-077 copy-test-090

copy-test-013 copy-test-026 copy-test-039 copy-test-052 copy-test-065 copy-test-078 copy-test-091

节点查看。发现文件同步到挂载的192.168.10.212节点山上了

[root@GlusterFS-master ~]# ls /data/gluster/kevin/

copy-test-002 copy-test-014 copy-test-036 copy-test-045 copy-test-056 copy-test-070 copy-test-075 copy-test-097

copy-test-009 copy-test-020 copy-test-042 copy-test-047 copy-test-062 copy-test-071 copy-test-080

copy-test-010 copy-test-027 copy-test-043 copy-test-053 copy-test-064 copy-test-072 copy-test-084

copy-test-013 copy-test-035 copy-test-044 copy-test-055 copy-test-068 copy-test-074 copy-test-092

[root@GlusterFS-master ~]# ll /data/gluster/kevin/|wc -l

[root@GlusterFS-slave ~]# ls /data/gluster/kevin/

copy-test-003 copy-test-018 copy-test-037 copy-test-050 copy-test-061 copy-test-069 copy-test-089

copy-test-005 copy-test-025 copy-test-040 copy-test-058 copy-test-066 copy-test-076 copy-test-091

copy-test-007 copy-test-026 copy-test-049 copy-test-059 copy-test-067 copy-test-085 copy-test-096

[root@GlusterFS-slave ~]# ll /data/gluster/kevin/|wc -l

[root@GlusterFS-slave2 gluster]# ls /data/gluster/kevin/

copy-test-004 copy-test-016 copy-test-024 copy-test-046 copy-test-065 copy-test-082 copy-test-088 copy-test-099

copy-test-006 copy-test-017 copy-test-029 copy-test-048 copy-test-078 copy-test-086 copy-test-093

copy-test-015 copy-test-023 copy-test-033 copy-test-052 copy-test-079 copy-test-087 copy-test-095

[root@GlusterFS-slave2 gluster]# ll /data/gluster/kevin/|wc -l

[root@GlusterFS-slave3 ~]# ls /data/gluster/kevin/

copy-test-001 copy-test-019 copy-test-030 copy-test-038 copy-test-054 copy-test-073 copy-test-090

copy-test-008 copy-test-021 copy-test-031 copy-test-039 copy-test-057 copy-test-077 copy-test-094

copy-test-011 copy-test-022 copy-test-032 copy-test-041 copy-test-060 copy-test-081 copy-test-098

copy-test-012 copy-test-028 copy-test-034 copy-test-051 copy-test-063 copy-test-083 copy-test-100

[root@GlusterFS-slave3 ~]# ll /data/gluster/kevin/|wc -l

c）如果直接在客户端挂载点下创建文件，则这些文件只会单独同步到所挂载的节点

（如上是192.168.10.212的/data/gluster目录下），其他节点不会同步！

[root@Client gfsmount]# for i in `seq -w 1 30`; do cp -rp /var/log/messages /opt/gfsmount/haha-test-$i; done

[root@Client gfsmount]# ls

haha-test-01 haha-test-05 haha-test-09 haha-test-13 haha-test-17 haha-test-21 haha-test-25 haha-test-29

haha-test-02 haha-test-06 haha-test-10 haha-test-14 haha-test-18 haha-test-22 haha-test-26 haha-test-30

haha-test-03 haha-test-07 haha-test-11 haha-test-15 haha-test-19 haha-test-23 haha-test-27 kevin

haha-test-04 haha-test-08 haha-test-12 haha-test-16 haha-test-20 haha-test-24 haha-test-28

节点查看，发现只是在192.168.10.212节点上有上面创建的那30个文件，其他3个节点都没有

[root@GlusterFS-master ~]# ls /data/gluster/

kevin

[root@GlusterFS-slave ~]# ls /data/gluster/

haha-test-01 haha-test-05 haha-test-09 haha-test-13 haha-test-17 haha-test-21 haha-test-25 haha-test-29

haha-test-02 haha-test-06 haha-test-10 haha-test-14 haha-test-18 haha-test-22 haha-test-26 haha-test-30

haha-test-03 haha-test-07 haha-test-11 haha-test-15 haha-test-19 haha-test-23 haha-test-27 kevin

haha-test-04 haha-test-08 haha-test-12 haha-test-16 haha-test-20 haha-test-24 haha-test-28

[root@GlusterFS-slave2 ~]# ls /data/gluster/

kevin

[root@GlusterFS-slave3 ~]# ls /data/gluster/

kevin

d）删除卷会造成一些数据丢失，因为被删除节点有数。

从卷中删除其中一个节点的brick

[root@GlusterFS-master ~]# gluster volume remove-brick gluster_data 192.168.10.220:/data/gluster

Usage: volume remove-brick <VOLNAME> [replica <COUNT>] <BRICK> ... <start|stop|status|commit|force>

[root@GlusterFS-master ~]# gluster

gluster> volume remove-brick gluster_data 192.168.10.220:/data/gluster force

Removing brick(s) can result in data loss. Do you want to Continue? (y/n) y

volume remove-brick commit force: success

gluster>

查看卷的信息，mount点空间也下降了（少了192.168.10.220的存储空间）

[root@GlusterFS-master ~]# gluster volume info

Volume Name: gluster_data

Type: Distribute

Volume ID: 0f8b2268-9d2f-4b5c-85df-13408825d6b3

Status: Started

Number of Bricks: 3

Transport-type: tcp

Bricks:

Brick1: 192.168.10.212:/data/gluster

Brick2: 192.168.10.239:/data/gluster

Brick3: 192.168.10.204:/data/gluster

客户端的挂载点空间下降到3G了

[root@Client ~]# df -h

Filesystem Size Used Avail Use% Mounted on

.......

192.168.10.212:gluster_data 3.0G 98M 2.9G 4% /opt/gfsmount

[root@Client ~]# ll /opt/gfsmount/kevin|wc -l

发现上面在客户端挂载点的kevin目录下创建的100个文件已经少了一部分，因为这部分数据在192.168.10.220节点上，该

节点的brick已经从卷中删除了，所以这部分数据就丢失了！

接着注意下面的操作！！

将上面删除的192.168.10.220的brick添加进去，发现添加失败！

[root@GlusterFS-master ~]# gluster volume add-brick gluster_data 192.168.10.220:/data/gluster

volume add-brick: failed: Staging failed on 192.168.10.220. Error: /data/gluster is already part of a volume

需要删除目录，才能加回来

[root@GlusterFS-slave3 ~]# rm -rf /data/gluster

[root@GlusterFS-slave3 ~]# ll /data

total 0

然后再添加就能成功了

[root@GlusterFS-master ~]# gluster volume add-brick gluster_data 192.168.10.220:/data/gluster

volume add-brick: success

[root@GlusterFS-master ~]# gluster volume info

Volume Name: gluster_data

Type: Distribute

Volume ID: 0f8b2268-9d2f-4b5c-85df-13408825d6b3

Status: Started

Number of Bricks: 4

Transport-type: tcp

Bricks:

Brick1: 192.168.10.212:/data/gluster

Brick2: 192.168.10.239:/data/gluster

Brick3: 192.168.10.204:/data/gluster

Brick4: 192.168.10.220:/data/gluster

添加回来后，客户端的挂载点的容量又上升到4G了

[root@Client ~]# df -h

.......

192.168.10.212:gluster_data 4.0G 131M 3.9G 4% /opt/gfsmount

---------------------------------------------------------------------------------

rebalance操作能够让文件按照之前的规则再分配

做下rebalance，看到新加的节点上分到了文件和目录。

注意，在实际生产环节中，做rebalance，最好在服务器空闲的时间操作

如上，新添加进去的192.168.10.220节点的存储目录下一开始是没有内容的

[root@GlusterFS-slave3 ~]# ls /data/gluster/

[root@GlusterFS-slave3 ~]#

[root@GlusterFS-master ~]# gluster volume rebalance gluster_data start

volume rebalance: gluster_data: success: Initiated rebalance on volume gluster_data.

Execute "gluster volume rebalance <volume-name> status" to check status.

ID: 49277bfa-df25-45c4-b1fb-cbcf8607a23e

再次查看新添加进去的192.168.10.220节点的存储目录，发现就有了数据

[root@GlusterFS-slave3 ~]# ls /data/gluster/

haha-test-03 haha-test-11 haha-test-14 haha-test-16 haha-test-25 haha-test-30 kevin

发现做了rebalance之后，客户端在挂载点下的数据就会均衡地分布到各节点上了

[root@Client ~]# ls /opt/gfsmount/

haha-test-01 haha-test-05 haha-test-09 haha-test-13 haha-test-17 haha-test-21 haha-test-25 haha-test-29

haha-test-02 haha-test-06 haha-test-10 haha-test-14 haha-test-18 haha-test-22 haha-test-26 haha-test-30

haha-test-03 haha-test-07 haha-test-11 haha-test-15 haha-test-19 haha-test-23 haha-test-27 kevin

haha-test-04 haha-test-08 haha-test-12 haha-test-16 haha-test-20 haha-test-24 haha-test-28

[root@Client ~]# ls /opt/gfsmount/kevin/

copy-test-002 copy-test-014 copy-test-026 copy-test-043 copy-test-053 copy-test-066 copy-test-076 copy-test-088

copy-test-003 copy-test-015 copy-test-027 copy-test-044 copy-test-055 copy-test-067 copy-test-078 copy-test-089

copy-test-004 copy-test-016 copy-test-029 copy-test-045 copy-test-056 copy-test-068 copy-test-079 copy-test-091

copy-test-005 copy-test-017 copy-test-033 copy-test-046 copy-test-058 copy-test-069 copy-test-080 copy-test-092

copy-test-006 copy-test-018 copy-test-035 copy-test-047 copy-test-059 copy-test-070 copy-test-082 copy-test-093

copy-test-007 copy-test-020 copy-test-036 copy-test-048 copy-test-061 copy-test-071 copy-test-084 copy-test-095

copy-test-009 copy-test-023 copy-test-037 copy-test-049 copy-test-062 copy-test-072 copy-test-085 copy-test-096

copy-test-010 copy-test-024 copy-test-040 copy-test-050 copy-test-064 copy-test-074 copy-test-086 copy-test-097

copy-test-013 copy-test-025 copy-test-042 copy-test-052 copy-test-065 copy-test-075 copy-test-087 copy-test-099

[root@GlusterFS-master ~]# ls /data/gluster/

haha-test-06 haha-test-07 haha-test-15 haha-test-19 haha-test-22 haha-test-28 kevin

[root@GlusterFS-master ~]# ls /data/gluster/kevin/

copy-test-002 copy-test-014 copy-test-036 copy-test-047 copy-test-059 copy-test-069 copy-test-080 copy-test-097

copy-test-003 copy-test-018 copy-test-037 copy-test-049 copy-test-061 copy-test-070 copy-test-084

copy-test-005 copy-test-020 copy-test-040 copy-test-050 copy-test-062 copy-test-071 copy-test-085

copy-test-007 copy-test-025 copy-test-042 copy-test-053 copy-test-064 copy-test-072 copy-test-089

copy-test-009 copy-test-026 copy-test-043 copy-test-055 copy-test-066 copy-test-074 copy-test-091

copy-test-010 copy-test-027 copy-test-044 copy-test-056 copy-test-067 copy-test-075 copy-test-092

copy-test-013 copy-test-035 copy-test-045 copy-test-058 copy-test-068 copy-test-076 copy-test-096

[root@GlusterFS-slave ~]# ls /data/gluster/

haha-test-01 haha-test-04 haha-test-08 haha-test-17 haha-test-20 haha-test-26 haha-test-27 haha-test-29 kevin

[root@GlusterFS-slave ~]# ls /data/gluster/kevin/

copy-test-002 copy-test-017 copy-test-040 copy-test-050 copy-test-064 copy-test-074 copy-test-086 copy-test-097

copy-test-004 copy-test-020 copy-test-042 copy-test-052 copy-test-065 copy-test-075 copy-test-087 copy-test-099

copy-test-006 copy-test-023 copy-test-043 copy-test-053 copy-test-066 copy-test-076 copy-test-088

copy-test-009 copy-test-024 copy-test-044 copy-test-055 copy-test-067 copy-test-078 copy-test-089

copy-test-010 copy-test-027 copy-test-045 copy-test-056 copy-test-068 copy-test-079 copy-test-091

copy-test-013 copy-test-029 copy-test-046 copy-test-058 copy-test-069 copy-test-080 copy-test-092

copy-test-014 copy-test-033 copy-test-047 copy-test-059 copy-test-070 copy-test-082 copy-test-093

copy-test-015 copy-test-035 copy-test-048 copy-test-061 copy-test-071 copy-test-084 copy-test-095

copy-test-016 copy-test-036 copy-test-049 copy-test-062 copy-test-072 copy-test-085 copy-test-096

[root@GlusterFS-slave2 ~]# ls /data/gluster/

haha-test-02 haha-test-09 haha-test-12 haha-test-18 haha-test-23 kevin

haha-test-05 haha-test-10 haha-test-13 haha-test-21 haha-test-24

[root@GlusterFS-slave2 ~]# ls /data/gluster/kevin/

copy-test-004 copy-test-016 copy-test-024 copy-test-046 copy-test-065 copy-test-082 copy-test-088 copy-test-099

copy-test-006 copy-test-017 copy-test-029 copy-test-048 copy-test-078 copy-test-086 copy-test-093

copy-test-015 copy-test-023 copy-test-033 copy-test-052 copy-test-079 copy-test-087 copy-test-095

[root@GlusterFS-slave3 ~]# ls /data/gluster/

haha-test-03 haha-test-11 haha-test-14 haha-test-16 haha-test-25 haha-test-30 kevin

[root@GlusterFS-slave3 ~]# ls /data/gluster/kevin/

查看下平衡状态

[root@GlusterFS-master ~]# gluster volume rebalance gluster_data status

Node Rebalanced-files size scanned failures skipped status run time in secs

--------- ----------- ----------- ----------- ----------- ----------- ------------ --------------

localhost 29 164.8KB 105 0 0 completed 1.00

192.168.10.212 43 248.7KB 131 0 0 completed 1.00

192.168.10.204 0 0Bytes 105 0 0 completed 0.00

192.168.10.220 0 0Bytes 105 0 0 completed 0.00

volume rebalance: gluster_data: success:

--------------------------------------------------------------------------

接着进行卸载挂载点，停止卷

这个操作很危险，但是卷删除了，下面的数据还在：

[root@Client ~]# umount /opt/gfsmount -lf

[root@GlusterFS-master ~]# gluster volume stop gluster_data

Stopping volume will make its data inaccessible. Do you want to continue? (y/n) y

volume stop: gluster_data: success

[root@GlusterFS-master ~]# gluster volume delete gluster_data

Deleting volume will erase all information about the volume. Do you want to continue? (y/n) y

volume delete: gluster_data: success

[root@GlusterFS-master ~]# gluster volume list

No volumes present in cluster

[root@GlusterFS-master ~]# gluster volume info

No volumes present

gluster_data卷删除了，但是各节点上存储目录下的数据还在

[root@GlusterFS-master ~]# ls /data/gluster/

haha-test-06 haha-test-07 haha-test-15 haha-test-19 haha-test-22 haha-test-28 kevin

[root@GlusterFS-slave ~]# ls /data/gluster/

haha-test-01 haha-test-04 haha-test-08 haha-test-17 haha-test-20 haha-test-26 haha-test-27 haha-test-29 kevin

[root@GlusterFS-slave2 ~]# ls /data/gluster/

haha-test-02 haha-test-09 haha-test-12 haha-test-18 haha-test-23 kevin

haha-test-05 haha-test-10 haha-test-13 haha-test-21 haha-test-24

[root@GlusterFS-slave3 ~]# ls /data/gluster/

haha-test-03 haha-test-11 haha-test-14 haha-test-16 haha-test-25 haha-test-30 kevin

客户端挂载点数据不显示

[root@Client ~]# ls /opt/gfsmount/

[root@Client ~]#

想要清除数据，可以登录到每个节点上删除brick下面的数据

[root@GlusterFS-master ~]# rm -rf /data/gluster

[root@GlusterFS-slave ~]# rm -rf /data/gluster

[root@GlusterFS-slave2 ~]# rm -rf /data/gluster

[root@GlusterFS-slave3 ~]# rm -rf /data/gluster

================创建复制卷及其相关管理操作================

100

101

102

103

104

105

106

107

108

109

110

111

112

113

114

115

116

117

118

119

120

121

122

123

124

125

126

127

128

129

130

131

132

133

134

135

136

137

138

139

140

141

142

143

144

145

146

147

148

149

150

151

152

153

154

155

156

157

158

159

160

161

162

163

164

165

166

167

168

169

170

171

172

173

174

175

176

177

178

179

180

181

182

183

184

185

186

187

188

189

190

191

192

193

194

195

196

197

198

199

200

201

202

203

204

205

206

207

208

209

210

211

212

213

214

215

216

217

218

219

220

221

222

223

224

225

226

227

228

229

230

231

232

233

234

235

236

237

238

239

240

241

242

243

244

245

246

247

248

249

250

251

252

253

254

255

256

257

258

259

260

261

262

263

264

265

266

267

268

269

270

271

272

273

274

275

276

277

278

279

280

281

282

283

284

285

286

287

288

289

290

291

292

293

294

295

296

297

298

299

300

301

302

303

304

305

306

307

308

309

310

311

312

313

314

315

316

317

318

319

320

321

322

323

324

325

326

327

328

329

330

331

332

333

334

335

336

337

338

339

340

341

342

343

344

345

346

347

348

349

350

先删除各个节点在上面用到的存储目录

[root@GlusterFS-master ~]# rm -rf /data/gluster

[root@GlusterFS-slave ~]# rm -rf /data/gluster

[root@GlusterFS-slave2 ~]# rm -rf /data/gluster

[root@GlusterFS-slave3 ~]# rm -rf /data/gluster

[root@GlusterFS-master ~]# gluster volume info

No volumes present

[root@GlusterFS-master ~]# gluster volume status

No volumes present

[root@GlusterFS-master ~]# gluster volume list

No volumes present in cluster

[root@GlusterFS-master ~]# gluster peer status

Number of Peers: 3

Hostname: 192.168.10.212

Uuid: f8e69297-4690-488e-b765-c1c404810d6a

State: Peer in Cluster (Connected)

Hostname: 192.168.10.204

Uuid: a989394c-f64a-40c3-8bc5-820f623952c4

State: Peer in Cluster (Connected)

Hostname: 192.168.10.220

Uuid: dd99743a-285b-4aed-b3d6-e860f9efd965

State: Peer in Cluster (Connected)

删除集群中的节点

[root@GlusterFS-master ~]# gluster

gluster> peer detach 192.168.10.220

peer detach: success

gluster> peer detach 192.168.10.204

peer detach: success

gluster> peer detach 192.168.10.212

peer detach: success

gluster> peer detach 192.168.10.239

peer detach: failed: 192.168.10.239 is localhost

gluster>

[root@GlusterFS-slave ~]# gluster

peer detach: success

查看集群，已没有节点信息了

[root@GlusterFS-master ~]# gluster peer status

Number of Peers: 0

----------------------------------------------------------------------

现在开始创建复制卷，操作如下：

先添加集群

[root@GlusterFS-master ~]# gluster peer probe 192.168.10.212

peer probe: success.

[root@GlusterFS-master ~]# gluster peer status

Number of Peers: 1

Hostname: 192.168.10.212

Uuid: f8e69297-4690-488e-b765-c1c404810d6a

State: Peer in Cluster (Connected)

创建复制卷，需要replica 2，表示两个副本

[root@GlusterFS-master ~]# gluster volume create gluster_share replica 2 192.168.10.239:/data/gluster 192.168.10.212:/data/gluster force

volume create: gluster_share: success: please start the volume to access data

[root@GlusterFS-master ~]# gluster volume info

Volume Name: gluster_share

Type: Replicate

Volume ID: a9f989bd-7edd-4089-836a-d9f742b8d37a

Status: Created

Number of Bricks: 1 x 2 = 2

Transport-type: tcp

Bricks:

Brick1: 192.168.10.239:/data/gluster

Brick2: 192.168.10.212:/data/gluster

以上两个节点会自动生成存储目录/data/gluster

[root@GlusterFS-master ~]# ll /data/

total 0

drwxr-xr-x. 2 root root 6 Apr 10 12:58 gluster

[root@GlusterFS-slave ~]# ll /data/

total 0

drwxr-xr-x. 2 root root 6 Apr 10 12:59 gluster

启动卷，查看卷状态　

[root@GlusterFS-master ~]# gluster volume list

gluster_share

[root@GlusterFS-master ~]# gluster volume start gluster_share

volume start: gluster_share: success

[root@GlusterFS-master ~]# gluster volume info

Volume Name: gluster_share

Type: Replicate

Volume ID: a9f989bd-7edd-4089-836a-d9f742b8d37a

Status: Started

Number of Bricks: 1 x 2 = 2

Transport-type: tcp

Bricks:

Brick1: 192.168.10.239:/data/gluster

Brick2: 192.168.10.212:/data/gluster

客户端挂载后操作创建文件文件，发现容量只有一个节点的容量，因为是复制卷

[root@Client ~]# mount -t glusterfs 192.168.10.239:gluster_share /opt/gfsmount/

[root@Client ~]# df -h

.......

192.168.10.239:gluster_share 1014M 33M 982M 4% /opt/gfsmount

客户端挂载点写入数据

[root@Client gfsmount]# mkdir test

[root@Client gfsmount]# touch kevin grace

两个节点由于是副本关系，内容一致

[root@GlusterFS-master ~]# ls /data/gluster/

grace kevin test

[root@GlusterFS-slave ~]# ls /data/gluster/

grace kevin test

------------------------------------------------------------------

模拟误删卷信息故障

删除卷信息，卷信息在下面路径下　

[root@GlusterFS-master ~]# ls /usr/local/glusterfs/var/lib/glusterd/vols/gluster_share/

bricks gluster_share-rebalance.vol rbstate

cksum gluster_share.tcp-fuse.vol run

gluster_share.192.168.10.212.data-gluster.vol info snapd.info

gluster_share.192.168.10.239.data-gluster.vol node_state.info trusted-gluster_share.tcp-fuse.vol

[root@GlusterFS-slave ~]# ls /usr/local/glusterfs/var/lib/glusterd/vols/gluster_share/

bricks gluster_share-rebalance.vol rbstate

cksum gluster_share.tcp-fuse.vol run

gluster_share.192.168.10.212.data-gluster.vol info snapd.info

gluster_share.192.168.10.239.data-gluster.vol node_state.info trusted-gluster_share.tcp-fuse.vol

这里以删除192.168.10.212（即GlusterFS-slave）节点的卷信息为例

[root@GlusterFS-slave ~]# rm -rf /usr/local/glusterfs/var/lib/glusterd/vols/gluster_share/

[root@GlusterFS-slave ~]# ls /usr/local/glusterfs/var/lib/glusterd/vols/gluster_share/

ls: cannot access /usr/local/glusterfs/var/lib/glusterd/vols/gluster_share/: No such file or directory

恢复卷信息

把卷信息同步过来，192.168.10.212节点上的卷信息删除了，但是它的备份节点192.168.10.239节点上的卷信息是正常的！

下面操作的all表示同步所有卷信息过来，这里也可以写成gluster_share卷

特别注意：这种卷信息要定期备份！！！！

[root@GlusterFS-master ~]# gluster volume sync 192.168.10.239 all //不能在节点本机进行针对自己的备份

Sync volume may make data inaccessible while the sync is in progress. Do you want to continue? (y/n) y

volume sync: failed: sync from localhost not allowed

[root@GlusterFS-slave ~]# gluster volume sync 192.168.10.239 all //所以要在集群中的其他节点上操作

Sync volume may make data inaccessible while the sync is in progress. Do you want to continue? (y/n) y

volume sync: success

然后再去192.168.10.212节点上查看卷信息，发现删除的卷信息已经恢复回来了！

[root@GlusterFS-slave ~]# ls /usr/local/glusterfs/var/lib/glusterd/vols/gluster_share/

bricks gluster_share-rebalance.vol rbstate

cksum gluster_share.tcp-fuse.vol run

gluster_share.192.168.10.212.data-gluster.vol info snapd.info

gluster_share.192.168.10.239.data-gluster.vol node_state.info trusted-gluster_share.tcp-fuse.vol

------------------------------------------------------------------

追加节点操作

[root@GlusterFS-master ~]# gluster peer probe 192.168.10.204

peer probe: success.

[root@GlusterFS-master ~]# gluster peer probe 192.168.10.220

peer probe: success.

[root@GlusterFS-master ~]# gluster peer status

Number of Peers: 3

Hostname: 192.168.10.212

Uuid: f8e69297-4690-488e-b765-c1c404810d6a

State: Peer in Cluster (Connected)

Hostname: 192.168.10.204

Uuid: a989394c-f64a-40c3-8bc5-820f623952c4

State: Peer in Cluster (Connected)

Hostname: 192.168.10.220

Uuid: dd99743a-285b-4aed-b3d6-e860f9efd965

State: Peer in Cluster (Connected)

查看卷信息

[root@GlusterFS-master ~]# gluster volume info

Volume Name: gluster_share

Type: Replicate

Volume ID: a9f989bd-7edd-4089-836a-d9f742b8d37a

Status: Started

Number of Bricks: 1 x 2 = 2

Transport-type: tcp

Bricks:

Brick1: 192.168.10.239:/data/gluster

Brick2: 192.168.10.212:/data/gluster

卷扩容操作（即将新添加的两个节点的brick添加到上面的models磁盘里）

先将客户端的挂载卸载掉

[root@Client ~]# umount /opt/gfsmount/

接着关闭gluster_share卷

[root@GlusterFS-master ~]# gluster volume stop gluster_share

Stopping volume will make its data inaccessible. Do you want to continue? (y/n) y

volume stop: gluster_share: success

[root@GlusterFS-master ~]# gluster volume status gluster_share

Volume gluster_share is not started

然后执行卷扩容操作

[root@GlusterFS-master ~]# gluster volume add-brick gluster_share 192.168.10.204:/data/gluster 192.168.10.220:/data/gluster force

volume add-brick: success

再次查看卷信息，发现新节点已经加入进去了

[root@GlusterFS-master ~]# gluster volume info

Volume Name: gluster_share

Type: Distributed-Replicate

Volume ID: a9f989bd-7edd-4089-836a-d9f742b8d37a

Status: Stopped

Number of Bricks: 2 x 2 = 4

Transport-type: tcp

Bricks:

Brick1: 192.168.10.239:/data/gluster

Brick2: 192.168.10.212:/data/gluster

Brick3: 192.168.10.204:/data/gluster

Brick4: 192.168.10.220:/data/gluster

然后重新启动gluster_share卷

[root@GlusterFS-master ~]# gluster volume start gluster_share

volume start: gluster_share: success

[root@GlusterFS-master ~]# gluster volume status gluster_share

Status of volume: gluster_share

Gluster process Port Online Pid

------------------------------------------------------------------------------

Brick 192.168.10.239:/data/gluster 49155 Y 8907

Brick 192.168.10.212:/data/gluster 49156 Y 4049

Brick 192.168.10.204:/data/gluster 49156 Y 11447

Brick 192.168.10.220:/data/gluster 49157 Y 16714

NFS Server on localhost N/A N N/A

Self-heal Daemon on localhost N/A N N/A

NFS Server on 192.168.10.212 N/A N N/A

Self-heal Daemon on 192.168.10.212 N/A N N/A

NFS Server on 192.168.10.220 N/A N N/A

Self-heal Daemon on 192.168.10.220 N/A N N/A

NFS Server on 192.168.10.204 N/A N N/A

Self-heal Daemon on 192.168.10.204 N/A N N/A

Task Status of Volume gluster_share

------------------------------------------------------------------------------

There are no active volume tasks

如上发现四个节点的Online项状态都是"Y"

接着在Client客户机重新挂载存储

[root@Client ~]# mount -t glusterfs 192.168.10.239:gluster_share /opt/gfsmount/

[root@Client ~]# df -h

.......

192.168.10.239:gluster_share 2.0G 65M 2.0G 4% /opt/gfsmount

如上发现客户端挂载点的容量是2G，即两组replica的分区大小

在客户机上写入数据做测试，发现：

1）新添加到models卷里的节点的存储目录里不会有之前其他节点的数据，只会有新写入的数据。

2）由于上面创建副本卷的时候，指定的副本是2个（即replica 2），如果不做rebalance，那么在客户端写入的文件数据都只会同步到原来的那个replica

下的节点里，即新加入的两个节点里不会同步数据；在客户端创建目录，四个节点都会同步到；但是文件只会同步到之前的两个节点。

接着进行重新均衡。均衡卷的前提是至少有两个brick存储单元

[root@GlusterFS-master ~]# gluster volume rebalance gluster_share start

volume rebalance: gluster_share: success: Initiated rebalance on volume gluster_share.

Execute "gluster volume rebalance <volume-name> status" to check status.

ID: 26035833-3c20-4822-b065-7a5e15d30b85

[root@GlusterFS-master ~]# gluster volume rebalance gluster_share status

Node Rebalanced-files size scanned failures skipped status run time in secs

--------- ----------- ----------- ----------- ----------- ----------- ------------ --------------

localhost 103 10.3MB 305 0 0 completed 1.00

192.168.10.212 0 0Bytes 210 0 0 completed 0.00

192.168.10.204 11 58.9KB 213 0 0 completed 0.00

192.168.10.220 0 0Bytes 210 0 0 completed 0.00

volume rebalance: gluster_share: success:

然后删除客户端的数据，重新写入进行测试

[root@Client ~]# rm -rf /opt/gfsmount/*

[root@Client ~]# cd /opt/gfsmount/

[root@Client gfsmount]# ls

[root@Client gfsmount]# mkdir kevin

发现四个节点都有这个kevin目录

[root@GlusterFS-master ~]# ls /data/gluster/

kevin

[root@GlusterFS-slave ~]# ls /data/gluster/

kevin

[root@GlusterFS-slave2 ~]# ls /data/gluster/

kevin

[root@GlusterFS-slave3 ~]# ls /data/gluster/

kevin

然后批量写入数据

[root@Client gfsmount]# for i in `seq -w 1 100`; do cp -rp /var/log/messages /opt/gfsmount/haha-test-$i; done

[root@Client gfsmount]# ls

haha-test-001 haha-test-014 haha-test-027 haha-test-040 haha-test-053 haha-test-066 haha-test-079 haha-test-092

haha-test-002 haha-test-015 haha-test-028 haha-test-041 haha-test-054 haha-test-067 haha-test-080 haha-test-093

haha-test-003 haha-test-016 haha-test-029 haha-test-042 haha-test-055 haha-test-068 haha-test-081 haha-test-094

haha-test-004 haha-test-017 haha-test-030 haha-test-043 haha-test-056 haha-test-069 haha-test-082 haha-test-095

haha-test-005 haha-test-018 haha-test-031 haha-test-044 haha-test-057 haha-test-070 haha-test-083 haha-test-096

haha-test-006 haha-test-019 haha-test-032 haha-test-045 haha-test-058 haha-test-071 haha-test-084 haha-test-097

haha-test-007 haha-test-020 haha-test-033 haha-test-046 haha-test-059 haha-test-072 haha-test-085 haha-test-098

haha-test-008 haha-test-021 haha-test-034 haha-test-047 haha-test-060 haha-test-073 haha-test-086 haha-test-099

haha-test-009 haha-test-022 haha-test-035 haha-test-048 haha-test-061 haha-test-074 haha-test-087 haha-test-100

haha-test-010 haha-test-023 haha-test-036 haha-test-049 haha-test-062 haha-test-075 haha-test-088 kevin

haha-test-011 haha-test-024 haha-test-037 haha-test-050 haha-test-063 haha-test-076 haha-test-089

haha-test-012 haha-test-025 haha-test-038 haha-test-051 haha-test-064 haha-test-077 haha-test-090

haha-test-013 haha-test-026 haha-test-039 haha-test-052 haha-test-065 haha-test-078 haha-test-091

发现这些数据被分为两个replica副本，分到放到了192.168.10.239、192.168.10.212（副本关系）和192.168.10.204、192.168.10.220上了

[root@GlusterFS-master ~]# ls /data/gluster/

haha-test-004 haha-test-015 haha-test-028 haha-test-040 haha-test-050 haha-test-070 haha-test-080 haha-test-091

haha-test-007 haha-test-017 haha-test-029 haha-test-042 haha-test-051 haha-test-072 haha-test-081 haha-test-093

haha-test-009 haha-test-019 haha-test-032 haha-test-043 haha-test-052 haha-test-073 haha-test-084 haha-test-094

haha-test-010 haha-test-020 haha-test-033 haha-test-044 haha-test-056 haha-test-075 haha-test-087 haha-test-095

haha-test-011 haha-test-021 haha-test-034 haha-test-047 haha-test-059 haha-test-076 haha-test-088 haha-test-097

haha-test-012 haha-test-022 haha-test-036 haha-test-048 haha-test-061 haha-test-078 haha-test-089 haha-test-100

haha-test-014 haha-test-027 haha-test-037 haha-test-049 haha-test-069 haha-test-079 haha-test-090 kevin

[root@GlusterFS-slave ~]# ls /data/gluster/

haha-test-004 haha-test-015 haha-test-028 haha-test-040 haha-test-050 haha-test-070 haha-test-080 haha-test-091

haha-test-007 haha-test-017 haha-test-029 haha-test-042 haha-test-051 haha-test-072 haha-test-081 haha-test-093

haha-test-009 haha-test-019 haha-test-032 haha-test-043 haha-test-052 haha-test-073 haha-test-084 haha-test-094

haha-test-010 haha-test-020 haha-test-033 haha-test-044 haha-test-056 haha-test-075 haha-test-087 haha-test-095

haha-test-011 haha-test-021 haha-test-034 haha-test-047 haha-test-059 haha-test-076 haha-test-088 haha-test-097

haha-test-012 haha-test-022 haha-test-036 haha-test-048 haha-test-061 haha-test-078 haha-test-089 haha-test-100

haha-test-014 haha-test-027 haha-test-037 haha-test-049 haha-test-069 haha-test-079 haha-test-090 kevin

[root@GlusterFS-slave2 ~]# ls /data/gluster/

haha-test-001 haha-test-013 haha-test-026 haha-test-041 haha-test-057 haha-test-065 haha-test-077 haha-test-096

haha-test-002 haha-test-016 haha-test-030 haha-test-045 haha-test-058 haha-test-066 haha-test-082 haha-test-098

haha-test-003 haha-test-018 haha-test-031 haha-test-046 haha-test-060 haha-test-067 haha-test-083 haha-test-099

haha-test-005 haha-test-023 haha-test-035 haha-test-053 haha-test-062 haha-test-068 haha-test-085 kevin

haha-test-006 haha-test-024 haha-test-038 haha-test-054 haha-test-063 haha-test-071 haha-test-086

haha-test-008 haha-test-025 haha-test-039 haha-test-055 haha-test-064 haha-test-074 haha-test-092

[root@GlusterFS-slave3 ~]# ls /data/gluster/

haha-test-001 haha-test-013 haha-test-026 haha-test-041 haha-test-057 haha-test-065 haha-test-077 haha-test-096

haha-test-002 haha-test-016 haha-test-030 haha-test-045 haha-test-058 haha-test-066 haha-test-082 haha-test-098

haha-test-003 haha-test-018 haha-test-031 haha-test-046 haha-test-060 haha-test-067 haha-test-083 haha-test-099

haha-test-005 haha-test-023 haha-test-035 haha-test-053 haha-test-062 haha-test-068 haha-test-085 kevin

haha-test-006 haha-test-024 haha-test-038 haha-test-054 haha-test-063 haha-test-071 haha-test-086

haha-test-008 haha-test-025 haha-test-039 haha-test-055 haha-test-064 haha-test-074 haha-test-092

同样在客户端的挂载点的目录（即/opt/gfsmount/kevin）下创建的文件也是同样被分为两个replica副本，分到放到了

192.168.10.239、192.168.10.212（副本关系）和192.168.10.204、192.168.10.220上了

================Gluster设置允许可信任客户端IP================

设置只允许192.168.1.*的访问

[root@GlusterFS-master ~]# gluster volume set gluster_share auth.allow 192.168.1.*

volume set: success

查看卷信息

[root@GlusterFS-master ~]# gluster volume info

Volume Name: gluster_share

Type: Distributed-Replicate

Volume ID: a9f989bd-7edd-4089-836a-d9f742b8d37a

Status: Started

Number of Bricks: 2 x 2 = 4

Transport-type: tcp

Bricks:

Brick1: 192.168.10.239:/data/gluster

Brick2: 192.168.10.212:/data/gluster

Brick3: 192.168.10.204:/data/gluster

Brick4: 192.168.10.220:/data/gluster

Options Reconfigured:

auth.allow: 192.168.1.*

注意上面最后一行卷信息，说明只允许客户端ip为192.168.10.*网段的机器挂载。

然后在192.168.10.213客户端进行挂载，发现就挂载不上了

[root@Client ~]# mount -t glusterfs 192.168.10.239:gluster_share /opt/gfsmount/

[root@Client ~]# df -h

Filesystem Size Used Avail Use% Mounted on

/dev/mapper/centos-root 38G 4.3G 33G 12% /

devtmpfs 1.9G 0 1.9G 0% /dev

tmpfs 1.9G 0 1.9G 0% /dev/shm

tmpfs 1.9G 8.6M 1.9G 1% /run

tmpfs 1.9G 0 1.9G 0% /sys/fs/cgroup

/dev/vda1 1014M 143M 872M 15% /boot

/dev/mapper/centos-home 19G 33M 19G 1% /home

tmpfs 380M 0 380M 0% /run/user/0

overlay 38G 4.3G 33G 12% /var/lib/docker/overlay2/9904ac8cbcba967de3262dc0d5e230c64ad3c1c53b588048e263767d36df8c1a/merged

shm 64M 0 64M 0% /var/lib/docker/containers/222ec7f21b2495591613e0d1061e4405cd57f99ffaf41dbba1a98c350cd70f60/mounts/shm

接着修改授权信息

[root@GlusterFS-master ~]# gluster volume set gluster_share auth.allow 192.168.10.*

volume set: success

[root@GlusterFS-master ~]# gluster volume info

Volume Name: gluster_share

Type: Distributed-Replicate

Volume ID: a9f989bd-7edd-4089-836a-d9f742b8d37a

Status: Started

Number of Bricks: 2 x 2 = 4

Transport-type: tcp

Bricks:

Brick1: 192.168.10.239:/data/gluster

Brick2: 192.168.10.212:/data/gluster

Brick3: 192.168.10.204:/data/gluster

Brick4: 192.168.10.220:/data/gluster

Options Reconfigured:

auth.allow: 192.168.10.*

然后再在192.168.10.213的客户端进行挂载测试，发现能挂载上了

[root@Client ~]# mount -t glusterfs 192.168.10.239:gluster_share /opt/gfsmount/

[root@Client ~]# df -h

Filesystem Size Used Avail Use% Mounted on

/dev/mapper/centos-root 38G 4.3G 33G 12% /

devtmpfs 1.9G 0 1.9G 0% /dev

tmpfs 1.9G 0 1.9G 0% /dev/shm

tmpfs 1.9G 8.6M 1.9G 1% /run

tmpfs 1.9G 0 1.9G 0% /sys/fs/cgroup

/dev/vda1 1014M 143M 872M 15% /boot

/dev/mapper/centos-home 19G 33M 19G 1% /home

tmpfs 380M 0 380M 0% /run/user/0

overlay 38G 4.3G 33G 12% /var/lib/docker/overlay2/9904ac8cbcba967de3262dc0d5e230c64ad3c1c53b588048e263767d36df8c1a/merged

shm 64M 0 64M 0% /var/lib/docker/containers/222ec7f21b2495591613e0d1061e4405cd57f99ffaf41dbba1a98c350cd70f60/mounts/shm

192.168.10.239:gluster_share 2.0G 67M 2.0G 4% /opt/gfsmount

================Gluster的性能测试工具================

100

101

102

103

104

105

106

107

108

109

110

111

112

113

114

115

116

117

118

119

120

121

122

123

124

125

126

在一个节点起服务。比如这里在GlusterFS-master节点上启动：　　

[root@GlusterFS-master ~]# yum install -y iperf

[root@GlusterFS-master ~]# iperf -s

------------------------------------------------------------

Server listening on TCP port 5001

TCP window size: 85.3 KByte (default)

------------------------------------------------------------

客户端连接上面的GlusterFS-master节点服务器，测试网络速度

[root@Client ~]# yum install -y iperf

[root@Client ~]# iperf -c 192.168.10.239

------------------------------------------------------------

Client connecting to 192.168.10.239, TCP port 5001

TCP window size: 85.0 KByte (default)

------------------------------------------------------------

[ 3] local 192.168.10.213 port 55276 connected with 192.168.10.239 port 5001

[ ID] Interval Transfer Bandwidth

[ 3] 0.0-10.0 sec 19.5 GBytes 16.7 Gbits/sec

节点服务器上也能查看到信息。因为是虚拟机环境，这里虚高了。

[root@GlusterFS-master ~]# iperf -s

------------------------------------------------------------

Server listening on TCP port 5001

TCP window size: 85.3 KByte (default)

------------------------------------------------------------

[ 4] local 192.168.10.239 port 5001 connected with 192.168.10.213 port 55276

[ ID] Interval Transfer Bandwidth

[ 4] 0.0-10.0 sec 19.5 GBytes 16.7 Gbits/sec

如果觉得压力不够，可以客户端多个进程一起发包。使用-P参数，客户端结果如下

[root@Client ~]# iperf -c 192.168.10.239 -P 10

------------------------------------------------------------

Client connecting to 192.168.10.239, TCP port 5001

TCP window size: 85.0 KByte (default)

------------------------------------------------------------

[ 12] local 192.168.10.213 port 55296 connected with 192.168.10.239 port 5001

[ 6] local 192.168.10.213 port 55284 connected with 192.168.10.239 port 5001

[ 7] local 192.168.10.213 port 55286 connected with 192.168.10.239 port 5001

[ 3] local 192.168.10.213 port 55278 connected with 192.168.10.239 port 5001

[ 5] local 192.168.10.213 port 55282 connected with 192.168.10.239 port 5001

[ 4] local 192.168.10.213 port 55280 connected with 192.168.10.239 port 5001

[ 8] local 192.168.10.213 port 55288 connected with 192.168.10.239 port 5001

[ 9] local 192.168.10.213 port 55290 connected with 192.168.10.239 port 5001

[ 11] local 192.168.10.213 port 55294 connected with 192.168.10.239 port 5001

[ 10] local 192.168.10.213 port 55292 connected with 192.168.10.239 port 5001

[ ID] Interval Transfer Bandwidth

[ 12] 0.0-10.0 sec 3.18 GBytes 2.73 Gbits/sec

[ 11] 0.0-10.0 sec 616 MBytes 516 Mbits/sec

[ 10] 0.0-10.0 sec 3.26 GBytes 2.80 Gbits/sec

[ 6] 0.0-10.0 sec 3.18 GBytes 2.72 Gbits/sec

[ 7] 0.0-10.0 sec 3.18 GBytes 2.73 Gbits/sec

[ 3] 0.0-10.0 sec 616 MBytes 516 Mbits/sec

[ 4] 0.0-10.0 sec 3.07 GBytes 2.63 Gbits/sec

[ 8] 0.0-10.0 sec 2.90 GBytes 2.49 Gbits/sec

[ 9] 0.0-10.0 sec 2.89 GBytes 2.48 Gbits/sec

[ 5] 0.0-10.0 sec 3.06 GBytes 2.62 Gbits/sec

[SUM] 0.0-10.0 sec 25.9 GBytes 22.2 Gbits/sec

节点服务器上的信息：

[root@GlusterFS-master ~]# iperf -s

------------------------------------------------------------

Server listening on TCP port 5001

TCP window size: 85.3 KByte (default)

------------------------------------------------------------

[ 4] local 192.168.10.239 port 5001 connected with 192.168.10.213 port 55276

[ ID] Interval Transfer Bandwidth

[ 4] 0.0-10.0 sec 19.5 GBytes 16.7 Gbits/sec

[ 4] local 192.168.10.239 port 5001 connected with 192.168.10.213 port 55278

[ 5] local 192.168.10.239 port 5001 connected with 192.168.10.213 port 55280

[ 7] local 192.168.10.239 port 5001 connected with 192.168.10.213 port 55284

[ 6] local 192.168.10.239 port 5001 connected with 192.168.10.213 port 55282

[ 8] local 192.168.10.239 port 5001 connected with 192.168.10.213 port 55286

[ 9] local 192.168.10.239 port 5001 connected with 192.168.10.213 port 55288

[ 10] local 192.168.10.239 port 5001 connected with 192.168.10.213 port 55290

[ 13] local 192.168.10.239 port 5001 connected with 192.168.10.213 port 55294

[ 11] local 192.168.10.239 port 5001 connected with 192.168.10.213 port 55292

[ 12] local 192.168.10.239 port 5001 connected with 192.168.10.213 port 55296

[ 13] 0.0-10.5 sec 616 MBytes 491 Mbits/sec

[ 4] 0.0-10.5 sec 616 MBytes 491 Mbits/sec

[ 5] 0.0-10.5 sec 3.07 GBytes 2.50 Gbits/sec

[ 7] 0.0-10.5 sec 3.18 GBytes 2.59 Gbits/sec

[ 8] 0.0-10.5 sec 3.18 GBytes 2.59 Gbits/sec

[ 9] 0.0-10.5 sec 2.90 GBytes 2.37 Gbits/sec

[ 6] 0.0-10.5 sec 3.06 GBytes 2.49 Gbits/sec

[ 10] 0.0-10.5 sec 2.89 GBytes 2.35 Gbits/sec

[ 11] 0.0-10.5 sec 3.26 GBytes 2.65 Gbits/sec

[ 12] 0.0-10.5 sec 3.18 GBytes 2.59 Gbits/sec

[SUM] 0.0-10.5 sec 25.9 GBytes 21.1 Gbits/sec

-----------------------------------------------------

dd工具测试：客户端写入速度和读取速度测试

[root@Client ~]# mount -t glusterfs 192.168.10.239:gluster_share /opt/gfsmount

[root@Client ~]# df -h

.......

192.168.10.239:gluster_share 2.0G 68M 2.0G 4% /opt/gfsmount

测试写文件的速度（写入一个300M文件的速度）

[root@Client ~]# dd if=/dev/zero of=/opt/gfsmount/grace bs=1M count=500

500+0 records in

500+0 records out

524288000 bytes (524 MB) copied, 0.981631 s, 534 MB/s

[root@Client ~]# du -sh /opt/gfsmount/grace

500M /opt/gfsmount/grace

测试读文件的速度

[root@Client ~]# dd if=/opt/gfsmount/grace of=/dev/null bs=1M count=500

500+0 records in

500+0 records out

524288000 bytes (524 MB) copied, 1.00698 s, 521 MB/s

再次测试，虚拟机不是很稳定

[root@Client ~]# dd if=/dev/zero of=/opt/gfsmount/grace1 bs=1M count=500

500+0 records in

500+0 records out

524288000 bytes (524 MB) copied, 1.15882 s, 452 MB/s

[root@Client ~]# dd if=/opt/gfsmount/grace1 of=/dev/null bs=1M count=500

500+0 records in

500+0 records out

524288000 bytes (524 MB) copied, 1.0682 s, 491 MB/s

================Gluster集群典型故障处理=================

100

101

102

103

104

105

106

107

108

109

110

111

112

113

114

115

116

117

118

119

120

121

122

123

124

125

126

127

128

129

130

131

132

133

134

135

136

137

138

139

140

141

142

143

144

145

146

147

148

149

150

151

152

153

154

155

156

157

158

159

160

161

162

163

164

165

166

167

168

169

170

171

172

173

174

175

176

177

178

179

180

181

182

183

184

185

186

187

188

189

190

191

192

193

194

195

196

197

198

199

200

201

202

203

204

205

206

207

208

209

210

211

212

213

214

215

216

217

218

219

220

221

222

223

224

225

226

227

228

229

230

231

232

233

234

235

1）复制卷数据不一致

故障现象：双副本卷数据出现不一致

故障模拟：删除其中一个brick数据

修复方法：访问文件触发自修复：

[root@GlusterFS-master ~]# gluster volume info

Volume Name: gluster_share

Type: Distributed-Replicate

Volume ID: a9f989bd-7edd-4089-836a-d9f742b8d37a

Status: Started

Number of Bricks: 2 x 2 = 4

Transport-type: tcp

Bricks:

Brick1: 192.168.10.239:/data/gluster

Brick2: 192.168.10.212:/data/gluster

Brick3: 192.168.10.204:/data/gluster

Brick4: 192.168.10.220:/data/gluster

Options Reconfigured:

客户端操作：

[root@Client ~]# df -h

.......

192.168.10.239:gluster_share 2.0G 1.1G 961M 53% /opt/gfsmount

[root@Client ~]# rm -rf /opt/gfsmount/*

[root@Client ~]# cd /opt/gfsmount/

[root@Client gfsmount]# touch a b c d e f g

客户端写入的数据分为两个replica放在了四个节点上

[root@GlusterFS-master ~]# ls /data/gluster/

a b c e

[root@GlusterFS-slave ~]# ls /data/gluster/

a b c e

[root@GlusterFS-slave2 ~]# ls /data/gluster/

d f g

[root@GlusterFS-slave3 ~]# ls /data/gluster/

d f g

模拟问题：

在GlusterFS-slave2机器上删除文件

[root@GlusterFS-slave2 ~]# cd /data/gluster/

[root@GlusterFS-slave2 gluster]# ls

d f g

[root@GlusterFS-slave2 gluster]# rm -rf d

[root@GlusterFS-slave2 gluster]# rm -rf f

[root@GlusterFS-slave2 gluster]# ls

GlusterFS-slave2的备份节点GlusterFS-slave3上有上面删除的数据

[root@GlusterFS-slave3 ~]# cd /data/gluster/

[root@GlusterFS-slave3 gluster]# ls

d f g

客户端上访问文件可以触发文件的自动修复

[root@Client ~]# cd /opt/gfsmount/

[root@Client gfsmount]# ls

a b c d e f g

[root@Client gfsmount]# cat d

[root@Client gfsmount]# cat f

再次到GlusterFS-slave2节点上查看，删除的数据就自动修复了

[root@GlusterFS-slave2 ~]# ls /data/gluster/

d f g

2）glusterfs集群节点配置信息不正确

故障模拟

删除server2部分配置信息

配置信息位置：/usr/local/glusterfs/var/lib/glusterd

修复方法

触发自修复：通过Gluster工具同步配置信息

gluster volume sync server1 all

恢复复制卷 brick

故障现象：双副本卷中一个brick损坏

恢复流程：

a）重新建立故障brick目录

# setfattr -n trusted.gfid -v

0x00000000000000000000000000000001 /data2

# setfattr -n trusted.glusterfs.dht -v

0x000000010000000000000000ffffffff /data2

# setfattr -n trusted.glusterfs.volume-id -v

0xcc51d546c0af4215a72077ad9378c2ac /data2

-v 的参数设置成你的值

b）设置扩展属性(参考另一个复制 brick)

c）重启 glusterd服务

d）触发数据自修复

# find /data/glusterd -type f -print0 | xargs -0 head -c1 >/dev/null

模拟删除brick的操作　

[root@GlusterFS-master ~]# gluster volume info

Volume Name: gluster_share

Type: Distributed-Replicate

Volume ID: a9f989bd-7edd-4089-836a-d9f742b8d37a

Status: Started

Number of Bricks: 2 x 2 = 4

Transport-type: tcp

Bricks:

Brick1: 192.168.10.239:/data/gluster

Brick2: 192.168.10.212:/data/gluster

Brick3: 192.168.10.204:/data/gluster

Brick4: 192.168.10.220:/data/gluster

Options Reconfigured:

在GlusterFS-slave节点上删除brick数据

[root@GlusterFS-slave ~]# ls /data/gluster/

a b c e

[root@GlusterFS-slave ~]# rm -rf /data/gluster

[root@GlusterFS-slave ~]# ll /data/gluster

ls: cannot access /data/gluster: No such file or directory

接在在GlusterFS-slave的备份节点GlusterFS-master上操作，获取扩展属性（使用"yum search getfattr"命令getfattr工具的安装途径）

[root@GlusterFS-master ~]# yum install -y attr.x86_64

[root@GlusterFS-master ~]# cd /data/

[root@GlusterFS-master data]# getfattr -d -m . -e hex gluster/

# file: gluster/

security.selinux=0x756e636f6e66696e65645f753a6f626a6563745f723a756e6c6162656c65645f743a733000

trusted.gfid=0x00000000000000000000000000000001

trusted.glusterfs.dht=0x0000000100000000000000007ffffc25

trusted.glusterfs.volume-id=0xa9f989bd7edd4089836ad9f742b8d37a

注意下面的操作将根据上面的属性信息中的id进行操作

重新建立故障brick目录

恢复操作，扩展属性可以从GlusterFS-master节点机上获取的复制（注意上面属性中的id：0xa9f989bd7edd4089836ad9f742b8d37a），执行顺序没关系

[root@GlusterFS-slave ~]# mkdir /data/gluster

[root@GlusterFS-slave ~]# yum install -y attr.x86_64

[root@GlusterFS-slave ~]# getfattr -d -m . -e hex /data/gluster

getfattr: Removing leading '/' from absolute path names

# file: data/gluster

security.selinux=0x756e636f6e66696e65645f753a6f626a6563745f723a756e6c6162656c65645f743a733000

[root@GlusterFS-slave ~]# setfattr -n trusted.glusterfs.volume-id -v 0xa9f989bd7edd4089836ad9f742b8d37a /data/gluster

[root@GlusterFS-slave ~]# getfattr -d -m . -e hex /data/gluster

getfattr: Removing leading '/' from absolute path names

# file: data/gluster

security.selinux=0x756e636f6e66696e65645f753a6f626a6563745f723a756e6c6162656c65645f743a733000

trusted.glusterfs.volume-id=0xa9f989bd7edd4089836ad9f742b8d37a

[root@GlusterFS-slave ~]# setfattr -n trusted.gfid -v 0x00000000000000000000000000000001 /data/gluster

[root@GlusterFS-slave ~]# setfattr -n trusted.glusterfs.dht -v 0x0000000100000000000000007ffffc25 /data/gluster

[root@GlusterFS-slave ~]# getfattr -d -m . -e hex /data/gluster

getfattr: Removing leading '/' from absolute path names

# file: data/gluster

security.selinux=0x756e636f6e66696e65645f753a6f626a6563745f723a756e6c6162656c65645f743a733000

trusted.gfid=0x00000000000000000000000000000001

trusted.glusterfs.dht=0x0000000100000000000000007ffffc25

trusted.glusterfs.volume-id=0xa9f989bd7edd4089836ad9f742b8d37a

重启gluster服务

[root@GlusterFS-slave ~]# ps -ef|grep gluster

root 4909 1 0 15:12 ? 00:00:01 /usr/local/glusterfs/sbin/glusterfs -s localhost --volfile-id gluster/glustershd -p /usr/local/glusterfs/var/lib/glusterd/glustershd/run/glustershd.pid -l /usr/local/glusterfs/var/log/glusterfs/glustershd.log -S /var/run/f9f65f2dbb0c193ecab167839d75699e.socket --xlator-option *replicate*.node-uuid=f8e69297-4690-488e-b765-c1c404810d6a

root 5069 5044 0 17:31 pts/0 00:00:00 grep --color=auto gluster

root 32450 1 0 Apr08 ? 00:00:26 /usr/local/glusterfs/sbin/glusterd

[root@GlusterFS-slave ~]# ps -ef|grep gluster|awk '{print $2}'|xargs kill -9

kill: sending signal to 5071 failed: No such process

[root@GlusterFS-slave ~]# ps -ef|grep gluster

root 5078 5044 0 17:32 pts/0 00:00:00 grep --color=auto gluster

[root@GlusterFS-slave ~]# /usr/local/glusterfs/sbin/glusterd

[root@GlusterFS-slave ~]# ps -ef|grep gluster

root 5080 1 14 17:32 ? 00:00:00 /usr/local/glusterfs/sbin/glusterd

root 5212 5044 0 17:32 pts/0 00:00:00 grep --color=auto gluster

[root@GlusterFS-slave ~]# ls /data/gluster/

[root@GlusterFS-slave ~]#

可以尝试通过下面三种方法进行数据恢复

a）重启gluster服务后，Gluster-slave节点的数据。如果没有恢复，那么就尝试第2种方法

b）在客户端的挂载点下cat那些删除的文件或者再写入新数据，去触发自动修复。如果还没有恢复。那么尝试第3种方法

c）重启gluster_share卷

[root@GlusterFS-master ~]# gluster volume stop gluster_share

Stopping volume will make its data inaccessible. Do you want to continue? (y/n) y

volume stop: gluster_share: success

[root@GlusterFS-master ~]# gluster volume start gluster_share

volume start: gluster_share: success

[root@GlusterFS-master ~]# gluster volume info

Volume Name: gluster_share

Type: Distributed-Replicate

Volume ID: a9f989bd-7edd-4089-836a-d9f742b8d37a

Status: Started

Number of Bricks: 2 x 2 = 4

Transport-type: tcp

Bricks:

Brick1: 192.168.10.239:/data/gluster

Brick2: 192.168.10.212:/data/gluster

Brick3: 192.168.10.204:/data/gluster

Brick4: 192.168.10.220:/data/gluster

[root@GlusterFS-master ~]# gluster volume status gluster_share

Status of volume: gluster_share

Gluster process Port Online Pid

------------------------------------------------------------------------------

Brick 192.168.10.239:/data/gluster 49155 Y 15508

Brick 192.168.10.212:/data/gluster 49156 Y 5249

Brick 192.168.10.204:/data/gluster 49156 Y 12162

Brick 192.168.10.220:/data/gluster 49157 Y 17402

NFS Server on localhost N/A N N/A

Self-heal Daemon on localhost N/A Y 15527

NFS Server on 192.168.10.204 N/A N N/A

Self-heal Daemon on 192.168.10.204 N/A Y 12181

NFS Server on 192.168.10.220 N/A N N/A

Self-heal Daemon on 192.168.10.220 N/A Y 17421

NFS Server on 192.168.10.212 N/A N N/A

Self-heal Daemon on 192.168.10.212 N/A Y 5268

Task Status of Volume gluster_share

------------------------------------------------------------------------------

Task : Rebalance

ID : 26035833-3c20-4822-b065-7a5e15d30b85

Status : completed

[root@GlusterFS-master ~]# gluster volume heal gluster_share info

Brick GlusterFS-master:/data/gluster/

Number of entries: 0

Brick GlusterFS-slave:/data/gluster/

Number of entries: 0

Brick GlusterFS-slave2:/data/gluster/

Number of entries: 0

Brick GlusterFS-slave3:/data/gluster/

Number of entries: 0

查看下数据是否恢复了

[root@GlusterFS-slave ~]# cd /data/gluster/

[root@GlusterFS-slave gluster]# ls

a b c

注意：

如上在客户端挂载点写入新数据后，GlusterFS-slave节点的数据就会恢复，如果发现恢复后的数据跟它的备份节点

GlusterFS—master的数据不一致，这个时候只需要在客户端挂载点下cat那些不一致的文件，即触发自动修复机制，

则GlusterFS-slave节点就会自动恢复那些不一致的数据！

================glusterfs集群生产场景调优精要=================

系统关键考虑

1）性能需求

2）Read/Write

3）吞吐量/IOPS/可用性

4）Workload

5）什么应用？

6）大文件？

7）小文件？

8）除了吞吐量之外的需求？

系统配置

1）根据Workload选择适当的 Volume类型

2）Volume类型

3）DHT – 高性能，无冗余

4）AFR – 高可用，读性能高

5）STP – 高并发读，写性能低，无冗余

6）协议/性能

7）Native – 性能最优

8）NFS – 特定应用下可获得最优性能

9）CIFS – 仅Windows平台使用

10）数据流

11）不同访问协议的数据流差异

12）hash+复制卷的模式是生产必须的

系统硬件配置

1）节点和集群配置

2）多 CPU－支持更多的并发线程

3）多 MEM－支持更大的 Cache

4）多网络端口－支持更高的吞吐量

5）专用后端网络用于集群内部通信

6）NFS/CIFS协议访问需要专用后端网络

7）推荐至少10GbE

8）Native协议用于内部节点通信

性能相关经验

1）GlusterFS性能很大程度上依赖硬件

2）充分理解应用基础上进行硬件配置

3）缺省参数主要用于通用目的

4）GlusterFS存在若干性能调优参数

5）性能问题应当优先排除磁盘和网络故障

6）建议6到8个磁盘最做一个raid

系统规模和架构

1）性能理论上由硬件配置决定

2）CPU/Mem/Disk/Network

3）系统规模由性能和容量需求决定

4）2U/4U存储服务器和JBOD适合构建Brick

5）三种典型应用部署

6）容量需求应用

7）2U/4U存储服务器+多个JBOD

8）CPU/RAM/Network要求低

9）性能和容量混合需求应用

10）2U/4U存储服务器+少数JBOD

11）高 CPU/RAM，低Network

12）性能需求应用

13）1U/2U存储服务器(无JBOD)

14）高 CPU/RAM，快DISK/Network

系统调优

1）关键调优参数

2）Performance.write-behind-window-size 65535 (字节)

3）Performance.cache-refresh-timeout 1 (秒)

4）Performance.cache-size 1073741824 (字节)

5）Performance.read-ahead off (仅1GbE)

6）Performance.io-thread-count 24 (CPU核数)

7）Performance.client-io-threads on (客户端)

8）performance.write-behind on

9）performance.flush-behind on

10）cluster.stripe-block-size 4MB (缺省 128KB)

11）Nfs.disable off (缺省打开)

12）缺省参数设置适用于混合workloads

13）不同应用调优

14）理解硬件/固件配置及对性能的影响

15）如CPU频率、 IB、 10GbE、 TCP offload

KVM优化

1）使用 QEMU-GlusterFS(libgfapi)整合方案

2）gluster volume set <volume> group virt

3）tuned-adm profile rhs-virtualization

4）KVM host: tuned-adm profile virtual-host

5）Images和应用数据使用不同的 volume

6）每个gluster节点不超过2个KVM Host (16 guest/host)

7）提高响应时间

8）减少/sys/block/vda/queue/nr_request

9）Server/Guest： 128/8 (缺省企值256/128)

10）提高读带宽

11）提高 /sys/block/vda/queue/read_ahead_kb

12）VM readahead： 4096 (缺省值128)

发表于 2023-06-25 20:11
阅读 ( 31 )

lusterFS分布式存储集群拾遗

你可能感兴趣的文章

相关问题

0 条评论

作家榜 »