一、GlusterFS介绍
GlusterFS是可伸缩的分布式文件系统,它将来自多个服务器的磁盘存储资源聚合到单个全局名称空间中。
优点
- 扩展到几个PB
- 处理数千个客户
- 兼容POSIX
- 使用商品硬件
- 可以使用任何支持扩展属性的磁盘文件系统
- 可使用NFS和SMB等行业标准协议进行访问
- 提供复制,配额,地理复制,快照和位腐检测
- 允许针对不同的工作负载进行优化
- 开源的
与Ceph的简单对比
[图片上传失败...(image-13f3a1-1598346187406)]
](https://upload-images.jianshu.io/upload_images/9644303-25d2b11ad20d7785.png?imageMogr2/auto-orient/strip%7CimageView2/2/w/1240)
环境要求
- 至少拥有三个节点
- 在名为“ server1”,“ server2”和“ server3”的3个节点上的Fedora 30(或更高版本)
- 网络连接正常
- 每个虚拟机上至少有两个虚拟磁盘,一个用于操作系统安装,一个用于服务GlusterFS存储(sdb),将GlusterFS存储与OS安装区分开。
- 在每台服务器上设置NTP,以使文件系统顶部的许多应用程序正常运行。
二、部署GlusterFS
1. 部署GlusterFS集群
使用/dev/sdb1作为Glusterfs存储盘,在所有节点“ server {1,2,3}”上执行此步骤
$ mkfs.xfs -i size=512 /dev/sdb1
$ mkdir -p /data/glusterfs
$ echo '/dev/sdb1 /data/glusterfs xfs defaults 1 2' >> /etc/fstab
$ mount -a && mount
安装GlusterFS
$ yum install -y centos-release-gluster
$ yum install glusterfs-server
启动GlusterFS服务,设置开机自启
$ systemctl start glusterd
$ systemctl enable glusterd
在Server1添加可信池
$ gluster peer probe 192.168.16.174
$ gluster peer probe 192.168.16.175
查看可信池状态
$ gluster peer status
2. 卷类型
- 分布式卷 ,默认创建的卷的类型。在这里,文件分布在卷中的各个块之间。比如file1只能存储在brick1或brick2中,而不能同时存储在两者中。因此, 没有数据冗余。这种存储卷的目的是轻松而便宜地缩放卷大小。但是,这也意味着brick故障将导致数据完全丢失,并且必须依靠底层硬件来进行数据丢失保护。
创建方式如下:
$ gluster volume create test-volume server1:/data server2:/data
- 复制卷,相当于副本,数据将保留在所有模块上。副本数可以在创建卷时决定,推荐至少三个副本,这种卷的一个主要优点是,即使一个brick发生故障,仍然可以从其复制的砖块访问数据。这样的卷用于更好的可靠性和数据冗余。
创建方式如下:
$ gluster volume create test-volume replica 3 server1:/data server2:/data server3:/data
- 分布式复制卷,在这个卷中,文件分布在复制的bricks上。bricks 必须是副本的倍数。另外,我们指定bricks 的顺序也很重要,因为相邻的bricks 会成为彼此的复制品。这种类型的卷用于由于冗余和扩展存储而需要高可用性的数据。所以,如果有八个bricks ,副本数为2,那么前两个bricks 就成为彼此的副本,然后是下两个bricks ,依此类推。该体积表示为4x2。类似地,如果有8个块,副本计数为4,那么4个块bricks成为彼此的副本,我们将此卷表示为2x4卷。
创建方式如下:
# 三个副本,六个节点
$ gluster volume create test-volume replica 3 transport tcp server1:/exp1 server2:/exp2 server3:/exp3 server4:/exp4 server5:/exp5 server6:/exp6
3. 创建复制卷
在所有节点上创建目录
$ mkdir -p /data/brick1/gv0
在任意节点上创建名为“gv0”的三个副本Glusterfs卷
$ gluster volume create gv0 replica 3 192.168.16.173:/data/brick1/gv0 192.168.16.174:/data/brick1/gv0 192.168.16.175:/data/brick1/gv0 force
volume create: gv0: success: please start the volume to access data
启动卷
$ gluster volume start gv0
volume start: gv0: success
查看卷的信息
$ gluster volume info
Volume Name: gv0
Type: Replicate
Volume ID: 90b4c648-3fe4-4fcb-8e1e-0a29379eef6c
Status: Started
Snapshot Count: 0
Number of Bricks: 1 x 3 = 3
Transport-type: tcp
Bricks:
Brick1: 192.168.16.173:/data/brick1/gv0
Brick2: 192.168.16.174:/data/brick1/gv0
Brick3: 192.168.16.175:/data/brick1/gv0
Options Reconfigured:
transport.address-family: inet
storage.fips-mode-rchecksum: on
nfs.disable: on
performance.client-io-threads: off
测试GlusterFS卷,在任意节点挂载Glusterfs卷
$ mount -t glusterfs 192.168.16.173:/gv0 /mnt
$ touch test /mnt
$ ls
test
查看GlusterFS卷
$ ls /data/brick1/gv0
test
其他节点挂载卷,删除这个测试文件
$ mount -t glusterfs 192.168.16.173:/gv0 /mnt
$ ls /mnt
test
$ rm -f /mnt/test
查看其他节点文件是否存在
$ ls /data/brick1/gv0
test
三、k8s使用GlusterFS
创建glusterfs的endpoints
apiVersion: v1
kind: Endpoints
metadata:
name: glusterfs-cluster
namespace: default
subsets:
- addresses:
- ip: 192.168.16.173
- ip: 192.168.16.174
- ip: 192.168.16.175
ports:
- port: 49152
protocol: TCP
创建pv使用glusterfs
apiVersion: v1
kind: PersistentVolume
metadata:
name: glusterfs-pv
spec:
capacity:
storage: 10Gi
accessModes:
- ReadWriteMany
glusterfs:
endpoints: glusterfs-cluster
path: gv0 #上面创建的存储卷
readOnly: false
创建PVC
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
name: glusterfs-pvc
spec:
accessModes:
- ReadWriteMany
resources:
requests:
storage: 2Gi
创建这些资源
$ kubectl create -f glusterfs-cluster.yaml
endpoints/glusterfs-cluster created
$ kubectl create -f glusterfs-pv.yaml
persistentvolume/glusterfs-pv created
$ kubectl create -f glusterfs-pvc.yaml
persistentvolumeclaim/glusterfs-pvc created
$ kubectl get pvc
NAME STATUS VOLUME CAPACITY ACCESS MODES STORAGECLASS AGE
glusterfs-pvc Bound glusterfs-pv 10Gi RWX 3s
创建使用Glusterfs卷的应用
apiVersion: apps/v1
kind: Deployment
metadata:
name: nginx-deployment
spec:
replicas: 2
selector:
matchLabels:
name: nginx
template:
metadata:
labels:
name: nginx
spec:
containers:
- name: nginx
image: nginx
ports:
- containerPort: 80
volumeMounts:
- name: nginxglusterfs
mountPath: "/usr/share/nginx/html"
volumes:
- name: nginxglusterfs
persistentVolumeClaim:
claimName: glusterfs-pvc
创建Pod
$ kubectl apply -f nginx-deployment.yaml
deployment.apps/nginx-deployment created
$ kubectl get pods
NAME READY STATUS RESTARTS AGE
nginx-deployment-5dd796bd85-fb47x 1/1 Running 13 31d
nginx-deployment-7ff54c698c-ffdzn 1/1 Running 0 88s
nginx-deployment-7ff54c698c-ppdj9 1/1 Running 0 2m11s
查看容器是否成功挂载
$ kubectl exec nginx-deployment-7ff54c698c-ffdzn ls /usr/share/nginx/html/
kubectl exec [POD] [COMMAND] is DEPRECATED and will be removed in a future version. Use kubectl kubectl exec [POD] -- [COMMAND] instead.
test
$ kubectl exec nginx-deployment-7ff54c698c-ffdzn mount | grep gv0
kubectl exec [POD] [COMMAND] is DEPRECATED and will be removed in a future version. Use kubectl kubectl exec [POD] -- [COMMAND] instead.
192.168.16.173:gv0 on /usr/share/nginx/html type fuse.glusterfs (rw,relatime,user_id=0,group_id=0,default_permissions,allow_other,max_read=131072)
可以看到之前创建的测试文件已经挂载到容器中,容器也成功挂载了glusterfs的gv0卷。
四、StorageClass管理GlusterFS
Heketi是一个提供RESTful API管理GlusterFS卷的框架,便于管理员对GlusterFS进行操作:
- 可以用于管理GlusterFS卷的生命周期;
- 能够在OpenStack,Kubernetes,Openshift等云平台上实现动态存储资源供应(动态在GlusterFS集群内选择bricks构建volume);
- 支持GlusterFS多集群管理。
当前环境
IP地址 主机名 磁盘
192.168.16.173 k8s-master01 /dev/sdc
192.168.16.174 k8s-node01 /dev/sdc
192.168.16.175 k8s-node02 /dev/sdc
下载heketi
wget https://github.com/heketi/heketi/releases/download/v8.0.0/heketi-v8.0.0.linux.amd64.tar.gz
tar xf heketi-v8.0.0.linux.amd64.tar.gz
heketi通过ssh管理glusterFS,创建免秘钥登陆到所有glusterFS节点。
ssh-keygen -f heketi_key -t rsa -N ''
#将公钥放到所有GlusterFS节点
ssh-copy-id -i heketi_key.pub root@192.168.16.173
ssh-copy-id -i heketi_key.pub root@192.168.16.174
ssh-copy-id -i heketi_key.pub root@192.168.16.175
修改heketi.json配置文件
{
"_port_comment": "Heketi Server Port Number",
"port": "18080", #修改端口号
----------------------------------------------------
"_use_auth": "Enable JWT authorization. Please enable for deployment",
"use_auth": true, #开启认证
"_jwt": "Private keys for access",
"jwt": {
"_admin": "Admin has access to all APIs",
"admin": {
"key": "adminkey" # 定义admin的key
},
"_user": "User only has access to /volumes endpoint",
"user": {
"key": "userkey" #定义用户的key
}
},
-----------------------------------------------
"executor": "ssh", # 修改SSH方式
"_sshexec_comment": "SSH username and private key file information",
"sshexec": {
"keyfile": "/root/heketi/heketi_key", #刚才生成的密钥
"user": "root",
"port": "22",
"fstab": "/etc/fstab",
"backup_lvm_metadata": false
},
----------------------------------------------
"_db_comment": "Database file name",
"db": "/root/heketi/heketi.db", #修改当前路径下
---------------------------------------------
将heketi添加到系统服务
$ cat /usr/lib/systemd/system/heketi.service
[Unit]
Description=RESTful based volume management framework for GlusterFS
Before=network-online.target
After=network-online.target
Documentation=https://github.com/heketi/heketi
[Service]
Type=simple
LimitNOFILE=65536
ExecStart=/root/heketi/heketi --config=/root/heketi/heketi.json #修改为自己的路径
KillMode=process
Restart=on-failure
RestartSec=5
SuccessExitStatus=15
StandardOutput=syslog
StandardError=syslog
[Install]
WantedBy=multi-user.target
Heketi添加cluster
$ ./heketi-cli --user admin --server http://192.168.16.173:18080 --secret adminkey --json cluster create
{"id":"6e484a07e4c841666a6927220fcdcfc1","nodes":[],"volumes":[],"block":true,"file":true,"blockvolumes":[]}
将3个glusterfs节点作为node添加到cluster
$ ./heketi-cli --server http://192.168.16.173:18080 --user "admin" --secret "adminkey" node add --cluster "6e484a07e4c841666a6927220fcdcfc1" --management-host-name k8s-master01 --storage-host-name 192.168.16.173 --zone 1
Node information:
Id: bc601dd85cf45fa0492da2f53099845b
State: online
Cluster Id: 6e484a07e4c841666a6927220fcdcfc1
Zone: 1
Management Hostname k8s-master01
Storage Hostname 192.168.16.173
$ ./heketi-cli --server http://192.168.16.173:18080 --user "admin" --secret "adminkey" node add --cluster "6e484a07e4c841666a6927220fcdcfc1" --management-host-name k8s-node01 --storage-host-name 192.168.16.174 --zone 1
Node information:
Id: e87dd0045a03d9636e9d3d15ec5221ee
State: online
Cluster Id: 6e484a07e4c841666a6927220fcdcfc1
Zone: 1
Management Hostname k8s-node01
Storage Hostname 192.168.16.174
添加device,volume是基于device创建的,同时需要注意的是,目前heketi仅支持使用裸分区或裸磁盘(未格式化)添加为device,不支持文件系统
$ ./heketi-cli --server http://192.168.16.173:18080 --user "admin" --secret "adminkey" --json device add --name="/dev/sdc" --node "bc601dd85cf45fa0492da2f53099845b"
$ ./heketi-cli --server http://192.168.16.173:18080 --user "admin" --secret "adminkey" --json device add --name="/dev/sdc" --node "e87dd0045a03d9636e9d3d15ec5221ee"
Device added successfully
创建一个大小为3G,副本为3的volume
$ ./heketi-cli --server http://192.168.16.173:18080 --user "admin" --secret "adminkey" volume create --size 3 --replica 3
Name: vol_20cd37910466e10a65996e2ee826c416
Size: 3
Volume Id: 20cd37910466e10a65996e2ee826c416
Cluster Id: 6e484a07e4c841666a6927220fcdcfc1
Mount: 192.168.16.173:vol_20cd37910466e10a65996e2ee826c416
Mount Options: backup-volfile-servers=192.168.16.174
Block: false
Free Size: 0
Reserved Size: 0
Block Hosting Restriction: (none)
Block Volumes: []
Durability Type: replicate
Distributed+Replica: 3
通过上面信息,可以看到Heketi创建了名为vol_20cd37910466e10a65996e2ee826c416的数据卷。
创建storageclass
apiVersion: v1
kind: Secret
metadata:
name: heketi-secret
namespace: default
data:
# base64 encoded password. E.g.: echo -n "mypassword" | base64
key: bXlwYXNzd29yZA==
type: kubernetes.io/glusterfs
---
apiVersion: storage.k8s.io/v1beta1
kind: StorageClass
metadata:
name: glusterfs
provisioner: kubernetes.io/glusterfs
allowVolumeExpansion: true
parameters:
resturl: "http://192.168.16.173:18080"
clusterid: "6e484a07e4c841666a6927220fcdcfc1"
restauthenabled: "true"
restuser: "admin"
#secretNamespace: "default"
#secretName: "heketi-secret"
restuserkey: "adminkey"
gidMin: "40000"
gidMax: "50000"
volumetype: "replicate:3"
创建statefulset应用测试,这里使用nginx模拟
apiVersion: v1
kind: Service
metadata:
name: nginx
labels:
app: nginx
spec:
ports:
- port: 80
name: web
clusterIP: None
selector:
app: nginx
---
apiVersion: apps/v1
kind: StatefulSet
metadata:
name: nginx
spec:
selector:
matchLabels:
app: nginx # has to match .spec.template.metadata.labels
serviceName: "nginx"
replicas: 3 # by default is 1
template:
metadata:
labels:
app: nginx # has to match .spec.selector.matchLabels
spec:
terminationGracePeriodSeconds: 10
containers:
- name: nginx
image: nginx
ports:
- containerPort: 80
name: web
volumeMounts:
- name: www
mountPath: /usr/share/nginx/html
volumeClaimTemplates:
- metadata:
name: www
spec:
accessModes: [ "ReadWriteOnce" ]
storageClassName: "glusterfs" #声明使用StorageClass
resources:
requests:
storage: 2Gi
创建资源
$ kubectl apply -f storageclass-glusterfs.yaml
secret/heketi-secret created
storageclass.storage.k8s.io/glusterfs created
$ kubectl apply -f statefulset.yaml
service/nginx created
statefulset.apps/nginx created
查看状态
$ kubectl get pv
NAME CAPACITY ACCESS MODES RECLAIM POLICY STATUS CLAIM STORAGECLASS REASON AGE
pvc-903bbd49-4f71-4a7c-833b-212be8a949ad 2Gi RWO Delete Bound default/www-nginx-0 glusterfs 5s
$ kubectl get pvc
NAME STATUS VOLUME CAPACITY ACCESS MODES STORAGECLASS AGE
www-nginx-0 Bound pvc-903bbd49-4f71-4a7c-833b-212be8a949ad 2Gi RWO glusterfs 28s
$ kubectl get pods -o wide
NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES
nginx-0 1/1 Running 0 92s 10.244.2.124 k8s-node01 <none> <none>
nginx-1 1/1 Running 0 2m12s 10.244.0.207 k8s-master01 <none> <none>
nginx-2 1/1 Running 0 2m12s 10.244.0.208 k8s-node02 <none> <none>
可以看到上面的PV与PVC已经自动创建并且为Bound状态,Pod也已经正常运行。现在我们可以再查看容器挂载信息:
$ kubectl exec nginx-0 mount | grep 192
kubectl exec [POD] [COMMAND] is DEPRECATED and will be removed in a future version. Use kubectl kubectl exec [POD] -- [COMMAND] instead.
192.168.16.174:vol_97ae4dc4d9782df08970e7c5f8516c8c on /usr/share/nginx/html type fuse.glusterfs (rw,relatime,user_id=0,group_id=0,default_permissions,allow_other,max_read=131072)
$ kubectl exec nginx-1 mount | grep 192
kubectl exec [POD] [COMMAND] is DEPRECATED and will be removed in a future version. Use kubectl kubectl exec [POD] -- [COMMAND] instead.
192.168.16.173:vol_5537441457775eb4e020a3855e994721 on /usr/share/nginx/html type fuse.glusterfs (rw,relatime,user_id=0,group_id=0,default_permissions,allow_other,max_read=131072)
可以看到上面的容器分别通过StorageClass挂载了名为”192.168.16.17X:vol_XXXX“的GlusterFS卷。
我们查看下卷的状态
$ gluster volume status
Status of volume: gv0 # 刚开始手动创建的卷
Gluster process TCP Port RDMA Port Online Pid
------------------------------------------------------------------------------
Brick 192.168.16.173:/data/brick1/gv0 49152 0 Y 13532
Brick 192.168.16.174:/data/brick1/gv0 49152 0 Y 20043
Brick 192.168.16.175:/data/brick1/gv0 49152 0 Y 20049
Self-heal Daemon on localhost N/A N/A Y 13557
Self-heal Daemon on 192.168.16.174 N/A N/A Y 20064
Task Status of Volume gv0
------------------------------------------------------------------------------
There are no active volume tasks
Volume k8s_data is not started
Volume k8s_data2 is not started
Status of volume: vol_20cd37910466e10a65996e2ee826c416
Gluster process TCP Port RDMA Port Online Pid
------------------------------------------------------------------------------
Brick 192.168.16.174:/var/lib/heketi/mounts
/vg_f54f294ff92934f19e2600f57192640e/brick_
4073c8584f0f625837828186a39e4858/brick 49153 0 Y 32125
Brick 192.168.16.173:/var/lib/heketi/mounts
/vg_ca8cc5d5946ec90c6f63fb42da3f6276/brick_
e50d24410bfe65ee9fd786ee232603b6/brick 49153 0 Y 12563
Brick 192.168.16.175:/var/lib/heketi/mounts
/vg_f231dadw1313mdhafahg3123n41640e/brick_
4073c8584f0f625837828186a39e4858/brick 49153 0 Y 32125
Self-heal Daemon on localhost N/A N/A Y 13557
Self-heal Daemon on 192.168.16.174 N/A N/A Y 20064
Task Status of Volume vol_20cd37910466e10a65996e2ee826c416
------------------------------------------------------------------------------
There are no active volume tasks
Status of volume: vol_5537441457775eb4e020a3855e994721
Gluster process TCP Port RDMA Port Online Pid
------------------------------------------------------------------------------
Brick 192.168.16.173:/var/lib/heketi/mounts
/vg_ca8cc5d5946ec90c6f63fb42da3f6276/brick_
01fe9669ec5bafbd81acaa1c6fc54964/brick 49155 0 Y 31882
Brick 192.168.16.174:/var/lib/heketi/mounts
/vg_f54f294ff92934f19e2600f57192640e/brick_
586a0f8d32a2797cdaa15673a9b1685c/brick 49155 0 Y 7040
Brick 192.168.16.175:/var/lib/heketi/mounts
/vg_da194ff92934f19e2600f312dxca21/brick_
586a0f8d32a2797cdaa15673a9b1685c/brick 49155 0 Y 8173
Self-heal Daemon on localhost N/A N/A Y 13557
Self-heal Daemon on 192.168.16.174 N/A N/A Y 20064
Task Status of Volume vol_5537441457775eb4e020a3855e994721
------------------------------------------------------------------------------
There are no active volume tasks
Status of volume: vol_97ae4dc4d9782df08970e7c5f8516c8c
Gluster process TCP Port RDMA Port Online Pid
------------------------------------------------------------------------------
Brick 192.168.16.173:/var/lib/heketi/mounts
/vg_ca8cc5d5946ec90c6f63fb42da3f6276/brick_
42e11be3dd1c4fbf1316a8e067fdc047/brick 49154 0 Y 9343
Brick 192.168.16.174:/var/lib/heketi/mounts
/vg_f54f294ff92934f19e2600f57192640e/brick_
4ed51fa7facd881b929192d15278b3cd/brick 49154 0 Y 7501
Brick 192.168.16.175:/var/lib/heketi/mounts
/vg_x12da712j8xaj319dka7d1k8dqabw123/brick_
586a0f8d32a2797cdaa15673a9b1685c/brick 49155 0 Y 7040
Self-heal Daemon on localhost N/A N/A Y 13557
Self-heal Daemon on 192.168.16.174 N/A N/A Y 20064
Task Status of Volume vol_97ae4dc4d9782df08970e7c5f8516c8c
------------------------------------------------------------------------------
There are no active volume tasks
通过上面的信息我们可以知道,Heketi其实就是GlusterFS的管理工具,可以自动的为应用创建数据卷以及挂载点。
参考资料:
https://github.com/heketi/heketi
https://www.cnblogs.com/zhangb8042/p/10254983.html
https://www.jianshu.com/p/4ebf960b2075
关注公众号回复【k8s】获取视频教程及更多资料:
![image.png](https://upload-images.jianshu.io/upload_images/9644303-52cdf2b971de1c50.png?imageMogr2/auto-orient/strip%7CimageView2/2/w/1240