您现在的位置是:首页 >技术交流 >k8s-熔灾网站首页技术交流
k8s-熔灾
k8s-熔灾
1.数据备份
1.velero
velero 是一个vmware 开源的一个云原生的灾难恢复和迁移工具,采用go语言编写可以进行安全的迁移,备份和恢复。
kubernetes-velero开源网站:https://velero.io/
下载网站:https://github.com/vmware-tanzu/velero/releases/tag/v1.11.0
velero相对于其他恢复和迁移软件更加人性化,能够独立备份和恢复namespace
velero支持ceph,oss等对象存储,etcd快照是一个本地文件
velero支持任务计划实现周期备份,但etcd快照也可以基于计划任务实现。
velero支持ASW(亚马逊云)EBS(弹性计算)创建快照及还原
https://www.qloudx.com/velero-for-kubernetes-backup-restore-stateful-workloads-with-aws-ebs-snapshots/ #ASW支持
https://github.com/vmware-tanzu/velero-plugin-for-aws #github的地址
velero架构图
采用对象存储多机备份提高集群熔灾,针对市面上的oss(对象存储)都是支持的CEPH,oss,aos等
velero 的常规使用
root@ubuntuharbor50:/opt/velero/velero-v1.11.0-linux-arm64# ./velero
Velero is a tool for managing disaster recovery, specifically for Kubernetes
cluster resources. It provides a simple, configurable, and operationally robust
way to back up your application state and associated data.
If you're familiar with kubectl, Velero supports a similar model, allowing you to
execute commands such as 'velero get backup' and 'velero create schedule'. The same
operations can also be performed as 'velero backup get' and 'velero schedule create'.
Usage:
velero [command]
Available Commands:
backup Work with backups
backup-location Work with backup storage locations
bug Report a Velero bug
client Velero client related commands
completion Generate completion script
create Create velero resources
debug Generate debug bundle
delete Delete velero resources
describe Describe velero resources
get Get velero resources
help Help about any command
install Install Velero
plugin Work with plugins
repo Work with repositories
restore Work with restores
schedule Work with schedules
snapshot-location Work with snapshot locations
uninstall Uninstall Velero
version Print the velero version and associated image
Flags:
--add_dir_header If true, adds the file directory to the header of the log messages
--alsologtostderr log to standard error as well as files (no effect when -logtostderr=true)
--colorized optionalBool Show colored output in TTY. Overrides 'colorized' value from $HOME/.config/velero/config.json if present. Enabled by default
--features stringArray Comma-separated list of features to enable for this Velero process. Combines with values from $HOME/.config/velero/config.json if present
-h, --help help for velero
--kubeconfig string Path to the kubeconfig file to use to talk to the Kubernetes apiserver. If unset, try the environment variable KUBECONFIG, as well as in-cluster configuration
--kubecontext string The context to use to talk to the Kubernetes apiserver. If unset defaults to whatever your current-context is (kubectl config current-context)
--log_backtrace_at traceLocation when logging hits line file:N, emit a stack trace (default :0)
--log_dir string If non-empty, write log files in this directory (no effect when -logtostderr=true)
--log_file string If non-empty, use this log file (no effect when -logtostderr=true)
--log_file_max_size uint Defines the maximum size a log file can grow to (no effect when -logtostderr=true). Unit is megabytes. If the value is 0, the maximum file size is unlimited. (default 1800)
--logtostderr log to standard error instead of files (default true)
-n, --namespace string The namespace in which Velero should operate (default "velero")
--one_output If true, only write logs to their native severity level (vs also writing to each lower severity level; no effect when -logtostderr=true)
--skip_headers If true, avoid header prefixes in the log messages
--skip_log_headers If true, avoid headers when opening log files (no effect when -logtostderr=true)
--stderrthreshold severity logs at or above this threshold go to stderr when writing to files and stderr (no effect when -logtostderr=true or -alsologtostderr=false) (default 2)
-v, --v Level number for the log level verbosity
--vmodule moduleSpec comma-separated list of pattern=N settings for file-filtered logging
Use "velero [command] --help" for more information about a command.
root@ubuntuharbor50:/opt/velero/velero-v1.11.0-linux-arm64#
2.部署velero(本系统为arm系统)
下载velero ,一定要测试./velero 是否能够使用
wget https://github.com/vmware-tanzu/velero/releases/download/v1.11.0/velero-v1.11.0-linux-arm64.tar.gz
准备一台服务器,更新一下源下载golang-cfssl
apt update && apt --fix-broken install && apt install golang-cfssl
访问minio的认证文件,后边搭建minio会用到
root@ubuntuharbor50:/opt/velero/auth# cat velero-auth.txt
[default]
aws_access_key_id = admin
aws_secret_access_key = 12345678
root@ubuntuharbor50:/opt/velero/auth#
user-csr的文件
root@ubuntuharbor50:/opt/velero/auth# cat awsuser-csr.json
{
"CN": "awsuser",
"hosts": [],
"key": {
"algo": "rsa",
"size": 2048
},
"names": [
{
"C": "CN",
"ST": "BeiJing",
"L": "BeiJing",
"O": "k8s",
"OU": "System"
}
]
}
证书签发环境的下载(注意好我都是arms自己系统下自己的)
wget https://github.com/cloudflare/cfssl/releases/download/v1.6.4/cfssl_1.6.4_linux_arm64
wget https://github.com/cloudflare/cfssl/releases/download/v1.6.4/cfssljson_1.6.4_linux_arm64
wget https://github.com/cloudflare/cfssl/releases/download/v1.6.4/cfssl-certinfo_1.6.4_linux_arm64
拿到一个这样的目录
root@ubuntuharbor50:/opt/velero/auth# ls -alh
total 28M
drwxr-xr-x 2 root root 4.0K Jun 6 01:16 .
drwxr-xr-x 4 root root 4.0K Jun 6 01:04 ..
-rw-r--r-- 1 root root 220 Jun 6 01:05 awsuser-csr.json
-rw------- 1 root root 8.9M Jun 6 01:10 cfssl-certinfo_1.6.4_linux_arm64
-rw------- 1 root root 12M Jun 6 01:10 cfssl_1.6.4_linux_arm64
-rw------- 1 root root 7.2M Jun 6 01:09 cfssljson_1.6.4_linux_arm64
-rw-r--r-- 1 root root 69 Jun 6 01:04 velero-auth.txt
root@ubuntuharbor50:/opt/velero/auth#
更改名称添加权限
root@ubuntuharbor50:/opt/velero/auth# mv cfssl-certinfo_1.6.4_linux_arm64 cfssl-certinfo
root@ubuntuharbor50:/opt/velero/auth# mv cfssljson_1.6.4_linux_arm64 cfssljson
root@ubuntuharbor50:/opt/velero/auth# mv cfssl_1.6.4_linux_arm64 cfssl
root@ubuntuharbor50:/opt/velero/auth# cp cfssl cfssl-certinfo cfssljson /usr/local/bin/
root@ubuntuharbor50:/opt/velero/auth# chmod a+x /usr/local/bin/cfssl*
root@ubuntuharbor50:/opt/velero/auth# ls -alh /usr/local/bin/cfssl*
-rwx--x--x 1 root root 12M Jun 6 01:20 /usr/local/bin/cfssl
-rwx--x--x 1 root root 8.9M Jun 6 01:20 /usr/local/bin/cfssl-certinfo
-rwx--x--x 1 root root 7.2M Jun 6 01:20 /usr/local/bin/cfssljson
root@ubuntuharbor50:/opt/velero/auth#
执行证书签发
本集群是>=1.24 的
root@ubuntuharbor50:/opt/velero/auth# kubectl get nodes -A
NAME STATUS ROLES AGE VERSION
k8s-master-01-11 Ready,SchedulingDisabled master 34h v1.26.4
k8s-worker-01-23 Ready node 34h v1.26.4
k8s-worker-02-21 Ready node 34h v1.26.4
k8s-worker-03-22 Ready node 34h v1.26.4
k8s-worker-04-32 Ready node 34h v1.26.4
k8s-worker-05-33 Ready node 34h v1.26.4
k8s-worker-06-12 Ready node 34h v1.26.4
k8s-worker-07-13 Ready node 34h v1.26.4
1.24签发命令为
root@ubuntuharbor50:/opt/velero/auth# /usr/local/bin/cfssl gencert -ca=/etc/kubeasz/clusters/k8s-01/ssl/ca.pem -ca-key=/etc/kubeasz/clusters/k8s-01/ssl/ca-key.pem -config=/etc/kubeasz/clusters/k8s-01/ssl/ca-config.json -profile=kubernetes ./awsuser-csr.json
-ca= k8s的ca文件,我这里是kubeasz创建所以在此目录,如果是kubeadm创建应该在/etc/kubenetes下
-ca-key = k8s的key文件
-config = ca的config
<=1.23 的签发命令为
root@ubuntuharbor50:/opt/velero/auth# /usr/local/bin/cfssl gencert -ca=/etc/kubernetes/ssl/ca.pem -ca-key=/etc/kubernetes/ssl/ca-key.pem -config=/etc/kubeasz/clusters/k8s-cluster1/ssl/ca-config.json -profile=kubernetes ./awsuser-csr.json
签发实战
root@ubuntuharbor50:/opt/velero/auth# /usr/local/bin/cfssl gencert -hostname=ubuntuharbor50 -ca=/etc/kubeasz/clusters/k8s-01/ssl/ca.pem -ca-key=/etc/kubeasz/clusters/k8s-01/ssl/ca-key.pem -config=/etc/kubeasz/clusters/k8s-01/ssl/ca-config.json -profile=kubernetes ./awsuser-csr.json
2023/06/06 02:37:47 [INFO] generate received request
2023/06/06 02:37:47 [INFO] received CSR
2023/06/06 02:37:47 [INFO] generating key: rsa-2048
2023/06/06 02:37:47 [INFO] encoded CSR
2023/06/06 02:37:47 [INFO] signed certificate with serial number 318858172898098129458887003499476556896675561675
{"cert":"-----BEGIN CERTIFICATE-----
MIID8jCCAtqgAwIBAgIUN9oYUGNY+fpW/jYbOR9syAhoAMswDQYJKoZIhvcNAQEL
BQAwZDELMAkGA1UEBhMCQ04xETAPBgNVBAgTCEhhbmdaaG91MQswCQYDVQQHEwJY
UzEMMAoGA1UEChMDazhzMQ8wDQYDVQQLEwZTeXN0ZW0xFjAUBgNVBAMTDWt1YmVy
bmV0ZXMtY2EwIBcNMjMwNjA1MTgzMzAwWhgPMjA3MzA1MjMxODMzMDBaMGIxCzAJ
BgNVBAYTAkNOMRAwDgYDVQQIEwdCZWlKaW5nMRAwDgYDVQQHEwdCZWlKaW5nMQww
CgYDVQQKEwNrOHMxDzANBgNVBAsTBlN5c3RlbTEQMA4GA1UEAxMHYXdzdXNlcjCC
ASIwDQYJKoZIhvcNAQEBBQADggEPADCCAQoCggEBAKgFybuModc2oiYeYUcT0GXA
iIkT42/Rz/iWzN5vc41MaYHScxivHFPeo62hHQD0h7+3sT38N76kJL9psV6DkDkE
JhMRy0/QDIFSoFwKRSIngAnq9hX3vDOi+9CYQZDJEXsJxzTmve/BGv6nnZ9ctoY6
TzTOElceTfydwXfq7Oj15qdzJVpwbPD7xbQrwbvbxbI5KwIq+agfBRZz9uUAKTRS
+SvxFGSAE+06LEiGuNV0vrAnVOcZ4XFO2BlWnHXEW+/GJ4O1DsZBoD6Z6bQzx0Ll
mfG7ax9VCnbGs/0F6Ij1xBanvA3d80n6vW207nZBnyJko/5LuVGXYcbW8f0UCdMC
AwEAAaOBmzCBmDAOBgNVHQ8BAf8EBAMCBaAwHQYDVR0lBBYwFAYIKwYBBQUHAwEG
CCsGAQUFBwMCMAwGA1UdEwEB/wQCMAAwHQYDVR0OBBYEFAjr/QOYk2+BULVwpUI6
hBcPEa59MB8GA1UdIwQYMBaAFCbn7kDkkQpBXSndmSeBxng1msR8MBkGA1UdEQQS
MBCCDnVidW50dWhhcmJvcjUwMA0GCSqGSIb3DQEBCwUAA4IBAQAW0AYMauakOYuH
eFYhHWwrmfA039tanljwT8TJHot4CKAcmGgTGb/yHJj2rteBEJrgrF0507ntqikU
K48q5zcGXVN3gj0MOvSdhiQwekLzRNx6iif7rgjfXRXk0snTqnpnlxKU848FBy1T
5LS8WSk9geNINRM7FNsx3/klMusoNuPeox1m4eWSJb3VMViQ3emVu5QEnwqKDuAB
Cdu/mZON5l8LfpB8bynuDjtO2oZcxZJ8OfsPVnU9rGXXSZHRThFR9B94owARpTNx
Bvum17GlCHoidJoYLwu3ZpYqJk7fcVvR1GXssY4HLls1SWGnEPqqWJ6qaRuluX+k
3qZS/6zl
-----END CERTIFICATE-----
","csr":"-----BEGIN CERTIFICATE REQUEST-----
MIIC0zCCAbsCAQAwYjELMAkGA1UEBhMCQ04xEDAOBgNVBAgTB0JlaUppbmcxEDAO
BgNVBAcTB0JlaUppbmcxDDAKBgNVBAoTA2s4czEPMA0GA1UECxMGU3lzdGVtMRAw
DgYDVQQDEwdhd3N1c2VyMIIBIjANBgkqhkiG9w0BAQEFAAOCAQ8AMIIBCgKCAQEA
qAXJu4yh1zaiJh5hRxPQZcCIiRPjb9HP+JbM3m9zjUxpgdJzGK8cU96jraEdAPSH
v7exPfw3vqQkv2mxXoOQOQQmExHLT9AMgVKgXApFIieACer2Ffe8M6L70JhBkMkR
ewnHNOa978Ea/qedn1y2hjpPNM4SVx5N/J3Bd+rs6PXmp3MlWnBs8PvFtCvBu9vF
sjkrAir5qB8FFnP25QApNFL5K/EUZIAT7TosSIa41XS+sCdU5xnhcU7YGVacdcRb
78Yng7UOxkGgPpnptDPHQuWZ8btrH1UKdsaz/QXoiPXEFqe8Dd3zSfq9bbTudkGf
ImSj/ku5UZdhxtbx/RQJ0wIDAQABoCwwKgYJKoZIhvcNAQkOMR0wGzAZBgNVHREE
EjAQgg51YnVudHVoYXJib3I1MDANBgkqhkiG9w0BAQsFAAOCAQEAFXyNLlbynUOO
xy2OaHNHzy9dkNz2xyLwHqnS5O15i5DBPpWnlL0dWi4QjTBNtykkpeQ9H/KCSf4r
X8B7DcV6TBVgo62alGmHv7OFanGsafEDYA2BX063+gYJBjaUlzxJAAaRM4AbiNHn
jIqrrlFMBjpYppYae2nDY00w7bYf54Do889v43s4YoDm3/3/DMAcXYfK1GBlEV/R
5dZ1+B07uMzZ59z1MtbFcOJZ3VZX5Xo+cKBFDz9ifPzP6xye6vPl/iWU6fJypRIp
UGd4ZK84tVow63kXkcPcqaB0h1oZTEL79Y1+J0hJva0HXzKu9ILKyCA8BL76XkpL
wgykwneblw==
-----END CERTIFICATE REQUEST-----
","key":"-----BEGIN RSA PRIVATE KEY-----
MIIEpAIBAAKCAQEAqAXJu4yh1zaiJh5hRxPQZcCIiRPjb9HP+JbM3m9zjUxpgdJz
GK8cU96jraEdAPSHv7exPfw3vqQkv2mxXoOQOQQmExHLT9AMgVKgXApFIieACer2
Ffe8M6L70JhBkMkRewnHNOa978Ea/qedn1y2hjpPNM4SVx5N/J3Bd+rs6PXmp3Ml
WnBs8PvFtCvBu9vFsjkrAir5qB8FFnP25QApNFL5K/EUZIAT7TosSIa41XS+sCdU
5xnhcU7YGVacdcRb78Yng7UOxkGgPpnptDPHQuWZ8btrH1UKdsaz/QXoiPXEFqe8
Dd3zSfq9bbTudkGfImSj/ku5UZdhxtbx/RQJ0wIDAQABAoIBAQCfDUuXxFp3dXot
B1kihXkiuQ0GZdNISJ7MPUQV0/7YZNsDT4owdaMlKX5boEXqX5AZRfP8L0M9rfgz
UgPa6kOeFXVNW+zP0qvjx6mRNw+WcznbKZZl2StI3iHtphN60TtA81Klmz91M6Ew
Ks8kygjmK1BLNj9aRI+icFtx/urg4k0Bs/gvl2lUhUQqATkhOV42rsVOaW49EawK
nwPEVc42kb1BlRFkzDbFDspEP9yoMM6w3Y8UFYV42FzDhJZm091+0g93nEOLqi2s
P9H+IBPPQXOamH4HlRPeiZNExpjKZScE5lHEd2YOi7e1B7UtnEvJFQ8pArz/g3pV
iBxRhPVhAoGBAMq8tcgOt/hJUgFcmp8PdqHW7OPnDBaYPUaV4nP4FdTkea2M08eq
yLRNWVADuz23OFIfaBnsbSwNMx6/2tbkP0AMykeP9ZIWOMy7+J0rYShgQr/vAlE8
iP/mukzbvDgol3geZFsJcrx9OOHCCY4LHFSQ6xH0LkqRCaXpC2mZYzWbAoGBANQq
URw24iNH06gJiYvW8g3g8F9QvMXFInwBrJAsmOlCt+3g4Z5B6LY304a/LZr9slbi
2MxmDVHw2PV06TDArTFEcgpIfbB3oFZvJ7K7Tlu1ZW3fDZv8awVsMx5WSgmSIluL
AR7/i3K5os8qCJwoF5gCP6y0ZlCcc/oSySgR8pwpAoGAQGsl96N1oVbqz7P1DYWE
VHhOXTwVAzjsf3kws1io1zSh1RtiT5dcnq3VKy+EV1/YbX+9PD97kPvAuoyLpKxx
zJBD1elQRlL5SVSQ8p/OB15O113Chr2NaoKNv84ySEXdmzVM/gBKjMndQR6+mnu9
TMGfb9z+uILNZgJetfcfJvECgYEArMRUzkvm8+HOehxiFCyRaTnNo2BEiCuSfDaE
xdZ7Ih+BVUT1lICJNrDZH/XX9kk2i0goULGdkSc2FRMBvQB5SBA7aSJEr4mKWDgl
tIaQNV/OW5zyIR54K69DJSYRHiAQuEjGPe7MKD0AVgAdiMOhCthx73nrgyMT0gSw
J2AOFpkCgYBW9hvYSNysqvZgtNN+Xk+t+xV9+COaJHHoxldzxo7CDTIdQwUjOZ86
WMJoLH9zjeXTAqxbEeeLvmg0ylDgqBhcD9CxiAYhCsgvfgS3OyOYqBW1VmLbEFbb
giTmPdkWz4WJ5RTCZKw6qO5dFAom60Bg5nAWO0d1yQ8wW7SXdzEHvg==
-----END RSA PRIVATE KEY-----
"}
分离cer
-----BEGIN CERTIFICATE-----
MIID8jCCAtqgAwIBAgIUN9oYUGNY+fpW/jYbOR9syAhoAMswDQYJKoZIhvcNAQEL
BQAwZDELMAkGA1UEBhMCQ04xETAPBgNVBAgTCEhhbmdaaG91MQswCQYDVQQHEwJY
UzEMMAoGA1UEChMDazhzMQ8wDQYDVQQLEwZTeXN0ZW0xFjAUBgNVBAMTDWt1YmVy
bmV0ZXMtY2EwIBcNMjMwNjA1MTgzMzAwWhgPMjA3MzA1MjMxODMzMDBaMGIxCzAJ
BgNVBAYTAkNOMRAwDgYDVQQIEwdCZWlKaW5nMRAwDgYDVQQHEwdCZWlKaW5nMQww
CgYDVQQKEwNrOHMxDzANBgNVBAsTBlN5c3RlbTEQMA4GA1UEAxMHYXdzdXNlcjCC
ASIwDQYJKoZIhvcNAQEBBQADggEPADCCAQoCggEBAKgFybuModc2oiYeYUcT0GXA
iIkT42/Rz/iWzN5vc41MaYHScxivHFPeo62hHQD0h7+3sT38N76kJL9psV6DkDkE
JhMRy0/QDIFSoFwKRSIngAnq9hX3vDOi+9CYQZDJEXsJxzTmve/BGv6nnZ9ctoY6
TzTOElceTfydwXfq7Oj15qdzJVpwbPD7xbQrwbvbxbI5KwIq+agfBRZz9uUAKTRS
+SvxFGSAE+06LEiGuNV0vrAnVOcZ4XFO2BlWnHXEW+/GJ4O1DsZBoD6Z6bQzx0Ll
mfG7ax9VCnbGs/0F6Ij1xBanvA3d80n6vW207nZBnyJko/5LuVGXYcbW8f0UCdMC
AwEAAaOBmzCBmDAOBgNVHQ8BAf8EBAMCBaAwHQYDVR0lBBYwFAYIKwYBBQUHAwEG
CCsGAQUFBwMCMAwGA1UdEwEB/wQCMAAwHQYDVR0OBBYEFAjr/QOYk2+BULVwpUI6
hBcPEa59MB8GA1UdIwQYMBaAFCbn7kDkkQpBXSndmSeBxng1msR8MBkGA1UdEQQS
MBCCDnVidW50dWhhcmJvcjUwMA0GCSqGSIb3DQEBCwUAA4IBAQAW0AYMauakOYuH
eFYhHWwrmfA039tanljwT8TJHot4CKAcmGgTGb/yHJj2rteBEJrgrF0507ntqikU
K48q5zcGXVN3gj0MOvSdhiQwekLzRNx6iif7rgjfXRXk0snTqnpnlxKU848FBy1T
5LS8WSk9geNINRM7FNsx3/klMusoNuPeox1m4eWSJb3VMViQ3emVu5QEnwqKDuAB
Cdu/mZON5l8LfpB8bynuDjtO2oZcxZJ8OfsPVnU9rGXXSZHRThFR9B94owARpTNx
Bvum17GlCHoidJoYLwu3ZpYqJk7fcVvR1GXssY4HLls1SWGnEPqqWJ6qaRuluX+k
3qZS/6zl
-----END CERTIFICATE-----
分离csr
-----BEGIN CERTIFICATE REQUEST-----
MIIC0zCCAbsCAQAwYjELMAkGA1UEBhMCQ04xEDAOBgNVBAgTB0JlaUppbmcxEDAO
BgNVBAcTB0JlaUppbmcxDDAKBgNVBAoTA2s4czEPMA0GA1UECxMGU3lzdGVtMRAw
DgYDVQQDEwdhd3N1c2VyMIIBIjANBgkqhkiG9w0BAQEFAAOCAQ8AMIIBCgKCAQEA
qAXJu4yh1zaiJh5hRxPQZcCIiRPjb9HP+JbM3m9zjUxpgdJzGK8cU96jraEdAPSH
v7exPfw3vqQkv2mxXoOQOQQmExHLT9AMgVKgXApFIieACer2Ffe8M6L70JhBkMkR
ewnHNOa978Ea/qedn1y2hjpPNM4SVx5N/J3Bd+rs6PXmp3MlWnBs8PvFtCvBu9vF
sjkrAir5qB8FFnP25QApNFL5K/EUZIAT7TosSIa41XS+sCdU5xnhcU7YGVacdcRb
78Yng7UOxkGgPpnptDPHQuWZ8btrH1UKdsaz/QXoiPXEFqe8Dd3zSfq9bbTudkGf
ImSj/ku5UZdhxtbx/RQJ0wIDAQABoCwwKgYJKoZIhvcNAQkOMR0wGzAZBgNVHREE
EjAQgg51YnVudHVoYXJib3I1MDANBgkqhkiG9w0BAQsFAAOCAQEAFXyNLlbynUOO
xy2OaHNHzy9dkNz2xyLwHqnS5O15i5DBPpWnlL0dWi4QjTBNtykkpeQ9H/KCSf4r
X8B7DcV6TBVgo62alGmHv7OFanGsafEDYA2BX063+gYJBjaUlzxJAAaRM4AbiNHn
jIqrrlFMBjpYppYae2nDY00w7bYf54Do889v43s4YoDm3/3/DMAcXYfK1GBlEV/R
5dZ1+B07uMzZ59z1MtbFcOJZ3VZX5Xo+cKBFDz9ifPzP6xye6vPl/iWU6fJypRIp
UGd4ZK84tVow63kXkcPcqaB0h1oZTEL79Y1+J0hJva0HXzKu9ILKyCA8BL76XkpL
wgykwneblw==
-----END CERTIFICATE REQUEST-----
分离key
-----BEGIN RSA PRIVATE KEY-----
MIIEpAIBAAKCAQEAqAXJu4yh1zaiJh5hRxPQZcCIiRPjb9HP+JbM3m9zjUxpgdJz
GK8cU96jraEdAPSHv7exPfw3vqQkv2mxXoOQOQQmExHLT9AMgVKgXApFIieACer2
Ffe8M6L70JhBkMkRewnHNOa978Ea/qedn1y2hjpPNM4SVx5N/J3Bd+rs6PXmp3Ml
WnBs8PvFtCvBu9vFsjkrAir5qB8FFnP25QApNFL5K/EUZIAT7TosSIa41XS+sCdU
5xnhcU7YGVacdcRb78Yng7UOxkGgPpnptDPHQuWZ8btrH1UKdsaz/QXoiPXEFqe8
Dd3zSfq9bbTudkGfImSj/ku5UZdhxtbx/RQJ0wIDAQABAoIBAQCfDUuXxFp3dXot
B1kihXkiuQ0GZdNISJ7MPUQV0/7YZNsDT4owdaMlKX5boEXqX5AZRfP8L0M9rfgz
UgPa6kOeFXVNW+zP0qvjx6mRNw+WcznbKZZl2StI3iHtphN60TtA81Klmz91M6Ew
Ks8kygjmK1BLNj9aRI+icFtx/urg4k0Bs/gvl2lUhUQqATkhOV42rsVOaW49EawK
nwPEVc42kb1BlRFkzDbFDspEP9yoMM6w3Y8UFYV42FzDhJZm091+0g93nEOLqi2s
P9H+IBPPQXOamH4HlRPeiZNExpjKZScE5lHEd2YOi7e1B7UtnEvJFQ8pArz/g3pV
iBxRhPVhAoGBAMq8tcgOt/hJUgFcmp8PdqHW7OPnDBaYPUaV4nP4FdTkea2M08eq
yLRNWVADuz23OFIfaBnsbSwNMx6/2tbkP0AMykeP9ZIWOMy7+J0rYShgQr/vAlE8
iP/mukzbvDgol3geZFsJcrx9OOHCCY4LHFSQ6xH0LkqRCaXpC2mZYzWbAoGBANQq
URw24iNH06gJiYvW8g3g8F9QvMXFInwBrJAsmOlCt+3g4Z5B6LY304a/LZr9slbi
2MxmDVHw2PV06TDArTFEcgpIfbB3oFZvJ7K7Tlu1ZW3fDZv8awVsMx5WSgmSIluL
AR7/i3K5os8qCJwoF5gCP6y0ZlCcc/oSySgR8pwpAoGAQGsl96N1oVbqz7P1DYWE
VHhOXTwVAzjsf3kws1io1zSh1RtiT5dcnq3VKy+EV1/YbX+9PD97kPvAuoyLpKxx
zJBD1elQRlL5SVSQ8p/OB15O113Chr2NaoKNv84ySEXdmzVM/gBKjMndQR6+mnu9
TMGfb9z+uILNZgJetfcfJvECgYEArMRUzkvm8+HOehxiFCyRaTnNo2BEiCuSfDaE
xdZ7Ih+BVUT1lICJNrDZH/XX9kk2i0goULGdkSc2FRMBvQB5SBA7aSJEr4mKWDgl
tIaQNV/OW5zyIR54K69DJSYRHiAQuEjGPe7MKD0AVgAdiMOhCthx73nrgyMT0gSw
J2AOFpkCgYBW9hvYSNysqvZgtNN+Xk+t+xV9+COaJHHoxldzxo7CDTIdQwUjOZ86
WMJoLH9zjeXTAqxbEeeLvmg0ylDgqBhcD9CxiAYhCsgvfgS3OyOYqBW1VmLbEFbb
giTmPdkWz4WJ5RTCZKw6qO5dFAom60Bg5nAWO0d1yQ8wW7SXdzEHvg==
-----END RSA PRIVATE KEY-----
确认无误
root@ubuntuharbor50:/opt/velero/auth# ls -alh
total 28M
drwxr-xr-x 2 root root 4.0K Jun 6 02:50 .
drwxr-xr-x 4 root root 4.0K Jun 6 01:04 ..
-rw-r--r-- 1 root root 220 Jun 6 01:05 awsuser-csr.json
-rw-r--r-- 1 root root 1.7K Jun 6 02:50 awsuser-key.pem
-rw-r--r-- 1 root root 1.1K Jun 6 02:50 awsuser.csr
-rw-r--r-- 1 root root 1.4K Jun 6 02:50 awsuser.pem
-rw------- 1 root root 12M Jun 6 01:10 cfssl
-rw------- 1 root root 8.9M Jun 6 01:10 cfssl-certinfo
-rw------- 1 root root 7.2M Jun 6 01:09 cfssljson
-rw-r--r-- 1 root root 69 Jun 6 01:04 velero-auth.txt
root@ubuntuharbor50:/opt/velero/auth#
拷贝文件:
我拷贝了两个节点一个为本地节点,另一个为master节点
10.0.0.11 为我的master节点
root@ubuntuharbor50:/opt/velero/auth# cp awsuser-key.pem /etc/kubeasz/clusters/k8s-01/ssl/
root@ubuntuharbor50:/opt/velero/auth# scp awsuser-key.pem 10.0.0.11:/etc/kubernetes/ssl/
awsuser-key.pem 100% 1679 5.0MB/s 00:00
root@ubuntuharbor50:/opt/velero/auth# cp awsuser.pem /etc/kubeasz/clusters/k8s-01/ssl/
root@ubuntuharbor50:/opt/velero/auth# scp awsuser.pem 10.0.0.11:/etc/kubernetes/ssl/
验证数据kubeasz节点
root@ubuntuharbor50:/opt/velero/auth# ls /etc/kubeasz/clusters/k8s-01/ssl/awsuser*
/etc/kubeasz/clusters/k8s-01/ssl/awsuser-key.pem /etc/kubeasz/clusters/k8s-01/ssl/awsuser.pem
验证数据 Master节点
root@k8s-master-01-11:/data/k8s_yaml/app# ls /etc/kubernetes/ssl/awsuser*
/etc/kubernetes/ssl/awsuser-key.pem /etc/kubernetes/ssl/awsuser.pem
生成集群的config文件
export KUBE_APISERVER="https://10.0.0.11:6443" #宣告环境变量
root@k8s-master-01-11:/opt/velero# kubectl config set-cluster kubernetes
> --certificate-authority=/etc/kubernetes/ssl/ca.pem
> --embed-certs=true
> --server=${KUBE_APISERVER}
> --kubeconfig=./awsuser.kubeconfig
Cluster "kubernetes" set.
root@k8s-master-01-11:/opt/velero#
root@k8s-master-01-11:/opt/velero# kubectl config set-credentials awsuser
> --client-certificate=/etc/kubernetes/ssl/awsuser.pem
> --client-key=/etc/kubernetes/ssl/awsuser-key.pem
> --embed-certs=true
> --kubeconfig=./awsuser.kubeconfig
User "awsuser" set.
root@k8s-master-01-11:/opt/velero# ls
awsuser.kubeconfig
root@k8s-master-01-11:/opt/velero# kubectl config set-context kubernetes
> --cluster=kubernetes
> --user=awsuser
> --namespace=velero-system
> --kubeconfig=./awsuser.kubeconfig
Context "kubernetes" created.
root@k8s-master-01-11:/opt/velero# kubectl config use-context kubernetes --kubeconfig=awsuser.kubeconfig
Switched to context "kubernetes".
root@k8s-master-01-11:/opt/velero# kubectl create clusterrolebinding awsuser --clusterrole=cluster-admin --user=awsuser
clusterrolebinding.rbac.authorization.k8s.io/awsuser created
root@k8s-master-01-11:/opt/velero# kubectl create ns velero-system
namespace/velero-system created
root@k8s-master-01-11:/opt/velero#
生成客户端证书认证
root@k8s-master-01-11:/opt/velero#
root@k8s-master-01-11:/opt/velero# kubectl config set-credentials awsuser
> --client-certificate=/etc/kubernetes/ssl/awsuser.pem
> --client-key=/etc/kubernetes/ssl/awsuser-key.pem
> --embed-certs=true
> --kubeconfig=./awsuser.kubeconfig
User "awsuser" set.
root@k8s-master-01-11:/opt/velero# ls
awsuser.kubeconfig
设置上下文参数认证
root@k8s-master-01-11:/opt/velero# kubectl config set-context kubernetes
> --cluster=kubernetes
> --user=awsuser
> --namespace=velero-system
> --kubeconfig=./awsuser.kubeconfig
Context "kubernetes" created.
设置默认上下文
root@k8s-master-01-11:/opt/velero# kubectl config use-context kubernetes --kubeconfig=awsuser.kubeconfig
Switched to context "kubernetes".
集群中创建用户
root@k8s-master-01-11:/opt/velero# kubectl create clusterrolebinding awsuser --clusterrole=cluster-admin --user=awsuser
clusterrolebinding.rbac.authorization.k8s.io/awsuser created
创建命名空间
root@k8s-master-01-11:/opt/velero# kubectl create ns velero-system
namespace/velero-system created
下载镜像images
root@ubuntuharbor50:/opt/velero# docker pull velero/velero:release-1.10-dev
release-1.10-dev: Pulling from velero/velero
dde16751a2cc: Pull complete
fe5ca62666f0: Pull complete
fff4e558ad3a: Pull complete
fcb6f6d2c998: Pull complete
e8c73c638ae9: Pull complete
1e3d9b7d1452: Pull complete
4aa0ea1413d3: Pull complete
7c881f9ab25e: Pull complete
5627a970d25e: Pull complete
1e0faa398149: Pull complete
de478f74f720: Pull complete
950fde9ccf45: Pull complete
Digest: sha256:078af8052876fd369225ec7bfffe411d34a93ddb265277094d58103f03ec7c8a
Status: Downloaded newer image for velero/velero:release-1.10-dev
docker.io/velero/velero:release-1.10-dev
root@ubuntuharbor50:/opt/velero# docker pull velero/velero-plugin-for-aws:latest
latest: Pulling from velero/velero-plugin-for-aws
f8638652bda4: Pull complete
81f64d343e1d: Pull complete
Digest: sha256:91b0be561037f3b6420ace15536996afed5bd0fba537f7ab8e171587977cdc83
Status: Downloaded newer image for velero/velero-plugin-for-aws:latest
docker.io/velero/velero-plugin-for-aws:latest
root@ubuntuharbor50:/opt/velero#
执行安装
root@k8s-master-01-11:/opt/velero#velero --kubeconfig ./awsuser.kubeconfig
install
--provider aws
--plugins www.ghostxin.online/application/velero-plugin-for-aws:latest
--bucket velerodata
--secret-file ./velero-auth.txt
--use-volume-snapshots=false
--namespace velero-system
--backup-location-config region=minio,s3ForcePathStyle="true",s3Url=http://10.0.0.11:9000
#需要等待pod初始化完成
root@k8s-master-01-11:/opt/velero# velero --kubeconfig ./awsuser.kubeconfig install --provider aws --plugins www.ghostxin.online/application/velero-plugin-for-aws:latest --bucket velerodata --secret-file ./velero-auth.txt --use-volume-snapshots=false --namespace velero-system --backup-location-config region=minio,s3ForcePathStyle="true",s3Url=http://10.0.0.11:9000
CustomResourceDefinition/backuprepositories.velero.io: attempting to create resource
CustomResourceDefinition/backuprepositories.velero.io: attempting to create resource client
CustomResourceDefinition/backuprepositories.velero.io: created
CustomResourceDefinition/backups.velero.io: attempting to create resource
CustomResourceDefinition/backups.velero.io: attempting to create resource client
CustomResourceDefinition/backups.velero.io: created
CustomResourceDefinition/backupstoragelocations.velero.io: attempting to create resource
CustomResourceDefinition/backupstoragelocations.velero.io: attempting to create resource client
CustomResourceDefinition/backupstoragelocations.velero.io: created
CustomResourceDefinition/deletebackuprequests.velero.io: attempting to create resource
CustomResourceDefinition/deletebackuprequests.velero.io: attempting to create resource client
CustomResourceDefinition/deletebackuprequests.velero.io: created
CustomResourceDefinition/downloadrequests.velero.io: attempting to create resource
CustomResourceDefinition/downloadrequests.velero.io: attempting to create resource client
CustomResourceDefinition/downloadrequests.velero.io: created
CustomResourceDefinition/podvolumebackups.velero.io: attempting to create resource
CustomResourceDefinition/podvolumebackups.velero.io: attempting to create resource client
CustomResourceDefinition/podvolumebackups.velero.io: created
CustomResourceDefinition/podvolumerestores.velero.io: attempting to create resource
CustomResourceDefinition/podvolumerestores.velero.io: attempting to create resource client
CustomResourceDefinition/podvolumerestores.velero.io: created
CustomResourceDefinition/restores.velero.io: attempting to create resource
CustomResourceDefinition/restores.velero.io: attempting to create resource client
CustomResourceDefinition/restores.velero.io: created
CustomResourceDefinition/schedules.velero.io: attempting to create resource
CustomResourceDefinition/schedules.velero.io: attempting to create resource client
CustomResourceDefinition/schedules.velero.io: created
CustomResourceDefinition/serverstatusrequests.velero.io: attempting to create resource
CustomResourceDefinition/serverstatusrequests.velero.io: attempting to create resource client
CustomResourceDefinition/serverstatusrequests.velero.io: created
CustomResourceDefinition/volumesnapshotlocations.velero.io: attempting to create resource
CustomResourceDefinition/volumesnapshotlocations.velero.io: attempting to create resource client
CustomResourceDefinition/volumesnapshotlocations.velero.io: created
Waiting for resources to be ready in cluster...
Namespace/velero-system: attempting to create resource
Namespace/velero-system: attempting to create resource client
Namespace/velero-system: already exists, proceeding
Namespace/velero-system: created
ClusterRoleBinding/velero-velero-system: attempting to create resource
ClusterRoleBinding/velero-velero-system: attempting to create resource client
ClusterRoleBinding/velero-velero-system: created
ServiceAccount/velero: attempting to create resource
ServiceAccount/velero: attempting to create resource client
ServiceAccount/velero: created
Secret/cloud-credentials: attempting to create resource
Secret/cloud-credentials: attempting to create resource client
Secret/cloud-credentials: created
BackupStorageLocation/default: attempting to create resource
BackupStorageLocation/default: attempting to create resource client
BackupStorageLocation/default: created
Deployment/velero: attempting to create resource
Deployment/velero: attempting to create resource client
Deployment/velero: created
Velero is installed! ⛵ Use 'kubectl logs deployment/velero -n velero-system' to view the status.
# 执行安装:
velero --kubeconfig ./awsuser.kubeconfig
install
--provider aws
--plugins www.ghostxin.online/application/velero-plugin-for-aws:latest
--bucket velerodata
--secret-file ./velero-auth.txt
--use-volume-snapshots=false
--namespace velero-system
--backup-location-config region=minio,s3ForcePathStyle="true",s3Url=http://10.0.0.50:9000 #minio地址+端口
#验证安装:
root@k8s-master1:/data/velero# kubectl get pod -n velero-system
root@k8s-master1:/data/velero# kubectl get pod -n velero-system -o wide
root@k8s-node2:~# nerdctl pull velero/velero-plugin-for-aws:v1.5.5
root@k8s-node2:~# nerdctl pull velero/velero:v1.11.0
root@k8s-master1:/data/velero# kubectl get pod -n velero-system
NAME READY STATUS RESTARTS AGE
velero-98bc8c975-q6c5d 1/1 Running 0 2m36s
观察pod,创建完成
NAMESPACE NAME READY STATUS RESTARTS AGE
kube-system calico-kube-controllers-5d45cfb97b-ndhcx 1/1 Running 2 (3h5m ago) 36h
kube-system calico-node-4nxsw 1/1 Running 0 36h
kube-system calico-node-4v5kq 1/1 Running 1 (12h ago) 36h
kube-system calico-node-4zd25 1/1 Running 1 (12h ago) 36h
kube-system calico-node-7dlcz 1/1 Running 1 (12h ago) 36h
kube-system calico-node-cwgcl 1/1 Running 2 (11h ago) 36h
kube-system calico-node-kc866 1/1 Running 1 (12h ago) 36h
kube-system calico-node-kksvs 1/1 Running 1 (12h ago) 36h
kube-system calico-node-v6chq 1/1 Running 1 (12h ago) 36h
kube-system coredns-566564f9fd-j4gsv 1/1 Running 0 11h
kube-system coredns-566564f9fd-jv4wz 1/1 Running 0 11h
myapp myapp-tomcat-app1-deployment-6d9d8885db-n2twq 1/1 Running 0 149m
velero-system velero-f9b9bc564-qgpj5 1/1 Running 0 2m27s
3.备份namespace
命令演示
velero backup create myserver-ns-backup-${DATE} --include-namespaces myserver --kubeconfig=./awsuser.kubeconfig --namespace velero-system
./velero backup create myapp-ns-backup-`date +%Y-%m-%d-%T` --include-namespace myapp --kubeconfig=/root/.kube/config --namespace ^C
实战
首先需要确定需要备份的namespace是否存在app,velero也要存在
root@k8s-master-01-11:/opt/velero# kubectl get pod -A
NAMESPACE NAME READY STATUS RESTARTS AGE
kube-system calico-kube-controllers-5d45cfb97b-ndhcx 1/1 Running 2 (3h8m ago) 36h
kube-system calico-node-4nxsw 1/1 Running 0 36h
kube-system calico-node-4v5kq 1/1 Running 1 (12h ago) 36h
kube-system calico-node-4zd25 1/1 Running 1 (12h ago) 36h
kube-system calico-node-7dlcz 1/1 Running 1 (12h ago) 36h
kube-system calico-node-cwgcl 1/1 Running 2 (11h ago) 36h
kube-system calico-node-kc866 1/1 Running 1 (12h ago) 36h
kube-system calico-node-kksvs 1/1 Running 1 (12h ago) 36h
kube-system calico-node-v6chq 1/1 Running 1 (12h ago) 36h
kube-system coredns-566564f9fd-j4gsv 1/1 Running 0 11h
kube-system coredns-566564f9fd-jv4wz 1/1 Running 0 11h
myapp myapp-tomcat-app1-deployment-6d9d8885db-n2twq 1/1 Running 0 152m
velero-system velero-f9b9bc564-qgpj5 1/1 Running 0 5m16s
备份命令
velero backup create myapp-ns-backup-`date +%Y-%m-%d-%T` --include-namespace myapp --kubeconfig=/root/.kube/config --namespace velero-system
备份回馈
root@k8s-master-01-11:/opt/velero# velero backup create myapp-ns-backup-`date +%Y-%m-%d` --include-namespaces myapp --kubeconfig=/root/.kube/config --namespace velero-system
Backup request "myapp-ns-backup-2023-06-06" submitted successfully.
Run `velero backup describe myapp-ns-backup-2023-06-06` or `velero backup logs myapp-ns-backup-2023-06-06` for more details.
验证数据
root@k8s-master-01-11:/opt/velero# velero backup create myapp-ns-backup-`date +%Y-%m-%d` --include-namespaces myapp --kubeconfig=/root/.kube/config --namespace velero-system
An error occurred: backups.velero.io "myapp-ns-backup-2023-06-06" already exists
root@k8s-master-01-11:/opt/velero# velero backup describe myapp-ns-backup-2023-06-06 --kubeconfig=./awsuser.kubeconfig -- namespace velero-system
4.Minio
minio下载
dockerhub地址:https://hub.docker.com/r/bitnami/minio/tags
通过docker 下载并传到本地镜像仓库
root@ubuntuharbor50:/opt/velero/auth# docker pull bitnami/minio:latest
latest: Pulling from bitnami/minio
503743688227: Pull complete
Digest: sha256:090a11b3e1bcd3ea118d0f1406bb60c502c40bce68a1bb0933de04168ab87fde
Status: Downloaded newer image for bitnami/minio:latest
docker.io/bitnami/minio:latest
root@ubuntuharbor50:/opt/velero/auth# docker tag bitnami/minio:latest www.ghostxin.online/application/bitnami/minio:latest
root@ubuntuharbor50:/opt/velero/auth# docker push www.ghostxin.online/application/bitnami/minio:latest
The push refers to repository [www.ghostxin.online/application/bitnami/minio]
bc00acc355f8: Pushed
latest: digest: sha256:02d6bc6ca13ac66cce51e411d11b4cb8e3a13dc279d0640a12fae1d213198ccc size: 529
启动minio
root@k8s-deploy:/usr/local/src# docker run --name minio
-p 9000:9000
-p 9001:9001
-d --restart=always
-e "MINIO_ROOT_USER=admin"
-e "MINIO_ROOT_PASSWORD=12345678"
-v /data/minio/data:/data www.ghostxin.online/application/bitnami/minio:latest
登陆测试账号为admin 密码12345678
看到写入数据就ok
[外链图片转存失败,源站可能有防盗链机制,建议将图片保存下来直接上传(img-gXpMPGMM-1686298798989)(/Users/liujinxin/Library/Application Support/typora-user-images/image-20230606041743903.png)]
之前的备份数据
[外链图片转存失败,源站可能有防盗链机制,建议将图片保存下来直接上传(img-vldAnTxg-1686298798989)(/Users/liujinxin/Library/Application Support/typora-user-images/image-20230606041921141.png)]验证备份,备份时tomcat镜像是正常启动的,目前需要删除tomcatpod 然后备份恢复
验证pod
root@k8s-master-01-11:/opt/velero# kubectl get pod -A
NAMESPACE NAME READY STATUS RESTARTS AGE
kube-system calico-kube-controllers-5d45cfb97b-ndhcx 1/1 Running 2 (4h7m ago) 37h
kube-system calico-node-4nxsw 1/1 Running 0 37h
kube-system calico-node-4v5kq 1/1 Running 1 (13h ago) 37h
kube-system calico-node-4zd25 1/1 Running 1 (13h ago) 37h
kube-system calico-node-7dlcz 1/1 Running 1 (13h ago) 37h
kube-system calico-node-cwgcl 1/1 Running 2 (12h ago) 37h
kube-system calico-node-kc866 1/1 Running 1 (13h ago) 37h
kube-system calico-node-kksvs 1/1 Running 1 (13h ago) 37h
kube-system calico-node-v6chq 1/1 Running 1 (13h ago) 37h
kube-system coredns-566564f9fd-j4gsv 1/1 Running 0 12h
kube-system coredns-566564f9fd-jv4wz 1/1 Running 0 12h
myapp myapp-tomcat-app1-deployment-6d9d8885db-n2twq 1/1 Running 0 3h31m
velero-system velero-f9b9bc564-fnlkh 1/1 Running 0 4m39s
删除tomcat镜像,镜像已删除,开始恢复
root@k8s-master-01-11:/opt/velero# cd /data/k8s_yaml/app/
root@k8s-master-01-11:/data/k8s_yaml/app# kubectl delete -f tomcat.yaml
deployment.apps "myapp-tomcat-app1-deployment" deleted
service "myapp-tomcat-app1-service" deleted
root@k8s-master-01-11:/data/k8s_yaml/app# kubectl get pod -A
NAMESPACE NAME READY STATUS RESTARTS AGE
kube-system calico-kube-controllers-5d45cfb97b-ndhcx 1/1 Running 2 (4h8m ago) 37h
kube-system calico-node-4nxsw 1/1 Running 0 37h
kube-system calico-node-4v5kq 1/1 Running 1 (13h ago) 37h
kube-system calico-node-4zd25 1/1 Running 1 (13h ago) 37h
kube-system calico-node-7dlcz 1/1 Running 1 (13h ago) 37h
kube-system calico-node-cwgcl 1/1 Running 2 (13h ago) 37h
kube-system calico-node-kc866 1/1 Running 1 (13h ago) 37h
kube-system calico-node-kksvs 1/1 Running 1 (13h ago) 37h
kube-system calico-node-v6chq 1/1 Running 1 (13h ago) 37h
kube-system coredns-566564f9fd-j4gsv 1/1 Running 0 12h
kube-system coredns-566564f9fd-jv4wz 1/1 Running 0 12h
velero-system velero-f9b9bc564-fnlkh 1/1 Running 0 6m12s
root@k8s-master-01-11:/data/k8s_yaml/app#
恢复数据使用命令
velero restore create --from-backup ${名称} --wait --kubeconfig=./awsuser.kubeconfig --namespace velero-system
恢复实际使用命令
root@k8s-master-01-11:/opt/velero# velero restore create --from-backup myapp-ns-backup-2023-06-06 --include-namespaces myapp --kubeconfig=./awsuser.kubeconfig --namespace velero-system
Restore request "myapp-ns-backup-2023-06-06-20230606042702" submitted successfully.
Run `velero restore describe myapp-ns-backup-2023-06-06-20230606042702` or `velero restore logs myapp-ns-backup-2023-06-06-20230606042702` for more details.
查看pod,恢复完成
root@k8s-master-01-11:/opt/velero# kubectl get pod -A
NAMESPACE NAME READY STATUS RESTARTS AGE
kube-system calico-kube-controllers-5d45cfb97b-ndhcx 1/1 Running 2 (4h12m ago) 37h
kube-system calico-node-4nxsw 1/1 Running 0 37h
kube-system calico-node-4v5kq 1/1 Running 1 (13h ago) 37h
kube-system calico-node-4zd25 1/1 Running 1 (13h ago) 37h
kube-system calico-node-7dlcz 1/1 Running 1 (13h ago) 37h
kube-system calico-node-cwgcl 1/1 Running 2 (13h ago) 37h
kube-system calico-node-kc866 1/1 Running 1 (13h ago) 37h
kube-system calico-node-kksvs 1/1 Running 1 (13h ago) 37h
kube-system calico-node-v6chq 1/1 Running 1 (13h ago) 37h
kube-system coredns-566564f9fd-j4gsv 1/1 Running 0 13h
kube-system coredns-566564f9fd-jv4wz 1/1 Running 0 13h
myapp myapp-tomcat-app1-deployment-6d9d8885db-n2twq 1/1 Running 0 4s
velero-system velero-f9b9bc564-fnlkh 1/1 Running 0 9m58s
指定pod进行恢复
创建演示pod
kubectl run net-test1 --image=www.ghostxin.online/application/ubuntu:V1 sleep 10000000000 -n myapp
查看pod已经创建完成
root@k8s-master-01-11:/opt/velero# kubectl get pod -A
NAMESPACE NAME READY STATUS RESTARTS AGE
kube-system calico-kube-controllers-5d45cfb97b-ndhcx 1/1 Running 2 (4h26m ago) 38h
kube-system calico-node-4nxsw 1/1 Running 0 38h
kube-system calico-node-4v5kq 1/1 Running 1 (13h ago) 38h
kube-system calico-node-4zd25 1/1 Running 1 (13h ago) 38h
kube-system calico-node-7dlcz 1/1 Running 1 (13h ago) 38h
kube-system calico-node-cwgcl 1/1 Running 2 (13h ago) 38h
kube-system calico-node-kc866 1/1 Running 1 (13h ago) 38h
kube-system calico-node-kksvs 1/1 Running 1 (13h ago) 38h
kube-system calico-node-v6chq 1/1 Running 1 (13h ago) 38h
kube-system coredns-566564f9fd-j4gsv 1/1 Running 0 13h
kube-system coredns-566564f9fd-jv4wz 1/1 Running 0 13h
myapp myapp-tomcat-app1-deployment-6d9d8885db-n2twq 1/1 Running 0 13m
myapp net-test1 1/1 Running 0 7s
velero-system velero-f9b9bc564-fnlkh 1/1 Running 0 23m
创建pod备份
velero backup create pod-backup-`date +%Y-%m-%d` --include-cluster-resources=true --ordered-resources 'pods=myapp/net-test1,defafut/net-test1' --namespace velero-system --include-namespaces=myapp,defafut
创建命令执行结果
root@k8s-master-01-11:/opt/velero# velero backup create pod-backup-`date +%Y-%m-%d` --include-cluster-resources=true --ordered-resources 'pods=myapp/net-test1,defafut/net-test1' --namespace velero-system --include-namespaces=myapp,defafut
Backup request "pod-backup-2023-06-06" submitted successfully.
Run `velero backup describe pod-backup-2023-06-06` or `velero backup logs pod-backup-2023-06-06` for more details.
查看minio页面,发现备份已经创建
删除namesp下的所有pod然后恢复备份查看是否能够恢复test-pod
myapp目前存在2个pod
root@k8s-master-01-11:/opt/velero# kubectl get pod -A
NAMESPACE NAME READY STATUS RESTARTS AGE
kube-system calico-kube-controllers-5d45cfb97b-ndhcx 1/1 Running 2 (4h31m ago) 38h
kube-system calico-node-4nxsw 1/1 Running 0 38h
kube-system calico-node-4v5kq 1/1 Running 1 (13h ago) 38h
kube-system calico-node-4zd25 1/1 Running 1 (13h ago) 38h
kube-system calico-node-7dlcz 1/1 Running 1 (13h ago) 38h
kube-system calico-node-cwgcl 1/1 Running 2 (13h ago) 38h
kube-system calico-node-kc866 1/1 Running 1 (13h ago) 38h
kube-system calico-node-kksvs 1/1 Running 1 (13h ago) 38h
kube-system calico-node-v6chq 1/1 Running 1 (13h ago) 38h
kube-system coredns-566564f9fd-j4gsv 1/1 Running 0 13h
kube-system coredns-566564f9fd-jv4wz 1/1 Running 0 13h
myapp myapp-tomcat-app1-deployment-6d9d8885db-n2twq 1/1 Running 0 19m
myapp net-test1 1/1 Running 0 5m27s
velero-system velero-f9b9bc564-fnlkh 1/1 Running 0 29m
全部删除开始恢复
删除验证
root@k8s-master-01-11:/data/k8s_yaml/app# kubectl delete -f tomcat.yaml
deployment.apps "myapp-tomcat-app1-deployment" deleted
service "myapp-tomcat-app1-service" deleted
root@k8s-master-01-11:/data/k8s_yaml/app# cd -
/opt/velero
root@k8s-master-01-11:/opt/velero# kubectl delete pod -n myapp net-test1
pod "net-test1" deleted
root@k8s-master-01-11:/opt/velero# kubectl get pod -A
NAMESPACE NAME READY STATUS RESTARTS AGE
kube-system calico-kube-controllers-5d45cfb97b-ndhcx 1/1 Running 2 (4h33m ago) 38h
kube-system calico-node-4nxsw 1/1 Running 0 38h
kube-system calico-node-4v5kq 1/1 Running 1 (13h ago) 38h
kube-system calico-node-4zd25 1/1 Running 1 (13h ago) 38h
kube-system calico-node-7dlcz 1/1 Running 1 (13h ago) 38h
kube-system calico-node-cwgcl 1/1 Running 2 (13h ago) 38h
kube-system calico-node-kc866 1/1 Running 1 (13h ago) 38h
kube-system calico-node-kksvs 1/1 Running 1 (13h ago) 38h
kube-system calico-node-v6chq 1/1 Running 1 (13h ago) 38h
kube-system coredns-566564f9fd-j4gsv 1/1 Running 0 13h
kube-system coredns-566564f9fd-jv4wz 1/1 Running 0 13h
velero-system velero-f9b9bc564-fnlkh 1/1 Running 0 30m
root@k8s-master-01-11:/opt/velero#
恢复验证,发现会恢复整个ns的全部pod
root@k8s-master-01-11:/opt/velero# velero restore create --from-backup pod-backup-2023-06-06 --wait --kubeconfig=./awsuser.kubeconfig --namespace velero-system
Restore request "pod-backup-2023-06-06-20230606045100" submitted successfully.
Waiting for restore to complete. You may safely press ctrl-c to stop waiting - your restore will continue in the background.
.................
Restore completed with status: Completed. You may check for more information using the commands `velero restore describe pod-backup-2023-06-06-20230606045100` and `velero restore logs pod-backup-2023-06-06-20230606045100`.
#验证pod
root@k8s-master-01-11:/opt/velero# kubectl get pod -A
NAMESPACE NAME READY STATUS RESTARTS AGE
kube-system calico-kube-controllers-5d45cfb97b-ndhcx 1/1 Running 2 (4h36m ago) 38h
kube-system calico-node-4nxsw 1/1 Running 0 38h
kube-system calico-node-4v5kq 1/1 Running 1 (13h ago) 38h
kube-system calico-node-4zd25 1/1 Running 1 (13h ago) 38h
kube-system calico-node-7dlcz 1/1 Running 1 (13h ago) 38h
kube-system calico-node-cwgcl 1/1 Running 2 (13h ago) 38h
kube-system calico-node-kc866 1/1 Running 1 (13h ago) 38h
kube-system calico-node-kksvs 1/1 Running 1 (13h ago) 38h
kube-system calico-node-v6chq 1/1 Running 1 (13h ago) 38h
kube-system coredns-566564f9fd-j4gsv 1/1 Running 0 13h
kube-system coredns-566564f9fd-jv4wz 1/1 Running 0 13h
myapp myapp-tomcat-app1-deployment-6d9d8885db-n2twq 1/1 Running 0 17s
myapp net-test1 1/1 Running 0 17s
velero-system velero-f9b9bc564-fnlkh 1/1 Running 0 34m
脚本(备份ns)
#!/bin/bash
#********************************************************************
#Author: liujinxin
#QQ: 942207953
#Date: 2023-06-06
#FileName: backup_kubernetes_ns.sh
#E-MAIL: 942207953@qq.com
#Description: The test script
#Copyright (C): 2023 All rights reserved
#********************************************************************
NS_NAME=`kubectl get ns | awk '{if (NR>2) print $1}'`
DATE=`date +%y-%m-%d-%H-%M-%S`
if [ 'pwd' -eq "/opt/velero" ] ;then
for i in ${NS_NAME} ;do
velero backup create ${i}-ns-buckup-${DATE} --include-cluster-resources=true --include-namespaces ${i} --kubeconfig=/root/.kube/config --namespace velero-system
done
else
cd /opt/velero
for i in ${NS_NAME} ;do
velero backup create ${i}-ns-buckup-${DATE} --include-cluster-resources=true --include-namespaces ${i} --kubeconfig=/root/.kube/config --namespace velero-system
done
fi
echo "Task completed,Please log in to mini to view!"
执行结果
root@k8s-master-01-11:/opt/velero# bash -x backup_kubernetes_ns.sh
++ awk '{if (NR>2) print $1}'
++ kubectl get ns
+ NS_NAME='kube-node-lease
kube-public
kube-system
myapp
velero-system'
++ date +%y-%m-%d-%H-%M-%S
+ DATE=23-06-06-05-09-07
+ '[' pwd -eq /opt/velero ']'
backup_kubernetes_ns.sh: line 13: [: pwd: integer expression expected
+ cd /opt/velero
+ for i in ${NS_NAME}
+ velero backup create kube-node-lease-ns-buckup-23-06-06-05-09-07 --include-cluster-resources=true --include-namespaces kube-node-lease --kubeconfig=/root/.kube/config --namespace velero-system
Backup request "kube-node-lease-ns-buckup-23-06-06-05-09-07" submitted successfully.
Run `velero backup describe kube-node-lease-ns-buckup-23-06-06-05-09-07` or `velero backup logs kube-node-lease-ns-buckup-23-06-06-05-09-07` for more details.
+ for i in ${NS_NAME}
+ velero backup create kube-public-ns-buckup-23-06-06-05-09-07 --include-cluster-resources=true --include-namespaces kube-public --kubeconfig=/root/.kube/config --namespace velero-system
Backup request "kube-public-ns-buckup-23-06-06-05-09-07" submitted successfully.
Run `velero backup describe kube-public-ns-buckup-23-06-06-05-09-07` or `velero backup logs kube-public-ns-buckup-23-06-06-05-09-07` for more details.
+ for i in ${NS_NAME}
+ velero backup create kube-system-ns-buckup-23-06-06-05-09-07 --include-cluster-resources=true --include-namespaces kube-system --kubeconfig=/root/.kube/config --namespace velero-system
Backup request "kube-system-ns-buckup-23-06-06-05-09-07" submitted successfully.
Run `velero backup describe kube-system-ns-buckup-23-06-06-05-09-07` or `velero backup logs kube-system-ns-buckup-23-06-06-05-09-07` for more details.
+ for i in ${NS_NAME}
+ velero backup create myapp-ns-buckup-23-06-06-05-09-07 --include-cluster-resources=true --include-namespaces myapp --kubeconfig=/root/.kube/config --namespace velero-system
Backup request "myapp-ns-buckup-23-06-06-05-09-07" submitted successfully.
Run `velero backup describe myapp-ns-buckup-23-06-06-05-09-07` or `velero backup logs myapp-ns-buckup-23-06-06-05-09-07` for more details.
+ for i in ${NS_NAME}
+ velero backup create velero-system-ns-buckup-23-06-06-05-09-07 --include-cluster-resources=true --include-namespaces velero-system --kubeconfig=/root/.kube/config --namespace velero-system
Backup request "velero-system-ns-buckup-23-06-06-05-09-07" submitted successfully.
Run `velero backup describe velero-system-ns-buckup-23-06-06-05-09-07` or `velero backup logs velero-system-ns-buckup-23-06-06-05-09-07` for more details.
+ echo 'Task completed,Please log in to mini to view!'
Task completed,Please log in to mini to view!
minio截图,已经全部备份成功
2.HPA(弹性伸缩)
1.HPA介绍
动态扩容类型分为三种
(1)水平扩容 :水平调整pod的数量,增加pod数量(HPA)
(2)垂直扩容:增加pod的使用配置(VPA)
(3)集群扩容:集群node节点扩容(CA,cluster autoscale)
根据资源使用的状况弹性扩展pod副本,判断指标CPU+内存
当业务高峰期可以进行自动扩容,低峰期自动缩容,在公有云上支持node级别动态扩缩
HPA(Horizontal Pod autoscaling)控制器,可以根据定义好的阈值以及pod当前资源使用率自动控制在k8s集群中运行的pod数量(自动弹性水平自动伸缩)
针对HPA的参数:
–horizontal-pod-autoscaling-sync-period :默认15s间隔时间查询metric资源使用率
–horizontal-pod- autoscaling-downsacle-stabilization :缩容间隔时间默认5分钟
–horizontal-pod-autoscaling-sync-period:同步pod副本数量降额时间,查询pod利用率
–horizontal-pod-autoscaling-inital- readiness-dalay : pod的就绪时间,在此时间内pod默认认为是未就绪状态 默认30s
–horizontal-pod-autoscaling-tolerance:HAP容忍的差异率,默认为0.1
举例差异率:
比如设置当前HPA的CPU为50%,那么当CPU使用率到达80%时会计算(80%除以50%=1.6),那么大于1.1触发HPA的扩容,反之当前CPU使用率是40%(40%除以50%=0.8)则不会触发扩容。
POD计算公式 TargetNumOfPods = ceil(sum(CurrentPodsCPUUtilization) / Target)
ceil是一个向上取整的目的pod整数。
因为k8s没有收集指标数据的功能,只是一个容器编排工具,所以针对指标收集需要下载metric
github地址:
https://github.com/kubernetes-sigs/metrics-server
2.手动扩缩
当前副本数
root@k8s-master-01-11:/opt/velero# kubectl get pod -n myapp
NAME READY STATUS RESTARTS AGE
myapp-tomcat-app1-deployment-6d9d8885db-n2twq 1/1 Running 0 54m
root@k8s-master-01-11:/opt/velero#
命令查询
root@k8s-master-01-11:/opt/velero# kubectl --help | grep sca
scale Set a new size for a deployment, replica set, or replication controller
autoscale Auto-scale a deployment, replica set, stateful set, or replication controller
root@k8s-master-01-11:/opt/velero#
命令使用
root@k8s-master-01-11:/opt/velero# kubectl scale -h
Set a new size for a deployment, replica set, replication controller, or stateful set.
Scale also allows users to specify one or more preconditions for the scale action.
If --current-replicas or --resource-version is specified, it is validated before the scale is attempted, and it is
guaranteed that the precondition holds true when the scale is sent to the server.
Examples:
# Scale a replica set named 'foo' to 3
kubectl scale --replicas=3 rs/foo
# Scale a resource identified by type and name specified in "foo.yaml" to 3
kubectl scale --replicas=3 -f foo.yaml
# If the deployment named mysql's current size is 2, scale mysql to 3
kubectl scale --current-replicas=2 --replicas=3 deployment/mysql
# Scale multiple replication controllers
kubectl scale --replicas=5 rc/foo rc/bar rc/baz
# Scale stateful set named 'web' to 3
kubectl scale --replicas=3 statefulset/web
针对deployment执行扩容
root@k8s-master-01-11:/opt/velero# kubectl scale --replicas=3 deployment/myapp-tomcat-app1-deployment -n myapp
deployment.apps/myapp-tomcat-app1-deployment scaled
root@k8s-master-01-11:/opt/velero#
验证
root@k8s-master-01-11:/opt/velero# kubectl get pod -n myapp
NAME READY STATUS RESTARTS AGE
myapp-tomcat-app1-deployment-6d9d8885db-57dzl 1/1 Running 0 <invalid>
myapp-tomcat-app1-deployment-6d9d8885db-n2twq 1/1 Running 0 59m
myapp-tomcat-app1-deployment-6d9d8885db-scj4t 1/1 Running 0 <invalid>
root@k8s-master-01-11:/opt/velero#
手动扩容成功
3.资源扩容
1.metric部署
下载metric :https://github.com/kubernetes-sigs/metrics-server/releases/download/v0.6.3/high-availability-1.21+.yaml
镜像上传本地harbor
root@ubuntuharbor50:/# docker pull bitnami/metrics-server:0.6.3
0.6.3: Pulling from bitnami/metrics-server
44ed08c15e18: Pull complete
Digest: sha256:74d5f5b75131fe83184df19df6790fdce02dfb72b6393eaa9b4c0d16b960f760
Status: Downloaded newer image for bitnami/metrics-server:0.6.3
docker.io/bitnami/metrics-server:0.6.3
root@ubuntuharbor50:/# docker tag bitnami/metrics-server:0.6.3 www.ghostxin.online/application/metrics-server:0.6.3
root@ubuntuharbor50:/# docker push www.ghostxin.online/application/metrics-server:0.6.3
The push refers to repository [www.ghostxin.online/application/metrics-server]
06c7742d35a7: Pushed
0.6.3: digest: sha256:4eed46cd490774867d39cfe67ec4501a1130a86aaaf2a750442a1483c0106123 size: 529
root@ubuntuharbor50:/#
yaml文件修改
apiVersion: v1
kind: ServiceAccount
metadata:
labels:
k8s-app: metrics-server
name: metrics-server
namespace: kube-system
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRole
metadata:
labels:
k8s-app: metrics-server
rbac.authorization.k8s.io/aggregate-to-admin: "true"
rbac.authorization.k8s.io/aggregate-to-edit: "true"
rbac.authorization.k8s.io/aggregate-to-view: "true"
name: system:aggregated-metrics-reader
rules:
- apiGroups:
- metrics.k8s.io
resources:
- pods
- nodes
verbs:
- get
- list
- watch
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRole
metadata:
labels:
k8s-app: metrics-server
name: system:metrics-server
rules:
- apiGroups:
- ""
resources:
- nodes/metrics
verbs:
- get
- apiGroups:
- ""
resources:
- pods
- nodes
verbs:
- get
- list
- watch
---
apiVersion: rbac.authorization.k8s.io/v1
kind: RoleBinding
metadata:
labels:
k8s-app: metrics-server
name: metrics-server-auth-reader
namespace: kube-system
roleRef:
apiGroup: rbac.authorization.k8s.io
kind: Role
name: extension-apiserver-authentication-reader
subjects:
- kind: ServiceAccount
name: metrics-server
namespace: kube-system
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRoleBinding
metadata:
labels:
k8s-app: metrics-server
name: metrics-server:system:auth-delegator
roleRef:
apiGroup: rbac.authorization.k8s.io
kind: ClusterRole
name: system:auth-delegator
subjects:
- kind: ServiceAccount
name: metrics-server
namespace: kube-system
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRoleBinding
metadata:
labels:
k8s-app: metrics-server
name: system:metrics-server
roleRef:
apiGroup: rbac.authorization.k8s.io
kind: ClusterRole
name: system:metrics-server
subjects:
- kind: ServiceAccount
name: metrics-server
namespace: kube-system
---
apiVersion: v1
kind: Service
metadata:
labels:
k8s-app: metrics-server
name: metrics-server
namespace: kube-system
spec:
ports:
- name: https
port: 443
protocol: TCP
targetPort: https
selector:
k8s-app: metrics-server
---
apiVersion: apps/v1
kind: Deployment
metadata:
labels:
k8s-app: metrics-server
name: metrics-server
namespace: kube-system
spec:
replicas: 2
selector:
matchLabels:
k8s-app: metrics-server
strategy:
rollingUpdate:
maxUnavailable: 1
template:
metadata:
labels:
k8s-app: metrics-server
spec:
affinity:
podAntiAffinity:
requiredDuringSchedulingIgnoredDuringExecution:
- labelSelector:
matchLabels:
k8s-app: metrics-server
namespaces:
- kube-system
topologyKey: kubernetes.io/hostname
containers:
- args:
- --cert-dir=/tmp
- --secure-port=4443
- --kubelet-preferred-address-types=InternalIP,ExternalIP,Hostname
- --kubelet-use-node-status-port
- --metric-resolution=15s
image: www.ghostxin.online/application/metrics-server:0.6.3
imagePullPolicy: IfNotPresent
livenessProbe:
failureThreshold: 3
httpGet:
path: /livez
port: https
scheme: HTTPS
periodSeconds: 10
name: metrics-server
ports:
- containerPort: 4443
name: https
protocol: TCP
readinessProbe:
failureThreshold: 3
httpGet:
path: /readyz
port: https
scheme: HTTPS
initialDelaySeconds: 20
periodSeconds: 10
resources:
requests:
cpu: 100m
memory: 200Mi
securityContext:
allowPrivilegeEscalation: false
readOnlyRootFilesystem: true
runAsNonRoot: true
runAsUser: 1000
volumeMounts:
- mountPath: /tmp
name: tmp-dir
nodeSelector:
kubernetes.io/os: linux
priorityClassName: system-cluster-critical
serviceAccountName: metrics-server
volumes:
- emptyDir: {}
name: tmp-dir
---
apiVersion: policy/v1
kind: PodDisruptionBudget
metadata:
name: metrics-server
namespace: kube-system
spec:
minAvailable: 1
selector:
matchLabels:
k8s-app: metrics-server
---
apiVersion: apiregistration.k8s.io/v1
kind: APIService
metadata:
labels:
k8s-app: metrics-server
name: v1beta1.metrics.k8s.io
spec:
group: metrics.k8s.io
groupPriorityMinimum: 100
insecureSkipTLSVerify: true
service:
name: metrics-server
namespace: kube-system
version: v1beta1
versionPriority: 100
没有部署前,无法查询
root@k8s-master-01-11:/data/k8s_yaml/app/metric# kubectl top nodes
error: Metrics API not available
root@k8s-master-01-11:/data/k8s_yaml/app/metric# kubectl top pods
error: Metrics API not available
开始部署:
root@k8s-master-01-11:/data/k8s_yaml/app/metric# kubectl apply -f metric-server.yaml
serviceaccount/metrics-server created
clusterrole.rbac.authorization.k8s.io/system:aggregated-metrics-reader created
clusterrole.rbac.authorization.k8s.io/system:metrics-server created
rolebinding.rbac.authorization.k8s.io/metrics-server-auth-reader created
clusterrolebinding.rbac.authorization.k8s.io/metrics-server:system:auth-delegator created
clusterrolebinding.rbac.authorization.k8s.io/system:metrics-server created
service/metrics-server created
deployment.apps/metrics-server created
poddisruptionbudget.policy/metrics-server created
apiservice.apiregistration.k8s.io/v1beta1.metrics.k8s.io created
查看部署内容:
root@k8s-master-01-11:/data/k8s_yaml/app/metric# kubectl get pod -n kube-system
NAME READY STATUS RESTARTS AGE
calico-kube-controllers-5d45cfb97b-ndhcx 1/1 Running 2 (6h29m ago) 40h
calico-node-4nxsw 1/1 Running 0 40h
calico-node-4v5kq 1/1 Running 1 (15h ago) 40h
calico-node-4zd25 1/1 Running 1 (15h ago) 40h
calico-node-7dlcz 1/1 Running 1 (15h ago) 40h
calico-node-cwgcl 1/1 Running 2 (15h ago) 40h
calico-node-kc866 1/1 Running 1 (15h ago) 40h
calico-node-kksvs 1/1 Running 1 (15h ago) 40h
calico-node-v6chq 1/1 Running 1 (15h ago) 40h
coredns-566564f9fd-j4gsv 1/1 Running 0 15h
coredns-566564f9fd-jv4wz 1/1 Running 0 15h
metrics-server-74446748bf-m4dh8 1/1 Running 0 2m11s
metrics-server-74446748bf-xmrbn 1/1 Running 0 2m11s
top pod 查看内容
root@k8s-master-01-11:/data/k8s_yaml/app/metric# kubectl top pod -A
NAMESPACE NAME CPU(cores) MEMORY(bytes)
kube-system calico-kube-controllers-5d45cfb97b-ndhcx 2m 14Mi
kube-system calico-node-4v5kq 14m 95Mi
kube-system calico-node-4zd25 12m 93Mi
kube-system calico-node-7dlcz 14m 93Mi
kube-system calico-node-cwgcl 9m 91Mi
kube-system calico-node-kc866 11m 96Mi
kube-system calico-node-kksvs 13m 93Mi
kube-system calico-node-v6chq 15m 94Mi
kube-system coredns-566564f9fd-j4gsv 1m 12Mi
kube-system coredns-566564f9fd-jv4wz 1m 12Mi
kube-system metrics-server-74446748bf-m4dh8 3m 16Mi
kube-system metrics-server-74446748bf-xmrbn 2m 16Mi
myapp myapp-tomcat-app1-deployment-6d9d8885db-57dzl 1m 54Mi
myapp myapp-tomcat-app1-deployment-6d9d8885db-n2twq 1m 62Mi
myapp myapp-tomcat-app1-deployment-6d9d8885db-scj4t 2m 56Mi
velero-system velero-f9b9bc564-fnlkh 1m 21Mi
Top nodes 查看内容
root@k8s-master-01-11:/data/k8s_yaml/app/metric# kubectl top nodes
NAME CPU(cores) CPU% MEMORY(bytes) MEMORY%
k8s-master-01-11 34m 3% 1113Mi 66%
k8s-worker-01-23 32m 1% 935Mi 25%
k8s-worker-02-21 31m 1% 931Mi 25%
k8s-worker-03-22 36m 1% 973Mi 26%
k8s-worker-04-32 35m 1% 878Mi 52%
k8s-worker-05-33 33m 1% 990Mi 59%
k8s-worker-07-13 102m 10% 1071Mi 64%
2.HPA部署
1.命令行部署
kubectl autoscale ${控制器名称deployment}/${podname} --min=${最小HPA个数} --max=${最大HPA个数} --cpu-percent=80 -n ${namespace} horizontalpodautoscaler.autoscaling/myserver-tomcat-deployment autoscaled
2.yaml文件部署
应用yaml
kind: Deployment
#apiVersion: extensions/v1beta1
apiVersion: apps/v1
metadata:
labels:
app: myapp-tomcat-app1-deployment-label
name: myapp-tomcat-app1-deployment
namespace: myapp
spec:
replicas: 1
selector:
matchLabels:
app: myapp-tomcat-app1-selector
template:
metadata:
labels:
app: myapp-tomcat-app1-selector
spec:
containers:
- name: myapp-tomcat-app1-container
image: www.ghostxin.online/application/tomcat:latest
#command: ["/apps/tomcat/bin/run_tomcat.sh"]
#imagePullPolicy: IfNotPresent
imagePullPolicy: Always
ports:
- containerPort: 8080
protocol: TCP
name: http
resources:
limits:
cpu: 1
memory: 500Mi
requests:
cpu: 1
memory: 500Mi
---
kind: Service
apiVersion: v1
metadata:
labels:
app: myapp-tomcat-app1-service-label
name: myapp-tomcat-app1-service
namespace: myapp
spec:
type: NodePort
ports:
- name: http
port: 8080
protocol: TCP
targetPort: 8080
nodePort: 30088
selector:
app: myapp-tomcat-app1-selector
HPA的yaml
root@k8s-master-01-11:/data/k8s_yaml/app/HPA# cat hpa.yaml
#apiVersion: autoscaling/v2beta1
apiVersion: autoscaling/v1
kind: HorizontalPodAutoscaler
metadata:
namespace: myapp #修改为自己的namespace
name: myapp-tomcat-app1-deployment #修改为自己的deployment
labels:
app: myapp-tomcat-app1
version: v2beta1
spec:
scaleTargetRef:
apiVersion: apps/v1
#apiVersion: extensions/v1beta1
kind: Deployment
name: myapp-tomcat-app1-deployment
minReplicas: 2
maxReplicas: 5
targetCPUUtilizationPercentage:20
#metrics:
#- type: Resource
# resource:
# name: cpu
# targetAverageUtilization: 60
#- type: Resource
# resource:
# name: memory
验证部署:
root@k8s-master-01-11:/data/k8s_yaml/app/HPA# kubectl get hpa -n myapp
NAME REFERENCE TARGETS MINPODS MAXPODS REPLICAS AGE
myapp-tomcat-app1-deployment Deployment/myapp-tomcat-app1-deployment <unknown>/60% 2 5 2 31s
root@k8s-master-01-11:/data/k8s_yaml/app/HPA# kubectl get pod -n myapp
NAME READY STATUS RESTARTS AGE
myapp-tomcat-app1-deployment-85f8cf7cb7-l5h6t 1/1 Running 0 5m51s
myapp-tomcat-app1-deployment-85f8cf7cb7-lxjxc 1/1 Running 0 32s
root@k8s-master-01-11:/data/k8s_yaml/app/HPA# kubectl get pod -A
NAMESPACE NAME READY STATUS RESTARTS AGE
kube-system calico-kube-controllers-5d45cfb97b-ndhcx 1/1 Running 2 (6h44m ago) 40h
kube-system calico-node-4nxsw 1/1 Running 0 40h
kube-system calico-node-4v5kq 1/1 Running 1 (16h ago) 40h
kube-system calico-node-4zd25 1/1 Running 1 (16h ago) 40h
kube-system calico-node-7dlcz 1/1 Running 1 (16h ago) 40h
kube-system calico-node-cwgcl 1/1 Running 2 (15h ago) 40h
kube-system calico-node-kc866 1/1 Running 1 (15h ago) 40h
kube-system calico-node-kksvs 1/1 Running 1 (16h ago) 40h
kube-system calico-node-v6chq 1/1 Running 1 (16h ago) 40h
kube-system coredns-566564f9fd-j4gsv 1/1 Running 0 15h
kube-system coredns-566564f9fd-jv4wz 1/1 Running 0 15h
kube-system metrics-server-74446748bf-m4dh8 1/1 Running 0 17m
kube-system metrics-server-74446748bf-xmrbn 1/1 Running 0 17m
myapp myapp-tomcat-app1-deployment-85f8cf7cb7-l5h6t 1/1 Running 0 6m5s
myapp myapp-tomcat-app1-deployment-85f8cf7cb7-lxjxc 1/1 Running 0 46s
velero-system velero-f9b9bc564-fnlkh 1/1 Running 0 162m
root@k8s-master-01-11:/data/k8s_yaml/app/HPA#
查看资源收集状况
root@ubuntuharbor50:~# kubectl get hpa -n myapp
NAME REFERENCE TARGETS MINPODS MAXPODS REPLICAS AGE
myapp-tomcat-app1-deployment Deployment/myapp-tomcat-app1-deployment 3%/20% 2 5 2 14h
root@ubuntuharbor50:~#
describe 一下
root@ubuntuharbor50:~# kubectl describe pod -n myapp myapp-tomcat-app1-deployment-dc66b885b-85q8c
Name: myapp-tomcat-app1-deployment-dc66b885b-85q8c
Namespace: myapp
Priority: 0
Service Account: default
Node: k8s-worker-02-21/10.0.0.21
Start Time: Tue, 06 Jun 2023 08:11:32 +0800
Labels: app=myapp-tomcat-app1-selector
pod-template-hash=dc66b885b
Annotations: <none>
Status: Running
IP: 172.16.76.75
IPs:
IP: 172.16.76.75
Controlled By: ReplicaSet/myapp-tomcat-app1-deployment-dc66b885b
Containers:
myapp-tomcat-app1-container:
Container ID: containerd://ab33af1718f7aa465ad94ecb82a6e335a3be183ae0ada5ac78298bd594cdbc65
Image: www.ghostxin.online/application/tomcat:latest
Image ID: www.ghostxin.online/application/tomcat@sha256:06f4422c74e1774bb9121dba1345127283cd49ba0902b0b6b6fdbce142c94b3e
Port: 8080/TCP
Host Port: 0/TCP
State: Running
Started: Tue, 06 Jun 2023 08:11:33 +0800
Ready: True
Restart Count: 0
Requests:
cpu: 50m
memory: 500Mi
Environment: <none>
Mounts:
/var/run/secrets/kubernetes.io/serviceaccount from kube-api-access-j4dtw (ro)
Conditions:
Type Status
Initialized True
Ready True
ContainersReady True
PodScheduled True
Volumes:
kube-api-access-j4dtw:
Type: Projected (a volume that contains injected data from multiple sources)
TokenExpirationSeconds: 3607
ConfigMapName: kube-root-ca.crt
ConfigMapOptional: <nil>
DownwardAPI: true
QoS Class: Burstable
Node-Selectors: <none>
Tolerations: node.kubernetes.io/not-ready:NoExecute op=Exists for 300s
node.kubernetes.io/unreachable:NoExecute op=Exists for 300s
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Normal Pulling 14h kubelet Pulling image "www.ghostxin.online/application/tomcat:latest"
Normal Pulled 14h kubelet Successfully pulled image "www.ghostxin.online/application/tomcat:latest" in 46.300854ms (46.304687ms including waiting)
Normal Created 14h kubelet Created container myapp-tomcat-app1-container
Normal Started 14h kubelet Started container myapp-tomcat-app1-container
Normal Scheduled 14h default-scheduler Successfully assigned myapp/myapp-tomcat-app1-deployment-dc66b885b-85q8c to k8s-worker-02-21
4.资源限制
针对资源限制CPU
https://kubernetes.io/zh-cn/docs/tasks/configure-pod-container/assign-cpu-resource/
针对资源限制内存
https://kubernetes.io/zh-cn/docs/tasks/configure-pod-container/assign-memory-resource/
cpu 是以核心为单位进行控制 1即为1核,100m即为 0.1 核
内存的计算单位为Ki,Mi,Gi,Ti,Pi,Ei,K,M,G,E,P为单位
内存限制分为了requests 和limit
request 为请求资源,会立马占用资源作为预备资源
limit为限制资源,是作为限制对象最大的使用资源大小
其中 limit为资源占用的最终大小,比如 request cpu为 1核,limit cpu为 2核,那么该资源能够最大使用的cpu为两核
内存同上,limit是最终能够资源使用的大小
1.kubernetes 容器资源限制
在containers下对容器进行资源限制,限制为1个cpu,但是命令行占用两个cpu
apiVersion: apps/v1
kind: Deployment
metadata:
name: limit-test-deployment
namespace: myapp
spec:
replicas: 1
selector:
matchLabels: #
app: limit-test-pod
template:
metadata:
labels:
app: limit-test-pod
spec:
containers:
- name: limit-test-container
image: www.ghostxin.online/application/stress-ng:latest
resources:
limits:
cpu: 1
memory: "256Mi"
requests:
cpu: 1
memory: "256Mi"
args: ["--vm", "2", "--vm-bytes", "256M"]s
查看限制资源,通过增加参数限制cpu限制的死死的
root@k8s-master-01-11:/data/k8s_yaml/app/limit_request/limit-case# kubectl top pods -n myapp
NAME CPU(cores) MEMORY(bytes)
limit-test-deployment-795566866c-7l2qh 1002m 162Mi
root@k8s-master-01-11:/data/k8s_yaml/app/limit_request/limit-case#
将request 和limit 调整一下,cpu 调整为1.2,内存调整为512Mi
apiVersion: apps/v1
kind: Deployment
metadata:
name: limit-test-deployment
namespace: myapp
spec:
replicas: 1
selector:
matchLabels:
app: limit-test-pod
template:
metadata:
labels:
app: limit-test-pod
spec:
containers:
- name: limit-test-container
image: www.ghostxin.online/application/stress-ng:latest
resources:
limits:
cpu: "1.2"
memory: "512Mi"
requests:
memory: "100Mi"
cpu: "500m"
args: ["--vm", "2", "--vm-bytes", "256M"]
查看top,发现锁到1200m豪核,内存使用为256Mi,limit限制为521Mi
root@k8s-master-01-11:/data/k8s_yaml/app/limit_request/limit-case# kubectl top pods -n myapp
NAME CPU(cores) MEMORY(bytes)
limit-test-deployment-744998dd8b-mlcqf 1202m 259Mi
root@k8s-master-01-11:/data/k8s_yaml/app/limit_request/limit-case#
2.kubernetes pod资源限制
官方文档:https://kubernetes.io/zh-cn/docs/concepts/policy/limit-range/
利用yaml文件对namespace下的pod和contaoner节点进行资源限制
apiVersion: v1
kind: LimitRange
metadata:
name: limitrange-myapp
namespace: myapp
spec:
limits:
- type: Container #限制的资源类型
max:
cpu: "2" #限制单个容器的最大CPU
memory: "2Gi" #限制单个容器的最大内存
min:
cpu: "500m" #限制单个容器的最小CPU
memory: "512Mi" #限制单个容器的最小内存
default:
cpu: "500m" #默认单个容器的CPU限制
memory: "512Mi" #默认单个容器的内存限制
defaultRequest:
cpu: "500m" #默认单个容器的CPU创建请求
memory: "512Mi" #默认单个容器的内存创建请求
maxLimitRequestRatio:
cpu: 2 #限制CPU limit/request比值最大为2
memory: 2 #限制内存limit/request比值最大为1.5
- type: Pod
max:
cpu: "1" #限制单个Pod的最大CPU
memory: "1Gi" #限制单个Pod最大内存
- type: PersistentVolumeClaim
max:
storage: 50Gi #限制PVC最大的requests.storage
min:
storage: 30Gi #限制PVC最小的requests.storage
通过此命令查看资源限制
kubectl get deployment -n myapp myapp-wordpress-deployment -o json
添加app资源yaml文件,把资源设置为超出资源
root@k8s-master-01-11:/data/k8s_yaml/app/limit_request/limit-case# cat case4-pod-RequestRatio-limit.yaml
kind: Deployment
apiVersion: apps/v1
metadata:
labels:
app: myapp-wordpress-deployment-label
name: myapp-wordpress-deployment
namespace: myapp
spec:
replicas: 1
selector:
matchLabels:
app: myapp-wordpress-selector
template:
metadata:
labels:
app: myapp-wordpress-selector
spec:
containers:
- name: myapp-wordpress-nginx-container
image: nginx:1.16.1
imagePullPolicy: Always
ports:
- containerPort: 80
protocol: TCP
name: http
resources:
limits:
cpu: 500m
memory: 0.5Gi
requests:
cpu: 500m
memory: 0.5Gi
- name: myapp-wordpress-php-container
image: php:5.6-fpm-alpine
imagePullPolicy: Always
ports:
- containerPort: 80
protocol: TCP
name: http
resources:
limits:
cpu: 500m
memory: 0.5Gi
requests:
cpu: 500m
memory: 0.5Gi
- name: myapp-wordpress-redis-container
image: redis:4.0.14-alpine
imagePullPolicy: Always
ports:
- containerPort: 80
protocol: TCP
name: http
resources:
limits:
cpu: 1.5
memory: 1.5Gi
requests:
cpu: 1.5
memory: 1.5Gi
---
kind: Service
apiVersion: v1
metadata:
labels:
app: myapp-wordpress-service-label
name: myapp-wordpress-service
namespace: myapp
spec:
type: NodePort
ports:
- name: http
port: 80
protocol: TCP
targetPort: 8080
nodePort: 30033
selector:
app: myapp-wordpress-selector
查看deployment错误提示
root@k8s-master-01-11:/data/k8s_yaml/app/limit_request/limit-case# kubectl get deployment -n myapp myapp-wordpress-deployment -o json
{
"apiVersion": "apps/v1",
"kind": "Deployment",
"metadata": {
"annotations": {
"deployment.kubernetes.io/revision": "2",
"kubectl.kubernetes.io/last-applied-configuration": "{"apiVersion":"apps/v1","kind":"Deployment","metadata":{"annotations":{},"labels":{"app":"myapp-wordpress-deployment-label"},"name":"myapp-wordpress-deployment","namespace":"myapp"},"spec":{"replicas":1,"selector":{"matchLabels":{"app":"myapp-wordpress-selector"}},"template":{"metadata":{"labels":{"app":"myapp-wordpress-selector"}},"spec":{"containers":[{"image":"nginx:1.16.1","imagePullPolicy":"Always","name":"myapp-wordpress-nginx-container","ports":[{"containerPort":80,"name":"http","protocol":"TCP"}],"resources":{"limits":{"cpu":"500m","memory":"0.5Gi"},"requests":{"cpu":"500m","memory":"0.5Gi"}}},{"image":"php:5.6-fpm-alpine","imagePullPolicy":"Always","name":"myapp-wordpress-php-container","ports":[{"containerPort":80,"name":"http","protocol":"TCP"}],"resources":{"limits":{"cpu":"500m","memory":"0.5Gi"},"requests":{"cpu":"500m","memory":"0.5Gi"}}},{"image":"redis:4.0.14-alpine","imagePullPolicy":"Always","name":"myapp-wordpress-redis-container","ports":[{"containerPort":80,"name":"http","protocol":"TCP"}],"resources":{"limits":{"cpu":1.5,"memory":"1.5Gi"},"requests":{"cpu":1.5,"memory":"1.5Gi"}}}]}}}}
"
},
"creationTimestamp": "2023-06-06T20:10:04Z",
"generation": 2,
"labels": {
"app": "myapp-wordpress-deployment-label"
},
"name": "myapp-wordpress-deployment",
"namespace": "myapp",
"resourceVersion": "195977",
"uid": "b3449d23-47b3-4b8a-a40b-6f558396a630"
},
"spec": {
"progressDeadlineSeconds": 600,
"replicas": 1,
"revisionHistoryLimit": 10,
"selector": {
"matchLabels": {
"app": "myapp-wordpress-selector"
}
},
"strategy": {
"rollingUpdate": {
"maxSurge": "25%",
"maxUnavailable": "25%"
},
"type": "RollingUpdate"
},
"template": {
"metadata": {
"creationTimestamp": null,
"labels": {
"app": "myapp-wordpress-selector"
}
},
"spec": {
"containers": [
{
"image": "nginx:1.16.1",
"imagePullPolicy": "Always",
"name": "myapp-wordpress-nginx-container",
"ports": [
{
"containerPort": 80,
"name": "http",
"protocol": "TCP"
}
],
"resources": {
"limits": {
"cpu": "500m",
"memory": "512Mi"
},
"requests": {
"cpu": "500m",
"memory": "512Mi"
}
},
"terminationMessagePath": "/dev/termination-log",
"terminationMessagePolicy": "File"
},
{
"image": "php:5.6-fpm-alpine",
"imagePullPolicy": "Always",
"name": "myapp-wordpress-php-container",
"ports": [
{
"containerPort": 80,
"name": "http",
"protocol": "TCP"
}
],
"resources": {
"limits": {
"cpu": "500m",
"memory": "512Mi"
},
"requests": {
"cpu": "500m",
"memory": "512Mi"
}
},
"terminationMessagePath": "/dev/termination-log",
"terminationMessagePolicy": "File"
},
{
"image": "redis:4.0.14-alpine",
"imagePullPolicy": "Always",
"name": "myapp-wordpress-redis-container",
"ports": [
{
"containerPort": 80,
"name": "http",
"protocol": "TCP"
}
],
"resources": {
"limits": {
"cpu": "1500m",
"memory": "1536Mi"
},
"requests": {
"cpu": "1500m",
"memory": "1536Mi"
}
},
"terminationMessagePath": "/dev/termination-log",
"terminationMessagePolicy": "File"
}
],
"dnsPolicy": "ClusterFirst",
"restartPolicy": "Always",
"schedulerName": "default-scheduler",
"securityContext": {},
"terminationGracePeriodSeconds": 30
}
}
},
"status": {
"conditions": [
{
"lastTransitionTime": "2023-06-06T20:10:04Z",
"lastUpdateTime": "2023-06-06T20:10:04Z",
"message": "Deployment does not have minimum availability.",
"reason": "MinimumReplicasUnavailable",
"status": "False",
"type": "Available"
},
{
"lastTransitionTime": "2023-06-06T20:10:04Z",
"lastUpdateTime": "2023-06-06T20:10:04Z",
"message": "pods "myapp-wordpress-deployment-564c567696-z69xd" is forbidden: [maximum cpu usage per Container is 1, but limit is 2, maximum memory usage per Container is 1Gi, but limit is 2Gi, maximum cpu usage per Pod is 2, but limit is 3, maximum memory usage per Pod is 2Gi, but limit is 3221225472]",
"reason": "FailedCreate",
"status": "True",
"type": "ReplicaFailure"
},
{
"lastTransitionTime": "2023-06-06T20:24:16Z",
"lastUpdateTime": "2023-06-06T20:24:16Z",
"message": "Created new replica set "myapp-wordpress-deployment-565d99c778"",
"reason": "NewReplicaSetCreated",
"status": "True",
"type": "Progressing"
}
],
"observedGeneration": 2,
"unavailableReplicas": 2
}
}
发现message报错
3.kubernetes namespace 资源限制
官方网站说明:https://kubernetes.io/zh-cn/docs/concepts/policy/resource-quotas/
根据yaml文件进行限制namespace资源
apiVersion: v1
kind: ResourceQuota
metadata:
name: quota-myapp
namespace: myapp
spec:
hard:
requests.cpu: "8" #限制请求cpu为8核
limits.cpu: "8" #资源限制8核cpu
requests.memory: 4Gi #请求内存为4G
limits.memory: 4Gi #限制内存4G
requests.nvidia.com/gpu: 4 #GPU4 个
pods: "100" #pod副本数为100个
services: "100" #service 100个
实战环境
apiVersion: v1
kind: ResourceQuota
metadata:
name: quota-myapp
namespace: myapp
spec:
hard:
requests.cpu: "3"
limits.cpu: "3"
requests.memory: 3Gi
limits.memory: 3Gi
requests.nvidia.com/gpu: 4
pods: "100"
services: "100"
测试yaml,测试副本为5个
kind: Deployment
apiVersion: apps/v1
metadata:
labels:
app: myapp-nginx-deployment-label
name: myapp-nginx-deployment
namespace: myapp
spec:
replicas: 5
selector:
matchLabels:
app: myapp-nginx-selector
template:
metadata:
labels:
app: myapp-nginx-selector
spec:
containers:
- name: myapp-nginx-container
image: nginx:1.16.1
imagePullPolicy: Always
ports:
- containerPort: 80
protocol: TCP
name: http
env:
- name: "password"
value: "123456"
- name: "age"
value: "18"
resources:
limits:
cpu: 1
memory: 1Gi
requests:
cpu: 1
memory: 1Gi
---
kind: Service
apiVersion: v1
metadata:
labels:
app: myapp-nginx-service-label
name: myapp-nginx-service
namespace: myapp
spec:
type: NodePort
ports:
- name: http
port: 80
protocol: TCP
targetPort: 8080
nodePort: 30033
selector:
app: myapp-nginx-selector
提示报错了,错误的创建发现副本是创建不了5个的
{
"apiVersion": "apps/v1",
"kind": "Deployment",
"metadata": {
"annotations": {
"deployment.kubernetes.io/revision": "1",
"kubectl.kubernetes.io/last-applied-configuration": "{"apiVersion":"apps/v1","kind":"Deployment","metadata":{"annotations":{},"labels":{"app":"myapp-nginx-deployment-label"},"name":"myapp-nginx-deployment","namespace":"myapp"},"spec":{"replicas":5,"selector":{"matchLabels":{"app":"myapp-nginx-selector"}},"template":{"metadata":{"labels":{"app":"myapp-nginx-selector"}},"spec":{"containers":[{"env":[{"name":"password","value":"123456"},{"name":"age","value":"18"}],"image":"nginx:1.16.1","imagePullPolicy":"Always","name":"myapp-nginx-container","ports":[{"containerPort":80,"name":"http","protocol":"TCP"}],"resources":{"limits":{"cpu":1,"memory":"1Gi"},"requests":{"cpu":1,"memory":"1Gi"}}}]}}}}
"
},
"creationTimestamp": "2023-06-06T21:02:12Z",
"generation": 1,
"labels": {
"app": "myapp-nginx-deployment-label"
},
"name": "myapp-nginx-deployment",
"namespace": "myapp",
"resourceVersion": "201993",
"uid": "a7e530bf-0f4e-4e31-b623-6499cf431ead"
},
"spec": {
"progressDeadlineSeconds": 600,
"replicas": 5,
"revisionHistoryLimit": 10,
"selector": {
"matchLabels": {
"app": "myapp-nginx-selector"
}
},
"strategy": {
"rollingUpdate": {
"maxSurge": "25%",
"maxUnavailable": "25%"
},
"type": "RollingUpdate"
},
"template": {
"metadata": {
"creationTimestamp": null,
"labels": {
"app": "myapp-nginx-selector"
}
},
"spec": {
"containers": [
{
"env": [
{
"name": "password",
"value": "123456"
},
{
"name": "age",
"value": "18"
}
],
"image": "nginx:1.16.1",
"imagePullPolicy": "Always",
"name": "myapp-nginx-container",
"ports": [
{
"containerPort": 80,
"name": "http",
"protocol": "TCP"
}
],
"resources": {
"limits": {
"cpu": "1",
"memory": "1Gi"
},
"requests": {
"cpu": "1",
"memory": "1Gi"
}
},
"terminationMessagePath": "/dev/termination-log",
"terminationMessagePolicy": "File"
}
],
"dnsPolicy": "ClusterFirst",
"restartPolicy": "Always",
"schedulerName": "default-scheduler",
"securityContext": {},
"terminationGracePeriodSeconds": 30
}
}
},
"status": {
"availableReplicas": 3,
"conditions": [
{
"lastTransitionTime": "2023-06-06T21:02:12Z",
"lastUpdateTime": "2023-06-06T21:02:12Z",
"message": "Deployment does not have minimum availability.",
"reason": "MinimumReplicasUnavailable",
"status": "False",
"type": "Available"
},
{
"lastTransitionTime": "2023-06-06T21:02:12Z",
"lastUpdateTime": "2023-06-06T21:02:12Z",
"message": "pods "myapp-nginx-deployment-7db9866d86-x29kf" is forbidden: exceeded quota: quota-myapp, requested: limits.cpu=1,limits.memory=1Gi,requests.cpu=1,requests.memory=1Gi, used: limits.cpu=3,limits.memory=3Gi,requests.cpu=3,requests.memory=3Gi, limited: limits.cpu=3,limits.memory=3Gi,requests.cpu=3,requests.memory=3Gi",
"reason": "FailedCreate",
"status": "True",
"type": "ReplicaFailure"
},
{
"lastTransitionTime": "2023-06-06T21:02:12Z",
"lastUpdateTime": "2023-06-06T21:03:43Z",
"message": "ReplicaSet "myapp-nginx-deployment-7db9866d86" is progressing.",
"reason": "ReplicaSetUpdated",
"status": "True",
"type": "Progressing"
}
],
"observedGeneration": 1,
"readyReplicas": 3,
"replicas": 3,
"unavailableReplicas": 2,
"updatedReplicas": 3
}
}
实际副本数量为三个,到达限制的最大数量
root@k8s-master-01-11:/data/k8s_yaml/app/limit_request/limit-case# kubectl get pod -n myapp
NAME READY STATUS RESTARTS AGE
myapp-nginx-deployment-7db9866d86-47mp8 1/1 Running 0 113s
myapp-nginx-deployment-7db9866d86-m76fs 1/1 Running 0 3m20s
myapp-nginx-deployment-7db9866d86-ntpf6 1/1 Running 0 3m20s
root@k8s-master-01-11:/data/k8s_yaml/app/limit_request/limit-case#
4.资源调度
node 调度器 官方文档
https://kubernetes.io/zh-cn/docs/concepts/scheduling-eviction/assign-pod-node/
node selector 基于node标签选择器,将pod调度到指定的node节点上
1.亲和反亲和 node selector
首先需要将node节点上打上标签,项目组中标记节点进行调度
添加两个标签 一个disktype=‘SSD’, diskplay=‘Nvdia’
root@k8s-master-01-11:/data/k8s_yaml/app/RBAC# kubectl get nodes
NAME STATUS ROLES AGE VERSION
10.0.0.12 Ready node 4h29m v1.26.4
k8s-master-01-11 Ready,SchedulingDisabled master 2d16h v1.26.4
k8s-worker-01-23 Ready node 2d16h v1.26.4
k8s-worker-02-21 Ready node 2d16h v1.26.4
k8s-worker-03-22 Ready node 2d16h v1.26.4
k8s-worker-04-32 Ready node 2d16h v1.26.4
k8s-worker-05-33 Ready node 2d16h v1.26.4
k8s-worker-07-13 Ready node 2d16h v1.26.4
root@k8s-master-01-11:/data/k8s_yaml/app/RBAC# kubectl label node 10.0.0.12 disktype='SSD'
node/10.0.0.12 labeled
root@k8s-master-01-11:/data/k8s_yaml/app/RBAC# kubectl label node k8s-worker-07-13 diskplay='Nvdia'
node/k8s-worker-07-13 labeled
root@k8s-master-01-11:/data/k8s_yaml/app/RBAC#
多添加几个节点
root@k8s-master-01-11:/data/k8s_yaml/app/node-selector# kubectl label node k8s-worker-05-33 diskplay='Nvdia'
node/k8s-worker-05-33 labeled
root@k8s-master-01-11:/data/k8s_yaml/app/node-selector# kubectl label node k8s-worker-04-32 diskplay='Nvdia'
node/k8s-worker-04-32 labeled
root@k8s-master-01-11:/data/k8s_yaml/app/node-selector# k8s-worker-05-33 k8s-worker-04-32 k8s-worker-07-13
查看调度节点
root@k8s-master-01-11:/data/k8s_yaml/app/node-selector# kubectl get pod -A -o wide
NAMESPACE NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES
kube-system calico-kube-controllers-5d45cfb97b-ndhcx 1/1 Running 3 (8h ago) 2d16h 10.0.0.23 k8s-worker-01-23 <none> <none>
kube-system calico-node-4v5kq 1/1 Running 2 (4h24m ago) 2d16h 10.0.0.33 k8s-worker-05-33 <none> <none>
kube-system calico-node-4zd25 1/1 Running 1 (40h ago) 2d16h 10.0.0.22 k8s-worker-03-22 <none> <none>
kube-system calico-node-7dlcz 1/1 Running 1 (40h ago) 2d16h 10.0.0.21 k8s-worker-02-21 <none> <none>
kube-system calico-node-b96p5 1/1 Running 1 (4h24m ago) 4h52m 10.0.0.12 10.0.0.12 <none> <none>
kube-system calico-node-cwgcl 1/1 Running 2 (40h ago) 2d16h 10.0.0.11 k8s-master-01-11 <none> <none>
kube-system calico-node-kc866 1/1 Running 2 (4h25m ago) 2d16h 10.0.0.13 k8s-worker-07-13 <none> <none>
kube-system calico-node-kksvs 1/1 Running 2 (4h26m ago) 2d16h 10.0.0.32 k8s-worker-04-32 <none> <none>
kube-system calico-node-v6chq 1/1 Running 1 (40h ago) 2d16h 10.0.0.23 k8s-worker-01-23 <none> <none>
kube-system coredns-566564f9fd-j4gsv 1/1 Running 0 40h 172.16.221.65 k8s-worker-03-22 <none> <none>
kube-system coredns-566564f9fd-jv4wz 1/1 Running 0 40h 172.16.76.65 k8s-worker-02-21 <none> <none>
kube-system metrics-server-74446748bf-2w2n4 1/1 Running 0 23h 172.16.124.74 k8s-worker-01-23 <none> <none>
kube-system metrics-server-74446748bf-g7b65 1/1 Running 1 (4h24m ago) 23h 172.16.94.8 k8s-worker-05-33 <none> <none>
kubernetes-dashboard dashboard-metrics-scraper-5fdf8ff74f-6znsb 1/1 Running 0 100m 172.16.227.133 k8s-worker-07-13 <none> <none>
kubernetes-dashboard kubernetes-dashboard-56cdd85c55-8vpvk 1/1 Running 0 100m 172.16.39.74 10.0.0.12 <none> <none>
myapp myapp-tomcat-app1-deployment-7d6cd8c6bc-4pjsg 1/1 Running 0 29s 172.16.5.7 k8s-worker-04-32 <none> <none>
myapp myapp-tomcat-app1-deployment-7d6cd8c6bc-4w4fg 1/1 Running 0 29s 172.16.227.136 k8s-worker-07-13 <none> <none>
myapp myapp-tomcat-app1-deployment-7d6cd8c6bc-9m46r 1/1 Running 0 29s 172.16.94.17 k8s-worker-05-33 <none> <none>
myapp myapp-tomcat-app1-deployment-7d6cd8c6bc-p524w 1/1 Running 0 29s 172.16.94.18 k8s-worker-05-33 <none> <none>
velero-system velero-f9b9bc564-fnlkh 1/1 Running 1 (4h26m ago) 27h 172.16.5.5 k8s-worker-04-32 <none> <none>
发现基本都调度在了 k8s-worker-05-33 k8s-worker-04-32 k8s-worker-07-13 三个节点上
查看一下describe
root@k8s-master-01-11:/data/k8s_yaml/app/node-selector# kubectl describe pod -n myapp myapp-tomcat-app1-deployment-7d6cd8c6bc-9m46r
Name: myapp-tomcat-app1-deployment-7d6cd8c6bc-9m46r
Namespace: myapp
Priority: 0
Service Account: default
Node: k8s-worker-05-33/10.0.0.33
Start Time: Wed, 07 Jun 2023 07:28:09 +0800
Labels: app=myapp-tomcat-app1-selector
pod-template-hash=7d6cd8c6bc
Annotations: <none>
Status: Running
IP: 172.16.94.17
IPs:
IP: 172.16.94.17
Controlled By: ReplicaSet/myapp-tomcat-app1-deployment-7d6cd8c6bc
Containers:
myapp-tomcat-myapp1:
Container ID: containerd://f042f06972e7e8a56ef3c0a98be721281afa775e4b2610024cff1557c0d398b6
Image: www.ghostxin.online/application/tomcat:v1
Image ID: www.ghostxin.online/application/tomcat@sha256:d1e29610f432b50e4ffe5869dddaa80ab2a4bf8dc4fa5218fd7ad930aa86f6b8
Port: 8080/TCP
Host Port: 0/TCP
State: Running
Started: Wed, 07 Jun 2023 07:28:11 +0800
Ready: True
Restart Count: 0
Environment: <none>
Mounts:
/var/run/secrets/kubernetes.io/serviceaccount from kube-api-access-5rvdm (ro)
Conditions:
Type Status
Initialized True
Ready True
ContainersReady True
PodScheduled True
Volumes:
kube-api-access-5rvdm:
Type: Projected (a volume that contains injected data from multiple sources)
TokenExpirationSeconds: 3607
ConfigMapName: kube-root-ca.crt
ConfigMapOptional: <nil>
DownwardAPI: true
QoS Class: BestEffort
Node-Selectors: diskplay=Nvdia
Tolerations: node.kubernetes.io/not-ready:NoExecute op=Exists for 300s
node.kubernetes.io/unreachable:NoExecute op=Exists for 300s
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Normal Scheduled 2m35s default-scheduler Successfully assigned myapp/myapp-tomcat-app1-deployment-7d6cd8c6bc-9m46r to k8s-worker-05-33
Normal Pulling 2m34s kubelet Pulling image "www.ghostxin.online/application/tomcat:v1"
Normal Pulled 2m33s kubelet Successfully pulled image "www.ghostxin.online/application/tomcat:v1" in 47.955962ms (48.030546ms including waiting)
Normal Created 2m33s kubelet Created container myapp-tomcat-myapp1
Normal Started 2m33s kubelet Started container myapp-tomcat-myapp1
没有什么问题
2.node affinity and anit-affinity
node的亲和与反亲和
1.2版本引入的新特性,类似于node selector 允许pod在node之间进行调度
RequiredDuringSchedulinglgnoredDuringExecution:必须满足pod的匹配条件,否则不进行调度
preferredDuringSchedulinglgnoredDuringExecution: 倾向满足pod匹配条件,不满足的情况下会调度到不符合条件的node上。
IgonreFDuringExcution:如果在pod运行中,node的标签发生变化,导致亲和策略策略不能满足,也会在node上运行
affintiy 和anit affinity 的目的也是控制pod的调度结果,但是相对于node selector affinity(亲和)和anit affinity (反亲和)功能更加的强大。
node selector 和affinity 的对比
(1)亲和与反亲和不仅仅支持and与的关系,还支持in,notin,exist,DoesNotExist,Gt,Lt
in:标签值存在标签列表中,匹配成功就可以调度到目的node,实现node亲和
notin:标签值不存在指定的标签列表中,不会调度到对应node,实现反亲和
Gt:标签大于某个值
Lt:标签小于某个值
exist:指定标签值
(2)可以设置软亲和硬亲和,当配置软亲和时匹配不到指定标签值时还会调度到其他不符合的节点。
(3)可以针对pod进行亲和定义,比如配置pod不可以调度到同一个node节点。
实战硬亲和:
查看现有的label
NAME STATUS ROLES AGE VERSION INTERNAL-IP EXTERNAL-IP OS-IMAGE KERNEL-VERSION CONTAINER-RUNTIME LABELS
10.0.0.12 Ready node 20h v1.26.4 10.0.0.12 <none> Ubuntu 22.04.2 LTS 5.15.0-73-generic containerd://1.6.20 beta.kubernetes.io/arch=arm64,beta.kubernetes.io/os=linux,disktype=SSD,kubernetes.io/arch=arm64,kubernetes.io/hostname=10.0.0.12,kubernetes.io/os=linux,kubernetes.io/role=node,projeck=python
k8s-master-01-11 Ready,SchedulingDisabled master 3d8h v1.26.4 10.0.0.11 <none> Ubuntu 22.04.2 LTS 5.15.0-73-generic containerd://1.6.20 beta.kubernetes.io/arch=arm64,beta.kubernetes.io/os=linux,kubernetes.io/arch=arm64,kubernetes.io/hostname=k8s-master-01-11,kubernetes.io/os=linux,kubernetes.io/role=master
k8s-worker-01-23 Ready node 3d8h v1.26.4 10.0.0.23 <none> Ubuntu 22.04.2 LTS 5.15.0-73-generic containerd://1.6.20 beta.kubernetes.io/arch=arm64,beta.kubernetes.io/os=linux,kubernetes.io/arch=arm64,kubernetes.io/hostname=k8s-worker-01-23,kubernetes.io/os=linux,kubernetes.io/role=node
k8s-worker-02-21 Ready node 3d8h v1.26.4 10.0.0.21 <none> Ubuntu 22.04.2 LTS 5.15.0-73-generic containerd://1.6.20 beta.kubernetes.io/arch=arm64,beta.kubernetes.io/os=linux,kubernetes.io/arch=arm64,kubernetes.io/hostname=k8s-worker-02-21,kubernetes.io/os=linux,kubernetes.io/role=node
k8s-worker-03-22 Ready node 3d8h v1.26.4 10.0.0.22 <none> Ubuntu 22.04.2 LTS 5.15.0-73-generic containerd://1.6.20 beta.kubernetes.io/arch=arm64,beta.kubernetes.io/os=linux,kubernetes.io/arch=arm64,kubernetes.io/hostname=k8s-worker-03-22,kubernetes.io/os=linux,kubernetes.io/role=node
k8s-worker-04-32 Ready node 3d8h v1.26.4 10.0.0.32 <none> Ubuntu 22.04.2 LTS 5.15.0-73-generic containerd://1.6.20 beta.kubernetes.io/arch=arm64,beta.kubernetes.io/os=linux,diskplay=Nvdia,kubernetes.io/arch=arm64,kubernetes.io/hostname=k8s-worker-04-32,kubernetes.io/os=linux,kubernetes.io/role=node,projeck=go
k8s-worker-05-33 Ready node 3d8h v1.26.4 10.0.0.33 <none> Ubuntu 22.04.2 LTS 5.15.0-73-generic containerd://1.6.20 beta.kubernetes.io/arch=arm64,beta.kubernetes.io/os=linux,diskplay=Nvdia,kubernetes.io/arch=arm64,kubernetes.io/hostname=k8s-worker-05-33,kubernetes.io/os=linux,kubernetes.io/role=node
k8s-worker-07-13 Ready node 3d8h v1.26.4 10.0.0.13 <none> Ubuntu 22.04.2 LTS 5.15.0-73-generic containerd://1.6.20 beta.kubernetes.io/arch=arm64,beta.kubernetes.io/os=linux,diskplay=Nvdia,kubernetes.io/arch=arm64,kubernetes.io/hostname=k8s-worker-07-13,kubernetes.io/os=linux,kubernetes.io/role=node
目前现有的label存在diskplay=Nvdia和disktype=SSD, 其中10.0.0.12 是disktype=SSD,project=python
k8s-worker-04-32的diskplay=‘Nvdia’,project=‘go’
针对 affinity 硬亲和调度 的yaml
kind: Deployment
#apiVersion: extensions/v1beta1
apiVersion: apps/v1
metadata:
labels:
app: myapp-tomcat-app2-deployment-label
name: myapp-tomcat-app2-deployment
namespace: myapp
spec:
replicas: 1
selector:
matchLabels:
app: myapp-tomcat-app2-selector
template:
metadata:
labels:
app: myapp-tomcat-app2-selector
spec:
containers:
- name: myapp-tomcat-app2-container
image: www.ghostxin.online/application/tomcat:v1
imagePullPolicy: IfNotPresent
#imagePullPolicy: Always
ports:
- containerPort: 8080
protocol: TCP
name: http
affinity:
nodeAffinity:
requiredDuringSchedulingIgnoredDuringExecution:
nodeSelectorTerms:
- matchExpressions: #匹配条件1,有一个key但是有多个values、则只要匹配成功一个value就可以调度
- key: disktype
operator: In
values:
- SSD # 只有一个value是匹配成功也可以调度
- xxx
- matchExpressions: #匹配条件1,有一个key但是有多个values、则只要匹配成功一个value就可以调度
- key: project
operator: In
values:
- xxx #即使这俩条件都匹配不上也可以调度,即多个matchExpressions只要有任意一个能匹配任何一个value就可以调用。
- nnn
使用当前yaml文件
root@k8s-master-01-11:/data/k8s_yaml/app/node-selector# kubectl apply -f case3-1.1-nodeAffinity-requiredDuring-matchExpressions.yaml
deployment.apps/myapp-tomcat-app2-deployment created
root@k8s-master-01-11:/data/k8s_yaml/app/node-selector# kubectl get pod -A
NAMESPACE NAME READY STATUS RESTARTS AGE
kube-system calico-kube-controllers-5d45cfb97b-ndhcx 1/1 Running 3 (24h ago) 3d8h
kube-system calico-node-4v5kq 1/1 Running 2 (19h ago) 3d8h
kube-system calico-node-4zd25 1/1 Running 1 (2d8h ago) 3d8h
kube-system calico-node-7dlcz 1/1 Running 1 (2d8h ago) 3d8h
kube-system calico-node-b96p5 1/1 Running 1 (19h ago) 20h
kube-system calico-node-cwgcl 1/1 Running 2 (2d7h ago) 3d8h
kube-system calico-node-kc866 1/1 Running 2 (19h ago) 3d8h
kube-system calico-node-kksvs 1/1 Running 2 (19h ago) 3d8h
kube-system calico-node-v6chq 1/1 Running 1 (2d8h ago) 3d8h
kube-system coredns-566564f9fd-j4gsv 1/1 Running 0 2d7h
kube-system coredns-566564f9fd-jv4wz 1/1 Running 0 2d7h
kube-system metrics-server-74446748bf-2w2n4 1/1 Running 0 39h
kube-system metrics-server-74446748bf-g7b65 1/1 Running 1 (19h ago) 39h
kubernetes-dashboard dashboard-metrics-scraper-5fdf8ff74f-6znsb 1/1 Running 0 17h
kubernetes-dashboard kubernetes-dashboard-56cdd85c55-8vpvk 1/1 Running 0 17h
myapp myapp-tomcat-app2-deployment-57bcfc65b4-bkjl5 0/1 ContainerCreating 0 5s
velero-system velero-f9b9bc564-fnlkh 1/1 Running 1 (19h ago) 42h
通过describe 查看调度情况
root@k8s-master-01-11:/data/k8s_yaml/app/node-selector# kubectl describe pod -n myapp myapp-tomcat-app2-deployment-57bcfc65b4-bkjl5
Name: myapp-tomcat-app2-deployment-57bcfc65b4-bkjl5
Namespace: myapp
Priority: 0
Service Account: default
Node: 10.0.0.12/10.0.0.12
Start Time: Thu, 08 Jun 2023 10:37:45 +0800
Labels: app=myapp-tomcat-app2-selector
pod-template-hash=57bcfc65b4
Annotations: <none>
Status: Running
IP: 172.16.39.75
IPs:
IP: 172.16.39.75
Controlled By: ReplicaSet/myapp-tomcat-app2-deployment-57bcfc65b4
Containers:
myapp-tomcat-app2-container:
Container ID: containerd://d2f23f0d05737482555691fc2b3f5730166b4fadb8786f7a3046e0e85e7d3254
Image: www.ghostxin.online/application/tomcat:v1
Image ID: www.ghostxin.online/application/tomcat@sha256:d1e29610f432b50e4ffe5869dddaa80ab2a4bf8dc4fa5218fd7ad930aa86f6b8
Port: 8080/TCP
Host Port: 0/TCP
State: Running
Started: Thu, 08 Jun 2023 10:38:09 +0800
Ready: True
Restart Count: 0
Environment: <none>
Mounts:
/var/run/secrets/kubernetes.io/serviceaccount from kube-api-access-jpqq6 (ro)
Conditions:
Type Status
Initialized True
Ready True
ContainersReady True
PodScheduled True
Volumes:
kube-api-access-jpqq6:
Type: Projected (a volume that contains injected data from multiple sources)
TokenExpirationSeconds: 3607
ConfigMapName: kube-root-ca.crt
ConfigMapOptional: <nil>
DownwardAPI: true
QoS Class: BestEffort
Node-Selectors: <none>
Tolerations: node.kubernetes.io/not-ready:NoExecute op=Exists for 300s
node.kubernetes.io/unreachable:NoExecute op=Exists for 300s
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Normal Scheduled 58s default-scheduler Successfully assigned myapp/myapp-tomcat-app2-deployment-57bcfc65b4-bkjl5 to 10.0.0.12
Normal Pulling <invalid> kubelet Pulling image "www.ghostxin.online/application/tomcat:v1"
Normal Pulled <invalid> kubelet Successfully pulled image "www.ghostxin.online/application/tomcat:v1" in 12.716539406s (12.716559324s including waiting)
Normal Created <invalid> kubelet Created container myapp-tomcat-app2-container
Normal Started <invalid> kubelet Started container myapp-tomcat-app2-container
发现当只有一个满足条件时,是可以进行调度的,如果只有一个math但是多个key呢
yaml文件:
kind: Deployment
#apiVersion: extensions/v1beta1
apiVersion: apps/v1
metadata:
labels:
app: myapp-tomcat-app2-deployment-label
name: myapp-tomcat-app2-deployment
namespace: myapp
spec:
replicas: 1
selector:
matchLabels:
app: myapp-tomcat-app2-selector
template:
metadata:
labels:
app: myapp-tomcat-app2-selector
spec:
containers:
- name: myapp-tomcat-app2-container
image: www.ghostxin.online/application/tomcat:v1
imagePullPolicy: IfNotPresent
#imagePullPolicy: Always
ports:
- containerPort: 8080
protocol: TCP
name: http
affinity:
nodeAffinity:
requiredDuringSchedulingIgnoredDuringExecution:
nodeSelectorTerms:
- matchExpressions: #匹配条件1,同一个key的多个value只有有一个匹配成功就认为当前key匹配成功
- key: disktype
operator: In
values:
- SSD
- hdd
- key: project #匹配条件2,当前key也要匹配成功一个value,即条件1和条件2必须同时每个key匹配成功一个value,否则不调度
operator: In
values:
- go
我们配置了一个SSD但是项目是go的项目,标签值是不匹配的
首先通过kubectl get pod -A -o wide
root@k8s-master-01-11:/data/k8s_yaml/app/node-selector# kubectl get pod -A -o wide
NAMESPACE NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES
kube-system calico-kube-controllers-5d45cfb97b-ndhcx 1/1 Running 3 (36h ago) 3d20h 10.0.0.23 k8s-worker-01-23 <none> <none>
kube-system calico-node-4v5kq 1/1 Running 2 (32h ago) 3d20h 10.0.0.33 k8s-worker-05-33 <none> <none>
kube-system calico-node-4zd25 1/1 Running 1 (2d20h ago) 3d20h 10.0.0.22 k8s-worker-03-22 <none> <none>
kube-system calico-node-7dlcz 1/1 Running 1 (2d20h ago) 3d20h 10.0.0.21 k8s-worker-02-21 <none> <none>
kube-system calico-node-b96p5 1/1 Running 1 (32h ago) 32h 10.0.0.12 10.0.0.12 <none> <none>
kube-system calico-node-cwgcl 1/1 Running 2 (2d19h ago) 3d20h 10.0.0.11 k8s-master-01-11 <none> <none>
kube-system calico-node-kc866 1/1 Running 2 (32h ago) 3d20h 10.0.0.13 k8s-worker-07-13 <none> <none>
kube-system calico-node-kksvs 1/1 Running 2 (32h ago) 3d20h 10.0.0.32 k8s-worker-04-32 <none> <none>
kube-system calico-node-v6chq 1/1 Running 1 (2d20h ago) 3d20h 10.0.0.23 k8s-worker-01-23 <none> <none>
kube-system coredns-566564f9fd-j4gsv 1/1 Running 0 2d19h 172.16.221.65 k8s-worker-03-22 <none> <none>
kube-system coredns-566564f9fd-jv4wz 1/1 Running 0 2d19h 172.16.76.65 k8s-worker-02-21 <none> <none>
kube-system metrics-server-74446748bf-2w2n4 1/1 Running 0 2d3h 172.16.124.74 k8s-worker-01-23 <none> <none>
kube-system metrics-server-74446748bf-g7b65 1/1 Running 1 (32h ago) 2d3h 172.16.94.8 k8s-worker-05-33 <none> <none>
kubernetes-dashboard dashboard-metrics-scraper-5fdf8ff74f-6znsb 1/1 Running 0 29h 172.16.227.133 k8s-worker-07-13 <none> <none>
kubernetes-dashboard kubernetes-dashboard-56cdd85c55-8vpvk 1/1 Running 0 29h 172.16.39.74 10.0.0.12 <none> <none>
myapp myapp-tomcat-app2-deployment-54bbc97779-qg9x7 0/1 Pending 0 12s <none> <none> <none> <none>
velero-system velero-f9b9bc564-fnlkh 1/1 Running 1 (32h ago) 2d6h 172.16.5.5 k8s-worker-04-32 <none> <none>
发现创建的pod 没有背调度,然后通过kubectl describe 进行查看
root@k8s-master-01-11:/data/k8s_yaml/app/node-selector# kubectl describe pod -n myapp myapp-tomcat-app2-deployment-54bbc97779-qg9x7
Name: myapp-tomcat-app2-deployment-54bbc97779-qg9x7
Namespace: myapp
Priority: 0
Service Account: default
Node: <none>
Labels: app=myapp-tomcat-app2-selector
pod-template-hash=54bbc97779
Annotations: <none>
Status: Pending
IP:
IPs: <none>
Controlled By: ReplicaSet/myapp-tomcat-app2-deployment-54bbc97779
Containers:
myapp-tomcat-app2-container:
Image: www.ghostxin.online/application/tomcat:v1
Port: 8080/TCP
Host Port: 0/TCP
Environment: <none>
Mounts:
/var/run/secrets/kubernetes.io/serviceaccount from kube-api-access-nqpqq (ro)
Conditions:
Type Status
PodScheduled False
Volumes:
kube-api-access-nqpqq:
Type: Projected (a volume that contains injected data from multiple sources)
TokenExpirationSeconds: 3607
ConfigMapName: kube-root-ca.crt
ConfigMapOptional: <nil>
DownwardAPI: true
QoS Class: BestEffort
Node-Selectors: <none>
Tolerations: node.kubernetes.io/not-ready:NoExecute op=Exists for 300s
node.kubernetes.io/unreachable:NoExecute op=Exists for 300s
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Warning FailedScheduling 106s default-scheduler 0/8 nodes are available: 1 node(s) were unschedulable, 7 node(s) didn't match Pod's node affinity/selector. preemption: 0/8 nodes are available: 8 Preemption is not helpful for scheduling..
发现没有match的匹配项,需要修改一下project的values
kind: Deployment
#apiVersion: extensions/v1beta1
apiVersion: apps/v1
metadata:
labels:
app: myapp-tomcat-app2-deployment-label
name: myapp-tomcat-app2-deployment
namespace: myapp
spec:
replicas: 1
selector:
matchLabels:
app: myapp-tomcat-app2-selector
template:
metadata:
labels:
app: myapp-tomcat-app2-selector
spec:
containers:
- name: myapp-tomcat-app2-container
image: www.ghostxin.online/application/tomcat:v1
imagePullPolicy: IfNotPresent
#imagePullPolicy: Always
ports:
- containerPort: 8080
protocol: TCP
name: http
affinity:
nodeAffinity:
requiredDuringSchedulingIgnoredDuringExecution:
nodeSelectorTerms:
- matchExpressions: #匹配条件1,同一个key的多个value只有有一个匹配成功就认为当前key匹配成功
- key: disktype
operator: In
values:
- SSD
- key: project #匹配条件2,当前key也要匹配成功一个value,即条件1和条件2必须同时每个key匹配成功一个value,否则不调度
operator: In
values:
- python #修改此参数
修改priject的values为python继续发布
root@k8s-master-01-11:/data/k8s_yaml/app/node-selector# kubectl get pod -A -o wide
NAMESPACE NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES
kube-system calico-kube-controllers-5d45cfb97b-ndhcx 1/1 Running 4 (2m20s ago) 3d21h 10.0.0.23 k8s-worker-01-23 <none> <none>
kube-system calico-node-4v5kq 1/1 Running 3 (2m11s ago) 3d21h 10.0.0.33 k8s-worker-05-33 <none> <none>
kube-system calico-node-4zd25 1/1 Running 2 (2m19s ago) 3d21h 10.0.0.22 k8s-worker-03-22 <none> <none>
kube-system calico-node-7dlcz 1/1 Running 2 (2m24s ago) 3d21h 10.0.0.21 k8s-worker-02-21 <none> <none>
kube-system calico-node-b96p5 1/1 Running 2 (119s ago) 33h 10.0.0.12 10.0.0.12 <none> <none>
kube-system calico-node-cwgcl 1/1 Running 3 (2m8s ago) 3d21h 10.0.0.11 k8s-master-01-11 <none> <none>
kube-system calico-node-kc866 1/1 Running 3 (2m29s ago) 3d21h 10.0.0.13 k8s-worker-07-13 <none> <none>
kube-system calico-node-kksvs 1/1 Running 3 (2m14s ago) 3d21h 10.0.0.32 k8s-worker-04-32 <none> <none>
kube-system calico-node-v6chq 1/1 Running 2 (2m20s ago) 3d21h 10.0.0.23 k8s-worker-01-23 <none> <none>
kube-system coredns-566564f9fd-j4gsv 1/1 Running 1 (2m19s ago) 2d20h 172.16.221.75 k8s-worker-03-22 <none> <none>
kube-system coredns-566564f9fd-jv4wz 1/1 Running 1 (2m24s ago) 2d20h 172.16.76.80 k8s-worker-02-21 <none> <none>
kube-system metrics-server-74446748bf-2w2n4 1/1 Running 1 (2m20s ago) 2d4h 172.16.124.77 k8s-worker-01-23 <none> <none>
kube-system metrics-server-74446748bf-g7b65 1/1 Running 2 (2m11s ago) 2d4h 172.16.94.19 k8s-worker-05-33 <none> <none>
kubernetes-dashboard dashboard-metrics-scraper-5fdf8ff74f-6znsb 1/1 Running 1 (2m29s ago) 30h 172.16.227.137 k8s-worker-07-13 <none> <none>
kubernetes-dashboard kubernetes-dashboard-56cdd85c55-8vpvk 1/1 Running 1 (119s ago) 30h 172.16.39.76 10.0.0.12 <none> <none>
myapp myapp-tomcat-app2-deployment-58d754c65f-mxvdp 1/1 Running 0 46s 172.16.39.78 10.0.0.12 <none> <none>
velero-system velero-f9b9bc564-fnlkh 1/1 Running 2 (2m14s ago) 2d7h 172.16.5.8 k8s-worker-04-32 <none> <none>
调度成功了,查看describe
Name: myapp-tomcat-app2-deployment-58d754c65f-mxvdp
Namespace: myapp
Priority: 0
Service Account: default
Node: 10.0.0.12/10.0.0.12
Start Time: Thu, 08 Jun 2023 11:51:34 +0800
Labels: app=myapp-tomcat-app2-selector
pod-template-hash=58d754c65f
Annotations: <none>
Status: Running
IP: 172.16.39.78
IPs:
IP: 172.16.39.78
Controlled By: ReplicaSet/myapp-tomcat-app2-deployment-58d754c65f
Containers:
myapp-tomcat-app2-container:
Container ID: containerd://d205f5326eafcc61e1970a467de5d5d0412af3476ef98b142b85a22acf390303
Image: www.ghostxin.online/application/tomcat:v1
Image ID: www.ghostxin.online/application/tomcat@sha256:d1e29610f432b50e4ffe5869dddaa80ab2a4bf8dc4fa5218fd7ad930aa86f6b8
Port: 8080/TCP
Host Port: 0/TCP
State: Running
Started: Thu, 08 Jun 2023 11:51:34 +0800
Ready: True
Restart Count: 0
Environment: <none>
Mounts:
/var/run/secrets/kubernetes.io/serviceaccount from kube-api-access-swlpz (ro)
Conditions:
Type Status
Initialized True
Ready True
ContainersReady True
PodScheduled True
Volumes:
kube-api-access-swlpz:
Type: Projected (a volume that contains injected data from multiple sources)
TokenExpirationSeconds: 3607
ConfigMapName: kube-root-ca.crt
ConfigMapOptional: <nil>
DownwardAPI: true
QoS Class: BestEffort
Node-Selectors: <none>
Tolerations: node.kubernetes.io/not-ready:NoExecute op=Exists for 300s
node.kubernetes.io/unreachable:NoExecute op=Exists for 300s
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Normal Scheduled 14s default-scheduler Successfully assigned myapp/myapp-tomcat-app2-deployment-58d754c65f-mxvdp to 10.0.0.12
Normal Pulled 14s kubelet Container image "www.ghostxin.online/application/tomcat:v1" already present on machine
Normal Created 14s kubelet Created container myapp-tomcat-app2-container
Normal Started 14s kubelet Started container myapp-tomcat-app2-container
调度成功了
实战软亲和
软亲和有一个权重的概念,权重比越高调度优先级越大,软亲和中条件不匹配会调度到其他条件
yaml文件
kind: Deployment
#apiVersion: extensions/v1beta1
apiVersion: apps/v1
metadata:
labels:
app: myapp-tomcat-app2-deployment-label
name: myapp-tomcat-app2-deployment
namespace: myapp
spec:
replicas: 1
selector:
matchLabels:
app: myapp-tomcat-app2-selector
template:
metadata:
labels:
app: myapp-tomcat-app2-selector
spec:
containers:
- name: myapp-tomcat-app2-container
image: www.ghostxin.online/application/tomcat:v1
imagePullPolicy: IfNotPresent
#imagePullPolicy: Always
ports:
- containerPort: 8080
protocol: TCP
name: http
affinity:
nodeAffinity:
preferredDuringSchedulingIgnoredDuringExecution:
- weight: 80 #软亲和条件1,weight值越大优先级越高,越优先匹配调度
preference:
matchExpressions:
- key: project
operator: In
values:
- python
- weight: 60 #软亲和条件2,在条件1不满足时匹配条件2
preference:
matchExpressions:
- key: disktype
operator: In
values:
- ssd
发现背调度到了10.0.0.12
root@k8s-master-01-11:/data/k8s_yaml/app/node-selector# kubectl get pod -A -o wide
NAMESPACE NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES
kube-system calico-kube-controllers-5d45cfb97b-ndhcx 1/1 Running 4 (3h18m ago) 4d 10.0.0.23 k8s-worker-01-23 <none> <none>
kube-system calico-node-4v5kq 1/1 Running 3 (3h17m ago) 4d 10.0.0.33 k8s-worker-05-33 <none> <none>
kube-system calico-node-4zd25 1/1 Running 2 (3h18m ago) 4d 10.0.0.22 k8s-worker-03-22 <none> <none>
kube-system calico-node-7dlcz 1/1 Running 2 (3h18m ago) 4d 10.0.0.21 k8s-worker-02-21 <none> <none>
kube-system calico-node-b96p5 1/1 Running 2 (3h17m ago) 36h 10.0.0.12 10.0.0.12 <none> <none>
kube-system calico-node-cwgcl 1/1 Running 3 (3h17m ago) 4d 10.0.0.11 k8s-master-01-11 <none> <none>
kube-system calico-node-kc866 1/1 Running 3 (3h18m ago) 4d 10.0.0.13 k8s-worker-07-13 <none> <none>
kube-system calico-node-kksvs 1/1 Running 3 (3h17m ago) 4d 10.0.0.32 k8s-worker-04-32 <none> <none>
kube-system calico-node-v6chq 1/1 Running 2 (3h18m ago) 4d 10.0.0.23 k8s-worker-01-23 <none> <none>
kube-system coredns-566564f9fd-j4gsv 1/1 Running 1 (3h18m ago) 2d23h 172.16.221.75 k8s-worker-03-22 <none> <none>
kube-system coredns-566564f9fd-jv4wz 1/1 Running 1 (3h18m ago) 2d23h 172.16.76.80 k8s-worker-02-21 <none> <none>
kube-system metrics-server-74446748bf-2w2n4 1/1 Running 1 (3h18m ago) 2d7h 172.16.124.77 k8s-worker-01-23 <none> <none>
kube-system metrics-server-74446748bf-g7b65 1/1 Running 2 (3h17m ago) 2d7h 172.16.94.19 k8s-worker-05-33 <none> <none>
kubernetes-dashboard dashboard-metrics-scraper-5fdf8ff74f-6znsb 1/1 Running 1 (3h18m ago) 33h 172.16.227.137 k8s-worker-07-13 <none> <none>
kubernetes-dashboard kubernetes-dashboard-56cdd85c55-8vpvk 1/1 Running 1 (3h17m ago) 33h 172.16.39.76 10.0.0.12 <none> <none>
myapp myapp-tomcat-app2-deployment-7f65676fbc-dsnng 1/1 Running 0 7s 172.16.39.79 10.0.0.12 <none> <none>
velero-system velero-f9b9bc564-fnlkh 1/1 Running 2 (3h17m ago) 2d10h 172.16.5.8 k8s-worker-04-32 <none> <none>
查看一下describe
root@k8s-master-01-11:/data/k8s_yaml/app/node-selector# kubectl describe pod -n myapp myapp-tomcat-app2-deployment-7f65676fbc-dsnng
Name: myapp-tomcat-app2-deployment-7f65676fbc-dsnng
Namespace: myapp
Priority: 0
Service Account: default
Node: 10.0.0.12/10.0.0.12
Start Time: Thu, 08 Jun 2023 15:07:57 +0800
Labels: app=myapp-tomcat-app2-selector
pod-template-hash=7f65676fbc
Annotations: <none>
Status: Running
IP: 172.16.39.79
IPs:
IP: 172.16.39.79
Controlled By: ReplicaSet/myapp-tomcat-app2-deployment-7f65676fbc
Containers:
myapp-tomcat-app2-container:
Container ID: containerd://b952308bcbea0190e3471dc46e32a2ab6c8e509a3e1ce53e59c7fe120b43763a
Image: www.ghostxin.online/application/tomcat:v1
Image ID: www.ghostxin.online/application/tomcat@sha256:d1e29610f432b50e4ffe5869dddaa80ab2a4bf8dc4fa5218fd7ad930aa86f6b8
Port: 8080/TCP
Host Port: 0/TCP
State: Running
Started: Thu, 08 Jun 2023 15:07:57 +0800
Ready: True
Restart Count: 0
Environment: <none>
Mounts:
/var/run/secrets/kubernetes.io/serviceaccount from kube-api-access-b8dvj (ro)
Conditions:
Type Status
Initialized True
Ready True
ContainersReady True
PodScheduled True
Volumes:
kube-api-access-b8dvj:
Type: Projected (a volume that contains injected data from multiple sources)
TokenExpirationSeconds: 3607
ConfigMapName: kube-root-ca.crt
ConfigMapOptional: <nil>
DownwardAPI: true
QoS Class: BestEffort
Node-Selectors: <none>
Tolerations: node.kubernetes.io/not-ready:NoExecute op=Exists for 300s
node.kubernetes.io/unreachable:NoExecute op=Exists for 300s
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Normal Scheduled 8m21s default-scheduler Successfully assigned myapp/myapp-tomcat-app2-deployment-7f65676fbc-dsnng to 10.0.0.12
Normal Pulled 8m21s kubelet Container image "www.ghostxin.online/application/tomcat:v1" already present on machine
Normal Created 8m21s kubelet Created container myapp-tomcat-app2-container
Normal Started 8m21s kubelet Started container myapp-tomcat-app2-container
软硬亲和配合使用
硬亲和为强制匹配,软亲和是模糊匹配配合使用时可以由硬亲和决定必要参数,软亲和添加调度的权重
硬亲和:调度除了master的节点
软亲和:设置权重比,80调度为project为python的节点。60调度disktype为ssd的节点
kind: Deployment
#apiVersion: extensions/v1beta1
apiVersion: apps/v1
metadata:
labels:
app: myapp-tomcat-app2-deployment-label
name: myapp-tomcat-app2-deployment
namespace: myapp
spec:
replicas: 1
selector:
matchLabels:
app: myapp-tomcat-app2-selector
template:
metadata:
labels:
app: myapp-tomcat-app2-selector
spec:
containers:
- name: myapp-tomcat-app2-container
image: www.ghostxin.online/application/tomcat:v1
imagePullPolicy:
#imagePullPolicy: Always
ports:
- containerPort: 8080
protocol: TCP
name: http
affinity:
nodeAffinity:
requiredDuringSchedulingIgnoredDuringExecution: #硬亲和
nodeSelectorTerms:
- matchExpressions: #硬匹配条件1
- key: "kubernetes.io/role"
operator: NotIn
values:
- "master" #硬性匹配key 的值kubernetes.io/role不包含master的节点,即绝对不会调度到master节点(node反亲和)
preferredDuringSchedulingIgnoredDuringExecution: #软亲和
- weight: 80
preference:
matchExpressions:
- key: project
operator: In
values:
- python
- weight: 60
preference:
matchExpressions:
- key: disktype
operator: In
values:
- ssd
修改调度project和权重比
kind: Deployment
#apiVersion: extensions/v1beta1
apiVersion: apps/v1
metadata:
labels:
app: myapp-tomcat-app2-deployment-label
name: myapp-tomcat-app2-deployment
namespace: myapp
spec:
replicas: 5
selector:
matchLabels:
app: myapp-tomcat-app2-selector
template:
metadata:
labels:
app: myapp-tomcat-app2-selector
spec:
containers:
- name: myapp-tomcat-app2-container
image: www.ghostxin.online/application/tomcat:v1
imagePullPolicy:
#imagePullPolicy: Always
ports:
- containerPort: 8080
protocol: TCP
name: http
affinity:
nodeAffinity:
requiredDuringSchedulingIgnoredDuringExecution: #硬亲和
nodeSelectorTerms:
- matchExpressions: #硬匹配条件1
- key: "kubernetes.io/role"
operator: NotIn
values:
- "master" #硬性匹配key 的值kubernetes.io/role不包含master的节点,即绝对不会调度到master节点(node反亲和)
preferredDuringSchedulingIgnoredDuringExecution: #软亲和
- weight: 50
preference:
matchExpressions:
- key: project
operator: In
values:
- python
- weight: 50
preference:
matchExpressions:
- key: project
operator: In
values:
- go
查看分布节点
root@k8s-master-01-11:/data/k8s_yaml/app/node-selector# kubectl get pod -n myapp -o wide
NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES
myapp-tomcat-app2-deployment-56d44b9477-bpzwn 1/1 Running 0 4m47s 172.16.39.86 10.0.0.12 <none> <none>
myapp-tomcat-app2-deployment-56d44b9477-mkmsn 1/1 Running 0 4m47s 172.16.39.87 10.0.0.12 <none> <none>
myapp-tomcat-app2-deployment-56d44b9477-sx4qn 1/1 Running 0 4m46s 172.16.94.20 k8s-worker-05-33 <none> <none>
myapp-tomcat-app2-deployment-56d44b9477-vgjgk 1/1 Running 0 4m47s 172.16.39.85 10.0.0.12 <none> <none>
myapp-tomcat-app2-deployment-56d44b9477-xrfbv 1/1 Running 0 4m46s 172.16.39.88 10.0.0.12 <none> <none>
root@k8s-master-01-11:/data/k8s_yaml/app/node-selector#
发现是有几率调度到其他节点,这个权重比50是单次调度的权重比,不是整体副本的调度权重比
反亲和实战
反亲和的概念意思就是必须写入到配置文件的label才能够被调度
yaml文件
kind: Deployment
#apiVersion: extensions/v1beta1
apiVersion: apps/v1
metadata:
labels:
app: myapp-tomcat-app2-deployment-label
name: myapp-tomcat-app2-deployment
namespace: myapp
spec:
replicas: 1
selector:
matchLabels:
app: myapp-tomcat-app2-selector
template:
metadata:
labels:
app: myapp-tomcat-app2-selector
spec:
containers:
- name: myapp-tomcat-app2-container
image: www.ghostxin.online/application/tomcat:v1
imagePullPolicy:
#imagePullPolicy: Always
ports:
- containerPort: 8080
protocol: TCP
name: http
affinity:
nodeAffinity:
requiredDuringSchedulingIgnoredDuringExecution:
nodeSelectorTerms:
- matchExpressions: #匹配条件1
- key: project
operator: NotIn #调度的目的节点没有key为disktype且值为hdd的标签
values:
- python #绝对不会调度到含有label的key为project且值为python的python的节点,即会调度到没有key为project且值为python的python的节点
查看调度
root@k8s-master-01-11:/data/k8s_yaml/app/node-selector# kubectl get pod -n myapp -o wide
NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES
myapp-tomcat-app2-deployment-57969d894-z9llf 1/1 Running 0 2m47s 172.16.94.21 k8s-worker-05-33 <none> <none>
root@k8s-master-01-11:/data/k8s_yaml/app/node-selector#
发现其调度节点为k8s-worker-05-33,该节点标签值为diskplay=Nvdia
root@k8s-master-01-11:/data/k8s_yaml/app/node-selector# kubectl get nodes --show-labels -o wide
NAME STATUS ROLES AGE VERSION INTERNAL-IP EXTERNAL-IP OS-IMAGE KERNEL-VERSION CONTAINER-RUNTIME LABELS
10.0.0.12 Ready node 37h v1.26.4 10.0.0.12 <none> Ubuntu 22.04.2 LTS 5.15.0-73-generic containerd://1.6.20 beta.kubernetes.io/arch=arm64,beta.kubernetes.io/os=linux,disktype=SSD,kubernetes.io/arch=arm64,kubernetes.io/hostname=10.0.0.12,kubernetes.io/os=linux,kubernetes.io/role=node,project=python
k8s-master-01-11 Ready,SchedulingDisabled master 4d1h v1.26.4 10.0.0.11 <none> Ubuntu 22.04.2 LTS 5.15.0-73-generic containerd://1.6.20 beta.kubernetes.io/arch=arm64,beta.kubernetes.io/os=linux,kubernetes.io/arch=arm64,kubernetes.io/hostname=k8s-master-01-11,kubernetes.io/os=linux,kubernetes.io/role=master
k8s-worker-01-23 Ready node 4d1h v1.26.4 10.0.0.23 <none> Ubuntu 22.04.2 LTS 5.15.0-73-generic containerd://1.6.20 beta.kubernetes.io/arch=arm64,beta.kubernetes.io/os=linux,kubernetes.io/arch=arm64,kubernetes.io/hostname=k8s-worker-01-23,kubernetes.io/os=linux,kubernetes.io/role=node
k8s-worker-02-21 Ready node 4d1h v1.26.4 10.0.0.21 <none> Ubuntu 22.04.2 LTS 5.15.0-73-generic containerd://1.6.20 beta.kubernetes.io/arch=arm64,beta.kubernetes.io/os=linux,kubernetes.io/arch=arm64,kubernetes.io/hostname=k8s-worker-02-21,kubernetes.io/os=linux,kubernetes.io/role=node
k8s-worker-03-22 Ready node 4d1h v1.26.4 10.0.0.22 <none> Ubuntu 22.04.2 LTS 5.15.0-73-generic containerd://1.6.20 beta.kubernetes.io/arch=arm64,beta.kubernetes.io/os=linux,kubernetes.io/arch=arm64,kubernetes.io/hostname=k8s-worker-03-22,kubernetes.io/os=linux,kubernetes.io/role=node
k8s-worker-04-32 Ready node 4d1h v1.26.4 10.0.0.32 <none> Ubuntu 22.04.2 LTS 5.15.0-73-generic containerd://1.6.20 beta.kubernetes.io/arch=arm64,beta.kubernetes.io/os=linux,diskplay=Nvdia,kubernetes.io/arch=arm64,kubernetes.io/hostname=k8s-worker-04-32,kubernetes.io/os=linux,kubernetes.io/role=node,projeck=go
k8s-worker-05-33 Ready node 4d1h v1.26.4 10.0.0.33 <none> Ubuntu 22.04.2 LTS 5.15.0-73-generic containerd://1.6.20 beta.kubernetes.io/arch=arm64,beta.kubernetes.io/os=linux,diskplay=Nvdia,kubernetes.io/arch=arm64,kubernetes.io/hostname=k8s-worker-05-33,kubernetes.io/os=linux,kubernetes.io/role=node
k8s-worker-07-13 Ready node 4d1h v1.26.4 10.0.0.13 <none> Ubuntu 22.04.2 LTS 5.15.0-73-generic containerd://1.6.20 beta.kubernetes.io/arch=arm64,beta.kubernetes.io/os=linux,diskplay=Nvdia,kubernetes.io/arch=arm64,kubernetes.io/hostname=k8s-worker-07-13,kubernetes.io/os=linux,kubernetes.io/role=node
反亲和成功
3.pod affinity 和 anit-affinity
pod的亲和与反亲和
(1)pod的亲和性和反亲和性可以基于已经在node节点运行的pod标签来约束和新创建的pod可以调度到目的节点。这里使用的是已经在运行的pod标签,而不是node标签。
(2)规则是如果nodeA已经运行了一个pod,这个pod满足一个或者多个新创建pod调度规则,那么新的pod会调度在此nodeA上,反之如果是反亲和就不会调度到nodeA上。
(3)规则表是一个具有可选关联命名空间的label-selector,所以pod亲和和反亲和是可以通过label-selector来选择namespace,因为node的亲和与反亲和不受namespace而限制,但是pod是被namespace限定的,所以作用于pod-label必须指定namespace。
(4)node节点是一个拓扑域,比如单台node节点,一个机柜,可以使用topologyKey来定义亲和或者反亲和的颗粒是node级别还是可用区级别,以便于kubernetes调度系统识别并选择正确的拓扑域。
针对pod affinity 合法操作符号,in,notin,Exists,DoseNotExists
在pod亲和配置中requiredDuringScheduleingIgnoredDuringExecution和preferredDuringScheduleingIgnoredDuringExecution,topologyKey不能为空。
在pod反亲和requiredDuringScheduleingIgnoredDuringExecution和preferredDuringScheduleingIgnoredDuringExecution,topologyKey不能为空。
对于requiredDuringScheduleingIgnoredDuringExecution要求的pod反亲和性,准入控制器LimitPodHardAntiAffinity被引入以确保topology只能是kubernetes.io/hostname ,如果希望topoloygKey也可用于其他定制拓扑逻辑,可以更改为准入控制器或者禁用。
除上述情况外,topologyKey 可以是任何合法的标签键。
实战软亲和
首先需要部署一个pod
以下为yaml文件
root@k8s-master-01-11:/data/k8s_yaml/app/node-selector# cat case4-4.1-nginx.yaml
kind: Deployment
#apiVersion: extensions/v1beta1
apiVersion: apps/v1
metadata:
labels:
app: python-nginx-deployment-label
name: python-nginx-deployment
namespace: myapp
spec:
replicas: 1
selector:
matchLabels:
app: python-nginx-selector
template:
metadata:
labels:
app: python-nginx-selector
project: python
spec:
containers:
- name: python-nginx-container
image: nginx:1.20.2-alpine
imagePullPolicy: Always
ports:
- containerPort: 80
protocol: TCP
name: http
- containerPort: 443
protocol: TCP
name: https
---
kind: Service
apiVersion: v1
metadata:
labels:
app: python-nginx-service-label
name: python-nginx-service
namespace: myapp
spec:
type: NodePort
ports:
- name: http
port: 80
protocol: TCP
targetPort: 80
nodePort: 30014
- name: https
port: 443
protocol: TCP
targetPort: 443
nodePort: 30453
selector:
app: python-nginx-selector
project: python #一个或多个selector,至少能匹配目标pod的一个标签
调度到worker 5 节点上
root@k8s-master-01-11:/data/k8s_yaml/app/node-selector# kubectl get pod -n myapp -o wide
NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES
python-nginx-deployment-7585b46b76-nqx2f 1/1 Running 0 2m33s 172.16.94.22 k8s-worker-05-33 <none> <none>
root@k8s-master-01-11:/data/k8s_yaml/app/node-selector#
通过此pod进行配置亲和性
亲和性错误yaml
kind: Deployment
#apiVersion: extensions/v1beta1
apiVersion: apps/v1
metadata:
labels:
app: myapp-tomcat-app2-deployment-label
name: myapp-tomcat-app2-deployment
namespace: myapp
spec:
replicas: 1
selector:
matchLabels:
app: myapp-tomcat-app2-selector
template:
metadata:
labels:
app: myapp-tomcat-app2-selector
spec:
containers:
- name: myapp-tomcat-app2-container
image: www.ghostxin.online/application/tomcat:v1
imagePullPolicy: IfNotPresent
#imagePullPolicy: Always
ports:
- containerPort: 8080
protocol: TCP
name: http
affinity:
podAffinity: #Pod亲和
#requiredDuringSchedulingIgnoredDuringExecution: #硬亲和,必须匹配成功才调度,如果匹配失败则拒绝调度。
preferredDuringSchedulingIgnoredDuringExecution: #软亲和,能匹配成功就调度到一个topology,匹配不成功会由kubernetes自行调度。
- weight: 100
podAffinityTerm:
labelSelector: #标签选择
matchExpressions: #正则匹配
- key: project
operator: In
values:
- pythonX #错误的values
topologyKey: kubernetes.io/hostname
namespaces:
- myapp
是创建成功的,因为是软亲和匹配不成功也会被创建
root@k8s-master-01-11:/data/k8s_yaml/app/node-selector# kubectl get pod -n myapp -o wide
NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES
myapp-tomcat-app2-deployment-74698b84dc-9zs7r 1/1 Running 0 59s 172.16.227.138 k8s-worker-07-13 <none> <none>
python-nginx-deployment-7585b46b76-nqx2f 1/1 Running 0 8m3s 172.16.94.22 k8s-worker-05-33 <none> <none>
修改一下此参数
kind: Deployment
#apiVersion: extensions/v1beta1
apiVersion: apps/v1
metadata:
labels:
app: myapp-tomcat-app2-deployment-label
name: myapp-tomcat-app2-deployment
namespace: myapp
spec:
replicas: 1
selector:
matchLabels:
app: myapp-tomcat-app2-selector
template:
metadata:
labels:
app: myapp-tomcat-app2-selector
spec:
containers:
- name: myapp-tomcat-app2-container
image: www.ghostxin.online/application/tomcat:v1
imagePullPolicy: IfNotPresent
#imagePullPolicy: Always
ports:
- containerPort: 8080
protocol: TCP
name: http
affinity:
podAffinity: #Pod亲和
#requiredDuringSchedulingIgnoredDuringExecution: #硬亲和,必须匹配成功才调度,如果匹配失败则拒绝调度。
preferredDuringSchedulingIgnoredDuringExecution: #软亲和,能匹配成功就调度到一个topology,匹配不成功会由kubernetes自行调度。
- weight: 100
podAffinityTerm:
labelSelector: #标签选择
matchExpressions: #正则匹配
- key: project
operator: In
values:
- python #正确的values
topologyKey: kubernetes.io/hostname
namespaces:
- myapp
修改为正确之后再次查看调度
root@k8s-master-01-11:/data/k8s_yaml/app/node-selector# kubectl get pod -n myapp -o wide
NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES
myapp-tomcat-app2-deployment-8674878fbf-zhwjt 1/1 Running 0 68s 172.16.94.23 k8s-worker-05-33 <none> <none>
python-nginx-deployment-7585b46b76-nqx2f 1/1 Running 0 14m 172.16.94.22 k8s-worker-05-33 <none> <none>
root@k8s-master-01-11:/data/k8s_yaml/app/node-selector#
调度成功的,
总结pod软亲和:匹配不到时也会随机选择进行创建,如果匹配到则会到对应的node进行创建
实战硬亲和
首先硬亲和yaml(错误示范)
kind: Deployment
#apiVersion: extensions/v1beta1
apiVersion: apps/v1
metadata:
labels:
app: myapp-tomcat-app2-deployment-label
name: myapp-tomcat-app2-deployment
namespace: myapp
spec:
replicas: 1
selector:
matchLabels:
app: myapp-tomcat-app2-selector
template:
metadata:
labels:
app: myapp-tomcat-app2-selector
spec:
containers:
- name: myapp-tomcat-app2-container
image: www.ghostxin.online/application/tomcat:v1
imagePullPolicy:
#imagePullPolicy: Always
ports:
- containerPort: 8080
protocol: TCP
name: http
affinity:
podAffinity: #区别在这
requiredDuringSchedulingIgnoredDuringExecution: #硬亲和
- labelSelector:
matchExpressions:
- key: project
operator: In
values:
- pythonX
topologyKey: "kubernetes.io/hostname"
namespaces:
- myapp
查看是否被调度
root@k8s-master-01-11:/data/k8s_yaml/app/node-selector# kubectl get pod -n myapp -o wide
NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES
myapp-tomcat-app2-deployment-6fc958d64c-88j2c 0/1 Pending 0 3s <none> <none> <none> <none>
python-nginx-deployment-7585b46b76-nqx2f 1/1 Running 0 17m 172.16.94.22 k8s-worker-05-33 <none> <none>
查看describe
Name: myapp-tomcat-app2-deployment-6fc958d64c-88j2c
Namespace: myapp
Priority: 0
Service Account: default
Node: <none>
Labels: app=myapp-tomcat-app2-selector
pod-template-hash=6fc958d64c
Annotations: <none>
Status: Pending
IP:
IPs: <none>
Controlled By: ReplicaSet/myapp-tomcat-app2-deployment-6fc958d64c
Containers:
myapp-tomcat-app2-container:
Image: www.ghostxin.online/application/tomcat:v1
Port: 8080/TCP
Host Port: 0/TCP
Environment: <none>
Mounts:
/var/run/secrets/kubernetes.io/serviceaccount from kube-api-access-q7q8g (ro)
Conditions:
Type Status
PodScheduled False
Volumes:
kube-api-access-q7q8g:
Type: Projected (a volume that contains injected data from multiple sources)
TokenExpirationSeconds: 3607
ConfigMapName: kube-root-ca.crt
ConfigMapOptional: <nil>
DownwardAPI: true
QoS Class: BestEffort
Node-Selectors: <none>
Tolerations: node.kubernetes.io/not-ready:NoExecute op=Exists for 300s
node.kubernetes.io/unreachable:NoExecute op=Exists for 300s
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Warning FailedScheduling 39s default-scheduler 0/8 nodes are available: 1 node(s) were unschedulable, 7 node(s) didn't match pod affinity rules. preemption: 0/8 nodes are available: 8 Preemption is not helpful for scheduling..
发现没有背调度属于正常现象
修改yaml文件继续调度
kind: Deployment
#apiVersion: extensions/v1beta1
apiVersion: apps/v1
metadata:
labels:
app: myapp-tomcat-app2-deployment-label
name: myapp-tomcat-app2-deployment
namespace: myapp
spec:
replicas: 1
selector:
matchLabels:
app: myapp-tomcat-app2-selector
template:
metadata:
labels:
app: myapp-tomcat-app2-selector
spec:
containers:
- name: myapp-tomcat-app2-container
image: www.ghostxin.online/application/tomcat:v1
imagePullPolicy:
#imagePullPolicy: Always
ports:
- containerPort: 8080
protocol: TCP
name: http
affinity:
podAffinity: #区别在这
requiredDuringSchedulingIgnoredDuringExecution: #硬亲和
- labelSelector:
matchExpressions:
- key: project
operator: In
values:
- python
topologyKey: "kubernetes.io/hostname"
namespaces:
- myapp
查看是否被调度,调度是否一致
root@k8s-master-01-11:/data/k8s_yaml/app/node-selector# kubectl get pod -n myapp -o wide
NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES
myapp-tomcat-app2-deployment-5d84896867-rstxr 1/1 Running 0 43s 172.16.94.24 k8s-worker-05-33 <none> <none>
python-nginx-deployment-7585b46b76-nqx2f 1/1 Running 0 20m 172.16.94.22 k8s-worker-05-33 <none> <none>
调度一致,最终结果一致
总结硬亲和pod:当匹配label时会跟随pod进行创建,无法匹配时直接pending无法被创建。
实战软反亲和
yaml文件
kind: Deployment
#apiVersion: extensions/v1beta1
apiVersion: apps/v1
metadata:
labels:
app: myapp-tomcat-app2-deployment-label
name: myapp-tomcat-app2-deployment
namespace: myapp
spec:
replicas: 5
selector:
matchLabels:
app: myapp-tomcat-app2-selector
template:
metadata:
labels:
app: myapp-tomcat-app2-selector
spec:
containers:
- name: myapp-tomcat-app2-container
image: www.ghostxin.online/application/tomcat:v1
imagePullPolicy: IfNotPresent
#imagePullPolicy: Always
ports:
- containerPort: 8080
protocol: TCP
name: http
affinity:
podAntiAffinity: #区别在这
preferredDuringSchedulingIgnoredDuringExecution:
- weight: 100
podAffinityTerm:
labelSelector:
matchExpressions:
- key: project
operator: In
values:
- python
topologyKey: kubernetes.io/hostname
namespaces:
- myapp
查看调度副本
root@k8s-master-01-11:/data/k8s_yaml/app/node-selector# kubectl get pod -n myapp -o wide
NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES
myapp-tomcat-app2-deployment-79fb6b8745-flvgq 1/1 Running 0 2m43s 172.16.227.144 k8s-worker-07-13 <none> <none>
myapp-tomcat-app2-deployment-79fb6b8745-hckwq 1/1 Running 0 2m44s 172.16.5.13 k8s-worker-04-32 <none> <none>
myapp-tomcat-app2-deployment-79fb6b8745-kkp9z 1/1 Running 0 2m44s 172.16.39.94 10.0.0.12 <none> <none>
myapp-tomcat-app2-deployment-79fb6b8745-q5jxw 1/1 Running 0 2m44s 172.16.124.81 k8s-worker-01-23 <none> <none>
myapp-tomcat-app2-deployment-79fb6b8745-r8tnk 1/1 Running 0 2m42s 172.16.221.80 k8s-worker-03-22 <none> <none>
python-nginx-deployment-7585b46b76-mlv7h 1/1 Running 0 4m8s 172.16.94.30 k8s-worker-05-33 <none> <none>
root@k8s-master-01-11:/data/k8s_yaml/app/node-selector#
发现并没有与nginx相同的node节点
实战硬反亲和
yaml
kind: Deployment
#apiVersion: extensions/v1beta1
apiVersion: apps/v1
metadata:
labels:
app: myapp-tomcat-app2-deployment-label
name: myapp-tomcat-app2-deployment
namespace: myapp
spec:
replicas: 1
selector:
matchLabels:
app: myapp-tomcat-app2-selector
template:
metadata:
labels:
app: myapp-tomcat-app2-selector
spec:
containers:
- name: myapp-tomcat-app2-container
image: www.ghostxin.online/application/tomcat:v1
imagePullPolicy: IfNotPresent
#imagePullPolicy: Always
ports:
- containerPort: 8080
protocol: TCP
name: http
affinity:
podAntiAffinity: #区别在这
requiredDuringSchedulingIgnoredDuringExecution:
- labelSelector:
matchExpressions:
- key: project
operator: In
values:
- python
topologyKey: "kubernetes.io/hostname"
namespaces:
- myapp
查看调度
root@k8s-master-01-11:/data/k8s_yaml/app/node-selector# kubectl get pod -n myapp -o wide
NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES
myapp-tomcat-app2-deployment-7c45bb4579-rqks7 1/1 Running 0 140m 172.16.39.95 10.0.0.12 <none> <none>
python-nginx-deployment-7585b46b76-mlv7h 1/1 Running 0 151m 172.16.94.30 k8s-worker-05-33 <none> <none>
发现没有调度到同一个节点
5.污点与容忍
污点与容忍官方的网站的介绍:https://kubernetes.io/zh-cn/docs/concepts/scheduling-eviction/taint-and-toleration/
污点:
用于node节点排斥pod调度,与亲和作用是相反的,taint的node和pod是排斥关系。
污点三种类型:
(1)noschedule: 表示k8s不会讲pod调度到该node节点上。
kubectl taint node nodeID key1=key1:sechduler #设置污点
kubectl describe node nodeID #查看污点节点
kubectl taint node nodeID key1:scheduler- #取消污点
(2)preferNo Schedule:表示k8s尽量避免将pod调度到该节点上
(3)NoExecute : 表示k8s不会将pod调度到该节点上,并且将已经存在节点上的pod节点进行驱逐。
kubectl taint node nodeID key1=key1:NoExecute #配置污点并且驱逐
kubectl taint node nodeID key1:NoExecute- #取消污点
容忍:
pod容忍node节点的污点信息,即node有污点信息也会将新的pod调度到新的node。
(1)toleration 容忍:定义容忍度,表示可以接受那些污点,容忍后可以将pod调度到该节点。
(2)operator:基于operator污点进行匹配,如果operator是Exists则容忍不需要value而是直接匹配污点类型。
如果operator是Equal:需要指定value并且value的值要等于tolerations的key。
容忍案例1:
首先挑选一个节点打上NoScheduler的标签
root@k8s-master-01-11:/data/k8s_yaml/app/node-selector# kubectl taint node k8s-worker-07-13 key1=key1:NoSchedule
node/k8s-worker-07-13 tainted
root@k8s-master-01-11:/data/k8s_yaml/app/node-selector#
首先是Equal yaml
kind: Deployment
#apiVersion: extensions/v1beta1
apiVersion: apps/v1
metadata:
labels:
app: myapp-tomcat-app1-deployment-label
name: myapp-tomcat-app1-deployment
namespace: myapp
spec:
replicas: 2
selector:
matchLabels:
app: myapp-tomcat-app1-selector
template:
metadata:
labels:
app: myapp-tomcat-app1-selector
spec:
containers:
- name: myapp-tomcat-app1-container
#image: harbor.myapp.net/myapp/tomcat-app1:v7
image: www.ghostxin.online/application/tomcat:v1
imagePullPolicy: IfNotPresent
ports:
- containerPort: 8080
protocol: TCP
name: http
tolerations:
- key: "key1"
operator: "Equal"
value: "value1"
effect: "NoSchedule"
---
kind: Service
apiVersion: v1
metadata:
labels:
app: myapp-tomcat-app1-service-label
name: myapp-tomcat-app1-service
namespace: myapp
spec:
type: NodePort
ports:
- name: http
port: 80
protocol: TCP
targetPort: 8080
#nodePort: 40003
selector:
app: myapp-tomcat-app1-selector
查看调度:
root@k8s-master-01-11:/data/k8s_yaml/app/node-selector# kubectl get pod -n myapp -o wide
NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES
myapp-tomcat-app1-deployment-5546587688-6djnq 1/1 Running 0 2m18s 172.16.94.32 k8s-worker-05-33 <none> <none>
myapp-tomcat-app1-deployment-5546587688-ps7hn 1/1 Running 0 2m18s 172.16.39.96 10.0.0.12 <none> <none>
root@k8s-master-01-11:/data/k8s_yaml/app/node-selector#
发现并没有调度到该节点
容忍案例2:
需要看类型为Exists的容忍案例:
yaml:
kind: Deployment
#apiVersion: extensions/v1beta1
apiVersion: apps/v1
metadata:
labels:
app: myapp-tomcat-app1-deployment-label
name: myapp-tomcat-app1-deployment
namespace: myapp
spec:
replicas: 10
selector:
matchLabels:
app: myapp-tomcat-app1-selector
template:
metadata:
labels:
app: myapp-tomcat-app1-selector
spec:
containers:
- name: myapp-tomcat-app1-container
#image: harbor.myapp.net/myapp/tomcat-app1:v7
image: www.ghostxin.online/application/tomcat:v1
imagePullPolicy: IfNotPresent
ports:
- containerPort: 8080
protocol: TCP
name: http
tolerations:
- key: "key1"
operator: "Exists"
# value: "key1"
effect: "NoSchedule"
---
kind: Service
apiVersion: v1
metadata:
labels:
app: myapp-tomcat-app1-service-label
name: myapp-tomcat-app1-service
namespace: myapp
spec:
type: NodePort
ports:
- name: http
port: 80
protocol: TCP
targetPort: 8080
#nodePort: 40003
selector:
app: myapp-tomcat-app1-selector
root@k8s-master-01-11:/data/k8s_yaml/app/node-selector# kubectl get pod -n myapp -o wide
NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES
myapp-tomcat-app1-deployment-54c76fcfcb-4w8tz 1/1 Running 0 53s 172.16.39.106 10.0.0.12 <none> <none>
myapp-tomcat-app1-deployment-54c76fcfcb-5bl68 1/1 Running 0 53s 172.16.227.154 k8s-worker-07-13 <none> <none>
myapp-tomcat-app1-deployment-54c76fcfcb-7bjdb 1/1 Running 0 51s 172.16.124.84 k8s-worker-01-23 <none> <none>
myapp-tomcat-app1-deployment-54c76fcfcb-d2pcp 1/1 Running 0 53s 172.16.5.17 k8s-worker-04-32 <none> <none>
myapp-tomcat-app1-deployment-54c76fcfcb-gs6wq 1/1 Running 0 49s 172.16.227.155 k8s-worker-07-13 <none> <none>
myapp-tomcat-app1-deployment-54c76fcfcb-hn8nd 1/1 Running 0 53s 172.16.94.42 k8s-worker-05-33 <none> <none>
myapp-tomcat-app1-deployment-54c76fcfcb-k4c8n 1/1 Running 0 53s 172.16.221.87 k8s-worker-03-22 <none> <none>
myapp-tomcat-app1-deployment-54c76fcfcb-pdkc4 1/1 Running 0 48s 172.16.39.107 10.0.0.12 <none> <none>
myapp-tomcat-app1-deployment-54c76fcfcb-qzg55 1/1 Running 0 51s 172.16.94.43 k8s-worker-05-33 <none> <none>
myapp-tomcat-app1-deployment-54c76fcfcb-sbxw6 1/1 Running 0 52s 172.16.76.92 k8s-worker-02-21 <none> <none>
我们发现有节点还是运行在worker7上,这是因为虽然标记NoSchedule但是他并不会驱逐之前的pod,之前在此节点上运行过的pod还是会在此节点上运行。
驱逐案例1:
将节点标记为NoExecute
root@k8s-master-01-11:/data/k8s_yaml/app/node-selector# kubectl taint node k8s-worker-07-13 key2=key2:NoExecute
node/k8s-worker-07-13 tainted
root@k8s-master-01-11:/data/k8s_yaml/app/node-selector#
调整后你会发现原本在此节点的pod会驱逐
root@k8s-master-01-11:/data/k8s_yaml/app/node-selector# kubectl get pod -n myapp -o wide
NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES
myapp-tomcat-app1-deployment-54c76fcfcb-4w8tz 1/1 Running 0 5m18s 172.16.39.106 10.0.0.12 <none> <none>
myapp-tomcat-app1-deployment-54c76fcfcb-5bl68 1/1 Terminating 0 5m18s 172.16.227.154 k8s-worker-07-13 <none> <none>
myapp-tomcat-app1-deployment-54c76fcfcb-7bjdb 1/1 Running 0 5m16s 172.16.124.84 k8s-worker-01-23 <none> <none>
myapp-tomcat-app1-deployment-54c76fcfcb-86jm6 1/1 Running 0 11s 172.16.5.18 k8s-worker-04-32 <none> <none>
myapp-tomcat-app1-deployment-54c76fcfcb-d2pcp 1/1 Running 0 5m18s 172.16.5.17 k8s-worker-04-32 <none> <none>
myapp-tomcat-app1-deployment-54c76fcfcb-gs6wq 1/1 Terminating 0 5m14s 172.16.227.155 k8s-worker-07-13 <none> <none>
myapp-tomcat-app1-deployment-54c76fcfcb-hn8nd 1/1 Running 0 5m18s 172.16.94.42 k8s-worker-05-33 <none> <none>
myapp-tomcat-app1-deployment-54c76fcfcb-k4c8n 1/1 Running 0 5m18s 172.16.221.87 k8s-worker-03-22 <none> <none>
myapp-tomcat-app1-deployment-54c76fcfcb-ml9cd 1/1 Running 0 11s 172.16.76.93 k8s-worker-02-21 <none> <none>
myapp-tomcat-app1-deployment-54c76fcfcb-pdkc4 1/1 Running 0 5m13s 172.16.39.107 10.0.0.12 <none> <none>
myapp-tomcat-app1-deployment-54c76fcfcb-qzg55 1/1 Running 0 5m16s 172.16.94.43 k8s-worker-05-33 <none> <none>
myapp-tomcat-app1-deployment-54c76fcfcb-sbxw6 1/1 Running 0 5m17s 172.16.76.92 k8s-worker-02-21 <none> <none>
但是配置了此参数就不会被驱逐
yaml 调整参数 容忍NoExecute
kind: Deployment
#apiVersion: extensions/v1beta1
apiVersion: apps/v1
metadata:
labels:
app: myapp-tomcat-app1-deployment-label
name: myapp-tomcat-app1-deployment
namespace: myapp
spec:
replicas: 10
selector:
matchLabels:
app: myapp-tomcat-app1-selector
template:
metadata:
labels:
app: myapp-tomcat-app1-selector
spec:
containers:
- name: myapp-tomcat-app1-container
#image: harbor.myapp.net/myapp/tomcat-app1:v7
image: www.ghostxin.online/application/tomcat:v1
imagePullPolicy: IfNotPresent
ports:
- containerPort: 8080
protocol: TCP
name: http
tolerations:
- key: "key1"
operator: "Exists"
# value: "key1"
effect: "NoSchedule"
- key: "key2"
operator: "Exists"
# value: "key1"
effect: "NoExecute"
---
kind: Service
apiVersion: v1
metadata:
labels:
app: myapp-tomcat-app1-service-label
name: myapp-tomcat-app1-service
namespace: myapp
spec:
type: NodePort
ports:
- name: http
port: 80
protocol: TCP
targetPort: 8080
#nodePort: 40003
selector:
app: myapp-tomcat-app1-selector
查看pod可以调度到worker7上
root@k8s-master-01-11:/data/k8s_yaml/app/node-selector# kubectl get pod -n myapp -o wide
NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES
myapp-tomcat-app1-deployment-55c56584f7-2vr8l 1/1 Running 0 97s 172.16.39.109 10.0.0.12 <none> <none>
myapp-tomcat-app1-deployment-55c56584f7-5k9nb 1/1 Running 0 97s 172.16.76.94 k8s-worker-02-21 <none> <none>
myapp-tomcat-app1-deployment-55c56584f7-5tffs 1/1 Running 0 97s 172.16.94.46 k8s-worker-05-33 <none> <none>
myapp-tomcat-app1-deployment-55c56584f7-8vzvt 1/1 Running 0 97s 172.16.39.108 10.0.0.12 <none> <none>
myapp-tomcat-app1-deployment-55c56584f7-g67z5 1/1 Running 0 97s 172.16.124.85 k8s-worker-01-23 <none> <none>
myapp-tomcat-app1-deployment-55c56584f7-lfq4l 1/1 Running 0 97s 172.16.221.88 k8s-worker-03-22 <none> <none>
myapp-tomcat-app1-deployment-55c56584f7-ns7jh 1/1 Running 0 97s 172.16.5.19 k8s-worker-04-32 <none> <none>
myapp-tomcat-app1-deployment-55c56584f7-qrjg2 1/1 Running 0 97s 172.16.227.156 k8s-worker-07-13 <none> <none>
myapp-tomcat-app1-deployment-55c56584f7-rfbrh 1/1 Running 0 97s 172.16.94.45 k8s-worker-05-33 <none> <none>
myapp-tomcat-app1-deployment-55c56584f7-vqrg6 1/1 Running 0 97s 172.16.227.157 k8s-worker-07-13 <none> <none>
扩展:
1.总结kubelet evictionHard及system-reserved、kube-reserved概念
3.RBAC
官方文档:https://kubernetes.io/zh-cn/docs/reference/access-authn-authz/rbac/
1.kubernetes API鉴权流程
1.身份验证
kubernetes API-server验证客户端身份是否为合法的account用户,如果不是合法用户直接返回401错误。
身份验证包括了 token 认证 ,https,password(老版本使用)
2.鉴权
通过了身份验证,API-Server会继续判断是否有权限能够执行当前请求,
请求包括:get ,list,delete
此步骤还会检查 policy 策略和verbs
3.准入
权限验证通过后,则API服务器会判断是否准入控制器,此步骤关系是否能够操作控制器
比如 deployment的副本调整,limit调整,request调整。
2.kubernetes API鉴权类型
鉴权官方网站:https://kubernetes.io/zh-cn/docs/reference/access-authn-authz/authorization/
1.node鉴权
针对kubelet发出的API请求进行鉴权,授权node节点kubelet读取service,endpoint,secret,configmaps等事件
更新node的pod与node的关系
2.webhook
一个http回调,发生某些事情时调用http的回调。
3.ABAC(attribut-based access control)
基于属性的访问,将属性直接与账号进行绑定,类似于权限绑定账号没有角色的概念比较难管理。
4.RBAC(role-based access control)
基于角色的访问控制,首先将权限和角色关联,然后将角色和用户绑定从而集成角色中的权限
角色可以绑定多个用户
3.RBAC
RBAC简介
RBAC API 声明了四个kubernetes对象, role ,cluster Role,RoleBinding,clusterrRoleBinding
定义了 角色,集群角色,角色绑定,集群角色绑定
role 角色: 定义一组规则,用于访问Kubernetes资源
RoleBinding:定义了角色和用户的绑定关系,
ClusterRole: 集群角色,定义了一组访问集群中的kubernetes(包括所有namespace)资源
ClusterRoleBinding:定义了集群角色和用户的绑定关系。
实战
首先需要创建一个用户
root@k8s-master-01-11:/data/k8s_yaml/app/RBAC# kubectl create serviceaccount app-account -n myapp
serviceaccount/app-account created
创建一个role规则,这里使用yaml文件,目前来说可以自由的修改权限,针对不同的资源
比如pod,deployment,exec(进入容器的权限)
kind: Role
apiVersion: rbac.authorization.k8s.io/v1
metadata:
namespace: myapp
name: myapp-role
rules:
- apiGroups: ["*"]
resources: ["pods"]
#verbs: ["*"]
##RO-Role
verbs: ["get", "watch", "list"]
- apiGroups: ["*"]
resources: ["pods/exec"]
#verbs: ["*"]
##RO-Role
verbs: ["get", "watch", "list","put","create"]
- apiGroups: ["extensions", "apps/v1"]
resources: ["deployments"]
#verbs: ["get", "list", "watch", "create", "update", "patch", "delete"]
##RO-Role
verbs: ["get", "watch", "list"]
- apiGroups: ["*"]
resources: ["*"]
#verbs: ["get", "list", "watch", "create", "update", "patch", "delete"]
##RO-Role
verbs: ["get", "watch", "list"]
Role-binding
kind: RoleBinding
apiVersion: rbac.authorization.k8s.io/v1
metadata:
name: role-bind-myapp
namespace: myapp
subjects:
- kind: ServiceAccount
name: app-account #你创建用户名称
namespace: myapp
roleRef:
kind: Role
name: myapp-role
apiGroup: rbac.authorization.k8s.io
~
为了登陆需要签发token
apiVersion: v1
kind: Secret
type: kubernetes.io/service-account-token
metadata:
name: myapp-user-token
namespace: myapp
annotations:
kubernetes.io/service-account.name: "app-account"
查看secret是否创建成功
root@k8s-master-01-11:/data/k8s_yaml/app/RBAC# kubectl get secrets -A
NAMESPACE NAME TYPE DATA AGE
kube-system calico-etcd-secrets Opaque 3 3h36m
kubernetes-dashboard dashboard-admin-user kubernetes.io/service-account-token 3 23m
kubernetes-dashboard kubernetes-dashboard-certs Opaque 0 23m
kubernetes-dashboard kubernetes-dashboard-csrf Opaque 1 23m
kubernetes-dashboard kubernetes-dashboard-key-holder Opaque 2 23m
myapp myapp-user-token kubernetes.io/service-account-token 3 25s
velero-system cloud-credentials Opaque 1 25h
velero-system velero-repo-credentials Opaque 1 25h
提取token进行登陆
root@k8s-master-01-11:/data/k8s_yaml/app/RBAC# kubectl get secrets -A
NAMESPACE NAME TYPE DATA AGE
kube-system calico-etcd-secrets Opaque 3 3h36m
kubernetes-dashboard dashboard-admin-user kubernetes.io/service-account-token 3 23m
kubernetes-dashboard kubernetes-dashboard-certs Opaque 0 23m
kubernetes-dashboard kubernetes-dashboard-csrf Opaque 1 23m
kubernetes-dashboard kubernetes-dashboard-key-holder Opaque 2 23m
myapp myapp-user-token kubernetes.io/service-account-token 3 25s
velero-system cloud-credentials Opaque 1 25h
velero-system velero-repo-credentials Opaque 1 25h
root@k8s-master-01-11:/data/k8s_yaml/app/RBAC# kubectl describe secrets -n myapp myapp-user-token
Name: myapp-user-token
Namespace: myapp
Labels: <none>
Annotations: kubernetes.io/service-account.name: app-account
kubernetes.io/service-account.uid: 44126052-8556-418e-8304-e3ec405eb29b
Type: kubernetes.io/service-account-token
Data
====
ca.crt: 1310 bytes
namespace: 5 bytes
token: eyJhbGciOiJSUzI1NiIsImtpZCI6ImV0TkJCRFREcGt4dEQ3ZFRVTWJqOFRHdDJKZkpDTm5VcmxmdGR2bkUtUFUifQ.eyJpc3MiOiJrdWJlcm5ldGVzL3NlcnZpY2VhY2NvdW50Iiwia3ViZXJuZXRlcy5pby9zZXJ2aWNlYWNjb3VudC9uYW1lc3BhY2UiOiJteWFwcCIsImt1YmVybmV0ZXMuaW8vc2VydmljZWFjY291bnQvc2VjcmV0Lm5hbWUiOiJteWFwcC11c2VyLXRva2VuIiwia3ViZXJuZXRlcy5pby9zZXJ2aWNlYWNjb3VudC9zZXJ2aWNlLWFjY291bnQubmFtZSI6ImFwcC1hY2NvdW50Iiwia3ViZXJuZXRlcy5pby9zZXJ2aWNlYWNjb3VudC9zZXJ2aWNlLWFjY291bnQudWlkIjoiNDQxMjYwNTItODU1Ni00MThlLTgzMDQtZTNlYzQwNWViMjliIiwic3ViIjoic3lzdGVtOnNlcnZpY2VhY2NvdW50Om15YXBwOmFwcC1hY2NvdW50In0.c_qHqjBFIrJ8eeFTmG94AfTb-TjvhkCf-vodlMBMg4rMLg65LN89ibGr3Dl4cPF_J6lN8-qP6gdNnEz6hu47mSDnGJ0b9SbuBuJ3Uv-6bOHQtr-Mbrf6WTw2F9mLU0AaIXMMVeV1Vf-NJKOQ7MS4OXymPXL1YNDodngp89DXFia_PusuPxLIi7SVBDEMhWRvuuIgk8rl0gkNCVujmVkc6YB-iv7w7dv15iUKGMhGwOcUhPfNEASdCJbEJFe3g9is7S7-O6TQj_EHyI4uTB5v2zhOHXURLxQYZp_yGZ-1F0dekFc6mTKqQrMlV_ii-Z0ahJc3_hmmGcGeC_YMn0s4DA
root@k8s-master-01-11:/data/k8s_yaml/app/RBAC#
是否能够看到:
没有问题
4.扩展
1.kubelet evictionHard
官方网站:https://kubernetes.io/zh-cn/docs/tasks/administer-cluster/reserve-compute-resources/
kubelet evictionHard是为系统守护进程预留计算资源
通常情况下,节点自己运行了很多驱动os和kubernetes的守护进程,除非为这些守护进程提前预留出资源,否则他将会与pod进行抢夺。
kube-evictionHard 是kube的逐出值
system-evictionHard是system的逐出值
evictionHard-threshold是强制逐出阈值
Allocatable是这个node 可用资源量
驱逐pod优先级:
(1)当pod设置limit和request相等时是优先级最高的,最后驱逐。
(2)当pod设置limit和request不想等时是优先级中等,其次驱逐。
(3)当pod不设置优先级最低,最先驱逐。
1.驱逐条件
(1)evcation-signal kubelet进行捕捉驱逐触发信号,通过判断资源的available值进行判断是否驱逐。
(2)operator : 通过操作符对比条件是否匹配资源量,是否触发驱逐。
(3)quantity:使用量,基于指定资源使用量进行判断是否驱逐。
比如: nodefs.available <10% ,node存储资源小于10%,触发驱逐信号
2.驱逐
1.软驱逐
软驱逐不会立即驱逐,可以自定义期限,在宽限的期限没有恢复,kubelet进行杀死pod驱逐。
软驱逐条件:
eviction:软驱逐触发条件,资源比比如imagefs.available<15% memory.available:<300Mi nodefs.available< 10% nodefs.inodesFree< 5%.
Evication-sofe-grace-period: 驱逐宽限期限,比如memory.available=1m30s,定义驱逐条件在触发pod驱逐必须保持的时间。
evication-max-pod-grace-period:满足驱逐条件终止pod的最大宽限时间。
2.硬驱逐
硬驱逐没有条件,直接驱逐
3.实战1
具体资源限制如下:
root@k8s-master-01-11:/opt# vim /etc/systemd/system/kubelet.service
Description=Kubernetes Kubelet
Documentation=https://github.com/GoogleCloudPlatform/kubernetes
[Service]
WorkingDirectory=/var/lib/kubelet
ExecStart=/opt/kube/bin/kubelet
--config=/var/lib/kubelet/config.yaml
--container-runtime-endpoint=unix:///run/containerd/containerd.sock
--hostname-override=k8s-master-01-11
--kubeconfig=/etc/kubernetes/kubelet.kubeconfig
--root-dir=/var/lib/kubelet
--kube-reserved=cpu=200m,memory=0.2Gi,ephemeral-storage=1Gi
--system-reserved=cpu=100m,memory=0.1Gi,ephemeral-storage=1Gi
--v=2
Restart=always
RestartSec=5
[Install]
WantedBy=multi-user.target
重新加载配置文件,并且重启
root@k8s-master-01-11:/opt# systemctl daemon-reload
root@k8s-master-01-11:/opt# systemctl restart kubelet.service
查看kubelet的status
root@k8s-master-01-11:/opt# systemctl status kubelet.service
● kubelet.service - Kubernetes Kubelet
Loaded: loaded (/etc/systemd/system/kubelet.service; enabled; vendor preset: enabled)
Active: active (running) since Thu 2023-06-08 21:15:29 CST; 12s ago
Docs: https://github.com/GoogleCloudPlatform/kubernetes
Main PID: 144725 (kubelet)
Tasks: 8 (limit: 2191)
Memory: 26.4M
CPU: 169ms
CGroup: /system.slice/kubelet.service
└─144725 /opt/kube/bin/kubelet --config=/var/lib/kubelet/config.yaml --container-runtime-endpoint=unix:///run/containerd/containerd.sock --hostname-overri
de=k8s-master-01-11 --kubeconfig=/etc/kubernetes/kubelet.kubeconfig --root-dir=/var/lib/kubelet --kube-reserved=cpu=200m,memory=0.2Gi,ephemeral-storage=1Gi
Jun 08 21:15:30 k8s-master-01-11 kubelet[144725]: I0608 21:15:30.574172 144725 reconciler_common.go:253] "operationExecutor.VerifyControllerAttachedVolume started for
volume "var-lib-calico" (UniqueName: "kubernetes.io/host-path/70f9c058-bb96-44f6-b7b3-11b0e8ac88a6-var-lib-calico") pod "calico-node-cwgcl" (UID: "70f9c058-bb9
6-44f6-b7b3-11b0e8ac88a6") " pod="kube-system/calico-node-cwgcl"
Jun 08 21:15:30 k8s-master-01-11 kubelet[144725]: I0608 21:15:30.574251 144725 reconciler_common.go:253] "operationExecutor.VerifyControllerAttachedVolume started for
volume "xtables-lock" (UniqueName: "kubernetes.io/host-path/70f9c058-bb96-44f6-b7b3-11b0e8ac88a6-xtables-lock") pod "calico-node-cwgcl" (UID: "70f9c058-bb96-44
f6-b7b3-11b0e8ac88a6") " pod="kube-system/calico-node-cwgcl"
Jun 08 21:15:30 k8s-master-01-11 kubelet[144725]: I0608 21:15:30.574299 144725 reconciler_common.go:253] "operationExecutor.VerifyControllerAttachedVolume started for
volume "sys-fs" (UniqueName: "kubernetes.io/host-path/70f9c058-bb96-44f6-b7b3-11b0e8ac88a6-sys-fs") pod "calico-node-cwgcl" (UID: "70f9c058-bb96-44f6-b7b3-11b0
e8ac88a6") " pod="kube-system/calico-node-cwgcl"
Jun 08 21:15:30 k8s-master-01-11 kubelet[144725]: I0608 21:15:30.574357 144725 reconciler_common.go:253] "operationExecutor.VerifyControllerAttachedVolume started for
volume "bpffs" (UniqueName: "kubernetes.io/host-path/70f9c058-bb96-44f6-b7b3-11b0e8ac88a6-bpffs") pod "calico-node-cwgcl" (UID: "70f9c058-bb96-44f6-b7b3-11b0e8
ac88a6") " pod="kube-system/calico-node-cwgcl"
Jun 08 21:15:30 k8s-master-01-11 kubelet[144725]: I0608 21:15:30.574401 144725 reconciler_common.go:253] "operationExecutor.VerifyControllerAttachedVolume started for
volume "nodeproc" (UniqueName: "kubernetes.io/host-path/70f9c058-bb96-44f6-b7b3-11b0e8ac88a6-nodeproc") pod "calico-node-cwgcl" (UID: "70f9c058-bb96-44f6-b7b3-
11b0e8ac88a6") " pod="kube-system/calico-node-cwgcl"
Jun 08 21:15:30 k8s-master-01-11 kubelet[144725]: I0608 21:15:30.574434 144725 reconciler_common.go:253] "operationExecutor.VerifyControllerAttachedVolume started for
volume "cni-bin-dir" (UniqueName: "kubernetes.io/host-path/70f9c058-bb96-44f6-b7b3-11b0e8ac88a6-cni-bin-dir") pod "calico-node-cwgcl" (UID: "70f9c058-bb96-44f6
-b7b3-11b0e8ac88a6") " pod="kube-system/calico-node-cwgcl"
Jun 08 21:15:30 k8s-master-01-11 kubelet[144725]: I0608 21:15:30.574466 144725 reconciler_common.go:253] "operationExecutor.VerifyControllerAttachedVolume started for
volume "cni-log-dir" (UniqueName: "kubernetes.io/host-path/70f9c058-bb96-44f6-b7b3-11b0e8ac88a6-cni-log-dir") pod "calico-node-cwgcl" (UID: "70f9c058-bb96-44f6
-b7b3-11b0e8ac88a6") " pod="kube-system/calico-node-cwgcl"
Jun 08 21:15:30 k8s-master-01-11 kubelet[144725]: I0608 21:15:30.574585 144725 reconciler_common.go:253] "operationExecutor.VerifyControllerAttachedVolume started for
volume "etcd-certs" (UniqueName: "kubernetes.io/secret/70f9c058-bb96-44f6-b7b3-11b0e8ac88a6-etcd-certs") pod "calico-node-cwgcl" (UID: "70f9c058-bb96-44f6-b7b3
-11b0e8ac88a6") " pod="kube-system/calico-node-cwgcl"
Jun 08 21:15:30 k8s-master-01-11 kubelet[144725]: I0608 21:15:30.574746 144725 reconciler_common.go:253] "operationExecutor.VerifyControllerAttachedVolume started for
volume "policysync" (UniqueName: "kubernetes.io/host-path/70f9c058-bb96-44f6-b7b3-11b0e8ac88a6-policysync") pod "calico-node-cwgcl" (UID: "70f9c058-bb96-44f6-b
7b3-11b0e8ac88a6") " pod="kube-system/calico-node-cwgcl"
Jun 08 21:15:30 k8s-master-01-11 kubelet[144725]: I0608 21:15:30.574789 144725 reconciler.go:41] "Reconciler: start to sync state"
root@k8s-master-01-11:/opt#
没有问题
资源预留参数
--kube-reserved #kube组件的资源预留
--system-reserved #system服务的资源预留
--eviction-hard #Pod硬驱逐资源阈值 这里没有设置
记得提前备份好配置文件
4.实战2:
使用二进制安装自动配置好了
kind: KubeletConfiguration
apiVersion: kubelet.config.k8s.io/v1beta1
address: 0.0.0.0
authentication:
anonymous:
enabled: false
webhook:
cacheTTL: 2m0s
enabled: true
x509:
clientCAFile: /etc/kubernetes/ssl/ca.pem
authorization:
mode: Webhook
webhook:
cacheAuthorizedTTL: 5m0s
cacheUnauthorizedTTL: 30s
cgroupDriver: systemd
cgroupsPerQOS: true
clusterDNS:
- 192.168.0.2
clusterDomain: cluster.local
configMapAndSecretChangeDetectionStrategy: Watch
containerLogMaxFiles: 3
containerLogMaxSize: 10Mi
enforceNodeAllocatable:
- pods
eventBurst: 10
eventRecordQPS: 5
evictionHard:
imagefs.available: 15% #镜像存储预留15%
memory.available: 300Mi #内存预留预留300mi
nodefs.available: 10% #node存储预留10%
nodefs.inodesFree: 5% #node 索引预留5%
evictionPressureTransitionPeriod: 5m0s #驱逐等待时间 过渡期
failSwapOn: true
fileCheckFrequency: 40s
hairpinMode: hairpin-veth
healthzBindAddress: 0.0.0.0
healthzPort: 10248
httpCheckFrequency: 40s
imageGCHighThresholdPercent: 85
imageGCLowThresholdPercent: 80
imageMinimumGCAge: 2m0s
kubeAPIBurst: 100
kubeAPIQPS: 50
makeIPTablesUtilChains: true
maxOpenFiles: 1000000
maxPods: 400
nodeLeaseDurationSeconds: 40
nodeStatusReportFrequency: 1m0s
nodeStatusUpdateFrequency: 10s
oomScoreAdj: -999
podPidsLimit: -1
port: 10250
# disable readOnlyPort
readOnlyPort: 0
resolvConf: /run/systemd/resolve/resolv.conf
runtimeRequestTimeout: 2m0s
serializeImagePulls: true
streamingConnectionIdleTimeout: 4h0m0s
syncFrequency: 1m0s
tlsCertFile: /etc/kubernetes/ssl/kubelet.pem
tlsPrivateKeyFile: /etc/kubernetes/ssl/kubelet-key.pem