Repository: Lancger/opsfull
Branch: master
Commit: 5b36608dbe13
Files: 104
Total size: 533.3 KB
Directory structure:
gitextract_y9znwe6w/
├── LICENSE
├── README.md
├── apps/
│ ├── README.md
│ ├── nginx/
│ │ └── README.md
│ ├── ops/
│ │ └── README.md
│ └── wordpress/
│ ├── README.md
│ ├── 基于PV_PVC部署Wordpress 示例.md
│ └── 部署Wordpress 示例.md
├── components/
│ ├── README.md
│ ├── cronjob/
│ │ └── README.md
│ ├── dashboard/
│ │ ├── Kubernetes-Dashboard v2.0.0.md
│ │ └── README.md
│ ├── external-storage/
│ │ ├── 0、nfs服务端搭建.md
│ │ ├── 1、k8s的pv和pvc简述.md
│ │ ├── 2、静态配置PV和PVC.md
│ │ ├── 3、动态申请PV卷.md
│ │ ├── 4、Kubernetes之MySQL持久存储和故障转移.md
│ │ ├── 5、Kubernetes之Nginx动静态PV持久存储.md
│ │ └── README.md
│ ├── heapster/
│ │ └── README.md
│ ├── ingress/
│ │ ├── 0.通俗理解Kubernetes中Service、Ingress与Ingress Controller的作用与关系.md
│ │ ├── 1.kubernetes部署Ingress-nginx单点和高可用.md
│ │ ├── 1.外部服务发现之Ingress介绍.md
│ │ ├── 2.ingress tls配置.md
│ │ ├── 3.ingress-http使用示例.md
│ │ ├── 4.ingress-https使用示例.md
│ │ ├── 5.hello-tls.md
│ │ ├── 6.ingress-https使用示例.md
│ │ ├── README.md
│ │ ├── nginx-ingress/
│ │ │ └── README.md
│ │ ├── traefik-ingress/
│ │ │ ├── 1.traefik反向代理Deamonset模式.md
│ │ │ ├── 2.traefik反向代理Deamonset模式TLS.md
│ │ │ └── README.md
│ │ └── 常用操作.md
│ ├── initContainers/
│ │ └── README.md
│ ├── job/
│ │ └── README.md
│ ├── k8s-monitor/
│ │ └── README.md
│ ├── kube-proxy/
│ │ └── README.md
│ ├── nfs/
│ │ └── README.md
│ └── pressure/
│ ├── README.md
│ ├── calico bgp网络需要物理路由和交换机支持吗.md
│ └── k8s集群更换网段方案.md
├── docs/
│ ├── Envoy的架构与基本术语.md
│ ├── Kubernetes学习笔记.md
│ ├── Kubernetes架构介绍.md
│ ├── Kubernetes集群环境准备.md
│ ├── app.md
│ ├── app2.md
│ ├── ca.md
│ ├── coredns.md
│ ├── dashboard.md
│ ├── dashboard_op.md
│ ├── delete.md
│ ├── docker-install.md
│ ├── etcd-install.md
│ ├── flannel.md
│ ├── k8s-error-resolution.md
│ ├── k8s_pv_local.md
│ ├── k8s重启pod.md
│ ├── master.md
│ ├── node.md
│ ├── operational.md
│ ├── 外部访问K8s中Pod的几种方式.md
│ └── 虚拟机环境准备.md
├── example/
│ ├── coredns/
│ │ └── coredns.yaml
│ └── nginx/
│ ├── nginx-daemonset.yaml
│ ├── nginx-deployment.yaml
│ ├── nginx-ingress.yaml
│ ├── nginx-pod.yaml
│ ├── nginx-rc.yaml
│ ├── nginx-rs.yaml
│ ├── nginx-service-nodeport.yaml
│ └── nginx-service.yaml
├── helm/
│ └── README.md
├── kubeadm/
│ ├── K8S-HA-V1.13.4-关闭防火墙版.md
│ ├── K8S-HA-V1.16.x-云环境-Calico.md
│ ├── K8S-V1.16.2-开启防火墙-Flannel.md
│ ├── Kubernetes 集群变更IP地址.md
│ ├── README.md
│ ├── k8S-HA-V1.15.3-Calico-开启防火墙版.md
│ ├── k8S-HA-V1.15.3-Flannel-开启防火墙版.md
│ ├── k8s清理.md
│ ├── kubeadm.yaml
│ ├── kubeadm初始化k8s集群延长证书过期时间.md
│ └── kubeadm无法下载镜像问题.md
├── manual/
│ ├── README.md
│ ├── v1.14/
│ │ └── README.md
│ └── v1.15.3/
│ └── README.md
├── mysql/
│ ├── README.md
│ └── kubernetes访问外部mysql服务.md
├── redis/
│ ├── K8s上Redis集群动态扩容.md
│ ├── K8s上运行Redis单实例.md
│ ├── K8s上运行Redis集群指南.md
│ └── README.md
├── rke/
│ ├── README.md
│ └── cluster.yml
└── tools/
├── Linux Kernel 升级.md
├── README.md
├── k8s域名解析coredns问题排查过程.md
├── kubernetes-node打标签.md
├── kubernetes-常用操作.md
├── kubernetes-批量删除Pods.md
├── kubernetes访问外部mysql服务.md
└── ssh_copy.sh
================================================
FILE CONTENTS
================================================
================================================
FILE: LICENSE
================================================
Apache License
Version 2.0, January 2004
http://www.apache.org/licenses/
TERMS AND CONDITIONS FOR USE, REPRODUCTION, AND DISTRIBUTION
1. Definitions.
"License" shall mean the terms and conditions for use, reproduction,
and distribution as defined by Sections 1 through 9 of this document.
"Licensor" shall mean the copyright owner or entity authorized by
the copyright owner that is granting the License.
"Legal Entity" shall mean the union of the acting entity and all
other entities that control, are controlled by, or are under common
control with that entity. For the purposes of this definition,
"control" means (i) the power, direct or indirect, to cause the
direction or management of such entity, whether by contract or
otherwise, or (ii) ownership of fifty percent (50%) or more of the
outstanding shares, or (iii) beneficial ownership of such entity.
"You" (or "Your") shall mean an individual or Legal Entity
exercising permissions granted by this License.
"Source" form shall mean the preferred form for making modifications,
including but not limited to software source code, documentation
source, and configuration files.
"Object" form shall mean any form resulting from mechanical
transformation or translation of a Source form, including but
not limited to compiled object code, generated documentation,
and conversions to other media types.
"Work" shall mean the work of authorship, whether in Source or
Object form, made available under the License, as indicated by a
copyright notice that is included in or attached to the work
(an example is provided in the Appendix below).
"Derivative Works" shall mean any work, whether in Source or Object
form, that is based on (or derived from) the Work and for which the
editorial revisions, annotations, elaborations, or other modifications
represent, as a whole, an original work of authorship. For the purposes
of this License, Derivative Works shall not include works that remain
separable from, or merely link (or bind by name) to the interfaces of,
the Work and Derivative Works thereof.
"Contribution" shall mean any work of authorship, including
the original version of the Work and any modifications or additions
to that Work or Derivative Works thereof, that is intentionally
submitted to Licensor for inclusion in the Work by the copyright owner
or by an individual or Legal Entity authorized to submit on behalf of
the copyright owner. For the purposes of this definition, "submitted"
means any form of electronic, verbal, or written communication sent
to the Licensor or its representatives, including but not limited to
communication on electronic mailing lists, source code control systems,
and issue tracking systems that are managed by, or on behalf of, the
Licensor for the purpose of discussing and improving the Work, but
excluding communication that is conspicuously marked or otherwise
designated in writing by the copyright owner as "Not a Contribution."
"Contributor" shall mean Licensor and any individual or Legal Entity
on behalf of whom a Contribution has been received by Licensor and
subsequently incorporated within the Work.
2. Grant of Copyright License. Subject to the terms and conditions of
this License, each Contributor hereby grants to You a perpetual,
worldwide, non-exclusive, no-charge, royalty-free, irrevocable
copyright license to reproduce, prepare Derivative Works of,
publicly display, publicly perform, sublicense, and distribute the
Work and such Derivative Works in Source or Object form.
3. Grant of Patent License. Subject to the terms and conditions of
this License, each Contributor hereby grants to You a perpetual,
worldwide, non-exclusive, no-charge, royalty-free, irrevocable
(except as stated in this section) patent license to make, have made,
use, offer to sell, sell, import, and otherwise transfer the Work,
where such license applies only to those patent claims licensable
by such Contributor that are necessarily infringed by their
Contribution(s) alone or by combination of their Contribution(s)
with the Work to which such Contribution(s) was submitted. If You
institute patent litigation against any entity (including a
cross-claim or counterclaim in a lawsuit) alleging that the Work
or a Contribution incorporated within the Work constitutes direct
or contributory patent infringement, then any patent licenses
granted to You under this License for that Work shall terminate
as of the date such litigation is filed.
4. Redistribution. You may reproduce and distribute copies of the
Work or Derivative Works thereof in any medium, with or without
modifications, and in Source or Object form, provided that You
meet the following conditions:
(a) You must give any other recipients of the Work or
Derivative Works a copy of this License; and
(b) You must cause any modified files to carry prominent notices
stating that You changed the files; and
(c) You must retain, in the Source form of any Derivative Works
that You distribute, all copyright, patent, trademark, and
attribution notices from the Source form of the Work,
excluding those notices that do not pertain to any part of
the Derivative Works; and
(d) If the Work includes a "NOTICE" text file as part of its
distribution, then any Derivative Works that You distribute must
include a readable copy of the attribution notices contained
within such NOTICE file, excluding those notices that do not
pertain to any part of the Derivative Works, in at least one
of the following places: within a NOTICE text file distributed
as part of the Derivative Works; within the Source form or
documentation, if provided along with the Derivative Works; or,
within a display generated by the Derivative Works, if and
wherever such third-party notices normally appear. The contents
of the NOTICE file are for informational purposes only and
do not modify the License. You may add Your own attribution
notices within Derivative Works that You distribute, alongside
or as an addendum to the NOTICE text from the Work, provided
that such additional attribution notices cannot be construed
as modifying the License.
You may add Your own copyright statement to Your modifications and
may provide additional or different license terms and conditions
for use, reproduction, or distribution of Your modifications, or
for any such Derivative Works as a whole, provided Your use,
reproduction, and distribution of the Work otherwise complies with
the conditions stated in this License.
5. Submission of Contributions. Unless You explicitly state otherwise,
any Contribution intentionally submitted for inclusion in the Work
by You to the Licensor shall be under the terms and conditions of
this License, without any additional terms or conditions.
Notwithstanding the above, nothing herein shall supersede or modify
the terms of any separate license agreement you may have executed
with Licensor regarding such Contributions.
6. Trademarks. This License does not grant permission to use the trade
names, trademarks, service marks, or product names of the Licensor,
except as required for reasonable and customary use in describing the
origin of the Work and reproducing the content of the NOTICE file.
7. Disclaimer of Warranty. Unless required by applicable law or
agreed to in writing, Licensor provides the Work (and each
Contributor provides its Contributions) on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or
implied, including, without limitation, any warranties or conditions
of TITLE, NON-INFRINGEMENT, MERCHANTABILITY, or FITNESS FOR A
PARTICULAR PURPOSE. You are solely responsible for determining the
appropriateness of using or redistributing the Work and assume any
risks associated with Your exercise of permissions under this License.
8. Limitation of Liability. In no event and under no legal theory,
whether in tort (including negligence), contract, or otherwise,
unless required by applicable law (such as deliberate and grossly
negligent acts) or agreed to in writing, shall any Contributor be
liable to You for damages, including any direct, indirect, special,
incidental, or consequential damages of any character arising as a
result of this License or out of the use or inability to use the
Work (including but not limited to damages for loss of goodwill,
work stoppage, computer failure or malfunction, or any and all
other commercial damages or losses), even if such Contributor
has been advised of the possibility of such damages.
9. Accepting Warranty or Additional Liability. While redistributing
the Work or Derivative Works thereof, You may choose to offer,
and charge a fee for, acceptance of support, warranty, indemnity,
or other liability obligations and/or rights consistent with this
License. However, in accepting such obligations, You may act only
on Your own behalf and on Your sole responsibility, not on behalf
of any other Contributor, and only if You agree to indemnify,
defend, and hold each Contributor harmless for any liability
incurred by, or claims asserted against, such Contributor by reason
of your accepting any such warranty or additional liability.
END OF TERMS AND CONDITIONS
APPENDIX: How to apply the Apache License to your work.
To apply the Apache License to your work, attach the following
boilerplate notice, with the fields enclosed by brackets "[]"
replaced with your own identifying information. (Don't include
the brackets!) The text should be enclosed in the appropriate
comment syntax for the file format. We also recommend that a
file or class name and description of purpose be included on the
same "printed page" as the copyright notice for easier
identification within third-party archives.
Copyright [yyyy] [name of copyright owner]
Licensed under the Apache License, Version 2.0 (the "License");
you may not use this file except in compliance with the License.
You may obtain a copy of the License at
http://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License.
================================================
FILE: README.md
================================================
# 一、K8S攻略
- [Kubernetes架构介绍](docs/Kubernetes架构介绍.md)
- [Kubernetes集群环境准备](docs/Kubernetes集群环境准备.md)
- [Docker安装](docs/docker-install.md)
- [CA证书制作](docs/ca.md)
- [ETCD集群部署](docs/etcd-install.md)
- [Master节点部署](docs/master.md)
- [Node节点部署](docs/node.md)
- [Flannel部署](docs/flannel.md)
- [应用创建](docs/app.md)
- [问题汇总](docs/k8s-error-resolution.md)
- [常用手册](docs/operational.md)
- [Envoy 的架构与基本术语](docs/Envoy的架构与基本术语.md)
- [K8S学习手册](docs/Kubernetes学习笔记.md)
- [K8S重启pod](docs/k8s%E9%87%8D%E5%90%AFpod.md)
- [K8S清理](docs/delete.md)
- [外部访问K8s中Pod的几种方式](docs/外部访问K8s中Pod的几种方式.md)
- [应用测试](docs/app2.md)
- [PVC](docs/k8s_pv_local.md)
- [dashboard操作](docs/dashboard_op.md)
# 使用手册
# 二、k8s资源清理
```
1、# svc清理
$ kubectl delete svc $(kubectl get svc -n mos-namespace|grep -v NAME|awk '{print $1}') -n mos-namespace
service "mysql-production" deleted
service "nginx-test" deleted
service "redis-cluster" deleted
service "redis-production" deleted
2、# deployment清理
$ kubectl delete deployment $(kubectl get deployment -n mos-namespace|grep -v NAME|awk '{print $1}') -n mos-namespace
deployment.extensions "centos7-app" deleted
3、# configmap清理
$ kubectl delete cm $(kubectl get cm -n mos-namespace|grep -v NAME|awk '{print $1}') -n mos-namespace
```
https://www.xiaodianer.net/index.php/kubernetes/istio/41-istio-https-demo
https://mp.weixin.qq.com/s/jnVn6_cyRUILBQ0cBhBNyQ Kubernetes v1.18.2 二进制高可用部署
================================================
FILE: apps/README.md
================================================
================================================
FILE: apps/nginx/README.md
================================================
================================================
FILE: apps/ops/README.md
================================================
================================================
FILE: apps/wordpress/README.md
================================================
================================================
FILE: apps/wordpress/基于PV_PVC部署Wordpress 示例.md
================================================
# 一、PV(PersistentVolume)
PersistentVolume (PV) 是外部存储系统中的一块存储空间,由管理员创建和维护。与 Volume 一样,PV 具有持久性,生命周期独立于 Pod。
1、PV和PVC是一一对应关系,当有PV被某个PVC所占用时,会显示banding,其它PVC不能再使用绑定过的PV。
2、PVC一旦绑定PV,就相当于是一个存储卷,此时PVC可以被多个Pod所使用。(PVC支不支持被多个Pod访问,取决于访问模型accessMode的定义)。
3、PVC若没有找到合适的PV时,则会处于pending状态。
4、PV的reclaim policy选项:
默认是Retain保留,保留生成的数据。
可以改为recycle回收,删除生成的数据,回收pv
delete,删除,pvc解除绑定后,pv也就自动删除。
# 二、PVC
PersistentVolumeClaim (PVC) 是对 PV 的申请 (Claim)。PVC 通常由普通用户创建和维护。需要为 Pod 分配存储资源时,用户可以创建一个 PVC,指明存储资源的容量大小和访问模式(比如只读)等信息,Kubernetes 会查找并提供满足条件的 PV。
有了 PersistentVolumeClaim,用户只需要告诉 Kubernetes 需要什么样的存储资源,而不必关心真正的空间从哪里分配,如何访问等底层细节信息。这些 Storage Provider 的底层信息交给管理员来处理,只有管理员才应该关心创建 PersistentVolume 的细节信息。
## PVC资源需要指定:
1、accessMode:访问模型;对象列表:
ReadWriteOnce – the volume can be mounted as read-write by a single node: RWO - ReadWriteOnce 一人读写
ReadOnlyMany – the volume can be mounted read-only by many nodes: ROX - ReadOnlyMany 多人只读
ReadWriteMany – the volume can be mounted as read-write by many nodes: RWX - ReadWriteMany 多人读写
2、resource:资源限制(比如:定义5GB空间,我们期望对应的存储空间至少5GB。)
3、selector:标签选择器。不加标签,就会在所有PV找最佳匹配。
4、storageClassName:存储类名称:
5、volumeMode:指后端存储卷的模式。可以用于做类型限制,哪种类型的PV可以被当前claim所使用。
6、volumeName:卷名称,指定后端PVC(相当于绑定)
# 三、两者差异
1、PV是属于集群级别的,不能定义在名称空间中
2、PVC时属于名称空间级别的。
参考文档:
https://blog.csdn.net/weixin_42973226/article/details/86501693 基于rook-ceph部署wordpress
https://www.cnblogs.com/benjamin77/p/9944268.html k8s的持久化存储PV&&PVC
================================================
FILE: apps/wordpress/部署Wordpress 示例.md
================================================
# 一、简述
Wordpress应用主要涉及到两个镜像:wordpress 和 mysql,wordpress 是应用的核心程序,mysql 是用于数据存储的。现在我们来看看如何来部署我们的这个wordpress应用。这个服务主要有2个pod资源,优先使用Deployment来管理我们的Pod。
# 二、创建一个MySQL的Deployment对象
- 1、创建namespace空间,并使用Service暴露服务给集群内部使用
```bash
# 清理wordpress-db资源
kubectl delete -f wordpress-db.yaml
# 编写mysql的deployment文件
cat > wordpress-db.yaml <<\EOF
---
apiVersion: v1
kind: Namespace
metadata:
name: blog
---
apiVersion: apps/v1beta1
kind: Deployment
metadata:
name: mysql-deploy
namespace: blog
labels:
app: mysql
spec:
template:
metadata:
labels:
app: mysql
spec:
containers:
- name: mysql
image: mysql:5.7
imagePullPolicy: IfNotPresent
ports:
- containerPort: 3306
name: dbport
env:
- name: MYSQL_ROOT_PASSWORD
value: rootPassW0rd
- name: MYSQL_DATABASE
value: wordpress
- name: MYSQL_USER
value: wordpress
- name: MYSQL_PASSWORD
value: wordpress
volumeMounts:
- name: db
mountPath: /var/lib/mysql
volumes:
- name: db
hostPath:
path: /var/lib/mysql
---
apiVersion: v1
kind: Service
metadata:
name: wordpress-mysql
namespace: blog
spec:
selector:
app: mysql
ports:
- name: mysqlport
protocol: TCP
port: 3306
targetPort: dbport
EOF
# 创建资源和服务
kubectl create -f wordpress-db.yaml
```
- 2、查看创建的svc服务
```bash
$ kubectl describe svc wordpress-mysql -n blog
Name: wordpress-mysql
Namespace: blog
Labels:
Annotations:
Selector: app=mysql
Type: ClusterIP
IP: 10.104.88.234
Port: mysqlport 3306/TCP
TargetPort: dbport/TCP
Endpoints: 10.244.1.115:3306
Session Affinity: None
Events:
```
- 3、验证创建的mysql资源服务可用性
```bash
# 命令行跑一个centos7的bash基础容器
$ kubectl run mysql-test --rm -it --image=alpine /bin/sh
kubectl run centos7-app --rm -it --image=centos:7.2.1511 -n blog
# 进入到容器
kubectl exec `kubectl get pods -n blog|grep centos7-app|awk '{print $1}'` -it /bin/bash -n blog
# 安装mysql客户端
yum install vim net-tools telnet nc -y
yum install -y mariadb.x86_64 mariadb-libs.x86_64
# 测试mysql服务端口是否OK
nc -zv wordpress-mysql 3306
# 连接测试
mysql -h'wordpress-mysql' -u'root' -p'rootPassW0rd' # 这里使用域名测试
mysql -h'10.104.88.234' -u'root' -p'rootPassW0rd' # 这里使用集群IP测试,这个经常会变
mysql -h'10.244.1.115' -u'root' -p'rootPassW0rd' # 这里使用Endpoints IP测试,这个经常会变
```
# 三、创建Wordpress服务Deployment对象
```bash
# 清理wordpress资源
kubectl delete -f wordpress.yaml
# 编写wordpress的deployment文件
cat > wordpress.yaml <<\EOF
---
apiVersion: apps/v1beta1
kind: Deployment
metadata:
name: wordpress-deploy
namespace: blog
labels:
app: wordpress
spec:
template:
metadata:
labels:
app: wordpress
spec:
containers:
- name: wordpress
image: wordpress
imagePullPolicy: IfNotPresent
ports:
- containerPort: 80
name: wdport
env:
- name: WORDPRESS_DB_HOST
value: wordpress-mysql:3306
- name: WORDPRESS_DB_USER
value: wordpress
- name: WORDPRESS_DB_PASSWORD
value: wordpress
---
apiVersion: v1
kind: Service
metadata:
name: wordpress-service
namespace: blog
spec:
type: NodePort
selector:
app: wordpress
ports:
- name: wordpressport
protocol: TCP
port: 80
targetPort: wdport
nodePort: 32380 #新增这一行,指定固定node端口
EOF
# 创建资源和服务
kubectl create -f wordpress.yaml
# 查看创建的pod资源
kubectl get pods -n blog
# 查看创建的svc资源
kubectl get svc -n blog
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
wordpress-mysql ClusterIP 10.104.88.234 3306/TCP 3m36s
wordpress-service NodePort 10.111.212.108 80:32380/TCP 12s
```
# 四、访问测试
```bash
#可以看到wordpress服务产生了一个32380的端口,现在我们是不是就可以通过任意节点的NodeIP加上32255端口,就可以访问我们的wordpress应用了,在浏览器中打开,如果看到wordpress跳转到了安装页面,证明我们的嗯安装是没有任何问题的了,如果没有出现预期的效果,那么就需要去查看下Pod的日志来查看问题了:
http://192.168.56.11:32380/
```

# 五、提高稳定性(进阶)
`1、当你使用kuberentes的时候,有没有遇到过Pod在启动后一会就挂掉然后又重新启动这样的恶性循环?你有没有想过kubernetes是如何检测pod是否还存活?虽然容器已经启动,但是kubernetes如何知道容器的进程是否准备好对外提供服务了呢?让我们通过kuberentes官网的这篇文章[Configure Liveness and Readiness Probes](https://kubernetes.io/docs/tasks/configure-pod-container/configure-liveness-readiness-startup-probes/),来一探究竟。`
`2、Kubelet使用liveness probe(存活探针)来确定何时重启容器。例如,当应用程序处于运行状态但无法做进一步操作,liveness探针将捕获到deadlock,重启处于该状态下的容器,使应用程序在存在bug的情况下依然能够继续运行下去(谁的程序还没几个bug呢)。`
`3、Kubelet使用readiness probe(就绪探针)来确定容器是否已经就绪可以接受流量。只有当Pod中的容器都处于就绪状态时kubelet才会认定该Pod处于就绪状态。该信号的作用是控制哪些Pod应该作为service的后端。如果Pod处于非就绪状态,那么它们将会被从service的load balancer中移除。`
`
现在wordpress应用已经部署成功了,那么就万事大吉了吗?如果我们的网站访问量突然变大了怎么办,如果我们要更新我们的镜像该怎么办?如果我们的mysql服务挂掉了怎么办?
所以要保证我们的网站能够非常稳定的提供服务,我们做得还不够,我们可以通过做些什么事情来提高网站的稳定性呢?
## 第一. 增加健康检测
我们前面说过liveness probe和rediness probe是提高应用稳定性非常重要的方法:
```bash
livenessProbe:
tcpSocket:
port: 80
initialDelaySeconds: 3
periodSeconds: 3
readinessProbe:
tcpSocket:
port: 80
initialDelaySeconds: 5
periodSeconds: 10
#增加上面两个探针,每10s检测一次应用是否可读,每3s检测一次应用是否存活
```
## 第二. 增加 HPA
让我们的应用能够自动应对流量高峰期:
```bash
1、创建HPA资源(一定要设置Pod的资源限制参数: request, 否则HPA不会工作)
$ kubectl autoscale deployment wordpress-deploy --cpu-percent=10 --min=1 --max=10 -n blog
deployment "wordpress-deploy" autoscaled
# 我们用kubectl autoscale命令为我们的wordpress-deploy创建一个HPA对象,最小的 pod 副本数为1,最大为10,HPA会根据设定的 cpu使用率(10%)动态的增加或者减少pod数量。当然最好我们也为Pod声明一些资源限制:
resources:
limits:
cpu: 200m
memory: 200Mi
requests:
cpu: 100m
memory: 100Mi
# 查看HPA
$ kubectl get HorizontalPodAutoscaler -A
NAMESPACE NAME REFERENCE TARGETS MINPODS MAXPODS REPLICAS AGE
blog wordpress-deploy Deployment/wordpress-deploy /10% 1 10 1 4m19s
2、更新Deployment后,我们可以可以来测试下上面的HPA是否会生效:
$ kubectl run -i --tty load-generator --image=busybox /bin/sh
If you don't see a command prompt, try pressing enter.
while true; do wget -q -O- http://wordpress:80; done
3、观察Deployment的副本数是否有变化
$ kubectl get deployment wordpress-deploy
NAME DESIRED CURRENT UP-TO-DATE AVAILABLE AGE
wordpress-deploy 3 3 3 3 4d
4、删除HPA
$ kubectl delete HorizontalPodAutoscaler wordpress-deploy -n blog
horizontalpodautoscaler.autoscaling "wordpress-deploy" deleted
```
## 第三. 增加滚动更新策略
这样可以保证我们在更新应用的时候服务不会被中断:
```bash
replicas: 2
revisionHistoryLimit: 10
minReadySeconds: 5
strategy:
type: RollingUpdate
rollingUpdate:
maxSurge: 1
maxUnavailable: 1
```
## 第四. 使用Service的名称来代替host
`如果mysql服务被重新创建了的话,它的clusterIP非常有可能就变化了,所以上面我们环境变量中的WORDPRESS_DB_HOST的值就会有问题,就会导致访问不了数据库服务了,这个地方我们可以直接使用Service的名称来代替host,这样即使clusterIP变化了,也不会有任何影响,这个我们会在后面的服务发现的章节和大家深入讲解的`
```bash
env:
- name: WORDPRESS_DB_HOST
value: wordpress-mysql:3306
```
## 第五. 容器启动顺序
`在部署wordpress服务的时候,mysql服务以前启动起来了吗?如果没有启动起来是不是我们也没办法连接数据库了啊?该怎么办,是不是在启动wordpress应用之前应该去检查一下mysql服务,如果服务正常的话我们就开始部署应用了,这是不是就是InitContainer的用法`
```bash
initContainers:
- name: init-db
image: busybox
command: ['sh', '-c', 'until nslookup mysql; do echo waiting for mysql service; sleep 2; done;']
# 直到mysql服务创建完成后,initContainer才结束,结束完成后我们才开始下面的部署。
```
# 六、优化文件合并
```bash
kubectl delete -f wordpress-all.yaml
cat > wordpress-all.yaml <<\EOF
---
apiVersion: v1
kind: Namespace
metadata:
name: blog
---
apiVersion: apps/v1beta1
kind: Deployment
metadata:
name: mysql-deploy
namespace: blog
labels:
app: mysql
spec:
template:
metadata:
labels:
app: mysql
spec:
containers:
- name: mysql
image: mysql:5.7
ports:
- containerPort: 3306
name: dbport
env:
- name: MYSQL_ROOT_PASSWORD
value: rootPassW0rd
- name: MYSQL_DATABASE
value: wordpress
- name: MYSQL_USER
value: wordpress
- name: MYSQL_PASSWORD
value: wordpress
volumeMounts:
- name: db
mountPath: /var/lib/mysql
volumes:
- name: db
hostPath:
path: /var/lib/mysql
---
apiVersion: v1
kind: Service
metadata:
name: wordpress-mysql
namespace: blog
spec:
selector:
app: mysql
ports:
- name: mysqlport
protocol: TCP
port: 3306
targetPort: dbport
---
apiVersion: apps/v1beta1
kind: Deployment
metadata:
name: wordpress-deploy
namespace: blog
labels:
app: wordpress
spec:
revisionHistoryLimit: 10
minReadySeconds: 5
strategy:
type: RollingUpdate
rollingUpdate:
maxSurge: 1
maxUnavailable: 1
template:
metadata:
labels:
app: wordpress
spec:
initContainers:
- name: init-db
image: busybox
command: ['sh', '-c', 'until nslookup wordpress-mysql; do echo waiting for mysql service; sleep 2; done;']
containers:
- name: wordpress
image: wordpress
imagePullPolicy: IfNotPresent
ports:
- containerPort: 80
name: wdport
env:
- name: WORDPRESS_DB_HOST
value: wordpress-mysql:3306
- name: WORDPRESS_DB_USER
value: wordpress
- name: WORDPRESS_DB_PASSWORD
value: wordpress
resources:
limits:
cpu: 200m
memory: 200Mi
requests:
cpu: 100m
memory: 100Mi
---
apiVersion: v1
kind: Service
metadata:
name: wordpress
namespace: blog
spec:
selector:
app: wordpress
type: NodePort
ports:
- name: wordpressport
protocol: TCP
port: 80
nodePort: 32380
targetPort: wdport
EOF
kubectl apply -f wordpress-all.yaml
watch kubectl get pods -n blog
# 检测mysql服务
$ kubectl run mysql-test --rm -it --image=alpine /bin/sh -n blog
$ nslookup wordpress-mysql
Name: wordpress-mysql
Address 1: 10.99.230.27 wordpress-mysql.blog.svc.cluster.local
$ ping wordpress-mysql
PING wordpress-mysql (10.99.230.27): 56 data bytes
64 bytes from 10.99.230.27: seq=0 ttl=64 time=0.124 ms
64 bytes from 10.99.230.27: seq=0 ttl=64 time=0.124 ms
```
参考文档:
https://www.qikqiak.com/k8s-book/docs/31.%E9%83%A8%E7%BD%B2%20Wordpress%20%E7%A4%BA%E4%BE%8B.html
https://blog.csdn.net/maoreyou/article/details/80050623 Kubernetes之路 3 - 解决服务依赖
================================================
FILE: components/README.md
================================================
# ingress
# helm
https://kubernetes.io/docs/setup/production-environment/tools/kubeadm/install-kubeadm/#check-required-ports 需要开放的端口
================================================
FILE: components/cronjob/README.md
================================================
参考资料:
https://www.jianshu.com/p/62b4f0a3134b Kubernetes对象之CronJob
================================================
FILE: components/dashboard/Kubernetes-Dashboard v2.0.0.md
================================================
```bash
#安装
kubectl apply -f https://raw.githubusercontent.com/kubernetes/dashboard/v2.0.0-beta4/aio/deploy/recommended.yaml
#卸载
kubectl delete -f https://raw.githubusercontent.com/kubernetes/dashboard/v2.0.0-beta4/aio/deploy/recommended.yaml
#账号授权
kubectl delete -f admin.yaml
cat > admin.yaml << \EOF
kind: ClusterRoleBinding
apiVersion: rbac.authorization.k8s.io/v1
metadata:
name: admin
annotations:
rbac.authorization.kubernetes.io/autoupdate: "true"
roleRef:
kind: ClusterRole
name: cluster-admin
apiGroup: rbac.authorization.k8s.io
subjects:
- kind: ServiceAccount
name: admin
namespace: kube-system
---
apiVersion: v1
kind: ServiceAccount
metadata:
name: admin
namespace: kube-system
labels:
kubernetes.io/cluster-service: "true"
addonmanager.kubernetes.io/mode: Reconcile
EOF
kubectl apply -f admin.yaml
kubectl describe secret/$(kubectl get secret -n kube-system |grep admin|awk '{print $1}') -n kube-system
```
参考文档:
http://www.mydlq.club/article/28/
================================================
FILE: components/dashboard/README.md
================================================
# 一、安装dashboard v1.10.1
## 1、使用NodePort方式暴露访问
1、下载对应的yaml文件
```
wget https://raw.githubusercontent.com/kubernetes/dashboard/v1.10.1/src/deploy/recommended/kubernetes-dashboard.yaml
vim kubernetes-dashboard.yaml
1、# 修改镜像名称
......
spec:
containers:
- name: kubernetes-dashboard
#image: k8s.gcr.io/kubernetes-dashboard-amd64:v1.10.1 #这个换成阿里云的镜像
image: registry.cn-hangzhou.aliyuncs.com/google_containers/kubernetes-dashboard-amd64:v1.10.1
ports:
- containerPort: 8443
protocol: TCP
args:
- --auto-generate-certificates
......
```
2、# 修改Service为NodePort类型
```
......
kind: Service
apiVersion: v1
metadata:
labels:
k8s-app: kubernetes-dashboard
name: kubernetes-dashboard
namespace: kube-system
spec:
type: NodePort # 新增这一行,指定为NodePort方式
ports:
- port: 443
targetPort: 8443
nodePort: 32370 #新增这一行,指定固定node端口
selector:
k8s-app: kubernetes-dashboard
```
3、dashboard最终文件
```
cat > kubernetes-dashboard.yaml << \EOF
# Copyright 2017 The Kubernetes Authors.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
# ------------------- Dashboard Secret ------------------- #
apiVersion: v1
kind: Secret
metadata:
labels:
k8s-app: kubernetes-dashboard
name: kubernetes-dashboard-certs
namespace: kube-system
type: Opaque
---
# ------------------- Dashboard Service Account ------------------- #
apiVersion: v1
kind: ServiceAccount
metadata:
labels:
k8s-app: kubernetes-dashboard
name: kubernetes-dashboard
namespace: kube-system
---
# ------------------- Dashboard Role & Role Binding ------------------- #
kind: Role
apiVersion: rbac.authorization.k8s.io/v1
metadata:
name: kubernetes-dashboard-minimal
namespace: kube-system
rules:
# Allow Dashboard to create 'kubernetes-dashboard-key-holder' secret.
- apiGroups: [""]
resources: ["secrets"]
verbs: ["create"]
# Allow Dashboard to create 'kubernetes-dashboard-settings' config map.
- apiGroups: [""]
resources: ["configmaps"]
verbs: ["create"]
# Allow Dashboard to get, update and delete Dashboard exclusive secrets.
- apiGroups: [""]
resources: ["secrets"]
resourceNames: ["kubernetes-dashboard-key-holder", "kubernetes-dashboard-certs"]
verbs: ["get", "update", "delete"]
# Allow Dashboard to get and update 'kubernetes-dashboard-settings' config map.
- apiGroups: [""]
resources: ["configmaps"]
resourceNames: ["kubernetes-dashboard-settings"]
verbs: ["get", "update"]
# Allow Dashboard to get metrics from heapster.
- apiGroups: [""]
resources: ["services"]
resourceNames: ["heapster"]
verbs: ["proxy"]
- apiGroups: [""]
resources: ["services/proxy"]
resourceNames: ["heapster", "http:heapster:", "https:heapster:"]
verbs: ["get"]
---
apiVersion: rbac.authorization.k8s.io/v1
kind: RoleBinding
metadata:
name: kubernetes-dashboard-minimal
namespace: kube-system
roleRef:
apiGroup: rbac.authorization.k8s.io
kind: Role
name: kubernetes-dashboard-minimal
subjects:
- kind: ServiceAccount
name: kubernetes-dashboard
namespace: kube-system
---
# ------------------- Dashboard Deployment ------------------- #
kind: Deployment
apiVersion: apps/v1
metadata:
labels:
k8s-app: kubernetes-dashboard
name: kubernetes-dashboard
namespace: kube-system
spec:
replicas: 1
revisionHistoryLimit: 10
selector:
matchLabels:
k8s-app: kubernetes-dashboard
template:
metadata:
labels:
k8s-app: kubernetes-dashboard
spec:
containers:
- name: kubernetes-dashboard
#image: k8s.gcr.io/kubernetes-dashboard-amd64:v1.10.1
image: registry.cn-hangzhou.aliyuncs.com/google_containers/kubernetes-dashboard-amd64:v1.10.1
ports:
- containerPort: 8443
protocol: TCP
args:
- --auto-generate-certificates
# Uncomment the following line to manually specify Kubernetes API server Host
# If not specified, Dashboard will attempt to auto discover the API server and connect
# to it. Uncomment only if the default does not work.
# - --apiserver-host=http://my-address:port
volumeMounts:
- name: kubernetes-dashboard-certs
mountPath: /certs
# Create on-disk volume to store exec logs
- mountPath: /tmp
name: tmp-volume
livenessProbe:
httpGet:
scheme: HTTPS
path: /
port: 8443
initialDelaySeconds: 30
timeoutSeconds: 30
volumes:
- name: kubernetes-dashboard-certs
secret:
secretName: kubernetes-dashboard-certs
- name: tmp-volume
emptyDir: {}
serviceAccountName: kubernetes-dashboard
# Comment the following tolerations if Dashboard must not be deployed on master
tolerations:
- key: node-role.kubernetes.io/master
effect: NoSchedule
---
# ------------------- Dashboard Service ------------------- #
kind: Service
apiVersion: v1
metadata:
labels:
k8s-app: kubernetes-dashboard
name: kubernetes-dashboard
namespace: kube-system
spec:
type: NodePort # 新增这一行,指定为NodePort方式
ports:
- port: 443
targetPort: 8443
nodePort: 32370 #新增这一行,指定固定node端口
selector:
k8s-app: kubernetes-dashboard
EOF
kubectl apply -f kubernetes-dashboard.yaml
```
4、然后创建一个具有全局所有权限的用户来登录Dashboard:(admin.yaml)
```
cat > admin.yaml << \EOF
kind: ClusterRoleBinding
apiVersion: rbac.authorization.k8s.io/v1beta1
metadata:
name: admin
annotations:
rbac.authorization.kubernetes.io/autoupdate: "true"
roleRef:
kind: ClusterRole
name: cluster-admin
apiGroup: rbac.authorization.k8s.io
subjects:
- kind: ServiceAccount
name: admin
namespace: kube-system
---
apiVersion: v1
kind: ServiceAccount
metadata:
name: admin
namespace: kube-system
labels:
kubernetes.io/cluster-service: "true"
addonmanager.kubernetes.io/mode: Reconcile
EOF
kubectl apply -f admin.yaml
kubectl delete -f admin.yaml
#获取token
kubectl -n kube-system describe secret $(kubectl -n kube-system get secret | grep admin | awk '{print $1}')
```
5、访问测试 `https://nodeip:32370`
## 2、使用Ingress方式访问
```bash
#清理NodePort方式的dashboard
kubectl delete -f kubernetes-dashboard.yaml
rm -f kubernetes-dashboard.yaml
wget https://raw.githubusercontent.com/kubernetes/dashboard/v1.10.1/src/deploy/recommended/kubernetes-dashboard.yaml
kubectl apply -n kube-system -f kubernetes-dashboard.yaml
```
1、创建和安装加密访问凭证
通过https进行访问必需要使用证书和密钥,在Kubernetes中可以通过配置一个加密凭证(TLS secret)来提供。
```bash
#1、创建 tls secret
#这里只是拿来自己使用,创建一个自己签名的证书。如果是公共服务,建议去数字证书颁发机构去申请一个正式的数字证书(需要一些服务费用);或者使用Let's encrypt去申请一个免费的(后面有介绍);如果使用Cloudflare可以自动生成证书和https转接服务,但是需要将域名迁移过去,高级功能是收费的。
#https://github.com/kubernetes/contrib/blob/master/ingress/controllers/nginx/examples/tls/README.md
kubectl delete secret k8s-dashboard-secret -n kube-system
rm -rf /etc/certs/ssl/
mkdir -p /etc/certs/ssl/default
cd /etc/certs/ssl/default/
openssl req -x509 -nodes -days 365 -newkey rsa:2048 -keyout tls_default.key -out tls_default.crt -subj "/CN=dashboard.devops.com"
#将会产生两个文件tls_default.key和tls_default.crt,你可以改成自己的文件名或放在特定的目录下(如果你是为公共服务器创建的,请保证这个不会被别人访问到)。后面的dashboard.devops.com是我的服务器IP地址,你可以改成自己的。
```
2、安装 tls secret
```bash
#下一步,将这两个文件的信息创建为一个Kubernetes的secret访问凭证,我将名称指定为 k8s-dashboard-secret ,这在后面的Ingress配置时将会用到。如果你修改了这个名字,注意后面的Ingress配置yaml文件也需要同步修改。
cd /etc/certs/ssl/default/
kubectl -n kube-system delete secret k8s-dashboard-secret
kubectl -n kube-system create secret tls k8s-dashboard-secret --key=tls_default.key --cert=tls_default.crt
#查看证书
kubectl get secret k8s-dashboard-secret -n kube-system
kubectl describe secret k8s-dashboard-secret -n kube-system
#注意:
#上面命令的参数 -n 指定凭证安装的命名空间。
#为了安全考虑,Ingress所有的资源(凭证、路由、服务)必须在同一个命名空间。
```
3、配置Ingress 路由
```bash
#将下面的内容保存为文件dashboard-ingress.yaml。里面的 / 设定为访问Kubernetes dashboard服务,/web 只是为了测试和占位,如果没有安装nginx,将会返回找不到服务的消息。
cat >dashboard-ingress.yaml<<\EOF
apiVersion: extensions/v1beta1
kind: Ingress
metadata:
name: k8s-dashboard
namespace: kube-system
annotations:
kubernetes.io/ingress.class: traefik
spec:
tls:
- secretName: traefik-cert #注意这里需要跟traefik.toml文件设置的证书挂钩
#- secretName: k8s-dashboard-secret
rules:
- host: dashboard.devops.com
http:
paths:
- path: /
backend:
serviceName: kubernetes-dashboard
servicePort: 443
EOF
kubectl apply -n kube-system -f dashboard-ingress.yaml
#注意
#上面的annotations部分是必须的,以提供https和https service的支持。不过,不同的Ingress Controller可能的实现(或版本)有所不同,需要安装相应的实现(版本)进行设置。
#参见,#issue:https://github.com/kubernetes/ingress-nginx/issues/2460
```
参考资料:
https://my.oschina.net/u/2306127/blog/1930169?from=timeline&isappinstalled=0 Kubernetes dashboard 通过 Ingress 提供HTTPS访问
================================================
FILE: components/external-storage/0、nfs服务端搭建.md
================================================
## 一、nfs服务端
```bash
#所有节点安装nfs
yum install -y nfs-utils rpcbind
#创建nfs目录
mkdir -p /nfs/data/
#修改权限
chmod -R 666 /nfs/data
#编辑export文件
vim /etc/exports
/nfs/data 192.168.56.0/24(rw,async,no_root_squash)
#如果设置为 /nfs/data *(rw,async,no_root_squash) 则对所以的IP都有效
常用选项:
ro:客户端挂载后,其权限为只读,默认选项;
rw:读写权限;
sync:同时将数据写入到内存与硬盘中;
async:异步,优先将数据保存到内存,然后再写入硬盘;
Secure:要求请求源的端口小于1024
用户映射:
root_squash:当NFS客户端使用root用户访问时,映射到NFS服务器的匿名用户;
no_root_squash:当NFS客户端使用root用户访问时,映射到NFS服务器的root用户;
all_squash:全部用户都映射为服务器端的匿名用户;
anonuid=UID:将客户端登录用户映射为此处指定的用户uid;
anongid=GID:将客户端登录用户映射为此处指定的用户gid
#配置生效
exportfs -r
#查看生效
exportfs
#启动rpcbind、nfs服务
systemctl restart rpcbind && systemctl enable rpcbind
systemctl restart nfs && systemctl enable nfs
#查看 RPC 服务的注册状况 (注意/etc/hosts.deny 里面需要放开以下服务)
$ rpcinfo -p localhost
program vers proto port service
100000 4 tcp 111 portmapper
100000 3 tcp 111 portmapper
100000 2 tcp 111 portmapper
100000 4 udp 111 portmapper
100000 3 udp 111 portmapper
100000 2 udp 111 portmapper
100005 1 udp 20048 mountd
100005 1 tcp 20048 mountd
100005 2 udp 20048 mountd
100005 2 tcp 20048 mountd
100005 3 udp 20048 mountd
100005 3 tcp 20048 mountd
100024 1 udp 34666 status
100024 1 tcp 7951 status
100003 3 tcp 2049 nfs
100003 4 tcp 2049 nfs
100227 3 tcp 2049 nfs_acl
100003 3 udp 2049 nfs
100003 4 udp 2049 nfs
100227 3 udp 2049 nfs_acl
100021 1 udp 31088 nlockmgr
100021 3 udp 31088 nlockmgr
100021 4 udp 31088 nlockmgr
100021 1 tcp 27131 nlockmgr
100021 3 tcp 27131 nlockmgr
100021 4 tcp 27131 nlockmgr
#修改/etc/hosts.allow放开rpcbind(nfs服务端和客户端都要加上)
chattr -i /etc/hosts.allow
echo "nfsd:all" >>/etc/hosts.allow
echo "rpcbind:all" >>/etc/hosts.allow
echo "mountd:all" >>/etc/hosts.allow
chattr +i /etc/hosts.allow
#showmount测试
showmount -e 192.168.56.11
#tcpdmatch测试
$ tcpdmatch rpcbind 192.168.56.11
client: address 192.168.56.11
server: process rpcbind
access: granted
```
## 二、nfs客户端
```bash
yum install -y nfs-utils rpcbind
#客户端创建目录,然后执行挂载
mkdir -p /mnt/nfs #(注意挂载成功后,/mnt下原有数据将会被隐藏,无法找到)
mount -t nfs -o nolock,vers=4 192.168.56.11:/nfs/data /mnt/nfs
```
## 三、挂载nfs
```bash
#或者直接写到/etc/fstab文件中
vim /etc/fstab
192.168.56.11:/nfs/data /mnt/nfs/ nfs auto,noatime,nolock,bg,nfsvers=4,intr,tcp,actimeo=1800 0 0
#挂载
mount -a
#卸载挂载
umount /mnt/nfs
#查看nfs服务端信息
nfsstat -s
#查看nfs客户端信息
nfsstat -c
```
参考文档:
http://www.mydlq.club/article/3/ CentOS7 搭建 NFS 服务器
https://blog.rot13.org/2012/05/rpcbind-is-new-portmap-or-how-to-make-nfs-secure.html
https://yq.aliyun.com/articles/694065
https://www.crifan.com/linux_fstab_and_mount_nfs_syntax_and_parameter_meaning/ Linux中fstab的语法和参数含义和mount NFS时相关参数含义
================================================
FILE: components/external-storage/1、k8s的pv和pvc简述.md
================================================
# 一、PV(PersistentVolume)
PersistentVolume (PV) 是外部存储系统中的一块存储空间,由管理员创建和维护。与 Volume 一样,PV 具有持久性,生命周期独立于 Pod。
1、PV和PVC是一一对应关系,当有PV被某个PVC所占用时,会显示banding,其它PVC不能再使用绑定过的PV。
2、PVC一旦绑定PV,就相当于是一个存储卷,此时PVC可以被多个Pod所使用。(PVC支不支持被多个Pod访问,取决于访问模型accessMode的定义)。
3、PVC若没有找到合适的PV时,则会处于pending状态。
4、PV的reclaim policy选项:
默认是Retain保留,保留生成的数据。
可以改为recycle回收,删除生成的数据,回收pv
delete,删除,pvc解除绑定后,pv也就自动删除。
# 二、PVC
PersistentVolumeClaim (PVC) 是对 PV 的申请 (Claim)。PVC 通常由普通用户创建和维护。需要为 Pod 分配存储资源时,用户可以创建一个 PVC,指明存储资源的容量大小和访问模式(比如只读)等信息,Kubernetes 会查找并提供满足条件的 PV。
有了 PersistentVolumeClaim,用户只需要告诉 Kubernetes 需要什么样的存储资源,而不必关心真正的空间从哪里分配,如何访问等底层细节信息。这些 Storage Provider 的底层信息交给管理员来处理,只有管理员才应该关心创建 PersistentVolume 的细节信息。
## PVC资源需要指定:
1、accessMode:访问模型;对象列表:
ReadWriteOnce – the volume can be mounted as read-write by a single node: RWO - ReadWriteOnce 一人读写
ReadOnlyMany – the volume can be mounted read-only by many nodes: ROX - ReadOnlyMany 多人只读
ReadWriteMany – the volume can be mounted as read-write by many nodes: RWX - ReadWriteMany 多人读写
2、resource:资源限制(比如:定义5GB空间,我们期望对应的存储空间至少5GB。)
3、selector:标签选择器。不加标签,就会在所有PV找最佳匹配。
4、storageClassName:存储类名称:
5、volumeMode:指后端存储卷的模式。可以用于做类型限制,哪种类型的PV可以被当前claim所使用。
6、volumeName:卷名称,指定后端PVC(相当于绑定)
# 三、两者差异
1、PV是属于集群级别的,不能定义在名称空间中
2、PVC时属于名称空间级别的。
参考文档:
https://blog.csdn.net/weixin_42973226/article/details/86501693 基于rook-ceph部署wordpress
https://www.cnblogs.com/benjamin77/p/9944268.html k8s的持久化存储PV&&PVC
================================================
FILE: components/external-storage/2、静态配置PV和PVC.md
================================================
Table of Contents
=================
* [一、环境介绍](#一环境介绍)
* [二、PV操作](#二pv操作)
* [01、创建PV卷](#01创建pv卷)
* [02、PV配置参数介绍](#02pv配置参数介绍)
* [03、创建PV资源](#03创建pv资源)
* [04、查看PV](#04查看pv)
* [三、PVC操作](#三pvc操作)
* [01、创建PVC资源](#01创建pvc资源)
* [02、查看PVC/PV](#02查看pvcpv)
* [四、Pod中使用存储](#四pod中使用存储)
* [五、验证](#五验证)
* [01、验证PV是否可用](#01验证pv是否可用)
* [02、进入pod查看挂载情况](#02进入pod查看挂载情况)
* [03、删除pod](#03删除pod)
* [04、继续删除pvc](#04继续删除pvc)
* [05、继续删除pv](#05继续删除pv)
# 一、环境介绍
作为准备工作,我们已经在 k8s同一局域内网节点上搭建了一个 NFS 服务器,目录为 /data/nfs, pv是全局的,pvc可以指定namespace。
# 二、PV操作
## 01、创建PV卷
```bash
# 创建pv卷对应的目录
mkdir -p /data/nfs/pv001
mkdir -p /data/nfs/pv002
# 配置exportrs
$ vim /etc/exports
/data/nfs *(rw,no_root_squash,sync,insecure)
/data/nfs/pv001 *(rw,no_root_squash,sync,insecure)
/data/nfs/pv002 *(rw,no_root_squash,sync,insecure)
# 配置生效
exportfs -r
# 重启rpcbind、nfs服务
systemctl restart rpcbind && systemctl restart nfs
# 查看挂载点
$ showmount -e localhost
Export list for localhost:
/data/nfs/pv002 *
/data/nfs/pv001 *
/data/nfs *
```
## 02、PV配置参数介绍
```bash
配置说明:
① capacity 指定 PV 的容量为 20G。
② accessModes 指定访问模式为 ReadWriteOnce,支持的访问模式有:
ReadWriteOnce – PV 能以 read-write 模式 mount 到单个节点。
ReadOnlyMany – PV 能以 read-only 模式 mount 到多个节点。
ReadWriteMany – PV 能以 read-write 模式 mount 到多个节点。
③ persistentVolumeReclaimPolicy 指定当 PV 的回收策略为 Recycle,支持的策略有:
Retain – 就是保留现场,K8S什么也不做,需要管理员手动去处理PV里的数据,处理完后,再手动删除PV
Recycle – K8S会将PV里的数据删除,然后把PV的状态变成Available,又可以被新的PVC绑定使用
Delete – K8S会自动删除该PV及里面的数据
④ storageClassName 指定 PV 的 class 为 nfs。相当于为 PV 设置了一个分类,PVC 可以指定 class 申请相应 class 的 PV。
⑤ 指定 PV 在 NFS 服务器上对应的目录。
一般来说,PV和PVC的生命周期分为5个阶段:
Provisioning,即PV的创建,可以直接创建PV(静态方式),也可以使用StorageClass动态创建
Binding,将PV分配给PVC
Using,Pod通过PVC使用该Volume
Releasing,Pod释放Volume并删除PVC
Reclaiming,回收PV,可以保留PV以便下次使用,也可以直接从云存储中删除
根据这5个阶段,Volume的状态有以下4种:
Available:可用
Bound:已经分配给PVC
Released:PVC解绑但还未执行回收策略
Failed:发生错误
变成Released的PV会根据定义的回收策略做相应的回收工作。有三种回收策略:
Retain 就是保留现场,K8S什么也不做,等待用户手动去处理PV里的数据,处理完后,再手动删除PV
Delete K8S会自动删除该PV及里面的数据
Recycle K8S会将PV里的数据删除,然后把PV的状态变成Available,又可以被新的PVC绑定使用
```
## 03、创建PV资源
1、nfs-pv001.yaml
```bash
# 清理pv资源
kubectl delete -f nfs-pv001.yaml
# 编写pv资源文件
cat > nfs-pv001.yaml <<\EOF
apiVersion: v1
kind: PersistentVolume
metadata:
name: nfs-pv001
labels:
pv: nfs-pv001
spec:
capacity:
storage: 20Gi
accessModes:
- ReadWriteOnce
persistentVolumeReclaimPolicy: Recycle
storageClassName: nfs
nfs:
path: /data/nfs/pv001
server: 192.168.56.11
EOF
# 部署pv到集群中
kubectl apply -f nfs-pv001.yaml
```
2、nfs-pv002.yaml
```bash
# 清理pv资源
kubectl delete -f nfs-pv002.yaml
# 编写pv资源文件
cat > nfs-pv002.yaml <<\EOF
apiVersion: v1
kind: PersistentVolume
metadata:
name: nfs-pv002
labels:
pv: nfs-pv002
spec:
capacity:
storage: 30Gi
accessModes:
- ReadWriteOnce
persistentVolumeReclaimPolicy: Recycle
storageClassName: nfs
nfs:
path: /data/nfs/pv002
server: 192.168.56.11
EOF
# 部署pv到集群中
kubectl apply -f nfs-pv002.yaml
```
## 04、查看PV
```bash
# 查看pv
$ kubectl get pv
NAME CAPACITY ACCESS MODES RECLAIM POLICY STATUS CLAIM STORAGECLASS REASON AGE
nfs-pv001 20Gi RWO Recycle Available nfs 68s
nfs-pv002 30Gi RWO Recycle Available nfs 33s
#STATUS 为 Available,表示 pv 就绪,可以被 PVC 申请。
```
# 三、PVC操作
## 01、创建PVC资源
接下来创建2个名为pvc001和pvc002的PVC,配置文件 nfs-pvc001.yaml 如下:
1、nfs-pvc001.yaml
```bash
# 清理pvc资源
kubectl delete -f nfs-pvc001.yaml
# 编写pvc资源文件
cat > nfs-pvc001.yaml <<\EOF
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
name: nfs-pvc001
spec:
accessModes:
- ReadWriteOnce
resources:
requests:
storage: 20Gi
storageClassName: nfs
selector:
matchLabels:
pv: nfs-pv001
EOF
# 部署pvc到集群中
kubectl apply -f nfs-pvc001.yaml
```
2、nfs-pvc002.yaml
```bash
# 清理pvc资源
kubectl delete -f nfs-pvc002.yaml
# 编写pvc资源文件
cat > nfs-pvc002.yaml <<\EOF
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
name: nfs-pvc002
spec:
accessModes:
- ReadWriteOnce
resources:
requests:
storage: 30Gi
storageClassName: nfs
selector:
matchLabels:
pv: nfs-pv002
EOF
# 部署pvc到集群中
kubectl apply -f nfs-pvc002.yaml
```
## 02、查看PVC/PV
```bash
$ kubectl get pvc --show-labels
NAME STATUS VOLUME CAPACITY ACCESS MODES STORAGECLASS AGE
nfs-pvc001 Bound nfs-pv001 20Gi RWO nfs 18s
nfs-pvc002 Bound nfs-pv002 30Gi RWO nfs 7s
$ kubectl get pv --show-labels
NAME CAPACITY ACCESS MODES RECLAIM POLICY STATUS CLAIM STORAGECLASS REASON AGE
nfs-pv001 20Gi RWO Recycle Bound default/nfs-pvc001 nfs 17m
nfs-pv002 30Gi RWO Recycle Bound default/nfs-pvc002 nfs 17m
# 从 kubectl get pvc 和 kubectl get pv 的输出可以看到 pvc001 和pvc002分别绑定到pv001和pv002,申请成功。注意pvc绑定到对应pv通过labels标签方式实现,也可以不指定,将随机绑定到pv。
```
# 四、Pod中使用存储
```与使用普通 Volume 的格式类似,在 volumes 中通过 persistentVolumeClaim 指定使用nfs-pvc001和nfs-pvc002申请的 Volume。```
1、nfs-pod001.yaml
```bash
# 清理pod资源
kubectl delete -f nfs-pod001.yaml
# 编写pod资源文件
cat > nfs-pod001.yaml <<\EOF
kind: Pod
apiVersion: v1
metadata:
name: nfs-pod001
spec:
containers:
- name: myfrontend
image: nginx
volumeMounts:
- mountPath: "/var/www/html"
name: nfs-pv001
volumes:
- name: nfs-pv001
persistentVolumeClaim:
claimName: nfs-pvc001
EOF
# 创建pod资源
kubectl apply -f nfs-pod001.yaml
```
2、nfs-pod002.yaml
```bash
# 清理pod资源
kubectl delete -f nfs-pod002.yaml
# 编写pod资源文件
cat > nfs-pod002.yaml <<\EOF
kind: Pod
apiVersion: v1
metadata:
name: nfs-pod002
spec:
containers:
- name: myfrontend
image: nginx
volumeMounts:
- mountPath: "/var/www/html"
name: nfs-pv002
volumes:
- name: nfs-pv002
persistentVolumeClaim:
claimName: nfs-pvc002
EOF
# 创建pod资源
kubectl apply -f nfs-pod002.yaml
```
# 五、验证
## 01、验证PV是否可用
```bash
# 进入到pod创建文件
kubectl exec nfs-pod001 touch /var/www/html/index001.html
kubectl exec nfs-pod002 touch /var/www/html/index002.html
# 登录到nfs-server上面查看文件是否创建成功
$ ls /data/nfs/pv001/
index001.html
$ ls /data/nfs/pv002/
index002.html
```
## 02、进入pod查看挂载情况
```bash
# 验证pod001的挂载
$ kubectl exec -it nfs-pod001 /bin/bash
$ root@nfs-pod001:/# df -h
Filesystem Size Used Avail Use% Mounted on
overlay 711G 85G 627G 12% /
tmpfs 64M 0 64M 0% /dev
tmpfs 16G 0 16G 0% /sys/fs/cgroup
/dev/sda3 711G 85G 627G 12% /etc/hosts
shm 64M 0 64M 0% /dev/shm
192.168.56.11:/data/nfs/pv001 932G 620M 931G 1% /var/www/html
# 验证pod002的挂载
$ kubectl exec -it nfs-pod002 /bin/bash
$ root@nfs-pod002:/# df -h
Filesystem Size Used Avail Use% Mounted on
overlay 711G 85G 627G 12% /
tmpfs 64M 0 64M 0% /dev
tmpfs 16G 0 16G 0% /sys/fs/cgroup
/dev/sda3 711G 85G 627G 12% /etc/hosts
shm 64M 0 64M 0% /dev/shm
192.168.56.11:/data/nfs/pv002 932G 620M 931G 1% /var/www/html
```
## 03、删除pod
pv和pvc不会被删除,nfs存储的数据不会被删除
```bash
$ kubectl delete -f nfs-pod001.yaml
pod "nfs-pod001" deleted
$ kubectl get pv
NAME CAPACITY ACCESS MODES RECLAIM POLICY STATUS CLAIM STORAGECLASS REASON AGE
nfs-pv001 20Gi RWO Recycle Bound default/nfs-pvc001 nfs 13m
nfs-pv002 30Gi RWO Recycle Bound default/nfs-pvc002 nfs 13m
$ kubectl get pvc
NAME STATUS VOLUME CAPACITY ACCESS MODES STORAGECLASS AGE
nfs-pvc001 Bound nfs-pv001 20Gi RWO nfs 13m
nfs-pvc002 Bound nfs-pv002 30Gi RWO nfs 13m
```
## 04、继续删除pvc
pv将被释放,处于 Available 可用状态,并且nfs存储中的数据被删除。
```bash
$ kubectl delete -f nfs-pvc001.yaml
persistentvolumeclaim "nfs-pvc001" deleted
$ kubectl get pv
NAME CAPACITY ACCESS MODES RECLAIM POLICY STATUS CLAIM STORAGECLASS REASON AGE
nfs-pv001 20Gi RWO Recycle Available nfs 18m
nfs-pv002 30Gi RWO Recycle Bound default/nfs-pvc002 nfs 18m
$ ls /nfs/data/pv001/ # 文件不存在
```
## 05、继续删除pv
```bash
$ kubectl delete -f nfs-pv001.yaml
persistentvolume "nfs-pv001" deleted
```
参考文档:
https://blog.csdn.net/networken/article/details/86697018 kubernetes部署NFS持久存储
================================================
FILE: components/external-storage/3、动态申请PV卷.md
================================================
Table of Contents
=================
* [Kubernetes 中部署 NFS Provisioner 为 NFS 提供动态分配卷](#kubernetes-中部署-nfs-provisioner-为-nfs-提供动态分配卷)
* [一、NFS Provisioner 简介](#一nfs-provisioner-简介)
* [二、External NFS驱动的工作原理](#二external-nfs驱动的工作原理)
* [1、nfs-client](#1nfs-client)
* [2、nfs](#2nfs)
* [三、部署服务](#三部署服务)
* [1、配置授权](#1配置授权)
* [2、部署nfs-client-provisioner](#2部署nfs-client-provisioner)
* [3、部署NFS Provisioner](#3部署nfs-provisioner)
* [4、创建StorageClass](#4创建storageclass)
* [四、创建PVC](#四创建pvc)
* [01、创建一个新的namespace,然后创建pvc资源](#01创建一个新的namespace然后创建pvc资源)
* [五、创建测试Pod](#五创建测试pod)
* [01、进入 NFS Server 服务器验证是否创建对应文件](#01进入-nfs-server-服务器验证是否创建对应文件)
# Kubernetes 中部署 NFS Provisioner 为 NFS 提供动态分配卷
## 一、NFS Provisioner 简介
NFS Provisioner 是一个自动配置卷程序,它使用现有的和已配置的 NFS 服务器来支持通过持久卷声明动态配置 Kubernetes 持久卷。
- 持久卷被配置为:namespace−{pvcName}-${pvName}。
## 二、External NFS驱动的工作原理
K8S的外部NFS驱动,可以按照其工作方式(是作为NFS server还是NFS client)分为两类:
### 1、nfs-client
- 也就是我们接下来演示的这一类,它通过K8S的内置的NFS驱动挂载远端的NFS服务器到本地目录;然后将自身作为storage provider,关联storage class。当用户创建对应的PVC来申请PV时,该provider就将PVC的要求与自身的属性比较,一旦满足就在本地挂载好的NFS目录中创建PV所属的子目录,为Pod提供动态的存储服务。
### 2、nfs
- 与nfs-client不同,该驱动并不使用k8s的NFS驱动来挂载远端的NFS到本地再分配,而是直接将本地文件映射到容器内部,然后在容器内使用ganesha.nfsd来对外提供NFS服务;在每次创建PV的时候,直接在本地的NFS根目录中创建对应文件夹,并export出该子目录。利用NFS动态提供Kubernetes后端存储卷
- 本文将介绍使用nfs-client-provisioner这个应用,利用NFS Server给Kubernetes作为持久存储的后端,并且动态提供PV。前提条件是有已经安装好的NFS服务器,并且NFS服务器与Kubernetes的Slave节点都能网络连通。将nfs-client驱动做一个deployment部署到K8S集群中,然后对外提供存储服务。
`nfs-client-provisioner` 是一个Kubernetes的简易NFS的外部 provisioner,本身不提供NFS,需要现有的NFS服务器提供存储
## 三、部署服务
### 1、配置授权
现在的 Kubernetes 集群大部分是基于 RBAC 的权限控制,所以创建一个一定权限的 ServiceAccount 与后面要创建的 “NFS Provisioner” 绑定,赋予一定的权限。
```bash
# 清理rbac授权
kubectl delete -f nfs-rbac.yaml -n kube-system
# 编写yaml
cat >nfs-rbac.yaml<<-EOF
---
kind: ServiceAccount
apiVersion: v1
metadata:
name: nfs-client-provisioner
---
kind: ClusterRole
apiVersion: rbac.authorization.k8s.io/v1
metadata:
name: nfs-client-provisioner-runner
rules:
- apiGroups: [""]
resources: ["persistentvolumes"]
verbs: ["get", "list", "watch", "create", "delete"]
- apiGroups: [""]
resources: ["persistentvolumeclaims"]
verbs: ["get", "list", "watch", "update"]
- apiGroups: ["storage.k8s.io"]
resources: ["storageclasses"]
verbs: ["get", "list", "watch"]
- apiGroups: [""]
resources: ["events"]
verbs: ["create", "update", "patch"]
---
kind: ClusterRoleBinding
apiVersion: rbac.authorization.k8s.io/v1
metadata:
name: run-nfs-client-provisioner
subjects:
- kind: ServiceAccount
name: nfs-client-provisioner
namespace: kube-system
roleRef:
kind: ClusterRole
name: nfs-client-provisioner-runner
apiGroup: rbac.authorization.k8s.io
---
kind: Role
apiVersion: rbac.authorization.k8s.io/v1
metadata:
name: leader-locking-nfs-client-provisioner
rules:
- apiGroups: [""]
resources: ["endpoints"]
verbs: ["get", "list", "watch", "create", "update", "patch"]
---
kind: RoleBinding
apiVersion: rbac.authorization.k8s.io/v1
metadata:
name: leader-locking-nfs-client-provisioner
subjects:
- kind: ServiceAccount
name: nfs-client-provisioner
# replace with namespace where provisioner is deployed
namespace: kube-system
roleRef:
kind: Role
name: leader-locking-nfs-client-provisioner
apiGroup: rbac.authorization.k8s.io
EOF
# 应用授权
kubectl apply -f nfs-rbac.yaml -n kube-system
```
### 2、部署nfs-client-provisioner
首先克隆仓库获取yaml文件
```
git clone https://github.com/kubernetes-incubator/external-storage.git
cp -R external-storage/nfs-client/deploy/ /root/
cd deploy
```
### 3、部署NFS Provisioner
修改deployment.yaml文件,这里修改的参数包括NFS服务器所在的IP地址(10.198.1.155),以及NFS服务器共享的路径(/data/nfs/),两处都需要修改为你实际的NFS服务器和共享目录。另外修改nfs-client-provisioner镜像从七牛云拉取。
设置 NFS Provisioner 部署文件,这里将其部署到 “kube-system” Namespace 中。
```bash
# 清理NFS Provisioner资源
kubectl delete -f nfs-provisioner-deploy.yaml -n kube-system
export NFS_ADDRESS='10.198.1.155'
export NFS_DIR='/data/nfs'
# 编写deployment.yaml
cat >nfs-provisioner-deploy.yaml<<-EOF
---
kind: Deployment
apiVersion: apps/v1
metadata:
name: nfs-client-provisioner
spec:
replicas: 1
selector:
matchLabels:
app: nfs-client-provisioner
strategy:
type: Recreate #---设置升级策略为删除再创建(默认为滚动更新)
template:
metadata:
labels:
app: nfs-client-provisioner
spec:
serviceAccountName: nfs-client-provisioner
containers:
- name: nfs-client-provisioner
#---由于quay.io仓库国内被墙,所以替换成七牛云的仓库
#image: quay-mirror.qiniu.com/external_storage/nfs-client-provisioner:latest
image: registry.cn-hangzhou.aliyuncs.com/open-ali/nfs-client-provisioner:latest
volumeMounts:
- name: nfs-client-root
mountPath: /persistentvolumes
env:
- name: PROVISIONER_NAME
value: nfs-client #---nfs-provisioner的名称,以后设置的storageclass要和这个保持一致
- name: NFS_SERVER
value: ${NFS_ADDRESS} #---NFS服务器地址,和 valumes 保持一致
- name: NFS_PATH
value: ${NFS_DIR} #---NFS服务器目录,和 valumes 保持一致
volumes:
- name: nfs-client-root
nfs:
server: ${NFS_ADDRESS} #---NFS服务器地址
path: ${NFS_DIR} #---NFS服务器目录
EOF
# 部署deployment.yaml
kubectl apply -f nfs-provisioner-deploy.yaml -n kube-system
# 查看创建的pod
kubectl get pod -o wide -n kube-system|grep nfs-client
# 查看pod日志
kubectl logs -f `kubectl get pod -o wide -n kube-system|grep nfs-client|awk '{print $1}'` -n kube-system
```
### 4、创建StorageClass
storage class的定义,需要注意的是:provisioner属性要等于驱动所传入的环境变量`PROVISIONER_NAME`的值。否则,驱动不知道知道如何绑定storage class。
此处可以不修改,或者修改provisioner的名字,需要与上面的deployment的`PROVISIONER_NAME`名字一致。
```bash
# 清理storageclass资源
kubectl delete -f nfs-storage.yaml
# 编写yaml
cat >nfs-storage.yaml<<-EOF
apiVersion: storage.k8s.io/v1
kind: StorageClass
metadata:
name: nfs-storage
annotations:
storageclass.kubernetes.io/is-default-class: "true" #---设置为默认的storageclass
provisioner: nfs-client #---动态卷分配者名称,必须和上面创建的"PROVISIONER_NAME"变量中设置的Name一致
parameters:
archiveOnDelete: "true" #---设置为"false"时删除PVC不会保留数据,"true"则保留数据
mountOptions:
- hard #指定为硬挂载方式
- nfsvers=4 #指定NFS版本,这个需要根据 NFS Server 版本号设置
EOF
#部署class.yaml
kubectl apply -f nfs-storage.yaml
#查看创建的storageclass(这里可以看到nfs-storage已经变为默认的storageclass了)
$ kubectl get sc
NAME PROVISIONER AGE
nfs-storage (default) nfs-client 3m38s
```
## 四、创建PVC
### 01、创建一个新的namespace,然后创建pvc资源
```bash
# 删除命令空间
kubectl delete ns kube-public
# 创建命名空间
kubectl create ns kube-public
# 清理pvc
kubectl delete -f test-claim.yaml -n kube-public
# 编写yaml
cat >test-claim.yaml<<\EOF
kind: PersistentVolumeClaim
apiVersion: v1
metadata:
name: test-claim
spec:
storageClassName: nfs-storage #---需要与上面创建的storageclass的名称一致
accessModes:
- ReadWriteMany
resources:
requests:
storage: 100Gi
EOF
#创建PVC
kubectl apply -f test-claim.yaml -n kube-public
#查看创建的PV和PVC
$ kubectl get pvc -n kube-public
NAME STATUS VOLUME CAPACITY ACCESS MODES STORAGECLASS AGE
test-claim Bound pvc-593f241f-a75f-459a-af18-a672e5090921 100Gi RWX nfs-storage 3s
kubectl get pv
#然后,我们进入到NFS的export目录,可以看到对应该volume name的目录已经创建出来了。其中volume的名字是namespace,PVC name以及uuid的组合:
#注意,出现pvc在pending的原因可能为nfs-client-provisioner pod 出现了问题,删除重建的时候会出现镜像问题
```
## 五、创建测试Pod
```bash
# 清理资源
kubectl delete -f test-pod.yaml -n kube-public
# 编写yaml
cat > test-pod.yaml <<\EOF
kind: Pod
apiVersion: v1
metadata:
name: test-pod
spec:
containers:
- name: test-pod
image: busybox:latest
command:
- "/bin/sh"
args:
- "-c"
- "touch /mnt/SUCCESS && exit 0 || exit 1"
volumeMounts:
- name: nfs-pvc
mountPath: "/mnt"
restartPolicy: "Never"
volumes:
- name: nfs-pvc
persistentVolumeClaim:
claimName: test-claim
EOF
#创建pod
kubectl apply -f test-pod.yaml -n kube-public
#查看创建的pod
kubectl get pod -o wide -n kube-public
```
### 01、进入 NFS Server 服务器验证是否创建对应文件
进入 NFS Server 服务器的 NFS 挂载目录,查看是否存在 Pod 中创建的文件:
```bash
$ cd /data/nfs/
$ ls
archived-kube-public-test-claim-pvc-2dd4740d-f2d1-4e88-a0fc-383c00e37255 kube-public-test-claim-pvc-ad304939-e75d-414f-81b5-7586ef17db6c
archived-kube-public-test-claim-pvc-593f241f-a75f-459a-af18-a672e5090921 kube-system-test1-claim-pvc-f84dc09c-b41e-4e67-a239-b14f8d342efc
archived-kube-public-test-claim-pvc-b08b209d-c448-4ce4-ab5c-1bf37cc568e6 pv001
default-test-claim-pvc-4f18ed06-27cd-465b-ac87-b2e0e9565428 pv002
# 可以看到已经生成 SUCCESS 该文件,并且可知通过 NFS Provisioner 创建的目录命名方式为 “namespace名称-pvc名称-pv名称”,pv 名称是随机字符串,所以每次只要不删除 PVC,那么 Kubernetes 中的与存储绑定将不会丢失,要是删除 PVC 也就意味着删除了绑定的文件夹,下次就算重新创建相同名称的 PVC,生成的文件夹名称也不会一致,因为 PV 名是随机生成的字符串,而文件夹命名又跟 PV 有关,所以删除 PVC 需谨慎。
```
参考文档:
https://blog.csdn.net/qq_25611295/article/details/86065053 k8s pv与pvc持久化存储(静态与动态)
https://blog.csdn.net/networken/article/details/86697018 kubernetes部署NFS持久存储
https://www.jianshu.com/p/5e565a8049fc kubernetes部署NFS持久存储(静态和动态)
================================================
FILE: components/external-storage/4、Kubernetes之MySQL持久存储和故障转移.md
================================================
Table of Contents
=================
* [一、MySQL持久化演练](#一mysql持久化演练)
* [1、数据库提供持久化存储,主要分为下面几个步骤:](#1数据库提供持久化存储主要分为下面几个步骤)
* [二、静态PV PVC](#二静态pv-pvc)
* [1、创建 PV](#1创建-pv)
* [2、创建PVC](#2创建pvc)
* [三、部署 MySQL](#三部署-mysql)
* [1、MySQL 的配置文件mysql.yaml如下:](#1mysql-的配置文件mysqlyaml如下)
* [2、更新 MySQL 数据](#2更新-mysql-数据)
* [3、故障转移](#3故障转移)
* [四、全新命名空间使用](#四全新命名空间使用)
# 一、MySQL持久化演练
## 1、数据库提供持久化存储,主要分为下面几个步骤:
1、创建 PV 和 PVC
2、部署 MySQL
3、向 MySQL 添加数据
4、模拟节点宕机故障,Kubernetes 将 MySQL 自动迁移到其他节点
5、验证数据一致性
# 二、静态PV PVC
```bash
PV就好比是一个仓库,我们需要先购买一个仓库,即定义一个PV存储服务,例如CEPH,NFS,Local Hostpath等等。
PVC就好比租户,pv和pvc是一对一绑定的,挂载到POD中,一个pvc可以被多个pod挂载。
```
## 1、创建 PV
```bash
# 清理pv资源
kubectl delete -f mysql-static-pv.yaml
# 编写pv yaml资源文件
cat > mysql-static-pv.yaml <<\EOF
apiVersion: v1
kind: PersistentVolume
metadata:
name: mysql-static-pv
spec:
capacity:
storage: 80Gi
accessModes:
- ReadWriteOnce
#ReadWriteOnce - 卷可以由单个节点以读写方式挂载
#ReadOnlyMany - 卷可以由许多节点以只读方式挂载
#ReadWriteMany - 卷可以由许多节点以读写方式挂载
persistentVolumeReclaimPolicy: Retain
#Retain,不清理, 保留 Volume(需要手动清理)
#Recycle,删除数据,即 rm -rf /thevolume/*(只有 NFS 和 HostPath 支持)
#Delete,删除存储资源,比如删除 AWS EBS 卷(只有 AWS EBS, GCE PD, Azure Disk 和 Cinder 支持)
nfs:
path: /data/nfs/mysql/
server: 10.198.1.155
mountOptions:
- vers=4
- minorversion=0
- noresvport
EOF
# 部署pv到集群中
kubectl apply -f mysql-static-pv.yaml
# 查看pv
$ kubectl get pv
NAME CAPACITY ACCESS MODES RECLAIM POLICY STATUS CLAIM STORAGECLASS REASON AGE
mysql-static-pv 80Gi RWO Retain Available 4m20s
```
## 2、创建PVC
```bash
# 清理pvc资源
kubectl delete -f mysql-pvc.yaml
# 编写pvc yaml资源文件
cat > mysql-pvc.yaml <<\EOF
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
name: mysql-static-pvc
spec:
accessModes:
- ReadWriteOnce
resources:
requests:
storage: 80Gi
EOF
# 创建pvc资源
kubectl apply -f mysql-pvc.yaml
# 查看pvc
$ kubectl get pvc
NAME STATUS VOLUME CAPACITY ACCESS MODES STORAGECLASS AGE
mysql-static-pvc Bound pvc-c55f8695-2a0b-4127-a60b-5c1aba8b9104 80Gi RWO nfs-storage 81s
```
# 三、部署 MySQL
## 1、MySQL 的配置文件mysql.yaml如下:
```bash
kubectl delete -f mysql.yaml
cat >mysql.yaml<<\EOF
apiVersion: v1
kind: Service
metadata:
name: mysql
spec:
ports:
- port: 3306
selector:
app: mysql
---
apiVersion: extensions/v1beta1
kind: Deployment
metadata:
name: mysql
spec:
selector:
matchLabels:
app: mysql
template:
metadata:
labels:
app: mysql
spec:
containers:
- name: mysql
image: mysql:5.6
env:
- name: MYSQL_ROOT_PASSWORD
value: password
ports:
- name: mysql
containerPort: 3306
volumeMounts:
- name: mysql-persistent-storage
mountPath: /var/lib/mysql
volumes:
- name: mysql-persistent-storage
persistentVolumeClaim:
claimName: mysql-static-pvc
EOF
kubectl apply -f mysql.yaml
# PVC mysql-static-pvc Bound 的 PV mysql-static-pv 将被 mount 到 MySQL 的数据目录 /var/lib/mysql。
```
## 2、更新 MySQL 数据
MySQL 被部署到 k8s-node02,下面通过客户端访问 Service mysql:
```bash
$ kubectl run -it --rm --image=mysql:5.6 --restart=Never mysql-client -- mysql -h mysql -ppassword
If you don't see a command prompt, try pressing enter.
mysql>
我们在mysql库中创建一个表myid,然后在表里新增几条数据。
mysql> use mysql
Database changed
mysql> drop table myid;
Query OK, 0 rows affected (0.12 sec)
mysql> create table myid(id int(4));
Query OK, 0 rows affected (0.23 sec)
mysql> insert myid values(888);
Query OK, 1 row affected (0.03 sec)
mysql> select * from myid;
+------+
| id |
+------+
| 888 |
+------+
1 row in set (0.00 sec)
```
## 3、故障转移
我们现在把 node02 机器关机,模拟节点宕机故障。
```bash
1、一段时间之后,Kubernetes 将 MySQL 迁移到 k8s-node01
$ kubectl get pod -o wide
NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES
mysql-7686899cf9-8z6tc 1/1 Running 0 21s 10.244.1.19 node01
mysql-7686899cf9-d4m42 1/1 Terminating 0 23m 10.244.2.17 node02
2、验证数据的一致性
$ kubectl run -it --rm --image=mysql:5.6 --restart=Never mysql-client -- mysql -h mysql -ppassword
If you don't see a command prompt, try pressing enter.
mysql> use mysql
Reading table information for completion of table and column names
You can turn off this feature to get a quicker startup with -A
Database changed
mysql> select * from myid;
+------+
| id |
+------+
| 888 |
+------+
1 row in set (0.00 sec)
3、MySQL 服务恢复,数据也完好无损,我们可以可以在存储节点上面查看一下生成的数据库文件。
[root@nfs_server mysql-pv]# ll
-rw-rw---- 1 systemd-bus-proxy ssh_keys 56 12月 14 09:53 auto.cnf
-rw-rw---- 1 systemd-bus-proxy ssh_keys 12582912 12月 14 10:15 ibdata1
-rw-rw---- 1 systemd-bus-proxy ssh_keys 50331648 12月 14 10:15 ib_logfile0
-rw-rw---- 1 systemd-bus-proxy ssh_keys 50331648 12月 14 09:53 ib_logfile1
drwx------ 2 systemd-bus-proxy ssh_keys 4096 12月 14 10:05 mysql
drwx------ 2 systemd-bus-proxy ssh_keys 4096 12月 14 09:53 performance_schema
```
# 四、全新命名空间使用
pv是全局的,pvc可以指定namespace
```bash
kubectl delete ns test-ns
kubectl create ns test-ns
kubectl apply -f mysql-pvc.yaml -n test-ns
kubectl apply -f mysql.yaml -n test-ns
kubectl get pods -n test-ns -o wide
kubectl -n test-ns logs -f $(kubectl get pods -n test-ns|grep mysql|awk '{print $1}')
kubectl run -n test-ns -it --rm --image=mysql:5.6 --restart=Never mysql-client -- mysql -h mysql -ppassword
```
参考文档:
https://blog.51cto.com/wzlinux/2330295 Kubernetes 之 MySQL 持久存储和故障转移(十一)
https://qingmu.io/2019/08/11/Run-mysql-on-kubernetes/ 从部署mysql聊一聊有状态服务和PV及PVC
================================================
FILE: components/external-storage/5、Kubernetes之Nginx动静态PV持久存储.md
================================================
Table of Contents
=================
* [一、nginx使用nfs静态PV](#一nginx使用nfs静态pv)
* [1、静态nfs-static-nginx-rc.yaml](#1静态nfs-static-nginx-rcyaml)
* [2、静态nfs-static-nginx-deployment.yaml](#2静态nfs-static-nginx-deploymentyaml)
* [3、nginx多目录挂载](#3nginx多目录挂载)
* [二、nginx使用nfs动态PV](#二nginx使用nfs动态pv)
* [1、动态nfs-dynamic-nginx.yaml](#1动态nfs-dynamic-nginxyaml)
# 一、nginx使用nfs静态PV
## 1、静态nfs-static-nginx-rc.yaml
```bash
##清理资源
kubectl delete -f nfs-static-nginx-rc.yaml -n test
cat >nfs-static-nginx-rc.yaml<<\EOF
##创建namespace
---
apiVersion: v1
kind: Namespace
metadata:
name: test
labels:
name: test
##创建nfs-pv
---
apiVersion: v1
kind: PersistentVolume
metadata:
name: nfs-pv
labels:
pv: nfs-pv
spec:
capacity:
storage: 10Gi
accessModes:
- ReadWriteMany
persistentVolumeReclaimPolicy: Retain
storageClassName: nfs # 注意这里使用nfs的storageClassName,如果没改k8s的默认storageClassName的话,这里可以省略
nfs:
path: /data/nfs/nginx/
server: 10.198.1.155
##创建nfs-pvc
---
kind: PersistentVolumeClaim
apiVersion: v1
metadata:
name: nfs-pvc
namespace: test
labels:
pvc: nfs-pvc
spec:
accessModes:
- ReadWriteMany
resources:
requests:
storage: 10Gi
storageClassName: nfs
selector:
matchLabels:
pv: nfs-pv
##部署应用nginx
---
apiVersion: v1
kind: ReplicationController
metadata:
name: nginx-test
namespace: test
labels:
name: nginx-test
spec:
replicas: 2
selector:
name: nginx-test
template:
metadata:
labels:
name: nginx-test
spec:
containers:
- name: nginx-test
image: docker.io/nginx
volumeMounts:
- mountPath: /usr/share/nginx/html
name: nginx-data
ports:
- containerPort: 80
volumes:
- name: nginx-data
persistentVolumeClaim:
claimName: nfs-pvc
##创建service
---
apiVersion: v1
kind: Service
metadata:
namespace: test
name: nginx-test
labels:
name: nginx-test
spec:
type: NodePort
ports:
- port: 80
protocol: TCP
targetPort: 80
name: http
nodePort: 30080
selector:
name: nginx-test
EOF
##创建资源
kubectl apply -f nfs-static-nginx-rc.yaml -n test
##查看pv资源
kubectl get pv -n test --show-labels
##查看pvc资源
kubectl get pvc -n test --show-labels
##查看pod
$ kubectl get pods -n test
NAME READY STATUS RESTARTS AGE
nginx-test-r4n2j 1/1 Running 0 54s
nginx-test-zstf5 1/1 Running 0 54s
#可以看到,nginx应用已经部署成功。
#nginx应用的数据目录是使用的nfs共享存储,我们在nfs共享的目录里加入index.html文件,然后再访问nginx-service暴露的端口
#切换到到nfs-server服务器上
echo "Test NFS Share discovery with nfs-static-nginx-rc" > /data/nfs/nginx/index.html
#在浏览器上访问kubernetes主节点的 http://master:30080,就能访问到这个页面内容了
```
## 2、静态nfs-static-nginx-deployment.yaml
```bash
##清理资源
kubectl delete -f nfs-static-nginx-deployment.yaml -n test
cat >nfs-static-nginx-deployment.yaml<<\EOF
##创建namespace
---
apiVersion: v1
kind: Namespace
metadata:
name: test
labels:
name: test
##创建nfs-pv
---
apiVersion: v1
kind: PersistentVolume
metadata:
name: nfs-pv
labels:
pv: nfs-pv
spec:
capacity:
storage: 10Gi
accessModes:
- ReadWriteMany
persistentVolumeReclaimPolicy: Retain
storageClassName: nfs # 注意这里使用nfs的storageClassName,如果没改k8s的默认storageClassName的话,这里可以省略
nfs:
path: /data/nfs/nginx/
server: 10.198.1.155
##创建nfs-pvc
---
kind: PersistentVolumeClaim
apiVersion: v1
metadata:
name: nfs-pvc
namespace: test
labels:
pvc: nfs-pvc
spec:
accessModes:
- ReadWriteMany
resources:
requests:
storage: 10Gi
storageClassName: nfs
selector:
matchLabels:
pv: nfs-pv
##部署应用nginx
---
apiVersion: apps/v1
kind: Deployment
metadata:
name: nginx-deployment
namespace: test
labels:
name: nginx-test
spec:
replicas: 2
selector:
matchLabels:
name: nginx-test
template:
metadata:
labels:
name: nginx-test
spec:
containers:
- name: nginx-test
image: docker.io/nginx
volumeMounts:
- mountPath: /usr/share/nginx/html
name: nginx-data
ports:
- containerPort: 80
volumes:
- name: nginx-data
persistentVolumeClaim:
claimName: nfs-pvc
##创建service
---
apiVersion: v1
kind: Service
metadata:
namespace: test
name: nginx-test
labels:
name: nginx-test
spec:
type: NodePort
ports:
- port: 80
protocol: TCP
targetPort: 80
name: http
nodePort: 30080
selector:
name: nginx-test
EOF
##创建资源
kubectl apply -f nfs-static-nginx-deployment.yaml -n test
##查看pv资源
kubectl get pv -n test --show-labels
##查看pvc资源
kubectl get pvc -n test --show-labels
##查看pod
$ kubectl get pods -n test
NAME READY STATUS RESTARTS AGE
nginx-deployment-64d6f78cdf-8bw8t 1/1 Running 0 55s
nginx-deployment-64d6f78cdf-n5n4q 1/1 Running 0 55s
#可以看到,nginx应用已经部署成功。
#nginx应用的数据目录是使用的nfs共享存储,我们在nfs共享的目录里加入index.html文件,然后再访问nginx-service暴露的端口
#切换到到nfs-server服务器上
echo "Test NFS Share discovery with nfs-static-nginx-deployment" > /data/nfs/nginx/index.html
#在浏览器上访问kubernetes主节点的 http://master:30080,就能访问到这个页面内容了
```
## 3、nginx多目录挂载
```
1、PV和PVC是一一对应关系,当有PV被某个PVC所占用时,会显示banding,其它PVC不能再使用绑定过的PV。
2、PVC一旦绑定PV,就相当于是一个存储卷,此时PVC可以被多个Pod所使用。(PVC支不支持被多个Pod访问,取决于访问模型accessMode的定义)。
3、PVC若没有找到合适的PV时,则会处于pending状态。
4、PV是属于集群级别的,不能定义在名称空间中。
5、PVC时属于名称空间级别的。
```
```bash
##清理资源
kubectl delete -f nfs-static-nginx-dp-many.yaml -n test
cat >nfs-static-nginx-dp-many.yaml<<\EOF
##创建namespace
---
apiVersion: v1
kind: Namespace
metadata:
name: test
labels:
name: test
##创建nginx-data-pv
---
apiVersion: v1
kind: PersistentVolume
metadata:
name: nginx-data-pv
labels:
pv: nginx-data-pv
spec:
capacity:
storage: 50Gi
accessModes:
- ReadWriteMany
persistentVolumeReclaimPolicy: Retain
storageClassName: nfs # 注意这里使用nfs的storageClassName,如果没改k8s的默认storageClassName的话,这里可以省略
nfs:
path: /data/nfs/nginx/
server: 10.198.1.155
##创建nginx-etc-pv
---
apiVersion: v1
kind: PersistentVolume
metadata:
name: nginx-etc-pv
labels:
pv: nginx-etc-pv
spec:
capacity:
storage: 50Gi
accessModes:
- ReadWriteMany
persistentVolumeReclaimPolicy: Retain
storageClassName: nfs # 注意这里使用nfs的storageClassName,如果没改k8s的默认storageClassName的话,这里可以省略
nfs:
path: /data/nfs/nginx/
server: 10.198.1.155
##创建pvc名字为nfs-nginx-data,存放数据
---
kind: PersistentVolumeClaim
apiVersion: v1
metadata:
name: nfs-nginx-data
namespace: test
labels:
pvc: nfs-nginx-data
spec:
accessModes:
- ReadWriteMany
resources:
requests:
storage: 50Gi
storageClassName: nfs
selector:
matchLabels:
pv: nginx-data-pv
##创建pvc名字为nfs-nginx-etc,存放配置文件
---
kind: PersistentVolumeClaim
apiVersion: v1
metadata:
name: nfs-nginx-etc
namespace: test
labels:
pvc: nfs-nginx-etc
spec:
accessModes:
- ReadWriteMany
resources:
requests:
storage: 50Gi
storageClassName: nfs
selector:
matchLabels:
pv: nginx-etc-pv
##部署应用nginx
---
apiVersion: apps/v1
kind: Deployment
metadata:
name: nginx-deployment
namespace: test
labels:
name: nginx-test
spec:
replicas: 2
selector:
matchLabels:
name: nginx-test
template:
metadata:
labels:
name: nginx-test
spec:
containers:
- name: nginx-test
image: docker.io/nginx
volumeMounts:
- mountPath: /usr/share/nginx/html
name: nginx-data
# - mountPath: /etc/nginx #--这里需要注意,如果是这么挂载,那么需要事先现在/data/nfs/nginx/目录下把nginx的完整配置提前拷贝好
# name: nginx-etc
ports:
- containerPort: 80
volumes:
- name: nginx-data
persistentVolumeClaim:
claimName: nfs-nginx-data
# - name: nginx-etc
# persistentVolumeClaim:
# claimName: nfs-nginx-etc
##创建service
---
apiVersion: v1
kind: Service
metadata:
namespace: test
name: nginx-test
labels:
name: nginx-test
spec:
type: NodePort
ports:
- port: 80
protocol: TCP
targetPort: 80
name: http
nodePort: 30080
selector:
name: nginx-test
EOF
##创建资源
kubectl apply -f nfs-static-nginx-dp-many.yaml -n test
##查看pv资源
kubectl get pv -n test --show-labels
##查看pvc资源
kubectl get pvc -n test --show-labels
##查看pod
$ kubectl get pods -n test
NAME READY STATUS RESTARTS AGE
nginx-deployment-64d6f78cdf-8bw8t 1/1 Running 0 55s
nginx-deployment-64d6f78cdf-n5n4q 1/1 Running 0 55s
##进入容器
kubectl exec -it nginx-deployment-f687cdf47-xncj8 -n test /bin/bash
#可以看到,nginx应用已经部署成功。
#nginx应用的数据目录是使用的nfs共享存储,我们在nfs共享的目录里加入index.html文件,然后再访问nginx-service暴露的端口
#切换到到nfs-server服务器上
echo "Test NFS Share discovery with nfs-static-nginx-dp-many" > /data/nfs/nginx/index.html
#在浏览器上访问kubernetes主节点的 http://master:30080,就能访问到这个页面内容了
```
## 4、参数namespace
```bash
##清理资源
export NAMESPACE="mos-namespace"
kubectl delete -f nfs-static-nginx-dp-many.yaml -n ${NAMESPACE}
cat >nfs-static-nginx-dp-many.yaml<<-EOF
##创建namespace
---
apiVersion: v1
kind: Namespace
metadata:
name: ${NAMESPACE}
labels:
name: ${NAMESPACE}
##创建nginx-data-pv
---
apiVersion: v1
kind: PersistentVolume
metadata:
name: nginx-data-pv
labels:
pv: nginx-data-pv
spec:
capacity:
storage: 50Gi
accessModes:
- ReadWriteMany
persistentVolumeReclaimPolicy: Retain
storageClassName: nfs # 注意这里使用nfs的storageClassName,如果没改k8s的默认storageClassName的话,这里可以省略
nfs:
path: /data/nfs/nginx/
server: 10.198.1.155
##创建nginx-log-pv
---
apiVersion: v1
kind: PersistentVolume
metadata:
name: nginx-log-pv
labels:
pv: nginx-log-pv
spec:
capacity:
storage: 50Gi
accessModes:
- ReadWriteMany
persistentVolumeReclaimPolicy: Retain
storageClassName: nfs # 注意这里使用nfs的storageClassName,如果没改k8s的默认storageClassName的话,这里可以省略
nfs:
path: /data/nfs/nginx/
server: 10.198.1.155
##创建pvc名字为nfs-nginx-data,存放数据
---
kind: PersistentVolumeClaim
apiVersion: v1
metadata:
name: nfs-nginx-data
labels:
pvc: nfs-nginx-data
spec:
accessModes:
- ReadWriteMany
resources:
requests:
storage: 50Gi
storageClassName: nfs
selector:
matchLabels:
pv: nginx-data-pv
##创建pvc名字为nfs-nginx-log,存放日志文件
---
kind: PersistentVolumeClaim
apiVersion: v1
metadata:
name: nfs-nginx-log
labels:
pvc: nfs-nginx-log
spec:
accessModes:
- ReadWriteMany
resources:
requests:
storage: 50Gi
storageClassName: nfs
selector:
matchLabels:
pv: nginx-log-pv
##部署应用nginx
---
apiVersion: apps/v1
kind: Deployment
metadata:
name: nginx-deployment
labels:
name: nginx-test
spec:
replicas: 2
selector:
matchLabels:
name: nginx-test
template:
metadata:
labels:
name: nginx-test
spec:
containers:
- name: nginx-test
image: docker.io/nginx
volumeMounts:
- mountPath: /usr/share/nginx/html
name: nginx-data
- mountPath: /var/log/nginx
name: nginx-log
ports:
- containerPort: 80
volumes:
- name: nginx-data
persistentVolumeClaim:
claimName: nfs-nginx-data
- name: nginx-log
persistentVolumeClaim:
claimName: nfs-nginx-log
##创建service
---
apiVersion: v1
kind: Service
metadata:
name: nginx-test
labels:
name: nginx-test
spec:
type: NodePort
ports:
- port: 80
protocol: TCP
targetPort: 80
name: http
nodePort: 30180
selector:
name: nginx-test
EOF
##创建资源
kubectl apply -f nfs-static-nginx-dp-many.yaml -n ${NAMESPACE}
```
# 二、nginx使用nfs动态PV
`https://github.com/Lancger/opsfull/blob/master/components/external-storage/3%E3%80%81%E5%8A%A8%E6%80%81%E7%94%B3%E8%AF%B7PV%E5%8D%B7.md`
## 1、动态nfs-dynamic-nginx.yaml
通过参数控制在哪个命名空间创建
```bash
##清理命名空间
kubectl delete ns k8s-public
##创建命名空间
kubectl create ns k8s-public
##清理资源
kubectl delete -f nfs-dynamic-nginx-deployment.yaml -n k8s-public
cat >nfs-dynamic-nginx-deployment.yaml<<\EOF
##动态申请nfs-dynamic-pvc
---
kind: PersistentVolumeClaim
apiVersion: v1
metadata:
name: nfs-dynamic-claim
spec:
storageClassName: nfs-storage #--需要与上面创建的storageclass的名称一致
accessModes:
- ReadWriteMany
resources:
requests:
storage: 90Gi
##部署应用nginx
---
apiVersion: apps/v1
kind: Deployment
metadata:
name: nginx-deployment
labels:
name: nginx-test
spec:
replicas: 3
selector:
matchLabels:
name: nginx-test
template:
metadata:
labels:
name: nginx-test
spec:
containers:
- name: nginx-test
image: docker.io/nginx
volumeMounts:
- mountPath: /usr/share/nginx/html
name: nginx-data
ports:
- containerPort: 80
volumes:
- name: nginx-data
persistentVolumeClaim:
claimName: nfs-dynamic-claim
##创建service
---
apiVersion: v1
kind: Service
metadata:
name: nginx-test
labels:
name: nginx-test
spec:
type: NodePort
ports:
- port: 80
protocol: TCP
targetPort: 80
name: http
nodePort: 30090
selector:
name: nginx-test
EOF
##创建资源
kubectl apply -f nfs-dynamic-nginx-deployment.yaml -n k8s-public
##查看pv资源
kubectl get pv -n k8s-public --show-labels
##查看pvc资源
kubectl get pvc -n k8s-public --show-labels
##查看pod
$ kubectl get pods -n k8s-public
NAME READY STATUS RESTARTS AGE
nginx-deployment-544f569478-5t8wm 1/1 Running 0 40s
nginx-deployment-544f569478-8gks5 1/1 Running 0 40s
nginx-deployment-544f569478-pw96x 1/1 Running 0 40s
#可以看到,nginx应用已经部署成功。
#nginx应用的数据目录是使用的nfs共享存储,我们在nfs共享的目录里加入index.html文件,然后再访问nginx-service暴露的端口
#切换到到nfs-server服务器上
#注意动态的在这个目录,创建的目录命名方式为 “namespace名称-pvc名称-pv名称”
/data/nfs/kube-public-test-claim-pvc-ad304939-e75d-414f-81b5-7586ef17db6c
echo "Test NFS Share discovery with nfs-dynamic-nginx-deployment" > /data/nfs/kube-public-test-claim-pvc-ad304939-e75d-414f-81b5-7586ef17db6c/index.html
#在浏览器上访问kubernetes主节点的 http://master:30090,就能访问到这个页面内容了
```

参考文档:
https://kubernetes.io/zh/docs/tasks/run-application/run-stateless-application-deployment/
https://blog.51cto.com/ylw6006/2071845 在kubernetes集群中运行nginx
================================================
FILE: components/external-storage/README.md
================================================
PersistenVolume(PV):对存储资源创建和使用的抽象,使得存储作为集群中的资源管理
PV分为静态和动态,动态能够自动创建PV
PersistentVolumeClaim(PVC):让用户不需要关心具体的Volume实现细节
容器与PV、PVC之间的关系,可以如下图所示:

总的来说,PV是提供者,PVC是消费者,消费的过程就是绑定
# 问题一
pv挂载正常,pvc一直处于Pending状态
```bash
#在test的命名空间创建pvc
$ kubectl get pvc -n test
NAME STATUS VOLUME CAPACITY ACCESS MODES STORAGECLASS AGE
nfs-pvc Pending---这里发现一直处于Pending的状态 nfs-storage 10s
#查看日志
$ kubectl describe pvc nfs-pvc -n test
failed to provision volume with StorageClass "nfs-storage": claim Selector is not supported
#从日志中发现,问题出在标签匹配的地方
```
参考资料:
https://blog.csdn.net/qq_25611295/article/details/86065053 k8s pv与pvc持久化存储(静态与动态)
https://www.jianshu.com/p/5e565a8049fc kubernetes部署NFS持久存储(静态和动态)
================================================
FILE: components/heapster/README.md
================================================
# 一、问题现象
heapster: 已经被k8s给舍弃掉了
```bash
heapster logs这个报错啥情况
E0918 16:56:05.022867 1 manager.go:101] Error in scraping containers from kubelet_summary:10.10.188.242:10255: Get http://10.10.188.242:10255/stats/summary/: dial tcp 10.10.188.242:10255: getsockopt: connection refused
```
# 排查思路
```
1、排查下kubelet,10255是它暴露的端口
service kubelet status #看状态是正常的
#在10.10.188.242上执行
[root@localhost ~]# netstat -lnpt | grep 10255
tcp 0 0 10.10.188.240:10255 0.0.0.0:* LISTEN 9243/kubelet
看了下/var/log/pods/kube-system_heapster-5f848f54bc-rtbv4_abf53b7c-491f-472a-9e8b-815066a6ae3d/heapster下日志 所有的物理节点都是10255 拒绝连接
2、浏览器访问查看数据
10.10.188.242 是你节点的IP吧,正常的话浏览器访问http://IP:10255/stats/summary是有值的,你看下,如果没有那就是kubelet的配置出问题
```


================================================
FILE: components/ingress/0.通俗理解Kubernetes中Service、Ingress与Ingress Controller的作用与关系.md
================================================
# 一、通俗的讲:
- 1、Service 是后端真实服务的抽象,一个 Service 可以代表多个相同的后端服务
- 2、Ingress 是反向代理规则,用来规定 HTTP/S 请求应该被转发到哪个 Service 上,比如根据请求中不同的 Host 和 url 路径让请求落到不同的 Service 上
- 3、Ingress Controller 就是一个反向代理程序,它负责解析 Ingress 的反向代理规则,如果 Ingress 有增删改的变动,所有的 Ingress Controller 都会及时更新自己相应的转发规则,当 Ingress Controller 收到请求后就会根据这些规则将请求转发到对应的 Service
# 二、数据流向图
Kubernetes 并没有自带 Ingress Controller,它只是一种标准,具体实现有多种,需要自己单独安装,常用的是 Nginx Ingress Controller 和 Traefik Ingress Controller。 所以 Ingress 是一种转发规则的抽象,Ingress Controller 的实现需要根据这些 Ingress 规则来将请求转发到对应的 Service,我画了个图方便大家理解:

从图中可以看出,Ingress Controller 收到请求,匹配 Ingress 转发规则,匹配到了就转发到后端 Service,而 Service 可能代表的后端 Pod 有多个,选出一个转发到那个 Pod,最终由那个 Pod 处理请求。
# 三、Ingress Controller对外暴露方式
有同学可能会问,既然 Ingress Controller 要接受外面的请求,而 Ingress Controller 是部署在集群中的,怎么让 Ingress Controller 本身能够被外面访问到呢,有几种方式:
- 1、Ingress Controller 用 Deployment 方式部署,给它添加一个 Service,类型为 LoadBalancer,这样会自动生成一个 IP 地址,通过这个 IP 就能访问到了,并且一般这个 IP 是高可用的(前提是集群支持 LoadBalancer,通常云服务提供商才支持,自建集群一般没有)
- 2、使用集群内部的某个或某些节点作为边缘节点,给 node 添加 label 来标识,Ingress Controller 用 DaemonSet 方式部署,使用 nodeSelector 绑定到边缘节点,保证每个边缘节点启动一个 Ingress Controller 实例,用 hostPort 直接在这些边缘节点宿主机暴露端口,然后我们可以访问边缘节点中 Ingress Controller 暴露的端口,这样外部就可以访问到 Ingress Controller 了
- 3、Ingress Controller 用 Deployment 方式部署,给它添加一个 Service,类型为 NodePort,部署完成后查看会给出一个端口,通过 kubectl get svc 我们可以查看到这个端口,这个端口在集群的每个节点都可以访问,通过访问集群节点的这个端口就可以访问 Ingress Controller 了。但是集群节点这么多,而且端口又不是 80和443,太不爽了,一般我们会在前面自己搭个负载均衡器,比如用 Nginx,将请求转发到集群各个节点的那个端口上,这样我们访问 Nginx 就相当于访问到 Ingress Controller 了
一般比较推荐的是前面两种方式。
参考资料:
https://cloud.tencent.com/developer/article/1326535 通俗理解Kubernetes中Service、Ingress与Ingress Controller的作用与关系
================================================
FILE: components/ingress/1.kubernetes部署Ingress-nginx单点和高可用.md
================================================
# 一、Ingress-nginx简介
Pod的IP以及service IP只能在集群内访问,如果想在集群外访问kubernetes提供的服务,可以使用nodeport、proxy、loadbalacer以及ingress等方式,由于service的IP集群外不能访问,就是使用ingress方式再代理一次,即ingress代理service,service代理pod.
Ingress基本原理图如下:

# 二、部署nginx-ingress-controller
```bash
# github地址
https://github.com/kubernetes/ingress-nginx
https://kubernetes.github.io/ingress-nginx/
# 1、下载nginx-ingress-controller配置文件
kubectl apply -f https://raw.githubusercontent.com/kubernetes/ingress-nginx/master/deploy/static/mandatory.yaml
# 2、service-nodeport.yaml为ingress通过nodeport对外提供服务,注意默认nodeport暴露端口为随机,可以编辑该文件自定义端口
Using NodePort:
kubectl apply -f https://raw.githubusercontent.com/kubernetes/ingress-nginx/master/deploy/static/provider/baremetal/service-nodeport.yaml
# 3、查看ingress-nginx组件状态
root># kubectl get pod -n ingress-nginx
NAME READY STATUS RESTARTS AGE
nginx-ingress-controller-568867bf56-mbvm2 1/1 Running 0 4m46s
查看创建的ingress service暴露的端口:
root># kubectl get svc -n ingress-nginx
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
ingress-nginx NodePort 10.97.243.123 80:30725/TCP,443:32314/TCP 5m46s
```
# 二、创建ingress-nginx后端服务
1.创建一个Service及后端Deployment(以nginx为例)
```
cat > deploy-demon.yaml<<\EOF
apiVersion: v1
kind: Service
metadata:
name: myapp
namespace: default
spec:
selector:
app: myapp
release: canary
ports:
- name: http
port: 80
targetPort: 80
---
apiVersion: apps/v1
kind: Deployment
metadata:
name: myapp-deploy
spec:
replicas: 2
selector:
matchLabels:
app: myapp
release: canary
template:
metadata:
labels:
app: myapp
release: canary
spec:
containers:
- name: myapp
image: ikubernetes/myapp:v2
ports:
- name: httpd
containerPort: 80
EOF
root># kubectl apply -f deploy-demon.yaml
root># kubectl get pods
root># kubectl get svc myapp
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
myapp ClusterIP 10.106.30.175 80/TCP 59s
# 通过ClusterIP方式内部测试访问Services
root># curl 10.106.30.175
Hello MyApp | Version: v2 | Pod Name
```
# 三、创建myapp的ingress规则
```
cat > ingress-myapp.yaml<<\EOF
apiVersion: extensions/v1beta1
kind: Ingress
metadata:
name: ingress-myapp
namespace: default
annotations:
kubernets.io/ingress.class: "nginx"
spec:
rules:
- host: www.k8s-devops.com
http:
paths:
- path:
backend:
serviceName: myapp
servicePort: 80
EOF
root># kubectl apply -f ingress-myapp.yaml
root># kubectl get ingress
NAME HOSTS ADDRESS PORTS AGE
ingress-myapp www.k8s-devops.com 10.97.243.123 80 5s
# 通过Ingress方式内部测试访问域名
root># curl -x 10.97.243.123:80 http://www.k8s-devops.com
Hello MyApp | Version: v2 | Pod Name
```
# 四、查看ingress-default-backend的详细信息:
```
root># kubectl exec -n ingress-nginx -it nginx-ingress-controller-568867bf56-mbvm2 -- /bin/sh
$ cat nginx.conf
## start server www.k8s-devops.com
server {
server_name www.k8s-devops.com ;
listen 80 ;
listen 443 ssl http2 ;
set $proxy_upstream_name "-";
ssl_certificate_by_lua_block {
certificate.call()
}
location / {
set $namespace "default";
set $ingress_name "ingress-myapp";
set $service_name "myapp";
set $service_port "80";
set $location_path "/";
```
# 五、测试域名
```
1、这是nginx-ingress-controller采用的deployment部署的多副本
root># kubectl get deployment -A
ingress-nginx nginx-ingress-controller 6/6 6 6 65m (这里有6个副本)
root># kubectl get svc -n ingress-nginx
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
ingress-nginx NodePort 10.97.243.123 80:30725/TCP,443:32314/TCP 69m
root># kubectl describe svc ingress-nginx -n ingress-nginx
Name: ingress-nginx
Namespace: ingress-nginx
Labels: app.kubernetes.io/name=ingress-nginx
app.kubernetes.io/part-of=ingress-nginx
Annotations: kubectl.kubernetes.io/last-applied-configuration:
{"apiVersion":"v1","kind":"Service","metadata":{"annotations":{},"labels":{"app.kubernetes.io/name":"ingress-nginx","app.kubernetes.io/par...
Selector: app.kubernetes.io/name=ingress-nginx,app.kubernetes.io/part-of=ingress-nginx
Type: NodePort
IP: 10.97.243.123
Port: http 80/TCP
TargetPort: 80/TCP
NodePort: http 30725/TCP
Endpoints: 10.244.154.195:80,10.244.154.196:80,10.244.44.197:80 + 3 more... 这里转到6个pod
Port: https 443/TCP
TargetPort: 443/TCP
NodePort: https 32314/TCP
Endpoints: 10.244.154.195:443,10.244.154.196:443,10.244.44.197:443 + 3 more...
Session Affinity: None
External Traffic Policy: Cluster
Events:
root># kubectl get endpoints -n ingress-nginx
NAME ENDPOINTS AGE
ingress-nginx 10.244.154.195:80,10.244.154.196:80,10.244.44.197:80 + 9 more... 68m
Ingress Controller 用 Deployment 方式部署,给它添加一个 Service,类型为 NodePort,部署完成后查看会给出一个端口,通过 kubectl get svc 我们可以查看到这个端口,这个端口在集群的每个节点都可以访问,通过访问集群节点的这个端口就可以访问 Ingress Controller 了。但是集群节点这么多,而且端口又不是 80和443,太不爽了,一般我们会在前面自己搭个负载均衡器,比如用 Nginx,将请求转发到集群各个节点的那个端口上,这样我们访问 Nginx 就相当于访问到 Ingress Controller 了。
# 通过Nodeport方式测试(主机IP+端口)
curl 10.10.0.24:30725
curl 10.10.0.32:30725
curl 10.10.0.23:30725
curl 10.10.0.25:30725
curl 10.10.0.29:30725
curl 10.10.0.12:30725
2、通过Ingress IP 绑定域名测试
root># kubectl get ingress -A
NAMESPACE NAME HOSTS ADDRESS PORTS AGE
default ingress-myapp www.k8s-devops.com 10.97.243.123 80 45m
root># curl -x 10.97.243.123:80 http://www.k8s-devops.com
```
# 六、Ingress高可用
Ingress高可用,我们可以通过修改deployment的副本数来实现高可用,但是由于ingress承载着整个集群流量的接入,所以生产环境中,建议把ingress通过DaemonSet的方式部署集群中,而且该节点打上污点不允许业务pod进行调度,以避免业务应用与Ingress服务发生资源争抢。然后通过SLB把ingress节点主机添为后端服务器,进行流量转发。
1、修改为DaemonSet方式部署
```
wget -N https://raw.githubusercontent.com/kubernetes/ingress-nginx/master/deploy/static/mandatory.yaml -O ingress-nginx-mandatory.yaml
1、类型的修改
sed -i 's/kind: Deployment/kind: DaemonSet/g' ingress-nginx-mandatory.yaml
sed -i 's/replicas:/#replicas:/g' ingress-nginx-mandatory.yaml
2、镜像的修改(可忽略)
#sed -i -e 's?quay.io?quay.azk8s.cn?g' -e 's?k8s.gcr.io?gcr.azk8s.cn/google-containers?g' ingress-nginx-mandatory.yaml
3、使pod共享宿主机网络,暴露所监听的端口以及让容器使用K8S的DNS
# spec.template.spec 下面
# serviceAccountName: nginx-ingress-serviceaccount 的前后,平级加上 hostNetwork: true 和 dnsPolicy: "ClusterFirstWithHostNet"
sed -i '/serviceAccountName: nginx-ingress-serviceaccount/a\ hostNetwork: true' ingress-nginx-mandatory.yaml
sed -i '/serviceAccountName: nginx-ingress-serviceaccount/a\ dnsPolicy: "ClusterFirstWithHostNet"' ingress-nginx-mandatory.yaml
4、节点打标签和污点
# 添加节点标签append to serviceAccountName
nodeSelector:
node-ingress: "true"
tolerations:
- key: "node-role.kubernetes.io/master"
operator: "Equal"
value: ""
effect: "NoSchedule"
sed -i '/serviceAccountName: nginx-ingress-serviceaccount/a\ nodeSelector:\n node-ingress: "true"' ingress-nginx-mandatory.yaml
修改参数如下:
kind: Deployment #修改为DaemonSet
replicas: 1 #注销此行,DaemonSet不需要此参数
hostNetwork: true #添加该字段让docker使用物理机网络,在物理机暴露服务端口(80),注意物理机80端口提前不能被占用
dnsPolicy: ClusterFirstWithHostNet #使用hostNetwork后容器会使用物理机网络包括DNS,会无法解析内部service,使用此参数让容器使用K8S的DNS
nodeSelector:node-ingress: "true" #添加节点标签
tolerations: 添加对指定节点容忍度
注意一点,因为我们创建的ingress-controller采用的时hostnetwork模式,所以无需在创建ingress-svc服务来把端口映射到节点主机上。
```
这里我在3台master节点部署(生产环境不要使用master节点,应该部署在独立的节点上),因为我们采用DaemonSet的方式,所以我们需要对3个节点打标签以及容忍度。
```
## 查看标签
root># kubectl get nodes --show-labels
## 给节点打标签
[root@k8s-master-01]# kubectl label nodes k8s-master-01 node-ingress="true"
[root@k8s-master-01]# kubectl label nodes k8s-master-02 node-ingress="true"
[root@k8s-master-01]# kubectl label nodes k8s-master-03 node-ingress="true"
## 节点打污点
### master节点我之前已经打过污点,如果你没有打污点,执行下面3条命令。此污点名称需要与yaml文件中pod的容忍污点对应
[root@k8s-master-01]# kubectl taint nodes k8s-master-01 node-role.kubernetes.io/master=:NoSchedule
[root@k8s-master-01]# kubectl taint nodes k8s-master-02 node-role.kubernetes.io/master=:NoSchedule
[root@k8s-master-01]# kubectl taint nodes k8s-master-03 node-role.kubernetes.io/master=:NoSchedule
```
2、最终配置文件DaemonSet版的Ingress
```
cat >ingress-nginx-mandatory.yaml<<\EOF
apiVersion: v1
kind: Namespace
metadata:
name: ingress-nginx
labels:
app.kubernetes.io/name: ingress-nginx
app.kubernetes.io/part-of: ingress-nginx
---
kind: ConfigMap
apiVersion: v1
metadata:
name: nginx-configuration
namespace: ingress-nginx
labels:
app.kubernetes.io/name: ingress-nginx
app.kubernetes.io/part-of: ingress-nginx
---
kind: ConfigMap
apiVersion: v1
metadata:
name: tcp-services
namespace: ingress-nginx
labels:
app.kubernetes.io/name: ingress-nginx
app.kubernetes.io/part-of: ingress-nginx
---
kind: ConfigMap
apiVersion: v1
metadata:
name: udp-services
namespace: ingress-nginx
labels:
app.kubernetes.io/name: ingress-nginx
app.kubernetes.io/part-of: ingress-nginx
---
apiVersion: v1
kind: ServiceAccount
metadata:
name: nginx-ingress-serviceaccount
namespace: ingress-nginx
labels:
app.kubernetes.io/name: ingress-nginx
app.kubernetes.io/part-of: ingress-nginx
---
apiVersion: rbac.authorization.k8s.io/v1beta1
kind: ClusterRole
metadata:
name: nginx-ingress-clusterrole
labels:
app.kubernetes.io/name: ingress-nginx
app.kubernetes.io/part-of: ingress-nginx
rules:
- apiGroups:
- ""
resources:
- configmaps
- endpoints
- nodes
- pods
- secrets
verbs:
- list
- watch
- apiGroups:
- ""
resources:
- nodes
verbs:
- get
- apiGroups:
- ""
resources:
- services
verbs:
- get
- list
- watch
- apiGroups:
- ""
resources:
- events
verbs:
- create
- patch
- apiGroups:
- "extensions"
- "networking.k8s.io"
resources:
- ingresses
verbs:
- get
- list
- watch
- apiGroups:
- "extensions"
- "networking.k8s.io"
resources:
- ingresses/status
verbs:
- update
---
apiVersion: rbac.authorization.k8s.io/v1beta1
kind: Role
metadata:
name: nginx-ingress-role
namespace: ingress-nginx
labels:
app.kubernetes.io/name: ingress-nginx
app.kubernetes.io/part-of: ingress-nginx
rules:
- apiGroups:
- ""
resources:
- configmaps
- pods
- secrets
- namespaces
verbs:
- get
- apiGroups:
- ""
resources:
- configmaps
resourceNames:
# Defaults to "-"
# Here: "-"
# This has to be adapted if you change either parameter
# when launching the nginx-ingress-controller.
- "ingress-controller-leader-nginx"
verbs:
- get
- update
- apiGroups:
- ""
resources:
- configmaps
verbs:
- create
- apiGroups:
- ""
resources:
- endpoints
verbs:
- get
---
apiVersion: rbac.authorization.k8s.io/v1beta1
kind: RoleBinding
metadata:
name: nginx-ingress-role-nisa-binding
namespace: ingress-nginx
labels:
app.kubernetes.io/name: ingress-nginx
app.kubernetes.io/part-of: ingress-nginx
roleRef:
apiGroup: rbac.authorization.k8s.io
kind: Role
name: nginx-ingress-role
subjects:
- kind: ServiceAccount
name: nginx-ingress-serviceaccount
namespace: ingress-nginx
---
apiVersion: rbac.authorization.k8s.io/v1beta1
kind: ClusterRoleBinding
metadata:
name: nginx-ingress-clusterrole-nisa-binding
labels:
app.kubernetes.io/name: ingress-nginx
app.kubernetes.io/part-of: ingress-nginx
roleRef:
apiGroup: rbac.authorization.k8s.io
kind: ClusterRole
name: nginx-ingress-clusterrole
subjects:
- kind: ServiceAccount
name: nginx-ingress-serviceaccount
namespace: ingress-nginx
---
apiVersion: apps/v1
#kind: Deployment #修改为DaemonSet
kind: DaemonSet
metadata:
name: nginx-ingress-controller
namespace: ingress-nginx
labels:
app.kubernetes.io/name: ingress-nginx
app.kubernetes.io/part-of: ingress-nginx
spec:
#replicas: 1 #注销此行,DaemonSet不需要此参数
selector:
matchLabels:
app.kubernetes.io/name: ingress-nginx
app.kubernetes.io/part-of: ingress-nginx
template:
metadata:
labels:
app.kubernetes.io/name: ingress-nginx
app.kubernetes.io/part-of: ingress-nginx
annotations:
prometheus.io/port: "10254"
prometheus.io/scrape: "true"
spec:
# wait up to five minutes for the drain of connections
terminationGracePeriodSeconds: 300
serviceAccountName: nginx-ingress-serviceaccount
hostNetwork: true #添加该字段让docker使用物理机网络,在物理机暴露服务端口(80),注意物理机80端口提前不能被占用
dnsPolicy: ClusterFirstWithHostNet #使用hostNetwork后容器会使用物理机网络包括DNS,会无法解析内部service,使用此参数让容器使用K8S的DNS
nodeSelector:
kubernetes.io/os: linux
nodeSelector:
node-ingress: "true" #添加节点标签
tolerations: #添加对指定节点容忍度
- key: "node-role.kubernetes.io/master"
operator: "Equal"
value: ""
effect: "NoSchedule"
containers:
- name: nginx-ingress-controller
image: quay.io/kubernetes-ingress-controller/nginx-ingress-controller:0.26.1
args:
- /nginx-ingress-controller
- --configmap=$(POD_NAMESPACE)/nginx-configuration
- --tcp-services-configmap=$(POD_NAMESPACE)/tcp-services
- --udp-services-configmap=$(POD_NAMESPACE)/udp-services
- --publish-service=$(POD_NAMESPACE)/ingress-nginx
- --annotations-prefix=nginx.ingress.kubernetes.io
securityContext:
allowPrivilegeEscalation: true
capabilities:
drop:
- ALL
add:
- NET_BIND_SERVICE
# www-data -> 33
runAsUser: 33
env:
- name: POD_NAME
valueFrom:
fieldRef:
fieldPath: metadata.name
- name: POD_NAMESPACE
valueFrom:
fieldRef:
fieldPath: metadata.namespace
ports:
- name: http
containerPort: 80
- name: https
containerPort: 443
livenessProbe:
failureThreshold: 3
httpGet:
path: /healthz
port: 10254
scheme: HTTP
initialDelaySeconds: 10
periodSeconds: 10
successThreshold: 1
timeoutSeconds: 10
readinessProbe:
failureThreshold: 3
httpGet:
path: /healthz
port: 10254
scheme: HTTP
periodSeconds: 10
successThreshold: 1
timeoutSeconds: 10
lifecycle:
preStop:
exec:
command:
- /wait-shutdown
---
EOF
kubectl apply -f ingress-nginx-mandatory.yaml
```
3、创建资源
```
[root@k8s-master01 ingress-master]# kubectl apply -f ingress-nginx-mandatory.yaml
## 查看资源分布情况
### 可以看到两个ingress-controller已经根据我们选择,部署在3个master节点上
[root@k8s-master01 ingress-master]# kubectl get pod -n ingress-nginx -o wide
NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES
nginx-ingress-controller-298dq 1/1 Running 0 134m 172.16.11.122 k8s-master03
nginx-ingress-controller-sh9h2 1/1 Running 0 134m 172.16.11.121 k8s-master02
```
4、测试
```
#配置集群外域名解析,当前测试环境我们使用windows hosts文件进行解析(针对于node节点有公网IP的类型)
92.168.92.56 www.k8s-devops.com
92.168.92.57 www.k8s-devops.com
92.168.92.58 www.k8s-devops.com
使用域名进行访问:
www.k8s-devops.com
```
参考资料:
https://www.cnblogs.com/tchua/p/11174386.html Kubernetes集群Ingress高可用部署
https://github.com/kubernetes/ingress-nginx/blob/04e2ad8fcd51b0741263a37b8e7424ca3979137c/docs/deploy/index.md 官网
https://blog.csdn.net/networken/article/details/85881558 kubernetes部署Ingress-nginx
https://www.jianshu.com/p/a8e18cef13b2 HA ingress-nginx: DaemonSet hostNetwork keepavlied
================================================
FILE: components/ingress/1.外部服务发现之Ingress介绍.md
================================================
# 一、ingress介绍
K8s集群对外暴露服务的方式目前只有三种:`LoadBlancer`、`NodePort`、`Ingress`。前两种熟悉起来比较快,而且使用起来也比较方便,在此就不进行介绍了。
Ingress其实就是从 kuberenets 集群外部访问集群的一个入口,将外部的请求转发到集群内不同的 Service 上,其实就相当于 nginx、haproxy 等负载均衡代理服务器,有的同学可能觉得我们直接使用 nginx 就实现了,但是只使用 nginx 这种方式有很大缺陷,每次有新服务加入的时候怎么改 Nginx 配置?不可能让我们去手动更改或者滚动更新前端的 Nginx Pod 吧?那我们再加上一个服务发现的工具比如 consul 如何?貌似是可以,对吧?而且在之前单独使用 docker 的时候,这种方式已经使用得很普遍了,Ingress 实际上就是这样实现的,只是服务发现的功能自己实现了,不需要使用第三方的服务了,然后再加上一个域名规则定义,路由信息的刷新需要一个靠 Ingress controller 来提供。
其中ingress controller目前主要有两种:基于`nginx`服务的ingress controller和基于`traefik`的ingress controller。而其中traefik的ingress controller,目前支持http和https协议
# 二、ingress的工作原理
## 1、ingress由两部分组成: ingress controller和ingress服务
Ingress controller 可以理解为一个监听器,通过不断地与 kube-apiserver 打交道,实时的感知后端 service、pod 的变化,当得到这些变化信息后,Ingress controller 再结合 Ingress 的配置,更新反向代理负载均衡器,达到服务发现的作用。其实这点和服务发现工具 consul consul-template 非常类似。
## 2、ingress具体的工作原理如下
ingress contronler通过与k8s的api进行交互,动态的去感知k8s集群中ingress服务规则的变化,然后读取它,并按照定义的ingress规则,转发到k8s集群中对应的service。而这个ingress规则写明了哪个域名对应k8s集群中的哪个service,然后再根据ingress-controller中的nginx配置模板,生成一段对应的nginx配置。然后再把该配置动态的写到ingress-controller的pod里,该ingress-controller的pod里面运行着一个nginx服务,控制器会把生成的nginx配置写入到nginx的配置文件中,然后reload一下,使其配置生效。以此来达到域名分配置及动态更新的效果。
# 三、Traefik
Traefik 是一款开源的反向代理与负载均衡工具。它最大的优点是能够与常见的微服务系统直接整合,可以实现自动化动态配置。目前支持 Docker、Swarm、Mesos/Marathon、 Mesos、Kubernetes、Consul、Etcd、Zookeeper、BoltDB、Rest API 等等后端模型。
要使用 traefik,我们同样需要部署 traefik 的 Pod,由于我们演示的集群中只有 master 节点有外网网卡,所以我们这里只有 master 这一个边缘节点,我们将 traefik 部署到该节点上即可。

- 1、 首先,为安全起见我们这里使用 RBAC 安全认证方式:([rbac.yaml](https://github.com/containous/traefik/blob/v1.7/examples/k8s/traefik-rbac.yaml))
```
# vim rbac.yaml
---
apiVersion: v1
kind: ServiceAccount
metadata:
name: traefik-ingress-controller
namespace: kube-system
---
kind: ClusterRole
apiVersion: rbac.authorization.k8s.io/v1
metadata:
name: traefik-ingress-controller
rules:
- apiGroups:
- ""
resources:
- pods
- services
- endpoints
- secrets
verbs:
- get
- list
- watch
- apiGroups:
- extensions
resources:
- ingresses
verbs:
- get
- list
- watch
---
kind: ClusterRoleBinding
apiVersion: rbac.authorization.k8s.io/v1
metadata:
name: traefik-ingress-controller
roleRef:
apiGroup: rbac.authorization.k8s.io
kind: ClusterRole
name: traefik-ingress-controller
subjects:
- kind: ServiceAccount
name: traefik-ingress-controller
namespace: kube-system
```
- 2、直接在集群中创建即可:
```
$ kubectl create -f rbac.yaml
serviceaccount "traefik-ingress-controller" created
clusterrole.rbac.authorization.k8s.io "traefik-ingress-controller" created
clusterrolebinding.rbac.authorization.k8s.io "traefik-ingress-controller" created
```
- 3、然后使用 Deployment 来管理 traefik Pod,直接使用官方的 traefik 镜像部署即可([traefik.yaml](https://github.com/containous/traefik/blob/v1.7/examples/k8s/traefik-deployment.yaml))
```
# vim traefik.yaml
---
kind: Deployment
apiVersion: extensions/v1beta1
metadata:
name: traefik-ingress-controller
namespace: kube-system
labels:
k8s-app: traefik-ingress-lb
spec:
replicas: 1
selector:
matchLabels:
k8s-app: traefik-ingress-lb
template:
metadata:
labels:
k8s-app: traefik-ingress-lb
name: traefik-ingress-lb
spec:
serviceAccountName: traefik-ingress-controller
terminationGracePeriodSeconds: 60
tolerations:
- operator: "Exists"
nodeSelector:
kubernetes.io/hostname: master #默认master是不允许被调度的,加上tolerations后允许被调度,然后这里使用自身机器master的地址,可以使用kubectl get nodes --show-labels来查看
containers:
- image: traefik:v1.7
name: traefik-ingress-lb
ports:
- name: http
containerPort: 80
#hostPort: 80
- name: admin
containerPort: 8080
args:
- --api
- --kubernetes
- --logLevel=INFO
---
kind: Service
apiVersion: v1
metadata:
name: traefik-ingress-service
namespace: kube-system
spec:
selector:
k8s-app: traefik-ingress-lb
ports:
- protocol: TCP
# 该端口为 traefik ingress-controller的服务端口
port: 80
name: web
# 集群hosts文件中设置的 NODE_PORT_RANGE 作为 NodePort的可用范围
# 从默认20000~40000之间选一个可用端口,让ingress-controller暴露给外部的访问
nodePort: 23456
- protocol: TCP
# 该端口为 traefik 的管理WEB界面
port: 8080
name: admin
nodePort: 23457
type: NodePort
```
- 4、直接创建上面的资源对象即可:
```
$ kubectl create -f traefik.yaml
deployment.extensions "traefik-ingress-controller" created
service "traefik-ingress-service" created
```
- 5、要注意上面 yaml 文件:
```
tolerations:
- operator: "Exists"
nodeSelector:
kubernetes.io/hostname: master
由于我们这里的特殊性,只有 master 节点有外网访问权限,所以我们使用nodeSelector标签将traefik的固定调度到master这个节点上,那么上面的tolerations是干什么的呢?这个是因为我们集群使用的 kubeadm 安装的,master 节点默认是不能被普通应用调度的,要被调度的话就需要添加这里的 tolerations 属性,当然如果你的集群和我们的不太一样,直接去掉这里的调度策略就行。
nodeSelector 和 tolerations 都属于 Pod 的调度策略,在后面的课程中会为大家讲解。
```
- 6、traefik 还提供了一个 web ui 工具,就是上面的 8080 端口对应的服务,为了能够访问到该服务,我们这里将服务设置成的 NodePort:
```
$ kubectl get pods -n kube-system -l k8s-app=traefik-ingress-lb -o wide
NAME READY STATUS RESTARTS AGE IP NODE
traefik-ingress-controller-57c4f787d9-bfhnl 1/1 Running 0 8m 10.244.0.18 master
$ kubectl get svc -n kube-system
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
...
traefik-ingress-service NodePort 10.102.183.112 80:23456/TCP,8080:23457/TCP 8m
...
```
现在在浏览器中输入 [http://master_node_ip:23457 例如 http://16.21.206.156:23457/dashboard/ 注意这里是使用的IP] 就可以访问到 traefik 的 dashboard 了
# 四、Ingress 对象
以上我们是通过 NodePort 来访问 traefik 的 Dashboard 的,那怎样通过 ingress 来访问呢? 首先,需要创建一个 ingress 对象:(ingress.yaml)
```
# vim ingress.yaml
---
apiVersion: extensions/v1beta1
kind: Ingress
metadata:
name: traefik-web-ui
namespace: kube-system
annotations:
kubernetes.io/ingress.class: traefik
spec:
rules:
- host: traefik-ui.test.com
http:
paths:
- backend:
serviceName: traefik-ingress-service
#servicePort: 8080
servicePort: admin #跟上面service的name对应
```
然后为 traefik dashboard 创建对应的 ingress 对象:
```
$ kubectl create -f ingress.yaml
ingress.extensions "traefik-web-ui" created
```
要注意上面的 ingress 对象的规则,特别是 rules 区域,我们这里是要为 traefik 的 dashboard 建立一个 ingress 对象,所以这里的 serviceName 对应的是上面我们创建的 traefik-ingress-service,端口也要注意对应 8080 端口,为了避免端口更改,这里的 servicePort 的值也可以替换成上面定义的 port 的名字:admin
创建完成后,我们应该怎么来测试呢?
- 1、第一步,在本地的/etc/hosts里面添加上 traefik-ui.test.com 与 master 节点外网 IP 的映射关系
- 2、第二步,在浏览器中访问:http://traefik-ui.test.com 我们会发现并没有得到我们期望的 dashboard 界面,这是因为我们上面部署 traefik 的时候使用的是 NodePort 这种 Service 对象,所以我们只能通过上面的 23456 端口访问到我们的目标对象:[http://traefik-ui.test.com:23456](http://traefik-ui.test.com:23456) 加上端口后我们发现可以访问到 dashboard 了,而且在 dashboard 当中多了一条记录,正是上面我们创建的 ingress 对象的数据,我们还可以切换到 HEALTH 界面中,可以查看当前 traefik 代理的服务的整体的健康状态
注意这里为何是23456而不是23457,因为这里是通过ingress设置的域名来访问的,宿主机的23456端口对应宿主机上traefik-ingress-controller-nginx-pod容器的80端口,然后再经过ingress代理到service对应的pod节点上,如果traefik-ingress-controller-nginx-pod设置了宿主机端口映射,那么可以省略23456端口,下面会讲到hostPort: 80参数的使用,因为走了多层代理,所以直接Nodeport方式的性能会好一些,但是量一多,维护起来就比较麻烦)
- 3、第三步,上面我们可以通过自定义域名加上端口可以访问我们的服务了,但是我们平时服务别人的服务是不是都是直接用的域名啊,http 或者 https 的,几乎很少有在域名后面加上端口访问的吧?为什么?太麻烦啊,端口也记不住,要解决这个问题,怎么办,我们只需要把我们上面的 traefik 的核心应用的端口隐射到 master 节点上的 80 端口,是不是就可以了,因为 http 默认就是访问 80 端口,但是我们在 Service 里面是添加的一个 NodePort 类型的服务,没办法映射 80 端口,怎么办?这里就可以直接在 Pod 中指定一个 hostPort 即可,更改上面的 traefik.yaml 文件中的容器端口:
```
containers:
- image: traefik
name: traefik-ingress-lb
ports:
- name: http
containerPort: 80
hostPort: 80
- name: admin
containerPort: 8080
```
添加以后 hostPort: 80,然后更新应用:
```
$ kubectl apply -f traefik.yaml
```
更新完成后,这个时候我们在浏览器中直接使用域名方法测试下:
- 4、第四步,正常来说,我们如果有自己的域名,我们可以将我们的域名添加一条 DNS 记录,解析到 master 的外网 IP 上面,这样任何人都可以通过域名来访问我的暴露的服务了。如果你有多个边缘节点的话,可以在每个边缘节点上部署一个 ingress-controller 服务,然后在边缘节点前面挂一个负载均衡器,比如 nginx,将所有的边缘节点均作为这个负载均衡器的后端,这样就可以实现 ingress-controller 的高可用和负载均衡了。
到这里我们就通过 ingress 对象对外成功暴露了一个服务,下节课我们再来详细了解 traefik 的更多用法。
# 五、traefik 合并文件
1、创建文件 traefik-controller-ingress.yaml
```
vim traefik-controller-ingress.yaml
---
apiVersion: v1
kind: ServiceAccount
metadata:
name: traefik-ingress-controller
namespace: kube-system
---
kind: ClusterRole
apiVersion: rbac.authorization.k8s.io/v1
metadata:
name: traefik-ingress-controller
rules:
- apiGroups:
- ""
resources:
- pods
- services
- endpoints
- secrets
verbs:
- get
- list
- watch
- apiGroups:
- extensions
resources:
- ingresses
verbs:
- get
- list
- watch
---
kind: ClusterRoleBinding
apiVersion: rbac.authorization.k8s.io/v1
metadata:
name: traefik-ingress-controller
roleRef:
apiGroup: rbac.authorization.k8s.io
kind: ClusterRole
name: traefik-ingress-controller
subjects:
- kind: ServiceAccount
name: traefik-ingress-controller
namespace: kube-system
---
kind: Deployment
apiVersion: extensions/v1beta1
metadata:
name: traefik-ingress-controller
namespace: kube-system
labels:
k8s-app: traefik-ingress-lb
spec:
replicas: 1
selector:
matchLabels:
k8s-app: traefik-ingress-lb
template:
metadata:
labels:
k8s-app: traefik-ingress-lb
name: traefik-ingress-lb
spec:
serviceAccountName: traefik-ingress-controller
terminationGracePeriodSeconds: 60
tolerations:
- operator: "Exists"
nodeSelector:
kubernetes.io/hostname: master #默认master是不允许被调度的,加上tolerations后允许被调度,然后这里使用自身机器master的地址,可以使用kubectl get nodes来查看
containers:
- image: traefik:v1.7
name: traefik-ingress-lb
ports:
- name: http
containerPort: 80
hostPort: 80
- name: admin
containerPort: 8080
args:
- --api
- --kubernetes
- --logLevel=INFO
---
kind: Service
apiVersion: v1
metadata:
name: traefik-ingress-service
namespace: kube-system
spec:
selector:
k8s-app: traefik-ingress-lb
ports:
- protocol: TCP
# 该端口为 traefik ingress-controller的服务端口
port: 80
name: web
# 集群hosts文件中设置的 NODE_PORT_RANGE 作为 NodePort的可用范围
# 从默认20000~40000之间选一个可用端口,让ingress-controller暴露给外部的访问
nodePort: 23456
- protocol: TCP
# 该端口为 traefik 的管理WEB界面
port: 8080
name: admin
nodePort: 23457
type: NodePort
---
apiVersion: extensions/v1beta1
kind: Ingress
metadata:
name: traefik-web-ui
namespace: kube-system
annotations:
kubernetes.io/ingress.class: traefik
spec:
rules:
- host: traefik-ui.test.com
http:
paths:
- backend:
serviceName: traefik-ingress-service
#servicePort: 8080
servicePort: admin #跟上面service的name对应
```
2、更新应用
```
$ kubectl apply -f traefik-controller-ingress.yaml
```
3、访问测试
```
http://traefik-ui.test.com 绑定master的公网IP或者VIP
```
https://blog.csdn.net/oyym_mv/article/details/86986510 Kubernetes实录(11) kubernetes使用traefik作为反向代理(Deamonset模式)
================================================
FILE: components/ingress/2.ingress tls配置.md
================================================
# 1、Ingress tls
上节课给大家展示了 traefik 的安装使用以及简单的 ingress 的配置方法,这节课我们来学习一下 ingress tls 以及 path 路径在 ingress 对象中的使用方法。
# 2、TLS 认证
在现在大部分场景下面我们都会使用 https 来访问我们的服务,这节课我们将使用一个自签名的证书,当然你有在一些正规机构购买的 CA 证书是最好的,这样任何人访问你的服务的时候都是受浏览器信任的证书。使用下面的 openssl 命令生成 CA 证书:
```
mkdir -p /ssl/
cd /ssl/
openssl req -newkey rsa:2048 -nodes -keyout tls.key -x509 -days 365 -out tls.crt
```
现在我们有了证书,我们可以使用 kubectl 创建一个 secret 对象来存储上面的证书:
```
kubectl create secret generic traefik-cert --from-file=tls.crt --from-file=tls.key -n kube-system
```
# 3、配置 Traefik
前面我们使用的是 Traefik 的默认配置,现在我们来配置 Traefik,让其支持 https:
```
mkdir -p /config/
cd /config/
cat > traefik.toml <<\EOF
defaultEntryPoints = ["http", "https"]
[entryPoints]
[entryPoints.http]
address = ":80"
[entryPoints.http.redirect]
entryPoint = "https"
[entryPoints.https]
address = ":443"
[entryPoints.https.tls]
[[entryPoints.https.tls.certificates]]
CertFile = "/ssl/tls.crt"
KeyFile = "/ssl/tls.key"
EOF
上面的配置文件中我们配置了 http 和 https 两个入口,并且配置了将 http 服务强制跳转到 https 服务,这样我们所有通过 traefik 进来的服务都是 https 的,要访问 https 服务,当然就得配置对应的证书了,可以看到我们指定了 CertFile 和 KeyFile 两个文件,由于 traefik pod 中并没有这两个证书,所以我们要想办法将上面生成的证书挂载到 Pod 中去,是不是前面我们讲解过 secret 对象可以通过 volume 形式挂载到 Pod 中?至于上面的 traefik.toml 这个文件我们要怎么让 traefik pod 能够访问到呢?还记得我们前面讲过的 ConfigMap 吗?我们是不是可以将上面的 traefik.toml 配置文件通过一个 ConfigMap 对象挂载到 traefik pod 中去:
kubectl create configmap traefik-conf --from-file=traefik.toml -n kube-system
root># kubectl get configmap -n kube-system
NAME DATA AGE
coredns 1 11h
extension-apiserver-authentication 6 11h
kube-flannel-cfg 2 11h
kube-proxy 2 11h
kubeadm-config 2 11h
kubelet-config-1.15 1 11h
traefik-conf 1 10s
现在就可以更改下上节课的 traefik pod 的 yaml 文件了:
cd /data/components/ingress/
cat > traefik.yaml <<\EOF
kind: Deployment
apiVersion: extensions/v1beta1
metadata:
name: traefik-ingress-controller
namespace: kube-system
labels:
k8s-app: traefik-ingress-lb
spec:
replicas: 1
selector:
matchLabels:
k8s-app: traefik-ingress-lb
template:
metadata:
labels:
k8s-app: traefik-ingress-lb
name: traefik-ingress-lb
spec:
serviceAccountName: traefik-ingress-controller
terminationGracePeriodSeconds: 60
volumes:
- name: ssl
secret:
secretName: traefik-cert
- name: config
configMap:
name: traefik-conf
tolerations:
- operator: "Exists"
nodeSelector:
kubernetes.io/hostname: linux-node1.example.com
containers:
- image: traefik
name: traefik-ingress-lb
volumeMounts:
- mountPath: "/ssl" #这里注意挂载的路径
name: "ssl"
- mountPath: "/config" #这里注意挂载的路径
name: "config"
ports:
- name: http
containerPort: 80
hostPort: 80
- name: https
containerPort: 443
hostPort: 443
- name: admin
containerPort: 8080
args:
- --configfile=/config/traefik.toml
- --api
- --kubernetes
- --logLevel=INFO
EOF
和之前的比较,我们增加了 443 的端口配置,以及启动参数中通过 configfile 指定了 traefik.toml 配置文件,这个配置文件是通过 volume 挂载进来的。然后更新下 traefik pod:
kubectl apply -f traefik.yaml
kubectl logs -f traefik-ingress-controller-7dcfd9c6df-v58k7 -n kube-system
更新完成后我们查看 traefik pod 的日志,如果出现类似于上面的一些日志信息,证明更新成功了。现在我们去访问 traefik 的 dashboard 会跳转到 https 的地址,并会提示证书相关的报警信息,这是因为我们的证书是我们自建的,并不受浏览器信任,如果你是正规机构购买的证书并不会出现改报警信息,你应该可以看到我们常见的绿色标志:
https://traefik.k8s.com/dashboard/
```
# 4、配置 ingress
其实上面的 TLS 认证方式已经成功了,接下来我们通过一个实例来说明下 ingress 中 path 的用法,这里我们部署了3个简单的 web 服务,通过一个环境变量来标识当前运行的是哪个服务:(backend.yaml)
```
cd /data/components/ingress/
cat > backend.yaml <<\EOF
kind: Deployment
apiVersion: extensions/v1beta1
metadata:
name: svc1
spec:
replicas: 1
template:
metadata:
labels:
app: svc1
spec:
containers:
- name: svc1
image: cnych/example-web-service
env:
- name: APP_SVC
value: svc1
ports:
- containerPort: 8080
protocol: TCP
---
kind: Deployment
apiVersion: extensions/v1beta1
metadata:
name: svc2
spec:
replicas: 1
template:
metadata:
labels:
app: svc2
spec:
containers:
- name: svc2
image: cnych/example-web-service
env:
- name: APP_SVC
value: svc2
ports:
- containerPort: 8080
protocol: TCP
---
kind: Deployment
apiVersion: extensions/v1beta1
metadata:
name: svc3
spec:
replicas: 1
template:
metadata:
labels:
app: svc3
spec:
containers:
- name: svc3
image: cnych/example-web-service
env:
- name: APP_SVC
value: svc3
ports:
- containerPort: 8080
protocol: TCP
---
kind: Service
apiVersion: v1
metadata:
labels:
app: svc1
name: svc1
spec:
type: ClusterIP
ports:
- port: 8080
name: http
selector:
app: svc1
---
kind: Service
apiVersion: v1
metadata:
labels:
app: svc2
name: svc2
spec:
type: ClusterIP
ports:
- port: 8080
name: http
selector:
app: svc2
---
kind: Service
apiVersion: v1
metadata:
labels:
app: svc3
name: svc3
spec:
type: ClusterIP
ports:
- port: 8080
name: http
selector:
app: svc3
EOF
可以看到上面我们定义了3个 Deployment,分别对应3个 Service:
kubectl create -f backend.yaml
然后我们创建一个 ingress 对象来访问上面的3个服务:(example-ingress.yaml)
cat > example-ingress.yaml <<\EOF
apiVersion: extensions/v1beta1
kind: Ingress
metadata:
name: example-web-app
annotations:
kubernetes.io/ingress.class: "traefik"
spec:
rules:
- host: example.k8s.com
http:
paths:
- path: /s1
backend:
serviceName: svc1
servicePort: 8080
- path: /s2
backend:
serviceName: svc2
servicePort: 8080
- path: /
backend:
serviceName: svc3
servicePort: 8080
EOF
注意我们这里定义的 ingress 对象和之前有一个不同的地方是我们增加了 path 路径的定义,不指定的话默认是 '/',创建该 ingress 对象:
kubectl create -f example-ingress.yaml
现在我们可以在本地 hosts 里面给域名 example.k8s.com 添加对应的 hosts 解析,然后就可以在浏览器中访问,可以看到默认也会跳转到 https 的页面:
```
参考文档:
https://www.qikqiak.com/k8s-book/docs/41.ingress%20config.html
================================================
FILE: components/ingress/3.ingress-http使用示例.md
================================================
# 一、ingress-http测试示例
## 1、关键三个点:
注意这3个资源的namespace: kube-system需要一致
Deployment
Service
Ingress
```
$ vim nginx-deployment-http.yaml
---
apiVersion: apps/v1beta1
kind: Deployment
metadata:
name: nginx-deployment
namespace: kube-system
spec:
replicas: 2
template:
metadata:
labels:
app: nginx-pod
spec:
containers:
- name: nginx
image: nginx:1.15.5
ports:
- containerPort: 80
---
apiVersion: v1
kind: Service
metadata:
name: nginx-service
namespace: kube-system
annotations:
traefik.ingress.kubernetes.io/load-balancer-method: drr #动态加权轮训调度
spec:
template:
metadata:
labels:
name: nginx-service
spec:
selector:
app: nginx-pod
ports:
- port: 80
targetPort: 80
---
apiVersion: extensions/v1beta1
kind: Ingress
metadata:
name: nginx-ingress
namespace: kube-system
annotations:
kubernetes.io/ingress.class: traefik
spec:
rules:
- host: k8s.nginx.com
http:
paths:
- backend:
serviceName: nginx-service
servicePort: 80
```
## 2、创建资源
```
$ kubectl apply -f nginx-deployment-http.yaml
deployment.apps/nginx-pod create
service/nginx-service create
ingress.extensions/nginx-ingress create
```
## 3、访问刚创建的资源
首先这里需要先找到traefik-ingress pod 分布到到了那个节点,这里我们发现是落在了10.199.1.159的节点,然后我们绑定该节点对应的公网IP,这里假设为16.21.26.139
```
16.21.26.139 k8s.nginx.com
```
```
$ kubectl get pod -A -o wide|grep traefik-ingress
kube-system traefik-ingress-controller-7d454d7c68-8qpjq 1/1 Running 0 21h 10.46.2.10 10.199.1.159
```

## 4、清理资源
### 1、清理deployment
```
# 获取deployment
$ kubectl get deploy -A
NAMESPACE NAME READY UP-TO-DATE AVAILABLE AGE
kube-system coredns 2/2 2 2 3d
kube-system heapster 1/1 1 1 3d
kube-system kubernetes-dashboard 1/1 1 1 3d
kube-system metrics-server 1/1 1 1 3d
kube-system nginx-pod 2/2 2 2 25m
kube-system traefik-ingress-controller 1/1 1 1 2d22h
# 清理deployment
$ kubectl delete deploy nginx-pod -n kube-system
deployment.extensions "nginx-pod" deleted
```
### 2、清理service
```
# 获取svc
$ kubectl get svc -A
NAMESPACE NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
default kubernetes ClusterIP 10.44.0.1 443/TCP 3d
kube-system heapster ClusterIP 10.44.158.46 80/TCP 3d
kube-system kube-dns ClusterIP 10.44.0.2 53/UDP,53/TCP,9153/TCP 3d
kube-system kubernetes-dashboard NodePort 10.44.176.99 443:27008/TCP 3d
kube-system metrics-server ClusterIP 10.44.40.157 443/TCP 3d
kube-system nginx-service ClusterIP 10.44.148.252 80/TCP 28m
kube-system traefik-ingress-service NodePort 10.44.67.195 80:23456/TCP,443:23457/TCP,8080:33192/TCP 2d22h
# 清理svc
$ kubectl delete svc nginx-service -n kube-system
service "nginx-service" deleted
```
### 3、清理ingress
```
# 获取ingress
$ kubectl get ingress -A
NAMESPACE NAME HOSTS ADDRESS PORTS AGE
kube-system kubernetes-dashboard dashboard.test.com 80 2d22h
kube-system nginx-ingress k8s.nginx.com 80 29m
kube-system traefik-web-ui traefik-ui.test.com 80 2d22h
# 清理ingress
$ kubectl delete ingress nginx-ingress -n kube-system
ingress.extensions "nginx-ingress" deleted
```
参考资料:
https://xuchao918.github.io/2019/03/01/Kubernetes-traefik-ingress%E4%BD%BF%E7%94%A8/ Kubernetes traefik ingress使用
================================================
FILE: components/ingress/4.ingress-https使用示例.md
================================================
# 一、ingress-https测试示例
1、TLS 认证
在现在大部分场景下面我们都会使用 https 来访问我们的服务,这节课我们将使用一个自签名的证书,当然你有在一些正规机构购买的 CA 证书是最好的,这样任何人访问你的服务的时候都是受浏览器信任的证书。使用下面的 openssl 命令生成 CA 证书:
```
mkdir -p /ssl-k8s/
cd /ssl-k8s/
openssl req -x509 -nodes -days 365 -newkey rsa:2048 -keyout tls_k8s.key -out tls_k8s.crt -subj "/CN=hello.k8s.com"
```
现在我们有了证书,我们可以使用 kubectl 创建一个 secret 对象来存储上面的证书:(这个需手动执行创建好)
```
kubectl create secret generic traefik-k8s --from-file=tls_k8s.crt --from-file=tls_k8s.key -n kube-system
```
```
# vim /config/traefik.toml
defaultEntryPoints = ["http", "https"]
[entryPoints]
[entryPoints.http]
address = ":80"
[entryPoints.https]
address = ":443"
[entryPoints.https.tls]
[[entryPoints.https.tls.certificates]]
CertFile = "/ssl/tls_first.crt"
KeyFile = "/ssl/tls_first.key"
[[entryPoints.https.tls.certificates]]
CertFile = "/ssl/tls_second.crt"
KeyFile = "/ssl/tls_second.key"
```
## 1、关键五个点:
注意这5个资源的namespace: kube-system需要一致
secret ---secret 对象来存储ssl证书
configmap ---configmap 用来保存一个或多个key/value信息
Deployment
Service
Ingress
## 2、合并创建secret,configmap以及traefik文件
```
# vim traefik-controller-https.yaml
---
apiVersion: v1
kind: ConfigMap
metadata:
name: traefik-conf
namespace: kube-system
data:
traefik.toml: |
insecureSkipVerify = true
defaultEntryPoints = ["http", "https"]
[entryPoints]
[entryPoints.http]
address = ":80"
[entryPoints.https]
address = ":443"
[entryPoints.https.tls]
[[entryPoints.https.tls.certificates]]
CertFile = "/ssl/tls_first.crt"
KeyFile = "/ssl/tls_first.key"
[[entryPoints.https.tls.certificates]]
CertFile = "/ssl/tls_second.crt"
KeyFile = "/ssl/tls_second.key"
---
kind: Deployment
apiVersion: apps/v1beta1
metadata:
name: traefik-ingress-controller
namespace: kube-system
labels:
k8s-app: traefik-ingress-lb
spec:
replicas: 1
selector:
matchLabels:
k8s-app: traefik-ingress-lb
template:
metadata:
labels:
k8s-app: traefik-ingress-lb
name: traefik-ingress-lb
spec:
serviceAccountName: traefik-ingress-controller
terminationGracePeriodSeconds: 60
volumes:
- name: ssl
secret:
secretName: traefik-cert
- name: config
configMap:
name: traefik-conf
#nodeSelector:
# node-role.kubernetes.io/traefik: "true"
containers:
- image: traefik:v1.7.12
imagePullPolicy: IfNotPresent
name: traefik-ingress-lb
volumeMounts:
- mountPath: "/ssl"
name: "ssl"
- mountPath: "/config"
name: "config"
resources:
limits:
cpu: 1000m
memory: 800Mi
requests:
cpu: 500m
memory: 600Mi
args:
- --configfile=/config/traefik.toml
- --api
- --kubernetes
- --logLevel=INFO
securityContext:
capabilities:
drop:
- ALL
add:
- NET_BIND_SERVICE
ports:
- name: http
containerPort: 80
hostPort: 80
- name: https
containerPort: 443
hostPort: 443
---
kind: Service
apiVersion: v1
metadata:
name: traefik-ingress-service
namespace: kube-system
spec:
selector:
k8s-app: traefik-ingress-lb
ports:
- protocol: TCP
# 该端口为 traefik ingress-controller的服务端口
port: 80
# 集群hosts文件中设置的 NODE_PORT_RANGE 作为 NodePort的可用范围
# 从默认20000~40000之间选一个可用端口,让ingress-controller暴露给外部的访问
nodePort: 23456
name: http
- protocol: TCP
#
port: 443
nodePort: 23457
name: https
- protocol: TCP
# 该端口为 traefik 的管理WEB界面
port: 8080
name: admin
type: NodePort
---
kind: ClusterRole
apiVersion: rbac.authorization.k8s.io/v1
metadata:
name: traefik-ingress-controller
rules:
- apiGroups:
- ""
resources:
- pods
- services
- endpoints
- secrets
verbs:
- get
- list
- watch
- apiGroups:
- extensions
resources:
- ingresses
verbs:
- get
- list
- watch
---
kind: ClusterRoleBinding
apiVersion: rbac.authorization.k8s.io/v1
metadata:
name: traefik-ingress-controller
roleRef:
apiGroup: rbac.authorization.k8s.io
kind: ClusterRole
name: traefik-ingress-controller
subjects:
- kind: ServiceAccount
name: traefik-ingress-controller
namespace: kube-system
---
apiVersion: v1
kind: ServiceAccount
metadata:
name: traefik-ingress-controller
namespace: kube-system
```
# 二、应用测试示例
```
$ vim nginx-deployment-https.yaml
---
apiVersion: apps/v1beta1
kind: Deployment
metadata:
name: nginx-deployment
namespace: kube-system
spec:
replicas: 2
template:
metadata:
labels:
app: nginx-pod
spec:
containers:
- name: nginx
image: nginx:1.15.5
ports:
- containerPort: 80
---
apiVersion: v1
kind: Service
metadata:
name: nginx-service
namespace: kube-system
annotations:
traefik.ingress.kubernetes.io/load-balancer-method: drr #动态加权轮训调度
spec:
template:
metadata:
labels:
name: nginx-service
spec:
selector:
app: nginx-pod
ports:
- port: 80
targetPort: 80
---
apiVersion: extensions/v1beta1
kind: Ingress
metadata:
name: nginx-ingress
namespace: kube-system
annotations:
kubernetes.io/ingress.class: traefik
spec:
rules:
- host: k8s.nginx.com
http:
paths:
- backend:
serviceName: nginx-service
servicePort: 80
tls:
- secretName: traefik-k8s
```
## 2、创建资源
```
$ kubectl apply -f nginx-deployment-https.yaml
deployment.apps/nginx-pod create
service/nginx-service create
ingress.extensions/nginx-ingress create
```
## 3、访问刚创建的资源
首先这里需要先找到traefik-ingress pod 分布到到了那个节点,这里我们发现是落在了10.199.1.159的节点,然后我们绑定该节点对应的公网IP,这里假设为16.21.26.139
```
16.21.26.139 k8s.nginx.com
```
```
$ kubectl get pod -A -o wide|grep traefik-ingress
kube-system traefik-ingress-controller-7d454d7c68-8qpjq 1/1 Running 0 21h 10.46.2.10 10.199.1.159
```

## 4、清理资源
### 1、清理deployment
```
# 获取deployment
$ kubectl get deploy -A
NAMESPACE NAME READY UP-TO-DATE AVAILABLE AGE
kube-system coredns 2/2 2 2 3d
kube-system heapster 1/1 1 1 3d
kube-system kubernetes-dashboard 1/1 1 1 3d
kube-system metrics-server 1/1 1 1 3d
kube-system nginx-pod 2/2 2 2 25m
kube-system traefik-ingress-controller 1/1 1 1 2d22h
# 清理deployment
$ kubectl delete deploy nginx-pod -n kube-system
deployment.extensions "nginx-pod" deleted
```
### 2、清理service
```
# 获取svc
$ kubectl get svc -A
NAMESPACE NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
default kubernetes ClusterIP 10.44.0.1 443/TCP 3d
kube-system heapster ClusterIP 10.44.158.46 80/TCP 3d
kube-system kube-dns ClusterIP 10.44.0.2 53/UDP,53/TCP,9153/TCP 3d
kube-system kubernetes-dashboard NodePort 10.44.176.99 443:27008/TCP 3d
kube-system metrics-server ClusterIP 10.44.40.157 443/TCP 3d
kube-system nginx-service ClusterIP 10.44.148.252 80/TCP 28m
kube-system traefik-ingress-service NodePort 10.44.67.195 80:23456/TCP,443:23457/TCP,8080:33192/TCP 2d22h
# 清理svc
$ kubectl delete svc nginx-service -n kube-system
service "nginx-service" deleted
```
### 3、清理ingress
```
# 获取ingress
$ kubectl get ingress -A
NAMESPACE NAME HOSTS ADDRESS PORTS AGE
kube-system kubernetes-dashboard dashboard.test.com 80 2d22h
kube-system nginx-ingress k8s.nginx.com 80 29m
kube-system traefik-web-ui traefik-ui.test.com 80 2d22h
# 清理ingress
$ kubectl delete ingress nginx-ingress -n kube-system
ingress.extensions "nginx-ingress" deleted
```
参考资料:
https://xuchao918.github.io/2019/03/01/Kubernetes-traefik-ingress%E4%BD%BF%E7%94%A8/ Kubernetes traefik ingress使用
http://docs.kubernetes.org.cn/558.html
================================================
FILE: components/ingress/5.hello-tls.md
================================================
# 证书文件
1、生成证书
```
mkdir -p /ssl/{default,first,second}
cd /ssl/default/
openssl req -x509 -nodes -days 165 -newkey rsa:2048 -keyout tls.key -out tls.crt -subj "/CN=k8s.test.com"
kubectl -n kube-system create secret tls traefik-cert --key=tls.key --cert=tls.crt
cd /ssl/first/
openssl req -x509 -nodes -days 265 -newkey rsa:2048 -keyout tls_first.key -out tls_first.crt -subj "/CN=k8s.first.com"
kubectl create secret generic first-k8s --from-file=tls_first.crt --from-file=tls_first.key -n kube-system
cd /ssl/second/
openssl req -x509 -nodes -days 365 -newkey rsa:2048 -keyout tls_second.key -out tls_second.crt -subj "/CN=k8s.second.com"
kubectl create secret generic second-k8s --from-file=tls_second.crt --from-file=tls_second.key -n kube-system
#查看证书
kubectl get secret traefik-cert first-k8s second-k8s -n kube-system
kubectl describe secret traefik-cert first-k8s second-k8s -n kube-system
```
2、删除证书
```
$ kubectl delete secret traefik-cert first-k8s second-k8s -n kube-system
secret "second-k8s" deleted
secret "traefik-cert" deleted
secret "first-k8s" deleted
```
# 证书配置
1、创建configMap(cm)
```
mkdir -p /config/
cd /config/
$ vim traefik.toml
defaultEntryPoints = ["http", "https"]
[entryPoints]
[entryPoints.http]
address = ":80"
[entryPoints.https]
address = ":443"
[entryPoints.https.tls]
[[entryPoints.https.tls.certificates]]
CertFile = "/ssl/default/tls.crt"
KeyFile = "/ssl/default/tls.key"
[[entryPoints.https.tls.certificates]]
CertFile = "/ssl/first/tls_first.crt"
KeyFile = "/ssl/first/tls_first.key"
[[entryPoints.https.tls.certificates]]
CertFile = "/ssl/second/tls_second.crt"
KeyFile = "/ssl/second/tls_second.key"
$ kubectl create configmap traefik-conf --from-file=traefik.toml -n kube-system
$ kubectl get configmap traefik-conf -n kube-system
$ kubectl describe cm traefik-conf -n kube-system
```
2、删除configMap(cm)
```
$ kubectl delete cm traefik-conf -n kube-system
```
# traefik-ingress-controller文件
1、创建文件
```
$ vim traefik-controller-tls.yaml
---
apiVersion: v1
kind: ConfigMap
metadata:
name: traefik-conf
namespace: kube-system
data:
traefik.toml: |
insecureSkipVerify = true
defaultEntryPoints = ["http", "https"]
[entryPoints]
[entryPoints.http]
address = ":80"
[entryPoints.https]
address = ":443"
[entryPoints.https.tls]
[[entryPoints.https.tls.certificates]]
CertFile = "/ssl/default/tls.crt"
KeyFile = "/ssl/default/tls.key"
[[entryPoints.https.tls.certificates]]
CertFile = "/ssl/first/tls_first.crt"
KeyFile = "/ssl/first/tls_first.key"
[[entryPoints.https.tls.certificates]]
CertFile = "/ssl/second/tls_second.crt"
KeyFile = "/ssl/second/tls_second.key"
---
kind: Deployment
apiVersion: apps/v1beta1
metadata:
name: traefik-ingress-controller
namespace: kube-system
labels:
k8s-app: traefik-ingress-lb
spec:
replicas: 1
selector:
matchLabels:
k8s-app: traefik-ingress-lb
template:
metadata:
labels:
k8s-app: traefik-ingress-lb
name: traefik-ingress-lb
spec:
serviceAccountName: traefik-ingress-controller
terminationGracePeriodSeconds: 60
volumes:
- name: ssl
secret:
secretName: traefik-cert
- name: config
configMap:
name: traefik-conf
#nodeSelector:
# node-role.kubernetes.io/traefik: "true"
containers:
- image: traefik:v1.7.12
imagePullPolicy: IfNotPresent
name: traefik-ingress-lb
volumeMounts:
- mountPath: "/ssl"
name: "ssl"
- mountPath: "/config"
name: "config"
resources:
limits:
cpu: 1000m
memory: 800Mi
requests:
cpu: 500m
memory: 600Mi
args:
- --configfile=/config/traefik.toml
- --api
- --kubernetes
- --logLevel=INFO
securityContext:
capabilities:
drop:
- ALL
add:
- NET_BIND_SERVICE
ports:
- name: http
containerPort: 80
hostPort: 80
- name: https
containerPort: 443
hostPort: 443
---
kind: Service
apiVersion: v1
metadata:
name: traefik-ingress-service
namespace: kube-system
spec:
selector:
k8s-app: traefik-ingress-lb
ports:
- protocol: TCP
# 该端口为 traefik ingress-controller的服务端口
port: 80
# 集群hosts文件中设置的 NODE_PORT_RANGE 作为 NodePort的可用范围
# 从默认20000~40000之间选一个可用端口,让ingress-controller暴露给外部的访问
nodePort: 23456
name: http
- protocol: TCP
#
port: 443
nodePort: 23457
name: https
- protocol: TCP
# 该端口为 traefik 的管理WEB界面
port: 8080
name: admin
type: NodePort
---
kind: ClusterRole
apiVersion: rbac.authorization.k8s.io/v1
metadata:
name: traefik-ingress-controller
rules:
- apiGroups:
- ""
resources:
- pods
- services
- endpoints
- secrets
verbs:
- get
- list
- watch
- apiGroups:
- extensions
resources:
- ingresses
verbs:
- get
- list
- watch
---
kind: ClusterRoleBinding
apiVersion: rbac.authorization.k8s.io/v1
metadata:
name: traefik-ingress-controller
roleRef:
apiGroup: rbac.authorization.k8s.io
kind: ClusterRole
name: traefik-ingress-controller
subjects:
- kind: ServiceAccount
name: traefik-ingress-controller
namespace: kube-system
---
apiVersion: v1
kind: ServiceAccount
metadata:
name: traefik-ingress-controller
namespace: kube-system
```
2、应用生效
```
$ kubectl apply -f traefik-controller-tls.yaml
configmap/traefik-conf created
deployment.apps/traefik-ingress-controller created
service/traefik-ingress-service created
clusterrole.rbac.authorization.k8s.io/traefik-ingress-controller created
clusterrolebinding.rbac.authorization.k8s.io/traefik-ingress-controller created
serviceaccount/traefik-ingress-controller created
#删除资源
$ kubectl delete -f traefik-controller-tls.yaml
```
# 测试deployment和ingress
```
$ vim nginx-ingress-deploy.yaml
---
apiVersion: apps/v1beta1
kind: Deployment
metadata:
name: nginx-deployment
namespace: kube-system
spec:
replicas: 2
template:
metadata:
labels:
app: nginx-pod
spec:
containers:
- name: nginx
image: nginx:1.15.5
ports:
- containerPort: 80
---
apiVersion: v1
kind: Service
metadata:
name: nginx-service
namespace: kube-system
annotations:
traefik.ingress.kubernetes.io/load-balancer-method: drr #动态加权轮训调度
spec:
template:
metadata:
labels:
name: nginx-service
spec:
selector:
app: nginx-pod
ports:
- port: 80
targetPort: 80
---
apiVersion: extensions/v1beta1
kind: Ingress
metadata:
name: nginx-ingress
namespace: kube-system
annotations:
kubernetes.io/ingress.class: traefik
spec:
tls:
- secretName: first-k8s
- secretName: second-k8s
rules:
- host: k8s.first.com
http:
paths:
- backend:
serviceName: nginx-service
servicePort: 80
- host: k8s.senond.com
http:
paths:
- backend:
serviceName: nginx-service
servicePort: 80
$ kubectl apply -f nginx-ingress-deploy.yaml
$ kubectl delete -f nginx-ingress-deploy.yaml
```
================================================
FILE: components/ingress/6.ingress-https使用示例.md
================================================
# ingress-https测试示例
# 一、证书文件
## 1、TLS 认证
在现在大部分场景下面我们都会使用 https 来访问我们的服务,这节课我们将使用一个自签名的证书,当然你有在一些正规机构购买的 CA 证书是最好的,这样任何人访问你的服务的时候都是受浏览器信任的证书。使用下面的 openssl 命令生成 CA 证书:
```
openssl req -x509 -nodes -days 365 -newkey rsa:2048 -keyout tls.key -out tls.crt -subj "/CN=hello.test.com"
```
现在我们有了证书,我们可以使用 kubectl 创建一个 secret 对象来存储上面的证书:(这个需手动执行创建好)
```
kubectl -n kube-system create secret tls traefik-cert --key=tls.key --cert=tls.crt
```
## 2、多证书创建
```
mkdir -p /ssl/{default,first,second}
cd /ssl/default/
openssl req -x509 -nodes -days 165 -newkey rsa:2048 -keyout tls_default.key -out tls_default.crt -subj "/CN=k8s.test.com"
kubectl -n kube-system create secret tls traefik-cert --key=tls_default.key --cert=tls_default.crt
cd /ssl/first/
openssl req -x509 -nodes -days 265 -newkey rsa:2048 -keyout tls_first.key -out tls_first.crt -subj "/CN=k8s.first.com"
kubectl -n kube-system create secret tls first-k8s --key=tls_first.key --cert=tls_first.crt
cd /ssl/second/
openssl req -x509 -nodes -days 365 -newkey rsa:2048 -keyout tls_second.key -out tls_second.crt -subj "/CN=k8s.second.com"
kubectl -n kube-system create secret tls second-k8s --key=tls_second.key --cert=tls_second.crt
#查看证书
kubectl get secret traefik-cert first-k8s second-k8s -n kube-system
kubectl describe secret traefik-cert first-k8s second-k8s -n kube-system
```
## 3、删除证书
```
$ kubectl delete secret traefik-cert first-k8s second-k8s -n kube-system
secret "second-k8s" deleted
secret "traefik-cert" deleted
secret "first-k8s" deleted
```
## 4、关键5个点
```
注意这5个资源的namespace: kube-system 需要一致
secret ---secret 对象来存储ssl证书
configmap ---configmap 用来保存一个或多个key/value信息
Deployment
Service
Ingress
```
# 二、证书配置
## 1、创建configMap(cm)
```
mkdir -p /config/
cd /config/
$ vim traefik.toml
defaultEntryPoints = ["http", "https"]
[entryPoints]
[entryPoints.http]
address = ":80"
[entryPoints.https]
address = ":443"
[entryPoints.https.tls]
[[entryPoints.https.tls.certificates]]
CertFile = "/ssl/default/tls_default.crt"
KeyFile = "/ssl/default/tls_default.key"
[[entryPoints.https.tls.certificates]]
CertFile = "/ssl/first/tls_first.crt"
KeyFile = "/ssl/first/tls_first.key"
[[entryPoints.https.tls.certificates]]
CertFile = "/ssl/second/tls_second.crt"
KeyFile = "/ssl/second/tls_second.key"
$ kubectl create configmap traefik-conf --from-file=traefik.toml -n kube-system
$ kubectl get configmap traefik-conf -n kube-system
$ kubectl describe cm traefik-conf -n kube-system
```
## 2、删除configMap(cm)
```
$ kubectl delete cm traefik-conf -n kube-system
```
# 三、traefik-ingress-controller控制文件
## 1、创建文件
```
$ cd /config/
$ vim traefik-controller-tls.yaml
---
apiVersion: v1
kind: ConfigMap
metadata:
name: traefik-conf
namespace: kube-system
data:
traefik.toml: |
insecureSkipVerify = true
defaultEntryPoints = ["http", "https"]
[entryPoints]
[entryPoints.http]
address = ":80"
[entryPoints.https]
address = ":443"
[entryPoints.https.tls]
[[entryPoints.https.tls.certificates]]
CertFile = "/ssl/default/tls.crt"
KeyFile = "/ssl/default/tls.key"
[[entryPoints.https.tls.certificates]]
CertFile = "/ssl/first/tls_first.crt"
KeyFile = "/ssl/first/tls_first.key"
[[entryPoints.https.tls.certificates]]
CertFile = "/ssl/second/tls_second.crt"
KeyFile = "/ssl/second/tls_second.key"
---
kind: Deployment
apiVersion: apps/v1beta1
metadata:
name: traefik-ingress-controller
namespace: kube-system
labels:
k8s-app: traefik-ingress-lb
spec:
replicas: 1
selector:
matchLabels:
k8s-app: traefik-ingress-lb
template:
metadata:
labels:
k8s-app: traefik-ingress-lb
name: traefik-ingress-lb
spec:
serviceAccountName: traefik-ingress-controller
terminationGracePeriodSeconds: 60
volumes:
- name: ssl
secret:
secretName: traefik-cert
- name: config
configMap:
name: traefik-conf
#nodeSelector:
# node-role.kubernetes.io/traefik: "true"
tolerations:
- operator: "Exists"
nodeSelector:
kubernetes.io/hostname: 10.198.1.156 #指定traefik-ingress-controller跑在这个node节点上面
containers:
- image: traefik:v1.7.12
imagePullPolicy: IfNotPresent
name: traefik-ingress-lb
volumeMounts:
- mountPath: "/ssl"
name: "ssl"
- mountPath: "/config"
name: "config"
resources:
limits:
cpu: 1000m
memory: 800Mi
requests:
cpu: 500m
memory: 600Mi
args:
- --configfile=/config/traefik.toml
- --api
- --kubernetes
- --logLevel=INFO
securityContext:
capabilities:
drop:
- ALL
add:
- NET_BIND_SERVICE
ports:
- name: http
containerPort: 80
hostPort: 80
- name: https
containerPort: 443
hostPort: 443
---
kind: Service
apiVersion: v1
metadata:
name: traefik-ingress-service
namespace: kube-system
spec:
selector:
k8s-app: traefik-ingress-lb
ports:
- protocol: TCP
# 该端口为 traefik ingress-controller的服务端口
port: 80
# 集群hosts文件中设置的 NODE_PORT_RANGE 作为 NodePort的可用范围
# 从默认20000~40000之间选一个可用端口,让ingress-controller暴露给外部的访问
nodePort: 23456
name: http
- protocol: TCP
port: 443
nodePort: 23457
name: https
- protocol: TCP
# 该端口为 traefik 的管理WEB界面
port: 8080
name: admin
type: NodePort
---
kind: ClusterRole
apiVersion: rbac.authorization.k8s.io/v1
metadata:
name: traefik-ingress-controller
rules:
- apiGroups:
- ""
resources:
- pods
- services
- endpoints
- secrets
verbs:
- get
- list
- watch
- apiGroups:
- extensions
resources:
- ingresses
verbs:
- get
- list
- watch
---
kind: ClusterRoleBinding
apiVersion: rbac.authorization.k8s.io/v1
metadata:
name: traefik-ingress-controller
roleRef:
apiGroup: rbac.authorization.k8s.io
kind: ClusterRole
name: traefik-ingress-controller
subjects:
- kind: ServiceAccount
name: traefik-ingress-controller
namespace: kube-system
---
apiVersion: v1
kind: ServiceAccount
metadata:
name: traefik-ingress-controller
namespace: kube-system
```
## 2、应用生效
```
$ kubectl apply -f traefik-controller-tls.yaml
configmap/traefik-conf created
deployment.apps/traefik-ingress-controller created
service/traefik-ingress-service created
clusterrole.rbac.authorization.k8s.io/traefik-ingress-controller created
clusterrolebinding.rbac.authorization.k8s.io/traefik-ingress-controller created
serviceaccount/traefik-ingress-controller created
#删除资源
$ kubectl delete -f traefik-controller-tls.yaml
```
# 四、命令行创建 https ingress 例子
```
# 创建示例应用
$ kubectl run test-hello --image=nginx:alpine --port=80 --expose -n kube-system
# 删除示例应用(kubectl run 默认创建的是deployment资源应用 )
$ kubectl delete deployment test-hello -n kube-system
$ kubectl delete svc test-hello -n kube-system
# hello-tls-ingress 示例
$ cd /config/
$ vim hello-tls.ing.yaml
apiVersion: extensions/v1beta1
kind: Ingress
metadata:
name: hello-tls-ingress
namespace: kube-system
annotations:
kubernetes.io/ingress.class: traefik
spec:
rules:
- host: k8s.test.com
http:
paths:
- backend:
serviceName: test-hello
servicePort: 80
tls:
- secretName: traefik-cert
# 创建 https ingress
$ kubectl apply -f /config/hello-tls.ing.yaml
# 注意根据hello示例,需要在kube-system命名空间创建对应的secret: traefik-cert(这步在开篇已经创建了,无须再创建)
$ kubectl -n kube-system create secret tls traefik-cert --key=tls_default.key --cert=tls_default.crt
# 删除 https ingress
$ kubectl delete -f /config/hello-tls.ing.yaml
```
#测试访问(找到traefik-controller pod运行在哪个node节点上,然后绑定该节点的IP,然后访问该url)
https://k8s.test.com:23457

# 五、测试deployment和ingress
```
$ vim nginx-ingress-deploy.yaml
---
apiVersion: apps/v1beta1
kind: Deployment
metadata:
name: nginx-deployment
namespace: kube-system
spec:
replicas: 2
template:
metadata:
labels:
app: nginx-pod
spec:
containers:
- name: nginx
image: nginx:1.15.5
ports:
- containerPort: 80
---
apiVersion: v1
kind: Service
metadata:
name: nginx-service
namespace: kube-system
annotations:
traefik.ingress.kubernetes.io/load-balancer-method: drr #动态加权轮训调度
spec:
template:
metadata:
labels:
name: nginx-service
spec:
selector:
app: nginx-pod
ports:
- port: 80
targetPort: 80
---
apiVersion: extensions/v1beta1
kind: Ingress
metadata:
name: nginx-ingress
namespace: kube-system
annotations:
kubernetes.io/ingress.class: traefik
spec:
tls:
- secretName: first-k8s
- secretName: second-k8s
rules:
- host: k8s.first.com
http:
paths:
- backend:
serviceName: nginx-service
servicePort: 80
- host: k8s.second.com
http:
paths:
- backend:
serviceName: nginx-service
servicePort: 80
$ kubectl apply -f nginx-ingress-deploy.yaml
$ kubectl delete -f nginx-ingress-deploy.yaml
```
#访问测试
https://k8s.first.com:23457/

https://k8s.second.com:23457/

参考资料:
https://xuchao918.github.io/2019/03/01/Kubernetes-traefik-ingress%E4%BD%BF%E7%94%A8/ Kubernetes traefik ingress使用
http://docs.kubernetes.org.cn/558.html
================================================
FILE: components/ingress/README.md
================================================
参考资料:
https://segmentfault.com/a/1190000019908991 k8s ingress原理及ingress-nginx部署测试
https://www.cnblogs.com/tchua/p/11174386.html Kubernetes集群Ingress高可用部署
================================================
FILE: components/ingress/nginx-ingress/README.md
================================================
================================================
FILE: components/ingress/traefik-ingress/1.traefik反向代理Deamonset模式.md
================================================
# 一、Deamonset方式部署traefik-controller-ingress
https://github.com/containous/traefik/blob/v1.7/examples/k8s/traefik-ds.yaml
这里使用的DaemonSet,只是用traefik-ds.yaml ,traefik-rbac.yaml , ui.yaml
```bash
kubectl delete -f traefik-ds.yaml
rm -f ./traefik-ds.yaml
cat >traefik-ds.yaml<<\EOF
---
apiVersion: v1
kind: ServiceAccount
metadata:
name: traefik-ingress-controller
namespace: kube-system
---
kind: DaemonSet
apiVersion: apps/v1
metadata:
name: traefik-ingress-controller
namespace: kube-system
labels:
k8s-app: traefik-ingress-lb
spec:
selector:
matchLabels:
k8s-app: traefik-ingress-lb
template:
metadata:
labels:
k8s-app: traefik-ingress-lb
name: traefik-ingress-lb
spec:
serviceAccountName: traefik-ingress-controller
terminationGracePeriodSeconds: 60
#=======添加nodeSelector信息:只在master节点创建=======
tolerations:
- operator: "Exists"
nodeSelector:
kubernetes.io/role: master #默认master是不允许被调度的,加上tolerations后允许被调度,然后这里使用自身机器master的地址,可以使用kubectl get nodes --show-labels来查看
#===================================================
containers:
- image: traefik:v1.7
name: traefik-ingress-lb
ports:
- name: http
containerPort: 80
hostPort: 80
- name: admin
containerPort: 8080
hostPort: 8080
securityContext:
capabilities:
drop:
- ALL
add:
- NET_BIND_SERVICE
args:
- --api
- --kubernetes
- --logLevel=INFO
---
kind: Service
apiVersion: v1
metadata:
name: traefik-ingress-service
namespace: kube-system
spec:
selector:
k8s-app: traefik-ingress-lb
ports:
- protocol: TCP
port: 80
name: web
- protocol: TCP
port: 8080
name: admin
EOF
kubectl apply -f traefik-ds.yaml
```
# 二、traefik-rbac配置
https://github.com/containous/traefik/blob/v1.7/examples/k8s/traefik-rbac.yaml
```
kubectl delete -f traefik-rbac.yaml
rm -f ./traefik-rbac.yaml
cat >traefik-rbac.yaml<<\EOF
---
kind: ClusterRole
apiVersion: rbac.authorization.k8s.io/v1beta1
metadata:
name: traefik-ingress-controller
rules:
- apiGroups:
- ""
resources:
- services
- endpoints
- secrets
verbs:
- get
- list
- watch
- apiGroups:
- extensions
resources:
- ingresses
verbs:
- get
- list
- watch
- apiGroups:
- extensions
resources:
- ingresses/status
verbs:
- update
---
kind: ClusterRoleBinding
apiVersion: rbac.authorization.k8s.io/v1beta1
metadata:
name: traefik-ingress-controller
roleRef:
apiGroup: rbac.authorization.k8s.io
kind: ClusterRole
name: traefik-ingress-controller
subjects:
- kind: ServiceAccount
name: traefik-ingress-controller
namespace: kube-system
---
EOF
kubectl apply -f traefik-rbac.yaml
```
# 三、traefik-ui使用traefik进行代理
https://github.com/containous/traefik/blob/v1.7/examples/k8s/ui.yaml
1、代理方式一
```bash
kubectl delete -f ui.yaml
rm -f ./ui.yaml
cat >ui.yaml<<\EOF
---
apiVersion: v1
kind: Service
metadata:
name: traefik-web-ui
namespace: kube-system
spec:
selector:
k8s-app: traefik-ingress-lb
ports:
- name: web
port: 80
targetPort: 8080
---
apiVersion: extensions/v1beta1
kind: Ingress
metadata:
name: traefik-web-ui
namespace: kube-system
spec:
rules:
- host: traefik-ui.devops.com
http:
paths:
- path: /
backend:
serviceName: traefik-web-ui
servicePort: web
---
EOF
kubectl apply -f ui.yaml
```
2、代理方式二
```
kubectl delete -f ui.yaml
rm -f ./ui.yaml
cat >ui.yaml<<\EOF
---
kind: Service
apiVersion: v1
metadata:
name: traefik-ingress-service
namespace: kube-system
spec:
selector:
k8s-app: traefik-ingress-lb
ports:
- protocol: TCP
# 该端口为 traefik ingress-controller的服务端口
port: 80
name: web
- protocol: TCP
# 该端口为 traefik 的管理WEB界面
port: 8080
name: admin
---
apiVersion: extensions/v1beta1
kind: Ingress
metadata:
name: traefik-web-ui
namespace: kube-system
annotations:
kubernetes.io/ingress.class: traefik
spec:
rules:
- host: traefik-ui.devops.com
http:
paths:
- backend:
serviceName: traefik-ingress-service
#servicePort: 8080
servicePort: admin #跟上面service的name对应
---
EOF
kubectl apply -f ui.yaml
```
# 四、访问测试
`http://traefik-ui.devops.com`
# 五、汇总
```
kubectl delete -f all-ds.yaml
rm -f ./all-ds.yaml
cat >all-ds.yaml<<\EOF
---
apiVersion: v1
kind: ServiceAccount
metadata:
name: traefik-ingress-controller
namespace: kube-system
---
kind: DaemonSet
apiVersion: apps/v1
metadata:
name: traefik-ingress-controller
namespace: kube-system
labels:
k8s-app: traefik-ingress-lb
spec:
selector:
matchLabels:
k8s-app: traefik-ingress-lb
template:
metadata:
labels:
k8s-app: traefik-ingress-lb
name: traefik-ingress-lb
spec:
serviceAccountName: traefik-ingress-controller
terminationGracePeriodSeconds: 60
#=======添加nodeSelector信息:只在master节点创建=======
tolerations:
- operator: "Exists"
nodeSelector:
kubernetes.io/role: master #默认master是不允许被调度的,加上tolerations后允许被调度,然后这里使用自身机器master的地址,可以使用kubectl get nodes --show-labels来查看
#===================================================
containers:
- image: traefik:v1.7
name: traefik-ingress-lb
ports:
- name: http
containerPort: 80
hostPort: 80
- name: admin
containerPort: 8080
hostPort: 8080
securityContext:
capabilities:
drop:
- ALL
add:
- NET_BIND_SERVICE
args:
- --api
- --kubernetes
- --logLevel=INFO
---
kind: Service
apiVersion: v1
metadata:
name: traefik-ingress-service
namespace: kube-system
spec:
selector:
k8s-app: traefik-ingress-lb
ports:
- protocol: TCP
port: 80
name: web
- protocol: TCP
port: 8080
name: admin
---
kind: ClusterRole
apiVersion: rbac.authorization.k8s.io/v1beta1
metadata:
name: traefik-ingress-controller
rules:
- apiGroups:
- ""
resources:
- services
- endpoints
- secrets
verbs:
- get
- list
- watch
- apiGroups:
- extensions
resources:
- ingresses
verbs:
- get
- list
- watch
- apiGroups:
- extensions
resources:
- ingresses/status
verbs:
- update
---
kind: ClusterRoleBinding
apiVersion: rbac.authorization.k8s.io/v1beta1
metadata:
name: traefik-ingress-controller
roleRef:
apiGroup: rbac.authorization.k8s.io
kind: ClusterRole
name: traefik-ingress-controller
subjects:
- kind: ServiceAccount
name: traefik-ingress-controller
namespace: kube-system
---
apiVersion: v1
kind: Service
metadata:
name: traefik-web-ui
namespace: kube-system
spec:
selector:
k8s-app: traefik-ingress-lb
ports:
- name: web
port: 80
targetPort: 8080
---
apiVersion: extensions/v1beta1
kind: Ingress
metadata:
name: traefik-web-ui
namespace: kube-system
spec:
rules:
- host: traefik-ui.devops.com
http:
paths:
- path: /
backend:
serviceName: traefik-web-ui
servicePort: web
EOF
kubectl apply -f all-ds.yaml
```
参考资料:
https://blog.csdn.net/oyym_mv/article/details/86986510 kubernetes使用traefik作为反向代理(Deamonset模式)
https://www.cnblogs.com/twodoge/p/11663006.html 第二个坑新版本的 apps.v1 API需要在yaml文件中,selector变为必选项
================================================
FILE: components/ingress/traefik-ingress/2.traefik反向代理Deamonset模式TLS.md
================================================
# Ingress-Https测试示例
# 一、证书文件
## 1、TLS 认证
在现在大部分场景下面我们都会使用 https 来访问我们的服务,这节课我们将使用一个自签名的证书,当然你有在一些正规机构购买的 CA 证书是最好的,这样任何人访问你的服务的时候都是受浏览器信任的证书。使用下面的 openssl 命令生成 CA 证书:
```bash
rm -rf /etc/certs/ssl/
mkdir -p /etc/certs/ssl/
cd /etc/certs/ssl/
openssl req -x509 -nodes -days 365 -newkey rsa:2048 -keyout tls.key -out tls.crt -subj "/CN=hello.test.com"
```
现在我们有了证书,我们可以使用 kubectl 创建一个 secret 对象来存储上面的证书:(这个需手动执行创建好)
```bash
kubectl -n kube-system create secret tls traefik-cert --key=tls.key --cert=tls.crt
```
## 2、多证书创建
```bash
kubectl delete secret traefik-cert first-k8s second-k8s -n kube-system
rm -rf /etc/certs/ssl/
mkdir -p /etc/certs/ssl/{default,first,second}
cd /etc/certs/ssl/default/
openssl req -x509 -nodes -days 165 -newkey rsa:2048 -keyout tls_default.key -out tls_default.crt -subj "/CN=*.devops.com"
kubectl -n kube-system create secret tls traefik-cert --key=tls_default.key --cert=tls_default.crt
cd /etc/certs/ssl/first/
openssl req -x509 -nodes -days 265 -newkey rsa:2048 -keyout tls_first.key -out tls_first.crt -subj "/CN=k8s.first.com"
kubectl -n kube-system create secret tls first-k8s --key=tls_first.key --cert=tls_first.crt
cd /etc/certs/ssl/second/
openssl req -x509 -nodes -days 365 -newkey rsa:2048 -keyout tls_second.key -out tls_second.crt -subj "/CN=k8s.second.com"
kubectl -n kube-system create secret tls second-k8s --key=tls_second.key --cert=tls_second.crt
#查看证书
kubectl get secret traefik-cert first-k8s second-k8s -n kube-system
kubectl describe secret traefik-cert first-k8s second-k8s -n kube-system
```
## 3、关键5个点
```bash
注意这5个资源的namespace: kube-system 需要一致
secret ---secret 对象来存储ssl证书
configmap ---configmap 用来保存一个或多个key/value信息
Deployment or DaemonSet
Service
Ingress
```
# 二、证书配置,创建configMap(cm)
1、http和https并存
```bash
kubectl delete cm traefik-conf -n kube-system
rm -rf /etc/certs/config/
mkdir -p /etc/certs/config/
cd /etc/certs/config/
cat >traefik.toml<<\EOF
# 设置insecureSkipVerify = true,可以配置backend为443(比如dashboard)的ingress规则
insecureSkipVerify = true
defaultEntryPoints = ["http", "https"]
[entryPoints]
[entryPoints.http]
address = ":80"
[entryPoints.https]
address = ":443"
[entryPoints.https.tls]
[[entryPoints.https.tls.certificates]]
CertFile = "/etc/certs/ssl/default/tls_default.crt"
KeyFile = "/etc/certs/ssl/default/tls_default.key"
[[entryPoints.https.tls.certificates]]
CertFile = "/etc/certs/ssl/first/tls_first.crt"
KeyFile = "/etc/certs/ssl/first/tls_first.key"
[[entryPoints.https.tls.certificates]]
CertFile = "/etc/certs/ssl/second/tls_second.crt"
KeyFile = "/etc/certs/ssl/second/tls_second.key"
EOF
kubectl create configmap traefik-conf --from-file=traefik.toml -n kube-system
kubectl get configmap traefik-conf -n kube-system
kubectl describe cm traefik-conf -n kube-system
```
2、http跳转到https
```bash
kubectl delete cm traefik-conf -n kube-system
rm -rf /etc/certs/config/
mkdir -p /etc/certs/config/
cd /etc/certs/config/
cat >traefik.toml<<\EOF
# 指定了 "traefik" 在访问 "https" 后端时可以忽略TLS证书验证错误,从而使得 "https" 的后端,可以像http后端一样直接通过 "traefik" 透出,如kubernetes dashboard
insecureSkipVerify = true
#
defaultEntryPoints = ["http", "https"]
[entryPoints]
[entryPoints.http]
address = ":80"
[entryPoints.http.redirect]
entryPoint = "https"
[entryPoints.https]
address = ":443"
[entryPoints.https.tls]
[[entryPoints.https.tls.certificates]]
CertFile = "/etc/certs/ssl/default/tls_default.crt"
KeyFile = "/etc/certs/ssl/default/tls_default.key"
[[entryPoints.https.tls.certificates]]
CertFile = "/etc/certs/ssl/first/tls_first.crt"
KeyFile = "/etc/certs/ssl/first/tls_first.key"
[[entryPoints.https.tls.certificates]]
CertFile = "/etc/certs/ssl/second/tls_second.crt"
KeyFile = "/etc/certs/ssl/second/tls_second.key"
EOF
kubectl create configmap traefik-conf --from-file=traefik.toml -n kube-system
kubectl get configmap traefik-conf -n kube-system
kubectl describe cm traefik-conf -n kube-system
```
# 三、traefik-ingress-controller控制文件
## 1、创建文件
```
kubectl delete -f traefik-controller-tls.yaml
rm -f ./traefik-controller-tls.yaml
cat >traefik-controller-tls.yaml<<\EOF
---
apiVersion: v1
kind: ServiceAccount
metadata:
name: traefik-ingress-controller
namespace: kube-system
---
kind: DaemonSet
apiVersion: extensions/v1beta1
metadata:
name: traefik-ingress-controller
namespace: kube-system
labels:
k8s-app: traefik-ingress-lb
spec:
template:
metadata:
labels:
k8s-app: traefik-ingress-lb
name: traefik-ingress-lb
spec:
serviceAccountName: traefik-ingress-controller
terminationGracePeriodSeconds: 60
hostNetwork: true #添加该字段让docker使用物理机网络,在物理机暴露服务端口(80),注意物理机80端口提前不能被占用
dnsPolicy: ClusterFirstWithHostNet #使用hostNetwork后容器会使用物理机网络包括DNS,会无法解析内部service,使用此参数让容器使用K8S的DNS
volumes:
- name: ssl
secret:
secretName: traefik-cert
- name: config
configMap:
name: traefik-conf
#=======添加nodeSelector信息:只在master节点创建=======
tolerations:
- key: node-role.kubernetes.io/master
operator: "Equal"
value: ""
effect: NoSchedule
nodeSelector:
node-role.kubernetes.io/master: ""
#===================================================
containers:
- image: traefik:v1.7.12
name: traefik-ingress-lb
volumeMounts:
- mountPath: "/etc/certs/ssl"
name: "ssl"
- mountPath: "/etc/certs/config"
name: "config"
resources:
limits:
cpu: 1000m
memory: 800Mi
requests:
cpu: 500m
memory: 600Mi
ports:
- name: http
containerPort: 80
hostPort: 80
- name: https
containerPort: 443
hostPort: 443
- name: admin
containerPort: 8080
hostPort: 8080
securityContext:
capabilities:
drop:
- ALL
add:
- NET_BIND_SERVICE
args:
- --configfile=/etc/certs/config/traefik.toml
- --api
- --kubernetes
- --logLevel=INFO
---
kind: ClusterRole
apiVersion: rbac.authorization.k8s.io/v1beta1
metadata:
name: traefik-ingress-controller
rules:
- apiGroups:
- ""
resources:
- services
- endpoints
- secrets
verbs:
- get
- list
- watch
- apiGroups:
- extensions
resources:
- ingresses
verbs:
- get
- list
- watch
- apiGroups:
- extensions
resources:
- ingresses/status
verbs:
- update
---
kind: ClusterRoleBinding
apiVersion: rbac.authorization.k8s.io/v1beta1
metadata:
name: traefik-ingress-controller
roleRef:
apiGroup: rbac.authorization.k8s.io
kind: ClusterRole
name: traefik-ingress-controller
subjects:
- kind: ServiceAccount
name: traefik-ingress-controller
namespace: kube-system
---
kind: Service
apiVersion: v1
metadata:
name: traefik-ingress-service
namespace: kube-system
spec:
selector:
k8s-app: traefik-ingress-lb
ports:
- protocol: TCP
port: 80
name: http
- protocol: TCP
port: 443
name: https
- protocol: TCP
port: 8080
name: admin
---
apiVersion: extensions/v1beta1
kind: Ingress
metadata:
name: traefik-web-ui
namespace: kube-system
spec:
tls:
- secretName: traefik-cert
rules:
- host: traefik-ui.devops.com
http:
paths:
- path: /
backend:
serviceName: traefik-ingress-service
servicePort: admin
---
EOF
kubectl apply -f traefik-controller-tls.yaml
```
## 2、删除资源
```
kubectl delete -f traefik-controller-tls.yaml
```
# 四、命令行创建 https ingress 例子
```
# 创建示例应用
$ kubectl run test-hello --image=nginx:alpine --port=80 --expose -n kube-system
# 删除示例应用(kubectl run 默认创建的是deployment资源应用 )
$ kubectl delete deployment test-hello -n kube-system
$ kubectl delete svc test-hello -n kube-system
# hello-tls-ingress 示例
$ cd /config/
$ vim hello-tls.ing.yaml
apiVersion: extensions/v1beta1
kind: Ingress
metadata:
name: hello-tls-ingress
namespace: kube-system
annotations:
kubernetes.io/ingress.class: traefik
spec:
rules:
- host: k8s.test.com
http:
paths:
- backend:
serviceName: test-hello
servicePort: 80
tls:
- secretName: traefik-cert
# 创建 https ingress
$ kubectl apply -f /config/hello-tls.ing.yaml
# 注意根据hello示例,需要在kube-system命名空间创建对应的secret: traefik-cert(这步在开篇已经创建了,无须再创建)
$ kubectl -n kube-system create secret tls traefik-cert --key=tls_default.key --cert=tls_default.crt
# 删除 https ingress
$ kubectl delete -f /config/hello-tls.ing.yaml
```
#测试访问(找到traefik-controller pod运行在哪个node节点上,然后绑定该节点的IP,然后访问该url)
https://k8s.test.com:23457

# 五、测试deployment和ingress
```
$ vim nginx-ingress-deploy.yaml
---
apiVersion: apps/v1beta1
kind: Deployment
metadata:
name: nginx-deployment
namespace: kube-system
spec:
replicas: 2
template:
metadata:
labels:
app: nginx-pod
spec:
containers:
- name: nginx
image: nginx:1.15.5
ports:
- containerPort: 80
---
apiVersion: v1
kind: Service
metadata:
name: nginx-service
namespace: kube-system
annotations:
traefik.ingress.kubernetes.io/load-balancer-method: drr #动态加权轮训调度
spec:
template:
metadata:
labels:
name: nginx-service
spec:
selector:
app: nginx-pod
ports:
- port: 80
targetPort: 80
---
apiVersion: extensions/v1beta1
kind: Ingress
metadata:
name: nginx-ingress
namespace: kube-system
annotations:
kubernetes.io/ingress.class: traefik
spec:
tls:
- secretName: first-k8s
- secretName: second-k8s
rules:
- host: k8s.first.com
http:
paths:
- backend:
serviceName: nginx-service
servicePort: 80
- host: k8s.second.com
http:
paths:
- backend:
serviceName: nginx-service
servicePort: 80
$ kubectl apply -f nginx-ingress-deploy.yaml
$ kubectl delete -f nginx-ingress-deploy.yaml
```
#访问测试
https://k8s.first.com:23457/

https://k8s.second.com:23457/

参考资料:
https://xuchao918.github.io/2019/03/01/Kubernetes-traefik-ingress%E4%BD%BF%E7%94%A8/ Kubernetes traefik ingress使用
http://docs.kubernetes.org.cn/558.html
================================================
FILE: components/ingress/traefik-ingress/README.md
================================================
================================================
FILE: components/ingress/常用操作.md
================================================
```
[root@master ingress]# kubectl get ingress -A
NAMESPACE NAME HOSTS ADDRESS PORTS AGE
default nginx-ingress k8s.nginx.com 80 40m
kube-system kubernetes-dashboard dashboard.test.com 80 2d21h
kube-system traefik-web-ui traefik-ui.test.com 80 2d21h
[root@master ingress]# kubectl delete ingress hello-tls-ingress
ingress.extensions "hello-tls-ingress" deleted
```
# 1、rbac.yaml
首先,为安全起见我们这里使用 RBAC 安全认证方式:(rbac.yaml)
```
mkdir -p /data/components/ingress
cat > /data/components/ingress/rbac.yaml << \EOF
---
apiVersion: v1
kind: ServiceAccount
metadata:
name: traefik-ingress-controller
namespace: kube-system
---
kind: ClusterRole
apiVersion: rbac.authorization.k8s.io/v1beta1
metadata:
name: traefik-ingress-controller
rules:
- apiGroups:
- ""
resources:
- services
- endpoints
- secrets
verbs:
- get
- list
- watch
- apiGroups:
- extensions
resources:
- ingresses
verbs:
- get
- list
- watch
---
kind: ClusterRoleBinding
apiVersion: rbac.authorization.k8s.io/v1beta1
metadata:
name: traefik-ingress-controller
roleRef:
apiGroup: rbac.authorization.k8s.io
kind: ClusterRole
name: traefik-ingress-controller
subjects:
- kind: ServiceAccount
name: traefik-ingress-controller
namespace: kube-system
EOF
kubectl create -f /data/components/ingress/rbac.yaml
```
# 2、traefik.yaml
然后使用 Deployment 来管理 Pod,直接使用官方的 traefik 镜像部署即可(traefik.yaml)
```
cat > /data/components/ingress/traefik.yaml << \EOF
---
kind: Deployment
apiVersion: extensions/v1beta1
metadata:
name: traefik-ingress-controller
namespace: kube-system
labels:
k8s-app: traefik-ingress-lb
spec:
replicas: 1
selector:
matchLabels:
k8s-app: traefik-ingress-lb
template:
metadata:
labels:
k8s-app: traefik-ingress-lb
name: traefik-ingress-lb
spec:
serviceAccountName: traefik-ingress-controller
terminationGracePeriodSeconds: 60
tolerations:
- operator: "Exists"
nodeSelector:
kubernetes.io/hostname: linux-node1.example.com #默认master是不允许被调度的,加上tolerations后允许被调度
containers:
- image: traefik
name: traefik-ingress-lb
ports:
- name: http
containerPort: 80
- name: admin
containerPort: 8080
args:
- --api
- --kubernetes
- --logLevel=INFO
---
kind: Service
apiVersion: v1
metadata:
name: traefik-ingress-service
namespace: kube-system
spec:
selector:
k8s-app: traefik-ingress-lb
ports:
- protocol: TCP
port: 80
name: web
- protocol: TCP
port: 8080
name: admin
type: NodePort
EOF
kubectl create -f /data/components/ingress/traefik.yaml
kubectl apply -f /data/components/ingress/traefik.yaml
```
```
要注意上面 yaml 文件:
tolerations:
- operator: "Exists"
nodeSelector:
kubernetes.io/hostname: master
由于我们这里的特殊性,只有 master 节点有外网访问权限,所以我们使用nodeSelector标签将traefik的固定调度到master这个节点上,那么上面的tolerations是干什么的呢?这个是因为我们集群使用的 kubeadm 安装的,master 节点默认是不能被普通应用调度的,要被调度的话就需要添加这里的 tolerations 属性,当然如果你的集群和我们的不太一样,直接去掉这里的调度策略就行。
nodeSelector 和 tolerations 都属于 Pod 的调度策略,在后面的课程中会为大家讲解。
```
# 3、traefik-ui
traefik 还提供了一个 web ui 工具,就是上面的 8080 端口对应的服务,为了能够访问到该服务,我们这里将服务设置成的 NodePort
```
root># kubectl get pods -n kube-system -l k8s-app=traefik-ingress-lb -o wide
NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES
traefik-ingress-controller-5b58d5c998-6dn97 1/1 Running 0 88s 10.244.0.2 linux-node1.example.com
root># kubectl get svc -n kube-system|grep traefik-ingress-service
traefik-ingress-service NodePort 10.102.214.49 80:32472/TCP,8080:32482/TCP 44s
现在在浏览器中输入 master_node_ip:32303 就可以访问到 traefik 的 dashboard 了
```
http://192.168.56.11:32482/dashboard/
# 4、Ingress 对象
现在我们是通过 NodePort 来访问 traefik 的 Dashboard 的,那怎样通过 ingress 来访问呢? 首先,需要创建一个 ingress 对象:(ingress.yaml)
```
cat > /data/components/ingress/ingress.yaml <<\EOF
apiVersion: extensions/v1beta1
kind: Ingress
metadata:
name: traefik-web-ui
namespace: kube-system
annotations:
kubernetes.io/ingress.class: traefik
spec:
rules:
- host: traefik.k8s.com
http:
paths:
- backend:
serviceName: traefik-ingress-service
#servicePort: 8080
servicePort: admin #这里建议使用servicePort: admin,这样就避免端口的调整
EOF
kubectl create -f /data/components/ingress/ingress.yaml
kubectl apply -f /data/components/ingress/ingress.yaml
要注意上面的 ingress 对象的规则,特别是 rules 区域,我们这里是要为 traefik 的 dashboard 建立一个 ingress 对象,所以这里的 serviceName 对应的是上面我们创建的 traefik-ingress-service,端口也要注意对应 8080 端口,为了避免端口更改,这里的 servicePort 的值也可以替换成上面定义的 port 的名字:admin
```
创建完成后,我们应该怎么来测试呢?
```
第一步,在本地的/etc/hosts里面添加上 traefik.k8s.com 与 master 节点外网 IP 的映射关系
第二步,在浏览器中访问:http://traefik.k8s.com 我们会发现并没有得到我们期望的 dashboard 界面,这是因为我们上面部署 traefik 的时候使用的是 NodePort 这种 Service 对象,所以我们只能通过上面的 32482 端口访问到我们的目标对象:http://traefik.k8s.com:32482
加上端口后我们发现可以访问到 dashboard 了,而且在 dashboard 当中多了一条记录,正是上面我们创建的 ingress 对象的数据,我们还可以切换到 HEALTH 界面中,可以查看当前 traefik 代理的服务的整体的健康状态
第三步,上面我们可以通过自定义域名加上端口可以访问我们的服务了,但是我们平时服务别人的服务是不是都是直接用的域名啊,http 或者 https 的,几乎很少有在域名后面加上端口访问的吧?为什么?太麻烦啊,端口也记不住,要解决这个问题,怎么办,我们只需要把我们上面的 traefik 的核心应用的端口隐射到 master 节点上的 80 端口,是不是就可以了,因为 http 默认就是访问 80 端口,但是我们在 Service 里面是添加的一个 NodePort 类型的服务,没办法映射 80 端口,怎么办?这里就可以直接在 Pod 中指定一个 hostPort 即可,更改上面的 traefik.yaml 文件中的容器端口:
containers:
- image: traefik
name: traefik-ingress-lb
ports:
- name: http
containerPort: 80
hostPort: 80 #新增这行
- name: admin
containerPort: 8080
添加以后hostPort: 80,然后更新应用
kubectl apply -f traefik.yaml
更新完成后,这个时候我们在浏览器中直接使用域名方法测试下
http://traefik.k8s.com
第四步,正常来说,我们如果有自己的域名,我们可以将我们的域名添加一条 DNS 记录,解析到 master 的外网 IP 上面,这样任何人都可以通过域名来访问我的暴露的服务了。
如果你有多个边缘节点的话,可以在每个边缘节点上部署一个 ingress-controller 服务,然后在边缘节点前面挂一个负载均衡器,比如 nginx,将所有的边缘节点均作为这个负载均衡器的后端,这样就可以实现 ingress-controller 的高可用和负载均衡了。
```
# 5、ingress tls
上节课给大家展示了 traefik 的安装使用以及简单的 ingress 的配置方法,这节课我们来学习一下 ingress tls 以及 path 路径在 ingress 对象中的使用方法。
1、TLS 认证
在现在大部分场景下面我们都会使用 https 来访问我们的服务,这节课我们将使用一个自签名的证书,当然你有在一些正规机构购买的 CA 证书是最好的,这样任何人访问你的服务的时候都是受浏览器信任的证书。使用下面的 openssl 命令生成 CA 证书:
```
openssl req -newkey rsa:2048 -nodes -keyout tls.key -x509 -days 365 -out tls.crt
```
现在我们有了证书,我们可以使用 kubectl 创建一个 secret 对象来存储上面的证书:
```
kubectl create secret generic traefik-cert --from-file=tls.crt --from-file=tls.key -n kube-system
```
3、配置 Traefik
前面我们使用的是 Traefik 的默认配置,现在我们来配置 Traefik,让其支持 https:
================================================
FILE: components/initContainers/README.md
================================================
参考资料:
https://www.cnblogs.com/yanh0606/p/11395920.html Kubernetes的初始化容器initContainers
https://www.jianshu.com/p/e57c3e17ce8c 理解 Init 容器
================================================
FILE: components/job/README.md
================================================
参考资料:
https://www.jianshu.com/p/bd6cd1b4e076 Kubernetes对象之Job
https://www.cnblogs.com/lvcisco/p/9670100.html k8s Job、Cronjob 的使用
================================================
FILE: components/k8s-monitor/README.md
================================================
```
# 1、持久化监控数据
cat > prometheus-class.yaml <<-EOF
apiVersion: storage.k8s.io/v1
kind: StorageClass
metadata:
name: fast
provisioner: fuseim.pri/ifs # or choose another name, must match deployment's env PROVISIONER_NAME'
parameters:
archiveOnDelete: "true"
EOF
#部署class.yaml
kubectl apply -f prometheus-class.yaml
#查看创建的storageclass
kubectl get sc
#2、修改 Prometheus 持久化
prometheus是一种 StatefulSet 有状态集的部署模式,所以直接将 StorageClass 配置到里面,在下面的yaml中最下面添加持久化配置
#cat prometheus/prometheus-prometheus.yaml
apiVersion: monitoring.coreos.com/v1
kind: Prometheus
metadata:
labels:
prometheus: k8s
name: k8s
namespace: monitoring
spec:
alerting:
alertmanagers:
- name: alertmanager-main
namespace: monitoring
port: web
baseImage: quay.io/prometheus/prometheus
nodeSelector:
kubernetes.io/os: linux
podMonitorSelector: {}
replicas: 2
resources:
requests:
memory: 400Mi
ruleSelector:
matchLabels:
prometheus: k8s
role: alert-rules
securityContext:
fsGroup: 2000
runAsNonRoot: true
runAsUser: 1000
serviceAccountName: prometheus-k8s
serviceMonitorNamespaceSelector: {}
serviceMonitorSelector: {}
version: v2.11.0
storage: #----添加持久化配置,指定StorageClass为上面创建的fast
volumeClaimTemplate:
spec:
storageClassName: fast #---指定为fast
resources:
requests:
storage: 300Gi
kubectl apply -f prometheus/prometheus-prometheus.yaml
#3、修改 Grafana 持久化配置
由于 Grafana 是部署模式为 Deployment,所以我们提前为其创建一个 grafana-pvc.yaml 文件,加入下面 PVC 配置。
#vim grafana-pvc.yaml
kind: PersistentVolumeClaim
apiVersion: v1
metadata:
name: grafana
namespace: monitoring #---指定namespace为monitoring
spec:
storageClassName: fast #---指定StorageClass为上面创建的fast
accessModes:
- ReadWriteOnce
resources:
requests:
storage: 200Gi
kubectl apply -f grafana-pvc.yaml
#vim grafana/grafana-deployment.yaml
......
volumes:
- name: grafana-storage #-------新增持久化配置
persistentVolumeClaim:
claimName: grafana #-------设置为创建的PVC名称
#- emptyDir: {} #-------注释掉旧的配置
# name: grafana-storage
- name: grafana-datasources
secret:
secretName: grafana-datasources
- configMap:
name: grafana-dashboards
name: grafana-dashboards
......
kubectl apply -f grafana/grafana-deployment.yaml
```
参考资料:
https://www.cnblogs.com/skyflask/articles/11410063.html kubernetes监控方案--cAdvisor+Heapster+InfluxDB+Grafana
https://www.cnblogs.com/skyflask/p/11480988.html kubernetes监控终极方案-kube-promethues
http://www.mydlq.club/article/10/#wow1 Kube-promethues监控k8s集群
https://jicki.me/docker/kubernetes/2019/07/22/kube-prometheus/ Coreos kube-prometheus 监控
================================================
FILE: components/kube-proxy/README.md
================================================
# Kube-Proxy简述
```
运行在每个节点上,监听 API Server 中服务对象的变化,再通过管理 IPtables 来实现网络的转发
Kube-Proxy 目前支持三种模式:
UserSpace
k8s v1.2 后就已经淘汰
IPtables
目前默认方式
IPVS
需要安装ipvsadm、ipset 工具包和加载 ip_vs 内核模块
```
参考资料:
https://ywnz.com/linuxyffq/2530.html 解析从外部访问Kubernetes集群中应用的几种方法
https://www.jianshu.com/p/b2d13cec7091 浅谈 k8s service&kube-proxy
https://www.codercto.com/a/90806.html 探究K8S Service内部iptables路由规则
https://blog.51cto.com/goome/2369150 k8s实践7:ipvs结合iptables使用过程分析
https://blog.csdn.net/xinghun_4/article/details/50492041 kubernetes中port、target port、node port的对比分析,以及kube-proxy代理
================================================
FILE: components/nfs/README.md
================================================
================================================
FILE: components/pressure/README.md
================================================
# 一、生产大规模集群,网络组件选择
如果用calico-RR反射器这种模式,保证性能的情况下大概能支撑好多个节点?
RR反射器还分为两种 可以由calico的节点服务承载 也可以是直接的物理路由器做RR
超大规模Calico如果全以BGP来跑没什么问题 只是要做好网络地址规划 即便是不同集群容器地址也不能重叠
# 二、flanel网络组件压测
```
flannel受限于cpu压力
```

# 三、calico网络组件压测
```
calico则轻轻松松与宿主机性能相差无几
如果单单一个集群 节点数超级多 如果不做BGP路由聚合 物理路由器或三层交换机会扛不住的
```

# 四、calico网络和宿主机压测

================================================
FILE: components/pressure/calico bgp网络需要物理路由和交换机支持吗.md
================================================












================================================
FILE: components/pressure/k8s集群更换网段方案.md
================================================
```
1、服务器IP更换网段 有什么解决方案吗?不重新搭建集群的话?
方案一:
改监听地址,重做集群证书
不然还真不好搞的
方案二:
如果etcd一开始是静态的 那就不好玩了
得一开始就是基于dns discovery方式
简明扼要的说
就是但凡涉及IP地址的地方
全部用fqdn
无论是证书还是配置文件
这四句话核心就够了
etcd官方本来就有正式文档讲dns discovery部署
只是k8s部分,官方部署没有提
```




来自: 广大群友讨论集锦
https://github.com/etcd-io/etcd/blob/a4018f25c91fff8f4f15cd2cee9f026650c7e688/Documentation/clustering.md#dns-discovery
================================================
FILE: docs/Envoy的架构与基本术语.md
================================================
参考文档:
https://jimmysong.io/kubernetes-handbook/usecases/envoy-terminology.html Envoy 的架构与基本术语
================================================
FILE: docs/Kubernetes学习笔记.md
================================================
参考文档:
https://blog.gmem.cc/kubernetes-study-note Kubernetes学习笔记
================================================
FILE: docs/Kubernetes架构介绍.md
================================================
# Kubernetes架构介绍
## Kubernetes架构

## k8s架构图

## 一、K8S Master节点
### API Server
apiserver提供集群管理的REST API接口,包括认证授权、数据校验以 及集群状态变更等
只有API Server才直接操作etcd
其他模块通过API Server查询或修改数据
提供其他模块之间的数据交互和通信的枢纽
### Scheduler
scheduler负责分配调度Pod到集群内的node节点
监听kube-apiserver,查询还未分配Node的Pod
根据调度策略为这些Pod分配节点
### Controller Manager
controller-manager由一系列的控制器组成,它通过apiserver监控整个 集群的状态,并确保集群处于预期的工作状态
### ETCD
所有持久化的状态信息存储在ETCD中
## 二、K8S Node节点
### Kubelet
1. 管理Pods以及容器、镜像、Volume等,实现对集群 对节点的管理。
### Kube-proxy
2. 提供网络代理以及负载均衡,实现与Service通信。
### Docker Engine
3. 负责节点的容器的管理工作。
## 三、资源对象介绍
### 3.1 Replication Controller,RC
1. RC是K8s集群中最早的保证Pod高可用的API对象。通过监控运行中
的Pod来保证集群中运行指定数目的Pod副本。
2. 指定的数目可以是多个也可以是1个;少于指定数目,RC就会启动运
行新的Pod副本;多于指定数目,RC就会杀死多余的Pod副本。
3. 即使在指定数目为1的情况下,通过RC运行Pod也比直接运行Pod更 明智,因为RC也可以发挥它高可用的能力,保证永远有1个Pod在运 行。
### 3.2 Replica Set,RS
1. RS是新一代RC,提供同样的高可用能力,区别主要在于RS后来居上, 能支持更多中的匹配模式。副本集对象一般不单独使用,而是作为部 署的理想状态参数使用。
2. 是K8S 1.2中出现的概念,是RC的升级。一般和Deployment共同使用。
### 3.3 Deployment
1. Deployment表示用户对K8s集群的一次更新操作。Deployment是 一个比RS应用模式更广的API对象,
2. 可以是创建一个新的服务,更新一个新的服务,也可以是滚动升 级一个服务。滚动升级一个服务,实际是创建一个新的RS,然后 逐渐将新RS中副本数增加到理想状态,将旧RS中的副本数减小 到0的复合操作;
3. 这样一个复合操作用一个RS是不太好描述的,所以用一个更通用 的Deployment来描述。
### 3.4 Service
1. RC、RS和Deployment只是保证了支撑服务的POD的数量,但是没有解 决如何访问这些服务的问题。一个Pod只是一个运行服务的实例,随时可 能在一个节点上停止,在另一个节点以一个新的IP启动一个新的Pod,因 此不能以确定的IP和端口号提供服务。
2. 要稳定地提供服务需要服务发现和负载均衡能力。服务发现完成的工作, 是针对客户端访问的服务,找到对应的的后端服务实例。
3. 在K8集群中,客户端需要访问的服务就是Service对象。每个Service会对 应一个集群内部有效的虚拟IP,集群内部通过虚拟IP访问一个服务。
## 四、K8S的IP地址
1. Node IP: 节点设备的IP,如物理机,虚拟机等容器宿主的实际IP。
2. Pod IP: Pod 的IP地址,是根据docker0网格IP段进行分配的。
3. Cluster IP: Service的IP,是一个虚拟IP,仅作用于service对象,由k8s
管理和分配,需要结合service port才能使用,单独的IP没有通信功能,
集群外访问需要一些修改。
4. 在K8S集群内部,nodeip podip clusterip的通信机制是由k8s制定的路由
规则,不是IP路由。
================================================
FILE: docs/Kubernetes集群环境准备.md
================================================
# 一、k8s集群实验环境准备

| 主机名 |
IP地址(NAT) |
描述 |
| linux-node1.example.com |
eth0:192.168.56.11 |
Kubernets Master节点/Etcd节点 |
| linux-node2.example.com |
eth0:192.168.56.12 |
Kubernets Node节点/ Etcd节点 |
| linux-node3.example.com |
eth0:192.168.56.13 |
Kubernets Node节点/ Etcd节点 |
# 二、准备工作
1、设置主机名
```
hostnamectl set-hostname linux-node1
hostnamectl set-hostname linux-node2
hostnamectl set-hostname linux-node3
```
2、绑定主机host
```
cat > /etc/hosts </dev/null 2>&1
#设置时区
cp /usr/share/zoneinfo/Asia/Shanghai /etc/localtime
#SSH登录慢
sed -i "s/#UseDNS yes/UseDNS no/" /etc/ssh/sshd_config
sed -i "s/GSSAPIAuthentication yes/GSSAPIAuthentication no/" /etc/ssh/sshd_config
systemctl restart sshd.service
```
6、软件包下载
k8s-v1.12.0版本网盘地址: https://pan.baidu.com/s/1jU427W1f3oSDnzB3bU2s5w
```
#所有文件存放在/opt/kubernetes目录下
mkdir -p /opt/kubernetes/{cfg,bin,ssl,log}
#使用二进制方式进行部署
官网下载地址: https://github.com/kubernetes/kubernetes/blob/master/CHANGELOG-1.12.md#downloads-for-v1121
#添加环境变量
vim /root/.bash_profile
PATH=$PATH:$HOME/bin:/opt/kubernetes/bin
source /root/.bash_profile
```

7、解压软件包
```
tar -zxvf kubernetes.tar.gz -C /usr/local/src/
tar -zxvf kubernetes-server-linux-amd64.tar.gz -C /usr/local/src/
tar -zxvf kubernetes-client-linux-amd64.tar.gz -C /usr/local/src/
tar -zxvf kubernetes-node-linux-amd64.tar.gz -C /usr/local/src/
```
================================================
FILE: docs/app.md
================================================
1.创建一个测试用的deployment
```
[root@linux-node1 ~]# kubectl run net-test --image=alpine --replicas=2 sleep 360000
[root@linux-node1 ~]# kubectl get deployment
NAME DESIRED CURRENT UP-TO-DATE AVAILABLE AGE
net-test 2 2 2 2 2h
[root@linux-node1 ~]# kubectl delete deployment net-test
```
2.查看获取IP情况
```
[root@linux-node1 ~]# kubectl get pod -o wide
NAME READY STATUS RESTARTS AGE IP NODE
net-test-5767cb94df-6smfk 1/1 Running 1 1h 10.2.69.3 192.168.56.12
net-test-5767cb94df-ctkhz 1/1 Running 1 1h 10.2.17.3 192.168.56.13
```
3.测试联通性
```
[root@linux-node1 ~]# ping -c 1 10.2.69.3
PING 10.2.69.3 (10.2.69.3) 56(84) bytes of data.
64 bytes from 10.2.69.3: icmp_seq=1 ttl=61 time=1.39 ms
--- 10.2.69.3 ping statistics ---
1 packets transmitted, 1 received, 0% packet loss, time 0ms
rtt min/avg/max/mdev = 1.396/1.396/1.396/0.000 ms
[root@linux-node1 ~]# ping -c 1 10.2.17.3
PING 10.2.17.3 (10.2.17.3) 56(84) bytes of data.
64 bytes from 10.2.17.3: icmp_seq=1 ttl=61 time=1.16 ms
--- 10.2.17.3 ping statistics ---
1 packets transmitted, 1 received, 0% packet loss, time 0ms
rtt min/avg/max/mdev = 1.164/1.164/1.164/0.000 ms
#如果要在master节点不能ping通pod的IP,则需要检查flanneld服务,下面是各节点的网卡ip情况(发现各节点的flannel0的ip网段都是不一样的)
#node1
[root@linux-node1 ~]# ifconfig
docker0: flags=4099 mtu 1500
inet 10.2.41.1 netmask 255.255.255.0 broadcast 10.2.41.255
ether 02:42:77:d9:95:e3 txqueuelen 0 (Ethernet)
RX packets 0 bytes 0 (0.0 B)
RX errors 0 dropped 0 overruns 0 frame 0
TX packets 0 bytes 0 (0.0 B)
TX errors 0 dropped 0 overruns 0 carrier 0 collisions 0
eth0: flags=4163 mtu 1500
inet 192.168.56.11 netmask 255.255.255.0 broadcast 192.168.56.255
ether 00:0c:29:e6:00:79 txqueuelen 1000 (Ethernet)
RX packets 75548 bytes 10771254 (10.2 MiB)
RX errors 0 dropped 0 overruns 0 frame 0
TX packets 74344 bytes 12700211 (12.1 MiB)
TX errors 0 dropped 0 overruns 0 carrier 0 collisions 0
flannel0: flags=4305 mtu 1472
inet 10.2.41.0 netmask 255.255.0.0 destination 10.2.41.0
unspec 00-00-00-00-00-00-00-00-00-00-00-00-00-00-00-00 txqueuelen 500 (UNSPEC)
RX packets 30 bytes 2520 (2.4 KiB)
RX errors 0 dropped 0 overruns 0 frame 0
TX packets 30 bytes 2520 (2.4 KiB)
TX errors 0 dropped 0 overruns 0 carrier 0 collisions 0
lo: flags=73 mtu 65536
inet 127.0.0.1 netmask 255.0.0.0
loop txqueuelen 1000 (Local Loopback)
RX packets 34140 bytes 8049438 (7.6 MiB)
RX errors 0 dropped 0 overruns 0 frame 0
TX packets 34140 bytes 8049438 (7.6 MiB)
TX errors 0 dropped 0 overruns 0 carrier 0 collisions 0
#node2
[root@linux-node2 ~]# ifconfig
docker0: flags=4163 mtu 1400
inet 10.2.69.1 netmask 255.255.255.0 broadcast 10.2.69.255
ether 02:42:de:56:b5:1e txqueuelen 0 (Ethernet)
RX packets 10 bytes 448 (448.0 B)
RX errors 0 dropped 0 overruns 0 frame 0
TX packets 9 bytes 546 (546.0 B)
TX errors 0 dropped 0 overruns 0 carrier 0 collisions 0
eth0: flags=4163 mtu 1500
inet 192.168.56.12 netmask 255.255.255.0 broadcast 192.168.56.255
ether 00:0c:29:ee:65:40 txqueuelen 1000 (Ethernet)
RX packets 32893 bytes 4996885 (4.7 MiB)
RX errors 0 dropped 0 overruns 0 frame 0
TX packets 32877 bytes 3737878 (3.5 MiB)
TX errors 0 dropped 0 overruns 0 carrier 0 collisions 0
flannel0: flags=4305 mtu 1472
inet 10.2.69.0 netmask 255.255.0.0 destination 10.2.69.0
unspec 00-00-00-00-00-00-00-00-00-00-00-00-00-00-00-00 txqueuelen 500 (UNSPEC)
RX packets 3 bytes 252 (252.0 B)
RX errors 0 dropped 0 overruns 0 frame 0
TX packets 3 bytes 252 (252.0 B)
TX errors 0 dropped 0 overruns 0 carrier 0 collisions 0
lo: flags=73 mtu 65536
inet 127.0.0.1 netmask 255.0.0.0
loop txqueuelen 1000 (Local Loopback)
RX packets 347 bytes 36887 (36.0 KiB)
RX errors 0 dropped 0 overruns 0 frame 0
TX packets 347 bytes 36887 (36.0 KiB)
TX errors 0 dropped 0 overruns 0 carrier 0 collisions 0
veth09ea856c: flags=4163 mtu 1400
ether c6:be:00:bd:a9:18 txqueuelen 0 (Ethernet)
RX packets 10 bytes 588 (588.0 B)
RX errors 0 dropped 0 overruns 0 frame 0
TX packets 9 bytes 546 (546.0 B)
TX errors 0 dropped 0 overruns 0 carrier 0 collisions 0
#node3
[root@linux-node3 ~]# ifconfig
docker0: flags=4163 mtu 1400
inet 10.2.17.1 netmask 255.255.255.0 broadcast 10.2.17.255
ether 02:42:ac:11:ac:3c txqueuelen 0 (Ethernet)
RX packets 32 bytes 2408 (2.3 KiB)
RX errors 0 dropped 0 overruns 0 frame 0
TX packets 31 bytes 2814 (2.7 KiB)
TX errors 0 dropped 0 overruns 0 carrier 0 collisions 0
eth0: flags=4163 mtu 1500
inet 192.168.56.13 netmask 255.255.255.0 broadcast 192.168.56.255
ether 00:0c:29:53:f4:b1 txqueuelen 1000 (Ethernet)
RX packets 47504 bytes 7138550 (6.8 MiB)
RX errors 0 dropped 0 overruns 0 frame 0
TX packets 48402 bytes 8310935 (7.9 MiB)
TX errors 0 dropped 0 overruns 0 carrier 0 collisions 0
flannel0: flags=4305 mtu 1472
inet 10.2.17.0 netmask 255.255.0.0 destination 10.2.17.0
unspec 00-00-00-00-00-00-00-00-00-00-00-00-00-00-00-00 txqueuelen 500 (UNSPEC)
RX packets 27 bytes 2268 (2.2 KiB)
RX errors 0 dropped 0 overruns 0 frame 0
TX packets 27 bytes 2268 (2.2 KiB)
TX errors 0 dropped 0 overruns 0 carrier 0 collisions 0
lo: flags=73 mtu 65536
inet 127.0.0.1 netmask 255.0.0.0
loop txqueuelen 1000 (Local Loopback)
RX packets 129 bytes 13510 (13.1 KiB)
RX errors 0 dropped 0 overruns 0 frame 0
TX packets 129 bytes 13510 (13.1 KiB)
TX errors 0 dropped 0 overruns 0 carrier 0 collisions 0
veth8630a55b: flags=4163 mtu 1400
ether 72:e9:df:4f:f6:64 txqueuelen 0 (Ethernet)
RX packets 32 bytes 2856 (2.7 KiB)
RX errors 0 dropped 0 overruns 0 frame 0
TX packets 31 bytes 2814 (2.7 KiB)
TX errors 0 dropped 0 overruns 0 carrier 0 collisions 0
```
4、创建nginx服务
```
#创建deployment文件
[root@linux-node1 ~]# vim nginx-deployment.yaml
apiVersion: apps/v1
kind: Deployment
metadata:
name: nginx-deployment
labels:
app: nginx
spec:
replicas: 3
selector:
matchLabels:
app: nginx
template:
metadata:
labels:
app: nginx
spec:
containers:
- name: nginx
image: nginx:1.13.12
ports:
- containerPort: 80
#创建deployment
[root@linux-node1 ~]# kubectl create -f nginx-deployment.yaml
deployment.apps "nginx-deployment" created
#查看deployment
[root@linux-node1 ~]# kubectl get deployment
NAME DESIRED CURRENT UP-TO-DATE AVAILABLE AGE
nginx-deployment 3 3 3 2 48s
#查看deployment详情
[root@linux-node1 ~]# kubectl describe deployment nginx-deployment
Name: nginx-deployment
Namespace: default
CreationTimestamp: Tue, 09 Oct 2018 15:11:33 +0800
Labels: app=nginx
Annotations: deployment.kubernetes.io/revision=1
Selector: app=nginx
Replicas: 3 desired | 3 updated | 3 total | 3 available | 0 unavailable
StrategyType: RollingUpdate
MinReadySeconds: 0
RollingUpdateStrategy: 25% max unavailable, 25% max surge
Pod Template:
Labels: app=nginx
Containers:
nginx:
Image: nginx:1.13.12
Port: 80/TCP
Host Port: 0/TCP
Environment:
Mounts:
Volumes:
Conditions:
Type Status Reason
---- ------ ------
Available True MinimumReplicasAvailable
Progressing True NewReplicaSetAvailable
OldReplicaSets:
NewReplicaSet: nginx-deployment-6c45fc49cb (3/3 replicas created)
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Normal ScalingReplicaSet 2m deployment-controller Scaled up replica set nginx-deployment-6c45fc49cb to 3
#查看pod
[root@linux-node1 ~]# kubectl get pod -o wide
NAME READY STATUS RESTARTS AGE IP NODE
nginx-deployment-6c45fc49cb-7rwdp 1/1 Running 0 4m 10.2.76.5 192.168.56.12
nginx-deployment-6c45fc49cb-8dgkd 1/1 Running 0 4m 10.2.76.4 192.168.56.12
nginx-deployment-6c45fc49cb-clgkl 1/1 Running 0 4m 10.2.76.4 192.168.56.13
#查看pod详情
[root@linux-node1 ~]# kubectl describe pod nginx-deployment-6c45fc49cb-7rwdp
Name: nginx-deployment-6c45fc49cb-7rwdp
Namespace: default
Node: 192.168.56.12/192.168.56.12
Start Time: Tue, 09 Oct 2018 15:11:33 +0800
Labels: app=nginx
pod-template-hash=2701970576
Annotations:
Status: Running
IP: 10.2.76.5
Controlled By: ReplicaSet/nginx-deployment-6c45fc49cb
Containers:
nginx:
Container ID: docker://0ab9b4f9bf3691f16e9cb6836a7375cb7f886398bfa8a81147e9a24f3634d591
Image: nginx:1.13.12
Image ID: docker-pullable://nginx@sha256:b1d09e9718890e6ebbbd2bc319ef1611559e30ce1b6f56b2e3b479d9da51dc35
Port: 80/TCP
Host Port: 0/TCP
State: Running
Started: Tue, 09 Oct 2018 15:12:33 +0800
Ready: True
Restart Count: 0
Environment:
Mounts:
/var/run/secrets/kubernetes.io/serviceaccount from default-token-4cgj8 (ro)
Conditions:
Type Status
Initialized True
Ready True
PodScheduled True
Volumes:
default-token-4cgj8:
Type: Secret (a volume populated by a Secret)
SecretName: default-token-4cgj8
Optional: false
QoS Class: BestEffort
Node-Selectors:
Tolerations:
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Normal Scheduled 4m default-scheduler Successfully assigned nginx-deployment-6c45fc49cb-7rwdp to 192.168.56.12
Normal SuccessfulMountVolume 4m kubelet, 192.168.56.12 MountVolume.SetUp succeeded for volume "default-token-4cgj8"
Normal Pulling 4m kubelet, 192.168.56.12 pulling image "nginx:1.13.12"
Normal Pulled 3m kubelet, 192.168.56.12 Successfully pulled image "nginx:1.13.12"
Normal Created 3m kubelet, 192.168.56.12 Created container
Normal Started 3m kubelet, 192.168.56.12 Started container
#导出资源描述
kubectl get --export -o yaml 命令会以Yaml格式导出系统中已有资源描述
比如,我们可以将系统中 nginx 部署的描述导成 Yaml 文件
kubectl get deployment nginx-deployment-6c45fc49cb-7rwdp --export -o yaml > nginx-deployment.yaml
#测试pod访问
测试访问nginx镜像(在对应的节点上测试,本来是其他节点也可以正常访问的)
[root@linux-node3 ~]# curl --head http://10.2.76.4
HTTP/1.1 200 OK
Server: nginx/1.13.12
Date: Tue, 09 Oct 2018 07:17:55 GMT
Content-Type: text/html
Content-Length: 612
Last-Modified: Mon, 09 Apr 2018 16:01:09 GMT
Connection: keep-alive
ETag: "5acb8e45-264"
Accept-Ranges: bytes
```
5、更新Deployment
```
#--record 记录日志,方便以后回滚
[root@linux-node1 ~]# kubectl set image deployment/nginx-deployment nginx=nginx:1.12.1 --record
deployment.apps "nginx-deployment" image updated
```
6、查看更新后的Deployment
```
#这里发现镜像已经更新为1.12.1版本了,然后CURRENT(当前镜像数为4个,期望值DESIRED为3个,说明正在进行滚动更新)
[root@linux-node1 ~]# kubectl get deployment -o wide
NAME DESIRED CURRENT UP-TO-DATE AVAILABLE AGE CONTAINERS IMAGES SELECTOR
nginx-deployment 3 4 1 3 13m nginx nginx:1.12.1 app=nginx
```
7、查看历史记录
```
[root@linux-node1 ~]# kubectl rollout history deployment/nginx-deployment
deployments "nginx-deployment"
REVISION CHANGE-CAUSE
1 ---第一个没有,是因为我们创建的时候没有加上--record参数
4 kubectl set image deployment/nginx-deployment nginx=nginx:1.12.2 --record=true
5 kubectl set image deployment/nginx-deployment nginx=nginx:1.12.1 --record=true
```
7、查看具体某一个版本的升级历史
```
[root@linux-node1 ~]# kubectl rollout history deployment/nginx-deployment --revision=1
deployments "nginx-deployment" with revision #1
Pod Template:
Labels: app=nginx
pod-template-hash=2701970576
Containers:
nginx:
Image: nginx:1.13.12
Port: 80/TCP
Host Port: 0/TCP
Environment:
Mounts:
Volumes:
```
8、快速回滚到上一个版本
```
[root@linux-node1 ~]# kubectl rollout undo deployment/nginx-deployment
deployment.apps "nginx-deployment"
[root@linux-node1 ~]#
```
9、扩容到5个节点
```
[root@linux-node1 ~]# kubectl get pod -o wide ----之前是3个pod
NAME READY STATUS RESTARTS AGE IP NODE
nginx-deployment-7498dc98f8-48lqg 1/1 Running 0 2m 10.2.76.15 192.168.56.12
nginx-deployment-7498dc98f8-g4zkp 1/1 Running 0 2m 10.2.76.9 192.168.56.13
nginx-deployment-7498dc98f8-z2466 1/1 Running 0 2m 10.2.76.16 192.168.56.12
[root@linux-node1 ~]# kubectl scale deployment nginx-deployment --replicas 5
deployment.extensions "nginx-deployment" scaled
[root@linux-node1 ~]# kubectl get pod -o wide ----现在扩容到了5个pod
NAME READY STATUS RESTARTS AGE IP NODE
nginx-deployment-7498dc98f8-28894 1/1 Running 0 8s 10.2.76.10 192.168.56.13
nginx-deployment-7498dc98f8-48lqg 1/1 Running 0 2m 10.2.76.15 192.168.56.12
nginx-deployment-7498dc98f8-g4zkp 1/1 Running 0 2m 10.2.76.9 192.168.56.13
nginx-deployment-7498dc98f8-tt7z5 1/1 Running 0 7s 10.2.76.17 192.168.56.12
nginx-deployment-7498dc98f8-z2466 1/1 Running 0 2m 10.2.76.16 192.168.56.12
```
10、Pod ip 变化频繁, 引入service-ip
```
#创建nginx-server
[root@linux-node1 ~]# cat nginx-service.yaml
kind: Service
apiVersion: v1
metadata:
name: nginx-service
spec:
selector:
app: nginx
ports:
- protocol: TCP
port: 80
targetPort: 80
[root@linux-node1 ~]# kubectl create -f nginx-service.yaml
service "nginx-service" created
#发现给我们创建了一个vip 10.1.46.200 并且通过lvs做了负载均衡
[root@linux-node1 ~]# kubectl get service
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
kubernetes ClusterIP 10.1.0.1 443/TCP 3h
nginx-service ClusterIP 10.1.46.200 80/TCP 5m
#在node节点使用ipvsadm -Ln查看负载均衡后端节点
[root@linux-node2 ~]# ipvsadm -Ln
IP Virtual Server version 1.2.1 (size=4096)
Prot LocalAddress:Port Scheduler Flags
-> RemoteAddress:Port Forward Weight ActiveConn InActConn
TCP 10.1.46.200:80 rr
-> 10.2.76.11:80 Masq 1 0 0
-> 10.2.76.12:80 Masq 1 0 0
-> 10.2.76.13:80 Masq 1 0 0
-> 10.2.76.18:80 Masq 1 0 0
-> 10.2.76.19:80 Masq 1 0 0
#在master上访问vip不行,是因为没有安装kube-proxy服务,需要在node节点去测试验证
[root@linux-node1 ~]# curl --head http://10.1.46.200
[root@linux-node2 ~]# curl --head http://10.1.46.200
HTTP/1.1 200 OK
Server: nginx/1.10.3
Date: Tue, 09 Oct 2018 07:55:57 GMT
Content-Type: text/html
Content-Length: 612
Last-Modified: Tue, 31 Jan 2017 15:01:11 GMT
Connection: keep-alive
ETag: "5890a6b7-264"
Accept-Ranges: bytes
#每执行一次curl --head http://10.1.46.200请求,后端InActConn连接数就会增加1
[root@linux-node2 ~]# ipvsadm -Ln
IP Virtual Server version 1.2.1 (size=4096)
Prot LocalAddress:Port Scheduler Flags
-> RemoteAddress:Port Forward Weight ActiveConn InActConn
TCP 10.1.46.200:80 rr
-> 10.2.76.11:80 Masq 1 0 1
-> 10.2.76.12:80 Masq 1 0 1
-> 10.2.76.13:80 Masq 1 0 2
-> 10.2.76.18:80 Masq 1 0 2
-> 10.2.76.19:80 Masq 1 0 2
```
================================================
FILE: docs/app2.md
================================================
1.查询命名空间
```
[root@linux-node1 ~]# kubectl get namespace --all-namespaces
NAME STATUS AGE
default Active 3d13h
kube-node-lease Active 3d13h
kube-public Active 3d13h
kube-system Active 3d13h
```
2.查询健康状况
```
[root@linux-node1 ~]# kubectl get cs --all-namespaces
NAME STATUS MESSAGE ERROR
controller-manager Healthy ok
scheduler Healthy ok
etcd-0 Healthy {"health":"true"}
etcd-2 Healthy {"health":"true"}
etcd-1 Healthy {"health":"true"}
```
3.查询node
```
[root@linux-node1 ~]# kubectl get node -o wide
NAME STATUS ROLES AGE VERSION INTERNAL-IP EXTERNAL-IP OS-IMAGE KERNEL-VERSION CONTAINER-RUNTIME
10.33.35.5 Ready,SchedulingDisabled master 3d13h v1.15.2 10.33.35.5 CentOS Linux 7 (Core) 3.10.0-957.27.2.el7.x86_64 docker://18.9.6
10.33.35.6 Ready node 3d13h v1.15.2 10.33.35.6 CentOS Linux 7 (Core) 3.10.0-957.27.2.el7.x86_64 docker://18.9.6
10.33.35.7 Ready node 3d13h v1.15.2 10.33.35.7 CentOS Linux 7 (Core) 3.10.0-957.27.2.el7.x86_64 docker://18.9.6
```
4.创建一个测试用的deployment
```
[root@linux-node1 ~]# kubectl run net-test --image=alpine --replicas=2 sleep 360000
[root@linux-node1 ~]# kubectl get deployment
NAME DESIRED CURRENT UP-TO-DATE AVAILABLE AGE
net-test 2 2 2 2 2h
[root@linux-node1 ~]# kubectl delete deployment net-test
```
5.查看获取IP情况
```
[root@linux-node1 ~]# kubectl get pod -o wide
NAME READY STATUS RESTARTS AGE IP NODE
net-test-54ddf4f6c7-qfgw9 1/1 Running 0 22s 172.20.2.131 10.33.35.7
net-test-54ddf4f6c7-rwgmc 1/1 Running 0 22s 172.20.1.137 10.33.35.6
```
6、创建nginx服务
```
#创建deployment文件
[root@linux-node1 ~]# vim nginx-deployment.yaml
apiVersion: apps/v1
kind: Deployment
metadata:
name: nginx-deployment
labels:
app: nginx
spec:
replicas: 3
selector:
matchLabels:
app: nginx
template:
metadata:
labels:
app: nginx
spec:
containers:
- name: nginx
image: nginx:1.13.12
ports:
- containerPort: 80
#创建deployment
[root@linux-node1 ~]# kubectl create -f nginx-deployment.yaml
deployment.apps "nginx-deployment" created
#查看deployment
[root@linux-node1 ~]# kubectl get deployment
NAME DESIRED CURRENT UP-TO-DATE AVAILABLE AGE
nginx-deployment 3 3 3 2 48s
#查看deployment详情
[root@linux-node1 ~]# kubectl describe deployment nginx-deployment
Name: nginx-deployment
Namespace: default
CreationTimestamp: Tue, 09 Oct 2018 15:11:33 +0800
Labels: app=nginx
Annotations: deployment.kubernetes.io/revision=1
Selector: app=nginx
Replicas: 3 desired | 3 updated | 3 total | 3 available | 0 unavailable
StrategyType: RollingUpdate
MinReadySeconds: 0
RollingUpdateStrategy: 25% max unavailable, 25% max surge
Pod Template:
Labels: app=nginx
Containers:
nginx:
Image: nginx:1.13.12
Port: 80/TCP
Host Port: 0/TCP
Environment:
Mounts:
Volumes:
Conditions:
Type Status Reason
---- ------ ------
Available True MinimumReplicasAvailable
Progressing True NewReplicaSetAvailable
OldReplicaSets:
NewReplicaSet: nginx-deployment-6c45fc49cb (3/3 replicas created)
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Normal ScalingReplicaSet 2m deployment-controller Scaled up replica set nginx-deployment-6c45fc49cb to 3
#查看pod
[root@linux-node1 ~]# kubectl get pod -o wide
NAME READY STATUS RESTARTS AGE IP NODE
nginx-deployment-6c45fc49cb-7rwdp 1/1 Running 0 4m 10.2.76.5 192.168.56.12
nginx-deployment-6c45fc49cb-8dgkd 1/1 Running 0 4m 10.2.76.4 192.168.56.12
nginx-deployment-6c45fc49cb-clgkl 1/1 Running 0 4m 10.2.76.4 192.168.56.13
#查看pod详情
[root@linux-node1 ~]# kubectl describe pod nginx-deployment-6c45fc49cb-7rwdp
Name: nginx-deployment-6c45fc49cb-7rwdp
Namespace: default
Node: 192.168.56.12/192.168.56.12
Start Time: Tue, 09 Oct 2018 15:11:33 +0800
Labels: app=nginx
pod-template-hash=2701970576
Annotations:
Status: Running
IP: 10.2.76.5
Controlled By: ReplicaSet/nginx-deployment-6c45fc49cb
Containers:
nginx:
Container ID: docker://0ab9b4f9bf3691f16e9cb6836a7375cb7f886398bfa8a81147e9a24f3634d591
Image: nginx:1.13.12
Image ID: docker-pullable://nginx@sha256:b1d09e9718890e6ebbbd2bc319ef1611559e30ce1b6f56b2e3b479d9da51dc35
Port: 80/TCP
Host Port: 0/TCP
State: Running
Started: Tue, 09 Oct 2018 15:12:33 +0800
Ready: True
Restart Count: 0
Environment:
Mounts:
/var/run/secrets/kubernetes.io/serviceaccount from default-token-4cgj8 (ro)
Conditions:
Type Status
Initialized True
Ready True
PodScheduled True
Volumes:
default-token-4cgj8:
Type: Secret (a volume populated by a Secret)
SecretName: default-token-4cgj8
Optional: false
QoS Class: BestEffort
Node-Selectors:
Tolerations:
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Normal Scheduled 4m default-scheduler Successfully assigned nginx-deployment-6c45fc49cb-7rwdp to 192.168.56.12
Normal SuccessfulMountVolume 4m kubelet, 192.168.56.12 MountVolume.SetUp succeeded for volume "default-token-4cgj8"
Normal Pulling 4m kubelet, 192.168.56.12 pulling image "nginx:1.13.12"
Normal Pulled 3m kubelet, 192.168.56.12 Successfully pulled image "nginx:1.13.12"
Normal Created 3m kubelet, 192.168.56.12 Created container
Normal Started 3m kubelet, 192.168.56.12 Started container
#导出资源描述
kubectl get --export -o yaml 命令会以Yaml格式导出系统中已有资源描述
比如,我们可以将系统中 nginx 部署的描述导成 Yaml 文件
kubectl get deployment nginx-deployment-6c45fc49cb-7rwdp --export -o yaml > nginx-deployment.yaml
#测试pod访问
测试访问nginx镜像(在对应的节点上测试,本来是其他节点也可以正常访问的)
[root@linux-node3 ~]# curl --head http://10.2.76.4
HTTP/1.1 200 OK
Server: nginx/1.13.12
Date: Tue, 09 Oct 2018 07:17:55 GMT
Content-Type: text/html
Content-Length: 612
Last-Modified: Mon, 09 Apr 2018 16:01:09 GMT
Connection: keep-alive
ETag: "5acb8e45-264"
Accept-Ranges: bytes
```
8、更新Deployment
```
#--record 记录日志,方便以后回滚
[root@linux-node1 ~]# kubectl set image deployment/nginx-deployment nginx=nginx:1.12.1 --record
deployment.apps "nginx-deployment" image updated
```
9、查看更新后的Deployment
```
#这里发现镜像已经更新为1.12.1版本了,然后CURRENT(当前镜像数为4个,期望值DESIRED为3个,说明正在进行滚动更新)
[root@linux-node1 ~]# kubectl get deployment -o wide
NAME DESIRED CURRENT UP-TO-DATE AVAILABLE AGE CONTAINERS IMAGES SELECTOR
nginx-deployment 3 4 1 3 13m nginx nginx:1.12.1 app=nginx
```
10、查看历史记录
```
[root@linux-node1 ~]# kubectl rollout history deployment/nginx-deployment
deployments "nginx-deployment"
REVISION CHANGE-CAUSE
1 ---第一个没有,是因为我们创建的时候没有加上--record参数
4 kubectl set image deployment/nginx-deployment nginx=nginx:1.12.2 --record=true
5 kubectl set image deployment/nginx-deployment nginx=nginx:1.12.1 --record=true
```
11、查看具体某一个版本的升级历史
```
[root@linux-node1 ~]# kubectl rollout history deployment/nginx-deployment --revision=1
deployments "nginx-deployment" with revision #1
Pod Template:
Labels: app=nginx
pod-template-hash=2701970576
Containers:
nginx:
Image: nginx:1.13.12
Port: 80/TCP
Host Port: 0/TCP
Environment:
Mounts:
Volumes:
```
12、快速回滚到上一个版本
```
[root@linux-node1 ~]# kubectl rollout undo deployment/nginx-deployment
deployment.apps "nginx-deployment"
[root@linux-node1 ~]#
```
13、扩容到5个节点
```
[root@linux-node1 ~]# kubectl get pod -o wide ----之前是3个pod
NAME READY STATUS RESTARTS AGE IP NODE
nginx-deployment-7498dc98f8-48lqg 1/1 Running 0 2m 10.2.76.15 192.168.56.12
nginx-deployment-7498dc98f8-g4zkp 1/1 Running 0 2m 10.2.76.9 192.168.56.13
nginx-deployment-7498dc98f8-z2466 1/1 Running 0 2m 10.2.76.16 192.168.56.12
[root@linux-node1 ~]# kubectl scale deployment nginx-deployment --replicas 5
deployment.extensions "nginx-deployment" scaled
[root@linux-node1 ~]# kubectl get pod -o wide ----现在扩容到了5个pod
NAME READY STATUS RESTARTS AGE IP NODE
nginx-deployment-7498dc98f8-28894 1/1 Running 0 8s 10.2.76.10 192.168.56.13
nginx-deployment-7498dc98f8-48lqg 1/1 Running 0 2m 10.2.76.15 192.168.56.12
nginx-deployment-7498dc98f8-g4zkp 1/1 Running 0 2m 10.2.76.9 192.168.56.13
nginx-deployment-7498dc98f8-tt7z5 1/1 Running 0 7s 10.2.76.17 192.168.56.12
nginx-deployment-7498dc98f8-z2466 1/1 Running 0 2m 10.2.76.16 192.168.56.12
```
14、Pod ip 变化频繁, 引入service-ip
```
#创建nginx-server
[root@linux-node1 ~]# cat nginx-service.yaml
kind: Service
apiVersion: v1
metadata:
name: nginx-service
spec:
selector:
app: nginx
ports:
- protocol: TCP
port: 80
targetPort: 80
[root@linux-node1 ~]# kubectl create -f nginx-service.yaml
service "nginx-service" created
#发现给我们创建了一个vip 10.1.46.200 并且通过lvs做了负载均衡
[root@linux-node1 ~]# kubectl get service --all-namespaces
NAMESPACE NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
default kubernetes ClusterIP 172.68.0.1 443/TCP 3d13h
default my-mc-service ClusterIP 172.68.213.121 60001/TCP,60002/TCP 23m
default php-service ClusterIP 172.68.210.6 9898/TCP 18h
default test-hello ClusterIP 172.68.248.205 80/TCP 23h
kube-system heapster ClusterIP 172.68.19.198 80/TCP 3d13h
kube-system kube-dns ClusterIP 172.68.0.2 53/UDP,53/TCP,9153/TCP 3d13h
kube-system kubernetes-dashboard NodePort 172.68.58.252 443:26400/TCP 3d13h
kube-system metrics-server ClusterIP 172.68.31.222 443/TCP 3d13h
kube-system traefik-ingress-service NodePort 172.68.221.108 80:23456/TCP,8080:31477/TCP 3d13h
#删除service
[root@linux-node1 ~]# kubectl delete service nginx-service
service "nginx-service" deleted
#查看service的后端节点
[root@linux-node1 ~]# kubectl describe svc nginx-service
Name: nginx-service
Namespace: default
Labels:
Annotations:
Selector: app=nginx
Type: ClusterIP
IP: 172.68.176.9
Port: 80/TCP
TargetPort: 80/TCP
Endpoints: 172.20.1.138:80,172.20.2.132:80,172.20.2.133:80 --这里发现有3个后端节点
Session Affinity: None
Events:
```
15.创建自定义Ingress
有了ingress-controller,我们就可以创建自定义的Ingress了。这里已提前搭建好了nginx服务,我们针对nginx创建一个Ingress:
```
#vim nginx-ingress.yaml
apiVersion: extensions/v1beta1
kind: Ingress
metadata:
name: nginx-ingress
namespace: default
spec:
rules:
- host: myk8s.com
http:
paths:
- path: /
backend:
serviceName: nginx-service
servicePort: 80
其中:
rules中的host必须为域名,不能为IP,表示Ingress-controller的Pod所在主机域名,也就是Ingress-controller的IP对应的域名。
paths中的path则表示映射的路径。如映射/表示若访问myk8s.com,则会将请求转发至nginx的service,端口为80。
kubectl create -f nginx-ingress.yaml
kubectl get ingress -o wide
kubectl delete ingress nginx-ingress
#需要找出Ingress-controller的Pod所在主机(这里发现是在node2机器)
[root@linux-node1 ~]# kubectl get pods --all-namespaces -o wide
NAMESPACE NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES
default busybox 1/1 Running 41 41h 172.20.1.27 10.33.35.6
default my-mc-deployment-76f77494c7-kxv85 2/2 Running 0 63m 172.20.2.130 10.33.35.7
kube-system traefik-ingress-controller-766dbfdddd-fzb8d 1/1 Running 1 3d14h 172.20.1.14 10.33.35.6
#然后机器绑定域名
10.33.35.6 myk8s.com
#访问测试
[root@linux-node1 ~]# curl http://myk8s.com -I
HTTP/1.1 200 OK
Server: nginx/1.12.2
Date: Fri, 23 Aug 2019 04:07:44 GMT
Content-Type: text/html
Content-Length: 3700
Last-Modified: Fri, 10 May 2019 08:08:40 GMT
Connection: keep-alive
ETag: "5cd53188-e74"
Accept-Ranges: bytes
```
参考资料:
https://www.jianshu.com/p/feeea0bbd73e
================================================
FILE: docs/ca.md
================================================
# 手动制作CA证书
```
Kubernetes 系统各组件需要使用 TLS 证书对通信进行加密。
CA证书管理工具:
• easyrsa ---openvpn比较常用
• openssl
• cfssl ---使用最多,使用json文件格式,相对简单
```
## 1.安装 CFSSL
```
[root@linux-node1 ~]# cd /usr/local/src
[root@linux-node1 src]# wget https://pkg.cfssl.org/R1.2/cfssl_linux-amd64
[root@linux-node1 src]# wget https://pkg.cfssl.org/R1.2/cfssljson_linux-amd64
[root@linux-node1 src]# wget https://pkg.cfssl.org/R1.2/cfssl-certinfo_linux-amd64
[root@linux-node1 src]# chmod +x cfssl*
[root@linux-node1 src]# mv cfssl-certinfo_linux-amd64 /opt/kubernetes/bin/cfssl-certinfo
[root@linux-node1 src]# mv cfssljson_linux-amd64 /opt/kubernetes/bin/cfssljson
[root@linux-node1 src]# mv cfssl_linux-amd64 /opt/kubernetes/bin/cfssl
#复制cfssl命令文件到k8s-node1和k8s-node2节点。如果实际中多个节点,就都需要同步复制。
[root@linux-node1 ~]# scp /opt/kubernetes/bin/cfssl* 192.168.56.12:/opt/kubernetes/bin
[root@linux-node1 ~]# scp /opt/kubernetes/bin/cfssl* 192.168.56.13:/opt/kubernetes/bin
```
## 2.初始化cfssl
```
[root@linux-node1 src]# mkdir ssl && cd ssl
[root@linux-node1 ssl]# cfssl print-defaults config > config.json --生成ca-config.json的样例(可省略)
[root@linux-node1 ssl]# cfssl print-defaults csr > csr.json --生成ca-csr.json的样例(可省略)
```
## 3.创建用来生成 CA 文件的 JSON 配置文件
```
[root@linux-node1 ssl]#
cat > ca-config.json < ca-csr.json < 53/UDP,53/TCP 2m
#在node节点使用ipvsadm -Ln查看转发的后端节点(TCP和UDP的53端口)
[root@linux-node2 ~]# ipvsadm -Ln
IP Virtual Server version 1.2.1 (size=4096)
Prot LocalAddress:Port Scheduler Flags
-> RemoteAddress:Port Forward Weight ActiveConn InActConn
TCP 10.1.0.2:53 rr
-> 10.2.76.14:53 Masq 1 0 0
-> 10.2.76.20:53 Masq 1 0 0
UDP 10.1.0.2:53 rr
-> 10.2.76.14:53 Masq 1 0 0
-> 10.2.76.20:53 Masq 1 0 0
#发现是转到这2个pod容器
[root@linux-node1 ~]# kubectl get pod -n kube-system -o wide
NAME READY STATUS RESTARTS AGE IP NODE
coredns-77c989547b-4f9xz 1/1 Running 0 5m 10.2.76.20 192.168.56.12
coredns-77c989547b-9zm4m 1/1 Running 0 5m 10.2.76.14 192.168.56.13
```
## 测试CoreDNS
```
[root@linux-node1 coredns]# kubectl run dns-test --rm -it --image=alpine /bin/sh
If you don't see a command prompt, try pressing enter.
/ # ping www.qq.com
PING www.qq.com (121.51.142.21): 56 data bytes
64 bytes from 121.51.142.21: seq=0 ttl=127 time=20.864 ms
64 bytes from 121.51.142.21: seq=1 ttl=127 time=19.937 ms
```
================================================
FILE: docs/dashboard.md
================================================
# Kubernetes Dashboard
## 创建Dashboard
```
[root@linux-node1 ~]# kubectl create -f /srv/addons/dashboard/
[root@linux-node1 ~]# kubectl cluster-info
Kubernetes master is running at https://192.168.56.11:6443
kubernetes-dashboard is running at https://192.168.56.11:6443/api/v1/namespaces/kube-system/services/https:kubernetes-dashboard:/proxy
To further debug and diagnose cluster problems, use 'kubectl cluster-info dump'.
```
## 查看Dashboard信息
```
#发现Dashboard是运行在node3节点
[root@linux-node1 ~]# kubectl get pod -n kube-system -o wide
NAME READY STATUS RESTARTS AGE IP NODE
kubernetes-dashboard-66c9d98865-bqwl5 1/1 Running 0 1h 10.2.76.3 192.168.56.13
#查看Dashboard运行日志
[root@linux-node1 ~]# kubectl logs pod/kubernetes-dashboard-66c9d98865-bqwl5 -n kube-system
#查看Dashboard服务IP(可以访问任意node节点的34696端口就可以访问到Dashboard页面 https://192.168.56.13:34696/#!/overview?namespace=default,如何master节点安装了kube-proxy也可以访问)
[root@linux-node1 ~]# kubectl get service -n kube-system
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
kubernetes-dashboard NodePort 10.1.36.42 443:34696/TCP 1h
```
https://192.168.56.13:34696/#!/overview?namespace=default

## 访问Dashboard
https://192.168.56.11:6443/api/v1/namespaces/kube-system/services/https:kubernetes-dashboard:/proxy
用户名:admin 密码:admin 选择令牌模式登录。
### 获取Token
```
[root@linux-node1 ~]# kubectl -n kube-system describe secret $(kubectl -n kube-system get secret | grep admin-user | awk '{print $1}')
Name: admin-user-token-c97bl
Namespace: kube-system
Labels:
Annotations: kubernetes.io/service-account.name=admin-user
kubernetes.io/service-account.uid=379208ff-cb86-11e8-9f1c-080027dc9cd8
Type: kubernetes.io/service-account-token
Data
====
ca.crt: 1359 bytes
namespace: 11 bytes
token: eyJhbGciOiJSUzI1NiIsImtpZCI6IiJ9.eyJpc3MiOiJrdWJlcm5ldGVzL3NlcnZpY2VhY2NvdW50Iiwia3ViZXJuZXRlcy5pby9zZXJ2aWNlYWNjb3VudC9uYW1lc3BhY2UiOiJrdWJlLXN5c3RlbSIsImt1YmVybmV0ZXMuaW8vc2VydmljZWFjY291bnQvc2VjcmV0Lm5hbWUiOiJhZG1pbi11c2VyLXRva2VuLWM5N2JsIiwia3ViZXJuZXRlcy5pby9zZXJ2aWNlYWNjb3VudC9zZXJ2aWNlLWFjY291bnQubmFtZSI6ImFkbWluLXVzZXIiLCJrdWJlcm5ldGVzLmlvL3NlcnZpY2VhY2NvdW50L3NlcnZpY2UtYWNjb3VudC51aWQiOiIzNzkyMDhmZi1jYjg2LTExZTgtOWYxYy0wODAwMjdkYzljZDgiLCJzdWIiOiJzeXN0ZW06c2VydmljZWFjY291bnQ6a3ViZS1zeXN0ZW06YWRtaW4tdXNlciJ9.LopL7AD9feBZmhAuAUlPNjfthlJ1lJAPG6VXgBl-MZdofZpqNU9m-o-7M4hHa5AXkpeLvQrA1UKWWSR9eWEN06ugIkcH4Pk-tKrSVQUM6CDaE7eBdK91x1ltTonLz62_z_X8IvRYx1piv3wRUijoyRHCdziBnOhg67sT974CSPoRSOpl7ZR0Kn_L0LYRMOE9xfU3w4-sCpSx-jgc5oysAix95NqZgIkaZ6TRANpCnHE66fqL6yUwQxQ5yt7pw7J2iuSE3OxPU_cKArjYlWUvr72zG3SxZaR7dzQEggwmjSSeHRs0OK0968QAtCca1NTmcPaTtKhXYfXXdtusVCx7bA
```

================================================
FILE: docs/dashboard_op.md
================================================
# Kubernetes Dashboard
```
chattr -i /etc/passwd* && chattr -i /etc/group* && chattr -i /etc/shadow* && chattr -i /etc/gshadow*
cd /etc/ansible/
ansible-playbook 07.cluster-addon.yml
ansible-playbook 90.setup.yml
systemctl restart iptables
systemctl restart kube-scheduler
systemctl restart kube-controller-manager
systemctl restart kube-apiserver
systemctl restart etcd
systemctl restart docker
systemctl restart iptables
systemctl restart kubelet
systemctl restart kube-proxy
systemctl restart etcd
systemctl restart docker
```
## 1、查看deployment
```
[root@node1 ~]# kubectl get deployment -A
NAMESPACE NAME READY UP-TO-DATE AVAILABLE AGE
default my-mc-deployment 3/3 3 3 2d18h
default net 3/3 3 3 4d15h
default net-test 2/2 2 2 4d16h
default test-hello 1/1 1 1 6d
default test-jrr 1/1 1 1 43h
kube-system coredns 0/2 2 0 4d15h
kube-system heapster 1/1 1 1 8d
kube-system kubernetes-dashboard 0/1 1 0 4m42s
kube-system metrics-server 0/1 1 0 8d
kube-system traefik-ingress-controller 1/1 1 1 2d18h
#查看deployment详情
[root@node1 ~]# kubectl describe deployment kubernetes-dashboard -n kube-system
#删除deployment
[root@node1 ~]# kubectl delete deployment kubernetes-dashboard -n kube-system
deployment.extensions "kubernetes-dashboard" deleted
```
## 2、查看Service信息
```
[root@tw06a2753 ~]# kubectl get service -A -o wide
NAMESPACE NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE SELECTOR
default kubernetes ClusterIP 172.68.0.1 443/TCP 8d
default my-mc-service ClusterIP 172.68.113.166 60001/TCP,60002/TCP 3d14h app=products,department=sales
default nginx-service ClusterIP 172.68.176.9 80/TCP 5d app=nginx
default php-service ClusterIP 172.68.210.6 9898/TCP 5d19h app=nginx-php
default test-hello ClusterIP 172.68.248.205 80/TCP 6d run=test-hello
default test-jrr-php-service ClusterIP 172.68.58.202 9090/TCP 43h app=test-jrr-nginx-php
kube-system heapster ClusterIP 172.68.19.198 80/TCP 8d k8s-app=heapster
kube-system kube-dns ClusterIP 172.68.0.2 53/UDP,53/TCP,9153/TCP 4d15h k8s-app=kube-dns
kube-system kubernetes-dashboard NodePort 172.68.46.171 443:29107/TCP 6m31s k8s-app=kubernetes-dashboard
kube-system metrics-server ClusterIP 172.68.31.222 443/TCP 8d k8s-app=metrics-server
kube-system traefik-ingress-service NodePort 172.68.124.46 80:33813/TCP,8080:21315/TCP 2d18h k8s-app=traefik-ingress-lb
kube-system traefik-web-ui ClusterIP 172.68.226.139 80/TCP 2d19h k8s-app=traefik-ingress-lb
#查看service详情
[root@node1 ~]# kubectl describe svc kubernetes-dashboard -n kube-system
#删除service
[root@node1 ~]# kubectl delete svc kubernetes-dashboard -n kube-system
service "kubernetes-dashboard" deleted
```
## 3、查看Service对应的后端节点
```
#查看kubernetes-dashboard
[root@node1 ~]# kubectl describe svc kubernetes-dashboard -n kube-system
#查看服务my-mc-service
[root@node1 ~]# kubectl describe svc my-mc-service -n default
Name: my-mc-service
Namespace: default
Labels:
Annotations: kubectl.kubernetes.io/last-applied-configuration:
{"apiVersion":"v1","kind":"Service","metadata":{"annotations":{},"name":"my-mc-service","namespace":"default"},"spec":{"ports":[{"name":"m...
Selector: app=products,department=sales
Type: ClusterIP
IP: 172.68.113.166
Port: my-first-port 60001/TCP
TargetPort: 50001/TCP
Endpoints: 172.20.1.209:50001,172.20.2.206:50001,172.20.2.208:50001
Port: my-second-port 60002/TCP
TargetPort: 50002/TCP
Endpoints: 172.20.1.209:50002,172.20.2.206:50002,172.20.2.208:50002 ---发现这个service有这3个后端
Session Affinity: None
Events:
```
## 4、Dashboard运行在哪个节点
```
#发现Dashboard是运行在node3节点
[root@linux-node1 ~]# kubectl get pod -n kube-system -o wide
NAME READY STATUS RESTARTS AGE IP NODE
kubernetes-dashboard-66c9d98865-bqwl5 1/1 Running 0 1h 10.2.76.3 192.168.56.13
#查看Dashboard运行日志
[root@linux-node1 ~]# kubectl logs pod/kubernetes-dashboard-66c9d98865-bqwl5 -n kube-system
#查看Dashboard服务IP(可以访问任意node节点的34696端口就可以访问到Dashboard页面 https://192.168.56.13:34696/#!/overview?namespace=default,如何master节点安装了kube-proxy也可以访问)
[root@linux-node1 ~]# kubectl get service -n kube-system
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
kubernetes-dashboard NodePort 10.1.36.42 443:34696/TCP 1h
```
https://192.168.56.13:34696/#!/overview?namespace=default

## 访问Dashboard
https://192.168.56.11:6443/api/v1/namespaces/kube-system/services/https:kubernetes-dashboard:/proxy
用户名:admin 密码:admin 选择令牌模式登录。
### 获取Token
```
[root@linux-node1 ~]# kubectl -n kube-system describe secret $(kubectl -n kube-system get secret | grep admin-user | awk '{print $1}')
Name: admin-user-token-c97bl
Namespace: kube-system
Labels:
Annotations: kubernetes.io/service-account.name=admin-user
kubernetes.io/service-account.uid=379208ff-cb86-11e8-9f1c-080027dc9cd8
Type: kubernetes.io/service-account-token
Data
====
ca.crt: 1359 bytes
namespace: 11 bytes
token: eyJhbGciOiJSUzI1NiIsImtpZCI6IiJ9.eyJpc3MiOiJrdWJlcm5ldGVzL3NlcnZpY2VhY2NvdW50Iiwia3ViZXJuZXRlcy5pby9zZXJ2aWNlYWNjb3VudC9uYW1lc3BhY2UiOiJrdWJlLXN5c3RlbSIsImt1YmVybmV0ZXMuaW8vc2VydmljZWFjY291bnQvc2VjcmV0Lm5hbWUiOiJhZG1pbi11c2VyLXRva2VuLWM5N2JsIiwia3ViZXJuZXRlcy5pby9zZXJ2aWNlYWNjb3VudC9zZXJ2aWNlLWFjY291bnQubmFtZSI6ImFkbWluLXVzZXIiLCJrdWJlcm5ldGVzLmlvL3NlcnZpY2VhY2NvdW50L3NlcnZpY2UtYWNjb3VudC51aWQiOiIzNzkyMDhmZi1jYjg2LTExZTgtOWYxYy0wODAwMjdkYzljZDgiLCJzdWIiOiJzeXN0ZW06c2VydmljZWFjY291bnQ6a3ViZS1zeXN0ZW06YWRtaW4tdXNlciJ9.LopL7AD9feBZmhAuAUlPNjfthlJ1lJAPG6VXgBl-MZdofZpqNU9m-o-7M4hHa5AXkpeLvQrA1UKWWSR9eWEN06ugIkcH4Pk-tKrSVQUM6CDaE7eBdK91x1ltTonLz62_z_X8IvRYx1piv3wRUijoyRHCdziBnOhg67sT974CSPoRSOpl7ZR0Kn_L0LYRMOE9xfU3w4-sCpSx-jgc5oysAix95NqZgIkaZ6TRANpCnHE66fqL6yUwQxQ5yt7pw7J2iuSE3OxPU_cKArjYlWUvr72zG3SxZaR7dzQEggwmjSSeHRs0OK0968QAtCca1NTmcPaTtKhXYfXXdtusVCx7bA
```

================================================
FILE: docs/delete.md
================================================
```
#master
systemctl restart kube-scheduler
systemctl restart kube-controller-manager
systemctl restart kube-apiserver
systemctl restart flannel
systemctl restart etcd
systemctl restart docker
systemctl stop kube-scheduler
systemctl stop kube-controller-manager
systemctl stop kube-apiserver
systemctl stop flannel
systemctl stop etcd
systemctl stop docker
#node
systemctl restart kubelet
systemctl restart kube-proxy
systemctl restart flannel
systemctl restart etcd
systemctl restart docker
systemctl stop kubelet
systemctl stop kube-proxy
systemctl stop flannel
systemctl stop etcd
systemctl stop docker
```
```
# 清理k8s集群
rm -rf /var/lib/etcd/
rm -rf /var/lib/docker
rm -rf /opt/containerd
rm -rf /opt/kubernetes
rm -rf /var/lib/kubelet
rm -rf /var/lib/chrony
rm -rf /var/lib/kube-proxy
rm -rf /srv/*
systemctl disable kube-scheduler
systemctl disable kube-controller-manager
systemctl disable kube-apiserver
systemctl disable flannel
systemctl disable etcd
systemctl disable docker
systemctl disable kubelet
systemctl disable kube-proxy
systemctl disable flannel
systemctl disable etcd
systemctl disable docker
```
================================================
FILE: docs/docker-install.md
================================================
# study_docker
## 0.卸载旧版本
```bash
yum remove -y docker \
docker-client \
docker-client-latest \
docker-common \
docker-latest \
docker-latest-logrotate \
docker-logrotate \
docker-selinux \
docker-engine-selinux \
docker-engine
```
## 1.安装Docker
第一步:使用国内Docker源
```
cd /etc/yum.repos.d/
wget -O docker-ce.repo https://mirrors.aliyun.com/docker-ce/linux/centos/docker-ce.repo
#或
yum -y install yum-utils
yum-config-manager --add-repo http://mirrors.aliyun.com/docker-ce/linux/centos/docker-ce.repo
yum install -y yum-utils \
device-mapper-persistent-data \
lvm2
```
第二步:Docker安装:
```
yum install -y docker-ce
```
第三步:启动后台进程:
```bash
#启动docker服务
systemctl restart docker
#设置docker服务开启自启
systemctl enable docker
#Created symlink from /etc/systemd/system/multi-user.target.wants/docker.service to /usr/lib/systemd/system/docker.service.
#查看是否成功设置docker服务开启自启
systemctl list-unit-files|grep docker
docker.service enabled
#关闭docker服务开启自启
systemctl disable docker
#Removed symlink /etc/systemd/system/multi-user.target.wants/docker.service.
```
## 2.脚本安装Docker
```bash
#2.1、Docker官方安装脚本
curl -sSL https://get.docker.com/ | sh
#这个脚本会添加docker.repo仓库并且安装Docker
#2.2、阿里云的安装脚本
curl -sSL http://acs-public-mirror.oss-cn-hangzhou.aliyuncs.com/docker-engine/internet | sh -
#2.3、DaoCloud 的安装脚本
curl -sSL https://get.daocloud.io/docker | sh
```
### 3.Docker服务文件
```bash
# Docker从1.13版本开始调整了默认的防火墙规则,禁用了iptables filter表中FOWARD链,这样会引起Kubernetes集群中跨Node的Pod无法通信,执行下面命令
#注意,有变量的地方需要使用转义符号
cat > /usr/lib/systemd/system/docker.service << EOF
[Unit]
Description=Docker Application Container Engine
Documentation=https://docs.docker.com
BindsTo=containerd.service
After=network-online.target firewalld.service containerd.service
Wants=network-online.target
Requires=docker.socket
[Service]
Type=notify
# the default is not to use systemd for cgroups because the delegate issues still
# exists and systemd currently does not support the cgroup feature set required
# for containers run by docker
ExecStart=/usr/bin/dockerd -H fd:// --containerd=/run/containerd/containerd.sock --exec-opt native.cgroupdriver=systemd
ExecReload=/bin/kill -s HUP \$MAINPID
ExecStartPost=/usr/sbin/iptables -P FORWARD ACCEPT
TimeoutSec=0
RestartSec=2
Restart=always
# Note that StartLimit* options were moved from "Service" to "Unit" in systemd 229.
# Both the old, and new location are accepted by systemd 229 and up, so using the old location
# to make them work for either version of systemd.
StartLimitBurst=3
# Note that StartLimitInterval was renamed to StartLimitIntervalSec in systemd 230.
# Both the old, and new name are accepted by systemd 230 and up, so using the old name to make
# this option work for either version of systemd.
StartLimitInterval=60s
# Having non-zero Limit*s causes performance problems due to accounting overhead
# in the kernel. We recommend using cgroups to do container-local accounting.
LimitNOFILE=infinity
LimitNPROC=infinity
LimitCORE=infinity
# Comment TasksMax if your systemd version does not support it.
# Only systemd 226 and above support this option.
TasksMax=infinity
# set delegate yes so that systemd does not reset the cgroups of docker containers
Delegate=yes
# kill only the docker process, not all processes in the cgroup
KillMode=process
[Install]
WantedBy=multi-user.target
EOF
```
## 3.1、配置docker加速器
```bash
mkdir -p /data0/docker-data
cat > /etc/docker/daemon.json << \EOF
{
"exec-opts": ["native.cgroupdriver=systemd"],
"data-root": "/data0/docker-data",
"registry-mirrors" : [
"https://ot2k4d59.mirror.aliyuncs.com/"
],
"insecure-registries": ["reg.hub.com"]
}
EOF
或者
curl -sSL https://get.daocloud.io/daotools/set_mirror.sh | sh -s http://f1361db2.m.daocloud.io
```
### 3.2、重新加载docker的配置文件
```bash
systemctl daemon-reload
systemctl restart docker
```
### 3.3、内核参数配置
```bash
#编辑文件
vim /etc/sysctl.conf
net.bridge.bridge-nf-call-ip6tables = 1
net.bridge.bridge-nf-call-iptables = 1
#然后执行
sysctl -p
#查看docker信息是否生效
docker info
```
## 4.通过测试镜像运行一个容器来验证Docker是否安装正确
```bash
docker run hello-world
```
================================================
FILE: docs/etcd-install.md
================================================
# 手动部署ETCD集群
## 0.准备etcd软件包
```
[root@linux-node1 src]# wget https://github.com/coreos/etcd/releases/download/v3.2.18/etcd-v3.2.18-linux-amd64.tar.gz
[root@linux-node1 src]# tar zxf etcd-v3.2.18-linux-amd64.tar.gz
[root@linux-node1 src]# cd etcd-v3.2.18-linux-amd64
[root@linux-node1 etcd-v3.2.18-linux-amd64]# cp etcd etcdctl /opt/kubernetes/bin/
[root@linux-node1 etcd-v3.2.18-linux-amd64]# scp etcd etcdctl 192.168.56.12:/opt/kubernetes/bin/
[root@linux-node1 etcd-v3.2.18-linux-amd64]# scp etcd etcdctl 192.168.56.13:/opt/kubernetes/bin/
```
## 1.创建 etcd 证书签名请求:
```
#约定所有证书都放在 /usr/local/src/ssl 目录中,然后同步到其他机器
[root@linux-node1 ~]# cd /usr/local/src/ssl
[root@linux-node1 ssl]#
cat > etcd-csr.json < flanneld-csr.json </dev/null 2>&1
```
启动flannel
```
[root@linux-node1 ~]# systemctl daemon-reload
[root@linux-node1 ~]# systemctl enable flannel
[root@linux-node1 ~]# chmod +x /opt/kubernetes/bin/*
[root@linux-node1 ~]# systemctl start flannel
```
查看服务状态
```
[root@linux-node1 ~]# systemctl status flannel
```
## 配置Docker使用Flannel
```
[root@linux-node1 ~]# vim /usr/lib/systemd/system/docker.service
[Unit] #在Unit下面修改After和增加Requires
After=network-online.target flannel.service
Wants=network-online.target
Requires=flannel.service
[Service] #增加EnvironmentFile=-/run/flannel/docker
Type=notify
EnvironmentFile=-/run/flannel/docker
ExecStart=/usr/bin/dockerd $DOCKER_OPTS
#最终配置
cat /usr/lib/systemd/system/docker.service
[Unit]
Description=Docker Application Container Engine
Documentation=http://docs.docker.com
After=network.target flannel.service
Requires=flannel.service
[Service]
Type=notify
EnvironmentFile=-/run/flannel/docker
EnvironmentFile=-/opt/kubernetes/cfg/docker
ExecStart=/usr/bin/dockerd $DOCKER_OPT_BIP $DOCKER_OPT_MTU $DOCKER_OPTS
LimitNOFILE=1048576
LimitNPROC=1048576
ExecReload=/bin/kill -s HUP $MAINPID
# Having non-zero Limit*s causes performance problems due to accounting overhead
# in the kernel. We recommend using cgroups to do container-local accounting.
LimitNOFILE=infinity
LimitNPROC=infinity
LimitCORE=infinity
# Uncomment TasksMax if your systemd version supports it.
# Only systemd 226 and above support this version.
#TasksMax=infinity
TimeoutStartSec=0
# set delegate yes so that systemd does not reset the cgroups of docker containers
Delegate=yes
# kill only the docker process, not all processes in the cgroup
KillMode=process
# restart the docker process if it exits prematurely
Restart=on-failure
StartLimitBurst=3
StartLimitInterval=60s
[Install]
WantedBy=multi-user.target
```
将配置复制到另外两个阶段
```
[root@linux-node1 ~]# scp /usr/lib/systemd/system/docker.service 192.168.56.12:/usr/lib/systemd/system/
[root@linux-node1 ~]# scp /usr/lib/systemd/system/docker.service 192.168.56.13:/usr/lib/systemd/system/
```
重启Docker
```
systemctl daemon-reload
systemctl restart docker
```
================================================
FILE: docs/k8s-error-resolution.md
================================================
## 报错一:flanneld 启动不了
```
Oct 10 10:42:19 linux-node1 flanneld: E1010 10:42:19.499080 1816 main.go:349] Couldn't fetch network config: 100: Key not found (/coreos.com) [11]
```
## 解决办法:
```
#首先查看flannel使用的那种类型的网络模式是对应的etcd中的key是哪个(/kubernetes/network/config 或 /coreos.com/network )
[root@linux-node3 cfg]# cat /opt/kubernetes/cfg/flannel
FLANNEL_ETCD="-etcd-endpoints=https://192.168.56.11:2379,https://192.168.56.12:2379,https://192.168.56.13:2379"
FLANNEL_ETCD_KEY="-etcd-prefix=/coreos.com/network" ----这个参数值
FLANNEL_ETCD_CAFILE="--etcd-cafile=/opt/kubernetes/ssl/ca.pem"
FLANNEL_ETCD_CERTFILE="--etcd-certfile=/opt/kubernetes/ssl/flanneld.pem"
FLANNEL_ETCD_KEYFILE="--etcd-keyfile=/opt/kubernetes/ssl/flanneld-key.pem"
#etcd集群集群执行下面命令,清空etcd数据
rm -rf /var/lib/etcd/default.etcd/
#下面这条只需在一个节点执行就可以
#如果是/coreos.com/network则执行下面的
[root@linux-node1 ~]# /opt/kubernetes/bin/etcdctl --ca-file /opt/kubernetes/ssl/ca.pem \
--cert-file /opt/kubernetes/ssl/flanneld.pem \
--key-file /opt/kubernetes/ssl/flanneld-key.pem \
--no-sync -C https://192.168.56.11:2379,https://192.168.56.12:2379,https://192.168.56.13:2379 \
mk /coreos.com/network/config '{"Network":"172.17.0.0/16"}'
#如果是/kubernetes/network/config则执行下面的
[root@linux-node1 ~]# /opt/kubernetes/bin/etcdctl --ca-file /opt/kubernetes/ssl/ca.pem \
--cert-file /opt/kubernetes/ssl/flanneld.pem \
--key-file /opt/kubernetes/ssl/flanneld-key.pem \
--no-sync -C https://192.168.56.11:2379,https://192.168.56.12:2379,https://192.168.56.13:2379 \
mk /kubernetes/network/config '{ "Network": "10.2.0.0/16", "Backend": { "Type": "vxlan", "VNI": 1 }}'
```
参考文档:https://stackoverflow.com/questions/34439659/flannel-and-docker-dont-start
## 报错二:flanneld 启动不了
```
Oct 10 11:40:11 linux-node1 flanneld: E1010 11:40:11.797324 20669 main.go:349] Couldn't fetch network config: 104: Not a directory (/kubernetes/network/config) [12]
问题原因:在初次配置的时候,把flannel的配置文件中的etcd-prefix-key配置成了/kubernetes/network/config,实际上应该是/kubernetes/network
[root@linux-node1 ~]# cat /opt/kubernetes/cfg/flannel
FLANNEL_ETCD="-etcd-endpoints=https://192.168.56.11:2379,https://192.168.56.12:2379,https://192.168.56.13:2379"
FLANNEL_ETCD_KEY="-etcd-prefix=/kubernetes/network/config" --正确的应该为 /kubernetes/network/
FLANNEL_ETCD_CAFILE="--etcd-cafile=/opt/kubernetes/ssl/ca.pem"
FLANNEL_ETCD_CERTFILE="--etcd-certfile=/opt/kubernetes/ssl/flanneld.pem"
FLANNEL_ETCD_KEYFILE="--etcd-keyfile=/opt/kubernetes/ssl/flanneld-key.pem"
```
参考文档:https://www.cnblogs.com/lyzw/p/6016789.html
================================================
FILE: docs/k8s_pv_local.md
================================================
参考文档:
https://kubernetes.io/blog/2018/04/13/local-persistent-volumes-beta/
================================================
FILE: docs/k8s重启pod.md
================================================
通过kubectl delete批量删除全部Pod
```
kubectl delete pod --all
```
```
在没有pod 的yaml文件时,强制重启某个pod
kubectl get pod PODNAME -n NAMESPACE -o yaml | kubectl replace --force -f -
```
```
Q:如何进入一个 pod ?
kubectl get pod 查看pod name
kubectl describe pod name_of_pod 查看pod详细信息
进入pod:
[root@test001 ~]# kubectl get pod -o wide
NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES
nginx-deployment-68c7f5464c-p52rl 1/1 Running 0 17m 172.20.1.22 10.33.35.6
nginx-deployment-68c7f5464c-qfd24 1/1 Running 0 17m 172.20.2.16 10.33.35.7
kubectl exec -it name-of-pod /bin/bash
```
参考资料:
https://www.jianshu.com/p/baa6b11062de
================================================
FILE: docs/master.md
================================================
## 一.部署Kubernetes API服务部署
### 0.准备软件包
```
[root@linux-node1 ~]# cd /usr/local/src/kubernetes
[root@linux-node1 kubernetes]# cp server/bin/kube-apiserver /opt/kubernetes/bin/
[root@linux-node1 kubernetes]# cp server/bin/kube-controller-manager /opt/kubernetes/bin/
[root@linux-node1 kubernetes]# cp server/bin/kube-scheduler /opt/kubernetes/bin/
```
### 1.创建生成CSR的 JSON 配置文件
```
[root@linux-node1 ~]# cd /usr/local/src/ssl
[root@linux-node1 ssl]#
cat > kubernetes-csr.json < admin-csr.json < /etc/cni/net.d/10-default.conf <0{print $1}'| xargs kubectl certificate approve
```
执行完毕后,查看节点状态已经是Ready的状态了
```
[root@linux-node1 ~]# kubectl get node
NAME STATUS ROLES AGE VERSION
192.168.56.12 Ready 103s v1.12.1
192.168.56.13 Ready 103s v1.12.1
```
## 部署Kubernetes Proxy
1.配置kube-proxy使用LVS
```
[root@linux-node2 ~]# yum install -y ipvsadm ipset conntrack
```
2.创建 kube-proxy 证书请求
```
[root@linux-node1 ~]# cd /usr/local/src/ssl/
[root@linux-node1 ssl]#
cat > kube-proxy-csr.json < RemoteAddress:Port Forward Weight ActiveConn InActConn
TCP 10.1.0.1:443 rr persistent 10800
-> 192.168.56.11:6443 Masq 1 0 0
```
如果你在两台实验机器都安装了kubelet和proxy服务,使用下面的命令可以检查状态:
```
[root@linux-node1 ssl]# kubectl get node
NAME STATUS ROLES AGE VERSION
192.168.56.12 Ready 22m v1.10.1
192.168.56.13 Ready 3m v1.10.1
```
linux-node3节点请自行部署。
================================================
FILE: docs/operational.md
================================================
## 一、服务重启
```
#master
systemctl restart kube-scheduler
systemctl restart kube-controller-manager
systemctl restart kube-apiserver
systemctl restart flannel
systemctl restart etcd
systemctl stop kube-scheduler
systemctl stop kube-controller-manager
systemctl stop kube-apiserver
systemctl stop flannel
systemctl stop etcd
systemctl status kube-apiserver
systemctl status kube-scheduler
systemctl status kube-controller-manager
systemctl status etcd
#node
systemctl restart kubelet
systemctl restart kube-proxy
systemctl restart flannel
systemctl restart etcd
systemctl stop kubelet
systemctl stop kube-proxy
systemctl stop flannel
systemctl stop etcd
systemctl status kubelet
systemctl status kube-proxy
systemctl status flannel
systemctl status etcd
```
## 二、常用查询
```
#查询命名空间
[root@linux-node1 ~]# kubectl get namespace --all-namespaces
NAME STATUS AGE
default Active 3d13h
kube-node-lease Active 3d13h
kube-public Active 3d13h
kube-system Active 3d13h
#查询健康状况
[root@linux-node1 ~]# kubectl get cs --all-namespaces
NAME STATUS MESSAGE ERROR
controller-manager Healthy ok
scheduler Healthy ok
etcd-0 Healthy {"health":"true"}
etcd-2 Healthy {"health":"true"}
etcd-1 Healthy {"health":"true"}
#查询node
[root@linux-node1 ~]# kubectl get node -o wide
NAME STATUS ROLES AGE VERSION EXTERNAL-IP OS-IMAGE KERNEL-VERSION CONTAINER-RUNTIME
192.168.56.12 Ready 2m v1.10.3 CentOS Linux 7 (Core) 3.10.0-862.el7.x86_64 docker://18.6.1
192.168.56.13 Ready 2m v1.10.3 CentOS Linux 7 (Core) 3.10.0-862.el7.x86_64 docker://18.6.1
#创建测试deployment
[root@linux-node1 ~]# kubectl run net-test --image=alpine --replicas=2 sleep 360000
#查看创建的deployment
kubectl get deployment -o wide --all-namespaces
#查询pod
[root@linux-node1 ~]# kubectl get pod -o wide --all-namespaces
NAME READY STATUS RESTARTS AGE IP NODE
net-test-5767cb94df-6smfk 1/1 Running 1 1h 10.2.69.3 192.168.56.12
net-test-5767cb94df-ctkhz 1/1 Running 1 1h 10.2.17.3 192.168.56.13
#查询service
[root@linux-node1 ~]# kubectl get service
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
kubernetes ClusterIP 10.1.0.1 443/TCP 4m
#Etcd集群健康状况查询
[root@linux-node1 ~]# etcdctl --endpoints=https://192.168.56.11:2379 \
--ca-file=/opt/kubernetes/ssl/ca.pem \
--cert-file=/opt/kubernetes/ssl/etcd.pem \
--key-file=/opt/kubernetes/ssl/etcd-key.pem cluster-health
```
## 三、修改POD的IP地址段
```
#修改一
[root@linux-node1 ~]# vim /usr/lib/systemd/system/kube-controller-manager.service
[Unit]
Description=Kubernetes Controller Manager
Documentation=https://github.com/GoogleCloudPlatform/kubernetes
[Service]
ExecStart=/opt/kubernetes/bin/kube-controller-manager \
--address=127.0.0.1 \
--master=http://127.0.0.1:8080 \
--allocate-node-cidrs=true \
--service-cluster-ip-range=10.1.0.0/16 \
--cluster-cidr=10.2.0.0/16 \ ---POD的IP地址段
--cluster-name=kubernetes \
--cluster-signing-cert-file=/opt/kubernetes/ssl/ca.pem \
--cluster-signing-key-file=/opt/kubernetes/ssl/ca-key.pem \
--service-account-private-key-file=/opt/kubernetes/ssl/ca-key.pem \
--root-ca-file=/opt/kubernetes/ssl/ca.pem \
--leader-elect=true \
--v=2 \
--logtostderr=false \
--log-dir=/opt/kubernetes/log
Restart=on-failure
RestartSec=5
[Install]
WantedBy=multi-user.target
#修改二(修改etcd key中的值)
#创建etcd的key值
/opt/kubernetes/bin/etcdctl --ca-file /opt/kubernetes/ssl/ca.pem --cert-file /opt/kubernetes/ssl/flanneld.pem --key-file /opt/kubernetes/ssl/flanneld-key.pem \
--no-sync -C https://192.168.56.11:2379,https://192.168.56.12:2379,https://192.168.56.13:2379 \
mk /kubernetes/network/config '{ "Network": "10.2.0.0/16", "Backend": { "Type": "vxlan", "VNI": 1 }}'
获取etcd中key的值
/opt/kubernetes/bin/etcdctl --ca-file /opt/kubernetes/ssl/ca.pem --cert-file /opt/kubernetes/ssl/flanneld.pem --key-file /opt/kubernetes/ssl/flanneld-key.pem \
--no-sync -C https://192.168.56.11:2379,https://192.168.56.12:2379,https://192.168.56.13:2379 \
get /kubernetes/network/config
修改etcd中key的值
/opt/kubernetes/bin/etcdctl --ca-file /opt/kubernetes/ssl/ca.pem --cert-file /opt/kubernetes/ssl/flanneld.pem --key-file /opt/kubernetes/ssl/flanneld-key.pem \
--no-sync -C https://192.168.56.11:2379,https://192.168.56.12:2379,https://192.168.56.13:2379 \
set /kubernetes/network/config '{ "Network": "10.3.0.0/16", "Backend": { "Type": "vxlan", "VNI": 1 }}'
```
================================================
FILE: docs/外部访问K8s中Pod的几种方式.md
================================================
```
Ingress是个什么鬼,网上资料很多(推荐官方),大家自行研究。简单来讲,就是一个负载均衡的玩意,其主要用来解决使用NodePort暴露Service的端口时Node IP会漂移的问题。同时,若大量使用NodePort暴露主机端口,管理会非常混乱。
好的解决方案就是让外界通过域名去访问Service,而无需关心其Node IP及Port。那为什么不直接使用Nginx?这是因为在K8S集群中,如果每加入一个服务,我们都在Nginx中添加一个配置,其实是一个重复性的体力活,只要是重复性的体力活,我们都应该通过技术将它干掉。
Ingress就可以解决上面的问题,其包含两个组件Ingress Controller和Ingress:
Ingress
将Nginx的配置抽象成一个Ingress对象,每添加一个新的服务只需写一个新的Ingress的yaml文件即可
Ingress Controller
将新加入的Ingress转化成Nginx的配置文件并使之生效
```
参考文档:
https://blog.csdn.net/qq_23348071/article/details/87185025 从外部访问K8s中Pod的五种方式
================================================
FILE: docs/虚拟机环境准备.md
================================================
# 一、安装环境准备
下载系统镜像:可以在阿里云镜像站点下载 CentOS
镜像: http://mirrors.aliyun.com/centos/7/isos/x86_64/CentOS-7-x86_64-DVD-1804.iso
创建虚拟机:步骤略。
# 二、操作系统安装
为了统一环境,保证实验的通用性,将网卡名称设置为 eth*,不使用 CentOS 7 默认的网卡命名规则。所以需要在安装的时候,增加内核参数。
## 1)光标选择“Install CentOS 7”

## 2)点击 Tab,打开 kernel 启动选项后,增加 net.ifnames=0 biosdevname=0,如下图所示。

# 三、设置网络
## 1.vmware-workstation设置网络。
如果你的默认 NAT 地址段不是 192.168.56.0/24 可以修改 VMware Workstation 的配置,点击编辑 -> 虚拟 网络配置,然后进行配置。

## 2.virtualbox设置网络。


# 四、系统配置
## 1.设置主机名
```
[root@localhost ~]# vi /etc/hostname
linux-node1.example.com
或
#修改本机hostname
[root@localhost ~]# hostnamectl set-hostname linux-node1.example.com
#让主机名修改生效
[root@localhost ~]# su -l
Last login: Sun Sep 30 04:30:53 EDT 2018 on pts/0
[root@linux-node1 ~]#
```
## 2.安装依赖
```
#为了保证各服务器间时间一致,使用ntpdate同步时间。
# 安装ntpdate
[root@linux-node1 ~]# yum install -y wget lrzsz vim net-tools openssh-clients ntpdate unzip xz
$ 加入crontab
1 * * * * (/usr/sbin/ntpdate -s ntp1.aliyun.com;/usr/sbin/hwclock -w) > /dev/null 2>&1
1 * * * * /usr/sbin/ntpdate ntp1.aliyun.com >/dev/null 2>&1
#设置时区
[root@linux-node1 ~]# cp /usr/share/zoneinfo/Asia/Shanghai /etc/localtime
```
## 3.设置 IP 地址
请配置静态 IP 地址。注意将 UUID 和 MAC 地址已经其它配置删除掉,便于进行虚 拟机克隆,请参考下面的配置。
```
[root@linux-node1 ~]# vim /etc/sysconfig/network-scripts/ifcfg-eth0
TYPE=Ethernet
BOOTPROTO=static
NAME=eth0
DEVICE=eth0
ONBOOT=yes
IPADDR=192.168.56.11
NETMASK=255.255.255.0
#GATEWAY=192.168.56.2
#重启网络服务
[root@linux-node1 ~]# systemctl restart network
```
## 4.关闭 NetworkManager 和防火墙开启自启动
```
[root@linux-node1 ~]# systemctl disable firewalld
[root@linux-node1 ~]# systemctl disable NetworkManager
```
## 5.设置主机名解析
```
[root@linux-node1 ~]#
cat > /etc/hosts <> /etc/sysctl.conf
[root@linux-node1 ~]# sysctl -p
```
## 8.重启
```
[root@linux-node1 ~]# reboot
```
## 9.克隆虚拟机
关闭虚拟机,并克隆当前虚拟机 linux-node1 到 linux-node2 linux-node3,建议选择“创建完整克隆”,而不是“创 建链接克隆”。
克隆完毕后请给 linux-node2 linux-node3 设置正确的 IP 地址和主机名。
## 10.给虚拟机做快照
分别给三台虚拟机做快照。以便于随时回到一个刚初始化完毕的系统中。可以有效的减少学习过程中 的环境准备时间。同时,请确保实验环境的一致性,便于顺利的完成所有实验。
================================================
FILE: example/coredns/coredns.yaml
================================================
apiVersion: v1
kind: ServiceAccount
metadata:
name: coredns
namespace: kube-system
labels:
kubernetes.io/cluster-service: "true"
addonmanager.kubernetes.io/mode: Reconcile
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRole
metadata:
labels:
kubernetes.io/bootstrapping: rbac-defaults
addonmanager.kubernetes.io/mode: Reconcile
name: system:coredns
rules:
- apiGroups:
- ""
resources:
- endpoints
- services
- pods
- namespaces
verbs:
- list
- watch
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRoleBinding
metadata:
annotations:
rbac.authorization.kubernetes.io/autoupdate: "true"
labels:
kubernetes.io/bootstrapping: rbac-defaults
addonmanager.kubernetes.io/mode: EnsureExists
name: system:coredns
roleRef:
apiGroup: rbac.authorization.k8s.io
kind: ClusterRole
name: system:coredns
subjects:
- kind: ServiceAccount
name: coredns
namespace: kube-system
---
apiVersion: v1
kind: ConfigMap
metadata:
name: coredns
namespace: kube-system
labels:
addonmanager.kubernetes.io/mode: EnsureExists
data:
Corefile: |
.:53 {
errors
health
kubernetes cluster.local. in-addr.arpa ip6.arpa {
pods insecure
upstream
fallthrough in-addr.arpa ip6.arpa
}
prometheus :9153
proxy . /etc/resolv.conf
cache 30
}
---
apiVersion: extensions/v1beta1
kind: Deployment
metadata:
name: coredns
namespace: kube-system
labels:
k8s-app: coredns
kubernetes.io/cluster-service: "true"
addonmanager.kubernetes.io/mode: Reconcile
kubernetes.io/name: "CoreDNS"
spec:
replicas: 2
strategy:
type: RollingUpdate
rollingUpdate:
maxUnavailable: 1
selector:
matchLabels:
k8s-app: coredns
template:
metadata:
labels:
k8s-app: coredns
spec:
serviceAccountName: coredns
tolerations:
- key: node-role.kubernetes.io/master
effect: NoSchedule
- key: "CriticalAddonsOnly"
operator: "Exists"
containers:
- name: coredns
image: coredns/coredns:1.0.6
imagePullPolicy: IfNotPresent
resources:
limits:
memory: 170Mi
requests:
cpu: 100m
memory: 70Mi
args: [ "-conf", "/etc/coredns/Corefile" ]
volumeMounts:
- name: config-volume
mountPath: /etc/coredns
ports:
- containerPort: 53
name: dns
protocol: UDP
- containerPort: 53
name: dns-tcp
protocol: TCP
livenessProbe:
httpGet:
path: /health
port: 8080
scheme: HTTP
initialDelaySeconds: 60
timeoutSeconds: 5
successThreshold: 1
failureThreshold: 5
dnsPolicy: Default
volumes:
- name: config-volume
configMap:
name: coredns
items:
- key: Corefile
path: Corefile
---
apiVersion: v1
kind: Service
metadata:
name: coredns
namespace: kube-system
labels:
k8s-app: coredns
kubernetes.io/cluster-service: "true"
addonmanager.kubernetes.io/mode: Reconcile
kubernetes.io/name: "CoreDNS"
spec:
selector:
k8s-app: coredns
clusterIP: 10.1.0.2
ports:
- name: dns
port: 53
protocol: UDP
- name: dns-tcp
port: 53
protocol: TCP
================================================
FILE: example/nginx/nginx-daemonset.yaml
================================================
apiVersion: apps/v1
kind: DaemonSet
metadata:
name: nginx-daemonset
labels:
app: nginx
spec:
selector:
matchLabels:
app: nginx
template:
metadata:
labels:
app: nginx
spec:
containers:
- name: nginx
image: nginx:1.13.12
ports:
- containerPort: 80
================================================
FILE: example/nginx/nginx-deployment.yaml
================================================
apiVersion: apps/v1
kind: Deployment
metadata:
name: nginx-deployment
labels:
app: nginx
spec:
replicas: 3
selector:
matchLabels:
app: nginx
template:
metadata:
labels:
app: nginx
spec:
containers:
- name: nginx
image: nginx:1.13.12
ports:
- containerPort: 80
================================================
FILE: example/nginx/nginx-ingress.yaml
================================================
apiVersion: extensions/v1beta1
kind: Ingress
metadata:
name: nginx-ingress
spec:
rules:
- host: www.example.com
http:
paths:
- path: /
backend:
serviceName: nginx-service
servicePort: 80
================================================
FILE: example/nginx/nginx-pod.yaml
================================================
apiVersion: v1
kind: Pod
metadata:
name: nginx-pod
labels:
app: nginx
spec:
containers:
- name: nginx
image: nginx:1.13.12
ports:
- containerPort: 80
================================================
FILE: example/nginx/nginx-rc.yaml
================================================
apiVersion: v1
kind: ReplicationController
metadata:
name: nginx-rc
spec:
replicas: 3
selector:
app: nginx
template:
metadata:
name: nginx
labels:
app: nginx
spec:
containers:
- name: nginx
image: nginx:1.13.12
ports:
- containerPort: 80
================================================
FILE: example/nginx/nginx-rs.yaml
================================================
apiVersion: apps/v1
kind: ReplicaSet
metadata:
name: nginx-rs
labels:
app: nginx
spec:
replicas: 3
selector:
matchLabels:
app: nginx
template:
metadata:
labels:
app: nginx
spec:
containers:
- name: nginx
image: nginx:1.13.12
ports:
- containerPort: 80
================================================
FILE: example/nginx/nginx-service-nodeport.yaml
================================================
kind: Service
apiVersion: v1
metadata:
name: nginx-service
spec:
selector:
app: nginx
ports:
- protocol: TCP
port: 80
targetPort: 80
type: NodePort
================================================
FILE: example/nginx/nginx-service.yaml
================================================
kind: Service
apiVersion: v1
metadata:
name: nginx-service
spec:
selector:
app: nginx
ports:
- protocol: TCP
port: 80
targetPort: 80
================================================
FILE: helm/README.md
================================================
# 一、Helm - K8S的包管理器
类似Centos的yum
## 1、Helm架构
```bash
helm包括chart和release.
helm包含2个组件,Helm客户端和Tiller服务器.
```
## 2、Helm客户端安装
1、脚本安装
```bash
#安装
curl https://raw.githubusercontent.com/kubernetes/helm/master/scripts/get |bash
#查看
which helm
#因服务器端还没安装,这里会报无法连接
helm version
#添加命令补全
helm completion bash > .helmrc
echo "source .helmrc" >> .bashrc
```
2、源码安装
```bash
#源码安装
#curl -O https://get.helm.sh/helm-v2.16.0-linux-amd64.tar.gz
wget -O helm-v2.16.0-linux-amd64.tar.gz https://get.helm.sh/helm-v2.16.0-linux-amd64.tar.gz
tar -zxvf helm-v2.16.0-linux-amd64.tar.gz
cd linux-amd64 #若采用容器化部署到kubernetes中,则可以不用管tiller,只需将helm复制到/usr/bin目录即可
cp helm /usr/bin/
echo "source <(helm completion bash)" >> /root/.bashrc # 命令自动补全
```
## 3、Tiller服务器端安装
1、安装
```bash
helm init --upgrade -i registry.cn-hangzhou.aliyuncs.com/google_containers/tiller:v2.16.0 --stable-repo-url https://kubernetes.oss-cn-hangzhou.aliyuncs.com/charts
#查看
kubectl get --namespace=kube-system service tiller-deploy
kubectl get --namespace=kube-system deployments. tiller-deploy
kubectl get --namespace=kube-system pods|grep tiller-deploy
#能够看到服务器版本信息
helm version
#添加新的repo
helm repo add stable http://mirror.azure.cn/kubernetes/charts/
```
2、创建helm-rbac.yaml文件
```bash
cat >helm-rbac.yaml<<\EOF
apiVersion: v1
kind: ServiceAccount
metadata:
name: tiller
namespace: kube-system
---
apiVersion: rbac.authorization.k8s.io/v1beta1
kind: ClusterRoleBinding
metadata:
name: tiller
roleRef:
apiGroup: rbac.authorization.k8s.io
kind: ClusterRole
name: cluster-admin
subjects:
- kind: ServiceAccount
name: tiller
namespace: kube-system
EOF
kubectl apply -f helm-rbac.yaml
```
## 4、Helm使用
```bash
#搜索
helm search
#执行命名添加权限
kubectl create serviceaccount --namespace kube-system tiller
kubectl create clusterrolebinding tiller-cluster-rule --clusterrole=cluster-admin --serviceaccount=kube-system:tiller
kubectl patch deploy --namespace kube-system tiller-deploy -p '{"spec":{"template":{"spec":{"serviceAccount":"tiller"}}}}'
#安装chart的mysql应用
helm install stable/mysql
会自动部署 Service,Deployment,Secret 和 PersistentVolumeClaim,并给与很多提示信息,比如mysql密码获取,连接端口等.
#查看release各个对象
kubectl get service doltish-beetle-mysql
kubectl get deployments. doltish-beetle-mysql
kubectl get pods doltish-beetle-mysql-75fbddbd9d-f64j4
kubectl get pvc doltish-beetle-mysql
helm list # 显示已经部署的release
#删除
helm delete doltish-beetle
kubectl get pods
kubectl get service
kubectl get deployments.
kubectl get pvc
```
# 二、使用Helm部署Nginx Ingress
## 1、标记标签
我们将kub1(192.168.56.11)做为边缘节点,打上Label
```bash
#查看node标签
kubectl get nodes --show-labels
kubectl label node k8s-master-01 node-role.kubernetes.io/edge=
$ kubectl get node
NAME STATUS ROLES AGE VERSION
k8s-master-01 Ready edge,master 59m v1.16.2
k8s-master-02 Ready 58m v1.16.2
k8s-master-03 Ready 58m v1.16.2
```
## 2、编写chart的值文件ingress-nginx.yaml
```bash
cat >ingress-nginx.yaml<<\EOF
controller:
hostNetwork: true
daemonset:
useHostPort: false
hostPorts:
http: 80
https: 443
service:
type: ClusterIP
tolerations:
- operator: "Exists"
nodeSelector:
node-role.kubernetes.io/edge: ''
defaultBackend:
tolerations:
- operator: "Exists"
nodeSelector:
node-role.kubernetes.io/edge: ''
EOF
```
## 3、安装nginx-ingress
```bash
helm del --purge nginx-ingress
helm repo update
helm install stable/nginx-ingress \
--name nginx-ingress \
--namespace kube-system \
-f ingress-nginx.yaml
如果访问 http://192.168.56.11 返回default backend,则部署完成。
#nginx-ingress
docker pull registry.cn-hangzhou.aliyuncs.com/google_containers/nginx-ingress-controller:0.26.1
docker tag registry.cn-hangzhou.aliyuncs.com/google_containers/nginx-ingress-controller:0.26.1 quay.io/kubernetes-ingress-controller/nginx-ingress-controller:0.25.1
docker rmi registry.cn-hangzhou.aliyuncs.com/google_containers/nginx-ingress-controller:0.26.1
#defaultbackend
docker pull googlecontainer/defaultbackend-amd64:1.5
docker tag googlecontainer/defaultbackend-amd64:1.5 k8s.gcr.io/defaultbackend-amd64:1.5
docker rmi googlecontainer/defaultbackend-amd64:1.5
```
## 4、查看 nginx-ingress 的 Pod
```bash
kubectl get pods -n kube-system | grep nginx-ingress
```
# 三、Helm 安装部署Kubernetes的dashboard
## 1、创建tls secret
```bash
openssl req -x509 -nodes -days 3650 -newkey rsa:2048 -keyout ./tls.key -out ./tls.crt -subj "/CN=k8s.test.com"
```
## 2、安装tls secret
```bash
kubectl delete secret dashboard-tls-secret -n kube-system
kubectl -n kube-system create secret tls dashboard-tls-secret --key ./tls.key --cert ./tls.crt
kubectl get secret -n kube-system |grep dashboard
```
## 3、安装
```bash
cat >kubernetes-dashboard.yaml<<\EOF
image:
repository: registry.cn-hangzhou.aliyuncs.com/google_containers/kubernetes-dashboard-amd64
tag: v1.10.1
ingress:
enabled: true
hosts:
- k8s.test.com
annotations:
nginx.ingress.kubernetes.io/ssl-redirect: "false"
nginx.ingress.kubernetes.io/backend-protocol: "HTTPS"
tls:
- secretName: dashboard-tls-secret
hosts:
- k8s.test.com
nodeSelector:
node-role.kubernetes.io/edge: ''
tolerations:
- key: node-role.kubernetes.io/master
operator: Exists
effect: NoSchedule
- key: node-role.kubernetes.io/master
operator: Exists
effect: PreferNoSchedule
rbac:
clusterAdminRole: true
EOF
相比默认配置,修改了以下配置项:
ingress.enabled - 置为 true 开启 Ingress,用 Ingress 将 Kubernetes Dashboard 服务暴露出来,以便让我们浏览器能够访问
ingress.annotations - 指定 ingress.class 为 nginx,让我们安装 Nginx Ingress Controller 来反向代理 Kubernetes Dashboard 服务;由于 Kubernetes Dashboard 后端服务是以 https 方式监听的,而 Nginx Ingress Controller 默认会以 HTTP 协议将请求转发给后端服务,用secure-backends这个 annotation 来指示 Nginx Ingress Controller 以 HTTPS 协议将请求转发给后端服务
ingress.hosts - 这里替换为证书配置的域名
Ingress.tls - secretName 配置为 cert-manager 生成的免费证书所在的 Secret 资源名称,hosts 替换为证书配置的域名
rbac.clusterAdminRole - 置为 true 让 dashboard 的权限够大,这样我们可以方便操作多个 namespace
```
## 4、命令安装
1、安装
```bash
#删除
helm delete kubernetes-dashboard
helm del --purge kubernetes-dashboard
#安装
helm install stable/kubernetes-dashboard \
-n kubernetes-dashboard \
--namespace kube-system \
-f kubernetes-dashboard.yaml
```
2、查看pod
```bash
kubectl get pods -n kube-system -o wide
```
3、查看详细信息
```bash
kubectl describe pod `kubectl get pod -A|grep dashboard|awk '{print $2}'` -n kube-system
```
4、访问
```bash
#获取token
kubectl describe -n kube-system secret/`kubectl -n kube-system get secret | grep kubernetes-dashboard-token|awk '{print $1}'`
#访问
https://k8s.test.com
```
参考文档:
https://www.cnblogs.com/hongdada/p/11395200.html 镜像问题
https://www.qikqiak.com/post/install-nginx-ingress/
https://www.cnblogs.com/bugutian/p/11366556.html 国内不fq安装K8S三: 使用helm安装kubernet-dashboard
https://www.cnblogs.com/hongdada/p/11284534.html Helm 安装部署Kubernetes的dashboard
https://www.cnblogs.com/chanix/p/11731388.html Helm - K8S的包管理器
https://www.cnblogs.com/peitianwang/p/11649621.html
================================================
FILE: kubeadm/K8S-HA-V1.13.4-关闭防火墙版.md
================================================
# 环境介绍:
```bash
CentOS: 7.6
Docker: 18.06.1-ce
Kubernetes: 1.13.4
Kuberadm: 1.13.4
Kuberlet: 1.13.4
Kuberctl: 1.13.4
```
# 部署介绍:
创建高可用首先先有一个 Master 节点,然后再让其他服务器加入组成三个 Master 节点高可用,然后再将工作节点 Node 加入。下面将描述每个节点要执行的步骤:
```bash
Master01: 二、三、四、五、六、七、八、九、十一
Master02、Master03: 二、三、五、六、四、九
node01、node02、node03: 二、五、六、九
```
# 集群架构:

# 一、kuberadm 简介
### 1、Kuberadm 作用
Kubeadm 是一个工具,它提供了 kubeadm init 以及 kubeadm join 这两个命令作为快速创建 kubernetes 集群的最佳实践。
kubeadm 通过执行必要的操作来启动和运行一个最小可用的集群。它被故意设计为只关心启动集群,而不是之前的节点准备工作。同样的,诸如安装各种各样值得拥有的插件,例如 Kubernetes Dashboard、监控解决方案以及特定云提供商的插件,这些都不在它负责的范围。
相反,我们期望由一个基于 kubeadm 从更高层设计的更加合适的工具来做这些事情;并且,理想情况下,使用 kubeadm 作为所有部署的基础将会使得创建一个符合期望的集群变得容易。
### 2、Kuberadm 功能
```bash
kubeadm init: 启动一个 Kubernetes 主节点
kubeadm join: 启动一个 Kubernetes 工作节点并且将其加入到集群
kubeadm upgrade: 更新一个 Kubernetes 集群到新版本
kubeadm config: 如果使用 v1.7.x 或者更低版本的 kubeadm 初始化集群,您需要对集群做一些配置以便使用 kubeadm upgrade 命令
kubeadm token: 管理 kubeadm join 使用的令牌
kubeadm reset: 还原 kubeadm init 或者 kubeadm join 对主机所做的任何更改
kubeadm version: 打印 kubeadm 版本
kubeadm alpha: 预览一组可用的新功能以便从社区搜集反馈
```
### 3、功能版本
| Area |
Maturity Level |
| Command line UX |
GA |
| Implementation |
GA |
| Config file API |
beta |
| CoreDNS |
GA |
| kubeadm alpha subcommands |
alpha |
| High availability |
alpha |
| DynamicKubeletConfig |
alpha |
| Self-hosting |
alpha |
# 二、前期准备
### 1、虚拟机分配说明
| 地址 |
主机名 |
内存&CPU |
角色 |
| 10.19.2.200 |
- |
- |
vip |
| 10.19.2.56 |
k8s-master-01 |
2C & 2G |
master |
| 10.19.2.57 |
k8s-master-02 |
2C & 2G |
master |
| 10.19.2.58 |
k8s-master-03 |
2C & 2G |
master |
| 10.19.2.246 |
k8s-node-01 |
4C & 8G |
node |
| 10.19.2.247 |
k8s-node-02 |
4C & 8G |
node |
| 10.19.2.248 |
k8s-node-03 |
4C & 8G |
node |
### 2、各个节点端口占用
- Master 节点
| 规则 |
方向 |
端口范围 |
作用 |
使用者 |
| TCP |
Inbound 入口 |
6443* |
Kubernetes API |
server All |
| TCP |
Inbound 入口 |
2379-2380 |
etcd server |
client API kube-apiserver, etcd |
| TCP |
Inbound 入口 |
10250 |
Kubernetes API |
Self, Control plane |
| TCP |
Inbound 入口 |
10251 |
kube-scheduler |
Self |
| TCP |
Inbound 入口 |
10252 |
kube-controller-manager |
Self |
- node 节点
| 规则 |
方向 |
端口范围 |
作用 |
使用者 |
| TCP |
Inbound 入口 |
10250 |
Kubernetes API |
Self, Control plane |
| TCP |
Inbound 入口 |
30000-32767 |
NodePort Services** |
All |
### 3、基础环境设置
Kubernetes 需要一定的环境来保证正常运行,如各个节点时间同步,主机名称解析,关闭防火墙等等。
1、主机名称解析
分布式系统环境中的多主机通信通常基于主机名称进行,这在 IP 地址存在变化的可能性时为主机提供了固定的访问人口,因此一般需要有专用的 DNS 服务负责解决各节点主机 不过,考虑到此处部署的是测试集群,因此为了降低系复杂度,这里将基于 hosts 的文件进行主机名称解析。
2、修改hosts和免key登录
```bash
#分别进入不同服务器,进入 /etc/hosts 进行编辑
cat > /etc/hosts << \EOF
127.0.0.1 localhost localhost.localdomain localhost4 localhost4.localdomain4
::1 localhost localhost.localdomain localhost6 localhost6.localdomain6
10.19.2.200 k8s-vip master master.k8s.io
10.19.2.56 k8s-master-01 master01 master01.k8s.io
10.19.2.57 k8s-master-02 master02 master02.k8s.io
10.19.2.58 k8s-master-03 master03 master03.k8s.io
10.19.2.246 k8s-node-01 node01 node01.k8s.io
10.19.2.247 k8s-node-02 node02 node02.k8s.io
10.19.2.248 k8s-node-03 node03 node03.k8s.io
EOF
#root用户免密登录
mkdir -p /root/.ssh/
chmod 700 /root/.ssh/
echo 'ssh-rsa AAAAB3NzaC1yc2EAAAADAQABAAABAQC7bRm20od1b3rzW3ZPLB5NZn3jQesvfiz2p0WlfcYJrFHfF5Ap0ubIBUSQpVNLn94u8ABGBLboZL8Pjo+rXQPkIcObJxoKS8gz6ZOxcxJhldudbadabdanKAAKAKKKKKKKKKKKKKKKKKKKKKKK root@k8s-master-01' > /root/.ssh/authorized_keys
chmod 400 /root/.ssh/authorized_keys
```
3、修改hostname
```bash
#分别进入不同的服务器修改 hostname 名称
# 修改 10.19.2.56 服务器
hostnamectl set-hostname k8s-master-01
# 修改 10.19.2.57 服务器
hostnamectl set-hostname k8s-master-02
# 修改 10.19.2.58 服务器
hostnamectl set-hostname k8s-master-03
# 修改 10.19.2.246 服务器
hostnamectl set-hostname k8s-node-01
# 修改 10.19.2.247 服务器
hostnamectl set-hostname k8s-node-02
# 修改 10.19.2.248 服务器
hostnamectl set-hostname k8s-node-03
```
4、主机时间同步
```bash
#将各个服务器的时间同步,并设置开机启动同步时间服务
systemctl start chronyd.service
systemctl enable chronyd.service
```
5、关闭防火墙服务
```bash
systemctl stop firewalld
systemctl disable firewalld
```
6、关闭并禁用SELinux
```bash
# 若当前启用了 SELinux 则需要临时设置其当前状态为 permissive
setenforce 0
# 编辑/etc/sysconfig selinux 文件,以彻底禁用 SELinux
sed -i 's/^SELINUX=enforcing$/SELINUX=disabled/' /etc/selinux/config
# 查看selinux状态
getenforce
如果为permissive,则执行reboot重新启动即可
```
7、禁用 Swap 设备
kubeadm 默认会预先检当前主机是否禁用了 Swap 设备,并在未用时强制止部署 过程因此,在主机内存资惊充裕的条件下,需要禁用所有的 Swap 设备
```
# 关闭当前已启用的所有 Swap 设备
swapoff -a && sysctl -w vm.swappiness=0
sed -ri 's/.*swap.*/#&/' /etc/fstab
或
# 编辑 fstab 配置文件,注释掉标识为 Swap 设备的所有行
vi /etc/fstab
UUID=9be41058-76a6-4588-8e3f-5b44604d8de1 / xfs defaults,noatime 0 0
UUID=4489cc8f-1885-4e17-bfe7-8652fd1d3feb /boot xfs defaults,noatime 0 0
#UUID=0f5ae5f1-4872-471f-9f3a-f172a43fc1ff swap swap defaults,noatime 0 0
```
8、设置系统参数
设置允许路由转发,不对bridge的数据进行处理
```bash
#创建 /etc/sysctl.d/k8s.conf 文件
cat > /etc/sysctl.d/k8s.conf << \EOF
vm.swappiness = 0
net.ipv4.ip_forward = 1
net.bridge.bridge-nf-call-ip6tables = 1
net.bridge.bridge-nf-call-iptables = 1
EOF
#挂载br_netfilter
modprobe br_netfilter
#生效配置文件
sysctl -p /etc/sysctl.d/k8s.conf
#查看是否生成相关文件
ls /proc/sys/net/bridge
```
9、资源配置文件
`/etc/security/limits.conf` 是 Linux 资源使用配置文件,用来限制用户对系统资源的使用
```bash
echo "* soft nofile 65536" >> /etc/security/limits.conf
echo "* hard nofile 65536" >> /etc/security/limits.conf
echo "* soft nproc 65536" >> /etc/security/limits.conf
echo "* hard nproc 65536" >> /etc/security/limits.conf
echo "* soft memlock unlimited" >> /etc/security/limits.conf
echo "* hard memlock unlimited" >> /etc/security/limits.conf
```
10、安装依赖包以及相关工具
```bash
yum install -y epel-release
yum install -y yum-utils device-mapper-persistent-data lvm2 net-tools conntrack-tools wget vim ntpdate libseccomp libtool-ltdl
```
# 三、安装Keepalived
- keepalived介绍: 是集群管理中保证集群高可用的一个服务软件,其功能类似于heartbeat,用来防止单点故障
- Keepalived作用: 为haproxy提供vip(10.19.2.200)在三个haproxy实例之间提供主备,降低当其中一个haproxy失效的时对服务的影响。
### 1、yum安装Keepalived
```bash
# 安装keepalived
chattr -i /etc/passwd* && chattr -i /etc/group* && chattr -i /etc/shadow* && chattr -i /etc/gshadow*
yum install -y keepalived
```
### 2、配置Keepalived
```bash
cat < /etc/keepalived/keepalived.conf
! Configuration File for keepalived
# 主要是配置故障发生时的通知对象以及机器标识。
global_defs {
# 标识本节点的字条串,通常为 hostname,但不一定非得是 hostname。故障发生时,邮件通知会用到。
router_id LVS_k8s
}
# 用来做健康检查的,当时检查失败时会将 vrrp_instance 的 priority 减少相应的值。
vrrp_script check_haproxy {
script "killall -0 haproxy" #根据进程名称检测进程是否存活
interval 3
weight -2
fall 10
rise 2
}
# rp_instance用来定义对外提供服务的 VIP 区域及其相关属性。
vrrp_instance VI_1 {
state MASTER #当前节点为MASTER,其他两个节点设置为BACKUP
interface eth0 #改为自己的网卡
virtual_router_id 51
priority 250
advert_int 1
authentication {
auth_type PASS
auth_pass 35f18af7190d51c9f7f78f37300a0cbd
}
virtual_ipaddress {
10.19.2.200 #虚拟ip,即VIP
}
track_script {
check_haproxy
}
}
EOF
```
当前节点的配置中 state 配置为 MASTER,其它两个节点设置为 BACKUP
```bash
配置说明:
virtual_ipaddress: vip
track_script: 执行上面定义好的检测的script
interface: 节点固有IP(非VIP)的网卡,用来发VRRP包。
virtual_router_id: 取值在0-255之间,用来区分多个instance的VRRP组播
advert_int: 发VRRP包的时间间隔,即多久进行一次master选举(可以认为是健康查检时间间隔)。
authentication: 认证区域,认证类型有PASS和HA(IPSEC),推荐使用PASS(密码只识别前8位)。
state: 可以是MASTER或BACKUP,不过当其他节点keepalived启动时会将priority比较大的节点选举为MASTER,因此该项其实没有实质用途。
priority: 用来选举master的,要成为master,那么这个选项的值最好高于其他机器50个点,该项取值范围是1-255(在此范围之外会被识别成默认值100)。
# 1、注意防火墙需要放开vrrp协议(不然会出现脑裂现象,三台主机都存在VIP的情况)
#-A INPUT -p vrrp -j ACCEPT
-A RH-Firewall-1-INPUT -p vrrp -j ACCEPT
#2、注意上面配置script "killall -0 haproxy" #根据进程名称检测进程是否存活,会在/var/log/messages每隔一秒执行检测的日志记录
# tail -100f /var/log/message
Sep 27 10:54:16 tw19410s1 Keepalived_vrrp[9113]: /usr/bin/killall -0 haproxy exited with status 1
```
### 3、启动Keepalived
```bash
# 设置开机启动
systemctl enable keepalived
# 启动keepalived
systemctl start keepalived
# 查看启动状态
systemctl status keepalived
```
### 4、查看网络状态
kepplived 配置中 state 为 MASTER 的节点启动后,查看网络状态,可以看到虚拟IP已经加入到绑定的网卡中
```bash
[root@k8s-master-01 ~]# ip address show eth0
2: eth0: mtu 1500 qdisc mq state UP group default qlen 1000
link/ether 00:50:56:be:86:af brd ff:ff:ff:ff:ff:ff
inet 10.19.2.56/22 brd 10.19.3.255 scope global eth0
valid_lft forever preferred_lft forever
inet 10.19.2.200/32 scope global eth0
valid_lft forever preferred_lft forever
当关掉当前节点的keeplived服务后将进行虚拟IP转移,将会推选state 为 BACKUP 的节点的某一节点为新的MASTER,可以在那台节点上查看网卡,将会查看到虚拟IP
```
# 四、安装haproxy
此处的haproxy为apiserver提供反向代理,haproxy将所有请求轮询转发到每个master节点上。相对于仅仅使用keepalived主备模式仅单个master节点承载流量,这种方式更加合理、健壮。
### 1、yum安装haproxy
```bash
chattr -i /etc/passwd* && chattr -i /etc/group* && chattr -i /etc/shadow* && chattr -i /etc/gshadow*
yum install -y haproxy
```
### 2、配置haproxy
```bash
cat > /etc/haproxy/haproxy.cfg << EOF
#---------------------------------------------------------------------
# Global settings
#---------------------------------------------------------------------
global
# to have these messages end up in /var/log/haproxy.log you will
# need to:
# 1) configure syslog to accept network log events. This is done
# by adding the '-r' option to the SYSLOGD_OPTIONS in
# /etc/sysconfig/syslog
# 2) configure local2 events to go to the /var/log/haproxy.log
# file. A line like the following can be added to
# /etc/sysconfig/syslog
#
# local2.* /var/log/haproxy.log
#
log 127.0.0.1 local2
chroot /var/lib/haproxy
pidfile /var/run/haproxy.pid
maxconn 4000
user haproxy
group haproxy
daemon
# turn on stats unix socket
stats socket /var/lib/haproxy/stats
#---------------------------------------------------------------------
# common defaults that all the 'listen' and 'backend' sections will
# use if not designated in their block
#---------------------------------------------------------------------
defaults
mode http
log global
option httplog
option dontlognull
option http-server-close
option forwardfor except 127.0.0.0/8
option redispatch
retries 3
timeout http-request 10s
timeout queue 1m
timeout connect 10s
timeout client 1m
timeout server 1m
timeout http-keep-alive 10s
timeout check 10s
maxconn 3000
#---------------------------------------------------------------------
# kubernetes apiserver frontend which proxys to the backends
#---------------------------------------------------------------------
frontend kubernetes-apiserver
mode tcp
bind *:16443
option tcplog
default_backend kubernetes-apiserver
#---------------------------------------------------------------------
# round robin balancing between the various backends
#---------------------------------------------------------------------
backend kubernetes-apiserver
mode tcp
balance roundrobin
server master01.k8s.io 10.19.2.56:6443 check
server master02.k8s.io 10.19.2.57:6443 check
server master03.k8s.io 10.19.2.58:6443 check
#---------------------------------------------------------------------
# collection haproxy statistics message
#---------------------------------------------------------------------
listen stats
bind *:1080
stats auth admin:awesomePassword
stats refresh 5s
stats realm HAProxy\ Statistics
stats uri /admin?stats
EOF
```
haproxy配置在其他master节点上(10.19.2.57和10.19.2.58)相同
### 3、启动并检测haproxy
```bash
# 设置开机启动
systemctl enable haproxy
# 开启haproxy
systemctl start haproxy
# 查看启动状态
systemctl status haproxy
```
### 4、检测haproxy端口
```bash
ss -lnt | grep -E "16443|1080"
```
# 五、安装Docker (所有节点)
### 1、移除之前安装过的Docker
```bash
sudo yum remove -y docker \
docker-client \
docker-client-latest \
docker-common \
docker-latest \
docker-latest-logrotate \
docker-logrotate \
docker-selinux \
docker-engine-selinux \
docker-ce-cli \
docker-engine
# 查看还有没有存在的docker组件
rpm -qa|grep docker
# 有则通过命令 yum -y remove XXX 来删除,比如:
yum remove docker-ce-cli
```
### 2、配置docker的yum源
下面两个镜像源选择其一即可,由于官方下载速度比较慢,推荐用阿里镜像源
- 阿里镜像源
```bash
sudo yum-config-manager --add-repo http://mirrors.aliyun.com/docker-ce/linux/centos/docker-ce.repo
```
- Docker官方镜像源
```bash
sudo yum-config-manager --add-repo https://download.docker.com/linux/centos/docker-ce.repo
```
### 3、安装Docker:
```
# 显示docker-ce所有可安装版本:
yum list docker-ce --showduplicates | sort -r
# 安装指定docker版本
sudo yum install docker-ce-18.06.1.ce-3.el7 -y
# 启动docker并设置docker开机启动
systemctl enable docker
systemctl start docker
# 确认一下iptables
确认一下iptables filter表中FOWARD链的默认策略(pllicy)为ACCEPT。
iptables -nvL
Chain FORWARD (policy ACCEPT 0 packets, 0 bytes)
pkts bytes target prot opt in out source destination
0 0 DOCKER-USER all -- * * 0.0.0.0/0 0.0.0.0/0
0 0 DOCKER-ISOLATION-STAGE-1 all -- * * 0.0.0.0/0 0.0.0.0/0
0 0 ACCEPT all -- * docker0 0.0.0.0/0 0.0.0.0/0 ctstate RELATED,ESTABLISHED
0 0 DOCKER all -- * docker0 0.0.0.0/0 0.0.0.0/0
0 0 ACCEPT all -- docker0 !docker0 0.0.0.0/0 0.0.0.0/0
0 0 ACCEPT all -- docker0 docker0 0.0.0.0/0 0.0.0.0/0
Docker从1.13版本开始调整了默认的防火墙规则,禁用了iptables filter表中FOWARD链,这样会引起Kubernetes集群中跨Node的Pod无法通信。但这里通过安装docker 1806,发现默认策略又改回了ACCEPT,这个不知道是从哪个版本改回的,因为我们线上版本使用的1706还是需要手动调整这个策略的。
# 执行下面命令
iptables -P FORWARD ACCEPT
# 修改docker的配置
vim /usr/lib/systemd/system/docker.service
# 增加下面命令
ExecStartPost=/usr/sbin/iptables -P FORWARD ACCEPT
# 配置docker加速器
cat > /etc/docker/daemon.json << \EOF
{
"registry-mirrors": [
"https://dockerhub.azk8s.cn",
"https://i37dz0y4.mirror.aliyuncs.com"
],
"insecure-registries": ["reg.hub.com"]
}
EOF
# 重启Docker
systemctl daemon-reload
systemctl restart docker
```
### 4、docker最终的服务文件
```
#注意,有变量的地方需要使用转义符号
cat > /usr/lib/systemd/system/docker.service << EOF
[Unit]
Description=Docker Application Container Engine
Documentation=https://docs.docker.com
BindsTo=containerd.service
After=network-online.target firewalld.service containerd.service
Wants=network-online.target
Requires=docker.socket
[Service]
Type=notify
# the default is not to use systemd for cgroups because the delegate issues still
# exists and systemd currently does not support the cgroup feature set required
# for containers run by docker
ExecStart=/usr/bin/dockerd -H fd:// --containerd=/run/containerd/containerd.sock --exec-opt native.cgroupdriver=systemd
ExecReload=/bin/kill -s HUP \$MAINPID
ExecStartPost=/usr/sbin/iptables -P FORWARD ACCEPT
TimeoutSec=0
RestartSec=2
Restart=always
# Note that StartLimit* options were moved from "Service" to "Unit" in systemd 229.
# Both the old, and new location are accepted by systemd 229 and up, so using the old location
# to make them work for either version of systemd.
StartLimitBurst=3
# Note that StartLimitInterval was renamed to StartLimitIntervalSec in systemd 230.
# Both the old, and new name are accepted by systemd 230 and up, so using the old name to make
# this option work for either version of systemd.
StartLimitInterval=60s
# Having non-zero Limit*s causes performance problems due to accounting overhead
# in the kernel. We recommend using cgroups to do container-local accounting.
LimitNOFILE=infinity
LimitNPROC=infinity
LimitCORE=infinity
# Comment TasksMax if your systemd version does not support it.
# Only systemd 226 and above support this option.
TasksMax=infinity
# set delegate yes so that systemd does not reset the cgroups of docker containers
Delegate=yes
# kill only the docker process, not all processes in the cgroup
KillMode=process
[Install]
WantedBy=multi-user.target
EOF
```
# 六、安装kubeadm、kubelet
### 1、配置可用的国内yum源用于安装:
```
cat < /etc/yum.repos.d/kubernetes.repo
[kubernetes]
name=Kubernetes
baseurl=https://mirrors.aliyun.com/kubernetes/yum/repos/kubernetes-el7-x86_64/
enabled=1
gpgcheck=0
repo_gpgcheck=0
gpgkey=https://mirrors.aliyun.com/kubernetes/yum/doc/yum-key.gpg https://mirrors.aliyun.com/kubernetes/yum/doc/rpm-package-key.gpg
EOF
```
### 2、安装kubelet
```
# 需要在每台机器上都安装以下的软件包:
kubeadm: 用来初始化集群的指令。
kubelet: 在集群中的每个节点上用来启动 pod 和 container 等。
kubectl: 用来与集群通信的命令行工具。
# 查看kubelet版本列表
yum list kubelet --showduplicates | sort -r
# 安装kubelet
yum install -y kubelet-1.13.4-0
# 启动kubelet并设置开机启动
systemctl enable kubelet
systemctl start kubelet
# 检查状态
检查状态,发现是failed状态,正常,kubelet会10秒重启一次,等初始化master节点后即可正常
systemctl status kubelet
```
### 3、安装kubeadm
```
# 负责初始化集群
# 1、查看kubeadm版本列表
yum list kubeadm --showduplicates | sort -r
# 2、安装kubeadm
yum install -y kubeadm-1.13.4-0
# 安装 kubeadm 时候会默认安装 kubectl ,所以不需要单独安装kubectl
# 3、重启服务器
为了防止发生某些未知错误,这里我们重启下服务器,方便进行后续操作
reboot
```
# 七、初始化第一个kubernetes master节点
```
# 因为需要绑定虚拟IP,所以需要首先先查看虚拟IP启动这几台master机子哪台上
[root@k8s-master-01 ~]# ip address show eth0
2: eth0: mtu 1500 qdisc mq state UP group default qlen 1000
link/ether 00:50:56:be:86:af brd ff:ff:ff:ff:ff:ff
inet 10.19.2.56/22 brd 10.19.3.255 scope global eth0
valid_lft forever preferred_lft forever
inet 10.19.2.200/32 scope global eth0
valid_lft forever preferred_lft forever
可以看到虚拟IP 10.19.2.200 和 服务器IP 10.19.2.56在一台机子上,所以初始化kubernetes第一个master要在master01机子上进行安装
```
### 1、创建kubeadm配置的yaml文件
```
# 1、创建kubeadm配置的yaml文件
cat > kubeadm-config.yaml << EOF
apiServer:
certSANs:
- k8s-master-01
- k8s-master-02
- k8s-master-03
- master.k8s.io
- 10.19.2.56
- 10.19.2.57
- 10.19.2.58
- 10.19.2.200
- 127.0.0.1
extraArgs:
authorization-mode: Node,RBAC
timeoutForControlPlane: 4m0s
apiVersion: kubeadm.k8s.io/v1beta1
certificatesDir: /etc/kubernetes/pki
clusterName: kubernetes
controlPlaneEndpoint: "master.k8s.io:16443"
controllerManager: {}
dns:
type: CoreDNS
etcd:
local:
dataDir: /var/lib/etcd
imageRepository: registry.aliyuncs.com/google_containers
kind: ClusterConfiguration
kubernetesVersion: v1.13.4
networking:
dnsDomain: cluster.local
podSubnet: 10.20.0.0/16
serviceSubnet: 10.10.0.0/16
scheduler: {}
EOF
以下两个地方设置:
- certSANs: 虚拟ip地址(为了安全起见,把所有集群地址都加上)
- controlPlaneEndpoint: 虚拟IP:监控端口号
配置说明:
imageRepository: registry.aliyuncs.com/google_containers (使用阿里云镜像仓库)
podSubnet: 10.20.0.0/16 (#pod地址池)
serviceSubnet: 10.10.0.0/16 (#service地址池)
```
### 2、初始化第一个master节点
```
kubeadm init --config kubeadm-config.yaml
```
日志
```
Your Kubernetes master has initialized successfully!
To start using your cluster, you need to run the following as a regular user:
mkdir -p $HOME/.kube
sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config
sudo chown $(id -u):$(id -g) $HOME/.kube/config
You should now deploy a pod network to the cluster.
Run "kubectl apply -f [podnetwork].yaml" with one of the options listed at:
https://kubernetes.io/docs/concepts/cluster-administration/addons/
You can now join any number of machines by running the following on each node
as root:
kubeadm join master.k8s.io:16443 --token i77yg1.1eype0c53jsanoge --discovery-token-ca-cert-hash sha256:8f0a817012ab333a057b6a7410e65971be20b95c1b75fc4015f8f3b6785f626f
```
在此处看日志可以知道,通过
```
kubeadm join master.k8s.io:16443 --token i77yg1.1eype0c53jsanoge --discovery-token-ca-cert-hash sha256:8f0a817012ab333a057b6a7410e65971be20b95c1b75fc4015f8f3b6785f626f
```
来让节点加入集群
### 3、配置kubectl环境变量
```bash
# 配置环境变量
mkdir -p $HOME/.kube
sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config
sudo chown $(id -u):$(id -g) $HOME/.kube/config
# 指令补全
yum install bash-completion -y
source <(kubectl completion bash)
echo "source <(kubectl completion bash)" >> ~/.bashrc
```
### 4、查看组件状态
```bash
kubectl get cs
NAME STATUS MESSAGE ERROR
controller-manager Healthy ok
scheduler Healthy ok
etcd-0 Healthy {"health": "true"}
# 查看pod状态
[root@k8s-master-01 ~]# kubectl get pods --namespace=kube-system
NAME READY STATUS RESTARTS AGE
coredns-78d4cf999f-5zt5z 0/1 Pending 0 7m32s ---coredns没有启动
coredns-78d4cf999f-mkgsx 0/1 Pending 0 7m32s ---coredns没有启动
etcd-k8s-master-01 1/1 Running 0 6m39s
kube-apiserver-k8s-master-01 1/1 Running 0 6m43s
kube-controller-manager-k8s-master-01 1/1 Running 0 6m32s
kube-proxy-88s74 1/1 Running 0 7m32s
kube-scheduler-k8s-master-01 1/1 Running 0 6m45s
可以看到coredns没有启动,这是由于还没有配置网络插件,接下来配置下后再重新查看启动状态
```
# 八、安装网络插件
### 1、配置flannel插件的yaml文件
```bash
cat > kube-flannel.yaml << EOF
---
kind: ClusterRole
apiVersion: rbac.authorization.k8s.io/v1beta1
metadata:
name: flannel
rules:
- apiGroups:
- ""
resources:
- pods
verbs:
- get
- apiGroups:
- ""
resources:
- nodes
verbs:
- list
- watch
- apiGroups:
- ""
resources:
- nodes/status
verbs:
- patch
---
kind: ClusterRoleBinding
apiVersion: rbac.authorization.k8s.io/v1beta1
metadata:
name: flannel
roleRef:
apiGroup: rbac.authorization.k8s.io
kind: ClusterRole
name: flannel
subjects:
- kind: ServiceAccount
name: flannel
namespace: kube-system
---
apiVersion: v1
kind: ServiceAccount
metadata:
name: flannel
namespace: kube-system
---
kind: ConfigMap
apiVersion: v1
metadata:
name: kube-flannel-cfg
namespace: kube-system
labels:
tier: node
app: flannel
data:
cni-conf.json: |
{
"name": "cbr0",
"plugins": [
{
"type": "flannel",
"delegate": {
"hairpinMode": true,
"isDefaultGateway": true
}
},
{
"type": "portmap",
"capabilities": {
"portMappings": true
}
}
]
}
net-conf.json: |
{
"Network": "10.20.0.0/16",
"Backend": {
"Type": "vxlan"
}
}
---
apiVersion: extensions/v1beta1
kind: DaemonSet
metadata:
name: kube-flannel-ds-amd64
namespace: kube-system
labels:
tier: node
app: flannel
spec:
template:
metadata:
labels:
tier: node
app: flannel
spec:
hostNetwork: true
nodeSelector:
beta.kubernetes.io/arch: amd64
tolerations:
- operator: Exists
effect: NoSchedule
serviceAccountName: flannel
initContainers:
- name: install-cni
image: registry.cn-shenzhen.aliyuncs.com/cp_m/flannel:v0.10.0-amd64
command:
- cp
args:
- -f
- /etc/kube-flannel/cni-conf.json
- /etc/cni/net.d/10-flannel.conflist
volumeMounts:
- name: cni
mountPath: /etc/cni/net.d
- name: flannel-cfg
mountPath: /etc/kube-flannel/
containers:
- name: kube-flannel
image: registry.cn-shenzhen.aliyuncs.com/cp_m/flannel:v0.10.0-amd64
command:
- /opt/bin/flanneld
args:
- --ip-masq
- --kube-subnet-mgr
resources:
requests:
cpu: "100m"
memory: "50Mi"
limits:
cpu: "100m"
memory: "50Mi"
securityContext:
privileged: true
env:
- name: POD_NAME
valueFrom:
fieldRef:
fieldPath: metadata.name
- name: POD_NAMESPACE
valueFrom:
fieldRef:
fieldPath: metadata.namespace
volumeMounts:
- name: run
mountPath: /run
- name: flannel-cfg
mountPath: /etc/kube-flannel/
volumes:
- name: run
hostPath:
path: /run
- name: cni
hostPath:
path: /etc/cni/net.d
- name: flannel-cfg
configMap:
name: kube-flannel-cfg
EOF
“Network”: “10.20.0.0/16”要和kubeadm-config.yaml配置文件中podSubnet: 10.20.0.0/16相同
```
### 2、创建flanner相关role和pod
```
# 应用生效
[root@k8s-master-01 ~]# kubectl apply -f kube-flannel.yaml
clusterrole.rbac.authorization.k8s.io/flannel created
clusterrolebinding.rbac.authorization.k8s.io/flannel created
serviceaccount/flannel created
configmap/kube-flannel-cfg created
daemonset.extensions/kube-flannel-ds-amd64 created
# 等待一会时间,再次查看各个pods的状态
[root@k8s-master-01 ~]# kubectl get pods --namespace=kube-system
NAME READY STATUS RESTARTS AGE
coredns-78d4cf999f-5zt5z 1/1 Running 0 12m ---coredns启动成功
coredns-78d4cf999f-mkgsx 1/1 Running 0 12m ---coredns启动成功
etcd-k8s-master-01 1/1 Running 0 11m
kube-apiserver-k8s-master-01 1/1 Running 0 12m
kube-controller-manager-k8s-master-01 1/1 Running 0 11m
kube-flannel-ds-amd64-7lj6m 1/1 Running 0 13s
kube-proxy-88s74 1/1 Running 0 12m
kube-scheduler-k8s-master-01 1/1 Running 0 12m
```
# 九、加入集群
### 1、Master加入集群构成高可用
```
复制秘钥到各个节点
在master01 服务器上执行下面命令,将kubernetes相关文件复制到 master02、master03
如果其他节点为初始化第一个master节点,则将该节点的配置文件复制到其余两个主节点,例如master03为第一个master节点,则将它的k8s配置复制到master02和master01。
```
- 复制文件到 master02
```
ssh root@master02.k8s.io mkdir -p /etc/kubernetes/pki/etcd
scp /etc/kubernetes/admin.conf root@master02.k8s.io:/etc/kubernetes
scp /etc/kubernetes/pki/{ca.*,sa.*,front-proxy-ca.*} root@master02.k8s.io:/etc/kubernetes/pki
scp /etc/kubernetes/pki/etcd/ca.* root@master02.k8s.io:/etc/kubernetes/pki/etcd
```
- 复制文件到 master03
```
ssh root@master03.k8s.io mkdir -p /etc/kubernetes/pki/etcd
scp /etc/kubernetes/admin.conf root@master03.k8s.io:/etc/kubernetes
scp /etc/kubernetes/pki/{ca.*,sa.*,front-proxy-ca.*} root@master03.k8s.io:/etc/kubernetes/pki
scp /etc/kubernetes/pki/etcd/ca.* root@master03.k8s.io:/etc/kubernetes/pki/etcd
```
- master节点加入集群
master02 和 master03 服务器上都执行加入集群操作
```bash
kubeadm join master.k8s.io:16443 --token i77yg1.1eype0c53jsanoge --discovery-token-ca-cert-hash sha256:8f0a817012ab333a057b6a7410e65971be20b95c1b75fc4015f8f3b6785f626f --experimental-control-plane
```
如果加入失败想重新尝试,请输入 kubeadm reset 命令清除之前的设置,重新执行从“复制秘钥”和“加入集群”这两步
如果是master加入,请在最后面加上 –experimental-control-plane 这个参数
```bash
# 显示安装过程:
This node has joined the cluster and a new control plane instance was created:
* Certificate signing request was sent to apiserver and approval was received.
* The Kubelet was informed of the new secure connection details.
* Master label and taint were applied to the new node.
* The Kubernetes control plane instances scaled up.
* A new etcd member was added to the local/stacked etcd cluster.
To start administering your cluster from this node, you need to run the following as a regular user:
mkdir -p $HOME/.kube
sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config
sudo chown $(id -u):$(id -g) $HOME/.kube/config
Run 'kubectl get nodes' to see this node join the cluster.
```
- 配置kubectl环境变量
```bash
mkdir -p $HOME/.kube
sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config
sudo chown $(id -u):$(id -g) $HOME/.kube/config
# 指令补全
yum install bash-completion -y
source <(kubectl completion bash)
echo "source <(kubectl completion bash)" >> ~/.bashrc
```
### 2、node节点加入集群
除了让master节点加入集群组成高可用外,slave节点也要加入集群中。
这里将k8s-node-01、k8s-node-02、k8s-node-03加入集群,进行工作
输入初始化k8s master时候提示的加入命令,如下:
```
kubeadm join master.k8s.io:16443 --token i77yg1.1eype0c53jsanoge --discovery-token-ca-cert-hash sha256:8f0a817012ab333a057b6a7410e65971be20b95c1b75fc4015f8f3b6785f626f
```
node节点加入,不需要加上 –experimental-control-plane 这个参数
### 3、如果忘记加入集群的token和sha256 (如正常则跳过)
- 显示获取token列表
```
kubeadm token list
```
默认情况下 Token 过期是时间是24小时,如果 Token 过期以后,可以输入以下命令,生成新的 Token
```
kubeadm token create
```
- 获取ca证书sha256编码hash值
```
openssl x509 -pubkey -in /etc/kubernetes/pki/ca.crt | openssl rsa -pubin -outform der 2>/dev/null | openssl dgst -sha256 -hex | sed 's/^.* //'
```
拼接命令
```
kubeadm join master.k8s.io:16443 --token 882ik4.9ib2kb0eftvuhb58 --discovery-token-ca-cert-hash sha256:0b1a836894d930c8558b350feeac8210c85c9d35b6d91fde202b870f3244016a
如果是master加入,请在最后面加上 –experimental-control-plane 这个参数
```
### 4、查看各个节点加入集群情况
```
kubectl get nodes -o wide
```
# 十、从集群中删除 Node
- Master节点:
```
kubectl drain --delete-local-data --force --ignore-daemonsets
kubectl delete node
```
- Slave节点:
```
kubeadm reset
```
## 初始化失败
```bash
kubeadm reset
ifconfig cni0 down
ip link delete cni0
ifconfig flannel.1 down
ip link delete flannel.1
rm -rf /var/lib/cni/
rm -rf /var/lib/etcd/*
```
参考资料:
http://www.mydlq.club/article/4/
================================================
FILE: kubeadm/K8S-HA-V1.16.x-云环境-Calico.md
================================================
# 环境介绍:
```bash
CentOS: 7.6
Docker: docker-ce-18.09.9
Kubernetes: 1.16.2
- calico 3.8.2
- Kubeadm: 1.16.2
- nginx-ingress 1.5.3
- Kubelet: 1.16.2
```
# 部署介绍:
```
三个 master 组成主节点集群,通过内网 loader balancer 实现负载均衡;至少需要三个 master 节点才可组成高可用集群,否则会出现 脑裂 现象
多个 worker 组成工作节点集群,通过外网 loader balancer 实现负载均衡
```
# 集群架构:

# 一、kuberadm 简介
### 1、Kuberadm 作用
Kubeadm 是一个工具,它提供了 kubeadm init 以及 kubeadm join 这两个命令作为快速创建 kubernetes 集群的最佳实践。
kubeadm 通过执行必要的操作来启动和运行一个最小可用的集群。它被故意设计为只关心启动集群,而不是之前的节点准备工作。同样的,诸如安装各种各样值得拥有的插件,例如 Kubernetes Dashboard、监控解决方案以及特定云提供商的插件,这些都不在它负责的范围。
相反,我们期望由一个基于 kubeadm 从更高层设计的更加合适的工具来做这些事情;并且,理想情况下,使用 kubeadm 作为所有部署的基础将会使得创建一个符合期望的集群变得容易。
### 2、Kuberadm 功能
```bash
kubeadm init: 启动一个 Kubernetes 主节点
kubeadm join: 启动一个 Kubernetes 工作节点并且将其加入到集群
kubeadm upgrade: 更新一个 Kubernetes 集群到新版本
kubeadm config: 如果使用 v1.7.x 或者更低版本的 kubeadm 初始化集群,您需要对集群做一些配置以便使用 kubeadm upgrade 命令
kubeadm token: 管理 kubeadm join 使用的令牌
kubeadm reset: 还原 kubeadm init 或者 kubeadm join 对主机所做的任何更改
kubeadm version: 打印 kubeadm 版本
kubeadm alpha: 预览一组可用的新功能以便从社区搜集反馈
```
### 3、功能版本
| Area |
Maturity Level |
| Command line UX |
GA |
| Implementation |
GA |
| Config file API |
beta |
| CoreDNS |
GA |
| kubeadm alpha subcommands |
alpha |
| High availability |
alpha |
| DynamicKubeletConfig |
alpha |
| Self-hosting |
alpha |
# 二、前期准备
### 1、虚拟机分配说明
| 地址 |
主机名 |
内存&CPU |
角色 |
| 10.10.1.100 |
- |
- |
vip |
| 10.10.0.24 |
k8s-master-01 |
2C & 2G |
master |
| 10.10.0.32 |
k8s-master-02 |
2C & 2G |
master |
| 10.10.0.23 |
k8s-master-03 |
2C & 2G |
master |
| 10.10.0.25 |
k8s-node-01 |
4C & 8G |
node |
| 10.10.0.29 |
k8s-node-02 |
4C & 8G |
node |
| 10.10.0.12 |
k8s-node-03 |
4C & 8G |
node |
### 2、各个节点端口占用
- Master 节点
| 规则 |
方向 |
端口范围 |
作用 |
使用者 |
| TCP |
Inbound 入口 |
6443* |
Kubernetes API |
server All |
| TCP |
Inbound 入口 |
2379-2380 |
etcd server |
client API kube-apiserver, etcd |
| TCP |
Inbound 入口 |
10250 |
Kubernetes API |
Self, Control plane |
| TCP |
Inbound 入口 |
10251 |
kube-scheduler |
Self |
| TCP |
Inbound 入口 |
10252 |
kube-controller-manager |
Self |
- node 节点
| 规则 |
方向 |
端口范围 |
作用 |
使用者 |
| TCP |
Inbound 入口 |
10250 |
Kubernetes API |
Self, Control plane |
| TCP |
Inbound 入口 |
30000-32767 |
NodePort Services** |
All |
### 3、基础环境设置
Kubernetes 需要一定的环境来保证正常运行,如各个节点时间同步,主机名称解析,关闭防火墙等等。
1、主机名称解析
分布式系统环境中的多主机通信通常基于主机名称进行,这在 IP 地址存在变化的可能性时为主机提供了固定的访问人口,因此一般需要有专用的 DNS 服务负责解决各节点主机 不过,考虑到此处部署的是测试集群,因此为了降低系复杂度,这里将基于 hosts 的文件进行主机名称解析。
2、修改hosts和免key登录
```bash
#分别进入不同服务器,进入 /etc/hosts 进行编辑
cat > /etc/hosts << \EOF
127.0.0.1 localhost localhost.localdomain localhost4 localhost4.localdomain4
::1 localhost localhost.localdomain localhost6 localhost6.localdomain6
10.10.1.100 k8s-vip master master.k8s.io
10.10.0.24 k8s-master-01 master01 master01.k8s.io
10.10.0.32 k8s-master-02 master02 master02.k8s.io
10.10.0.23 k8s-master-03 master03 master03.k8s.io
10.10.0.25 k8s-node-01 node01 node01.k8s.io
10.10.0.29 k8s-node-02 node02 node02.k8s.io
10.10.0.12 k8s-node-03 node03 node03.k8s.io
EOF
#root用户免密登录
mkdir -p /root/.ssh/
chmod 700 /root/.ssh/
echo 'ssh-rsa AAAAB3NzaC1yc2EAAAADAQABAAABAQC7bRm20od1b3rzW3ZPLB5NZn3jQesvfiz2p0WlfcYJrFHfF5Ap0ubIBUSQpVNLn94u8ABGBLboZL8Pjo+rXQPkIcObJxoKS8gz6ZOxcxJhldudbadabdanKAAKAKKKKKKKKKKKKKKKKKKKKKKK root@k8s-master-01' > /root/.ssh/authorized_keys
chmod 400 /root/.ssh/authorized_keys
```
3、修改hostname
```bash
#分别进入不同的服务器修改 hostname 名称
# 修改 10.10.0.24 服务器
hostnamectl set-hostname k8s-master-01
# 修改 10.10.0.32 服务器
hostnamectl set-hostname k8s-master-02
# 修改 10.10.0.23 服务器
hostnamectl set-hostname k8s-master-03
# 修改 10.10.0.25 服务器
hostnamectl set-hostname k8s-node-01
# 修改 10.10.0.29 服务器
hostnamectl set-hostname k8s-node-02
# 修改 10.10.0.12 服务器
hostnamectl set-hostname k8s-node-03
```
4、主机时间同步
```bash
#将各个服务器的时间同步,并设置开机启动同步时间服务
yum install chrony -y
systemctl restart chronyd.service
systemctl enable chronyd.service
```
5、关闭防火墙服务
```bash
systemctl stop firewalld
systemctl disable firewalld
```
6、关闭并禁用SELinux
```bash
# 若当前启用了 SELinux 则需要临时设置其当前状态为 permissive
setenforce 0
# 编辑/etc/sysconfig selinux 文件,以彻底禁用 SELinux
sed -i 's/^SELINUX=enforcing$/SELINUX=disabled/' /etc/selinux/config
# 查看selinux状态
getenforce
如果为permissive,则执行reboot重新启动即可
```
7、禁用 Swap 设备
kubeadm 默认会预先检当前主机是否禁用了 Swap 设备,并在未用时强制止部署 过程因此,在主机内存资惊充裕的条件下,需要禁用所有的 Swap 设备
```
# 关闭当前已启用的所有 Swap 设备
swapoff -a && sysctl -w vm.swappiness=0
sed -ri 's/.*swap.*/#&/' /etc/fstab
cat /etc/fstab
或
# 编辑 fstab 配置文件,注释掉标识为 Swap 设备的所有行
vi /etc/fstab
UUID=9be41058-76a6-4588-8e3f-5b44604d8de1 / xfs defaults,noatime 0 0
UUID=4489cc8f-1885-4e17-bfe7-8652fd1d3feb /boot xfs defaults,noatime 0 0
#UUID=0f5ae5f1-4872-471f-9f3a-f172a43fc1ff swap swap defaults,noatime 0 0
```
8、设置系统参数
设置允许路由转发,不对bridge的数据进行处理
```bash
#创建 /etc/sysctl.d/k8s.conf 文件
cat > /etc/sysctl.d/k8s.conf << \EOF
vm.swappiness = 0
net.ipv4.ip_forward = 1
net.bridge.bridge-nf-call-ip6tables = 1
net.bridge.bridge-nf-call-iptables = 1
EOF
#挂载br_netfilter
modprobe br_netfilter
#生效配置文件
sysctl -p /etc/sysctl.d/k8s.conf
#查看是否生成相关文件
ls /proc/sys/net/bridge
```
9、资源配置文件
`/etc/security/limits.conf` 是 Linux 资源使用配置文件,用来限制用户对系统资源的使用
```bash
echo "* soft nofile 65536" >> /etc/security/limits.conf
echo "* hard nofile 65536" >> /etc/security/limits.conf
echo "* soft nproc 65536" >> /etc/security/limits.conf
echo "* hard nproc 65536" >> /etc/security/limits.conf
echo "* soft memlock unlimited" >> /etc/security/limits.conf
echo "* hard memlock unlimited" >> /etc/security/limits.conf
```
10、安装依赖包以及相关工具
```bash
yum install -y epel-release
yum install -y yum-utils nfs-utils expect device-mapper-persistent-data lvm2 net-tools conntrack-tools wget vim ntpdate libseccomp libtool-ltdl
```
# 五、安装Docker (所有节点)
### 1、移除之前安装过的Docker
```bash
sudo yum remove -y docker \
docker-client \
docker-client-latest \
docker-common \
docker-latest \
docker-latest-logrotate \
docker-logrotate \
docker-selinux \
docker-engine-selinux \
docker-ce-cli \
docker-engine
# 查看还有没有存在的docker组件
rpm -qa|grep docker
# 有则通过命令 yum -y remove XXX 来删除,比如:
yum remove docker-ce-cli
```
### 2、配置docker的yum源
下面两个镜像源选择其一即可,由于官方下载速度比较慢,推荐用阿里镜像源
- 阿里镜像源
```bash
yum install -y yum-utils \
device-mapper-persistent-data \
lvm2
yum-config-manager --add-repo http://mirrors.aliyun.com/docker-ce/linux/centos/docker-ce.repo
```
- Docker官方镜像源
```bash
yum-config-manager --add-repo https://download.docker.com/linux/centos/docker-ce.repo
```
### 3、安装Docker:
```
# 显示docker-ce所有可安装版本:
yum list docker-ce --showduplicates | sort -r
# 安装指定docker版本
yum install -y docker-ce-18.09.9 docker-ce-cli-18.09.9 containerd.io
# 启动docker并设置docker开机启动
systemctl enable docker
systemctl start docker
# 确认一下iptables
确认一下iptables filter表中FOWARD链的默认策略(pllicy)为ACCEPT。
iptables -nvL
Chain FORWARD (policy ACCEPT 0 packets, 0 bytes)
pkts bytes target prot opt in out source destination
0 0 DOCKER-USER all -- * * 0.0.0.0/0 0.0.0.0/0
0 0 DOCKER-ISOLATION-STAGE-1 all -- * * 0.0.0.0/0 0.0.0.0/0
0 0 ACCEPT all -- * docker0 0.0.0.0/0 0.0.0.0/0 ctstate RELATED,ESTABLISHED
0 0 DOCKER all -- * docker0 0.0.0.0/0 0.0.0.0/0
0 0 ACCEPT all -- docker0 !docker0 0.0.0.0/0 0.0.0.0/0
0 0 ACCEPT all -- docker0 docker0 0.0.0.0/0 0.0.0.0/0
Docker从1.13版本开始调整了默认的防火墙规则,禁用了iptables filter表中FOWARD链,这样会引起Kubernetes集群中跨Node的Pod无法通信。但这里通过安装docker 1806,发现默认策略又改回了ACCEPT,这个不知道是从哪个版本改回的,因为我们线上版本使用的1706还是需要手动调整这个策略的。
# 执行下面命令
iptables -P FORWARD ACCEPT
# 修改docker的配置
vim /usr/lib/systemd/system/docker.service
# 增加下面命令(ExecReload后面新增ExecStartPost=...)
...
ExecReload=/bin/kill -s HUP $MAINPID
ExecStartPost=/usr/sbin/iptables -P FORWARD ACCEPT
...
# 修改docker Cgroup Driver为systemd
# sed -i "s#^ExecStart=/usr/bin/dockerd.*#ExecStart=/usr/bin/dockerd -H fd:// --containerd=/run/containerd/containerd.sock --exec-opt native.cgroupdriver=systemd#g" /usr/lib/systemd/system/docker.service
# 设置 docker 镜像,提高 docker 镜像下载速度和稳定性
curl -sSL https://get.daocloud.io/daotools/set_mirror.sh | sh -s http://f1361db2.m.daocloud.io
# 或者直接配置文件docker加速器
cat > /etc/docker/daemon.json << \EOF
{
"exec-opts": ["native.cgroupdriver=systemd"],
"registry-mirrors": [
"https://dockerhub.azk8s.cn",
"https://i37dz0y4.mirror.aliyuncs.com"
],
"insecure-registries": ["reg.hub.com"]
}
EOF
# 重启Docker
systemctl daemon-reload
systemctl restart docker
docker info|grep -i Cgroup
```
### 4、docker最终的服务文件
```
#注意,有变量的地方需要使用转义符号
cat > /usr/lib/systemd/system/docker.service << EOF
[Unit]
Description=Docker Application Container Engine
Documentation=https://docs.docker.com
BindsTo=containerd.service
After=network-online.target firewalld.service containerd.service
Wants=network-online.target
Requires=docker.socket
[Service]
Type=notify
# the default is not to use systemd for cgroups because the delegate issues still
# exists and systemd currently does not support the cgroup feature set required
# for containers run by docker
ExecStart=/usr/bin/dockerd -H fd:// --containerd=/run/containerd/containerd.sock --exec-opt native.cgroupdriver=systemd
ExecReload=/bin/kill -s HUP \$MAINPID
ExecStartPost=/usr/sbin/iptables -P FORWARD ACCEPT
TimeoutSec=0
RestartSec=2
Restart=always
# Note that StartLimit* options were moved from "Service" to "Unit" in systemd 229.
# Both the old, and new location are accepted by systemd 229 and up, so using the old location
# to make them work for either version of systemd.
StartLimitBurst=3
# Note that StartLimitInterval was renamed to StartLimitIntervalSec in systemd 230.
# Both the old, and new name are accepted by systemd 230 and up, so using the old name to make
# this option work for either version of systemd.
StartLimitInterval=60s
# Having non-zero Limit*s causes performance problems due to accounting overhead
# in the kernel. We recommend using cgroups to do container-local accounting.
LimitNOFILE=infinity
LimitNPROC=infinity
LimitCORE=infinity
# Comment TasksMax if your systemd version does not support it.
# Only systemd 226 and above support this option.
TasksMax=infinity
# set delegate yes so that systemd does not reset the cgroups of docker containers
Delegate=yes
# kill only the docker process, not all processes in the cgroup
KillMode=process
[Install]
WantedBy=multi-user.target
EOF
# 重启Docker
systemctl daemon-reload
systemctl restart docker
systemctl enable docker
```
# 六、安装kubeadm、kubelet
### 1、配置yum源用于安装:
- 1、配置国内yum源
```
cat < /etc/yum.repos.d/kubernetes.repo
[kubernetes]
name=Kubernetes
baseurl=https://mirrors.aliyun.com/kubernetes/yum/repos/kubernetes-el7-x86_64/
enabled=1
gpgcheck=0
repo_gpgcheck=0
gpgkey=https://mirrors.aliyun.com/kubernetes/yum/doc/yum-key.gpg https://mirrors.aliyun.com/kubernetes/yum/doc/rpm-package-key.gpg
EOF
# 安装kubelet、kubeadm、kubectl
yum install -y kubelet-1.16.2 kubeadm-1.16.2 kubectl-1.16.2 --disableexcludes=kubernetes
systemctl daemon-reload
systemctl restart kubelet.service
systemctl enable kubelet.service
```
- 2、kubeadm 官方镜像源
```
cat < /etc/yum.repos.d/kubernetes.repo
[kubernetes]
name=Kubernetes
baseurl=https://packages.cloud.google.com/yum/repos/kubernetes-el7-x86_64
enabled=1
gpgcheck=1
repo_gpgcheck=1
gpgkey=https://packages.cloud.google.com/yum/doc/yum-key.gpg https://packages.cloud.google.com/yum/doc/rpm-package-key.gpg
EOF
# 安装kubelet、kubeadm、kubectl
yum install -y kubelet-1.16.2 kubeadm-1.16.2 kubectl-1.16.2 --disableexcludes=kubernetes
systemctl daemon-reload
systemctl restart kubelet.service
systemctl enable kubelet.service
```
### 2、安装kubelet
```
# 需要在每台机器上都安装以下的软件包:
kubeadm: 用来初始化集群的指令。
kubelet: 在集群中的每个节点上用来启动 pod 和 container 等。
kubectl: 用来与集群通信的命令行工具。
# 查看kubelet版本列表
yum list kubelet --showduplicates | sort -r
# 安装kubelet
yum install -y kubelet-1.16.2
# 启动kubelet并设置开机启动
systemctl daemon-reload
systemctl enable kubelet
systemctl restart kubelet
# 检查状态
检查状态,发现是failed状态,正常,kubelet会10秒重启一次,需等下面完成初始化master节点后即可正常
systemctl status kubelet
# 查看kubelet日志
journalctl -u kubelet --no-pager
```
### 3、安装kubeadm
```
# 负责初始化集群
# 1、查看kubeadm版本列表
yum list kubeadm --showduplicates | sort -r
# 2、安装kubeadm
yum install -y kubeadm-1.16.2
# 安装 kubeadm 时候会默认安装 kubectl ,所以不需要单独安装kubectl
# 3、重启服务器
为了防止发生某些未知错误,这里我们重启下服务器,方便进行后续操作
reboot
```
# 七、初始化第一个kubernetes master节点
以 `root` 身份在 `k8s-master-01` 机器上执行
初始化 `master` 节点时,如果因为中间某些步骤的配置出错,想要重新初始化 `master` 节点,请先执行 `yes | kubeadm reset` 操作
```bash
#查看初始化配置文件
kubeadm config view
```
1、精简配置文件初始化
```
# 替换 apiserver.demo 为 您想要的 dnsName
export APISERVER_NAME=master.k8s.io
# Kubernetes 容器组所在的网段,该网段安装完成后,由 kubernetes 创建,事先并不存在于您的物理网络中
export VER=v1.16.2
export POD_SUBNET=10.244.0.0/16
export SVC_SUBNET=10.96.0.0/12
rm -f ./kubeadm-config.yaml
cat < ./kubeadm-config.yaml
apiVersion: kubeadm.k8s.io/v1beta2
kind: ClusterConfiguration
kubernetesVersion: ${VER}
#imageRepository: registry.cn-hangzhou.aliyuncs.com/google_containers
controlPlaneEndpoint: "${APISERVER_NAME}:6443"
networking:
serviceSubnet: "${SVC_SUBNET}"
podSubnet: "${POD_SUBNET}"
dnsDomain: "cluster.local"
EOF
# kubeadm init
# 根据您服务器网速的情况,您需要等候 3 - 10 分钟
kubeadm init --config=kubeadm-config.yaml --upload-certs
# 配置 kubectl
rm -rf /root/.kube/
mkdir /root/.kube/
yes | cp -i /etc/kubernetes/admin.conf /root/.kube/config
```
2、详细配置文件初始化
```
# 1、创建kubeadm配置的yaml文件
rm -f ./kubeadm-config.yaml
export VER=v1.16.2
export MASTER_NODE1=10.10.0.24
export APISERVER_NAME=master.k8s.io
export POD_SUBNET=10.244.0.0/16
export SVC_SUBNET=10.96.0.0/12
cat < ./kubeadm-config.yaml
apiVersion: kubeadm.k8s.io/v1beta2
bootstrapTokens:
- groups:
- system:bootstrappers:kubeadm:default-node-token
token: abcdef.0123456789abcdef
ttl: 24h0m0s
usages:
- signing
- authentication
kind: InitConfiguration
localAPIEndpoint:
advertiseAddress: ${MASTER_NODE1} #这里填写第一个初始化的master的ip
bindPort: 6443
nodeRegistration:
criSocket: /var/run/dockershim.sock
name: k8s-master-01 #注意这里需要调整为自己的节点
taints:
- effect: NoSchedule
key: node-role.kubernetes.io/master
---
apiVersion: kubeadm.k8s.io/v1beta2
kind: ClusterConfiguration
clusterName: kubernetes
kubernetesVersion: ${VER}
certificatesDir: /etc/kubernetes/pki
controllerManager: {}
controlPlaneEndpoint: "${APISERVER_NAME}:16443" # 这里写vip的地址或域名加上端口
imageRepository: k8s.gcr.io
#imageRepository: registry.aliyuncs.com/google_containers # 使用阿里云镜像
apiServer:
timeoutForControlPlane: 4m0s
certSANs:
- k8s-master-01
- k8s-master-02
- k8s-master-03
- master.k8s.io
- 10.10.1.100
- 10.10.0.24
- 10.10.0.32
- 10.10.0.23
- 127.0.0.1
dns:
type: CoreDNS
etcd:
local:
dataDir: /var/lib/etcd
networking:
dnsDomain: cluster.local
podSubnet: ${POD_SUBNET}
serviceSubnet: ${SVC_SUBNET}
scheduler: {}
---
apiVersion: kubeproxy.config.k8s.io/v1alpha1
kind: KubeProxyConfiguration
mode: ipvs # kube-proxy 模式
EOF
kubeadm init --config=kubeadm-config.yaml --upload-certs
以下两个地方设置:
- certSANs: 虚拟ip地址(为了安全起见,把所有集群地址都加上)
- controlPlaneEndpoint: VIP:端口号
配置说明:
imageRepository: registry.aliyuncs.com/google_containers (使用阿里云镜像仓库)
podSubnet: 10.244.0.0/16 (#pod地址池)
serviceSubnet: 10.96.0.0/12 (#service地址池)
```
3、查看初始化配置文件
```
# 查看kubeadm配置文件
root># kubeadm config view
apiServer:
extraArgs:
authorization-mode: Node,RBAC
timeoutForControlPlane: 4m0s
apiVersion: kubeadm.k8s.io/v1beta2
certificatesDir: /etc/kubernetes/pki
clusterName: kubernetes
controlPlaneEndpoint: master.k8s.io:6443
controllerManager: {}
dns:
type: CoreDNS
etcd:
local:
dataDir: /var/lib/etcd
imageRepository: k8s.gcr.io
kind: ClusterConfiguration
kubernetesVersion: v1.16.2
networking:
dnsDomain: cluster.local
podSubnet: 10.244.0.0/16
serviceSubnet: 10.96.0.0/12
scheduler: {}
```
### 2、初始化第一个master节点
```
kubeadm init --config=kubeadm-config.yaml --upload-certs #使用这个就不用做拷贝证书的操作
```
日志
```
Your Kubernetes control-plane has initialized successfully!
To start using your cluster, you need to run the following as a regular user:
mkdir -p $HOME/.kube
sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config
sudo chown $(id -u):$(id -g) $HOME/.kube/config
You should now deploy a pod network to the cluster.
Run "kubectl apply -f [podnetwork].yaml" with one of the options listed at:
https://kubernetes.io/docs/concepts/cluster-administration/addons/
You can now join any number of the control-plane node running the following command on each as root:
kubeadm join master.k8s.io:16443 --token wf0eoe.liqcp0nhtlov4ioi \
--discovery-token-ca-cert-hash sha256:e43bbb08bb5decae1ce0001f2988ff79095e6be5a3dea77a7c6af180562c7e56 \
--control-plane --certificate-key 6054323448a1aeb661b78763262db5c30e12026c54341400d48401a853194ec2
Please note that the certificate-key gives access to cluster sensitive data, keep it secret!
As a safeguard, uploaded-certs will be deleted in two hours; If necessary, you can use
"kubeadm init phase upload-certs --upload-certs" to reload certs afterward.
Then you can join any number of worker nodes by running the following on each as root:
kubeadm join master.k8s.io:16443 --token wf0eoe.liqcp0nhtlov4ioi \
--discovery-token-ca-cert-hash sha256:e43bbb08bb5decae1ce0001f2988ff79095e6be5a3dea77a7c6af180562c7e56
```
### 执行结果中
用于初始化第二、三个 master 节点
```
#初始化第二个master节点
export MASTER_NODE2=10.10.0.32
kubeadm join master.k8s.io:16443 --apiserver-advertise-address ${MASTER_NODE2} --token abcdef.0123456789abcdef \
--discovery-token-ca-cert-hash sha256:ab6da874166785bfe75acc4d6fd622bf821a7451837332e3a21a6106e346c8d5 \
--control-plane --certificate-key 13284467f0141778898ffa33d340c0598cb757c6aa016f00da2165cd3eab4523
#初始化第三个master节点
export MASTER_NODE3=10.10.0.23
kubeadm join master.k8s.io:16443 --apiserver-advertise-address ${MASTER_NODE3} --token abcdef.0123456789abcdef \
--discovery-token-ca-cert-hash sha256:ab6da874166785bfe75acc4d6fd622bf821a7451837332e3a21a6106e346c8d5 \
--control-plane --certificate-key 13284467f0141778898ffa33d340c0598cb757c6aa016f00da2165cd3eab4523
```
用于初始化 worker 节点
```
kubeadm join master.k8s.io:16443 --token abcdef.0123456789abcdef \
--discovery-token-ca-cert-hash sha256:ab6da874166785bfe75acc4d6fd622bf821a7451837332e3a21a6106e346c8d5
```
### 3、配置kubectl环境变量
```bash
# 配置环境变量
rm -rf $HOME/.kube
mkdir -p $HOME/.kube
sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config
sudo chown $(id -u):$(id -g) $HOME/.kube/config
# 指令补全
yum install bash-completion -y
source <(kubectl completion bash)
echo "source <(kubectl completion bash)" >> ~/.bashrc
```
### 4、查看组件状态
```bash
kubectl get cs
NAME STATUS MESSAGE ERROR
controller-manager Healthy ok
scheduler Healthy ok
etcd-0 Healthy {"health": "true"}
# 查看pod状态
[root@k8s-master-01 ~]# kubectl get pods --namespace=kube-system
NAME READY STATUS RESTARTS AGE
coredns-78d4cf999f-5zt5z 0/1 Pending 0 7m32s ---coredns没有启动
coredns-78d4cf999f-mkgsx 0/1 Pending 0 7m32s ---coredns没有启动
etcd-k8s-master-01 1/1 Running 0 6m39s
kube-apiserver-k8s-master-01 1/1 Running 0 6m43s
kube-controller-manager-k8s-master-01 1/1 Running 0 6m32s
kube-proxy-88s74 1/1 Running 0 7m32s
kube-scheduler-k8s-master-01 1/1 Running 0 6m45s
可以看到coredns没有启动,这是由于还没有配置网络插件,接下来配置下后再重新查看启动状态
#检查ETCD服务
docker exec -it $(docker ps |grep etcd_etcd|awk '{print $1}') sh
etcdctl --endpoints=https://192.168.56.11:2379 --ca-file=/etc/kubernetes/pki/etcd/ca.crt --cert-file=/etc/kubernetes/pki/etcd/server.crt --key-file=/etc/kubernetes/pki/etcd/server.key member list
etcdctl --endpoints=https://192.168.56.11:2379 --ca-file=/etc/kubernetes/pki/etcd/ca.crt --cert-file=/etc/kubernetes/pki/etcd/server.crt --key-file=/etc/kubernetes/pki/etcd/server.key cluster-health
```
# 八、安装网络插件
### 1、安装 calico 网络插件
```
# 安装 calico 网络插件
# 参考文档 https://docs.projectcalico.org/v3.9/getting-started/kubernetes/
export POD_SUBNET=10.244.0.0/16
rm -f calico.yaml
wget https://docs.projectcalico.org/v3.9/manifests/calico.yaml
sed -i "s#192\.168\.0\.0/16#${POD_SUBNET}#" calico.yaml
kubectl apply -f calico.yaml
```
### 2、等待一会时间,再次查看各个pods的状态
```
[root@k8s-master-01 ~]# kubectl get pods --namespace=kube-system
NAME READY STATUS RESTARTS AGE
coredns-78d4cf999f-5zt5z 1/1 Running 0 12m ---coredns启动成功
coredns-78d4cf999f-mkgsx 1/1 Running 0 12m ---coredns启动成功
etcd-k8s-master-01 1/1 Running 0 11m
kube-apiserver-k8s-master-01 1/1 Running 0 12m
kube-controller-manager-k8s-master-01 1/1 Running 0 11m
kube-flannel-ds-amd64-7lj6m 1/1 Running 0 13s
kube-proxy-88s74 1/1 Running 0 12m
kube-scheduler-k8s-master-01 1/1 Running 0 12m
```
# 九、加入集群
### 1、Master加入集群构成高可用
```
复制秘钥到各个节点
在master01 服务器上执行下面命令,将kubernetes相关文件复制到 master02、master03
如果其他节点为初始化第一个master节点,则将该节点的配置文件复制到其余两个主节点,例如master03为第一个master节点,则将它的k8s配置复制到master02和master01。
```
- 复制文件到 master02
```
ssh root@master02.k8s.io mkdir -p /etc/kubernetes/pki/etcd
scp /etc/kubernetes/admin.conf root@master02.k8s.io:/etc/kubernetes
scp /etc/kubernetes/pki/{ca.*,sa.*,front-proxy-ca.*} root@master02.k8s.io:/etc/kubernetes/pki
scp /etc/kubernetes/pki/etcd/ca.* root@master02.k8s.io:/etc/kubernetes/pki/etcd
```
- 复制文件到 master03
```
ssh root@master03.k8s.io mkdir -p /etc/kubernetes/pki/etcd
scp /etc/kubernetes/admin.conf root@master03.k8s.io:/etc/kubernetes
scp /etc/kubernetes/pki/{ca.*,sa.*,front-proxy-ca.*} root@master03.k8s.io:/etc/kubernetes/pki
scp /etc/kubernetes/pki/etcd/ca.* root@master03.k8s.io:/etc/kubernetes/pki/etcd
```
- master节点加入集群
master02 和 master03 服务器上都执行加入集群操作
```bash
kubeadm join master.k8s.io:16443 --token i77yg1.1eype0c53jsanoge --discovery-token-ca-cert-hash sha256:8f0a817012ab333a057b6a7410e65971be20b95c1b75fc4015f8f3b6785f626f --experimental-control-plane
```
如果加入失败想重新尝试,请输入 kubeadm reset 命令清除之前的设置,重新执行从“复制秘钥”和“加入集群”这两步
如果是master加入,请在最后面加上 –experimental-control-plane 这个参数
```bash
# 显示安装过程:
This node has joined the cluster and a new control plane instance was created:
* Certificate signing request was sent to apiserver and approval was received.
* The Kubelet was informed of the new secure connection details.
* Master label and taint were applied to the new node.
* The Kubernetes control plane instances scaled up.
* A new etcd member was added to the local/stacked etcd cluster.
To start administering your cluster from this node, you need to run the following as a regular user:
mkdir -p $HOME/.kube
sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config
sudo chown $(id -u):$(id -g) $HOME/.kube/config
Run 'kubectl get nodes' to see this node join the cluster.
```
- 配置kubectl环境变量
```bash
mkdir -p $HOME/.kube
sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config
sudo chown $(id -u):$(id -g) $HOME/.kube/config
# 指令补全
yum install bash-completion -y
source <(kubectl completion bash)
echo "source <(kubectl completion bash)" >> ~/.bashrc
```
### 2、node节点加入集群
除了让master节点加入集群组成高可用外,slave节点也要加入集群中。
这里将k8s-node-01、k8s-node-02、k8s-node-03加入集群,进行工作
输入初始化k8s master时候提示的加入命令,如下:
```
kubeadm join master.k8s.io:16443 --token i77yg1.1eype0c53jsanoge --discovery-token-ca-cert-hash sha256:8f0a817012ab333a057b6a7410e65971be20b95c1b75fc4015f8f3b6785f626f
```
node节点加入,不需要加上 –experimental-control-plane 这个参数
### 3、如果忘记加入集群的token和sha256 (如正常则跳过)
- 显示获取token列表
```
kubeadm token list
```
默认情况下 Token 过期是时间是24小时,如果 Token 过期以后,可以输入以下命令,生成新的 Token
```
kubeadm token create
```
- 获取ca证书sha256编码hash值
```
openssl x509 -pubkey -in /etc/kubernetes/pki/ca.crt | openssl rsa -pubin -outform der 2>/dev/null | openssl dgst -sha256 -hex | sed 's/^.* //'
```
拼接命令
```
kubeadm join master.k8s.io:16443 --token 882ik4.9ib2kb0eftvuhb58 --discovery-token-ca-cert-hash sha256:0b1a836894d930c8558b350feeac8210c85c9d35b6d91fde202b870f3244016a
如果是master加入,请在最后面加上 –experimental-control-plane 这个参数
```
### 4、查看各个节点加入集群情况
```
kubectl get nodes -o wide
```
# 十、从集群中删除 Node
- Master节点:
```
kubectl drain --delete-local-data --force --ignore-daemonsets
kubectl delete node
```
- Slave节点:
```
kubeadm reset
```
## 初始化失败
```bash
yes | kubeadm reset
ifconfig cni0 down
ip link delete cni0
ifconfig flannel.1 down
ip link delete flannel.1
rm -rf /var/lib/cni/
rm -rf /var/lib/etcd/*
```
# 十一、安装Kubernetes Dashboard 2.0
```
#安装
kubectl apply -f https://raw.githubusercontent.com/kubernetes/dashboard/v2.0.0-beta4/aio/deploy/recommended.yaml
#卸载
kubectl delete -f https://raw.githubusercontent.com/kubernetes/dashboard/v2.0.0-beta4/aio/deploy/recommended.yaml
```
参考资料:
http://www.mydlq.club/article/4/
https://kuboard.cn/install/install-kubernetes.html#%E5%88%9D%E5%A7%8B%E5%8C%96%E7%AC%AC%E4%B8%80%E4%B8%AAmaster%E8%8A%82%E7%82%B9
https://blog.51cto.com/fengwan/2426528?source=dra kubeadm搭建高可用kubernetes 1.15.1
https://segmentfault.com/a/1190000018741112?utm_source=tag-newest Kubernetes的几种主流部署方式02-kubeadm部署高可用集群
================================================
FILE: kubeadm/K8S-V1.16.2-开启防火墙-Flannel.md
================================================
Table of Contents
=================
* [一、防火墙配置](#一防火墙配置)
* [二、初始化](#二初始化)
* [三、初始化集群](#三初始化集群)
* [1、命令行初始化](#1命令行初始化)
* [2、通过配置文件进行初始化](#2通过配置文件进行初始化)
* [3、初始化进行的操作](#3初始化进行的操作)
* [4、单独部署coredns(选择操作)](#4单独部署coredns选择操作)
* [5、集群移除节点](#5集群移除节点)
* [6、kube-proxy开启ipvs](#6kube-proxy开启ipvs)
* [四、Master操作](#四master操作)
* [五、Node操作](#五node操作)
* [六、集群操作](#六集群操作)
* [七、网络插件部署](#七网络插件部署)
* [1、master上部署flannel插件](#1master上部署flannel插件)
* [2、master上部署calico插件](#2master上部署calico插件)
* [3、性能对比](#3性能对比)
* [八、安装 Dashboard](#八安装-dashboard)
* [1、下载yaml文件](#1下载yaml文件)
* [2、修改配置](#2修改配置)
* [3、查看dashboard](#3查看dashboard)
* [4、然后创建一个具有全局所有权限的用户来登录Dashboard:(admin.yaml)](#4然后创建一个具有全局所有权限的用户来登录dashboardadminyaml)
* [九、问题排查](#九问题排查)
* [1、coredns异常问题](#1coredns异常问题)
* [1.1、解决办法](#11解决办法)
* [2、kubelet异常问题1](#2kubelet异常问题1)
* [3、kubelet异常问题2](#3kubelet异常问题2)
# 一、防火墙配置
```bash
chattr -i /etc/passwd* && chattr -i /etc/group* && chattr -i /etc/shadow* && chattr -i /etc/gshadow*
yum install iptables iptables-services -y
cat > /etc/sysconfig/iptables << \EOF
# Generated by iptables-save v1.4.21 on Thu Aug 1 01:26:09 2019
*filter
:INPUT ACCEPT [0:0]
:FORWARD ACCEPT [0:0]
:OUTPUT ACCEPT [0:0]
:RH-Firewall-1-INPUT - [0:0]
-A INPUT -j RH-Firewall-1-INPUT
-A FORWARD -j RH-Firewall-1-INPUT
-A RH-Firewall-1-INPUT -i lo -j ACCEPT
-A RH-Firewall-1-INPUT -p icmp -m icmp --icmp-type any -j ACCEPT
-A RH-Firewall-1-INPUT -s 192.168.56.0/24 -p tcp -m tcp --dport 22 -j ACCEPT
-A RH-Firewall-1-INPUT -p tcp -m tcp --dport 22 -j DROP
### k8s ###
-A RH-Firewall-1-INPUT -s 192.168.56.11/32 -j ACCEPT
-A RH-Firewall-1-INPUT -s 192.168.56.12/32 -j ACCEPT
-A RH-Firewall-1-INPUT -s 192.168.56.13/32 -j ACCEPT
# serviceSubnet rules
-A RH-Firewall-1-INPUT -s 10.96.0.0/12 -j ACCEPT
# podSubnet rules
-A RH-Firewall-1-INPUT -s 10.244.0.0/16 -j ACCEPT
# keepalived rules
-A RH-Firewall-1-INPUT -p vrrp -j ACCEPT
# port rules
-A RH-Firewall-1-INPUT -s 192.168.56.1/32 -p tcp -m multiport --dports 80,443,1080,6443,16443,30000:32767 -j ACCEPT
### k8s ###
-A RH-Firewall-1-INPUT -m state --state RELATED,ESTABLISHED -j ACCEPT
-A RH-Firewall-1-INPUT -j REJECT --reject-with icmp-host-prohibited
COMMIT
# Completed on Thu Aug 1 01:26:09 2019
EOF
systemctl restart iptables.service
systemctl enable iptables.service
iptables -nvL
```
# 二、初始化
```bash
cat > /etc/hosts << \EOF
127.0.0.1 localhost localhost.localdomain localhost4 localhost4.localdomain4
::1 localhost localhost.localdomain localhost6 localhost6.localdomain6
192.168.56.11 linux-node1 linux-node1.example.com
192.168.56.12 linux-node2 linux-node2.example.com
192.168.56.13 linux-node3 linux-node3.example.com
EOF
systemctl stop firewalld
systemctl disable firewalld
setenforce 0
sed -i 's/SELINUX=.*/SELINUX=disabled/g' /etc/selinux/config
sed -i 's/SELINUXTYPE=.*/SELINUXTYPE=disabled/g' /etc/selinux/config
# 关闭 swap
swapoff -a
#sed -ir 's/.*swap.*/#&/' /etc/fstab
#或
yes | cp /etc/fstab /etc/fstab_bak
cat /etc/fstab_bak |grep -v swap > /etc/fstab
#export Time=`date "+%Y%m%d%H%M%S"`
#cp /etc/fstab /etc/fstab_$Time
cat > /etc/sysctl.d/k8s.conf << \EOF
net.bridge.bridge-nf-call-ip6tables = 1
net.bridge.bridge-nf-call-iptables = 1
net.ipv4.ip_forward = 1
vm.swappiness = 0
EOF
#加载 br_netfilter 模块
modprobe br_netfilter
sysctl -p /etc/sysctl.d/k8s.conf
#创建/etc/sysconfig/modules/ipvs.modules文件,保证在节点重启后能自动加载所需模块
cat > /etc/sysconfig/modules/ipvs.modules < /etc/docker/daemon.json << \EOF
{
"exec-opts": ["native.cgroupdriver=systemd"],
"data-root": "/data0/docker-data",
"registry-mirrors" : [
"https://ot2k4d59.mirror.aliyuncs.com/"
],
"insecure-registries": ["reg.hub.com"]
}
EOF
systemctl daemon-reload
systemctl restart docker
cat < /etc/yum.repos.d/kubernetes.repo
[kubernetes]
name=Kubernetes
baseurl=http://mirrors.aliyun.com/kubernetes/yum/repos/kubernetes-el7-x86_64
enabled=1
gpgcheck=0
repo_gpgcheck=0
gpgkey=http://mirrors.aliyun.com/kubernetes/yum/doc/yum-key.gpg
http://mirrors.aliyun.com/kubernetes/yum/doc/rpm-package-key.gpg
EOF
yum install -y kubelet-1.16.2-0 kubeadm-1.16.2-0 kubectl-1.16.2-0 --disableexcludes=kubernetes
kubeadm version
systemctl daemon-reload
systemctl restart kubelet.service
systemctl enable kubelet.service
systemctl status kubelet
#查看kubelet日志
journalctl -f -u kubelet
#kubelet.service服务位置
ls -l /lib/systemd/system/kubelet.service
```
# 三、初始化集群
## 1、命令行初始化
```bash
#master节点初始化指令
kubeadm init \
--apiserver-advertise-address=192.168.56.11 \
--image-repository registry.aliyuncs.com/google_containers \
--kubernetes-version v1.16.2 \
--apiserver-bind-port=6443 \
--service-cidr=10.96.0.0/12 \
--pod-network-cidr=10.244.0.0/16 #这里使用这个是因为官方flannel使用的这个段地址,不然的话,kube-flannel.yml那里需要调整
#其他节点可以先指定image源,先下载需要的镜像
kubeadm config images pull --image-repository registry.aliyuncs.com/google_containers
#查看集群初始化配置
kubeadm config view
#获取加入集群的指令
kubeadm token create --print-join-command
kubeadm join 192.168.56.11:6443 --token 5avfk1.fwui1smk5utcu7m9 --discovery-token-ca-cert-hash sha256:6730e91a516d8bf3e26d8f5eddd6409a224f8703b94f6ecde2b1fd7481bbbd25
#集群初始化如果遇到问题,可以使用下面的命令进行清理
yes | kubeadm reset
ifconfig cni0 down
ip link delete cni0
ifconfig flannel.1 down
ip link delete flannel.1
rm -rf /var/lib/cni/
rm -f $HOME/.kube/config
systemctl restart kubelet
systemctl status kubelet
journalctl -f -u kubelet
```
## 2、通过配置文件进行初始化
```bash
#在 master 节点配置 kubeadm 初始化文件,可以通过如下命令导出默认的初始化配置:
root># kubeadm config print init-defaults > kubeadm.yaml
```
```bash
#然后根据我们自己的需求修改配置,比如修改 imageRepository 的值,kube-proxy 的模式为 ipvs
如果是 flannel 网络插件的,需要将 networking.podSubnet 设置为默认的 10.244.0.0/16
如果是 Calico 网络插件的,配置成 Calico 的默认网段 podSubnet: 192.168.0.0/16,这个也可以修改Calico的配置文件调整
rm -f kubeadm.yaml
cat > kubeadm.yaml << \EOF
apiVersion: kubeadm.k8s.io/v1beta2
bootstrapTokens:
- groups:
- system:bootstrappers:kubeadm:default-node-token
token: abcdef.0123456789abcdef
ttl: 24h0m0s
usages:
- signing
- authentication
kind: InitConfiguration
localAPIEndpoint:
advertiseAddress: 192.168.56.11 #修改为主节点 IP
bindPort: 6443
#controlPlaneEndpoint: 1.1.1.100 #如果前面配置了负载均衡,此处填写vip地址
nodeRegistration:
criSocket: /var/run/dockershim.sock
name: linux-node1.example.com
taints:
- effect: NoSchedule
key: node-role.kubernetes.io/master
---
apiServer:
timeoutForControlPlane: 4m0s
apiVersion: kubeadm.k8s.io/v1beta2
certificatesDir: /etc/kubernetes/pki
clusterName: kubernetes
controllerManager: {}
dns:
type: CoreDNS #dns 类型
etcd:
local:
dataDir: /var/lib/etcd
#imageRepository: k8s.gcr.io
imageRepository: registry.aliyuncs.com/google_containers #国内不能访问 Google,修改为阿里云
kind: ClusterConfiguration
kubernetesVersion: v1.16.2 # 修改版本号
networking:
dnsDomain: cluster.local
# 配置成 flannel 的默认网段
serviceSubnet: 10.96.0.0/12
podSubnet: 10.244.0.0/16
scheduler: {}
---
# 开启 IPVS 模式
apiVersion: kubeproxy.config.k8s.io/v1alpha1
kind: KubeProxyConfiguration
mode: ipvs # kube-proxy 模式
EOF
kubeadm init --config kubeadm.yaml
```
## 3、初始化进行的操作
```bash
初始化操作主要经历了下面15个步骤,每个阶段均输出均使用[步骤名称]作为开头:
1、[init]:指定版本进行初始化操作
2、[preflight] :初始化前的检查和下载所需要的Docker镜像文件。
3、[kubelet-start] :生成kubelet的配置文件”/var/lib/kubelet/config.yaml”,没有这个文件kubelet无法启动,所以初始化之前的kubelet实际上启动失败。
4、[certificates]:生成Kubernetes使用的证书,存放在/etc/kubernetes/pki目录中。
5、[kubeconfig] :生成 KubeConfig 文件,存放在/etc/kubernetes目录中,组件之间通信需要使用对应文件。
6、[control-plane]:使用/etc/kubernetes/manifest目录下的YAML文件,安装 Master 组件。
7、[etcd]:使用/etc/kubernetes/manifest/etcd.yaml安装Etcd服务。
8、[wait-control-plane]:等待control-plan部署的Master组件启动。
9、[apiclient]:检查Master组件服务状态。
10、[uploadconfig]:更新配置
11、[kubelet]:使用configMap配置kubelet。
12、[patchnode]:更新CNI信息到Node上,通过注释的方式记录。
13、[mark-control-plane]:为当前节点打标签,打了角色Master,和不可调度标签,这样默认就不会使用Master节点来运行Pod。
14、[bootstrap-token]:生成token记录下来,后边使用kubeadm join往集群中添加节点时会用到
15、[addons]:安装附加组件CoreDNS和kube-proxy
kubectl默认会在执行的用户家目录下面的.kube目录下寻找config文件。这里是将在初始化时[kubeconfig]步骤生成的admin.conf拷贝到.kube/config。
```
## 4、单独部署coredns(选择操作)
```bash
# 不依赖kubeadm的方式,适用于不是使用kubeadm创建的k8s集群,或者kubeadm初始化集群之后,删除了dns相关部署
# 在calico网络中也配置一个coredns # 10.96.0.10 为k8s官方指定的kube-dns地址
rm -f coredns.yaml.sed deploy.sh coredns.yml
wget https://raw.githubusercontent.com/coredns/deployment/master/kubernetes/coredns.yaml.sed
wget https://raw.githubusercontent.com/coredns/deployment/master/kubernetes/deploy.sh
chmod +x deploy.sh
./deploy.sh -i 10.10.0.10 > coredns.yml #这里从--service-cidr=10.10.0.0/16中选用10.10.0.10作为coredns地址
kubectl apply -f coredns.yml
# 查看
kubectl get pods --namespace kube-system
kubectl get svc --namespace kube-system
#删除coredns
kubectl delete deployment coredns -n kube-system
kubectl delete svc kube-dns -n kube-system
kubectl delete cm coredns -n kube-system
```
## 5、集群移除节点
```bash
1、#移除work节点
在准备移除的 worker 节点上执行
kubeadm reset
2、在第一个 master 节点 demo-master-a-1 上执行
kubectl delete node demo-worker-x-x
#worker 节点的名字可以通过在第一个 master 节点 demo-master-a-1 上执行 kubectl get nodes 命令获得
```
## 6、kube-proxy开启ipvs
```bash
1、#修改ConfigMap的kube-system/kube-proxy中的config.conf,把 mode: "" 改为mode: “ipvs" 保存退出即可
root># kubectl edit cm kube-proxy -n kube-system
configmap/kube-proxy edited
2、#删除之前的proxy pod
root># kubectl get pod -n kube-system |grep kube-proxy |awk '{system("kubectl delete pod "$1" -n kube-system")}'
3、#查看proxy运行状态
root># kubectl get pod -n kube-system | grep kube-proxy
4、#查看日志,如果有 `Using ipvs Proxier.` 说明kube-proxy的ipvs 开启成功!
root># kubectl logs kube-proxy-54qnw -n kube-system
I0518 20:24:09.319160 1 server_others.go:176] Using ipvs Proxier.
W0518 20:24:09.319751 1 proxier.go:386] IPVS scheduler not specified, use rr by default
I0518 20:24:09.320035 1 server.go:562] Version: v1.14.2
I0518 20:24:09.334372 1 conntrack.go:52] Setting nf_conntrack_max to 131072
I0518 20:24:09.334853 1 config.go:102] Starting endpoints config controller
I0518 20:24:09.334916 1 controller_utils.go:1027] Waiting for caches to sync for endpoints config controller
I0518 20:24:09.334945 1 config.go:202] Starting service config controller
I0518 20:24:09.334976 1 controller_utils.go:1027] Waiting for caches to sync for service config controller
I0518 20:24:09.435153 1 controller_utils.go:1034] Caches are synced for service config controller
I0518 20:24:09.435271 1 controller_utils.go:1034] Caches are synced for endpoints config controller
```
# 四、Master操作
```bash
#将 master 节点上面的 $HOME/.kube/config 文件拷贝到 node 节点对应的文件中
mkdir -p $HOME/.kube
yes | cp -i /etc/kubernetes/admin.conf $HOME/.kube/config
chown $(id -u):$(id -g) $HOME/.kube/config
scp $HOME/.kube/config root@linux-node2:$HOME/.kube/config
scp $HOME/.kube/config root@linux-node3:$HOME/.kube/config
#指令补全
yum install bash-completion -y
source <(kubectl completion bash)
echo "source <(kubectl completion bash)" >> ~/.bashrc
```
# 五、Node操作
```bash
#node节点操作
mkdir -p $HOME/.kube
sudo chown $(id -u):$(id -g) $HOME/.kube/config
#加入集群
kubeadm join 192.168.56.11:6443 --token 5avfk1.fwui1smk5utcu7m9 --discovery-token-ca-cert-hash sha256:6730e91a516d8bf3e26d8f5eddd6409a224f8703b94f6ecde2b1fd7481bbbd25
```
# 六、集群操作
```bash
#批量重启docker
docker restart `docker ps -a -q`
root># kubectl get nodes
NAME STATUS ROLES AGE VERSION
linux-node1.example.com NotReady master 11m v1.15.3
linux-node2.example.com NotReady 5m9s v1.15.3
linux-node3.example.com NotReady 4m58s v1.15.3
可以看到是 NotReady 状态,这是因为还没有安装网络插件,接下来安装网络插件,可以在文档 https://kubernetes.io/docs/setup/production-environment/tools/kubeadm/create-cluster-kubeadm/ 中选择我们自己的网络插件,这里我们安装 flannel:
iptables -I RH-Firewall-1-INPUT -s 10.96.0.0/16 -j ACCEPT
service iptables save
root># kubectl get pods -n kube-system
NAME READY STATUS RESTARTS AGE
coredns-5c98db65d4-mk254 1/1 Running 0 14m
coredns-5c98db65d4-ntz98 1/1 Running 0 14m
etcd-linux-node1.example.com 1/1 Running 0 13m
kube-apiserver-linux-node1.example.com 1/1 Running 0 13m
kube-controller-manager-linux-node1.example.com 1/1 Running 0 13m
kube-flannel-ds-amd64-6kx7m 1/1 Running 0 11m
kube-flannel-ds-amd64-cqfnb 1/1 Running 0 11m
kube-flannel-ds-amd64-thxx2 1/1 Running 0 11m
kube-proxy-gdtjg 1/1 Running 0 12m
kube-proxy-lcscl 1/1 Running 0 14m
kube-proxy-sb7d8 1/1 Running 0 12m
kube-scheduler-linux-node1.example.com 1/1 Running 0 13m
kubernetes-dashboard-fcfb4cbc-dqbq9 1/1 Running 0 4m43s
kubectl describe pod/coredns-5c98db65d4-mk254 -n kube-system
#创建Deployment
kubectl run --image=nginx nginx-web-1 --image-pull-policy='IfNotPresent' --replicas=3
#以不同方式暴露出去
kubectl expose deployment nginx-web-1 --port=80 --target-port=80
kubectl expose deployment nginx-web-1 --port=80 --target-port=80 --type=NodePort
root># kubectl exec -it nginx-web-1-5cc49f46bc-kn46r -- \
sh -c "echo hello>/usr/share/nginx/html/index.html"
root># kubectl get svc -A
default nginx-web-1 NodePort 10.10.43.53 80:30163/TCP 101s
root># kubectl get endpoints
nginx-web-1 10.244.154.193:80,10.244.44.193:80,10.244.89.129:80 5m27s
root># curl 10.10.43.53
hello
#显示iptables规则(注意这里kube-proxy需要使用ipvs模式,上面主机预设的iptables策略才生效)
iptables -nvL --line-number
#删除规则
iptables -D RH-Firewall-1-INPUT 4
```
# 七、网络插件部署
## 1、master上部署flannel插件
```bash
#插件镜像 network: flannel image(因墙的问题,需要从国内源下载)
docker pull quay-mirror.qiniu.com/coreos/flannel:v0.11.0-amd64
docker tag quay-mirror.qiniu.com/coreos/flannel:v0.11.0-amd64 quay.io/coreos/flannel:v0.11.0-amd64
https://www.cnblogs.com/horizonli/p/10855666.html
#部署flannel
rm -f kube-flannel.yml
wget https://raw.githubusercontent.com/coreos/flannel/master/Documentation/kube-flannel.yml
sed -i 's#image: quay.io/coreos/flannel:v0.11.0-amd64#image: registry.cn-shenzhen.aliyuncs.com/cp_m/flannel:v0.10.0-amd64#g' kube-flannel.yml
kubectl apply -f kube-flannel.yml
#另外需要注意的是如果你的节点有多个网卡的话,需要在 kube-flannel.yml 中使用--iface参数指定集群主机内网网卡的名称,否则可能会出现 dns 无法解析。flanneld 启动参数加上--iface=
args:
- --ip-masq
- --kube-subnet-mgr
- --iface=eth0
```
## 2、master上部署calico插件
```bash
export POD_SUBNET=10.244.0.0/16
rm -f calico.yaml
wget https://docs.projectcalico.org/v3.8/manifests/calico.yaml
sed -i "s#192\.168\.0\.0/16#${POD_SUBNET}#" calico.yaml
kubectl apply -f calico.yaml
https://www.cnblogs.com/goldsunshine/p/10701242.html k8s网络之Calico网络
```
## 3、性能对比
```bash
https://www.2cto.com/net/201701/591629.html kubernetes flannel neutron calico三种网络方案性能测试分析
```
# 八、安装 Dashboard
使用 dashboard 最好把浏览器的默认语言设置为英文,不然在进入容器操作的时候会有bug,会出现重影,然后k8s v1.16.x之后,需要使用Dashboard v2.0以上的版本,不然出现在error_outline 未知服务器错误 (404)
## 1、下载yaml文件
```bash
#下载
wget https://raw.githubusercontent.com/kubernetes/dashboard/v2.0.0-beta5/aio/deploy/recommended.yaml
```
## 2、修改配置
```bash
1、#热更新打补丁的方式修改svc
kubectl apply -f recommended.yaml
kubectl -n kubernetes-dashboard patch svc kubernetes-dashboard -p '{"spec":{"type":"NodePort"}}'
kubectl -n kubernetes-dashboard patch svc kubernetes-dashboard -p '{"spec": {"ports": [{"port":443, "nodePort": 30001}]}}'
kubectl get svc -A|grep kubernetes-dashboard
https://www.jianshu.com/p/f38e1767bf19 使用 kubectl patch 更新 API 对象
2、#手动修改recommended.yaml文件,为了方便访问,修改kubernetes-dashboard的Service定义,指定Service的type类型为NodeType,指定nodePort端口
kubectl delete -f recommended.yaml
---
kind: Service
apiVersion: v1
metadata:
labels:
k8s-app: kubernetes-dashboard
name: kubernetes-dashboard
namespace: kubernetes-dashboard
spec:
type: NodePort # 新增这一行,指定为NodePort方式
ports:
- port: 443
targetPort: 8443
nodePort: 30001 # 指定端口为30001
selector:
k8s-app: kubernetes-dashboard
---
kubectl apply -f recommended.yaml
#注:dashboard-metrics-scraper的Service不需要修改
Kubernetes Dashboard 默认部署时,只配置了最低权限的 RBAC
参考文档:https://github.com/kubernetes/dashboard/blob/master/docs/user/access-control/creating-sample-user.md
```
## 3、查看dashboard
```bash
root># kubectl get pod,deploy,svc -n kubernetes-dashboard
NAME READY STATUS RESTARTS AGE
pod/dashboard-metrics-scraper-76585494d8-ws57d 1/1 Running 0 2m18s
pod/kubernetes-dashboard-6b86b44f87-q26w6 1/1 Running 0 2m18s
NAME READY UP-TO-DATE AVAILABLE AGE
deployment.apps/dashboard-metrics-scraper 1/1 1 1 2m18s
deployment.apps/kubernetes-dashboard 1/1 1 1 2m18s
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
service/dashboard-metrics-scraper ClusterIP 10.102.114.143 8000/TCP 2m18s
service/kubernetes-dashboard NodePort 10.111.191.70 443:30001/TCP 2m19s
root># curl https://10.111.191.70:443 -k -I
HTTP/1.1 200 OK
Accept-Ranges: bytes
Cache-Control: no-store
Content-Length: 1262
Content-Type: text/html; charset=utf-8
Last-Modified: Mon, 14 Oct 2019 16:39:02 GMT
Date: Wed, 13 Nov 2019 02:25:52 GMT
# 我们可以看到官方的dashboard帮我们启动了web-ui,并且帮我们启动了一个Metric服务
# 但是dashboard默认使用的https的443端口
然后可以通过上面的 https://NodeIP:30001 端口去访问 Dashboard,要记住使用 https,Chrome不生效可以使用Firefox测试:
```
## 4、然后创建一个具有全局所有权限的用户来登录Dashboard:(admin.yaml)
```bash
cat > admin.yaml << \EOF
kind: ClusterRoleBinding
apiVersion: rbac.authorization.k8s.io/v1beta1
metadata:
name: admin
annotations:
rbac.authorization.kubernetes.io/autoupdate: "true"
roleRef:
kind: ClusterRole
name: cluster-admin
apiGroup: rbac.authorization.k8s.io
subjects:
- kind: ServiceAccount
name: admin
namespace: kube-system
---
apiVersion: v1
kind: ServiceAccount
metadata:
name: admin
namespace: kube-system
labels:
kubernetes.io/cluster-service: "true"
addonmanager.kubernetes.io/mode: Reconcile
EOF
kubectl apply -f admin.yaml
kubectl delete -f admin.yaml
#获取token
kubectl -n kube-system describe secret $(kubectl -n kube-system get secret | grep admin | awk '{print $1}')
```
https://192.168.56.12:31513
然后用上面的base64解码后的字符串作为token登录Dashboard即可: k8s dashboard
最终我们就完成了使用 kubeadm 搭建 v1.15.3 版本的 kubernetes 集群、coredns、ipvs、flannel。
# 九、问题排查
## 1、coredns异常问题

```
E1006 12:30:53.935744 1 reflector.go:134] github.com/coredns/coredns/plugin/kubernetes/controller.go:317: Failed to list *v1.Endpoints: Get https://10.10.0.1:443/api/v1/endpoints?limit=500&resourceVersion=0: dial tcp 10.10.0.1:443: connect: no route to host
E1006 12:30:53.935744 1 reflector.go:134] github.com/coredns/coredns/plugin/kubernetes/controller.go:317: Failed to list *v1.Endpoints: Get https://10.10.0.1:443/api/v1/endpoints?limit=500&resourceVersion=0: dial tcp 10.10.0.1:443: connect: no route to host
log: exiting because of error: log: cannot create log: open /tmp/coredns.coredns-bccdc95cf-vlqxk.unknownuser.log.ERROR.20191006-123053.1: no such file or directory
```
### 1.1、解决办法
```
实际上是主机防火墙的问题,需要添加
iptables -A RH-Firewall-1-INPUT -s 10.10.0.0/16 -j ACCEPT
其他参考
https://medium.com/@cminion/quicknote-kubernetes-networking-issues-78f1e0d06e12
https://github.com/coredns/coredns/issues/2325
```
## 2、kubelet异常问题1
```
问题现象:
kubelet fails to get cgroup stats for docker and kubelet services
解决办法:
cat > /etc/sysconfig/kubelet <<\EOF
KUBELET_EXTRA_ARGS=--runtime-cgroups=/systemd/system.slice --kubelet-cgroups=/systemd/system.slice
EOF
systemctl daemon-reload
systemctl restart kubelet
systemctl status kubelet
#查看kubelet日志
journalctl -f -u kubelet
https://stackoverflow.com/questions/46726216/kubelet-fails-to-get-cgroup-stats-for-docker-and-kubelet-services
https://www.twblogs.net/a/5cc87d63bd9eee1ac2ed736b
```
## 3、kubelet异常问题2
```
failed to create kubelet: misconfiguration: kubelet cgroup driver: "cgroupfs" is different from docker cgroup driver: "systemd"
#解决办法
添加如下内容--cgroup-driver=systemd
[root@tw19336 ~]# cat /usr/lib/systemd/system/kubelet.service.d/10-kubeadm.conf
# Note: This dropin only works with kubeadm and kubelet v1.11+
[Service]
Environment="KUBELET_KUBECONFIG_ARGS=--bootstrap-kubeconfig=/etc/kubernetes/bootstrap-kubelet.conf --kubeconfig=/etc/kubernetes/kubelet.conf --cgroup-driver=systemd"
Environment="KUBELET_CONFIG_ARGS=--config=/var/lib/kubelet/config.yaml"
# This is a file that "kubeadm init" and "kubeadm join" generates at runtime, populating the KUBELET_KUBEADM_ARGS variable dynamically
EnvironmentFile=-/var/lib/kubelet/kubeadm-flags.env
# This is a file that the user can use for overrides of the kubelet args as a last resort. Preferably, the user should use
# the .NodeRegistration.KubeletExtraArgs object in the configuration files instead. KUBELET_EXTRA_ARGS should be sourced from this file.
EnvironmentFile=-/etc/sysconfig/kubelet
ExecStart=
ExecStart=/usr/bin/kubelet $KUBELET_KUBECONFIG_ARGS $KUBELET_CONFIG_ARGS $KUBELET_KUBEADM_ARGS $KUBELET_EXTRA_ARGS
systemctl daemon-reload
systemctl restart kubelet
systemctl status kubelet
https://www.cnblogs.com/hongdada/p/9771857.html
```
参考文档:
https://www.cnblogs.com/liyongjian5179/p/11417794.html 使用kubeadm安装Kubernetes 1.15.3 并开启 ipvs
https://www.jianshu.com/p/8bc61078bded
https://www.cnblogs.com/lovesKey/p/10888006.html centos7下用kubeadm安装k8s集群并使用ipvs做高可用方案
https://github.com/kubernetes/dashboard/wiki/Creating-sample-user
https://www.qikqiak.com/post/use-kubeadm-install-kubernetes-1.15.3/
https://kubernetes.io/docs/setup/production-environment/tools/kubeadm/create-cluster-kubeadm/ 官方文档
https://www.jianshu.com/p/d0933d6ae162 kubeadm 1.15 安装
https://yq.aliyun.com/articles/680080/ 单独部署coredns
https://kubernetes.io/docs/setup/production-environment/tools/kubeadm/ha-topology/#stacked-etcd-topology etcd-stacked-cluster
https://www.kubernetes.org.cn/5021.html etcd 集群运维实践
================================================
FILE: kubeadm/Kubernetes 集群变更IP地址.md
================================================
参考资料:
https://blog.csdn.net/whywhy0716/article/details/92658111 Kubernetes 集群变更IP地址
================================================
FILE: kubeadm/README.md
================================================
# 一、防火墙配置
```
chattr -i /etc/passwd* && chattr -i /etc/group* && chattr -i /etc/shadow* && chattr -i /etc/gshadow*
yum install iptables iptables-services -y
cat > /etc/sysconfig/iptables << \EOF
# Generated by iptables-save v1.4.21 on Thu Aug 1 01:26:09 2019
*filter
:INPUT ACCEPT [0:0]
:FORWARD ACCEPT [0:0]
:OUTPUT ACCEPT [0:0]
:RH-Firewall-1-INPUT - [0:0]
-A INPUT -j RH-Firewall-1-INPUT
-A FORWARD -j RH-Firewall-1-INPUT
-A RH-Firewall-1-INPUT -i lo -j ACCEPT
-A RH-Firewall-1-INPUT -p icmp -m icmp --icmp-type any -j ACCEPT
-A RH-Firewall-1-INPUT -s 192.168.56.0/24 -p tcp -m tcp --dport 22 -j ACCEPT
-A RH-Firewall-1-INPUT -p tcp -m tcp --dport 22 -j DROP
### k8s ###
-A RH-Firewall-1-INPUT -s 192.168.56.11/32 -j ACCEPT
-A RH-Firewall-1-INPUT -s 192.168.56.12/32 -j ACCEPT
-A RH-Firewall-1-INPUT -s 192.168.56.13/32 -j ACCEPT
# serviceSubnet rules
-A RH-Firewall-1-INPUT -s 10.96.0.0/12 -j ACCEPT
# podSubnet rules
-A RH-Firewall-1-INPUT -s 10.244.0.0/16 -j ACCEPT
# keepalived rules
-A RH-Firewall-1-INPUT -p vrrp -j ACCEPT
# port rules
-A RH-Firewall-1-INPUT -s 192.168.56.1/32 -p tcp -m multiport --dports 80,443,1080,6443,16443,30000:32767 -j ACCEPT
### k8s ###
-A RH-Firewall-1-INPUT -m state --state RELATED,ESTABLISHED -j ACCEPT
-A RH-Firewall-1-INPUT -j REJECT --reject-with icmp-host-prohibited
COMMIT
# Completed on Thu Aug 1 01:26:09 2019
EOF
systemctl restart iptables.service
systemctl enable iptables.service
iptables -nvL
```
# 二、初始化
```bash
cat > /etc/hosts << \EOF
127.0.0.1 localhost localhost.localdomain localhost4 localhost4.localdomain4
::1 localhost localhost.localdomain localhost6 localhost6.localdomain6
192.168.56.11 linux-node1 linux-node1.example.com
192.168.56.12 linux-node2 linux-node2.example.com
192.168.56.13 linux-node3 linux-node3.example.com
EOF
systemctl stop firewalld
systemctl disable firewalld
setenforce 0
sed -i 's/SELINUX=.*/SELINUX=disabled/g' /etc/selinux/config
sed -i 's/SELINUXTYPE=.*/SELINUXTYPE=disabled/g' /etc/selinux/config
# 关闭 swap
swapoff -a
#sed -ir 's/.*swap.*/#&/' /etc/fstab
#或
yes | cp /etc/fstab /etc/fstab_bak
cat /etc/fstab_bak |grep -v swap > /etc/fstab
#export Time=`date "+%Y%m%d%H%M%S"`
#cp /etc/fstab /etc/fstab_$Time
cat > /etc/sysctl.d/k8s.conf << \EOF
net.bridge.bridge-nf-call-ip6tables = 1
net.bridge.bridge-nf-call-iptables = 1
net.ipv4.ip_forward = 1
vm.swappiness = 0
EOF
#加载 br_netfilter 模块
modprobe br_netfilter
sysctl -p /etc/sysctl.d/k8s.conf
#创建/etc/sysconfig/modules/ipvs.modules文件,保证在节点重启后能自动加载所需模块
cat > /etc/sysconfig/modules/ipvs.modules < /etc/docker/daemon.json << \EOF
{
"exec-opts": ["native.cgroupdriver=systemd"],
"registry-mirrors" : [
"https://ot2k4d59.mirror.aliyuncs.com/"
]
}
EOF
systemctl daemon-reload
systemctl restart docker
cat < /etc/yum.repos.d/kubernetes.repo
[kubernetes]
name=Kubernetes
baseurl=http://mirrors.aliyun.com/kubernetes/yum/repos/kubernetes-el7-x86_64
enabled=1
gpgcheck=0
repo_gpgcheck=0
gpgkey=http://mirrors.aliyun.com/kubernetes/yum/doc/yum-key.gpg
http://mirrors.aliyun.com/kubernetes/yum/doc/rpm-package-key.gpg
EOF
yum install -y kubelet-1.16.2 kubeadm-1.16.2 kubectl-1.16.2 --disableexcludes=kubernetes
systemctl daemon-reload
systemctl restart kubelet.service
kubeadm version
systemctl enable kubelet.service
systemctl status kubelet
#查看kubelet日志
journalctl -f -u kubelet
#kubelet.service服务位置
/lib/systemd/system/kubelet.service
```
# 三、初始化集群
1、命令行初始化
```bash
kubeadm init \
--apiserver-advertise-address=192.168.56.11 \
--image-repository registry.aliyuncs.com/google_containers \
--kubernetes-version v1.16.2 \
--apiserver-bind-port=6443 \
--service-cidr=10.96.0.0/12 \
--pod-network-cidr=10.244.0.0/16 #这里使用这个是因为官方flannel使用的这个段地址,不然的话,kube-flannel.yml那里需要调整
#其他节点可以先指定image源,先下载需要的镜像
kubeadm config images pull --image-repository registry.aliyuncs.com/google_containers
#获取加入集群的指令
kubeadm token create --print-join-command
kubeadm join 192.168.56.11:6443 --token 5avfk1.fwui1smk5utcu7m9 --discovery-token-ca-cert-hash sha256:6730e91a516d8bf3e26d8f5eddd6409a224f8703b94f6ecde2b1fd7481bbbd25
#集群初始化如果遇到问题,可以使用下面的命令进行清理
yes | kubeadm reset
ifconfig cni0 down
ip link delete cni0
ifconfig flannel.1 down
ip link delete flannel.1
rm -rf /var/lib/cni/
rm -f $HOME/.kube/config
systemctl restart kubelet
systemctl status kubelet
journalctl -f -u kubelet
```
2、通过配置文件进行初始化
```bash
#在 master 节点配置 kubeadm 初始化文件,可以通过如下命令导出默认的初始化配置:
root># kubeadm config print init-defaults > kubeadm.yaml
```
```
#然后根据我们自己的需求修改配置,比如修改 imageRepository 的值,kube-proxy 的模式为 ipvs
如果是 flannel 网络插件的,需要将 networking.podSubnet 设置为默认的 10.244.0.0/16
如果是 Calico 网络插件的,配置成 Calico 的默认网段 podSubnet: 192.168.0.0/16,这个也可以修改Calico的配置文件调整
rm -f kubeadm.yaml
cat > kubeadm.yaml << \EOF
apiVersion: kubeadm.k8s.io/v1beta2
bootstrapTokens:
- groups:
- system:bootstrappers:kubeadm:default-node-token
token: abcdef.0123456789abcdef
ttl: 24h0m0s
usages:
- signing
- authentication
kind: InitConfiguration
localAPIEndpoint:
advertiseAddress: 192.168.56.11 #修改为主节点 IP
bindPort: 6443
#controlPlaneEndpoint: 1.1.1.100 #如果前面配置了负载均衡,此处填写vip地址
nodeRegistration:
criSocket: /var/run/dockershim.sock
name: linux-node1.example.com
taints:
- effect: NoSchedule
key: node-role.kubernetes.io/master
---
apiServer:
timeoutForControlPlane: 4m0s
apiVersion: kubeadm.k8s.io/v1beta2
certificatesDir: /etc/kubernetes/pki
clusterName: kubernetes
controllerManager: {}
dns:
type: CoreDNS #dns 类型
etcd:
local:
dataDir: /var/lib/etcd
#imageRepository: k8s.gcr.io
imageRepository: registry.aliyuncs.com/google_containers #国内不能访问 Google,修改为阿里云
kind: ClusterConfiguration
kubernetesVersion: v1.16.2 # 修改版本号
networking:
dnsDomain: cluster.local
# 配置成 flannel 的默认网段
serviceSubnet: 10.96.0.0/12
podSubnet: 10.244.0.0/16
scheduler: {}
---
# 开启 IPVS 模式
apiVersion: kubeproxy.config.k8s.io/v1alpha1
kind: KubeProxyConfiguration
mode: ipvs # kube-proxy 模式
EOF
kubeadm init --config kubeadm.yaml
```
3、初始化进行的操作
```bash
初始化操作主要经历了下面15个步骤,每个阶段均输出均使用[步骤名称]作为开头:
1、[init]:指定版本进行初始化操作
2、[preflight] :初始化前的检查和下载所需要的Docker镜像文件。
3、[kubelet-start] :生成kubelet的配置文件”/var/lib/kubelet/config.yaml”,没有这个文件kubelet无法启动,所以初始化之前的kubelet实际上启动失败。
4、[certificates]:生成Kubernetes使用的证书,存放在/etc/kubernetes/pki目录中。
5、[kubeconfig] :生成 KubeConfig 文件,存放在/etc/kubernetes目录中,组件之间通信需要使用对应文件。
6、[control-plane]:使用/etc/kubernetes/manifest目录下的YAML文件,安装 Master 组件。
7、[etcd]:使用/etc/kubernetes/manifest/etcd.yaml安装Etcd服务。
8、[wait-control-plane]:等待control-plan部署的Master组件启动。
9、[apiclient]:检查Master组件服务状态。
10、[uploadconfig]:更新配置
11、[kubelet]:使用configMap配置kubelet。
12、[patchnode]:更新CNI信息到Node上,通过注释的方式记录。
13、[mark-control-plane]:为当前节点打标签,打了角色Master,和不可调度标签,这样默认就不会使用Master节点来运行Pod。
14、[bootstrap-token]:生成token记录下来,后边使用kubeadm join往集群中添加节点时会用到
15、[addons]:安装附加组件CoreDNS和kube-proxy
kubectl默认会在执行的用户家目录下面的.kube目录下寻找config文件。这里是将在初始化时[kubeconfig]步骤生成的admin.conf拷贝到.kube/config。
```
2、单独部署coredns(选择操作)
```
# 不依赖kubeadm的方式,适用于不是使用kubeadm创建的k8s集群,或者kubeadm初始化集群之后,删除了dns相关部署
# 在calico网络中也配置一个coredns # 10.96.0.10 为k8s官方指定的kube-dns地址
rm -f coredns.yaml.sed deploy.sh coredns.yml
wget https://raw.githubusercontent.com/coredns/deployment/master/kubernetes/coredns.yaml.sed
wget https://raw.githubusercontent.com/coredns/deployment/master/kubernetes/deploy.sh
chmod +x deploy.sh
./deploy.sh -i 10.10.0.10 > coredns.yml #这里从--service-cidr=10.10.0.0/16中选用10.10.0.10作为coredns地址
kubectl apply -f coredns.yml
# 查看
kubectl get pods --namespace kube-system
kubectl get svc --namespace kube-system
#删除coredns
kubectl delete deployment coredns -n kube-system
kubectl delete svc kube-dns -n kube-system
kubectl delete cm coredns -n kube-system
```
3、集群移除节点
```
1、#移除work节点
在准备移除的 worker 节点上执行
kubeadm reset
2、在第一个 master 节点 demo-master-a-1 上执行
kubectl delete node demo-worker-x-x
#worker 节点的名字可以通过在第一个 master 节点 demo-master-a-1 上执行 kubectl get nodes 命令获得
```
4、kube-proxy开启ipvs
```
1、#修改ConfigMap的kube-system/kube-proxy中的config.conf,把 mode: "" 改为mode: “ipvs" 保存退出即可
root># kubectl edit cm kube-proxy -n kube-system
configmap/kube-proxy edited
2、#删除之前的proxy pod
root># kubectl get pod -n kube-system |grep kube-proxy |awk '{system("kubectl delete pod "$1" -n kube-system")}'
3、#查看proxy运行状态
root># kubectl get pod -n kube-system | grep kube-proxy
4、#查看日志,如果有 `Using ipvs Proxier.` 说明kube-proxy的ipvs 开启成功!
root># kubectl logs kube-proxy-54qnw -n kube-system
I0518 20:24:09.319160 1 server_others.go:176] Using ipvs Proxier.
W0518 20:24:09.319751 1 proxier.go:386] IPVS scheduler not specified, use rr by default
I0518 20:24:09.320035 1 server.go:562] Version: v1.14.2
I0518 20:24:09.334372 1 conntrack.go:52] Setting nf_conntrack_max to 131072
I0518 20:24:09.334853 1 config.go:102] Starting endpoints config controller
I0518 20:24:09.334916 1 controller_utils.go:1027] Waiting for caches to sync for endpoints config controller
I0518 20:24:09.334945 1 config.go:202] Starting service config controller
I0518 20:24:09.334976 1 controller_utils.go:1027] Waiting for caches to sync for service config controller
I0518 20:24:09.435153 1 controller_utils.go:1034] Caches are synced for service config controller
I0518 20:24:09.435271 1 controller_utils.go:1034] Caches are synced for endpoints config controller
```
# 四、Master操作
```
#将 master 节点上面的 $HOME/.kube/config 文件拷贝到 node 节点对应的文件中
mkdir -p $HOME/.kube
yes | cp -i /etc/kubernetes/admin.conf $HOME/.kube/config
chown $(id -u):$(id -g) $HOME/.kube/config
scp $HOME/.kube/config root@linux-node2:$HOME/.kube/config
scp $HOME/.kube/config root@linux-node3:$HOME/.kube/config
#指令补全
yum install bash-completion -y
source <(kubectl completion bash)
echo "source <(kubectl completion bash)" >> ~/.bashrc
```
# 五、Node操作
```
#node节点操作
mkdir -p $HOME/.kube
sudo chown $(id -u):$(id -g) $HOME/.kube/config
#加入集群
kubeadm join 192.168.56.11:6443 --token 5avfk1.fwui1smk5utcu7m9 --discovery-token-ca-cert-hash sha256:6730e91a516d8bf3e26d8f5eddd6409a224f8703b94f6ecde2b1fd7481bbbd25
```
# 六、集群操作
```
#批量重启docker
docker restart `docker ps -a -q`
root># kubectl get nodes
NAME STATUS ROLES AGE VERSION
linux-node1.example.com NotReady master 11m v1.15.3
linux-node2.example.com NotReady 5m9s v1.15.3
linux-node3.example.com NotReady 4m58s v1.15.3
可以看到是 NotReady 状态,这是因为还没有安装网络插件,接下来安装网络插件,可以在文档 https://kubernetes.io/docs/setup/production-environment/tools/kubeadm/create-cluster-kubeadm/ 中选择我们自己的网络插件,这里我们安装 flannel:
iptables -I RH-Firewall-1-INPUT -s 10.96.0.0/16 -j ACCEPT
service iptables save
root># kubectl get pods -n kube-system
NAME READY STATUS RESTARTS AGE
coredns-5c98db65d4-mk254 1/1 Running 0 14m
coredns-5c98db65d4-ntz98 1/1 Running 0 14m
etcd-linux-node1.example.com 1/1 Running 0 13m
kube-apiserver-linux-node1.example.com 1/1 Running 0 13m
kube-controller-manager-linux-node1.example.com 1/1 Running 0 13m
kube-flannel-ds-amd64-6kx7m 1/1 Running 0 11m
kube-flannel-ds-amd64-cqfnb 1/1 Running 0 11m
kube-flannel-ds-amd64-thxx2 1/1 Running 0 11m
kube-proxy-gdtjg 1/1 Running 0 12m
kube-proxy-lcscl 1/1 Running 0 14m
kube-proxy-sb7d8 1/1 Running 0 12m
kube-scheduler-linux-node1.example.com 1/1 Running 0 13m
kubernetes-dashboard-fcfb4cbc-dqbq9 1/1 Running 0 4m43s
kubectl describe pod/coredns-5c98db65d4-mk254 -n kube-system
#创建Deployment
kubectl run --image=nginx nginx-web-1 --image-pull-policy='IfNotPresent' --replicas=3
#以不同方式暴露出去
kubectl expose deployment nginx-web-1 --port=80 --target-port=80
kubectl expose deployment nginx-web-1 --port=80 --target-port=80 --type=NodePort
root># kubectl exec -it nginx-web-1-5cc49f46bc-kn46r -- \
sh -c "echo hello>/usr/share/nginx/html/index.html"
root># kubectl get svc -A
default nginx-web-1 NodePort 10.10.43.53 80:30163/TCP 101s
root># kubectl get endpoints
nginx-web-1 10.244.154.193:80,10.244.44.193:80,10.244.89.129:80 5m27s
root># curl 10.10.43.53
hello
#显示iptables规则(注意这里kube-proxy需要使用ipvs模式,上面主机预设的iptables策略才生效)
iptables -nvL --line-number
#删除规则
iptables -D RH-Firewall-1-INPUT 4
```
# 七、网络插件部署
1、master上部署flannel插件
```
#插件镜像 network: flannel image(因墙的问题,需要从国内源下载)
docker pull quay-mirror.qiniu.com/coreos/flannel:v0.11.0-amd64
docker tag quay-mirror.qiniu.com/coreos/flannel:v0.11.0-amd64 quay.io/coreos/flannel:v0.11.0-amd64
https://www.cnblogs.com/horizonli/p/10855666.html
#部署flannel
rm -f kube-flannel.yml
wget https://raw.githubusercontent.com/coreos/flannel/master/Documentation/kube-flannel.yml
sed -i 's#image: quay.io/coreos/flannel:v0.11.0-amd64#image: registry.cn-shenzhen.aliyuncs.com/cp_m/flannel:v0.10.0-amd64#g' kube-flannel.yml
kubectl apply -f kube-flannel.yml
#另外需要注意的是如果你的节点有多个网卡的话,需要在 kube-flannel.yml 中使用--iface参数指定集群主机内网网卡的名称,否则可能会出现 dns 无法解析。flanneld 启动参数加上--iface=
args:
- --ip-masq
- --kube-subnet-mgr
- --iface=eth0
```
2、master上部署calico插件
```
export POD_SUBNET=10.244.0.0/16
rm -f calico.yaml
wget https://docs.projectcalico.org/v3.8/manifests/calico.yaml
sed -i "s#192\.168\.0\.0/16#${POD_SUBNET}#" calico.yaml
kubectl apply -f calico.yaml
https://www.cnblogs.com/goldsunshine/p/10701242.html k8s网络之Calico网络
```
3、性能对比
```
https://www.2cto.com/net/201701/591629.html kubernetes flannel neutron calico三种网络方案性能测试分析
```
# 八、安装 Dashboard
使用 dashboard 最好把浏览器的默认语言设置为英文,不然在进入容器操作的时候会有bug,会出现重影
1、下载yaml文件
```
wget https://raw.githubusercontent.com/kubernetes/dashboard/v1.10.1/src/deploy/recommended/kubernetes-dashboard.yaml
vim kubernetes-dashboard.yaml
1、# 修改镜像名称
......
spec:
containers:
- name: kubernetes-dashboard
#image: k8s.gcr.io/kubernetes-dashboard-amd64:v1.10.1 #这个换成阿里云的镜像
image: registry.cn-hangzhou.aliyuncs.com/google_containers/kubernetes-dashboard-amd64:v1.10.1
ports:
- containerPort: 8443
protocol: TCP
args:
- --auto-generate-certificates
......
2、# 修改Service为NodePort类型
......
kind: Service
apiVersion: v1
metadata:
labels:
k8s-app: kubernetes-dashboard
name: kubernetes-dashboard
namespace: kube-system
spec:
type: NodePort # 新增这一行,指定为NodePort方式
ports:
- port: 443
targetPort: 8443
nodePort: 32370 #新增这一行,指定固定node端口
selector:
k8s-app: kubernetes-dashboard
```
2、dashboard最终文件
```
cat > kubernetes-dashboard.yaml << \EOF
# Copyright 2017 The Kubernetes Authors.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
# ------------------- Dashboard Secret ------------------- #
apiVersion: v1
kind: Secret
metadata:
labels:
k8s-app: kubernetes-dashboard
name: kubernetes-dashboard-certs
namespace: kube-system
type: Opaque
---
# ------------------- Dashboard Service Account ------------------- #
apiVersion: v1
kind: ServiceAccount
metadata:
labels:
k8s-app: kubernetes-dashboard
name: kubernetes-dashboard
namespace: kube-system
---
# ------------------- Dashboard Role & Role Binding ------------------- #
kind: Role
apiVersion: rbac.authorization.k8s.io/v1
metadata:
name: kubernetes-dashboard-minimal
namespace: kube-system
rules:
# Allow Dashboard to create 'kubernetes-dashboard-key-holder' secret.
- apiGroups: [""]
resources: ["secrets"]
verbs: ["create"]
# Allow Dashboard to create 'kubernetes-dashboard-settings' config map.
- apiGroups: [""]
resources: ["configmaps"]
verbs: ["create"]
# Allow Dashboard to get, update and delete Dashboard exclusive secrets.
- apiGroups: [""]
resources: ["secrets"]
resourceNames: ["kubernetes-dashboard-key-holder", "kubernetes-dashboard-certs"]
verbs: ["get", "update", "delete"]
# Allow Dashboard to get and update 'kubernetes-dashboard-settings' config map.
- apiGroups: [""]
resources: ["configmaps"]
resourceNames: ["kubernetes-dashboard-settings"]
verbs: ["get", "update"]
# Allow Dashboard to get metrics from heapster.
- apiGroups: [""]
resources: ["services"]
resourceNames: ["heapster"]
verbs: ["proxy"]
- apiGroups: [""]
resources: ["services/proxy"]
resourceNames: ["heapster", "http:heapster:", "https:heapster:"]
verbs: ["get"]
---
apiVersion: rbac.authorization.k8s.io/v1
kind: RoleBinding
metadata:
name: kubernetes-dashboard-minimal
namespace: kube-system
roleRef:
apiGroup: rbac.authorization.k8s.io
kind: Role
name: kubernetes-dashboard-minimal
subjects:
- kind: ServiceAccount
name: kubernetes-dashboard
namespace: kube-system
---
# ------------------- Dashboard Deployment ------------------- #
kind: Deployment
apiVersion: apps/v1
metadata:
labels:
k8s-app: kubernetes-dashboard
name: kubernetes-dashboard
namespace: kube-system
spec:
replicas: 1
revisionHistoryLimit: 10
selector:
matchLabels:
k8s-app: kubernetes-dashboard
template:
metadata:
labels:
k8s-app: kubernetes-dashboard
spec:
containers:
- name: kubernetes-dashboard
#image: k8s.gcr.io/kubernetes-dashboard-amd64:v1.10.1
image: registry.cn-hangzhou.aliyuncs.com/google_containers/kubernetes-dashboard-amd64:v1.10.1
ports:
- containerPort: 8443
protocol: TCP
args:
- --auto-generate-certificates
# Uncomment the following line to manually specify Kubernetes API server Host
# If not specified, Dashboard will attempt to auto discover the API server and connect
# to it. Uncomment only if the default does not work.
# - --apiserver-host=http://my-address:port
volumeMounts:
- name: kubernetes-dashboard-certs
mountPath: /certs
# Create on-disk volume to store exec logs
- mountPath: /tmp
name: tmp-volume
livenessProbe:
httpGet:
scheme: HTTPS
path: /
port: 8443
initialDelaySeconds: 30
timeoutSeconds: 30
volumes:
- name: kubernetes-dashboard-certs
secret:
secretName: kubernetes-dashboard-certs
- name: tmp-volume
emptyDir: {}
serviceAccountName: kubernetes-dashboard
# Comment the following tolerations if Dashboard must not be deployed on master
tolerations:
- key: node-role.kubernetes.io/master
effect: NoSchedule
---
# ------------------- Dashboard Service ------------------- #
kind: Service
apiVersion: v1
metadata:
labels:
k8s-app: kubernetes-dashboard
name: kubernetes-dashboard
namespace: kube-system
spec:
type: NodePort # 新增这一行,指定为NodePort方式
ports:
- port: 443
targetPort: 8443
nodePort: 32370 #新增这一行,指定固定node端口
selector:
k8s-app: kubernetes-dashboard
EOF
kubectl apply -f kubernetes-dashboard.yaml
```
3、查看dashboard
```
root># kubectl get pods -n kube-system -l k8s-app=kubernetes-dashboard
NAME READY STATUS RESTARTS AGE
kubernetes-dashboard-fcfb4cbc-dqbq9 1/1 Running 0 8m5s
root># kubectl get svc -n kube-system -l k8s-app=kubernetes-dashboard
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
kubernetes-dashboard NodePort 192.168.56.11 443:32730/TCP 8m25s
然后可以通过上面的 https://NodeIP:32730 端口去访问 Dashboard,要记住使用 https,Chrome不生效可以使用Firefox测试:
```
4、然后创建一个具有全局所有权限的用户来登录Dashboard:(admin.yaml)
```
cat > admin.yaml << \EOF
kind: ClusterRoleBinding
apiVersion: rbac.authorization.k8s.io/v1beta1
metadata:
name: admin
annotations:
rbac.authorization.kubernetes.io/autoupdate: "true"
roleRef:
kind: ClusterRole
name: cluster-admin
apiGroup: rbac.authorization.k8s.io
subjects:
- kind: ServiceAccount
name: admin
namespace: kube-system
---
apiVersion: v1
kind: ServiceAccount
metadata:
name: admin
namespace: kube-system
labels:
kubernetes.io/cluster-service: "true"
addonmanager.kubernetes.io/mode: Reconcile
EOF
kubectl apply -f admin.yaml
kubectl delete -f admin.yaml
#获取token
kubectl -n kube-system describe secret $(kubectl -n kube-system get secret | grep admin | awk '{print $1}')
```
https://192.168.56.12:31513
然后用上面的base64解码后的字符串作为token登录Dashboard即可: k8s dashboard
最终我们就完成了使用 kubeadm 搭建 v1.15.3 版本的 kubernetes 集群、coredns、ipvs、flannel。
# 九、问题排查
1、coredns异常问题

```
E1006 12:30:53.935744 1 reflector.go:134] github.com/coredns/coredns/plugin/kubernetes/controller.go:317: Failed to list *v1.Endpoints: Get https://10.10.0.1:443/api/v1/endpoints?limit=500&resourceVersion=0: dial tcp 10.10.0.1:443: connect: no route to host
E1006 12:30:53.935744 1 reflector.go:134] github.com/coredns/coredns/plugin/kubernetes/controller.go:317: Failed to list *v1.Endpoints: Get https://10.10.0.1:443/api/v1/endpoints?limit=500&resourceVersion=0: dial tcp 10.10.0.1:443: connect: no route to host
log: exiting because of error: log: cannot create log: open /tmp/coredns.coredns-bccdc95cf-vlqxk.unknownuser.log.ERROR.20191006-123053.1: no such file or directory
```
解决办法
```
实际上是主机防火墙的问题,需要添加
iptables -A RH-Firewall-1-INPUT -s 10.10.0.0/16 -j ACCEPT
其他参考
https://medium.com/@cminion/quicknote-kubernetes-networking-issues-78f1e0d06e12
https://github.com/coredns/coredns/issues/2325
```
2、kubelet异常问题1
```
问题现象:
kubelet fails to get cgroup stats for docker and kubelet services
解决办法:
cat > /etc/sysconfig/kubelet <<\EOF
KUBELET_EXTRA_ARGS=--runtime-cgroups=/systemd/system.slice --kubelet-cgroups=/systemd/system.slice
EOF
systemctl daemon-reload
systemctl restart kubelet
systemctl status kubelet
#查看kubelet日志
journalctl -f -u kubelet
https://stackoverflow.com/questions/46726216/kubelet-fails-to-get-cgroup-stats-for-docker-and-kubelet-services
https://www.twblogs.net/a/5cc87d63bd9eee1ac2ed736b
```
3、kubelet异常问题2
```
failed to create kubelet: misconfiguration: kubelet cgroup driver: "cgroupfs" is different from docker cgroup driver: "systemd"
#解决办法
添加如下内容--cgroup-driver=systemd
[root@tw19336 ~]# cat /usr/lib/systemd/system/kubelet.service.d/10-kubeadm.conf
# Note: This dropin only works with kubeadm and kubelet v1.11+
[Service]
Environment="KUBELET_KUBECONFIG_ARGS=--bootstrap-kubeconfig=/etc/kubernetes/bootstrap-kubelet.conf --kubeconfig=/etc/kubernetes/kubelet.conf --cgroup-driver=systemd"
Environment="KUBELET_CONFIG_ARGS=--config=/var/lib/kubelet/config.yaml"
# This is a file that "kubeadm init" and "kubeadm join" generates at runtime, populating the KUBELET_KUBEADM_ARGS variable dynamically
EnvironmentFile=-/var/lib/kubelet/kubeadm-flags.env
# This is a file that the user can use for overrides of the kubelet args as a last resort. Preferably, the user should use
# the .NodeRegistration.KubeletExtraArgs object in the configuration files instead. KUBELET_EXTRA_ARGS should be sourced from this file.
EnvironmentFile=-/etc/sysconfig/kubelet
ExecStart=
ExecStart=/usr/bin/kubelet $KUBELET_KUBECONFIG_ARGS $KUBELET_CONFIG_ARGS $KUBELET_KUBEADM_ARGS $KUBELET_EXTRA_ARGS
systemctl daemon-reload
systemctl restart kubelet
systemctl status kubelet
https://www.cnblogs.com/hongdada/p/9771857.html
```
参考文档:
https://www.cnblogs.com/liyongjian5179/p/11417794.html 使用kubeadm安装Kubernetes 1.15.3 并开启 ipvs
https://www.jianshu.com/p/8bc61078bded
https://www.cnblogs.com/lovesKey/p/10888006.html centos7下用kubeadm安装k8s集群并使用ipvs做高可用方案
https://github.com/kubernetes/dashboard/wiki/Creating-sample-user
https://www.qikqiak.com/post/use-kubeadm-install-kubernetes-1.15.3/
https://kubernetes.io/docs/setup/production-environment/tools/kubeadm/create-cluster-kubeadm/ 官方文档
https://www.jianshu.com/p/d0933d6ae162 kubeadm 1.15 安装
https://yq.aliyun.com/articles/680080/ 单独部署coredns
https://kubernetes.io/docs/setup/production-environment/tools/kubeadm/ha-topology/#stacked-etcd-topology etcd-stacked-cluster
https://www.kubernetes.org.cn/5021.html etcd 集群运维实践
================================================
FILE: kubeadm/k8S-HA-V1.15.3-Calico-开启防火墙版.md
================================================
# 环境介绍:
```bash
CentOS: 7.6
Docker: docker-ce-18.09.9
Kubernetes: 1.15.3
Kubeadm: 1.15.3
Kubelet: 1.15.3
Kubectl: 1.15.3
```
# 部署介绍:
创建高可用首先先有一个 Master 节点,然后再让其他服务器加入组成三个 Master 节点高可用,然后再将工作节点 Node 加入。下面将描述每个节点要执行的步骤:
```bash
Master01: 二、三、四、五、六、七、八、九、十一
Master02、Master03: 二、三、五、六、四、九
node01、node02、node03: 二、五、六、九
```
# 防火墙配置
```bash
yum install iptables iptables-services -y
cat > /etc/sysconfig/iptables << \EOF
# Generated by iptables-save v1.4.21 on Thu Aug 1 01:26:09 2019
*filter
:INPUT ACCEPT [0:0]
:FORWARD ACCEPT [0:0]
:OUTPUT ACCEPT [0:0]
:RH-Firewall-1-INPUT - [0:0]
-A INPUT -j RH-Firewall-1-INPUT
-A FORWARD -j RH-Firewall-1-INPUT
-A RH-Firewall-1-INPUT -i lo -j ACCEPT
-A RH-Firewall-1-INPUT -p icmp -m icmp --icmp-type any -j ACCEPT
-A RH-Firewall-1-INPUT -s 192.168.56.0/24 -p tcp -m tcp --dport 22 -j ACCEPT
-A RH-Firewall-1-INPUT -p tcp -m tcp --dport 22 -j DROP
#k8s
-A RH-Firewall-1-INPUT -s 192.168.56.11/32 -j ACCEPT
-A RH-Firewall-1-INPUT -s 192.168.56.12/32 -j ACCEPT
-A RH-Firewall-1-INPUT -s 192.168.56.13/32 -j ACCEPT
-A RH-Firewall-1-INPUT -s 192.168.56.14/32 -j ACCEPT
-A RH-Firewall-1-INPUT -p vrrp -j ACCEPT
-A RH-Firewall-1-INPUT -s 192.168.56.1/32 -p tcp -m multiport --dports 80,443,1080,6443,16443 -j ACCEPT
#
-A RH-Firewall-1-INPUT -m state --state RELATED,ESTABLISHED -j ACCEPT
-A RH-Firewall-1-INPUT -j REJECT --reject-with icmp-host-prohibited
COMMIT
# Completed on Thu Aug 1 01:26:09 2019
EOF
systemctl restart iptables.service
systemctl enable iptables.service
iptables -nvL
```
# 集群架构:

# 一、kuberadm 简介
### 1、Kuberadm 作用
Kubeadm 是一个工具,它提供了 kubeadm init 以及 kubeadm join 这两个命令作为快速创建 kubernetes 集群的最佳实践。
kubeadm 通过执行必要的操作来启动和运行一个最小可用的集群。它被故意设计为只关心启动集群,而不是之前的节点准备工作。同样的,诸如安装各种各样值得拥有的插件,例如 Kubernetes Dashboard、监控解决方案以及特定云提供商的插件,这些都不在它负责的范围。
相反,我们期望由一个基于 kubeadm 从更高层设计的更加合适的工具来做这些事情;并且,理想情况下,使用 kubeadm 作为所有部署的基础将会使得创建一个符合期望的集群变得容易。
### 2、Kuberadm 功能
```bash
kubeadm init: 启动一个 Kubernetes 主节点
kubeadm join: 启动一个 Kubernetes 工作节点并且将其加入到集群
kubeadm upgrade: 更新一个 Kubernetes 集群到新版本
kubeadm config: 如果使用 v1.7.x 或者更低版本的 kubeadm 初始化集群,您需要对集群做一些配置以便使用 kubeadm upgrade 命令
kubeadm token: 管理 kubeadm join 使用的令牌
kubeadm reset: 还原 kubeadm init 或者 kubeadm join 对主机所做的任何更改
kubeadm version: 打印 kubeadm 版本
kubeadm alpha: 预览一组可用的新功能以便从社区搜集反馈
```
### 3、功能版本
| Area |
Maturity Level |
| Command line UX |
GA |
| Implementation |
GA |
| Config file API |
beta |
| CoreDNS |
GA |
| kubeadm alpha subcommands |
alpha |
| High availability |
alpha |
| DynamicKubeletConfig |
alpha |
| Self-hosting |
alpha |
# 二、前期准备
### 1、虚拟机分配说明
| 地址 |
主机名 |
内存&CPU |
角色 |
| 192.168.56.200 |
- |
- |
vip |
| 192.168.56.11 |
k8s-master-01 |
2C & 2G |
master |
| 192.168.56.12 |
k8s-master-02 |
2C & 2G |
master |
| 192.168.56.13 |
k8s-master-03 |
2C & 2G |
master |
| 192.168.56.14 |
k8s-node-01 |
4C & 8G |
node |
| 192.168.56.15 |
k8s-node-02 |
4C & 8G |
node |
| 192.168.56.16 |
k8s-node-03 |
4C & 8G |
node |
### 2、各个节点端口占用
- Master 节点
| 规则 |
方向 |
端口范围 |
作用 |
使用者 |
| TCP |
Inbound 入口 |
6443* |
Kubernetes API |
server All |
| TCP |
Inbound 入口 |
2379-2380 |
etcd server |
client API kube-apiserver, etcd |
| TCP |
Inbound 入口 |
10250 |
Kubernetes API |
Self, Control plane |
| TCP |
Inbound 入口 |
10251 |
kube-scheduler |
Self |
| TCP |
Inbound 入口 |
10252 |
kube-controller-manager |
Self |
- node 节点
| 规则 |
方向 |
端口范围 |
作用 |
使用者 |
| TCP |
Inbound 入口 |
10250 |
Kubernetes API |
Self, Control plane |
| TCP |
Inbound 入口 |
30000-32767 |
NodePort Services** |
All |
### 3、基础环境设置
Kubernetes 需要一定的环境来保证正常运行,如各个节点时间同步,主机名称解析,关闭防火墙等等。
1、主机名称解析
分布式系统环境中的多主机通信通常基于主机名称进行,这在 IP 地址存在变化的可能性时为主机提供了固定的访问人口,因此一般需要有专用的 DNS 服务负责解决各节点主机 不过,考虑到此处部署的是测试集群,因此为了降低系复杂度,这里将基于 hosts 的文件进行主机名称解析。
2、修改hosts和免key登录
```bash
#分别进入不同服务器,进入 /etc/hosts 进行编辑
cat > /etc/hosts << \EOF
127.0.0.1 localhost localhost.localdomain localhost4 localhost4.localdomain4
::1 localhost localhost.localdomain localhost6 localhost6.localdomain6
192.168.56.200 k8s-vip master master.k8s.io
192.168.56.11 k8s-master-01 master01 master01.k8s.io
192.168.56.12 k8s-master-02 master02 master02.k8s.io
192.168.56.13 k8s-master-03 master03 master03.k8s.io
192.168.56.14 k8s-node-01 node01 node01.k8s.io
192.168.56.15 k8s-node-02 node02 node02.k8s.io
192.168.56.16 k8s-node-03 node03 node03.k8s.io
EOF
#root用户免密登录
mkdir -p /root/.ssh/
chmod 700 /root/.ssh/
echo 'ssh-rsa AAAAB3NzaC1yc2EAAAADAQABAAABAQC7bRm20od1b3rzW3ZPLB5NZn3jQesvfiz2p0WlfcYJrFHfF5Ap0ubIBUSQpVNLn94u8ABGBLboZL8Pjo+rXQPkIcObJxoKS8gz6ZOxcxJhldudbadabdanKAAKAKKKKKKKKKKKKKKKKKKKKKKK root@k8s-master-01' > /root/.ssh/authorized_keys
chmod 400 /root/.ssh/authorized_keys
```
3、修改hostname
```bash
#分别进入不同的服务器修改 hostname 名称
# 修改 192.168.56.11 服务器
hostnamectl set-hostname k8s-master-01
# 修改 192.168.56.12 服务器
hostnamectl set-hostname k8s-master-02
# 修改 192.168.56.13 服务器
hostnamectl set-hostname k8s-master-03
# 修改 192.168.56.14 服务器
hostnamectl set-hostname k8s-node-01
# 修改 192.168.56.15 服务器
hostnamectl set-hostname k8s-node-02
# 修改 192.168.56.16 服务器
hostnamectl set-hostname k8s-node-03
```
4、主机时间同步
```bash
#将各个服务器的时间同步,并设置开机启动同步时间服务
systemctl start chronyd.service
systemctl enable chronyd.service
```
5、关闭防火墙服务
```bash
systemctl stop firewalld
systemctl disable firewalld
```
6、关闭并禁用SELinux
```bash
# 若当前启用了 SELinux 则需要临时设置其当前状态为 permissive
setenforce 0
# 编辑/etc/sysconfig selinux 文件,以彻底禁用 SELinux
sed -i 's/^SELINUX=enforcing$/SELINUX=disabled/' /etc/selinux/config
# 查看selinux状态
getenforce
如果为permissive,则执行reboot重新启动即可
```
7、禁用 Swap 设备
kubeadm 默认会预先检当前主机是否禁用了 Swap 设备,并在未用时强制止部署 过程因此,在主机内存资惊充裕的条件下,需要禁用所有的 Swap 设备
```
# 关闭当前已启用的所有 Swap 设备
swapoff -a && sysctl -w vm.swappiness=0
sed -ri 's/.*swap.*/#&/' /etc/fstab
cat /etc/fstab
或
# 编辑 fstab 配置文件,注释掉标识为 Swap 设备的所有行
vi /etc/fstab
UUID=9be41058-76a6-4588-8e3f-5b44604d8de1 / xfs defaults,noatime 0 0
UUID=4489cc8f-1885-4e17-bfe7-8652fd1d3feb /boot xfs defaults,noatime 0 0
#UUID=0f5ae5f1-4872-471f-9f3a-f172a43fc1ff swap swap defaults,noatime 0 0
```
8、设置系统参数
设置允许路由转发,不对bridge的数据进行处理
```bash
#创建 /etc/sysctl.d/k8s.conf 文件
cat > /etc/sysctl.d/k8s.conf << \EOF
vm.swappiness = 0
net.ipv4.ip_forward = 1
net.bridge.bridge-nf-call-ip6tables = 1
net.bridge.bridge-nf-call-iptables = 1
EOF
#挂载br_netfilter
modprobe br_netfilter
#生效配置文件
sysctl -p /etc/sysctl.d/k8s.conf
#查看是否生成相关文件
ls /proc/sys/net/bridge
```
9、资源配置文件
`/etc/security/limits.conf` 是 Linux 资源使用配置文件,用来限制用户对系统资源的使用
```bash
echo "* soft nofile 65536" >> /etc/security/limits.conf
echo "* hard nofile 65536" >> /etc/security/limits.conf
echo "* soft nproc 65536" >> /etc/security/limits.conf
echo "* hard nproc 65536" >> /etc/security/limits.conf
echo "* soft memlock unlimited" >> /etc/security/limits.conf
echo "* hard memlock unlimited" >> /etc/security/limits.conf
```
10、安装依赖包以及相关工具
```bash
yum install -y epel-release
yum install -y yum-utils device-mapper-persistent-data lvm2 net-tools conntrack-tools wget vim ntpdate libseccomp libtool-ltdl
```
# 三、安装Keepalived
- keepalived介绍: 是集群管理中保证集群高可用的一个服务软件,其功能类似于heartbeat,用来防止单点故障
- Keepalived作用: 为haproxy提供vip(192.168.56.200)在三个haproxy实例之间提供主备,降低当其中一个haproxy失效的时对服务的影响。
### 1、yum安装Keepalived
```bash
# 安装keepalived
chattr -i /etc/passwd* && chattr -i /etc/group* && chattr -i /etc/shadow* && chattr -i /etc/gshadow*
yum install -y keepalived
```
### 2、配置Keepalived
```bash
cat < /etc/keepalived/keepalived.conf
! Configuration File for keepalived
# 主要是配置故障发生时的通知对象以及机器标识。
global_defs {
# 标识本节点的字条串,通常为 hostname,但不一定非得是 hostname。故障发生时,邮件通知会用到。
router_id LVS_K8S
}
# 用来做健康检查的,当时检查失败时会将 vrrp_instance 的 priority 减少相应的值。
vrrp_script check_haproxy {
script "killall -0 haproxy" #根据进程名称检测进程是否存活
interval 3
weight -2
fall 10
rise 2
}
# rp_instance用来定义对外提供服务的 VIP 区域及其相关属性。
vrrp_instance VI_1 {
state MASTER #当前节点为MASTER,其他两个节点设置为BACKUP
interface eth0 #改为自己的网卡
virtual_router_id 51
priority 250
advert_int 1
authentication {
auth_type PASS
auth_pass 35f18af7190d51c9f7f78f37300a0cbd
}
virtual_ipaddress {
192.168.56.200 #虚拟ip,即VIP
}
track_script {
check_haproxy
}
}
EOF
```
当前节点的配置中 state 配置为 MASTER,其它两个节点设置为 BACKUP
```bash
配置说明:
virtual_ipaddress: vip
track_script: 执行上面定义好的检测的script
interface: 节点固有IP(非VIP)的网卡,用来发VRRP包。
virtual_router_id: 取值在0-255之间,用来区分多个instance的VRRP组播
advert_int: 发VRRP包的时间间隔,即多久进行一次master选举(可以认为是健康查检时间间隔)。
authentication: 认证区域,认证类型有PASS和HA(IPSEC),推荐使用PASS(密码只识别前8位)。
state: 可以是MASTER或BACKUP,不过当其他节点keepalived启动时会将priority比较大的节点选举为MASTER,因此该项其实没有实质用途。
priority: 用来选举master的,要成为master,那么这个选项的值最好高于其他机器50个点,该项取值范围是1-255(在此范围之外会被识别成默认值100)。
# 1、注意防火墙需要放开vrrp协议(不然会出现脑裂现象,三台主机都存在VIP的情况)
#-A INPUT -p vrrp -j ACCEPT
-A RH-Firewall-1-INPUT -p vrrp -j ACCEPT
#2、注意上面配置script "killall -0 haproxy" #根据进程名称检测进程是否存活,会在/var/log/messages每隔一秒执行检测的日志记录
# tail -100f /var/log/messages
Sep 27 10:54:16 tw19410s1 Keepalived_vrrp[9113]: /usr/bin/killall -0 haproxy exited with status 1
```
### 3、启动Keepalived
```bash
# 设置开机启动
systemctl enable keepalived
# 启动keepalived
systemctl start keepalived
# 查看启动状态
systemctl status keepalived
```
### 4、查看网络状态
kepplived 配置中 state 为 MASTER 的节点启动后,查看网络状态,可以看到虚拟IP已经加入到绑定的网卡中
```bash
[root@k8s-master-01 ~]# ip address show eth0
2: eth0: mtu 1500 qdisc mq state UP group default qlen 1000
link/ether 00:50:56:be:86:af brd ff:ff:ff:ff:ff:ff
inet 192.168.56.11/24 brd 192.168.56.255 scope global eth0
valid_lft forever preferred_lft forever
inet 192.168.56.200/32 scope global eth0
valid_lft forever preferred_lft forever
当关掉当前节点的keeplived服务后将进行虚拟IP转移,将会推选state 为 BACKUP 的节点的某一节点为新的MASTER,可以在那台节点上查看网卡,将会查看到虚拟IP
```
# 四、安装haproxy
此处的haproxy为apiserver提供反向代理,haproxy将所有请求轮询转发到每个master节点上。相对于仅仅使用keepalived主备模式仅单个master节点承载流量,这种方式更加合理、健壮。
### 1、yum安装haproxy
```bash
chattr -i /etc/passwd* && chattr -i /etc/group* && chattr -i /etc/shadow* && chattr -i /etc/gshadow*
yum install -y haproxy
```
### 2、配置haproxy
```bash
cat > /etc/haproxy/haproxy.cfg << EOF
#---------------------------------------------------------------------
# Global settings
#---------------------------------------------------------------------
global
# to have these messages end up in /var/log/haproxy.log you will
# need to:
# 1) configure syslog to accept network log events. This is done
# by adding the '-r' option to the SYSLOGD_OPTIONS in
# /etc/sysconfig/syslog
# 2) configure local2 events to go to the /var/log/haproxy.log
# file. A line like the following can be added to
# /etc/sysconfig/syslog
#
# local2.* /var/log/haproxy.log
#
log 127.0.0.1 local2
chroot /var/lib/haproxy
pidfile /var/run/haproxy.pid
maxconn 4000
user haproxy
group haproxy
daemon
# turn on stats unix socket
stats socket /var/lib/haproxy/stats
#---------------------------------------------------------------------
# common defaults that all the 'listen' and 'backend' sections will
# use if not designated in their block
#---------------------------------------------------------------------
defaults
mode http
log global
option httplog
option dontlognull
option http-server-close
option forwardfor except 127.0.0.0/8
option redispatch
retries 3
timeout http-request 10s
timeout queue 1m
timeout connect 10s
timeout client 1m
timeout server 1m
timeout http-keep-alive 10s
timeout check 10s
maxconn 3000
#---------------------------------------------------------------------
# kubernetes apiserver frontend which proxys to the backends
#---------------------------------------------------------------------
frontend kubernetes-apiserver
mode tcp
bind *:16443
option tcplog
default_backend kubernetes-apiserver
#---------------------------------------------------------------------
# round robin balancing between the various backends
#---------------------------------------------------------------------
backend kubernetes-apiserver
mode tcp
balance roundrobin
server master01.k8s.io 192.168.56.11:6443 check
server master02.k8s.io 192.168.56.12:6443 check
server master03.k8s.io 192.168.56.13:6443 check
#---------------------------------------------------------------------
# collection haproxy statistics message
#---------------------------------------------------------------------
listen stats
bind *:1080
stats auth admin:awesomePassword
stats refresh 5s
stats realm HAProxy\ Statistics
stats uri /admin?stats
EOF
```
haproxy配置在其他master节点上(192.168.56.12和192.168.56.13)相同
### 3、启动并检测haproxy
```bash
# 设置开机启动
systemctl enable haproxy
# 开启haproxy
systemctl start haproxy
# 查看启动状态
systemctl status haproxy
```
### 4、检测haproxy端口
```bash
ss -lnt | grep -E "16443|1080"
```
# 五、安装Docker (所有节点)
### 1、移除之前安装过的Docker
```bash
sudo yum remove -y docker \
docker-client \
docker-client-latest \
docker-common \
docker-latest \
docker-latest-logrotate \
docker-logrotate \
docker-selinux \
docker-engine-selinux \
docker-ce-cli \
docker-engine
# 查看还有没有存在的docker组件
rpm -qa|grep docker
# 有则通过命令 yum -y remove XXX 来删除,比如:
yum remove docker-ce-cli
```
### 2、配置docker的yum源
下面两个镜像源选择其一即可,由于官方下载速度比较慢,推荐用阿里镜像源
- 阿里镜像源
```bash
sudo yum-config-manager --add-repo http://mirrors.aliyun.com/docker-ce/linux/centos/docker-ce.repo
```
- Docker官方镜像源
```bash
sudo yum-config-manager --add-repo https://download.docker.com/linux/centos/docker-ce.repo
```
### 3、安装Docker:
```
# 显示docker-ce所有可安装版本:
yum list docker-ce --showduplicates | sort -r
# 安装指定docker版本
sudo yum install docker-ce-18.09.9-3.el7.x86_64 -y
# 启动docker并设置docker开机启动
systemctl enable docker
systemctl start docker
# 确认一下iptables
确认一下iptables filter表中FOWARD链的默认策略(pllicy)为ACCEPT。
iptables -nvL
Chain FORWARD (policy ACCEPT 0 packets, 0 bytes)
pkts bytes target prot opt in out source destination
0 0 DOCKER-USER all -- * * 0.0.0.0/0 0.0.0.0/0
0 0 DOCKER-ISOLATION-STAGE-1 all -- * * 0.0.0.0/0 0.0.0.0/0
0 0 ACCEPT all -- * docker0 0.0.0.0/0 0.0.0.0/0 ctstate RELATED,ESTABLISHED
0 0 DOCKER all -- * docker0 0.0.0.0/0 0.0.0.0/0
0 0 ACCEPT all -- docker0 !docker0 0.0.0.0/0 0.0.0.0/0
0 0 ACCEPT all -- docker0 docker0 0.0.0.0/0 0.0.0.0/0
Docker从1.13版本开始调整了默认的防火墙规则,禁用了iptables filter表中FOWARD链,这样会引起Kubernetes集群中跨Node的Pod无法通信。但这里通过安装docker 1806,发现默认策略又改回了ACCEPT,这个不知道是从哪个版本改回的,因为我们线上版本使用的1706还是需要手动调整这个策略的。
# 执行下面命令
iptables -P FORWARD ACCEPT
# 修改docker的配置
vim /usr/lib/systemd/system/docker.service
# 增加下面命令(ExecReload后面新增ExecStartPost=...)
...
ExecReload=/bin/kill -s HUP $MAINPID
ExecStartPost=/usr/sbin/iptables -P FORWARD ACCEPT
...
# 配置docker加速器
cat > /etc/docker/daemon.json << \EOF
{
"registry-mirrors": [
"https://dockerhub.azk8s.cn",
"https://i37dz0y4.mirror.aliyuncs.com"
],
"insecure-registries": ["reg.hub.com"]
}
EOF
# 重启Docker
systemctl daemon-reload
systemctl restart docker
```
### 4、docker最终的服务文件
```
#注意,有变量的地方需要使用转义符号
cat > /usr/lib/systemd/system/docker.service << EOF
[Unit]
Description=Docker Application Container Engine
Documentation=https://docs.docker.com
BindsTo=containerd.service
After=network-online.target firewalld.service containerd.service
Wants=network-online.target
Requires=docker.socket
[Service]
Type=notify
# the default is not to use systemd for cgroups because the delegate issues still
# exists and systemd currently does not support the cgroup feature set required
# for containers run by docker
ExecStart=/usr/bin/dockerd -H fd:// --containerd=/run/containerd/containerd.sock --exec-opt native.cgroupdriver=systemd
ExecReload=/bin/kill -s HUP \$MAINPID
ExecStartPost=/usr/sbin/iptables -P FORWARD ACCEPT
TimeoutSec=0
RestartSec=2
Restart=always
# Note that StartLimit* options were moved from "Service" to "Unit" in systemd 229.
# Both the old, and new location are accepted by systemd 229 and up, so using the old location
# to make them work for either version of systemd.
StartLimitBurst=3
# Note that StartLimitInterval was renamed to StartLimitIntervalSec in systemd 230.
# Both the old, and new name are accepted by systemd 230 and up, so using the old name to make
# this option work for either version of systemd.
StartLimitInterval=60s
# Having non-zero Limit*s causes performance problems due to accounting overhead
# in the kernel. We recommend using cgroups to do container-local accounting.
LimitNOFILE=infinity
LimitNPROC=infinity
LimitCORE=infinity
# Comment TasksMax if your systemd version does not support it.
# Only systemd 226 and above support this option.
TasksMax=infinity
# set delegate yes so that systemd does not reset the cgroups of docker containers
Delegate=yes
# kill only the docker process, not all processes in the cgroup
KillMode=process
[Install]
WantedBy=multi-user.target
EOF
# 重启Docker
systemctl daemon-reload
systemctl restart docker
systemctl enable docker
```
# 六、安装kubeadm、kubelet
### 1、配置可用的国内yum源用于安装:
```
cat < /etc/yum.repos.d/kubernetes.repo
[kubernetes]
name=Kubernetes
baseurl=https://mirrors.aliyun.com/kubernetes/yum/repos/kubernetes-el7-x86_64/
enabled=1
gpgcheck=0
repo_gpgcheck=0
gpgkey=https://mirrors.aliyun.com/kubernetes/yum/doc/yum-key.gpg https://mirrors.aliyun.com/kubernetes/yum/doc/rpm-package-key.gpg
EOF
```
### 2、安装kubelet
```
# 需要在每台机器上都安装以下的软件包:
kubeadm: 用来初始化集群的指令。
kubelet: 在集群中的每个节点上用来启动 pod 和 container 等。
kubectl: 用来与集群通信的命令行工具。
# 查看kubelet版本列表
yum list kubelet --showduplicates | sort -r
# 安装kubelet
yum install -y kubelet-1.15.3-0
# 启动kubelet并设置开机启动
systemctl enable kubelet
systemctl start kubelet
# 检查状态
检查状态,发现是failed状态,正常,kubelet会10秒重启一次,需等下面完成初始化master节点后即可正常
systemctl status kubelet
# 查看kubelet日志
journalctl -u kubelet --no-pager
```
### 3、安装kubeadm
```
# 负责初始化集群
# 1、查看kubeadm版本列表
yum list kubeadm --showduplicates | sort -r
# 2、安装kubeadm
yum install -y kubeadm-1.15.3-0
# 安装 kubeadm 时候会默认安装 kubectl ,所以不需要单独安装kubectl
# 3、重启服务器
为了防止发生某些未知错误,这里我们重启下服务器,方便进行后续操作
reboot
```
# 七、初始化第一个kubernetes master节点
```
# 因为需要绑定虚拟IP,所以需要首先先查看虚拟IP启动这几台master机子哪台上
[root@k8s-master-01 ~]# ip address show eth0
2: eth0: mtu 1500 qdisc mq state UP group default qlen 1000
link/ether 00:50:56:be:86:af brd ff:ff:ff:ff:ff:ff
inet 192.168.56.56/22 brd 10.19.3.255 scope global eth0
valid_lft forever preferred_lft forever
inet 192.168.56.200/32 scope global eth0
valid_lft forever preferred_lft forever
可以看到虚拟IP 192.168.56.200 和 服务器IP 192.168.56.11在一台机子上,所以初始化kubernetes第一个master要在master01机子上进行安装
```
### 1、创建kubeadm配置的yaml文件
```
# 1、创建kubeadm配置的yaml文件
rm -f ./kubeadm-config.yaml
export APISERVER_NAME=master.k8s.io
export POD_SUBNET=10.20.0.0/16
export SVC_SUBNET=10.96.0.0/16
cat > kubeadm-config.yaml << EOF
apiServer:
certSANs:
- k8s-master-01
- k8s-master-02
- k8s-master-03
- master.k8s.io
- 192.168.56.11
- 192.168.56.12
- 192.168.56.13
- 192.168.56.200
- 127.0.0.1
extraArgs:
authorization-mode: Node,RBAC
timeoutForControlPlane: 4m0s
apiVersion: kubeadm.k8s.io/v1beta1
certificatesDir: /etc/kubernetes/pki
clusterName: kubernetes
controlPlaneEndpoint: "${APISERVER_NAME}:16443"
controllerManager: {}
dns:
type: CoreDNS
etcd:
local:
dataDir: /var/lib/etcd
imageRepository: registry.aliyuncs.com/google_containers
kind: ClusterConfiguration
kubernetesVersion: v1.15.3
networking:
dnsDomain: cluster.local
podSubnet: "${POD_SUBNET}"
serviceSubnet: "${SVC_SUBNET}"
scheduler: {}
EOF
以下两个地方设置:
- certSANs: 虚拟ip地址(为了安全起见,把所有集群地址都加上)
- controlPlaneEndpoint: 虚拟IP:监控端口号
配置说明:
imageRepository: registry.aliyuncs.com/google_containers (使用阿里云镜像仓库)
podSubnet: 10.20.0.1/16 (#pod地址池)
serviceSubnet: 10.96.0.1/16 (#service地址池)
```
### 2、初始化第一个master节点
```
kubeadm init --config=kubeadm-config.yaml --upload-certs #使用这个就不用做拷贝证书的操作
```
日志
```
Your Kubernetes control-plane has initialized successfully!
To start using your cluster, you need to run the following as a regular user:
mkdir -p $HOME/.kube
sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config
sudo chown $(id -u):$(id -g) $HOME/.kube/config
You should now deploy a pod network to the cluster.
Run "kubectl apply -f [podnetwork].yaml" with one of the options listed at:
https://kubernetes.io/docs/concepts/cluster-administration/addons/
You can now join any number of the control-plane node running the following command on each as root:
kubeadm join master.k8s.io:16443 --token wf0eoe.liqcp0nhtlov4ioi \
--discovery-token-ca-cert-hash sha256:e43bbb08bb5decae1ce0001f2988ff79095e6be5a3dea77a7c6af180562c7e56 \
--control-plane --certificate-key 6054323448a1aeb661b78763262db5c30e12026c54341400d48401a853194ec2
Please note that the certificate-key gives access to cluster sensitive data, keep it secret!
As a safeguard, uploaded-certs will be deleted in two hours; If necessary, you can use
"kubeadm init phase upload-certs --upload-certs" to reload certs afterward.
Then you can join any number of worker nodes by running the following on each as root:
kubeadm join master.k8s.io:16443 --token wf0eoe.liqcp0nhtlov4ioi \
--discovery-token-ca-cert-hash sha256:e43bbb08bb5decae1ce0001f2988ff79095e6be5a3dea77a7c6af180562c7e56
```
### 执行结果中
用于初始化第二、三个 master 节点
```
kubeadm join master.k8s.io:16443 --token wf0eoe.liqcp0nhtlov4ioi \
--discovery-token-ca-cert-hash sha256:e43bbb08bb5decae1ce0001f2988ff79095e6be5a3dea77a7c6af180562c7e56 \
--control-plane --certificate-key 6054323448a1aeb661b78763262db5c30e12026c54341400d48401a853194ec2
```
用于初始化 worker 节点
```
kubeadm join master.k8s.io:16443 --token wf0eoe.liqcp0nhtlov4ioi \
--discovery-token-ca-cert-hash sha256:e43bbb08bb5decae1ce0001f2988ff79095e6be5a3dea77a7c6af180562c7e56
```
### 3、配置kubectl环境变量
```bash
# 配置环境变量
rm -rf $HOME/.kube
mkdir -p $HOME/.kube
sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config
sudo chown $(id -u):$(id -g) $HOME/.kube/config
# 指令补全
yum install bash-completion -y
source <(kubectl completion bash)
echo "source <(kubectl completion bash)" >> ~/.bashrc
```
### 4、查看组件状态
```bash
kubectl get cs
NAME STATUS MESSAGE ERROR
controller-manager Healthy ok
scheduler Healthy ok
etcd-0 Healthy {"health": "true"}
# 查看pod状态
[root@k8s-master-01 ~]# kubectl get pods --namespace=kube-system
NAME READY STATUS RESTARTS AGE
coredns-78d4cf999f-5zt5z 0/1 Pending 0 7m32s ---coredns没有启动
coredns-78d4cf999f-mkgsx 0/1 Pending 0 7m32s ---coredns没有启动
etcd-k8s-master-01 1/1 Running 0 6m39s
kube-apiserver-k8s-master-01 1/1 Running 0 6m43s
kube-controller-manager-k8s-master-01 1/1 Running 0 6m32s
kube-proxy-88s74 1/1 Running 0 7m32s
kube-scheduler-k8s-master-01 1/1 Running 0 6m45s
可以看到coredns没有启动,这是由于还没有配置网络插件,接下来配置下后再重新查看启动状态
#检查ETCD服务
docker exec -it $(docker ps |grep etcd_etcd|awk '{print $1}') sh
etcdctl --endpoints=https://192.168.56.11:2379 --ca-file=/etc/kubernetes/pki/etcd/ca.crt --cert-file=/etc/kubernetes/pki/etcd/server.crt --key-file=/etc/kubernetes/pki/etcd/server.key member list
etcdctl --endpoints=https://192.168.56.11:2379 --ca-file=/etc/kubernetes/pki/etcd/ca.crt --cert-file=/etc/kubernetes/pki/etcd/server.crt --key-file=/etc/kubernetes/pki/etcd/server.key cluster-health
```
# 八、安装网络插件
### 1、安装 calico 网络插件
```
# 安装 calico 网络插件
# 参考文档 https://docs.projectcalico.org/v3.8/getting-started/kubernetes/
export POD_SUBNET=10.20.0.0/16
rm -f calico.yaml
wget https://docs.projectcalico.org/v3.8/manifests/calico.yaml
sed -i "s#192\.168\.0\.0/16#${POD_SUBNET}#" calico.yaml
kubectl apply -f calico.yaml
```
### 2、等待一会时间,再次查看各个pods的状态
```
[root@k8s-master-01 ~]# kubectl get pods --namespace=kube-system
NAME READY STATUS RESTARTS AGE
coredns-78d4cf999f-5zt5z 1/1 Running 0 12m ---coredns启动成功
coredns-78d4cf999f-mkgsx 1/1 Running 0 12m ---coredns启动成功
etcd-k8s-master-01 1/1 Running 0 11m
kube-apiserver-k8s-master-01 1/1 Running 0 12m
kube-controller-manager-k8s-master-01 1/1 Running 0 11m
kube-flannel-ds-amd64-7lj6m 1/1 Running 0 13s
kube-proxy-88s74 1/1 Running 0 12m
kube-scheduler-k8s-master-01 1/1 Running 0 12m
```
# 九、加入集群
### 1、Master加入集群构成高可用
```
复制秘钥到各个节点
在master01 服务器上执行下面命令,将kubernetes相关文件复制到 master02、master03
如果其他节点为初始化第一个master节点,则将该节点的配置文件复制到其余两个主节点,例如master03为第一个master节点,则将它的k8s配置复制到master02和master01。
```
- 复制文件到 master02
```
ssh root@master02.k8s.io mkdir -p /etc/kubernetes/pki/etcd
scp /etc/kubernetes/admin.conf root@master02.k8s.io:/etc/kubernetes
scp /etc/kubernetes/pki/{ca.*,sa.*,front-proxy-ca.*} root@master02.k8s.io:/etc/kubernetes/pki
scp /etc/kubernetes/pki/etcd/ca.* root@master02.k8s.io:/etc/kubernetes/pki/etcd
```
- 复制文件到 master03
```
ssh root@master03.k8s.io mkdir -p /etc/kubernetes/pki/etcd
scp /etc/kubernetes/admin.conf root@master03.k8s.io:/etc/kubernetes
scp /etc/kubernetes/pki/{ca.*,sa.*,front-proxy-ca.*} root@master03.k8s.io:/etc/kubernetes/pki
scp /etc/kubernetes/pki/etcd/ca.* root@master03.k8s.io:/etc/kubernetes/pki/etcd
```
- master节点加入集群
master02 和 master03 服务器上都执行加入集群操作
```bash
kubeadm join master.k8s.io:16443 --token i77yg1.1eype0c53jsanoge --discovery-token-ca-cert-hash sha256:8f0a817012ab333a057b6a7410e65971be20b95c1b75fc4015f8f3b6785f626f --experimental-control-plane
```
如果加入失败想重新尝试,请输入 kubeadm reset 命令清除之前的设置,重新执行从“复制秘钥”和“加入集群”这两步
如果是master加入,请在最后面加上 –experimental-control-plane 这个参数
```bash
# 显示安装过程:
This node has joined the cluster and a new control plane instance was created:
* Certificate signing request was sent to apiserver and approval was received.
* The Kubelet was informed of the new secure connection details.
* Master label and taint were applied to the new node.
* The Kubernetes control plane instances scaled up.
* A new etcd member was added to the local/stacked etcd cluster.
To start administering your cluster from this node, you need to run the following as a regular user:
mkdir -p $HOME/.kube
sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config
sudo chown $(id -u):$(id -g) $HOME/.kube/config
Run 'kubectl get nodes' to see this node join the cluster.
```
- 配置kubectl环境变量
```bash
mkdir -p $HOME/.kube
sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config
sudo chown $(id -u):$(id -g) $HOME/.kube/config
# 指令补全
yum install bash-completion -y
source <(kubectl completion bash)
echo "source <(kubectl completion bash)" >> ~/.bashrc
```
### 2、node节点加入集群
除了让master节点加入集群组成高可用外,slave节点也要加入集群中。
这里将k8s-node-01、k8s-node-02、k8s-node-03加入集群,进行工作
输入初始化k8s master时候提示的加入命令,如下:
```
kubeadm join master.k8s.io:16443 --token i77yg1.1eype0c53jsanoge --discovery-token-ca-cert-hash sha256:8f0a817012ab333a057b6a7410e65971be20b95c1b75fc4015f8f3b6785f626f
```
node节点加入,不需要加上 –experimental-control-plane 这个参数
### 3、如果忘记加入集群的token和sha256 (如正常则跳过)
- 显示获取token列表
```
kubeadm token list
```
默认情况下 Token 过期是时间是24小时,如果 Token 过期以后,可以输入以下命令,生成新的 Token
```
kubeadm token create
```
- 获取ca证书sha256编码hash值
```
openssl x509 -pubkey -in /etc/kubernetes/pki/ca.crt | openssl rsa -pubin -outform der 2>/dev/null | openssl dgst -sha256 -hex | sed 's/^.* //'
```
拼接命令
```
kubeadm join master.k8s.io:16443 --token 882ik4.9ib2kb0eftvuhb58 --discovery-token-ca-cert-hash sha256:0b1a836894d930c8558b350feeac8210c85c9d35b6d91fde202b870f3244016a
如果是master加入,请在最后面加上 –experimental-control-plane 这个参数
```
### 4、查看各个节点加入集群情况
```
kubectl get nodes -o wide
```
# 十、从集群中删除 Node
- Master节点:
```
kubectl drain --delete-local-data --force --ignore-daemonsets
kubectl delete node
```
- Slave节点:
```
kubeadm reset
```
## 初始化失败
```bash
kubeadm reset
ifconfig cni0 down
ip link delete cni0
ifconfig flannel.1 down
ip link delete flannel.1
rm -rf /var/lib/cni/
rm -rf /var/lib/etcd/*
```
参考资料:
http://www.mydlq.club/article/4/
https://kuboard.cn/install/install-kubernetes.html#%E5%88%9D%E5%A7%8B%E5%8C%96%E7%AC%AC%E4%B8%80%E4%B8%AAmaster%E8%8A%82%E7%82%B9
https://blog.51cto.com/fengwan/2426528?source=dra kubeadm搭建高可用kubernetes 1.15.1
https://segmentfault.com/a/1190000018741112?utm_source=tag-newest Kubernetes的几种主流部署方式02-kubeadm部署高可用集群
================================================
FILE: kubeadm/k8S-HA-V1.15.3-Flannel-开启防火墙版.md
================================================
# 环境介绍:
```bash
CentOS: 7.6
Docker: docker-ce-18.09.9
Kubernetes: 1.15.3
Kubeadm: 1.15.3
Kubelet: 1.15.3
Kubectl: 1.15.3
```
# 部署介绍:
创建高可用首先先有一个 Master 节点,然后再让其他服务器加入组成三个 Master 节点高可用,然后再将工作节点 Node 加入。下面将描述每个节点要执行的步骤:
```bash
Master01: 二、三、四、五、六、七、八、九、十一
Master02、Master03: 二、三、五、六、四、九
node01、node02、node03: 二、五、六、九
```
# 防火墙配置
```bash
1、防火墙策略
yum install iptables iptables-services -y
cat > /etc/sysconfig/iptables << \EOF
# Generated by iptables-save v1.4.21 on Thu Aug 1 01:26:09 2019
*filter
:INPUT ACCEPT [0:0]
:FORWARD ACCEPT [0:0]
:OUTPUT ACCEPT [0:0]
:RH-Firewall-1-INPUT - [0:0]
-A INPUT -j RH-Firewall-1-INPUT
-A FORWARD -j RH-Firewall-1-INPUT
-A RH-Firewall-1-INPUT -i lo -j ACCEPT
-A RH-Firewall-1-INPUT -p icmp -m icmp --icmp-type any -j ACCEPT
-A RH-Firewall-1-INPUT -s 192.168.56.0/24 -p tcp -m tcp --dport 22 -j ACCEPT
-A RH-Firewall-1-INPUT -p tcp -m tcp --dport 22 -j DROP
# k8s 服务器公网和内网IP,VIP都加上
-A RH-Firewall-1-INPUT -s 192.168.56.200/32 -j ACCEPT
-A RH-Firewall-1-INPUT -s 192.168.56.11/32 -j ACCEPT
-A RH-Firewall-1-INPUT -s 192.168.56.12/32 -j ACCEPT
-A RH-Firewall-1-INPUT -s 192.168.56.13/32 -j ACCEPT
-A RH-Firewall-1-INPUT -s 192.168.56.14/32 -j ACCEPT
# keepalived
-A RH-Firewall-1-INPUT -p vrrp -j ACCEPT
# serviceSubnet rules
-A RH-Firewall-1-INPUT -s 10.96.0.0/12 -j ACCEPT
# podSubnet rules
-A RH-Firewall-1-INPUT -s 10.244.0.0/16 -j ACCEPT
# port rules
-A RH-Firewall-1-INPUT -s 192.168.56.1/32 -p tcp -m multiport --dports 80,443,1080,6443,16443 -j ACCEPT
#
-A RH-Firewall-1-INPUT -m state --state RELATED,ESTABLISHED -j ACCEPT
-A RH-Firewall-1-INPUT -j REJECT --reject-with icmp-host-prohibited
COMMIT
# Completed on Thu Aug 1 01:26:09 2019
EOF
systemctl restart iptables.service
systemctl enable iptables.service
iptables -nvL
2、hosts.deny配置(注意需要注释掉)
sed -ri 's/.*all:all.*/#all:all/g' /etc/hosts.deny
cat /etc/hosts.deny
```
# 集群架构:

# 一、kuberadm 简介
### 1、Kuberadm 作用
Kubeadm 是一个工具,它提供了 kubeadm init 以及 kubeadm join 这两个命令作为快速创建 kubernetes 集群的最佳实践。
kubeadm 通过执行必要的操作来启动和运行一个最小可用的集群。它被故意设计为只关心启动集群,而不是之前的节点准备工作。同样的,诸如安装各种各样值得拥有的插件,例如 Kubernetes Dashboard、监控解决方案以及特定云提供商的插件,这些都不在它负责的范围。
相反,我们期望由一个基于 kubeadm 从更高层设计的更加合适的工具来做这些事情;并且,理想情况下,使用 kubeadm 作为所有部署的基础将会使得创建一个符合期望的集群变得容易。
### 2、Kuberadm 功能
```bash
kubeadm init: 启动一个 Kubernetes 主节点
kubeadm join: 启动一个 Kubernetes 工作节点并且将其加入到集群
kubeadm upgrade: 更新一个 Kubernetes 集群到新版本
kubeadm config: 如果使用 v1.7.x 或者更低版本的 kubeadm 初始化集群,您需要对集群做一些配置以便使用 kubeadm upgrade 命令
kubeadm token: 管理 kubeadm join 使用的令牌
kubeadm reset: 还原 kubeadm init 或者 kubeadm join 对主机所做的任何更改
kubeadm version: 打印 kubeadm 版本
kubeadm alpha: 预览一组可用的新功能以便从社区搜集反馈
```
### 3、功能版本
| Area |
Maturity Level |
| Command line UX |
GA |
| Implementation |
GA |
| Config file API |
beta |
| CoreDNS |
GA |
| kubeadm alpha subcommands |
alpha |
| High availability |
alpha |
| DynamicKubeletConfig |
alpha |
| Self-hosting |
alpha |
# 二、前期准备
### 1、虚拟机分配说明
| 地址 |
主机名 |
内存&CPU |
角色 |
| 10.199.1.200 |
- |
- |
vip |
| 10.199.1.136 |
k8s-master-01 |
2C & 2G |
master |
| 10.199.1.137 |
k8s-master-02 |
2C & 2G |
master |
| 10.199.1.138 |
k8s-master-03 |
2C & 2G |
master |
| 10.199.1.139 |
k8s-node-01 |
4C & 8G |
node |
| 10.199.1.140 |
k8s-node-02 |
4C & 8G |
node |
| 10.199.1.141 |
k8s-node-03 |
4C & 8G |
node |
### 2、各个节点端口占用
- Master 节点
| 规则 |
方向 |
端口范围 |
作用 |
使用者 |
| TCP |
Inbound 入口 |
6443* |
Kubernetes API |
server All |
| TCP |
Inbound 入口 |
2379-2380 |
etcd server |
client API kube-apiserver, etcd |
| TCP |
Inbound 入口 |
10250 |
Kubernetes API |
Self, Control plane |
| TCP |
Inbound 入口 |
10251 |
kube-scheduler |
Self |
| TCP |
Inbound 入口 |
10252 |
kube-controller-manager |
Self |
- node 节点
| 规则 |
方向 |
端口范围 |
作用 |
使用者 |
| TCP |
Inbound 入口 |
10250 |
Kubernetes API |
Self, Control plane |
| TCP |
Inbound 入口 |
30000-32767 |
NodePort Services** |
All |
### 3、基础环境设置
Kubernetes 需要一定的环境来保证正常运行,如各个节点时间同步,主机名称解析,关闭防火墙等等。
1、主机名称解析
分布式系统环境中的多主机通信通常基于主机名称进行,这在 IP 地址存在变化的可能性时为主机提供了固定的访问人口,因此一般需要有专用的 DNS 服务负责解决各节点主机 不过,考虑到此处部署的是测试集群,因此为了降低系复杂度,这里将基于 hosts 的文件进行主机名称解析。
2、修改hosts和免key登录
```bash
#分别进入不同服务器,进入 /etc/hosts 进行编辑
cat > /etc/hosts << \EOF
127.0.0.1 localhost localhost.localdomain localhost4 localhost4.localdomain4
::1 localhost localhost.localdomain localhost6 localhost6.localdomain6
10.199.1.200 k8s-vip master master.k8s.io
10.199.1.136 k8s-master-01 master01 master01.k8s.io
10.199.1.137 k8s-master-02 master02 master02.k8s.io
10.199.1.138 k8s-master-03 master03 master03.k8s.io
10.199.1.139 k8s-node-01 node01 node01.k8s.io
EOF
#root用户免密登录
mkdir -p /root/.ssh/
chmod 700 /root/.ssh/
echo 'ssh-rsa AAAAB3NzaC1yc2EAAAADAQABAAABAQC7bRm20od1b3rzW3ZPLB5NZn3jQesvfiz2p0WlfcYJrFHfF5Ap0ubIBUSQpVNLn94u8ABGBLboZL8Pjo+rXQPkIcObJxoKS8gz6ZOxcxJhldudbadabdanKAAKAKKKKKKKKKKKKKKKKKKKKKKK root@k8s-master-01' > /root/.ssh/authorized_keys
chmod 400 /root/.ssh/authorized_keys
```
3、修改hostname
```bash
#分别进入不同的服务器修改 hostname 名称
# 修改 10.199.1.136 服务器
hostnamectl set-hostname k8s-master-01
# 修改 10.199.1.137 服务器
hostnamectl set-hostname k8s-master-02
# 修改 10.199.1.138 服务器
hostnamectl set-hostname k8s-master-03
# 修改 10.199.1.139 服务器
hostnamectl set-hostname k8s-node-01
# 修改 10.199.1.140 服务器
hostnamectl set-hostname k8s-node-02
# 修改 10.199.1.141 服务器
hostnamectl set-hostname k8s-node-03
```
4、主机时间同步
```bash
#将各个服务器的时间同步,并设置开机启动同步时间服务
systemctl restart chronyd.service
systemctl enable chronyd.service
```
5、关闭防火墙服务
```bash
systemctl stop firewalld
systemctl disable firewalld
```
6、关闭并禁用SELinux
```bash
# 若当前启用了 SELinux 则需要临时设置其当前状态为 permissive
setenforce 0
# 编辑/etc/sysconfig selinux 文件,以彻底禁用 SELinux
sed -i 's/^SELINUX=enforcing$/SELINUX=disabled/' /etc/selinux/config
# 查看selinux状态
getenforce
如果为permissive,则执行reboot重新启动即可
```
7、禁用 Swap 设备
kubeadm 默认会预先检当前主机是否禁用了 Swap 设备,并在未用时强制止部署 过程因此,在主机内存资惊充裕的条件下,需要禁用所有的 Swap 设备
```
# 关闭当前已启用的所有 Swap 设备
swapoff -a && sysctl -w vm.swappiness=0
sed -ri 's/.*swap.*/#&/' /etc/fstab
cat /etc/fstab
或
# 编辑 fstab 配置文件,注释掉标识为 Swap 设备的所有行
vi /etc/fstab
UUID=9be41058-76a6-4588-8e3f-5b44604d8de1 / xfs defaults,noatime 0 0
UUID=4489cc8f-1885-4e17-bfe7-8652fd1d3feb /boot xfs defaults,noatime 0 0
#UUID=0f5ae5f1-4872-471f-9f3a-f172a43fc1ff swap swap defaults,noatime 0 0
```
8、设置系统参数
设置允许路由转发,不对bridge的数据进行处理
```bash
#创建 /etc/sysctl.d/k8s.conf 文件
cat > /etc/sysctl.d/k8s.conf << \EOF
vm.swappiness = 0
net.ipv4.ip_forward = 1
net.bridge.bridge-nf-call-ip6tables = 1
net.bridge.bridge-nf-call-iptables = 1
EOF
#挂载br_netfilter
modprobe br_netfilter
#生效配置文件
sysctl -p /etc/sysctl.d/k8s.conf
#查看是否生成相关文件
ls /proc/sys/net/bridge
```
9、资源配置文件
`/etc/security/limits.conf` 是 Linux 资源使用配置文件,用来限制用户对系统资源的使用
```bash
echo "* soft nofile 65536" >> /etc/security/limits.conf
echo "* hard nofile 65536" >> /etc/security/limits.conf
echo "* soft nproc 65536" >> /etc/security/limits.conf
echo "* hard nproc 65536" >> /etc/security/limits.conf
echo "* soft memlock unlimited" >> /etc/security/limits.conf
echo "* hard memlock unlimited" >> /etc/security/limits.conf
```
10、安装依赖包以及相关工具
```bash
yum install -y epel-release
yum install -y yum-utils device-mapper-persistent-data lvm2 net-tools conntrack-tools wget vim ntpdate libseccomp libtool-ltdl
```
# 三、安装Keepalived
- keepalived介绍: 是集群管理中保证集群高可用的一个服务软件,其功能类似于heartbeat,用来防止单点故障
- Keepalived作用: 为haproxy提供vip(192.168.56.200)在三个haproxy实例之间提供主备,降低当其中一个haproxy失效的时对服务的影响。
### 1、yum安装Keepalived
```bash
# 安装keepalived
chattr -i /etc/passwd* && chattr -i /etc/group* && chattr -i /etc/shadow* && chattr -i /etc/gshadow*
yum install -y keepalived
```
### 2、配置Keepalived
```bash
cat < /etc/keepalived/keepalived.conf
! Configuration File for keepalived
# 主要是配置故障发生时的通知对象以及机器标识。
global_defs {
# 标识本节点的字条串,通常为 hostname,但不一定非得是 hostname。故障发生时,邮件通知会用到。
router_id LVS_K8S
}
# 用来做健康检查的,当时检查失败时会将 vrrp_instance 的 priority 减少相应的值。
vrrp_script check_haproxy {
script "killall -0 haproxy" #根据进程名称检测进程是否存活
interval 3
weight -2
fall 10
rise 2
}
# rp_instance用来定义对外提供服务的 VIP 区域及其相关属性。
vrrp_instance VI_1 {
state MASTER #当前节点为MASTER,其他两个节点设置为 BACKUP
interface bond0 #改为自己的网卡
virtual_router_id 51
priority 200
advert_int 1
authentication {
auth_type PASS
auth_pass 35f18af7190d51c9f7f78f37300a0cbd
}
virtual_ipaddress {
10.199.1.200/22 #虚拟VIP,即VIP,注意掩码一定要写,不然会出现VIP端口,部分机器正常,部分机器异常问题
}
track_script {
check_haproxy
}
}
EOF
```
当前节点的配置中 state 配置为 MASTER,其它两个节点设置为 BACKUP
```bash
配置说明:
virtual_ipaddress: vip
track_script: 执行上面定义好的检测的script
interface: 节点固有IP(非VIP)的网卡,用来发VRRP包。
virtual_router_id: 取值在0-255之间,用来区分多个instance的VRRP组播
advert_int: 发VRRP包的时间间隔,即多久进行一次master选举(可以认为是健康查检时间间隔)。
authentication: 认证区域,认证类型有PASS和HA(IPSEC),推荐使用PASS(密码只识别前8位)。
state: 可以是MASTER或BACKUP,不过当其他节点keepalived启动时会将priority比较大的节点选举为MASTER,因此该项其实没有实质用途。
priority: 用来选举master的,要成为master,那么这个选项的值最好高于其他机器50个点,该项取值范围是1-255(在此范围之外会被识别成默认值100)。
# 1、注意防火墙需要放开vrrp协议(不然会出现脑裂现象,三台主机都存在VIP的情况)
#-A INPUT -p vrrp -j ACCEPT
-A RH-Firewall-1-INPUT -p vrrp -j ACCEPT
# 2、注意上面配置script "killall -0 haproxy" #根据进程名称检测进程是否存活,会在/var/log/messages每隔一秒执行检测的日志记录
# tail -100f /var/log/messages
Sep 27 10:54:16 tw19410s1 Keepalived_vrrp[9113]: /usr/bin/killall -0 haproxy exited with status 1
# 3、“VRRP实例的绑定到IP”对于所使用的网卡需要合法
比如使用网卡“bond0”,该网卡的掩码为“255.255.255.0”,那么所使用的“VRRP实例的绑定到IP”的掩码也必须为“255.255.255.0”,即具有“xxx.xxx.xxx.xxx/24”的形式。
tcpdump -ani any vrrp | grep vrid
特别需要注意的是,同一网段中的virtual_router_id(vrid)的值不能重复,否则会干扰其他Keepalived集群的正常运行。
```
### 3、启动Keepalived
```bash
# 设置开机启动
systemctl enable keepalived
# 启动keepalived
systemctl restart keepalived
# 查看启动状态
systemctl status keepalived
```
### 4、查看网络状态
kepplived 配置中 state 为 MASTER 的节点启动后,查看网络状态,可以看到虚拟IP已经加入到绑定的网卡中
```bash
[root@k8s-master-01 ~]# ip address show bond0
6: bond0: mtu 1500 qdisc noqueue state UP group default qlen 1000
link/ether 6c:92:bf:27:9e:ed brd ff:ff:ff:ff:ff:ff
inet 10.199.1.136/22 brd 10.19.3.255 scope global bond0
valid_lft forever preferred_lft forever
inet 10.199.1.200/32 scope global bond0
valid_lft forever preferred_lft forever
当关掉当前节点的keeplived服务后将进行虚拟IP转移,将会推选state 为 BACKUP 的节点的某一节点为新的MASTER,可以在那台节点上查看网卡,将会查看到虚拟IP
```
# 四、安装haproxy
此处的haproxy为apiserver提供反向代理,haproxy将所有请求轮询转发到每个master节点上。相对于仅仅使用keepalived主备模式仅单个master节点承载流量,这种方式更加合理、健壮。
### 1、yum安装haproxy
```bash
chattr -i /etc/passwd* && chattr -i /etc/group* && chattr -i /etc/shadow* && chattr -i /etc/gshadow*
yum install -y haproxy
```
### 2、配置haproxy
```bash
cat > /etc/haproxy/haproxy.cfg << EOF
#---------------------------------------------------------------------
# Global settings
#---------------------------------------------------------------------
global
# to have these messages end up in /var/log/haproxy.log you will
# need to:
# 1) configure syslog to accept network log events. This is done
# by adding the '-r' option to the SYSLOGD_OPTIONS in
# /etc/sysconfig/syslog
# 2) configure local2 events to go to the /var/log/haproxy.log
# file. A line like the following can be added to
# /etc/sysconfig/syslog
#
# local2.* /var/log/haproxy.log
#
log 127.0.0.1 local2
chroot /var/lib/haproxy
pidfile /var/run/haproxy.pid
maxconn 4000
user haproxy
group haproxy
daemon
# turn on stats unix socket
stats socket /var/lib/haproxy/stats
#---------------------------------------------------------------------
# common defaults that all the 'listen' and 'backend' sections will
# use if not designated in their block
#---------------------------------------------------------------------
defaults
mode http
log global
option httplog
option dontlognull
option http-server-close
option forwardfor except 127.0.0.0/8
option redispatch
retries 3
timeout http-request 10s
timeout queue 1m
timeout connect 10s
timeout client 1m
timeout server 1m
timeout http-keep-alive 10s
timeout check 10s
maxconn 3000
#---------------------------------------------------------------------
# kubernetes apiserver frontend which proxys to the backends
#---------------------------------------------------------------------
frontend kubernetes-apiserver
mode tcp
bind *:16443
option tcplog
default_backend kubernetes-apiserver
#---------------------------------------------------------------------
# round robin balancing between the various backends
#---------------------------------------------------------------------
backend kubernetes-apiserver
mode tcp
balance roundrobin
server master01.k8s.io 10.199.1.136:6443 check
server master02.k8s.io 10.199.1.137:6443 check
server master03.k8s.io 10.199.1.138:6443 check
#---------------------------------------------------------------------
# collection haproxy statistics message
#---------------------------------------------------------------------
listen stats
bind *:1080
stats auth admin:awesomePassword
stats refresh 5s
stats realm HAProxy\ Statistics
stats uri /admin?stats
EOF
```
haproxy配置在其他master节点上(10.199.1.137和10.199.1.138)相同
### 3、启动并检测haproxy
```bash
# 设置开机启动
systemctl enable haproxy
# 开启haproxy
systemctl restart haproxy
# 查看启动状态
systemctl status haproxy
```
### 4、检测haproxy端口
```bash
ss -lnt | grep -E "16443|1080"
nc -zv master.k8s.io 16443
nc -zv master.k8s.io 1080
```
# 五、安装Docker (所有节点)
### 1、移除之前安装过的Docker
```bash
sudo yum remove -y docker \
docker-client \
docker-client-latest \
docker-common \
docker-latest \
docker-latest-logrotate \
docker-logrotate \
docker-selinux \
docker-engine-selinux \
docker-ce-cli \
docker-engine
# 查看还有没有存在的docker组件
rpm -qa|grep docker
# 有则通过命令 yum -y remove XXX 来删除,比如:
yum remove docker-ce-cli
```
### 2、配置docker的yum源
下面两个镜像源选择其一即可,由于官方下载速度比较慢,推荐用阿里镜像源
- 阿里镜像源
```bash
sudo yum-config-manager --add-repo http://mirrors.aliyun.com/docker-ce/linux/centos/docker-ce.repo
```
- Docker官方镜像源
```bash
sudo yum-config-manager --add-repo https://download.docker.com/linux/centos/docker-ce.repo
```
### 3、安装Docker:
```
# 显示docker-ce所有可安装版本:
yum list docker-ce --showduplicates | sort -r
# 安装指定docker版本
sudo yum install docker-ce-18.09.9-3.el7.x86_64 -y
# 启动docker并设置docker开机启动
systemctl enable docker
systemctl start docker
# 确认一下iptables
确认一下iptables filter表中FOWARD链的默认策略(pllicy)为ACCEPT。
iptables -nvL
Chain FORWARD (policy ACCEPT 0 packets, 0 bytes)
pkts bytes target prot opt in out source destination
0 0 DOCKER-USER all -- * * 0.0.0.0/0 0.0.0.0/0
0 0 DOCKER-ISOLATION-STAGE-1 all -- * * 0.0.0.0/0 0.0.0.0/0
0 0 ACCEPT all -- * docker0 0.0.0.0/0 0.0.0.0/0 ctstate RELATED,ESTABLISHED
0 0 DOCKER all -- * docker0 0.0.0.0/0 0.0.0.0/0
0 0 ACCEPT all -- docker0 !docker0 0.0.0.0/0 0.0.0.0/0
0 0 ACCEPT all -- docker0 docker0 0.0.0.0/0 0.0.0.0/0
Docker从1.13版本开始调整了默认的防火墙规则,禁用了iptables filter表中FOWARD链,这样会引起Kubernetes集群中跨Node的Pod无法通信。但这里通过安装docker 1806,发现默认策略又改回了ACCEPT,这个不知道是从哪个版本改回的,因为我们线上版本使用的1706还是需要手动调整这个策略的。
# 执行下面命令
iptables -P FORWARD ACCEPT
# 修改docker的配置
vim /usr/lib/systemd/system/docker.service
# 增加下面命令(ExecReload后面新增ExecStartPost=...)
...
ExecReload=/bin/kill -s HUP $MAINPID
ExecStartPost=/usr/sbin/iptables -P FORWARD ACCEPT
...
# 配置docker加速器
cat > /etc/docker/daemon.json << \EOF
{
"exec-opts": ["native.cgroupdriver=systemd"],
"registry-mirrors" : [
"https://ot2k4d59.mirror.aliyuncs.com/"
]
}
EOF
# 重启Docker
systemctl daemon-reload
systemctl restart docker
docker info|grep -i cgroup
```
### 4、docker最终的服务文件
```
#注意,有变量的地方需要使用转义符号
cat > /usr/lib/systemd/system/docker.service << EOF
[Unit]
Description=Docker Application Container Engine
Documentation=https://docs.docker.com
BindsTo=containerd.service
After=network-online.target firewalld.service containerd.service
Wants=network-online.target
Requires=docker.socket
[Service]
Type=notify
# the default is not to use systemd for cgroups because the delegate issues still
# exists and systemd currently does not support the cgroup feature set required
# for containers run by docker
ExecStart=/usr/bin/dockerd -H fd:// --containerd=/run/containerd/containerd.sock --exec-opt native.cgroupdriver=systemd
ExecReload=/bin/kill -s HUP \$MAINPID
ExecStartPost=/usr/sbin/iptables -P FORWARD ACCEPT
TimeoutSec=0
RestartSec=2
Restart=always
# Note that StartLimit* options were moved from "Service" to "Unit" in systemd 229.
# Both the old, and new location are accepted by systemd 229 and up, so using the old location
# to make them work for either version of systemd.
StartLimitBurst=3
# Note that StartLimitInterval was renamed to StartLimitIntervalSec in systemd 230.
# Both the old, and new name are accepted by systemd 230 and up, so using the old name to make
# this option work for either version of systemd.
StartLimitInterval=60s
# Having non-zero Limit*s causes performance problems due to accounting overhead
# in the kernel. We recommend using cgroups to do container-local accounting.
LimitNOFILE=infinity
LimitNPROC=infinity
LimitCORE=infinity
# Comment TasksMax if your systemd version does not support it.
# Only systemd 226 and above support this option.
TasksMax=infinity
# set delegate yes so that systemd does not reset the cgroups of docker containers
Delegate=yes
# kill only the docker process, not all processes in the cgroup
KillMode=process
[Install]
WantedBy=multi-user.target
EOF
# 重启Docker
systemctl daemon-reload
systemctl restart docker
systemctl enable docker
```
# 六、安装kubeadm、kubelet
### 1、配置可用的国内yum源用于安装:
```
cat < /etc/yum.repos.d/kubernetes.repo
[kubernetes]
name=Kubernetes
baseurl=https://mirrors.aliyun.com/kubernetes/yum/repos/kubernetes-el7-x86_64/
enabled=1
gpgcheck=0
repo_gpgcheck=0
gpgkey=https://mirrors.aliyun.com/kubernetes/yum/doc/yum-key.gpg https://mirrors.aliyun.com/kubernetes/yum/doc/rpm-package-key.gpg
EOF
```
### 2、安装kubelet
```
# 需要在每台机器上都安装以下的软件包:
kubeadm: 用来初始化集群的指令。
kubelet: 在集群中的每个节点上用来启动 pod 和 container 等。
kubectl: 用来与集群通信的命令行工具。
# 查看kubelet版本列表
yum list kubelet --showduplicates | sort -r
# 安装kubelet
yum install -y kubelet-1.15.3-0
# 启动kubelet并设置开机启动
systemctl enable kubelet
systemctl start kubelet
# 检查状态
检查状态,发现是failed状态,正常,kubelet会10秒重启一次,需等下面完成初始化master节点后即可正常
systemctl status kubelet
# 查看kubelet日志
journalctl -u kubelet --no-pager
```
### 3、安装kubeadm
```
# 负责初始化集群
# 1、查看kubeadm版本列表
yum list kubeadm --showduplicates | sort -r
# 2、安装kubeadm
yum install -y kubeadm-1.15.3-0
# 安装 kubeadm 时候会默认安装 kubectl ,所以不需要单独安装kubectl
# 3、重启服务器
为了防止发生某些未知错误,这里我们重启下服务器,方便进行后续操作
reboot
```
# 七、初始化第一个kubernetes master节点
```
# 因为需要绑定虚拟IP,所以需要首先先查看虚拟IP启动这几台master机子哪台上
[root@k8s-master-01 ~]# ip address show bond0
6: bond0: mtu 1500 qdisc noqueue state UP group default qlen 1000
link/ether 6c:92:bf:27:9e:ed brd ff:ff:ff:ff:ff:ff
inet 10.199.1.136/22 brd 10.19.3.255 scope global bond0
valid_lft forever preferred_lft forever
inet 10.199.1.200/32 scope global bond0
valid_lft forever preferred_lft forever
7: bond0.101@bond0: mtu 1500 qdisc noqueue state UP group default qlen 1000
link/ether 6c:92:bf:27:9e:ed brd ff:ff:ff:ff:ff:ff
inet 16.201.26.36/24 brd 16.201.26.255 scope global bond0.101
valid_lft forever preferred_lft forever
可以看到虚拟IP 10.199.1.200 和 服务器IP 10.199.1.136 在一台机子上,所以初始化kubernetes第一个master要在master01机子上进行安装
```
### 1、创建kubeadm配置的yaml文件
```
# 1、创建kubeadm配置的yaml文件
rm -f ./kubeadm-config.yaml
export MASTER_NODE1=10.199.1.136
export APISERVER_NAME=master.k8s.io
export POD_SUBNET=10.244.0.0/16
export SVC_SUBNET=10.96.0.0/12
cat < ./kubeadm-config.yaml
apiVersion: kubeadm.k8s.io/v1beta2
bootstrapTokens:
- groups:
- system:bootstrappers:kubeadm:default-node-token
token: abcdef.0123456789abcdef
ttl: 24h0m0s
usages:
- signing
- authentication
kind: InitConfiguration
localAPIEndpoint:
advertiseAddress: ${MASTER_NODE1} #这里填写第一个初始化的master的ip
bindPort: 6443
nodeRegistration:
criSocket: /var/run/dockershim.sock
name: k8s-master-01 #注意这里需要调整为自己的节点
taints:
- effect: NoSchedule
key: node-role.kubernetes.io/master
---
apiVersion: kubeadm.k8s.io/v1beta2
kind: ClusterConfiguration
clusterName: kubernetes
kubernetesVersion: v1.15.3
certificatesDir: /etc/kubernetes/pki
controllerManager: {}
controlPlaneEndpoint: "${APISERVER_NAME}:16443" # 这里写vip的地址或域名加上端口
imageRepository: registry.aliyuncs.com/google_containers # 使用阿里云镜像
apiServer:
timeoutForControlPlane: 4m0s
certSANs:
- k8s-master-01
- k8s-master-02
- k8s-master-03
- master.k8s.io
- 10.199.1.200
- 10.199.1.136
- 10.199.1.137
- 10.199.1.138
- 127.0.0.1
dns:
type: CoreDNS
etcd:
local:
dataDir: /var/lib/etcd
networking:
dnsDomain: cluster.local
podSubnet: ${POD_SUBNET}
serviceSubnet: ${SVC_SUBNET}
scheduler: {}
---
# 开启 IPVS 模式
apiVersion: kubeproxy.config.k8s.io/v1alpha1
kind: KubeProxyConfiguration
mode: ipvs # kube-proxy 模式
EOF
kubeadm init --config=kubeadm-config.yaml --upload-certs
以下两个地方设置:
- certSANs: 虚拟ip地址(为了安全起见,把所有集群地址都加上)
- controlPlaneEndpoint: 虚拟IP:监控端口号
配置说明:
imageRepository: registry.aliyuncs.com/google_containers (使用阿里云镜像仓库)
podSubnet: 10.244.0.0/16 (#pod地址池)
serviceSubnet: 10.96.0.0/12 (#service地址池)
```
### 2、初始化第一个master节点
```
kubeadm init --config=kubeadm-config.yaml --upload-certs #使用这个就不用做拷贝证书的操作
kubeadm init --config kubeadm-config.yaml #使用这个还需要手动做拷贝证书的操作
#验证下端口是否通
nc -zv master.k8s.io 6443
nc -zv master.k8s.io 16443
```
日志
```
Your Kubernetes control-plane has initialized successfully!
To start using your cluster, you need to run the following as a regular user:
mkdir -p $HOME/.kube
sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config
sudo chown $(id -u):$(id -g) $HOME/.kube/config
You should now deploy a pod network to the cluster.
Run "kubectl apply -f [podnetwork].yaml" with one of the options listed at:
https://kubernetes.io/docs/concepts/cluster-administration/addons/
You can now join any number of the control-plane node running the following command on each as root:
kubeadm join master.k8s.io:16443 --token abcdef.0123456789abcdef \
--discovery-token-ca-cert-hash sha256:ab6da874166785bfe75acc4d6fd622bf821a7451837332e3a21a6106e346c8d5 \
--control-plane --certificate-key 13284467f0141778898ffa33d340c0598cb757c6aa016f00da2165cd3eab4523
Please note that the certificate-key gives access to cluster sensitive data, keep it secret!
As a safeguard, uploaded-certs will be deleted in two hours; If necessary, you can use
"kubeadm init phase upload-certs --upload-certs" to reload certs afterward.
Then you can join any number of worker nodes by running the following on each as root:
kubeadm join master.k8s.io:16443 --token abcdef.0123456789abcdef \
--discovery-token-ca-cert-hash sha256:ab6da874166785bfe75acc4d6fd622bf821a7451837332e3a21a6106e346c8d5
```
### 执行结果中
用于初始化第二、三个 master 节点
```
#初始化第二个master节点
export MASTER_NODE2=10.199.1.137
kubeadm join master.k8s.io:16443 --apiserver-advertise-address ${MASTER_NODE2} --token abcdef.0123456789abcdef \
--discovery-token-ca-cert-hash sha256:ab6da874166785bfe75acc4d6fd622bf821a7451837332e3a21a6106e346c8d5 \
--control-plane --certificate-key 13284467f0141778898ffa33d340c0598cb757c6aa016f00da2165cd3eab4523
#初始化第三个master节点
export MASTER_NODE3=10.199.1.138
kubeadm join master.k8s.io:16443 --apiserver-advertise-address ${MASTER_NODE3} --token abcdef.0123456789abcdef \
--discovery-token-ca-cert-hash sha256:ab6da874166785bfe75acc4d6fd622bf821a7451837332e3a21a6106e346c8d5 \
--control-plane --certificate-key 13284467f0141778898ffa33d340c0598cb757c6aa016f00da2165cd3eab4523
```
用于初始化 worker 节点
```
kubeadm join master.k8s.io:16443 --token abcdef.0123456789abcdef \
--discovery-token-ca-cert-hash sha256:ab6da874166785bfe75acc4d6fd622bf821a7451837332e3a21a6106e346c8d5
```
### 3、配置kubectl环境变量
```bash
# 配置环境变量
rm -rf $HOME/.kube
mkdir -p $HOME/.kube
sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config
sudo chown $(id -u):$(id -g) $HOME/.kube/config
# 指令补全
yum install bash-completion -y
source <(kubectl completion bash)
echo "source <(kubectl completion bash)" >> ~/.bashrc
```
### 4、查看组件状态
```bash
kubectl get cs
NAME STATUS MESSAGE ERROR
controller-manager Healthy ok
scheduler Healthy ok
etcd-0 Healthy {"health": "true"}
# 查看pod状态
[root@k8s-master-01 ~]# kubectl get pods --namespace=kube-system
NAME READY STATUS RESTARTS AGE
coredns-78d4cf999f-5zt5z 0/1 Pending 0 7m32s ---coredns没有启动
coredns-78d4cf999f-mkgsx 0/1 Pending 0 7m32s ---coredns没有启动
etcd-k8s-master-01 1/1 Running 0 6m39s
kube-apiserver-k8s-master-01 1/1 Running 0 6m43s
kube-controller-manager-k8s-master-01 1/1 Running 0 6m32s
kube-proxy-88s74 1/1 Running 0 7m32s
kube-scheduler-k8s-master-01 1/1 Running 0 6m45s
可以看到coredns没有启动,这是由于还没有配置网络插件,接下来配置下后再重新查看启动状态
#检查ETCD服务
docker exec -it $(docker ps |grep etcd_etcd|awk '{print $1}') sh
etcdctl --endpoints=https://192.168.56.11:2379 --ca-file=/etc/kubernetes/pki/etcd/ca.crt --cert-file=/etc/kubernetes/pki/etcd/server.crt --key-file=/etc/kubernetes/pki/etcd/server.key member list
etcdctl --endpoints=https://192.168.56.11:2379 --ca-file=/etc/kubernetes/pki/etcd/ca.crt --cert-file=/etc/kubernetes/pki/etcd/server.crt --key-file=/etc/kubernetes/pki/etcd/server.key cluster-health
```
# 八、安装网络插件
### 1、安装 calico 网络插件
```
# 安装 calico 网络插件
# 参考文档 https://docs.projectcalico.org/v3.8/getting-started/kubernetes/
export POD_SUBNET=10.244.0.0/16
rm -f calico.yaml
wget https://docs.projectcalico.org/v3.8/manifests/calico.yaml
sed -i "s#192\.168\.0\.0/16#${POD_SUBNET}#" calico.yaml
kubectl apply -f calico.yaml
```
### 2、安装 flannel 网络插件
```bash
export POD_SUBNET=10.244.0.0/16
cat > kube-flannel.yaml << EOF
---
kind: ClusterRole
apiVersion: rbac.authorization.k8s.io/v1beta1
metadata:
name: flannel
rules:
- apiGroups:
- ""
resources:
- pods
verbs:
- get
- apiGroups:
- ""
resources:
- nodes
verbs:
- list
- watch
- apiGroups:
- ""
resources:
- nodes/status
verbs:
- patch
---
kind: ClusterRoleBinding
apiVersion: rbac.authorization.k8s.io/v1beta1
metadata:
name: flannel
roleRef:
apiGroup: rbac.authorization.k8s.io
kind: ClusterRole
name: flannel
subjects:
- kind: ServiceAccount
name: flannel
namespace: kube-system
---
apiVersion: v1
kind: ServiceAccount
metadata:
name: flannel
namespace: kube-system
---
kind: ConfigMap
apiVersion: v1
metadata:
name: kube-flannel-cfg
namespace: kube-system
labels:
tier: node
app: flannel
data:
cni-conf.json: |
{
"name": "cbr0",
"plugins": [
{
"type": "flannel",
"delegate": {
"hairpinMode": true,
"isDefaultGateway": true
}
},
{
"type": "portmap",
"capabilities": {
"portMappings": true
}
}
]
}
net-conf.json: |
{
"Network": "${POD_SUBNET}",
"Backend": {
"Type": "vxlan"
}
}
---
apiVersion: extensions/v1beta1
kind: DaemonSet
metadata:
name: kube-flannel-ds-amd64
namespace: kube-system
labels:
tier: node
app: flannel
spec:
template:
metadata:
labels:
tier: node
app: flannel
spec:
hostNetwork: true
nodeSelector:
beta.kubernetes.io/arch: amd64
tolerations:
- operator: Exists
effect: NoSchedule
serviceAccountName: flannel
initContainers:
- name: install-cni
image: registry.cn-shenzhen.aliyuncs.com/cp_m/flannel:v0.10.0-amd64
command:
- cp
args:
- -f
- /etc/kube-flannel/cni-conf.json
- /etc/cni/net.d/10-flannel.conflist
volumeMounts:
- name: cni
mountPath: /etc/cni/net.d
- name: flannel-cfg
mountPath: /etc/kube-flannel/
containers:
- name: kube-flannel
image: registry.cn-shenzhen.aliyuncs.com/cp_m/flannel:v0.10.0-amd64
command:
- /opt/bin/flanneld
args:
- --ip-masq
- --kube-subnet-mgr
- --iface=bond0
resources:
requests:
cpu: "100m"
memory: "50Mi"
limits:
cpu: "100m"
memory: "50Mi"
securityContext:
privileged: true
env:
- name: POD_NAME
valueFrom:
fieldRef:
fieldPath: metadata.name
- name: POD_NAMESPACE
valueFrom:
fieldRef:
fieldPath: metadata.namespace
volumeMounts:
- name: run
mountPath: /run
- name: flannel-cfg
mountPath: /etc/kube-flannel/
volumes:
- name: run
hostPath:
path: /run
- name: cni
hostPath:
path: /etc/cni/net.d
- name: flannel-cfg
configMap:
name: kube-flannel-cfg
EOF
“Network”: “10.244.0.0/16”要和kubeadm-config.yaml配置文件中podSubnet: 10.244.0.0/16相同
```
### 2、创建flanner相关role和pod
```
# 应用生效
[root@k8s-master-01 ~]# kubectl apply -f kube-flannel.yaml
clusterrole.rbac.authorization.k8s.io/flannel created
clusterrolebinding.rbac.authorization.k8s.io/flannel created
serviceaccount/flannel created
configmap/kube-flannel-cfg created
daemonset.extensions/kube-flannel-ds-amd64 created
# 等待一会时间,再次查看各个pods的状态
[root@k8s-master-01 ~]# kubectl get pods --namespace=kube-system
NAME READY STATUS RESTARTS AGE
coredns-78d4cf999f-5zt5z 1/1 Running 0 12m ---coredns启动成功
coredns-78d4cf999f-mkgsx 1/1 Running 0 12m ---coredns启动成功
etcd-k8s-master-01 1/1 Running 0 11m
kube-apiserver-k8s-master-01 1/1 Running 0 12m
kube-controller-manager-k8s-master-01 1/1 Running 0 11m
kube-flannel-ds-amd64-7lj6m 1/1 Running 0 13s
kube-proxy-88s74 1/1 Running 0 12m
kube-scheduler-k8s-master-01 1/1 Running 0 12m
# 加入更换了网络插件,需要把coredns的pod重新创建,不然网络coredns的pod网络不通
# 查看
kubectl get pods --namespace kube-system
kubectl get svc --namespace kube-system
#删除coredns
kubectl delete deployment coredns -n kube-system
kubectl delete svc kube-dns -n kube-system
kubectl delete cm coredns -n kube-system
#重新部署coredns
rm -f coredns.yaml.sed deploy.sh coredns.yml
wget https://raw.githubusercontent.com/coredns/deployment/master/kubernetes/coredns.yaml.sed
wget https://raw.githubusercontent.com/coredns/deployment/master/kubernetes/deploy.sh
chmod +x deploy.sh
./deploy.sh -i 10.96.0.10 > coredns.yml #这里从--service-cidr=10.96.0.0/16中选用10.96.0.10作为coredns地址
kubectl apply -f coredns.yml
```
# 九、加入集群
### 1、Master加入集群构成高可用
```
复制秘钥到各个节点
在master01 服务器上执行下面命令,将kubernetes相关文件复制到 master02、master03
如果其他节点为初始化第一个master节点,则将该节点的配置文件复制到其余两个主节点,例如master03为第一个master节点,则将它的k8s配置复制到master02和master01。
```
- 复制文件到 master02
```
ssh root@master02.k8s.io mkdir -p /etc/kubernetes/pki/etcd
scp /etc/kubernetes/admin.conf root@master02.k8s.io:/etc/kubernetes
scp /etc/kubernetes/pki/{ca.*,sa.*,front-proxy-ca.*} root@master02.k8s.io:/etc/kubernetes/pki
scp /etc/kubernetes/pki/etcd/ca.* root@master02.k8s.io:/etc/kubernetes/pki/etcd
```
- 复制文件到 master03
```
ssh root@master03.k8s.io mkdir -p /etc/kubernetes/pki/etcd
scp /etc/kubernetes/admin.conf root@master03.k8s.io:/etc/kubernetes
scp /etc/kubernetes/pki/{ca.*,sa.*,front-proxy-ca.*} root@master03.k8s.io:/etc/kubernetes/pki
scp /etc/kubernetes/pki/etcd/ca.* root@master03.k8s.io:/etc/kubernetes/pki/etcd
```
- master节点加入集群
master02 和 master03 服务器上都执行加入集群操作
```bash
kubeadm join master.k8s.io:16443 --token i77yg1.1eype0c53jsanoge --discovery-token-ca-cert-hash sha256:8f0a817012ab333a057b6a7410e65971be20b95c1b75fc4015f8f3b6785f626f --experimental-control-plane
```
如果加入失败想重新尝试,请输入 kubeadm reset 命令清除之前的设置,重新执行从“复制秘钥”和“加入集群”这两步
如果是master加入,请在最后面加上 –experimental-control-plane 这个参数
```bash
# 显示安装过程:
This node has joined the cluster and a new control plane instance was created:
* Certificate signing request was sent to apiserver and approval was received.
* The Kubelet was informed of the new secure connection details.
* Master label and taint were applied to the new node.
* The Kubernetes control plane instances scaled up.
* A new etcd member was added to the local/stacked etcd cluster.
To start administering your cluster from this node, you need to run the following as a regular user:
mkdir -p $HOME/.kube
sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config
sudo chown $(id -u):$(id -g) $HOME/.kube/config
Run 'kubectl get nodes' to see this node join the cluster.
```
- 配置kubectl环境变量
```bash
mkdir -p $HOME/.kube
sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config
sudo chown $(id -u):$(id -g) $HOME/.kube/config
# 指令补全
yum install bash-completion -y
source <(kubectl completion bash)
echo "source <(kubectl completion bash)" >> ~/.bashrc
```
### 2、node节点加入集群
除了让master节点加入集群组成高可用外,slave节点也要加入集群中。
这里将k8s-node-01、k8s-node-02、k8s-node-03加入集群,进行工作
输入初始化k8s master时候提示的加入命令,如下:
```
kubeadm join master.k8s.io:16443 --token i77yg1.1eype0c53jsanoge --discovery-token-ca-cert-hash sha256:8f0a817012ab333a057b6a7410e65971be20b95c1b75fc4015f8f3b6785f626f
```
node节点加入,不需要加上 –experimental-control-plane 这个参数
### 3、如果忘记加入集群的token和sha256 (如正常则跳过)
- 显示获取token列表
```
kubeadm token list
```
默认情况下 Token 过期是时间是24小时,如果 Token 过期以后,可以输入以下命令,生成新的 Token
```
kubeadm token create
```
- 获取ca证书sha256编码hash值
```
openssl x509 -pubkey -in /etc/kubernetes/pki/ca.crt | openssl rsa -pubin -outform der 2>/dev/null | openssl dgst -sha256 -hex | sed 's/^.* //'
```
拼接命令
```
kubeadm join master.k8s.io:16443 --token 882ik4.9ib2kb0eftvuhb58 --discovery-token-ca-cert-hash sha256:0b1a836894d930c8558b350feeac8210c85c9d35b6d91fde202b870f3244016a
如果是master加入,请在最后面加上 –experimental-control-plane 这个参数
```
### 4、查看各个节点加入集群情况
```
kubectl get nodes -o wide
```
# 十、从集群中删除 Node
- Master节点:
```
kubectl drain --delete-local-data --force --ignore-daemonsets
kubectl delete node
```
- Slave节点:
```
kubeadm reset
```
## 初始化失败
```bash
yes | kubeadm reset
ifconfig cni0 down
ip link delete cni0
ifconfig flannel.1 down
ip link delete flannel.1
rm -rf /var/lib/cni/
rm -f $HOME/.kube/config
systemctl restart docker
systemctl restart kubelet
systemctl status kubelet
journalctl -f -u kubelet
```
## 问题汇总:
1、多网卡监听问题
```
k8s master组件在多网卡环境下,会监听到服务器外网IP问题
#注意--hostname-override的值写kubectl get nodes显示的结果
#修改kubelet启动参数
cat > /etc/sysconfig/kubelet <<\EOF
KUBELET_EXTRA_ARGS=--runtime-cgroups=/systemd/system.slice --kubelet-cgroups=/systemd/system.slice --hostname-override=k8s-master-01 --node-ip=10.199.1.136
EOF
#重启kubelet服务
systemctl daemon-reload
systemctl restart kubelet
systemctl status kubelet
#查看kubelet日志
journalctl -f -u kubelet
https://blog.csdn.net/qianghaohao/article/details/98588427 kubeadm + vagrant 部署多节点 k8s 的一个坑(多网卡问题)
https://github.com/kubernetes/kubernetes/issues/33618
https://kubernetes.io/zh/docs/setup/independent/install-kubeadm/ kubeadm init 和 kubeadm join 用于为 kubelet 获取 额外的用户参数。
#解决方案
@danielschonfeld The kubelet flag you should set is --hostname-override
```
参考资料:
https://github.com/kubernetes/kubernetes/issues/33618 Issue when using kubeadm with multiple network interfaces #33618
http://www.mydlq.club/article/4/
https://kuboard.cn/install/install-kubernetes.html#%E5%88%9D%E5%A7%8B%E5%8C%96%E7%AC%AC%E4%B8%80%E4%B8%AAmaster%E8%8A%82%E7%82%B9
https://blog.51cto.com/fengwan/2426528?source=dra kubeadm搭建高可用kubernetes 1.15.1
https://segmentfault.com/a/1190000018741112?utm_source=tag-newest Kubernetes的几种主流部署方式02-kubeadm部署高可用集群
https://www.cnblogs.com/hongdada/p/9771857.html Docker中的Cgroup Driver:Cgroupfs 与 Systemd
https://juejin.im/entry/5b0aa39551882538be0d2e21 centos7使用kubeadm配置高可用集群(多master 多网卡,需主动修改组件信息)
================================================
FILE: kubeadm/k8s清理.md
================================================
# 一、清理资源
```
systemctl stop kubelet
systemctl stop docker
kubeadm reset
#yum remove -y kubelet kubeadm kubectl --disableexcludes=kubernetes
rm -rf /etc/kubernetes/
rm -rf /root/.kube/
rm -rf $HOME/.kube/
rm -rf /var/lib/etcd/
rm -rf /var/lib/cni/
rm -rf /var/lib/kubelet/
rm -rf /etc/cni/
rm -rf /opt/cni/
ifconfig cni0 down
ifconfig flannel.1 down
ifconfig docker0 down
ip link delete cni0
ip link delete flannel.1
#docker rmi -f $(docker images -q)
#docker rm -f `docker ps -a -q`
#yum install -y kubelet kubeadm kubectl --disableexcludes=kubernetes
kubeadm version
systemctl restart kubelet.service
systemctl enable kubelet.service
```
# 二、重新初始化
```
swapoff -a
modprobe br_netfilter
sysctl -p /etc/sysctl.d/k8s.conf
chmod 755 /etc/sysconfig/modules/ipvs.modules && bash /etc/sysconfig/modules/ipvs.modules && lsmod | grep -e ip_vs -e nf_conntrack_ipv4
kubeadm config images list |sed -e 's/^/docker pull /g' -e 's#k8s.gcr.io#registry.cn-hangzhou.aliyuncs.com/google_containers#g' |sh -x
docker images |grep google_containers |awk '{print "docker tag ",$1":"$2,$1":"$2}' |sed -e 's#registry.cn-hangzhou.aliyuncs.com/google_containers#k8s.gcr.io#2' |sh -x
docker images |grep google_containers |awk '{print "docker rmi ", $1":"$2}' |sh -x
docker pull coredns/coredns:1.3.1
docker tag coredns/coredns:1.3.1 k8s.gcr.io/coredns:1.3.1
docker rmi coredns/coredns:1.3.1
kubeadm init --kubernetes-version=v1.15.3 --pod-network-cidr=10.244.0.0/16 --apiserver-advertise-address=192.168.56.11 --apiserver-bind-port=6443
#获取加入集群的指令
kubeadm token create --print-join-command
```
# 三、Node操作
```
mkdir -p $HOME/.kube
```
# 四、Master操作
```
mkdir -p $HOME/.kube
cp -i /etc/kubernetes/admin.conf $HOME/.kube/config
chown $(id -u):$(id -g) $HOME/.kube/config
scp $HOME/.kube/config root@linux-node2:$HOME/.kube/config
scp $HOME/.kube/config root@linux-node3:$HOME/.kube/config
scp $HOME/.kube/config root@linux-node4:$HOME/.kube/config
```
# 五、Master和Node节点
```
chown $(id -u):$(id -g) $HOME/.kube/config
```
参考资料:
https://blog.51cto.com/wutengfei/2121202 kubernetes中网络报错问题
================================================
FILE: kubeadm/kubeadm.yaml
================================================
apiVersion: kubeadm.k8s.io/v1beta2
bootstrapTokens:
- groups:
- system:bootstrappers:kubeadm:default-node-token
token: abcdef.0123456789abcdef
ttl: 24h0m0s
usages:
- signing
- authentication
kind: InitConfiguration
localAPIEndpoint:
advertiseAddress: 192.168.56.11
bindPort: 6443
nodeRegistration:
criSocket: /var/run/dockershim.sock
name: linux-node1.example.com
taints:
- effect: NoSchedule
key: node-role.kubernetes.io/master
---
apiServer:
timeoutForControlPlane: 4m0s
apiVersion: kubeadm.k8s.io/v1beta2
certificatesDir: /etc/kubernetes/pki
clusterName: kubernetes
controllerManager: {}
dns:
type: CoreDNS
etcd:
local:
dataDir: /var/lib/etcd
imageRepository: k8s.gcr.io
kind: ClusterConfiguration
kubernetesVersion: v1.15.0
networking:
dnsDomain: cluster.local
podSubnet: 172.168.0.0/16
serviceSubnet: 10.96.0.0/12
scheduler: {}
---
apiVersion: kubeproxy.config.k8s.io/v1alpha1
kind: KubeProxyConfiguration
mode: ipvs # kube-proxy 模式
================================================
FILE: kubeadm/kubeadm初始化k8s集群延长证书过期时间.md
================================================
# 一、前言
kubeadm初始化k8s集群,签发的CA证书有效期默认是10年,签发的apiserver证书有效期默认是1年,到期之后请求apiserver会报错,使用openssl命令查询相关证书是否到期。
以下延长证书过期的方法适合kubernetes1.14、1.15、1.16、1.17、1.18版本
# 二、查看证书有效时间
```bash
openssl x509 -in /etc/kubernetes/pki/ca.crt -noout -text |grep Not
显示如下,通过下面可看到ca证书有效期是10年,从2020到2030年:
Not Before: Apr 22 04:09:07 2020 GMT
Not After : Apr 20 04:09:07 2030 GMT
openssl x509 -in /etc/kubernetes/pki/apiserver.crt -noout -text |grep Not
显示如下,通过下面可看到apiserver证书有效期是1年,从2020到2021年:
Not Before: Apr 22 04:09:07 2020 GMT
Not After : Apr 22 04:09:07 2021 GMT
```
# 三、延长证书过期时间
```bash
1.把update-kubeadm-cert.sh文件上传到master1、master2、master3节点
update-kubeadm-cert.sh文件所在的github地址如下:
https://github.com/luckylucky421/kubernetes1.17.3
把update-kubeadm-cert.sh文件clone和下载下来,拷贝到master1,master2,master3节点上
2.在每个节点都执行如下命令
1)给update-kubeadm-cert.sh证书授权可执行权限
chmod +x update-kubeadm-cert.sh
2)执行下面命令,修改证书过期时间,把时间延长到10年
./update-kubeadm-cert.sh all
3)在master1节点查询Pod是否正常,能查询出数据说明证书签发完成
kubectl get pods -n kube-system
显示如下,能够看到pod信息,说明证书签发正常:
......
calico-node-b5ks5 1/1 Running 0 157m
calico-node-r6bfr 1/1 Running 0 155m
calico-node-r8qzv 1/1 Running 0 7h1m
coredns-66bff467f8-5vk2q 1/1 Running 0 7h30m
......
```
# 四、验证证书有效时间是否延长到10年
```bash
openssl x509 -in /etc/kubernetes/pki/ca.crt -noout -text |grep Not
显示如下,通过下面可看到ca证书有效期是10年,从2020到2030年:
Not Before: Apr 22 04:09:07 2020 GMT
Not After : Apr 20 04:09:07 2030 GMT
openssl x509 -in /etc/kubernetes/pki/apiserver.crt -noout -text |grep Not
显示如下,通过下面可看到apiserver证书有效期是10年,从2020到2030年:
Not Before: Apr 22 11:15:53 2020 GMT
Not After : Apr 20 11:15:53 2030 GMT
openssl x509 -in /etc/kubernetes/pki/apiserver-etcd-client.crt -noout -text |grep Not
显示如下,通过下面可看到etcd证书有效期是10年,从2020到2030年:
Not Before: Apr 22 11:32:24 2020 GMT
Not After : Apr 20 11:32:24 2030 GMT
openssl x509 -in /etc/kubernetes/pki/front-proxy-ca.crt -noout -text |grep Not
显示如下,通过下面可看到fron-proxy证书有效期是10年,从2020到2030年:
Not Before: Apr 22 04:09:08 2020 GMT
Not After : Apr 20 04:09:08 2030 GMT
```
参考资料:
https://mp.weixin.qq.com/s/N7WRT0OkyJHec35BH_X1Hg kubeadm初始化k8s集群延长证书过期时间
================================================
FILE: kubeadm/kubeadm无法下载镜像问题.md
================================================
0、kubeadm镜像介绍
```
kubeadm 是kubernetes 的集群安装工具,能够快速安装kubernetes 集群。
kubeadm init 命令默认使用的docker镜像仓库为k8s.gcr.io,国内无法直接访问,于是需要变通一下。
```
1、首先查看需要使用哪些镜像
```
kubeadm config images list
#输出如下结果
k8s.gcr.io/kube-apiserver:v1.15.3
k8s.gcr.io/kube-controller-manager:v1.15.3
k8s.gcr.io/kube-scheduler:v1.15.3
k8s.gcr.io/kube-proxy:v1.15.3
k8s.gcr.io/pause:3.1
k8s.gcr.io/etcd:3.3.10
k8s.gcr.io/coredns:1.3.1
我们通过 docker.io/mirrorgooglecontainers 中转一下
```
2、批量下载及转换标签
```
#docker.io/mirrorgooglecontainers中转镜像
kubeadm config images list |sed -e 's/^/docker pull /g' -e 's#k8s.gcr.io#docker.io/mirrorgooglecontainers#g' |sh -x
docker images |grep mirrorgooglecontainers |awk '{print "docker tag ",$1":"$2,$1":"$2}' |sed -e 's#mirrorgooglecontainers#k8s.gcr.io#2' |sh -x
docker images |grep mirrorgooglecontainers |awk '{print "docker rmi ", $1":"$2}' |sh -x
docker pull coredns/coredns:1.3.1
docker tag coredns/coredns:1.3.1 k8s.gcr.io/coredns:1.3.1
docker rmi coredns/coredns:1.3.1
注:coredns没包含在docker.io/mirrorgooglecontainers中,需要手工从coredns官方镜像转换下。
#阿里云的中转镜像
kubeadm config images list |sed -e 's/^/docker pull /g' -e 's#k8s.gcr.io#registry.cn-hangzhou.aliyuncs.com/google_containers#g' |sh -x
docker images |grep google_containers |awk '{print "docker tag ",$1":"$2,$1":"$2}' |sed -e 's#registry.cn-hangzhou.aliyuncs.com/google_containers#k8s.gcr.io#2' |sh -x
docker images |grep google_containers |awk '{print "docker rmi ", $1":"$2}' |sh -x
docker pull coredns/coredns:1.3.1
docker tag coredns/coredns:1.3.1 k8s.gcr.io/coredns:1.3.1
docker rmi coredns/coredns:1.3.1
```
3、查看镜像列表
```
docker images
REPOSITORY TAG IMAGE ID CREATED SIZE
k8s.gcr.io/kube-proxy v1.15.3 232b5c793146 2 weeks ago 82.4MB
k8s.gcr.io/kube-scheduler v1.15.3 703f9c69a5d5 2 weeks ago 81.1MB
k8s.gcr.io/kube-controller-manager v1.15.3 e77c31de5547 2 weeks ago 159MB
k8s.gcr.io/coredns 1.3.1 eb516548c180 7 months ago 40.3MB
k8s.gcr.io/etcd 3.3.10 2c4adeb21b4f 9 months ago 258MB
k8s.gcr.io/pause 3.1 da86e6ba6ca1 20 months ago 742kB
docker rmi -f $(docker images -q)
docker rm -f `docker ps -a -q`
```
参考文档:
https://cloud.tencent.com/info/6db42438f5dd7842bcecb6baf61833aa.html kubeadm 无法下载镜像问题
https://juejin.im/post/5b8a4536e51d4538c545645c 使用kubeadm 部署 Kubernetes(国内环境)
================================================
FILE: manual/README.md
================================================
# 内核升级
```
# 载入公钥
rpm --import https://www.elrepo.org/RPM-GPG-KEY-elrepo.org
# 安装ELRepo
rpm -Uvh http://www.elrepo.org/elrepo-release-7.0-3.el7.elrepo.noarch.rpm
# 载入elrepo-kernel元数据
yum --disablerepo=\* --enablerepo=elrepo-kernel repolist
# 查看可用的rpm包
yum --disablerepo=\* --enablerepo=elrepo-kernel list kernel*
# 安装长期支持版本的kernel
yum --disablerepo=\* --enablerepo=elrepo-kernel install -y kernel-lt.x86_64
# 删除旧版本工具包
yum remove kernel-tools-libs.x86_64 kernel-tools.x86_64 -y
# 安装新版本工具包
yum --disablerepo=\* --enablerepo=elrepo-kernel install -y kernel-lt-tools.x86_64
#查看默认启动顺序
awk -F\' '$1=="menuentry " {print $2}' /etc/grub2.cfg
CentOS Linux (4.4.183-1.el7.elrepo.x86_64) 7 (Core)
CentOS Linux (3.10.0-327.10.1.el7.x86_64) 7 (Core)
CentOS Linux (0-rescue-c52097a1078c403da03b8eddeac5080b) 7 (Core)
#默认启动的顺序是从0开始,新内核是从头插入(目前位置在0,而4.4.4的是在1),所以需要选择0。
grub2-set-default 0
#重启并检查
reboot
```
参考资料
https://github.com/easzlab/kubeasz/blob/master/docs/guide/kernel_upgrade.md
================================================
FILE: manual/v1.14/README.md
================================================
================================================
FILE: manual/v1.15.3/README.md
================================================
# 一、Kubernetes 1.15 二进制集群安装
本系列文档将介绍如何使用二进制部署Kubernetes v1.15.3 集群的所有部署,而不是使用自动化部署(kubeadm)集群。在部署过程中,将详细列出各个组件启动参数,以及相关配置说明。在学习完本文档后,将理解k8s各个组件的交互原理,并且可以快速解决实际问题。
## 1.1、组件版本
```
Kubernetes 1.15.3
Docker 18.09 (docker使用官方的脚本安装,后期可能升级为新的版本,但是不影响)
Etcd 3.3.13
Flanneld 0.11.0
```
## 1.2、组件说明
### kube-apiserver
```
使用节点本地Nginx 4层透明代理实现高可用 (也可以使用haproxy,只是起到代理apiserver的作用)
关闭非安全端口8080和匿名访问
使用安全端口6443接受https请求
严格的认知和授权策略 (x509、token、rbac)
开启bootstrap token认证,支持kubelet TLS bootstrapping;
使用https访问kubelet、etcd
```
### kube-controller-manager
```
3节点高可用 (在k8s中,有些组件需要选举,所以使用奇数为集群高可用方案)
关闭非安全端口,使用10252接受https请求
使用kubeconfig访问apiserver的安全扣
使用approve kubelet证书签名请求(CSR),证书过期后自动轮转
各controller使用自己的ServiceAccount访问apiserver
```
### kube-scheduler
```
3节点高可用;
使用kubeconfig访问apiserver安全端口
```
### kubelet
```
使用kubeadm动态创建bootstrap token
使用TLS bootstrap机制自动生成client和server证书,过期后自动轮转
在kubeletConfiguration类型的JSON文件配置主要参数
关闭只读端口,在安全端口10250接受https请求,对请求进行认真和授权,拒绝匿名访问和非授权访问
使用kubeconfig访问apiserver的安全端口
```
### kube-proxy
```
使用kubeconfig访问apiserver的安全端口
在KubeProxyConfiguration类型JSON文件配置为主要参数
使用ipvs代理模式
```
### 集群插件
```
DNS 使用功能、性能更好的coredns
网络 使用Flanneld 作为集群网络插件
```
# 二、初始化环境
## 1.1、集群机器
```
#master节点
192.168.0.50 k8s-01
192.168.0.51 k8s-02
192.168.0.52 k8s-03
#node节点
192.168.0.53 k8s-04 #node节点只运行node,但是设置证书的时候要添加这个ip
```
本文档的所有etcd集群、master集群、worker节点均使用以上三台机器,并且初始化步骤需要在所有机器上执行命令。如果没有特殊命令,所有操作均在192.168.0.50上进行操作
node节点后面会有操作,但是在初始化这步,是所有集群机器。包括node节点,我上面没有列出node节点
## 1.2、修改主机名
所有机器设置永久主机名
```
hostnamectl set-hostname abcdocker-k8s01 #所有机器按照要求修改
bash #刷新主机名
```
接下来我们需要在所有机器上添加hosts解析
```
cat >> /etc/hosts <>/etc/profile
[root@abcdocker-k8s01 ~]# source /etc/profile
[root@abcdocker-k8s01 ~]# env|grep PATH
PATH=/opt/k8s/bin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/root/bin
```
## 1.4、安装依赖包
在每台服务器上安装依赖包
```
yum install -y conntrack ntpdate ntp ipvsadm ipset jq iptables curl sysstat libseccomp wget
```
关闭防火墙 Linux 以及swap分区
```
systemctl stop firewalld
systemctl disable firewalld
iptables -F && iptables -X && iptables -F -t nat && iptables -X -t nat
iptables -P FORWARD ACCEPT
swapoff -a
sed -i '/ swap / s/^\(.*\)$/#\1/g' /etc/fstab
setenforce 0
sed -i 's/^SELINUX=.*/SELINUX=disabled/' /etc/selinux/config
#如果开启了swap分区,kubelet会启动失败(可以通过设置参数——-fail-swap-on设置为false)
```
升级内核
```
```
参考资料
https://i4t.com/4253.html Kubernetes 1.14 二进制集群安装
https://github.com/kubernetes/kubernetes/releases/tag/v1.15.3 下载链接
================================================
FILE: mysql/README.md
================================================
================================================
FILE: mysql/kubernetes访问外部mysql服务.md
================================================
Table of Contents
=================
* [Table of Contents](#table-of-contents)
* [一、创建endpoints](#一创建endpoints)
* [二、创建service](#二创建service)
* [三、文件合并](#三文件合并)
* [四、安装centos7基础镜像](#四安装centos7基础镜像)
* [五、测试数据库连接](#五测试数据库连接)
`k8s访问集群外独立的服务最好的方式是采用Endpoint方式(可以看作是将k8s集群之外的服务抽象为内部服务),以mysql服务为例`
# 一、创建endpoints
(带注释的操作,建议分步操作,被这个坑了很久,或者可以直接使用合并文件一步执行)
```bash
# 删除 mysql-endpoints
kubectl delete -f mysql-endpoints.yaml -n mos-namespace
# 创建 mysql-endpoints.yaml
cat >mysql-endpoints.yaml<<\EOF
apiVersion: v1
kind: Endpoints
metadata:
name: mysql-production
subsets:
- addresses:
- ip: 10.198.1.155 #-注意目前服务器的数据库需要放开权限
ports:
- port: 3306
protocol: TCP
EOF
# 创建 mysql-endpoints
kubectl apply -f mysql-endpoints.yaml -n mos-namespace
# 查看 mysql-endpoints
kubectl get endpoints mysql-production -n mos-namespace
# 查看 mysql-endpoints详情
kubectl describe endpoints mysql-production -n mos-namespace
# 探测服务是否可达
nc -zv 10.198.1.155 3306
```
# 二、创建service
```bash
# 删除 mysql-service
kubectl delete -f mysql-service.yaml -n mos-namespace
# 编写 mysql-service.yaml
cat >mysql-service.yaml<<\EOF
apiVersion: v1
kind: Service
metadata:
name: mysql-production
spec:
ports:
- port: 3306
protocol: TCP
EOF
# 创建 mysql-service
kubectl apply -f mysql-service.yaml -n mos-namespace
# 查看 mysql-service
kubectl get svc mysql-production -n mos-namespace
# 查看 mysql-service详情
kubectl describe svc mysql-production -n mos-namespace
# 验证service ip的连通性
nc -zv `kubectl get svc mysql-production -n mos-namespace|grep mysql-production|awk '{print $3}'` 3306
```
# 三、文件合并
`注意点: Endpoints类型,可以打标签,但是Service不可以通过标签来选择,直接不写selector: name: mysql-endpoints 不然会出现异常,找不到endpoints节点`
```
cat << EOF > mysql-service-new.yaml
apiVersion: v1
kind: Service
metadata:
name: mysql-production
spec:
#selector: ---注意这里用标签选择,直接取消
# name: mysql-endpoints
ports:
- port: 3306
protocol: TCP
EOF
```
完整文件
```bash
kubectl delete -f mysql-endpoints-new.yaml -n mos-namespace
kubectl delete -f mysql-service-new.yaml -n mos-namespace
cat << EOF > mysql-endpoints-new.yaml
apiVersion: v1
kind: Endpoints
metadata:
name: mysql-production
labels:
name: mysql-endpoints
subsets:
- addresses:
- ip: 10.198.1.155
ports:
- port: 3306
protocol: TCP
EOF
cat << EOF > mysql-service-new.yaml
apiVersion: v1
kind: Service
metadata:
name: mysql-production
spec:
ports:
- port: 3306
protocol: TCP
EOF
kubectl apply -f mysql-endpoints-new.yaml -n mos-namespace
kubectl apply -f mysql-service-new.yaml -n mos-namespace
nc -zv `kubectl get svc mysql-production -n mos-namespace|grep mysql-production|awk '{print $3}'` 3306
```
# 四、安装centos7基础镜像
```bash
# 查看 mos-namespace 下的pod资源
kubectl get pods -n mos-namespace
# 清理命令行创建的deployment
kubectl delete deployment centos7-app -n mos-namespace
# 命令行跑一个centos7的bash基础容器
#kubectl run --rm --image=centos:7.2.1511 centos7-app -it --port=8080 --replicas=1 -n mos-namespace
kubectl run --image=centos:7.2.1511 centos7-app -it --port=8080 --replicas=1 -n mos-namespace
# 安装mysql客户端
yum install vim net-tools telnet nc -y
yum install -y mariadb.x86_64 mariadb-libs.x86_64
```
# 五、测试数据库连接
```bash
# 进入到容器
kubectl exec `kubectl get pods -n mos-namespace|grep centos7-app|awk '{print $1}'` -it /bin/bash -n mos-namespace
# 检查网络连通性
ping mysql-production
# 测试mysql服务端口是否OK
nc -zv mysql-production 3306
# 连接测试
mysql -h'mysql-production' -u'root' -p'password'
```
参考资料:
https://blog.csdn.net/hxpjava1/article/details/80040407 使用kubernetes访问外部服务mysql/redis
https://blog.csdn.net/liyingke112/article/details/76204038
https://blog.csdn.net/ybt_c_index/article/details/80881157 istio 0.8 用ServiceEntry访问外部服务(如RDS)
================================================
FILE: redis/K8s上Redis集群动态扩容.md
================================================
参考资料:
http://redisdoc.com/topic/cluster-tutorial.html#id10 Redis 命令参考
https://cloud.tencent.com/developer/article/1392872
================================================
FILE: redis/K8s上运行Redis单实例.md
================================================
Table of Contents
=================
* [一、创建namespace](#一创建namespace)
* [二、创建一个 configmap](#二创建一个-configmap)
* [三、创建 redis 容器](#三创建-redis-容器)
* [四、创建redis-service服务](#四创建redis-service服务)
* [五、验证redis实例](#五验证redis实例)
# 一、创建namespace
```bash
# 清理 namespace
kubectl delete -f mos-namespace.yaml
# 创建一个专用的 namespace
cat > mos-namespace.yaml <<\EOF
---
apiVersion: v1
kind: Namespace
metadata:
name: mos-namespace
EOF
kubectl apply -f mos-namespace.yaml
# 查看 namespace
kubectl get namespace -A
```
# 二、创建一个 configmap
```bash
mkdir config && cd config
# 清理configmap
kubectl delete configmap redis-conf -n mos-namespace
# 创建redis配置文件
cat >redis.conf <<\EOF
#daemonize yes
pidfile /data/redis.pid
port 6379
tcp-backlog 30000
timeout 0
tcp-keepalive 10
loglevel notice
logfile /data/redis.log
databases 16
#save 900 1
#save 300 10
#save 60 10000
stop-writes-on-bgsave-error no
rdbcompression yes
rdbchecksum yes
dbfilename dump.rdb
dir /data
slave-serve-stale-data yes
slave-read-only yes
repl-diskless-sync no
repl-diskless-sync-delay 5
repl-disable-tcp-nodelay no
slave-priority 100
requirepass redispassword
maxclients 30000
appendonly no
appendfilename "appendonly.aof"
appendfsync everysec
no-appendfsync-on-rewrite no
auto-aof-rewrite-percentage 100
auto-aof-rewrite-min-size 64mb
aof-load-truncated yes
lua-time-limit 5000
slowlog-log-slower-than 10000
slowlog-max-len 128
latency-monitor-threshold 0
notify-keyspace-events KEA
hash-max-ziplist-entries 512
hash-max-ziplist-value 64
list-max-ziplist-entries 512
list-max-ziplist-value 64
set-max-intset-entries 1000
zset-max-ziplist-entries 128
zset-max-ziplist-value 64
hll-sparse-max-bytes 3000
activerehashing yes
client-output-buffer-limit normal 0 0 0
client-output-buffer-limit slave 256mb 64mb 60
client-output-buffer-limit pubsub 32mb 8mb 60
hz 10
EOF
# 在mos-namespace中创建 configmap
kubectl create configmap redis-conf --from-file=redis.conf -n mos-namespace
```
# 三、创建 redis 容器
```bash
# 清理pod
kubectl delete -f mos_redis.yaml
cat > mos_redis.yaml <<\EOF
apiVersion: apps/v1
kind: Deployment
metadata:
name: mos-redis
namespace: mos-namespace
spec:
selector:
matchLabels:
name: mos-redis
replicas: 1
template:
metadata:
labels:
name: mos-redis
spec:
containers:
- name: mos-redis
image: redis
volumeMounts:
- name: mos
mountPath: "/usr/local/etc"
command:
- "redis-server"
args:
- "/usr/local/etc/redis/redis.conf"
volumes:
- name: mos
configMap:
name: redis-conf
items:
- key: redis.conf
path: redis/redis.conf
EOF
# 创建和查看 pod
kubectl apply -f mos_redis.yaml
kubectl get pods -n mos-namespace
# 注意:configMap 会挂在 /usr/local/etc/redis/redis.conf 上。与 mountPath 和 configMap 下的 path 一同指定
```
# 四、创建redis-service服务
```bash
# 删除service
kubectl delete -f redis-service.yaml -n mos-namespace
# 编写redis-service.yaml
cat >redis-service.yaml<<\EOF
apiVersion: v1
kind: Service
metadata:
name: redis-production
namespace: mos-namespace
spec:
selector:
name: mos-redis
ports:
- port: 6379
protocol: TCP
EOF
# 创建service
kubectl apply -f redis-service.yaml -n mos-namespace
# 查看service
kubectl get svc redis-production -n mos-namespace
# 查看service详情
kubectl describe svc redis-production -n mos-namespace
```
# 五、验证redis实例
1、普通方式验证
```bash
# 进入到容器
kubectl exec -it `kubectl get pods -n mos-namespace|grep redis|awk '{print $1}'` /bin/bash -n mos-namespace
redis-cli -h 127.0.0.1 -a redispassword
# 127.0.0.1:6379> set a b
# 127.0.0.1:6379> get a
"b"
# 查看日志(因为配置文件中有配置日志写到容器里的/data/redis.log文件)
kubectl exec -it `kubectl get pods -n mos-namespace|grep redis|awk '{print $1}'` /bin/bash -n mos-namespace
$ tail -100f /data/redis.log
1:C 14 Nov 2019 06:46:13.476 # oO0OoO0OoO0Oo Redis is starting oO0OoO0OoO0Oo
1:C 14 Nov 2019 06:46:13.476 # Redis version=5.0.6, bits=64, commit=00000000, modified=0, pid=1, just started
1:C 14 Nov 2019 06:46:13.476 # Configuration loaded
1:M 14 Nov 2019 06:46:13.478 * Running mode=standalone, port=6379.
1:M 14 Nov 2019 06:46:13.478 # WARNING: The TCP backlog setting of 30000 cannot be enforced because /proc/sys/net/core/somaxconn is set to the lower value of 128.
1:M 14 Nov 2019 06:46:13.478 # Server initialized
1:M 14 Nov 2019 06:46:13.478 # WARNING you have Transparent Huge Pages (THP) support enabled in your kernel. This will create latency and memory usage issues with Redis. To fix this issue run the command 'echo never > /sys/kernel/mm/transparent_hugepage/enabled' as root, and add it to your /etc/rc.local in order to retain the setting after a reboot. Redis must be restarted after THP is disabled.
1:M 14 Nov 2019 06:46:13.478 * Ready to accept connections
```
2、通过暴露的service验证
```bash
# 命令行跑一个centos7的bash基础容器
kubectl run --image=centos:7.2.1511 centos7-app -it --port=8080 --replicas=1 -n mos-namespace
# 通过service方式验证
kubectl exec `kubectl get pods -n mos-namespace|grep centos7-app|awk '{print $1}'` -it /bin/bash -n mos-namespace
yum install -y epel-release
yum install -y redis
redis-cli -h redis-production -a redispassword
```
参考文档:
https://www.cnblogs.com/klvchen/p/10862607.html
================================================
FILE: redis/K8s上运行Redis集群指南.md
================================================
Table of Contents
=================
* [一、前言](#一前言)
* [二、准备操作](#二准备操作)
* [三、StatefulSet简介](#三statefulset简介)
* [四、部署过程](#四部署过程)
* [1、创建NFS存储](#1创建nfs存储)
* [2、创建PV](#2创建pv)
* [3、创建Configmap](#3创建configmap)
* [4、创建Headless service](#4创建headless-service)
* [4、创建Redis 集群节点](#4创建redis-集群节点)
* [5、初始化Redis集群](#5初始化redis集群)
* [6、创建用于访问Service](#6创建用于访问service)
* [五、测试主从切换](#五测试主从切换)
* [六、疑问点](#六疑问点)
# 一、前言
架构原理:
`每个Master都可以拥有多个Slave。当Master下线后,Redis集群会从多个Slave中选举出一个新的Master作为替代,而旧Master重新上线后变成新Master的Slave。`
# 二、准备操作
本次部署主要基于该项目:
`https://github.com/zuxqoj/kubernetes-redis-cluster`
其包含了两种部署Redis集群的方式:
```bash
StatefulSet
Service & Deployment
```
两种方式各有优劣,对于像Redis、Mongodb、Zookeeper等有状态的服务,使用StatefulSet是首选方式。本文将主要介绍如何使用StatefulSet进行Redis集群的部署。
# 三、StatefulSet简介
- 1、RC、Deployment、DaemonSet都是面向无状态的服务,它们所管理的Pod的IP、名字,启停顺序等都是随机的,而StatefulSet是什么?顾名思义,有状态的集合,管理所有有状态的服务,比如MySQL、MongoDB集群等。
- 2、StatefulSet本质上是Deployment的一种变体,在v1.9版本中已成为GA版本,它为了解决有状态服务的问题,它所管理的Pod拥有固定的Pod名称,启停顺序,在StatefulSet中,Pod名字称为网络标识(hostname),还必须要用到共享存储。
- 3、在Deployment中,与之对应的服务是service,而在StatefulSet中与之对应的headless service,headless service,即无头服务,与service的区别就是它没有Cluster IP,解析它的名称时将返回该Headless Service对应的全部Pod的Endpoint列表。
- 4、除此之外,StatefulSet在Headless Service的基础上又为StatefulSet控制的每个Pod副本创建了一个DNS域名,这个域名的格式为:
```bash
$(podname).(headless server name)
FQDN: $(podname).(headless server name).namespace.svc.cluster.local
```
- 5、也即是说,对于有状态服务,我们最好使用固定的网络标识(如域名信息)来标记节点,当然这也需要应用程序的支持(如Zookeeper就支持在配置文件中写入主机域名)。
- 6、StatefulSet基于Headless Service(即没有Cluster IP的Service)为Pod实现了稳定的网络标志(包括Pod的hostname和DNS Records),在Pod重新调度后也保持不变。同时,结合PV/PVC,StatefulSet可以实现稳定的持久化存储,就算Pod重新调度后,还是能访问到原先的持久化数据。
- 7、以下为使用StatefulSet部署Redis的架构,无论是Master还是Slave,都作为StatefulSet的一个副本,并且数据通过PV进行持久化,对外暴露为一个Service,接受客户端请求。
# 四、部署过程
```bash
1.创建NFS存储
2.创建PV
3.创建PVC
4.创建Configmap
5.创建headless服务
6.创建Redis StatefulSet
7.初始化Redis集群
```
## 1、创建NFS存储
创建NFS存储主要是为了给Redis提供稳定的后端存储,当Redis的Pod重启或迁移后,依然能获得原先的数据。这里,我们先要创建NFS,然后通过使用PV为Redis挂载一个远程的NFS路径。
```bash
yum -y install nfs-utils #主包提供文件系统
yum -y install rpcbind #提供rpc协议
```
然后,新增/etc/exports文件,用于设置需要共享的路径
```bash
$ cat /etc/exports
/data/nfs/redis/pv1 *(rw,no_root_squash,sync,insecure)
/data/nfs/redis/pv2 *(rw,no_root_squash,sync,insecure)
/data/nfs/redis/pv3 *(rw,no_root_squash,sync,insecure)
/data/nfs/redis/pv4 *(rw,no_root_squash,sync,insecure)
/data/nfs/redis/pv5 *(rw,no_root_squash,sync,insecure)
/data/nfs/redis/pv6 *(rw,no_root_squash,sync,insecure)
#创建相应目录
mkdir -p /data/nfs/redis/pv{1..6}
#接着,启动NFS和rpcbind服务
systemctl restart rpcbind
systemctl restart nfs
systemctl enable nfs
systemctl enable rpcbind
#查看
exportfs -v
#客户端
yum -y install nfs-utils
#查看存储端共享
showmount -e localhost
```
## 2、创建PV
每一个Redis Pod都需要一个独立的PV来存储自己的数据,因此可以创建一个pv.yaml文件,包含6个PV
```bash
kubectl delete -f pv.yaml
cat >pv.yaml<<\EOF
apiVersion: v1
kind: PersistentVolume
metadata:
name: nfs-pv1
spec:
capacity:
storage: 20Gi
accessModes:
- ReadWriteMany
persistentVolumeReclaimPolicy: Retain
storageClassName: nfs
nfs:
server: 10.198.1.155
path: "/data/nfs/redis/pv1"
---
apiVersion: v1
kind: PersistentVolume
metadata:
name: nfs-pv2
spec:
capacity:
storage: 20Gi
accessModes:
- ReadWriteMany
persistentVolumeReclaimPolicy: Retain
storageClassName: nfs
nfs:
server: 10.198.1.155
path: "/data/nfs/redis/pv2"
---
apiVersion: v1
kind: PersistentVolume
metadata:
name: nfs-pv3
spec:
capacity:
storage: 20Gi
accessModes:
- ReadWriteMany
persistentVolumeReclaimPolicy: Retain
storageClassName: nfs
nfs:
server: 10.198.1.155
path: "/data/nfs/redis/pv3"
---
apiVersion: v1
kind: PersistentVolume
metadata:
name: nfs-pv4
spec:
capacity:
storage: 20Gi
accessModes:
- ReadWriteMany
persistentVolumeReclaimPolicy: Retain
storageClassName: nfs
nfs:
server: 10.198.1.155
path: "/data/nfs/redis/pv4"
---
apiVersion: v1
kind: PersistentVolume
metadata:
name: nfs-pv5
spec:
capacity:
storage: 20Gi
accessModes:
- ReadWriteMany
persistentVolumeReclaimPolicy: Retain
storageClassName: nfs
nfs:
server: 10.198.1.155
path: "/data/nfs/redis/pv5"
---
apiVersion: v1
kind: PersistentVolume
metadata:
name: nfs-pv6
spec:
capacity:
storage: 20Gi
accessModes:
- ReadWriteMany
persistentVolumeReclaimPolicy: Retain
storageClassName: nfs
nfs:
server: 10.198.1.155
path: "/data/nfs/redis/pv6"
EOF
kubectl apply -f pv.yaml
```
## 3、创建Configmap
这里,我们可以直接将Redis的配置文件转化为Configmap,这是一种更方便的配置读取方式。配置文件redis.conf如下
```bash
#配置文件redis.conf
cat >redis.conf<<\EOF
appendonly yes
cluster-enabled yes
cluster-config-file /var/lib/redis/nodes.conf
cluster-node-timeout 5000
dir /var/lib/redis
port 6379
EOF
#删除名为redis-conf的Configmap
kubectl delete configmap redis-conf
#创建名为redis-conf的Configmap
kubectl create configmap redis-conf --from-file=redis.conf
#查看创建的configmap
$ kubectl describe cm redis-conf
Name: redis-conf
Namespace: default
Labels:
Annotations:
Data
====
redis.conf:
----
appendonly yes
cluster-enabled yes
cluster-config-file /var/lib/redis/nodes.conf
cluster-node-timeout 5000
dir /var/lib/redis
port 6379
Events:
#如上,redis.conf中的所有配置项都保存到redis-conf这个Configmap中。
```
## 4、创建Headless service
Headless service是StatefulSet实现稳定网络标识的基础,我们需要提前创建。准备文件headless-service.yaml如下:
```bash
#删除svc
kubectl delete -f headless-service.yaml
#编写svc
cat >headless-service.yaml<<\EOF
apiVersion: v1
kind: Service
metadata:
name: redis-service
labels:
app: redis
spec:
ports:
- name: redis-port
port: 6379
clusterIP: None
selector:
app: redis
appCluster: redis-cluster
EOF
#创建svc
kubectl create -f headless-service.yaml
#查看service
$ kubectl get svc
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
redis-service ClusterIP None 6379/TCP 0s
```
可以看到,服务名称为redis-service,其CLUSTER-IP为None,表示这是一个“无头”服务。
## 4、创建Redis 集群节点
创建好Headless service后,就可以利用StatefulSet创建Redis 集群节点,这也是本文的核心内容。我们先创建redis.yml文件:
```bash
#清理pvc资源
kubectl delete pvc redis-data-redis-app-{0..5}
#清理pod资源
kubectl delete -f redis.yaml
#编写yaml
cat >redis.yaml<<\EOF
apiVersion: apps/v1beta1
kind: StatefulSet
metadata:
name: redis-app
spec:
serviceName: "redis-service"
replicas: 6
template:
metadata:
labels:
app: redis
appCluster: redis-cluster
spec:
terminationGracePeriodSeconds: 20
affinity:
podAntiAffinity:
preferredDuringSchedulingIgnoredDuringExecution:
- weight: 100
podAffinityTerm:
labelSelector:
matchExpressions:
- key: app
operator: In
values:
- redis
topologyKey: kubernetes.io/hostname
containers:
- name: redis
image: redis
command:
- "redis-server" #redis启动命令
args:
- "/etc/redis/redis.conf" #redis-server后面跟的参数,换行代表空格
- "--protected-mode" #允许外网访问
- "no"
resources: #资源
requests: #请求的资源
cpu: "100m" #m代表千分之,相当于0.1 个cpu资源
memory: "100Mi" #内存100m大小
ports:
- name: redis
containerPort: 6379
protocol: "TCP"
- name: cluster
containerPort: 16379
protocol: "TCP"
volumeMounts:
- name: "redis-conf" #挂载configmap生成的文件
mountPath: "/etc/redis" #挂载到哪个路径下
- name: "redis-data" #挂载持久卷的路径
mountPath: "/var/lib/redis"
volumes:
- name: "redis-conf" #引用configMap卷
configMap:
name: "redis-conf"
items:
- key: "redis.conf" #创建configMap指定的名称
path: "redis.conf" #里面的那个文件--from-file参数后面的文件
volumeClaimTemplates: #进行pvc持久卷声明
- metadata:
name: redis-data
spec:
accessModes: [ "ReadWriteMany" ]
storageClassName: "nfs" #--注意这里是使用nfs storageClass,如果没有改默认的,可以忽略不写
resources:
requests:
storage: 20Gi
EOF
#创建资源
kubectl apply -f redis.yaml
PodAntiAffinity:表示反亲和性,其决定了某个pod不可以和哪些Pod部署在同一拓扑域,可以用于将一个服务的POD分散在不同的主机或者拓扑域中,提高服务本身的稳定性。
matchExpressions:规定了Redis_Pod要尽量不要调度到包含app为redis的Node上,也即是说已经存在Redis的Node上尽量不要再分配Redis Pod了.
另外,根据StatefulSet的规则,我们生成的Redis的6个Pod的hostname会被依次命名为$(statefulset名称)-$(序号),如下图所示:
```
```bash
# kubectl get pods -o wide
NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE
redis-app-0 1/1 Running 0 2h 172.17.24.3 192.168.0.144
redis-app-1 1/1 Running 0 2h 172.17.63.8 192.168.0.148
redis-app-2 1/1 Running 0 2h 172.17.24.8 192.168.0.144
redis-app-3 1/1 Running 0 2h 172.17.63.9 192.168.0.148
redis-app-4 1/1 Running 0 2h 172.17.24.9 192.168.0.144
redis-app-5 1/1 ContainerCreating 0 2h 172.17.63.10 192.168.0.148
如上,可以看到这些Pods在部署时是以{0…N-1}的顺序依次创建的。注意,直到redis-app-0状态启动后达到Running状态之后,redis-app-1 才开始启动。
同时,每个Pod都会得到集群内的一个DNS域名,格式为$(podname).$(service name).$(namespace).svc.cluster.local ,也即是:
redis-app-0.redis-service.default.svc.cluster.local
redis-app-1.redis-service.default.svc.cluster.local
...以此类推...
这里我们可以验证一下
#kubectl run --rm curl --image=radial/busyboxplus:curl -it
kubectl run --rm -i --tty busybox --image=busybox:1.28 /bin/sh
$ nslookup redis-app-0.redis-service #注意格式 $(podname).$(service name).$(namespace)
Server: 10.96.0.10
Address 1: 10.96.0.10 kube-dns.kube-system.svc.cluster.local
Name: redis-app-0.redis-service
Address 1: 172.17.24.3 redis-app-0.redis-service.default.svc.cluster.local
在K8S集群内部,这些Pod就可以利用该域名互相通信。我们可以使用busybox镜像的nslookup检验这些域名(一条命令)
$ kubectl run -it --rm --image=busybox:1.28 --restart=Never busybox -- nslookup redis-app-0.redis-service
Server: 10.96.0.10
Address 1: 10.96.0.10 kube-dns.kube-system.svc.cluster.local
Name: redis-app-0.redis-service
Address 1: 172.17.24.3 redis-app-0.redis-service.default.svc.cluster.local
pod "busybox" deleted
可以看到, redis-app-0 的IP为172.17.24.3。当然,若Redis Pod迁移或是重启(我们可以手动删除掉一个Redis Pod来测试),IP是会改变的,但是Pod的域名、SRV records、A record都不会改变。
另外可以发现,我们之前创建的pv都被成功绑定了:
$ kubectl get pv|grep nfs-pv
nfs-pv1 20Gi RWX Retain Bound default/redis-data-redis-app-1 nfs 65s
nfs-pv2 20Gi RWX Retain Bound default/redis-data-redis-app-0 nfs 65s
nfs-pv3 20Gi RWX Retain Bound default/redis-data-redis-app-2 nfs 65s
nfs-pv4 20Gi RWX Retain Bound default/redis-data-redis-app-5 nfs 65s
nfs-pv5 20Gi RWX Retain Bound default/redis-data-redis-app-3 nfs 65s
nfs-pv6 20Gi RWX Retain Bound default/redis-data-redis-app-4 nfs 65s
查看pvc资源
$ kubectl get pvc
NAME STATUS VOLUME CAPACITY ACCESS MODES STORAGECLASS AGE
redis-data-redis-app-0 Bound nfs-pv2 20Gi RWX nfs 96s
redis-data-redis-app-1 Bound nfs-pv1 20Gi RWX nfs 86s
redis-data-redis-app-2 Bound nfs-pv3 20Gi RWX nfs 75s
redis-data-redis-app-3 Bound nfs-pv5 20Gi RWX nfs 69s
redis-data-redis-app-4 Bound nfs-pv6 20Gi RWX nfs 62s
redis-data-redis-app-5 Bound nfs-pv4 20Gi RWX nfs 56s
```
## 5、初始化Redis集群
创建好6个Redis Pod后,我们还需要利用常用的Redis-tribe工具进行集群的初始化
创建Ubuntu容器
由于Redis集群必须在所有节点启动后才能进行初始化,而如果将初始化逻辑写入Statefulset中,则是一件非常复杂而且低效的行为。这里,本人不得不称赞一下原项目作者的思路,值得学习。也就是说,我们可以在K8S上创建一个额外的容器,专门用于进行K8S集群内部某些服务的管理控制。
这里,我们专门启动一个Ubuntu的容器,可以在该容器中安装Redis-tribe,进而初始化Redis集群,执行:
```bash
1、#创建一个ubuntu容器
kubectl run -it ubuntu --image=ubuntu --restart=Never /bin/bash
#进入到容器
kubectl exec -it ubuntu /bin/bash
2、#我们使用阿里云的Ubuntu源,执行
$ cat > /etc/apt/sources.list << EOF
deb http://mirrors.aliyun.com/ubuntu/ bionic main restricted universe multiverse
deb-src http://mirrors.aliyun.com/ubuntu/ bionic main restricted universe multiverse
deb http://mirrors.aliyun.com/ubuntu/ bionic-security main restricted universe multiverse
deb-src http://mirrors.aliyun.com/ubuntu/ bionic-security main restricted universe multiverse
deb http://mirrors.aliyun.com/ubuntu/ bionic-updates main restricted universe multiverse
deb-src http://mirrors.aliyun.com/ubuntu/ bionic-updates main restricted universe multiverse
deb http://mirrors.aliyun.com/ubuntu/ bionic-proposed main restricted universe multiverse
deb-src http://mirrors.aliyun.com/ubuntu/ bionic-proposed main restricted universe multiverse
deb http://mirrors.aliyun.com/ubuntu/ bionic-backports main restricted universe multiverse
deb-src http://mirrors.aliyun.com/ubuntu/ bionic-backports main restricted universe multiverse
EOF
3、#成功后,原项目要求执行如下命令安装基本的软件环境:
apt-get update
apt-get install -y vim wget python2.7 python-pip redis-tools dnsutils
4、#初始化集群
首先,我们需要安装redis-trib
pip install redis-trib==0.5.1
然后,创建只有Master节点的集群
redis-trib.py create \
`dig +short redis-app-0.redis-service.default.svc.cluster.local`:6379 \
`dig +short redis-app-1.redis-service.default.svc.cluster.local`:6379 \
`dig +short redis-app-2.redis-service.default.svc.cluster.local`:6379
其次,为每个Master添加Slave
redis-trib.py replicate \
--master-addr `dig +short redis-app-0.redis-service.default.svc.cluster.local`:6379 \
--slave-addr `dig +short redis-app-3.redis-service.default.svc.cluster.local`:6379
redis-trib.py replicate \
--master-addr `dig +short redis-app-1.redis-service.default.svc.cluster.local`:6379 \
--slave-addr `dig +short redis-app-4.redis-service.default.svc.cluster.local`:6379
redis-trib.py replicate \
--master-addr `dig +short redis-app-2.redis-service.default.svc.cluster.local`:6379 \
--slave-addr `dig +short redis-app-5.redis-service.default.svc.cluster.local`:6379
至此,我们的Redis集群就真正创建完毕了,连到任意一个Redis Pod中检验一下:
$ kubectl exec -it redis-app-2 /bin/bash
root@redis-app-2:/data# /usr/local/bin/redis-cli -c
127.0.0.1:6379> cluster nodes
5d3e77f6131c6f272576530b23d1cd7592942eec 172.17.24.3:6379@16379 master - 0 1559628533000 1 connected 0-5461
a4b529c40a920da314c6c93d17dc603625d6412c 172.17.63.10:6379@16379 master - 0 1559628531670 6 connected 10923-16383
368971dc8916611a86577a8726e4f1f3a69c5eb7 172.17.24.9:6379@16379 slave 0025e6140f85cb243c60c214467b7e77bf819ae3 0 1559628533672 4 connected
0025e6140f85cb243c60c214467b7e77bf819ae3 172.17.63.8:6379@16379 master - 0 1559628533000 2 connected 5462-10922
6d5ee94b78b279e7d3c77a55437695662e8c039e 172.17.24.8:6379@16379 myself,slave a4b529c40a920da314c6c93d17dc603625d6412c 0 1559628532000 5 connected
2eb3e06ce914e0e285d6284c4df32573e318bc01 172.17.63.9:6379@16379 slave 5d3e77f6131c6f272576530b23d1cd7592942eec 0 1559628533000 3 connected
127.0.0.1:6379> cluster info
cluster_state:ok
cluster_slots_assigned:16384
cluster_slots_ok:16384
cluster_slots_pfail:0
cluster_slots_fail:0
cluster_known_nodes:6
cluster_size:3
cluster_current_epoch:6
cluster_my_epoch:6
cluster_stats_messages_ping_sent:14910
cluster_stats_messages_pong_sent:15139
cluster_stats_messages_sent:30049
cluster_stats_messages_ping_received:15139
cluster_stats_messages_pong_received:14910
cluster_stats_messages_received:30049
127.0.0.1:6379>
另外,还可以在NFS上查看Redis挂载的数据:
$ ll /data/nfs/redis/pv3
total 12
-rw-r--r-- 1 root root 92 Jun 4 11:36 appendonly.aof
-rw-r--r-- 1 root root 175 Jun 4 11:36 dump.rdb
-rw-r--r-- 1 root root 794 Jun 4 11:49 nodes.conf
```
## 6、创建用于访问Service
前面我们创建了用于实现StatefulSet的Headless Service,但该Service没有Cluster Ip,因此不能用于外界访问。所以,我们还需要创建一个Service,专用于为Redis集群提供访问和负载均衡:
```bash
#删除服务
kubectl delete -f redis-access-service.yaml
#编写yaml
cat >redis-access-service.yaml<<\EOF
apiVersion: v1
kind: Service
metadata:
name: redis-access-service
labels:
app: redis
spec:
type: NodePort
ports:
- name: redis-port
protocol: "TCP"
port: 6379
targetPort: 6379
nodePort: 30010
selector:
app: redis
appCluster: redis-cluster
EOF
#如上,该Service名称为 redis-access-service,在K8S集群中暴露6379端口,并且会对labels name为app: redis或appCluster: redis-cluster的pod进行负载均衡。
#创建服务
kubectl apply -f redis-access-service.yaml
#查看svc
$ kubectl get svc redis-access-service -o wide
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE SELECTOR
redis-access-service NodePort 10.111.59.191 6379:30010/TCP 83m app=redis,appCluster=redis-cluster
#如上,在K8S集群中,所有应用都可以通过 10.111.59.191:6379 来访问Redis集群。当然,为了方便测试,我们也可以为Service添加一个NodePort映射到物理机30010上。
#查看svc详情
$ kubectl describe svc redis-access-service
Name: redis-access-service
Namespace: default
Labels: app=redis
Annotations: kubectl.kubernetes.io/last-applied-configuration:
{"apiVersion":"v1","kind":"Service","metadata":{"annotations":{},"labels":{"app":"redis"},"name":"redis-access-service","namespace":"defau...
Selector: app=redis,appCluster=redis-cluster
Type: NodePort
IP: 10.111.59.191
Port: redis-port 6379/TCP
TargetPort: 6379/TCP
NodePort: redis-port 30010/TCP
Endpoints: 10.244.1.230:6379,10.244.1.231:6379,10.244.1.232:6379 + 3 more...
Session Affinity: None
External Traffic Policy: Cluster
Events:
#集群内测试(service ip 测试)
yum install redis -y
redis-cli -h 10.111.59.191 -p 6379 -c
10.111.59.191:6379> CLUSTER info
cluster_state:ok
cluster_slots_assigned:16384
cluster_slots_ok:16384
cluster_slots_pfail:0
cluster_slots_fail:0
cluster_known_nodes:5
cluster_size:3
cluster_current_epoch:3
cluster_my_epoch:3
cluster_stats_messages_ping_sent:766
cluster_stats_messages_pong_sent:790
cluster_stats_messages_meet_sent:2
cluster_stats_messages_sent:1558
cluster_stats_messages_ping_received:787
cluster_stats_messages_pong_received:768
cluster_stats_messages_meet_received:3
cluster_stats_messages_received:1558
#宿主机端口测试(使用集群协议测试)
redis-cli -h 10.198.1.156 -p 30010 -c
10.198.1.156:30010> cluster info
cluster_state:ok
cluster_slots_assigned:16384
cluster_slots_ok:16384
cluster_slots_pfail:0
cluster_slots_fail:0
cluster_known_nodes:5
cluster_size:3
cluster_current_epoch:3
cluster_my_epoch:2
cluster_stats_messages_ping_sent:907
cluster_stats_messages_pong_sent:901
cluster_stats_messages_meet_sent:3
cluster_stats_messages_sent:1811
cluster_stats_messages_ping_received:900
cluster_stats_messages_pong_received:910
cluster_stats_messages_meet_received:1
cluster_stats_messages_received:1811
```
# 五、测试主从切换
在K8S上搭建完好Redis集群后,我们最关心的就是其原有的高可用机制是否正常。这里,我们可以任意挑选一个Master的Pod来测试集群的主从切换机制,如redis-app-0:
```bash
[root@master redis]# kubectl get pods redis-app-0 -o wide
NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE
redis-app-1 1/1 Running 0 3h 172.17.24.3 192.168.0.144
进入redis-app-0查看:
[root@master redis]# kubectl exec -it redis-app-0 /bin/bash
root@redis-app-0:/data# /usr/local/bin/redis-cli -c
127.0.0.1:6379> role
1) "master"
2) (integer) 13370
3) 1) 1) "172.17.63.9"
2) "6379"
3) "13370"
127.0.0.1:6379>
如上可以看到,app-0为master,slave为172.17.63.9即redis-app-3。
接着,我们手动删除redis-app-0:
[root@master redis]# kubectl delete pod redis-app-0
pod "redis-app-0" deleted
[root@master redis]# kubectl get pod redis-app-0 -o wide
NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE
redis-app-0 1/1 Running 0 4m 172.17.24.3 192.168.0.144
我们再进入redis-app-0内部查看:
[root@master redis]# kubectl exec -it redis-app-0 /bin/bash
root@redis-app-0:/data# /usr/local/bin/redis-cli -c
127.0.0.1:6379> role
1) "slave"
2) "172.17.63.9"
3) (integer) 6379
4) "connected"
5) (integer) 13958
如上,redis-app-0变成了slave,从属于它之前的从节点172.17.63.9即redis-app-3
```
# 六、疑问点
1、pod重启,ip变了,集群健康性如何维护
```
至此,大家可能会疑惑,前面讲了这么多似乎并没有体现出StatefulSet的作用,其提供的稳定标志redis-app-*仅在初始化集群的时候用到,而后续Redis Pod的通信或配置文件中并没有使用该标志。我想说,是的,本文使用StatefulSet部署Redis确实没有体现出其优势,还不如介绍Zookeeper集群来的明显,不过没关系,学到知识就好。
那为什么没有使用稳定的标志,Redis Pod也能正常进行故障转移呢?这涉及了Redis本身的机制。因为,Redis集群中每个节点都有自己的NodeId(保存在自动生成的nodes.conf中),并且该NodeId不会随着IP的变化和变化,这其实也是一种固定的网络标志。也就是说,就算某个Redis Pod重启了,该Pod依然会加载保存的NodeId来维持自己的身份。我们可以在NFS上查看redis-app-1的nodes.conf文件
$ cat /usr/local/k8s/redis/pv1/nodes.conf
96689f2018089173e528d3a71c4ef10af68ee462 192.168.169.209:6379@16379 slave d884c4971de9748f99b10d14678d864187a9e5d3 0 1526460952651 4 connected
237d46046d9b75a6822f02523ab894928e2300e6 192.168.169.200:6379@16379 slave c15f378a604ee5b200f06cc23e9371cbc04f4559 0 1526460952651 1 connected
c15f378a604ee5b200f06cc23e9371cbc04f4559 192.168.169.197:6379@16379 master - 0 1526460952651 1 connected 10923-16383
d884c4971de9748f99b10d14678d864187a9e5d3 192.168.169.205:6379@16379 master - 0 1526460952651 4 connected 5462-10922
c3b4ae23c80ffe31b7b34ef29dd6f8d73beaf85f 192.168.169.198:6379@16379 myself,slave c8a8f70b4c29333de6039c47b2f3453ed11fb5c2 0 1526460952565 3 connected
c8a8f70b4c29333de6039c47b2f3453ed11fb5c2 192.168.169.201:6379@16379 master - 0 1526460952651 6 connected 0-5461
vars currentEpoch 6 lastVoteEpoch 4
如上,第一列为NodeId,稳定不变;第二列为IP和端口信息,可能会改变。
这里,我们介绍NodeId的两种使用场景:
当某个Slave Pod断线重连后IP改变,但是Master发现其NodeId依旧, 就认为该Slave还是之前的Slave。
当某个Master Pod下线后,集群在其Slave中选举重新的Master。待旧Master上线后,集群发现其NodeId依旧,会让旧Master变成新Master的slave。
```
2、pvc绑定不上报错(storageclass.storage.k8s.io "nfs" not found报错)
```
$ kubectl describe pvc redis-data-redis-app-0
Warning ProvisioningFailed 14s (x2 over 24s) persistentvolume-controller storageclass.storage.k8s.io "nfs" not found
#原因为创建pv的时候,没有指定
storageClassName: nfs
```
参考文档:
https://cloud.tencent.com/developer/article/1392872 redis动态扩容
https://blog.csdn.net/zhutongcloud/article/details/90768390 部署Redis集群
https://www.jianshu.com/p/65c4baadf5d9 redis故障切换nodeid原因
================================================
FILE: redis/README.md
================================================
参考资料:
https://mp.weixin.qq.com/s/noVUEO5tbdcdx8AzYNrsMw Kubernetes上通过sts测试Redis Cluster集群
================================================
FILE: rke/README.md
================================================
# 一、基础配置优化
```
chattr -i /etc/passwd* && chattr -i /etc/group* && chattr -i /etc/shadow* && chattr -i /etc/gshadow*
groupadd docker
useradd -g docker docker
echo "1Qaz2Wsx3Edc" | passwd --stdin docker
usermod docker -G docker #注意这里需要将数组改为docker属组,不然会报错
setenforce 0
sed -i 's/SELINUX=enforcing/SELINUX=disabled/g' /etc/selinux/config # 关闭selinux
systemctl daemon-reload
systemctl stop firewalld.service && systemctl disable firewalld.service # 关闭防火墙
#echo 'LANG="en_US.UTF-8"' >> /etc/profile; source /etc/profile # 修改系统语言
ln -sf /usr/share/zoneinfo/Asia/Shanghai /etc/localtime # 修改时区(如果需要修改)
# 性能调优
cat >> /etc/sysctl.conf< /etc/sysctl.d/k8s.conf
net.ipv4.ip_forward=1
net.bridge.bridge-nf-call-ip6tables = 1
net.bridge.bridge-nf-call-iptables = 1
vm.swappiness=0
EOF
sysctl --system
#docker用户免密登录
mkdir -p /home/docker/.ssh/
chmod 700 /home/docker/.ssh/
echo 'ssh-rsa AAAAB3NzaC1yc2EAAAADAQABAAABAQC7bRm20od1b3rzW3ZPLB5NZn3jQesvfiz2p0WlfcYJrFHfF5Ap0ubIBUSQpVNLn94u8ABGBLboZL8Pjo+rXQPkIcObJxoKS8gz6ZOxcxJhl11JKxTz7s49nNYaNDIwB13KaNpvBEHVoW3frUnP+RnIKIIDsr1QCr9t64D9TE99mbNkEvDXr021UQi12Bf4KP/8gfYK3hDMRuX634/K8yu7+IaO1vEPNT8HDo9XGcvrOD1QGV+is8mrU53Xa2qTsto7AOb2J8M6n1mSZxgNz2oGc6ZDuN1iMBfHm4O/s5VEgbttzB2PtI0meKeaLt8VaqwTth631EN1ryjRYUuav7bf docker@k8s-master-01' > /home/docker/.ssh/authorized_keys
chmod 400 /home/docker/.ssh/authorized_keys
```
## 二、基础环境准备
```
mkdir -p /etc/yum.repos.d_bak/
mv /etc/yum.repos.d/* /etc/yum.repos.d_bak/
curl http://mirrors.aliyun.com/repo/Centos-7.repo >/etc/yum.repos.d/Centos-7.repo
curl http://mirrors.aliyun.com/repo/epel-7.repo >/etc/yum.repos.d/epel-7.repo
sed -i '/aliyuncs/d' /etc/yum.repos.d/Centos-7.repo
yum clean all && yum makecache fast
yum -y install yum-utils
yum-config-manager --add-repo http://mirrors.aliyun.com/docker-ce/linux/centos/docker-ce.repo
yum install -y device-mapper-persistent-data lvm2
yum install docker-ce -y
#从docker1.13版本开始,docker会自动设置iptables的FORWARD默认策略为DROP,所以需要修改docker的启动配置文件/usr/lib/systemd/system/docker.service
cat > /usr/lib/systemd/system/docker.service << \EOF
[Unit]
Description=Docker Application Container Engine
Documentation=https://docs.docker.com
BindsTo=containerd.service
After=network-online.target firewalld.service containerd.service
Wants=network-online.target
Requires=docker.socket
[Service]
Type=notify
ExecStart=/usr/bin/dockerd -H fd:// --containerd=/run/containerd/containerd.sock
ExecStartPost=/usr/sbin/iptables -P FORWARD ACCEPT
ExecReload=/bin/kill -s HUP \$MAINPID
TimeoutSec=0
RestartSec=2
Restart=always
StartLimitBurst=3
StartLimitInterval=60s
LimitNOFILE=infinity
LimitNPROC=infinity
LimitCORE=infinity
TasksMax=infinity
Delegate=yes
KillMode=process
[Install]
WantedBy=multi-user.target
EOF
#设置加速器
curl -sSL https://get.daocloud.io/daotools/set_mirror.sh | sh -s http://41935bf4.m.daocloud.io
#这个脚本在centos 7上有个bug,脚本会改变docker的配置文件/etc/docker/daemon.json但修改的时候多了一个逗号,导致docker无法启动
#或者直接执行这个指令
tee /etc/docker/daemon.json <<-'EOF'
{
"registry-mirrors": ["https://1z45x7d0.mirror.aliyuncs.com"],
"insecure-registries": ["192.168.56.11:5000"],
"storage-driver": "overlay2",
"log-driver": "json-file",
"log-opts": {
"max-size": "100m",
"max-file": "3"
}
}
EOF
systemctl daemon-reload
systemctl restart docker
#查看加速器是否生效
root># docker info
Registry Mirrors:
https://1z45x7d0.mirror.aliyuncs.com/ --发现参数已经生效
Live Restore Enabled: false
```
## 三、RKE安装
使用RKE安装,需要先安装好docker和设置好root和普通用户的免key登录
1、下载RKE
```
#可以从https://github.com/rancher/rke/releases下载安装包,本文使用版本v0.3.0.下载完后将安装包上传至任意节点.
wget https://github.com/rancher/rke/releases/download/v0.2.8/rke_linux-amd64
chmod 777 rke_linux-amd64
mv rke_linux-amd64 /usr/local/bin/rke
```
2、创建集群配置文件
```
cat >/tmp/cluster.yml <> ~/.bashrc
```
# 四、helm将rancher部署在k8s集群
1、安装并配置helm客户端
```
#使用官方提供的脚本一键安装
curl https://raw.githubusercontent.com/kubernetes/helm/master/scripts/get > get_helm.sh
chmod 700 get_helm.sh
./get_helm.sh
#手动下载安装
#下载 Helm
wget https://storage.googleapis.com/kubernetes-helm/helm-v2.9.1-linux-amd64.tar.gz
#解压 Helm
tar -zxvf helm-v2.9.1-linux-amd64.tar.gz
#复制客户端执行文件到 bin 目录下
cp linux-amd64/helm /usr/local/bin/
```
2、配置helm客户端具有访问k8s集群的权限
```
kubectl -n kube-system create serviceaccount tiller
kubectl create clusterrolebinding tiller --clusterrole cluster-admin --serviceaccount=kube-system:tiller
```
3、将helm server(titler)部署到k8s集群
```
helm init --service-account tiller --tiller-image hongxiaolu/tiller:v2.12.3 --stable-repo-url https://kubernetes.oss-cn-hangzhou.aliyuncs.com/charts
```
4、为helm客户端配置chart仓库
```
helm repo add rancher-stable https://releases.rancher.com/server-charts/stable
```
5、检查rancher chart仓库可用
```
helm search rancher
```
```
安装证书管理器
helm install stable/cert-manager \
--name cert-manager \
--namespace kube-system
kubectl get pods --all-namespaces|grep cert-manager
helm install rancher-stable/rancher \
--name rancher \
--namespace cattle-system \
--set hostname=acai.rancher.com
```
参考资料:
http://www.acaiblog.cn/2019/03/15/RKE%E9%83%A8%E7%BD%B2rancher%E9%AB%98%E5%8F%AF%E7%94%A8%E9%9B%86%E7%BE%A4/
https://blog.csdn.net/login_sonata/article/details/93847888
================================================
FILE: rke/cluster.yml
================================================
# If you intened to deploy Kubernetes in an air-gapped environment,
# please consult the documentation on how to configure custom RKE images.
nodes:
- address: 10.198.1.156
port: "22"
internal_address: ""
role:
- controlplane
- worker
- etcd
hostname_override: ""
user: k8s
docker_socket: /var/run/docker.sock
ssh_key: ""
ssh_key_path: ~/.ssh/id_rsa
labels: {}
- address: 10.198.1.157
port: "22"
internal_address: ""
role:
- controlplane
- worker
- etcd
hostname_override: ""
user: k8s
docker_socket: /var/run/docker.sock
ssh_key: ""
ssh_key_path: ~/.ssh/id_rsa
labels: {}
- address: 10.198.1.158
port: "22"
internal_address: ""
role:
- worker
hostname_override: ""
user: k8s
docker_socket: /var/run/docker.sock
ssh_key: ""
ssh_key_path: ~/.ssh/id_rsa
labels: {}
- address: 10.198.1.159
port: "22"
internal_address: ""
role:
- worker
hostname_override: ""
user: k8s
docker_socket: /var/run/docker.sock
ssh_key: ""
ssh_key_path: ~/.ssh/id_rsa
labels: {}
- address: 10.198.1.160
port: "22"
internal_address: ""
role:
- worker
hostname_override: ""
user: k8s
docker_socket: /var/run/docker.sock
ssh_key: ""
ssh_key_path: ~/.ssh/id_rsa
labels: {}
services:
etcd:
image: ""
extra_args: {}
extra_binds: []
extra_env: []
external_urls: []
ca_cert: ""
cert: ""
key: ""
path: ""
snapshot: null
retention: ""
creation: ""
kube-api:
image: ""
extra_args:
enable-admission-plugins: NamespaceLifecycle,LimitRanger,ServiceAccount,DefaultStorageClass,DefaultTolerationSeconds,MutatingAdmissionWebhook,ValidatingAdmissionWebhook,ResourceQuota,Initializers
runtime-config: api/all=true,admissionregistration.k8s.io/v1alpha1=true
extra_binds: []
extra_env: []
service_cluster_ip_range: 10.44.0.0/16
service_node_port_range: ""
pod_security_policy: true
kube-controller:
image: ""
extra_args: {}
extra_binds: []
extra_env: []
cluster_cidr: 10.46.0.0/16
service_cluster_ip_range: 10.44.0.0/16
scheduler:
image: ""
extra_args: {}
extra_binds: []
extra_env: []
kubelet:
image: ""
extra_args:
enforce-node-allocatable: "pods,kube-reserved,system-reserved"
system-reserved-cgroup: "/system.slice"
system-reserved: "cpu=500m,memory=1Gi"
kube-reserved-cgroup: "/system.slice/kubelet.service"
kube-reserved: "cpu=1,memory=2Gi"
eviction-soft: "memory.available<10%,nodefs.available<10%,imagefs.available<10%"
eviction-soft-grace-period: "memory.available=2m,nodefs.available=2m,imagefs.available=2m"
extra_binds: []
extra_env: []
cluster_domain: k8s.test.net
infra_container_image: ""
cluster_dns_server: 10.44.0.10
fail_swap_on: false
kubeproxy:
image: ""
extra_args: {}
extra_binds: []
extra_env: []
network:
plugin: calico
options: {}
authentication:
strategy: x509
options: {}
sans: []
addons: ""
addons_include: []
system_images:
etcd: rancher/coreos-etcd:v3.2.24
alpine: rancher/rke-tools:v0.1.25
nginx_proxy: rancher/rke-tools:v0.1.25
cert_downloader: rancher/rke-tools:v0.1.25
kubernetes_services_sidecar: rancher/rke-tools:v0.1.25
kubedns: rancher/k8s-dns-kube-dns-amd64:1.14.13
dnsmasq: rancher/k8s-dns-dnsmasq-nanny-amd64:1.14.13
kubedns_sidecar: rancher/k8s-dns-sidecar-amd64:1.14.13
kubedns_autoscaler: rancher/cluster-proportional-autoscaler-amd64:1.0.0
kubernetes: rancher/hyperkube:v1.12.6-rancher1
flannel: rancher/coreos-flannel:v0.10.0
flannel_cni: rancher/coreos-flannel-cni:v0.3.0
calico_node: rancher/calico-node:v3.1.3
calico_cni: rancher/calico-cni:v3.1.3
calico_controllers: ""
calico_ctl: rancher/calico-ctl:v2.0.0
canal_node: rancher/calico-node:v3.1.3
canal_cni: rancher/calico-cni:v3.1.3
canal_flannel: rancher/coreos-flannel:v0.10.0
wave_node: weaveworks/weave-kube:2.1.2
weave_cni: weaveworks/weave-npc:2.1.2
pod_infra_container: rancher/pause-amd64:3.1
ingress: rancher/nginx-ingress-controller:0.21.0-rancher1
ingress_backend: rancher/nginx-ingress-controller-defaultbackend:1.4
metrics_server: rancher/metrics-server-amd64:v0.3.1
ssh_key_path: ~/.ssh/id_rsa
ssh_agent_auth: false
authorization:
mode: rbac
options: {}
ignore_docker_version: false
kubernetes_version: ""
private_registries: []
ingress:
provider: ""
options: {}
node_selector: {}
extra_args: {}
cluster_name: ""
cloud_provider:
name: ""
prefix_path: ""
addon_job_timeout: 0
bastion_host:
address: ""
port: ""
user: ""
ssh_key: ""
ssh_key_path: ""
monitoring:
provider: ""
options: {}
================================================
FILE: tools/Linux Kernel 升级.md
================================================
# Linux Kernel 升级
k8s,docker,cilium等很多功能、特性需要较新的linux内核支持,所以有必要在集群部署前对内核进行升级;CentOS7 和 Ubuntu16.04可以很方便的完成内核升级。
## CentOS7
红帽企业版 Linux 仓库网站 https://www.elrepo.org,主要提供各种硬件驱动(显卡、网卡、声卡等)和内核升级相关资源;兼容 CentOS7 内核升级。如下按照网站提示载入elrepo公钥及最新elrepo版本,然后按步骤升级内核(以安装长期支持版本 kernel-lt 为例)
``` bash
# 载入公钥
rpm --import https://www.elrepo.org/RPM-GPG-KEY-elrepo.org
# 安装ELRepo
rpm -Uvh http://www.elrepo.org/elrepo-release-7.0-4.el7.elrepo.noarch.rpm
# 载入elrepo-kernel元数据
yum --disablerepo=\* --enablerepo=elrepo-kernel repolist
# 查看可用的rpm包
yum --disablerepo=\* --enablerepo=elrepo-kernel list kernel*
# 安装长期支持版本的kernel
yum --disablerepo=\* --enablerepo=elrepo-kernel install -y kernel-lt.x86_64
# 删除旧版本工具包
yum remove kernel-tools-libs.x86_64 kernel-tools.x86_64 -y
# 安装新版本工具包
yum --disablerepo=\* --enablerepo=elrepo-kernel install -y kernel-lt-tools.x86_64
# 查看默认启动顺序
awk -F\' '$1=="menuentry " {print $2}' /etc/grub2.cfg
CentOS Linux (4.4.208-1.el7.elrepo.x86_64) 7 (Core)
CentOS Linux (3.10.0-1062.9.1.el7.x86_64) 7 (Core)
CentOS Linux (3.10.0-957.el7.x86_64) 7 (Core)
CentOS Linux (0-rescue-292a31ba53a34a6aa077e3467b6f9541) 7 (Core)
# 默认启动的顺序是从0开始,新内核是从头插入(目前位置在0,而4.4.4的是在1),所以需要选择0。
grub2-set-default 0
# 将第一个内核作为默认内核
sed -i 's/GRUB_DEFAULT=saved/GRUB_DEFAULT=0/g' /etc/default/grub
# 更新 grub
grub2-mkconfig -o /boot/grub2/grub.cfg
# 重启并检查
reboot
```
## Ubuntu16.04
``` bash
打开 http://kernel.ubuntu.com/~kernel-ppa/mainline/ 并选择列表中选择你需要的版本(以4.16.3为例)。
接下来,根据你的系统架构下载 如下.deb 文件:
Build for amd64 succeeded (see BUILD.LOG.amd64):
linux-headers-4.16.3-041603_4.16.3-041603.201804190730_all.deb
linux-headers-4.16.3-041603-generic_4.16.3-041603.201804190730_amd64.deb
linux-image-4.16.3-041603-generic_4.16.3-041603.201804190730_amd64.deb
#安装后重启即可
$ sudo dpkg -i *.deb
```
参考文档:
https://github.com/easzlab/kubeasz/blob/master/docs/guide/kernel_upgrade.md
================================================
FILE: tools/README.md
================================================
# 同步工具
1、同步主机host文件
```
[root@master01 ~]# ./ssh_copy.sh /etc/hosts
spawn scp /etc/hosts root@master01:/etc/hosts
hosts 100% 440 940.4KB/s 00:00
spawn scp /etc/hosts root@master02:/etc/hosts
hosts 100% 440 774.6KB/s 00:00
spawn scp /etc/hosts root@master03:/etc/hosts
hosts 100% 440 1.4MB/s 00:00
spawn scp /etc/hosts root@slave01:/etc/hosts
hosts 100% 440 912.6KB/s 00:00
spawn scp /etc/hosts root@slave02:/etc/hosts
hosts 100% 440 826.8KB/s 00:00
spawn scp /etc/hosts root@slave03:/etc/hosts
hosts
```
2、iptables多端口
```bash
#iptables多端口
-A RH-Firewall-1-INPUT -s 13.138.33.20/32 -p tcp -m tcp -m multiport --dports 80,443,6443,20000:40000 -j ACCEPT
#同步防火墙
./ssh_copy.sh /etc/sysconfig/iptables
```
================================================
FILE: tools/k8s域名解析coredns问题排查过程.md
================================================
参考资料:
https://segmentfault.com/a/1190000019823091?utm_source=tag-newest
================================================
FILE: tools/kubernetes-node打标签.md
================================================
```
kubectl get nodes -A --show-labels
kubectl label nodes 10.199.1.159 node=10.199.1.159
kubectl label nodes 10.199.1.160 node=10.199.1.160
```
================================================
FILE: tools/kubernetes-常用操作.md
================================================
# 一、节点调度配置
```
[root@master01 ~]# kubectl get nodes -A
NAME STATUS ROLES AGE VERSION
10.19.2.246 Ready node 3h13m v1.15.2
10.19.2.247 Ready node 3h13m v1.15.2
10.19.2.248 Ready node 3h13m v1.15.2
10.19.2.56 Ready,SchedulingDisabled master 4h55m v1.15.2
10.19.2.57 Ready,SchedulingDisabled master 4h55m v1.15.2
10.19.2.58 Ready,SchedulingDisabled master 4h55m v1.15.2
#方法一
[root@master01 ~]# kubectl uncordon 10.19.2.56
node/10.19.2.56 uncordoned
[root@master01 ~]# kubectl get nodes -A
NAME STATUS ROLES AGE VERSION
10.19.2.246 Ready node 3h13m v1.15.2
10.19.2.247 Ready node 3h13m v1.15.2
10.19.2.248 Ready node 3h13m v1.15.2
10.19.2.56 Ready master 4h56m v1.15.2
10.19.2.57 Ready,SchedulingDisabled master 4h56m v1.15.2
10.19.2.58 Ready,SchedulingDisabled master 4h56m v1.15.2
#方法二
[root@master01 ~]# kubectl patch node 10.19.2.56 -p '{"spec":{"unschedulable":false}}'
node/10.19.2.56 patched
[root@master01 ~]# kubectl get nodes -A
NAME STATUS ROLES AGE VERSION
10.19.2.246 Ready node 3h17m v1.15.2
10.19.2.247 Ready node 3h17m v1.15.2
10.19.2.248 Ready node 3h17m v1.15.2
10.19.2.56 Ready master 5h v1.15.2
10.19.2.57 Ready,SchedulingDisabled master 5h v1.15.2
10.19.2.58 Ready,SchedulingDisabled master 5h v1.15.2
```
# 二、标签查看
```
[root@master01 ~]# kubectl get nodes --show-labels
NAME STATUS ROLES AGE VERSION LABELS
10.19.2.246 Ready node 3h15m v1.15.2 beta.kubernetes.io/arch=amd64,beta.kubernetes.io/os=linux,kubernetes.io/arch=amd64,kubernetes.io/hostname=10.19.2.246,kubernetes.io/os=linux,kubernetes.io/role=node
10.19.2.247 Ready node 3h15m v1.15.2 beta.kubernetes.io/arch=amd64,beta.kubernetes.io/os=linux,kubernetes.io/arch=amd64,kubernetes.io/hostname=10.19.2.247,kubernetes.io/os=linux,kubernetes.io/role=node
10.19.2.248 Ready node 3h15m v1.15.2 beta.kubernetes.io/arch=amd64,beta.kubernetes.io/os=linux,kubernetes.io/arch=amd64,kubernetes.io/hostname=10.19.2.248,kubernetes.io/os=linux,kubernetes.io/role=node
10.19.2.56 Ready master 4h57m v1.15.2 beta.kubernetes.io/arch=amd64,beta.kubernetes.io/os=linux,kubernetes.io/arch=amd64,kubernetes.io/hostname=10.19.2.56,kubernetes.io/os=linux,kubernetes.io/role=master
10.19.2.57 Ready,SchedulingDisabled master 4h57m v1.15.2 beta.kubernetes.io/arch=amd64,beta.kubernetes.io/os=linux,kubernetes.io/arch=amd64,kubernetes.io/hostname=10.19.2.57,kubernetes.io/os=linux,kubernetes.io/role=master
10.19.2.58 Ready,SchedulingDisabled master 4h57m v1.15.2 beta.kubernetes.io/arch=amd64,beta.kubernetes.io/os=linux,kubernetes.io/arch=amd64,kubernetes.io/hostname=10.19.2.58,kubernetes.io/os=linux,kubernetes.io/role=master
```
参考文档:
https://blog.csdn.net/miss1181248983/article/details/88181434 Kubectl常用命令
================================================
FILE: tools/kubernetes-批量删除Pods.md
================================================
# 一、批量删除处于Pending状态的pod
```
kubectl get pods | grep Pending | awk '{print $1}' | xargs kubectl delete pod
```
# 二、批量删除处于Evicted状态的pod
```
kubectl get pods | grep Evicted | awk '{print $1}' | xargs kubectl delete pod
```
参考文档:
https://blog.csdn.net/weixin_39686421/article/details/80574131 kubernetes-批量删除Evicted Pods
================================================
FILE: tools/kubernetes访问外部mysql服务.md
================================================
`
k8s访问集群外独立的服务最好的方式是采用Endpoint方式(可以看作是将k8s集群之外的服务抽象为内部服务),以mysql服务为例
`
# 一、创建endpoints
```bash
#创建 mysql-endpoints.yaml
cat > mysql-endpoints.yaml <<\EOF
kind: Endpoints
apiVersion: v1
metadata:
name: mysql-production
namespace: default
subsets:
- addresses:
- ip: 10.198.1.155
ports:
- port: 3306
EOF
kubectl apply -f mysql-endpoints.yaml
```
# 二、创建service
```bash
#创建 mysql-service.yaml
cat > mysql-service.yaml <<\EOF
apiVersion: v1
kind: Service
metadata:
name: mysql-production
spec:
ports:
- port: 3306
EOF
kubectl apply -f mysql-service.yaml
```
# 三、测试连接数据库
```bash
cat > mysql-rc.yaml <<\EOF
apiVersion: v1
kind: ReplicationController
metadata:
name: mysql
spec:
replicas: 1
selector:
app: mysql
template:
metadata:
labels:
app: mysql
spec:
containers:
- name: mysql
image: docker.io/mysql:5.7
imagePullPolicy: IfNotPresent
ports:
- containerPort: 3306
env:
- name: MYSQL_ROOT_PASSWORD
value: "123456"
EOF
kubectl apply -f mysql-rc.yaml
```
参考资料:
https://blog.csdn.net/hxpjava1/article/details/80040407 使用kubernetes访问外部服务mysql/redis
================================================
FILE: tools/ssh_copy.sh
================================================
#!/bin/bash
for i in `echo master01 master02 master03 slave01 slave02 slave03`;do
expect -c "
spawn scp $1 root@$i:$1
expect {
\"*yes/no*\" {send \"yes\r\"; exp_continue}
\"*password*\" {send \"123456\r\"; exp_continue}
\"*Password*\" {send \"123456\r\";}
} "
done