Repository: Lancger/opsfull Branch: master Commit: 5b36608dbe13 Files: 104 Total size: 533.3 KB Directory structure: gitextract_y9znwe6w/ ├── LICENSE ├── README.md ├── apps/ │ ├── README.md │ ├── nginx/ │ │ └── README.md │ ├── ops/ │ │ └── README.md │ └── wordpress/ │ ├── README.md │ ├── 基于PV_PVC部署Wordpress 示例.md │ └── 部署Wordpress 示例.md ├── components/ │ ├── README.md │ ├── cronjob/ │ │ └── README.md │ ├── dashboard/ │ │ ├── Kubernetes-Dashboard v2.0.0.md │ │ └── README.md │ ├── external-storage/ │ │ ├── 0、nfs服务端搭建.md │ │ ├── 1、k8s的pv和pvc简述.md │ │ ├── 2、静态配置PV和PVC.md │ │ ├── 3、动态申请PV卷.md │ │ ├── 4、Kubernetes之MySQL持久存储和故障转移.md │ │ ├── 5、Kubernetes之Nginx动静态PV持久存储.md │ │ └── README.md │ ├── heapster/ │ │ └── README.md │ ├── ingress/ │ │ ├── 0.通俗理解Kubernetes中Service、Ingress与Ingress Controller的作用与关系.md │ │ ├── 1.kubernetes部署Ingress-nginx单点和高可用.md │ │ ├── 1.外部服务发现之Ingress介绍.md │ │ ├── 2.ingress tls配置.md │ │ ├── 3.ingress-http使用示例.md │ │ ├── 4.ingress-https使用示例.md │ │ ├── 5.hello-tls.md │ │ ├── 6.ingress-https使用示例.md │ │ ├── README.md │ │ ├── nginx-ingress/ │ │ │ └── README.md │ │ ├── traefik-ingress/ │ │ │ ├── 1.traefik反向代理Deamonset模式.md │ │ │ ├── 2.traefik反向代理Deamonset模式TLS.md │ │ │ └── README.md │ │ └── 常用操作.md │ ├── initContainers/ │ │ └── README.md │ ├── job/ │ │ └── README.md │ ├── k8s-monitor/ │ │ └── README.md │ ├── kube-proxy/ │ │ └── README.md │ ├── nfs/ │ │ └── README.md │ └── pressure/ │ ├── README.md │ ├── calico bgp网络需要物理路由和交换机支持吗.md │ └── k8s集群更换网段方案.md ├── docs/ │ ├── Envoy的架构与基本术语.md │ ├── Kubernetes学习笔记.md │ ├── Kubernetes架构介绍.md │ ├── Kubernetes集群环境准备.md │ ├── app.md │ ├── app2.md │ ├── ca.md │ ├── coredns.md │ ├── dashboard.md │ ├── dashboard_op.md │ ├── delete.md │ ├── docker-install.md │ ├── etcd-install.md │ ├── flannel.md │ ├── k8s-error-resolution.md │ ├── k8s_pv_local.md │ ├── k8s重启pod.md │ ├── master.md │ ├── node.md │ ├── operational.md │ ├── 外部访问K8s中Pod的几种方式.md │ └── 虚拟机环境准备.md ├── example/ │ ├── coredns/ │ │ └── coredns.yaml │ └── nginx/ │ ├── nginx-daemonset.yaml │ ├── nginx-deployment.yaml │ ├── nginx-ingress.yaml │ ├── nginx-pod.yaml │ ├── nginx-rc.yaml │ ├── nginx-rs.yaml │ ├── nginx-service-nodeport.yaml │ └── nginx-service.yaml ├── helm/ │ └── README.md ├── kubeadm/ │ ├── K8S-HA-V1.13.4-关闭防火墙版.md │ ├── K8S-HA-V1.16.x-云环境-Calico.md │ ├── K8S-V1.16.2-开启防火墙-Flannel.md │ ├── Kubernetes 集群变更IP地址.md │ ├── README.md │ ├── k8S-HA-V1.15.3-Calico-开启防火墙版.md │ ├── k8S-HA-V1.15.3-Flannel-开启防火墙版.md │ ├── k8s清理.md │ ├── kubeadm.yaml │ ├── kubeadm初始化k8s集群延长证书过期时间.md │ └── kubeadm无法下载镜像问题.md ├── manual/ │ ├── README.md │ ├── v1.14/ │ │ └── README.md │ └── v1.15.3/ │ └── README.md ├── mysql/ │ ├── README.md │ └── kubernetes访问外部mysql服务.md ├── redis/ │ ├── K8s上Redis集群动态扩容.md │ ├── K8s上运行Redis单实例.md │ ├── K8s上运行Redis集群指南.md │ └── README.md ├── rke/ │ ├── README.md │ └── cluster.yml └── tools/ ├── Linux Kernel 升级.md ├── README.md ├── k8s域名解析coredns问题排查过程.md ├── kubernetes-node打标签.md ├── kubernetes-常用操作.md ├── kubernetes-批量删除Pods.md ├── kubernetes访问外部mysql服务.md └── ssh_copy.sh ================================================ FILE CONTENTS ================================================ ================================================ FILE: LICENSE ================================================ Apache License Version 2.0, January 2004 http://www.apache.org/licenses/ TERMS AND CONDITIONS FOR USE, REPRODUCTION, AND DISTRIBUTION 1. Definitions. "License" shall mean the terms and conditions for use, reproduction, and distribution as defined by Sections 1 through 9 of this document. "Licensor" shall mean the copyright owner or entity authorized by the copyright owner that is granting the License. "Legal Entity" shall mean the union of the acting entity and all other entities that control, are controlled by, or are under common control with that entity. For the purposes of this definition, "control" means (i) the power, direct or indirect, to cause the direction or management of such entity, whether by contract or otherwise, or (ii) ownership of fifty percent (50%) or more of the outstanding shares, or (iii) beneficial ownership of such entity. "You" (or "Your") shall mean an individual or Legal Entity exercising permissions granted by this License. "Source" form shall mean the preferred form for making modifications, including but not limited to software source code, documentation source, and configuration files. "Object" form shall mean any form resulting from mechanical transformation or translation of a Source form, including but not limited to compiled object code, generated documentation, and conversions to other media types. "Work" shall mean the work of authorship, whether in Source or Object form, made available under the License, as indicated by a copyright notice that is included in or attached to the work (an example is provided in the Appendix below). "Derivative Works" shall mean any work, whether in Source or Object form, that is based on (or derived from) the Work and for which the editorial revisions, annotations, elaborations, or other modifications represent, as a whole, an original work of authorship. For the purposes of this License, Derivative Works shall not include works that remain separable from, or merely link (or bind by name) to the interfaces of, the Work and Derivative Works thereof. "Contribution" shall mean any work of authorship, including the original version of the Work and any modifications or additions to that Work or Derivative Works thereof, that is intentionally submitted to Licensor for inclusion in the Work by the copyright owner or by an individual or Legal Entity authorized to submit on behalf of the copyright owner. For the purposes of this definition, "submitted" means any form of electronic, verbal, or written communication sent to the Licensor or its representatives, including but not limited to communication on electronic mailing lists, source code control systems, and issue tracking systems that are managed by, or on behalf of, the Licensor for the purpose of discussing and improving the Work, but excluding communication that is conspicuously marked or otherwise designated in writing by the copyright owner as "Not a Contribution." "Contributor" shall mean Licensor and any individual or Legal Entity on behalf of whom a Contribution has been received by Licensor and subsequently incorporated within the Work. 2. Grant of Copyright License. Subject to the terms and conditions of this License, each Contributor hereby grants to You a perpetual, worldwide, non-exclusive, no-charge, royalty-free, irrevocable copyright license to reproduce, prepare Derivative Works of, publicly display, publicly perform, sublicense, and distribute the Work and such Derivative Works in Source or Object form. 3. Grant of Patent License. Subject to the terms and conditions of this License, each Contributor hereby grants to You a perpetual, worldwide, non-exclusive, no-charge, royalty-free, irrevocable (except as stated in this section) patent license to make, have made, use, offer to sell, sell, import, and otherwise transfer the Work, where such license applies only to those patent claims licensable by such Contributor that are necessarily infringed by their Contribution(s) alone or by combination of their Contribution(s) with the Work to which such Contribution(s) was submitted. If You institute patent litigation against any entity (including a cross-claim or counterclaim in a lawsuit) alleging that the Work or a Contribution incorporated within the Work constitutes direct or contributory patent infringement, then any patent licenses granted to You under this License for that Work shall terminate as of the date such litigation is filed. 4. Redistribution. You may reproduce and distribute copies of the Work or Derivative Works thereof in any medium, with or without modifications, and in Source or Object form, provided that You meet the following conditions: (a) You must give any other recipients of the Work or Derivative Works a copy of this License; and (b) You must cause any modified files to carry prominent notices stating that You changed the files; and (c) You must retain, in the Source form of any Derivative Works that You distribute, all copyright, patent, trademark, and attribution notices from the Source form of the Work, excluding those notices that do not pertain to any part of the Derivative Works; and (d) If the Work includes a "NOTICE" text file as part of its distribution, then any Derivative Works that You distribute must include a readable copy of the attribution notices contained within such NOTICE file, excluding those notices that do not pertain to any part of the Derivative Works, in at least one of the following places: within a NOTICE text file distributed as part of the Derivative Works; within the Source form or documentation, if provided along with the Derivative Works; or, within a display generated by the Derivative Works, if and wherever such third-party notices normally appear. The contents of the NOTICE file are for informational purposes only and do not modify the License. You may add Your own attribution notices within Derivative Works that You distribute, alongside or as an addendum to the NOTICE text from the Work, provided that such additional attribution notices cannot be construed as modifying the License. You may add Your own copyright statement to Your modifications and may provide additional or different license terms and conditions for use, reproduction, or distribution of Your modifications, or for any such Derivative Works as a whole, provided Your use, reproduction, and distribution of the Work otherwise complies with the conditions stated in this License. 5. Submission of Contributions. Unless You explicitly state otherwise, any Contribution intentionally submitted for inclusion in the Work by You to the Licensor shall be under the terms and conditions of this License, without any additional terms or conditions. Notwithstanding the above, nothing herein shall supersede or modify the terms of any separate license agreement you may have executed with Licensor regarding such Contributions. 6. Trademarks. This License does not grant permission to use the trade names, trademarks, service marks, or product names of the Licensor, except as required for reasonable and customary use in describing the origin of the Work and reproducing the content of the NOTICE file. 7. Disclaimer of Warranty. Unless required by applicable law or agreed to in writing, Licensor provides the Work (and each Contributor provides its Contributions) on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied, including, without limitation, any warranties or conditions of TITLE, NON-INFRINGEMENT, MERCHANTABILITY, or FITNESS FOR A PARTICULAR PURPOSE. You are solely responsible for determining the appropriateness of using or redistributing the Work and assume any risks associated with Your exercise of permissions under this License. 8. Limitation of Liability. In no event and under no legal theory, whether in tort (including negligence), contract, or otherwise, unless required by applicable law (such as deliberate and grossly negligent acts) or agreed to in writing, shall any Contributor be liable to You for damages, including any direct, indirect, special, incidental, or consequential damages of any character arising as a result of this License or out of the use or inability to use the Work (including but not limited to damages for loss of goodwill, work stoppage, computer failure or malfunction, or any and all other commercial damages or losses), even if such Contributor has been advised of the possibility of such damages. 9. Accepting Warranty or Additional Liability. While redistributing the Work or Derivative Works thereof, You may choose to offer, and charge a fee for, acceptance of support, warranty, indemnity, or other liability obligations and/or rights consistent with this License. However, in accepting such obligations, You may act only on Your own behalf and on Your sole responsibility, not on behalf of any other Contributor, and only if You agree to indemnify, defend, and hold each Contributor harmless for any liability incurred by, or claims asserted against, such Contributor by reason of your accepting any such warranty or additional liability. END OF TERMS AND CONDITIONS APPENDIX: How to apply the Apache License to your work. To apply the Apache License to your work, attach the following boilerplate notice, with the fields enclosed by brackets "[]" replaced with your own identifying information. (Don't include the brackets!) The text should be enclosed in the appropriate comment syntax for the file format. We also recommend that a file or class name and description of purpose be included on the same "printed page" as the copyright notice for easier identification within third-party archives. Copyright [yyyy] [name of copyright owner] Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with the License. You may obtain a copy of the License at http://www.apache.org/licenses/LICENSE-2.0 Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the specific language governing permissions and limitations under the License. ================================================ FILE: README.md ================================================ # 一、K8S攻略 - [Kubernetes架构介绍](docs/Kubernetes架构介绍.md) - [Kubernetes集群环境准备](docs/Kubernetes集群环境准备.md) - [Docker安装](docs/docker-install.md) - [CA证书制作](docs/ca.md) - [ETCD集群部署](docs/etcd-install.md) - [Master节点部署](docs/master.md) - [Node节点部署](docs/node.md) - [Flannel部署](docs/flannel.md) - [应用创建](docs/app.md) - [问题汇总](docs/k8s-error-resolution.md) - [常用手册](docs/operational.md) - [Envoy 的架构与基本术语](docs/Envoy的架构与基本术语.md) - [K8S学习手册](docs/Kubernetes学习笔记.md) - [K8S重启pod](docs/k8s%E9%87%8D%E5%90%AFpod.md) - [K8S清理](docs/delete.md) - [外部访问K8s中Pod的几种方式](docs/外部访问K8s中Pod的几种方式.md) - [应用测试](docs/app2.md) - [PVC](docs/k8s_pv_local.md) - [dashboard操作](docs/dashboard_op.md) # 使用手册
手动部署 1.Kubernetes集群环境准备 2.Docker安装 3.CA证书制作 4.ETCD集群部署 5.Master节点部署 6.Node节点部署 7.Flannel部署 8.应用创建
必备插件 1.CoreDNS部署 2.Dashboard部署 3.Heapster部署 4.Ingress部署 5.CI/CD 6.Helm部署 6.Helm部署
# 二、k8s资源清理 ``` 1、# svc清理 $ kubectl delete svc $(kubectl get svc -n mos-namespace|grep -v NAME|awk '{print $1}') -n mos-namespace service "mysql-production" deleted service "nginx-test" deleted service "redis-cluster" deleted service "redis-production" deleted 2、# deployment清理 $ kubectl delete deployment $(kubectl get deployment -n mos-namespace|grep -v NAME|awk '{print $1}') -n mos-namespace deployment.extensions "centos7-app" deleted 3、# configmap清理 $ kubectl delete cm $(kubectl get cm -n mos-namespace|grep -v NAME|awk '{print $1}') -n mos-namespace ``` https://www.xiaodianer.net/index.php/kubernetes/istio/41-istio-https-demo https://mp.weixin.qq.com/s/jnVn6_cyRUILBQ0cBhBNyQ Kubernetes v1.18.2 二进制高可用部署 ================================================ FILE: apps/README.md ================================================ ================================================ FILE: apps/nginx/README.md ================================================ ================================================ FILE: apps/ops/README.md ================================================ ================================================ FILE: apps/wordpress/README.md ================================================ ================================================ FILE: apps/wordpress/基于PV_PVC部署Wordpress 示例.md ================================================ # 一、PV(PersistentVolume) PersistentVolume (PV) 是外部存储系统中的一块存储空间,由管理员创建和维护。与 Volume 一样,PV 具有持久性,生命周期独立于 Pod。 1、PV和PVC是一一对应关系,当有PV被某个PVC所占用时,会显示banding,其它PVC不能再使用绑定过的PV。 2、PVC一旦绑定PV,就相当于是一个存储卷,此时PVC可以被多个Pod所使用。(PVC支不支持被多个Pod访问,取决于访问模型accessMode的定义)。 3、PVC若没有找到合适的PV时,则会处于pending状态。 4、PV的reclaim policy选项: 默认是Retain保留,保留生成的数据。 可以改为recycle回收,删除生成的数据,回收pv delete,删除,pvc解除绑定后,pv也就自动删除。 # 二、PVC PersistentVolumeClaim (PVC) 是对 PV 的申请 (Claim)。PVC 通常由普通用户创建和维护。需要为 Pod 分配存储资源时,用户可以创建一个 PVC,指明存储资源的容量大小和访问模式(比如只读)等信息,Kubernetes 会查找并提供满足条件的 PV。 有了 PersistentVolumeClaim,用户只需要告诉 Kubernetes 需要什么样的存储资源,而不必关心真正的空间从哪里分配,如何访问等底层细节信息。这些 Storage Provider 的底层信息交给管理员来处理,只有管理员才应该关心创建 PersistentVolume 的细节信息。 ## PVC资源需要指定: 1、accessMode:访问模型;对象列表: ReadWriteOnce – the volume can be mounted as read-write by a single node: RWO - ReadWriteOnce 一人读写 ReadOnlyMany – the volume can be mounted read-only by many nodes: ROX - ReadOnlyMany 多人只读 ReadWriteMany – the volume can be mounted as read-write by many nodes: RWX - ReadWriteMany 多人读写 2、resource:资源限制(比如:定义5GB空间,我们期望对应的存储空间至少5GB。) 3、selector:标签选择器。不加标签,就会在所有PV找最佳匹配。 4、storageClassName:存储类名称: 5、volumeMode:指后端存储卷的模式。可以用于做类型限制,哪种类型的PV可以被当前claim所使用。 6、volumeName:卷名称,指定后端PVC(相当于绑定) # 三、两者差异 1、PV是属于集群级别的,不能定义在名称空间中 2、PVC时属于名称空间级别的。 参考文档: https://blog.csdn.net/weixin_42973226/article/details/86501693 基于rook-ceph部署wordpress https://www.cnblogs.com/benjamin77/p/9944268.html k8s的持久化存储PV&&PVC ================================================ FILE: apps/wordpress/部署Wordpress 示例.md ================================================ # 一、简述  Wordpress应用主要涉及到两个镜像:wordpress 和 mysql,wordpress 是应用的核心程序,mysql 是用于数据存储的。现在我们来看看如何来部署我们的这个wordpress应用。这个服务主要有2个pod资源,优先使用Deployment来管理我们的Pod。 # 二、创建一个MySQL的Deployment对象 - 1、创建namespace空间,并使用Service暴露服务给集群内部使用 ```bash # 清理wordpress-db资源 kubectl delete -f wordpress-db.yaml # 编写mysql的deployment文件 cat > wordpress-db.yaml <<\EOF --- apiVersion: v1 kind: Namespace metadata: name: blog --- apiVersion: apps/v1beta1 kind: Deployment metadata: name: mysql-deploy namespace: blog labels: app: mysql spec: template: metadata: labels: app: mysql spec: containers: - name: mysql image: mysql:5.7 imagePullPolicy: IfNotPresent ports: - containerPort: 3306 name: dbport env: - name: MYSQL_ROOT_PASSWORD value: rootPassW0rd - name: MYSQL_DATABASE value: wordpress - name: MYSQL_USER value: wordpress - name: MYSQL_PASSWORD value: wordpress volumeMounts: - name: db mountPath: /var/lib/mysql volumes: - name: db hostPath: path: /var/lib/mysql --- apiVersion: v1 kind: Service metadata: name: wordpress-mysql namespace: blog spec: selector: app: mysql ports: - name: mysqlport protocol: TCP port: 3306 targetPort: dbport EOF # 创建资源和服务 kubectl create -f wordpress-db.yaml ``` - 2、查看创建的svc服务 ```bash $ kubectl describe svc wordpress-mysql -n blog Name: wordpress-mysql Namespace: blog Labels: Annotations: Selector: app=mysql Type: ClusterIP IP: 10.104.88.234 Port: mysqlport 3306/TCP TargetPort: dbport/TCP Endpoints: 10.244.1.115:3306 Session Affinity: None Events: ``` - 3、验证创建的mysql资源服务可用性 ```bash # 命令行跑一个centos7的bash基础容器 $ kubectl run mysql-test --rm -it --image=alpine /bin/sh kubectl run centos7-app --rm -it --image=centos:7.2.1511 -n blog # 进入到容器 kubectl exec `kubectl get pods -n blog|grep centos7-app|awk '{print $1}'` -it /bin/bash -n blog # 安装mysql客户端 yum install vim net-tools telnet nc -y yum install -y mariadb.x86_64 mariadb-libs.x86_64 # 测试mysql服务端口是否OK nc -zv wordpress-mysql 3306 # 连接测试 mysql -h'wordpress-mysql' -u'root' -p'rootPassW0rd' # 这里使用域名测试 mysql -h'10.104.88.234' -u'root' -p'rootPassW0rd' # 这里使用集群IP测试,这个经常会变 mysql -h'10.244.1.115' -u'root' -p'rootPassW0rd' # 这里使用Endpoints IP测试,这个经常会变 ``` # 三、创建Wordpress服务Deployment对象 ```bash # 清理wordpress资源 kubectl delete -f wordpress.yaml # 编写wordpress的deployment文件 cat > wordpress.yaml <<\EOF --- apiVersion: apps/v1beta1 kind: Deployment metadata: name: wordpress-deploy namespace: blog labels: app: wordpress spec: template: metadata: labels: app: wordpress spec: containers: - name: wordpress image: wordpress imagePullPolicy: IfNotPresent ports: - containerPort: 80 name: wdport env: - name: WORDPRESS_DB_HOST value: wordpress-mysql:3306 - name: WORDPRESS_DB_USER value: wordpress - name: WORDPRESS_DB_PASSWORD value: wordpress --- apiVersion: v1 kind: Service metadata: name: wordpress-service namespace: blog spec: type: NodePort selector: app: wordpress ports: - name: wordpressport protocol: TCP port: 80 targetPort: wdport nodePort: 32380 #新增这一行,指定固定node端口 EOF # 创建资源和服务 kubectl create -f wordpress.yaml # 查看创建的pod资源 kubectl get pods -n blog # 查看创建的svc资源 kubectl get svc -n blog NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE wordpress-mysql ClusterIP 10.104.88.234 3306/TCP 3m36s wordpress-service NodePort 10.111.212.108 80:32380/TCP 12s ``` # 四、访问测试 ```bash #可以看到wordpress服务产生了一个32380的端口,现在我们是不是就可以通过任意节点的NodeIP加上32255端口,就可以访问我们的wordpress应用了,在浏览器中打开,如果看到wordpress跳转到了安装页面,证明我们的嗯安装是没有任何问题的了,如果没有出现预期的效果,那么就需要去查看下Pod的日志来查看问题了: http://192.168.56.11:32380/ ``` ![wordpress](https://github.com/Lancger/opsfull/blob/master/images/wordpress-01.png) # 五、提高稳定性(进阶) `1、当你使用kuberentes的时候,有没有遇到过Pod在启动后一会就挂掉然后又重新启动这样的恶性循环?你有没有想过kubernetes是如何检测pod是否还存活?虽然容器已经启动,但是kubernetes如何知道容器的进程是否准备好对外提供服务了呢?让我们通过kuberentes官网的这篇文章[Configure Liveness and Readiness Probes](https://kubernetes.io/docs/tasks/configure-pod-container/configure-liveness-readiness-startup-probes/),来一探究竟。` `2、Kubelet使用liveness probe(存活探针)来确定何时重启容器。例如,当应用程序处于运行状态但无法做进一步操作,liveness探针将捕获到deadlock,重启处于该状态下的容器,使应用程序在存在bug的情况下依然能够继续运行下去(谁的程序还没几个bug呢)。` `3、Kubelet使用readiness probe(就绪探针)来确定容器是否已经就绪可以接受流量。只有当Pod中的容器都处于就绪状态时kubelet才会认定该Pod处于就绪状态。该信号的作用是控制哪些Pod应该作为service的后端。如果Pod处于非就绪状态,那么它们将会被从service的load balancer中移除。` ` 现在wordpress应用已经部署成功了,那么就万事大吉了吗?如果我们的网站访问量突然变大了怎么办,如果我们要更新我们的镜像该怎么办?如果我们的mysql服务挂掉了怎么办? 所以要保证我们的网站能够非常稳定的提供服务,我们做得还不够,我们可以通过做些什么事情来提高网站的稳定性呢? ## 第一. 增加健康检测 我们前面说过liveness probe和rediness probe是提高应用稳定性非常重要的方法: ```bash livenessProbe: tcpSocket: port: 80 initialDelaySeconds: 3 periodSeconds: 3 readinessProbe: tcpSocket: port: 80 initialDelaySeconds: 5 periodSeconds: 10 #增加上面两个探针,每10s检测一次应用是否可读,每3s检测一次应用是否存活 ``` ## 第二. 增加 HPA 让我们的应用能够自动应对流量高峰期: ```bash 1、创建HPA资源(一定要设置Pod的资源限制参数: request, 否则HPA不会工作) $ kubectl autoscale deployment wordpress-deploy --cpu-percent=10 --min=1 --max=10 -n blog deployment "wordpress-deploy" autoscaled # 我们用kubectl autoscale命令为我们的wordpress-deploy创建一个HPA对象,最小的 pod 副本数为1,最大为10,HPA会根据设定的 cpu使用率(10%)动态的增加或者减少pod数量。当然最好我们也为Pod声明一些资源限制: resources: limits: cpu: 200m memory: 200Mi requests: cpu: 100m memory: 100Mi # 查看HPA $ kubectl get HorizontalPodAutoscaler -A NAMESPACE NAME REFERENCE TARGETS MINPODS MAXPODS REPLICAS AGE blog wordpress-deploy Deployment/wordpress-deploy /10% 1 10 1 4m19s 2、更新Deployment后,我们可以可以来测试下上面的HPA是否会生效: $ kubectl run -i --tty load-generator --image=busybox /bin/sh If you don't see a command prompt, try pressing enter. while true; do wget -q -O- http://wordpress:80; done 3、观察Deployment的副本数是否有变化 $ kubectl get deployment wordpress-deploy NAME DESIRED CURRENT UP-TO-DATE AVAILABLE AGE wordpress-deploy 3 3 3 3 4d 4、删除HPA $ kubectl delete HorizontalPodAutoscaler wordpress-deploy -n blog horizontalpodautoscaler.autoscaling "wordpress-deploy" deleted ``` ## 第三. 增加滚动更新策略 这样可以保证我们在更新应用的时候服务不会被中断: ```bash replicas: 2 revisionHistoryLimit: 10 minReadySeconds: 5 strategy: type: RollingUpdate rollingUpdate: maxSurge: 1 maxUnavailable: 1 ``` ## 第四. 使用Service的名称来代替host `如果mysql服务被重新创建了的话,它的clusterIP非常有可能就变化了,所以上面我们环境变量中的WORDPRESS_DB_HOST的值就会有问题,就会导致访问不了数据库服务了,这个地方我们可以直接使用Service的名称来代替host,这样即使clusterIP变化了,也不会有任何影响,这个我们会在后面的服务发现的章节和大家深入讲解的` ```bash env: - name: WORDPRESS_DB_HOST value: wordpress-mysql:3306 ``` ## 第五. 容器启动顺序 `在部署wordpress服务的时候,mysql服务以前启动起来了吗?如果没有启动起来是不是我们也没办法连接数据库了啊?该怎么办,是不是在启动wordpress应用之前应该去检查一下mysql服务,如果服务正常的话我们就开始部署应用了,这是不是就是InitContainer的用法` ```bash initContainers: - name: init-db image: busybox command: ['sh', '-c', 'until nslookup mysql; do echo waiting for mysql service; sleep 2; done;'] # 直到mysql服务创建完成后,initContainer才结束,结束完成后我们才开始下面的部署。 ``` # 六、优化文件合并 ```bash kubectl delete -f wordpress-all.yaml cat > wordpress-all.yaml <<\EOF --- apiVersion: v1 kind: Namespace metadata: name: blog --- apiVersion: apps/v1beta1 kind: Deployment metadata: name: mysql-deploy namespace: blog labels: app: mysql spec: template: metadata: labels: app: mysql spec: containers: - name: mysql image: mysql:5.7 ports: - containerPort: 3306 name: dbport env: - name: MYSQL_ROOT_PASSWORD value: rootPassW0rd - name: MYSQL_DATABASE value: wordpress - name: MYSQL_USER value: wordpress - name: MYSQL_PASSWORD value: wordpress volumeMounts: - name: db mountPath: /var/lib/mysql volumes: - name: db hostPath: path: /var/lib/mysql --- apiVersion: v1 kind: Service metadata: name: wordpress-mysql namespace: blog spec: selector: app: mysql ports: - name: mysqlport protocol: TCP port: 3306 targetPort: dbport --- apiVersion: apps/v1beta1 kind: Deployment metadata: name: wordpress-deploy namespace: blog labels: app: wordpress spec: revisionHistoryLimit: 10 minReadySeconds: 5 strategy: type: RollingUpdate rollingUpdate: maxSurge: 1 maxUnavailable: 1 template: metadata: labels: app: wordpress spec: initContainers: - name: init-db image: busybox command: ['sh', '-c', 'until nslookup wordpress-mysql; do echo waiting for mysql service; sleep 2; done;'] containers: - name: wordpress image: wordpress imagePullPolicy: IfNotPresent ports: - containerPort: 80 name: wdport env: - name: WORDPRESS_DB_HOST value: wordpress-mysql:3306 - name: WORDPRESS_DB_USER value: wordpress - name: WORDPRESS_DB_PASSWORD value: wordpress resources: limits: cpu: 200m memory: 200Mi requests: cpu: 100m memory: 100Mi --- apiVersion: v1 kind: Service metadata: name: wordpress namespace: blog spec: selector: app: wordpress type: NodePort ports: - name: wordpressport protocol: TCP port: 80 nodePort: 32380 targetPort: wdport EOF kubectl apply -f wordpress-all.yaml watch kubectl get pods -n blog # 检测mysql服务 $ kubectl run mysql-test --rm -it --image=alpine /bin/sh -n blog $ nslookup wordpress-mysql Name: wordpress-mysql Address 1: 10.99.230.27 wordpress-mysql.blog.svc.cluster.local $ ping wordpress-mysql PING wordpress-mysql (10.99.230.27): 56 data bytes 64 bytes from 10.99.230.27: seq=0 ttl=64 time=0.124 ms 64 bytes from 10.99.230.27: seq=0 ttl=64 time=0.124 ms ``` 参考文档: https://www.qikqiak.com/k8s-book/docs/31.%E9%83%A8%E7%BD%B2%20Wordpress%20%E7%A4%BA%E4%BE%8B.html https://blog.csdn.net/maoreyou/article/details/80050623 Kubernetes之路 3 - 解决服务依赖 ================================================ FILE: components/README.md ================================================ # ingress # helm https://kubernetes.io/docs/setup/production-environment/tools/kubeadm/install-kubeadm/#check-required-ports 需要开放的端口 ================================================ FILE: components/cronjob/README.md ================================================ 参考资料: https://www.jianshu.com/p/62b4f0a3134b Kubernetes对象之CronJob ================================================ FILE: components/dashboard/Kubernetes-Dashboard v2.0.0.md ================================================ ```bash #安装 kubectl apply -f https://raw.githubusercontent.com/kubernetes/dashboard/v2.0.0-beta4/aio/deploy/recommended.yaml #卸载 kubectl delete -f https://raw.githubusercontent.com/kubernetes/dashboard/v2.0.0-beta4/aio/deploy/recommended.yaml #账号授权 kubectl delete -f admin.yaml cat > admin.yaml << \EOF kind: ClusterRoleBinding apiVersion: rbac.authorization.k8s.io/v1 metadata: name: admin annotations: rbac.authorization.kubernetes.io/autoupdate: "true" roleRef: kind: ClusterRole name: cluster-admin apiGroup: rbac.authorization.k8s.io subjects: - kind: ServiceAccount name: admin namespace: kube-system --- apiVersion: v1 kind: ServiceAccount metadata: name: admin namespace: kube-system labels: kubernetes.io/cluster-service: "true" addonmanager.kubernetes.io/mode: Reconcile EOF kubectl apply -f admin.yaml kubectl describe secret/$(kubectl get secret -n kube-system |grep admin|awk '{print $1}') -n kube-system ``` 参考文档: http://www.mydlq.club/article/28/ ================================================ FILE: components/dashboard/README.md ================================================ # 一、安装dashboard v1.10.1 ## 1、使用NodePort方式暴露访问 1、下载对应的yaml文件 ``` wget https://raw.githubusercontent.com/kubernetes/dashboard/v1.10.1/src/deploy/recommended/kubernetes-dashboard.yaml vim kubernetes-dashboard.yaml 1、# 修改镜像名称 ...... spec: containers: - name: kubernetes-dashboard #image: k8s.gcr.io/kubernetes-dashboard-amd64:v1.10.1 #这个换成阿里云的镜像 image: registry.cn-hangzhou.aliyuncs.com/google_containers/kubernetes-dashboard-amd64:v1.10.1 ports: - containerPort: 8443 protocol: TCP args: - --auto-generate-certificates ...... ``` 2、# 修改Service为NodePort类型 ``` ...... kind: Service apiVersion: v1 metadata: labels: k8s-app: kubernetes-dashboard name: kubernetes-dashboard namespace: kube-system spec: type: NodePort # 新增这一行,指定为NodePort方式 ports: - port: 443 targetPort: 8443 nodePort: 32370 #新增这一行,指定固定node端口 selector: k8s-app: kubernetes-dashboard ``` 3、dashboard最终文件 ``` cat > kubernetes-dashboard.yaml << \EOF # Copyright 2017 The Kubernetes Authors. # # Licensed under the Apache License, Version 2.0 (the "License"); # you may not use this file except in compliance with the License. # You may obtain a copy of the License at # # http://www.apache.org/licenses/LICENSE-2.0 # # Unless required by applicable law or agreed to in writing, software # distributed under the License is distributed on an "AS IS" BASIS, # WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. # See the License for the specific language governing permissions and # limitations under the License. # ------------------- Dashboard Secret ------------------- # apiVersion: v1 kind: Secret metadata: labels: k8s-app: kubernetes-dashboard name: kubernetes-dashboard-certs namespace: kube-system type: Opaque --- # ------------------- Dashboard Service Account ------------------- # apiVersion: v1 kind: ServiceAccount metadata: labels: k8s-app: kubernetes-dashboard name: kubernetes-dashboard namespace: kube-system --- # ------------------- Dashboard Role & Role Binding ------------------- # kind: Role apiVersion: rbac.authorization.k8s.io/v1 metadata: name: kubernetes-dashboard-minimal namespace: kube-system rules: # Allow Dashboard to create 'kubernetes-dashboard-key-holder' secret. - apiGroups: [""] resources: ["secrets"] verbs: ["create"] # Allow Dashboard to create 'kubernetes-dashboard-settings' config map. - apiGroups: [""] resources: ["configmaps"] verbs: ["create"] # Allow Dashboard to get, update and delete Dashboard exclusive secrets. - apiGroups: [""] resources: ["secrets"] resourceNames: ["kubernetes-dashboard-key-holder", "kubernetes-dashboard-certs"] verbs: ["get", "update", "delete"] # Allow Dashboard to get and update 'kubernetes-dashboard-settings' config map. - apiGroups: [""] resources: ["configmaps"] resourceNames: ["kubernetes-dashboard-settings"] verbs: ["get", "update"] # Allow Dashboard to get metrics from heapster. - apiGroups: [""] resources: ["services"] resourceNames: ["heapster"] verbs: ["proxy"] - apiGroups: [""] resources: ["services/proxy"] resourceNames: ["heapster", "http:heapster:", "https:heapster:"] verbs: ["get"] --- apiVersion: rbac.authorization.k8s.io/v1 kind: RoleBinding metadata: name: kubernetes-dashboard-minimal namespace: kube-system roleRef: apiGroup: rbac.authorization.k8s.io kind: Role name: kubernetes-dashboard-minimal subjects: - kind: ServiceAccount name: kubernetes-dashboard namespace: kube-system --- # ------------------- Dashboard Deployment ------------------- # kind: Deployment apiVersion: apps/v1 metadata: labels: k8s-app: kubernetes-dashboard name: kubernetes-dashboard namespace: kube-system spec: replicas: 1 revisionHistoryLimit: 10 selector: matchLabels: k8s-app: kubernetes-dashboard template: metadata: labels: k8s-app: kubernetes-dashboard spec: containers: - name: kubernetes-dashboard #image: k8s.gcr.io/kubernetes-dashboard-amd64:v1.10.1 image: registry.cn-hangzhou.aliyuncs.com/google_containers/kubernetes-dashboard-amd64:v1.10.1 ports: - containerPort: 8443 protocol: TCP args: - --auto-generate-certificates # Uncomment the following line to manually specify Kubernetes API server Host # If not specified, Dashboard will attempt to auto discover the API server and connect # to it. Uncomment only if the default does not work. # - --apiserver-host=http://my-address:port volumeMounts: - name: kubernetes-dashboard-certs mountPath: /certs # Create on-disk volume to store exec logs - mountPath: /tmp name: tmp-volume livenessProbe: httpGet: scheme: HTTPS path: / port: 8443 initialDelaySeconds: 30 timeoutSeconds: 30 volumes: - name: kubernetes-dashboard-certs secret: secretName: kubernetes-dashboard-certs - name: tmp-volume emptyDir: {} serviceAccountName: kubernetes-dashboard # Comment the following tolerations if Dashboard must not be deployed on master tolerations: - key: node-role.kubernetes.io/master effect: NoSchedule --- # ------------------- Dashboard Service ------------------- # kind: Service apiVersion: v1 metadata: labels: k8s-app: kubernetes-dashboard name: kubernetes-dashboard namespace: kube-system spec: type: NodePort # 新增这一行,指定为NodePort方式 ports: - port: 443 targetPort: 8443 nodePort: 32370 #新增这一行,指定固定node端口 selector: k8s-app: kubernetes-dashboard EOF kubectl apply -f kubernetes-dashboard.yaml ``` 4、然后创建一个具有全局所有权限的用户来登录Dashboard:(admin.yaml) ``` cat > admin.yaml << \EOF kind: ClusterRoleBinding apiVersion: rbac.authorization.k8s.io/v1beta1 metadata: name: admin annotations: rbac.authorization.kubernetes.io/autoupdate: "true" roleRef: kind: ClusterRole name: cluster-admin apiGroup: rbac.authorization.k8s.io subjects: - kind: ServiceAccount name: admin namespace: kube-system --- apiVersion: v1 kind: ServiceAccount metadata: name: admin namespace: kube-system labels: kubernetes.io/cluster-service: "true" addonmanager.kubernetes.io/mode: Reconcile EOF kubectl apply -f admin.yaml kubectl delete -f admin.yaml #获取token kubectl -n kube-system describe secret $(kubectl -n kube-system get secret | grep admin | awk '{print $1}') ``` 5、访问测试 `https://nodeip:32370` ## 2、使用Ingress方式访问 ```bash #清理NodePort方式的dashboard kubectl delete -f kubernetes-dashboard.yaml rm -f kubernetes-dashboard.yaml wget https://raw.githubusercontent.com/kubernetes/dashboard/v1.10.1/src/deploy/recommended/kubernetes-dashboard.yaml kubectl apply -n kube-system -f kubernetes-dashboard.yaml ``` 1、创建和安装加密访问凭证 通过https进行访问必需要使用证书和密钥,在Kubernetes中可以通过配置一个加密凭证(TLS secret)来提供。 ```bash #1、创建 tls secret #这里只是拿来自己使用,创建一个自己签名的证书。如果是公共服务,建议去数字证书颁发机构去申请一个正式的数字证书(需要一些服务费用);或者使用Let's encrypt去申请一个免费的(后面有介绍);如果使用Cloudflare可以自动生成证书和https转接服务,但是需要将域名迁移过去,高级功能是收费的。 #https://github.com/kubernetes/contrib/blob/master/ingress/controllers/nginx/examples/tls/README.md kubectl delete secret k8s-dashboard-secret -n kube-system rm -rf /etc/certs/ssl/ mkdir -p /etc/certs/ssl/default cd /etc/certs/ssl/default/ openssl req -x509 -nodes -days 365 -newkey rsa:2048 -keyout tls_default.key -out tls_default.crt -subj "/CN=dashboard.devops.com" #将会产生两个文件tls_default.key和tls_default.crt,你可以改成自己的文件名或放在特定的目录下(如果你是为公共服务器创建的,请保证这个不会被别人访问到)。后面的dashboard.devops.com是我的服务器IP地址,你可以改成自己的。 ``` 2、安装 tls secret ```bash #下一步,将这两个文件的信息创建为一个Kubernetes的secret访问凭证,我将名称指定为 k8s-dashboard-secret ,这在后面的Ingress配置时将会用到。如果你修改了这个名字,注意后面的Ingress配置yaml文件也需要同步修改。 cd /etc/certs/ssl/default/ kubectl -n kube-system delete secret k8s-dashboard-secret kubectl -n kube-system create secret tls k8s-dashboard-secret --key=tls_default.key --cert=tls_default.crt #查看证书 kubectl get secret k8s-dashboard-secret -n kube-system kubectl describe secret k8s-dashboard-secret -n kube-system #注意: #上面命令的参数 -n 指定凭证安装的命名空间。 #为了安全考虑,Ingress所有的资源(凭证、路由、服务)必须在同一个命名空间。 ``` 3、配置Ingress 路由 ```bash #将下面的内容保存为文件dashboard-ingress.yaml。里面的 / 设定为访问Kubernetes dashboard服务,/web 只是为了测试和占位,如果没有安装nginx,将会返回找不到服务的消息。 cat >dashboard-ingress.yaml<<\EOF apiVersion: extensions/v1beta1 kind: Ingress metadata: name: k8s-dashboard namespace: kube-system annotations: kubernetes.io/ingress.class: traefik spec: tls: - secretName: traefik-cert #注意这里需要跟traefik.toml文件设置的证书挂钩 #- secretName: k8s-dashboard-secret rules: - host: dashboard.devops.com http: paths: - path: / backend: serviceName: kubernetes-dashboard servicePort: 443 EOF kubectl apply -n kube-system -f dashboard-ingress.yaml #注意 #上面的annotations部分是必须的,以提供https和https service的支持。不过,不同的Ingress Controller可能的实现(或版本)有所不同,需要安装相应的实现(版本)进行设置。 #参见,#issue:https://github.com/kubernetes/ingress-nginx/issues/2460 ``` 参考资料: https://my.oschina.net/u/2306127/blog/1930169?from=timeline&isappinstalled=0 Kubernetes dashboard 通过 Ingress 提供HTTPS访问 ================================================ FILE: components/external-storage/0、nfs服务端搭建.md ================================================ ## 一、nfs服务端 ```bash #所有节点安装nfs yum install -y nfs-utils rpcbind #创建nfs目录 mkdir -p /nfs/data/ #修改权限 chmod -R 666 /nfs/data #编辑export文件 vim /etc/exports /nfs/data 192.168.56.0/24(rw,async,no_root_squash) #如果设置为 /nfs/data *(rw,async,no_root_squash) 则对所以的IP都有效 常用选项: ro:客户端挂载后,其权限为只读,默认选项; rw:读写权限; sync:同时将数据写入到内存与硬盘中; async:异步,优先将数据保存到内存,然后再写入硬盘; Secure:要求请求源的端口小于1024 用户映射: root_squash:当NFS客户端使用root用户访问时,映射到NFS服务器的匿名用户; no_root_squash:当NFS客户端使用root用户访问时,映射到NFS服务器的root用户; all_squash:全部用户都映射为服务器端的匿名用户; anonuid=UID:将客户端登录用户映射为此处指定的用户uid; anongid=GID:将客户端登录用户映射为此处指定的用户gid #配置生效 exportfs -r #查看生效 exportfs #启动rpcbind、nfs服务 systemctl restart rpcbind && systemctl enable rpcbind systemctl restart nfs && systemctl enable nfs #查看 RPC 服务的注册状况 (注意/etc/hosts.deny 里面需要放开以下服务) $ rpcinfo -p localhost program vers proto port service 100000 4 tcp 111 portmapper 100000 3 tcp 111 portmapper 100000 2 tcp 111 portmapper 100000 4 udp 111 portmapper 100000 3 udp 111 portmapper 100000 2 udp 111 portmapper 100005 1 udp 20048 mountd 100005 1 tcp 20048 mountd 100005 2 udp 20048 mountd 100005 2 tcp 20048 mountd 100005 3 udp 20048 mountd 100005 3 tcp 20048 mountd 100024 1 udp 34666 status 100024 1 tcp 7951 status 100003 3 tcp 2049 nfs 100003 4 tcp 2049 nfs 100227 3 tcp 2049 nfs_acl 100003 3 udp 2049 nfs 100003 4 udp 2049 nfs 100227 3 udp 2049 nfs_acl 100021 1 udp 31088 nlockmgr 100021 3 udp 31088 nlockmgr 100021 4 udp 31088 nlockmgr 100021 1 tcp 27131 nlockmgr 100021 3 tcp 27131 nlockmgr 100021 4 tcp 27131 nlockmgr #修改/etc/hosts.allow放开rpcbind(nfs服务端和客户端都要加上) chattr -i /etc/hosts.allow echo "nfsd:all" >>/etc/hosts.allow echo "rpcbind:all" >>/etc/hosts.allow echo "mountd:all" >>/etc/hosts.allow chattr +i /etc/hosts.allow #showmount测试 showmount -e 192.168.56.11 #tcpdmatch测试 $ tcpdmatch rpcbind 192.168.56.11 client: address 192.168.56.11 server: process rpcbind access: granted ``` ## 二、nfs客户端 ```bash yum install -y nfs-utils rpcbind #客户端创建目录,然后执行挂载 mkdir -p /mnt/nfs #(注意挂载成功后,/mnt下原有数据将会被隐藏,无法找到) mount -t nfs -o nolock,vers=4 192.168.56.11:/nfs/data /mnt/nfs ``` ## 三、挂载nfs ```bash #或者直接写到/etc/fstab文件中 vim /etc/fstab 192.168.56.11:/nfs/data /mnt/nfs/ nfs auto,noatime,nolock,bg,nfsvers=4,intr,tcp,actimeo=1800 0 0 #挂载 mount -a #卸载挂载 umount /mnt/nfs #查看nfs服务端信息 nfsstat -s #查看nfs客户端信息 nfsstat -c ``` 参考文档: http://www.mydlq.club/article/3/ CentOS7 搭建 NFS 服务器 https://blog.rot13.org/2012/05/rpcbind-is-new-portmap-or-how-to-make-nfs-secure.html https://yq.aliyun.com/articles/694065 https://www.crifan.com/linux_fstab_and_mount_nfs_syntax_and_parameter_meaning/ Linux中fstab的语法和参数含义和mount NFS时相关参数含义 ================================================ FILE: components/external-storage/1、k8s的pv和pvc简述.md ================================================ # 一、PV(PersistentVolume) PersistentVolume (PV) 是外部存储系统中的一块存储空间,由管理员创建和维护。与 Volume 一样,PV 具有持久性,生命周期独立于 Pod。 1、PV和PVC是一一对应关系,当有PV被某个PVC所占用时,会显示banding,其它PVC不能再使用绑定过的PV。 2、PVC一旦绑定PV,就相当于是一个存储卷,此时PVC可以被多个Pod所使用。(PVC支不支持被多个Pod访问,取决于访问模型accessMode的定义)。 3、PVC若没有找到合适的PV时,则会处于pending状态。 4、PV的reclaim policy选项: 默认是Retain保留,保留生成的数据。 可以改为recycle回收,删除生成的数据,回收pv delete,删除,pvc解除绑定后,pv也就自动删除。 # 二、PVC PersistentVolumeClaim (PVC) 是对 PV 的申请 (Claim)。PVC 通常由普通用户创建和维护。需要为 Pod 分配存储资源时,用户可以创建一个 PVC,指明存储资源的容量大小和访问模式(比如只读)等信息,Kubernetes 会查找并提供满足条件的 PV。 有了 PersistentVolumeClaim,用户只需要告诉 Kubernetes 需要什么样的存储资源,而不必关心真正的空间从哪里分配,如何访问等底层细节信息。这些 Storage Provider 的底层信息交给管理员来处理,只有管理员才应该关心创建 PersistentVolume 的细节信息。 ## PVC资源需要指定: 1、accessMode:访问模型;对象列表: ReadWriteOnce – the volume can be mounted as read-write by a single node: RWO - ReadWriteOnce 一人读写 ReadOnlyMany – the volume can be mounted read-only by many nodes: ROX - ReadOnlyMany 多人只读 ReadWriteMany – the volume can be mounted as read-write by many nodes: RWX - ReadWriteMany 多人读写 2、resource:资源限制(比如:定义5GB空间,我们期望对应的存储空间至少5GB。) 3、selector:标签选择器。不加标签,就会在所有PV找最佳匹配。 4、storageClassName:存储类名称: 5、volumeMode:指后端存储卷的模式。可以用于做类型限制,哪种类型的PV可以被当前claim所使用。 6、volumeName:卷名称,指定后端PVC(相当于绑定) # 三、两者差异 1、PV是属于集群级别的,不能定义在名称空间中 2、PVC时属于名称空间级别的。 参考文档: https://blog.csdn.net/weixin_42973226/article/details/86501693 基于rook-ceph部署wordpress https://www.cnblogs.com/benjamin77/p/9944268.html k8s的持久化存储PV&&PVC ================================================ FILE: components/external-storage/2、静态配置PV和PVC.md ================================================ Table of Contents ================= * [一、环境介绍](#一环境介绍) * [二、PV操作](#二pv操作) * [01、创建PV卷](#01创建pv卷) * [02、PV配置参数介绍](#02pv配置参数介绍) * [03、创建PV资源](#03创建pv资源) * [04、查看PV](#04查看pv) * [三、PVC操作](#三pvc操作) * [01、创建PVC资源](#01创建pvc资源) * [02、查看PVC/PV](#02查看pvcpv) * [四、Pod中使用存储](#四pod中使用存储) * [五、验证](#五验证) * [01、验证PV是否可用](#01验证pv是否可用) * [02、进入pod查看挂载情况](#02进入pod查看挂载情况) * [03、删除pod](#03删除pod) * [04、继续删除pvc](#04继续删除pvc) * [05、继续删除pv](#05继续删除pv) # 一、环境介绍 作为准备工作,我们已经在 k8s同一局域内网节点上搭建了一个 NFS 服务器,目录为 /data/nfs, pv是全局的,pvc可以指定namespace。 # 二、PV操作 ## 01、创建PV卷 ```bash # 创建pv卷对应的目录 mkdir -p /data/nfs/pv001 mkdir -p /data/nfs/pv002 # 配置exportrs $ vim /etc/exports /data/nfs *(rw,no_root_squash,sync,insecure) /data/nfs/pv001 *(rw,no_root_squash,sync,insecure) /data/nfs/pv002 *(rw,no_root_squash,sync,insecure) # 配置生效 exportfs -r # 重启rpcbind、nfs服务 systemctl restart rpcbind && systemctl restart nfs # 查看挂载点 $ showmount -e localhost Export list for localhost: /data/nfs/pv002 * /data/nfs/pv001 * /data/nfs * ``` ## 02、PV配置参数介绍 ```bash 配置说明: ① capacity 指定 PV 的容量为 20G。 ② accessModes 指定访问模式为 ReadWriteOnce,支持的访问模式有: ReadWriteOnce – PV 能以 read-write 模式 mount 到单个节点。 ReadOnlyMany – PV 能以 read-only 模式 mount 到多个节点。 ReadWriteMany – PV 能以 read-write 模式 mount 到多个节点。 ③ persistentVolumeReclaimPolicy 指定当 PV 的回收策略为 Recycle,支持的策略有: Retain – 就是保留现场,K8S什么也不做,需要管理员手动去处理PV里的数据,处理完后,再手动删除PV Recycle – K8S会将PV里的数据删除,然后把PV的状态变成Available,又可以被新的PVC绑定使用 Delete – K8S会自动删除该PV及里面的数据 ④ storageClassName 指定 PV 的 class 为 nfs。相当于为 PV 设置了一个分类,PVC 可以指定 class 申请相应 class 的 PV。 ⑤ 指定 PV 在 NFS 服务器上对应的目录。 一般来说,PV和PVC的生命周期分为5个阶段: Provisioning,即PV的创建,可以直接创建PV(静态方式),也可以使用StorageClass动态创建 Binding,将PV分配给PVC Using,Pod通过PVC使用该Volume Releasing,Pod释放Volume并删除PVC Reclaiming,回收PV,可以保留PV以便下次使用,也可以直接从云存储中删除 根据这5个阶段,Volume的状态有以下4种: Available:可用 Bound:已经分配给PVC Released:PVC解绑但还未执行回收策略 Failed:发生错误 变成Released的PV会根据定义的回收策略做相应的回收工作。有三种回收策略: Retain 就是保留现场,K8S什么也不做,等待用户手动去处理PV里的数据,处理完后,再手动删除PV Delete K8S会自动删除该PV及里面的数据 Recycle K8S会将PV里的数据删除,然后把PV的状态变成Available,又可以被新的PVC绑定使用 ``` ## 03、创建PV资源 1、nfs-pv001.yaml ```bash # 清理pv资源 kubectl delete -f nfs-pv001.yaml # 编写pv资源文件 cat > nfs-pv001.yaml <<\EOF apiVersion: v1 kind: PersistentVolume metadata: name: nfs-pv001 labels: pv: nfs-pv001 spec: capacity: storage: 20Gi accessModes: - ReadWriteOnce persistentVolumeReclaimPolicy: Recycle storageClassName: nfs nfs: path: /data/nfs/pv001 server: 192.168.56.11 EOF # 部署pv到集群中 kubectl apply -f nfs-pv001.yaml ``` 2、nfs-pv002.yaml ```bash # 清理pv资源 kubectl delete -f nfs-pv002.yaml # 编写pv资源文件 cat > nfs-pv002.yaml <<\EOF apiVersion: v1 kind: PersistentVolume metadata: name: nfs-pv002 labels: pv: nfs-pv002 spec: capacity: storage: 30Gi accessModes: - ReadWriteOnce persistentVolumeReclaimPolicy: Recycle storageClassName: nfs nfs: path: /data/nfs/pv002 server: 192.168.56.11 EOF # 部署pv到集群中 kubectl apply -f nfs-pv002.yaml ``` ## 04、查看PV ```bash # 查看pv $ kubectl get pv NAME CAPACITY ACCESS MODES RECLAIM POLICY STATUS CLAIM STORAGECLASS REASON AGE nfs-pv001 20Gi RWO Recycle Available nfs 68s nfs-pv002 30Gi RWO Recycle Available nfs 33s #STATUS 为 Available,表示 pv 就绪,可以被 PVC 申请。 ``` # 三、PVC操作 ## 01、创建PVC资源 接下来创建2个名为pvc001和pvc002的PVC,配置文件 nfs-pvc001.yaml 如下: 1、nfs-pvc001.yaml ```bash # 清理pvc资源 kubectl delete -f nfs-pvc001.yaml # 编写pvc资源文件 cat > nfs-pvc001.yaml <<\EOF apiVersion: v1 kind: PersistentVolumeClaim metadata: name: nfs-pvc001 spec: accessModes: - ReadWriteOnce resources: requests: storage: 20Gi storageClassName: nfs selector: matchLabels: pv: nfs-pv001 EOF # 部署pvc到集群中 kubectl apply -f nfs-pvc001.yaml ``` 2、nfs-pvc002.yaml ```bash # 清理pvc资源 kubectl delete -f nfs-pvc002.yaml # 编写pvc资源文件 cat > nfs-pvc002.yaml <<\EOF apiVersion: v1 kind: PersistentVolumeClaim metadata: name: nfs-pvc002 spec: accessModes: - ReadWriteOnce resources: requests: storage: 30Gi storageClassName: nfs selector: matchLabels: pv: nfs-pv002 EOF # 部署pvc到集群中 kubectl apply -f nfs-pvc002.yaml ``` ## 02、查看PVC/PV ```bash $ kubectl get pvc --show-labels NAME STATUS VOLUME CAPACITY ACCESS MODES STORAGECLASS AGE nfs-pvc001 Bound nfs-pv001 20Gi RWO nfs 18s nfs-pvc002 Bound nfs-pv002 30Gi RWO nfs 7s $ kubectl get pv --show-labels NAME CAPACITY ACCESS MODES RECLAIM POLICY STATUS CLAIM STORAGECLASS REASON AGE nfs-pv001 20Gi RWO Recycle Bound default/nfs-pvc001 nfs 17m nfs-pv002 30Gi RWO Recycle Bound default/nfs-pvc002 nfs 17m # 从 kubectl get pvc 和 kubectl get pv 的输出可以看到 pvc001 和pvc002分别绑定到pv001和pv002,申请成功。注意pvc绑定到对应pv通过labels标签方式实现,也可以不指定,将随机绑定到pv。 ``` # 四、Pod中使用存储 ```与使用普通 Volume 的格式类似,在 volumes 中通过 persistentVolumeClaim 指定使用nfs-pvc001和nfs-pvc002申请的 Volume。``` 1、nfs-pod001.yaml ```bash # 清理pod资源 kubectl delete -f nfs-pod001.yaml # 编写pod资源文件 cat > nfs-pod001.yaml <<\EOF kind: Pod apiVersion: v1 metadata: name: nfs-pod001 spec: containers: - name: myfrontend image: nginx volumeMounts: - mountPath: "/var/www/html" name: nfs-pv001 volumes: - name: nfs-pv001 persistentVolumeClaim: claimName: nfs-pvc001 EOF # 创建pod资源 kubectl apply -f nfs-pod001.yaml ``` 2、nfs-pod002.yaml ```bash # 清理pod资源 kubectl delete -f nfs-pod002.yaml # 编写pod资源文件 cat > nfs-pod002.yaml <<\EOF kind: Pod apiVersion: v1 metadata: name: nfs-pod002 spec: containers: - name: myfrontend image: nginx volumeMounts: - mountPath: "/var/www/html" name: nfs-pv002 volumes: - name: nfs-pv002 persistentVolumeClaim: claimName: nfs-pvc002 EOF # 创建pod资源 kubectl apply -f nfs-pod002.yaml ``` # 五、验证 ## 01、验证PV是否可用 ```bash # 进入到pod创建文件 kubectl exec nfs-pod001 touch /var/www/html/index001.html kubectl exec nfs-pod002 touch /var/www/html/index002.html # 登录到nfs-server上面查看文件是否创建成功 $ ls /data/nfs/pv001/ index001.html $ ls /data/nfs/pv002/ index002.html ``` ## 02、进入pod查看挂载情况 ```bash # 验证pod001的挂载 $ kubectl exec -it nfs-pod001 /bin/bash $ root@nfs-pod001:/# df -h Filesystem Size Used Avail Use% Mounted on overlay 711G 85G 627G 12% / tmpfs 64M 0 64M 0% /dev tmpfs 16G 0 16G 0% /sys/fs/cgroup /dev/sda3 711G 85G 627G 12% /etc/hosts shm 64M 0 64M 0% /dev/shm 192.168.56.11:/data/nfs/pv001 932G 620M 931G 1% /var/www/html # 验证pod002的挂载 $ kubectl exec -it nfs-pod002 /bin/bash $ root@nfs-pod002:/# df -h Filesystem Size Used Avail Use% Mounted on overlay 711G 85G 627G 12% / tmpfs 64M 0 64M 0% /dev tmpfs 16G 0 16G 0% /sys/fs/cgroup /dev/sda3 711G 85G 627G 12% /etc/hosts shm 64M 0 64M 0% /dev/shm 192.168.56.11:/data/nfs/pv002 932G 620M 931G 1% /var/www/html ``` ## 03、删除pod pv和pvc不会被删除,nfs存储的数据不会被删除 ```bash $ kubectl delete -f nfs-pod001.yaml pod "nfs-pod001" deleted $ kubectl get pv NAME CAPACITY ACCESS MODES RECLAIM POLICY STATUS CLAIM STORAGECLASS REASON AGE nfs-pv001 20Gi RWO Recycle Bound default/nfs-pvc001 nfs 13m nfs-pv002 30Gi RWO Recycle Bound default/nfs-pvc002 nfs 13m $ kubectl get pvc NAME STATUS VOLUME CAPACITY ACCESS MODES STORAGECLASS AGE nfs-pvc001 Bound nfs-pv001 20Gi RWO nfs 13m nfs-pvc002 Bound nfs-pv002 30Gi RWO nfs 13m ``` ## 04、继续删除pvc pv将被释放,处于 Available 可用状态,并且nfs存储中的数据被删除。 ```bash $ kubectl delete -f nfs-pvc001.yaml persistentvolumeclaim "nfs-pvc001" deleted $ kubectl get pv NAME CAPACITY ACCESS MODES RECLAIM POLICY STATUS CLAIM STORAGECLASS REASON AGE nfs-pv001 20Gi RWO Recycle Available nfs 18m nfs-pv002 30Gi RWO Recycle Bound default/nfs-pvc002 nfs 18m $ ls /nfs/data/pv001/ # 文件不存在 ``` ## 05、继续删除pv ```bash $ kubectl delete -f nfs-pv001.yaml persistentvolume "nfs-pv001" deleted ``` 参考文档: https://blog.csdn.net/networken/article/details/86697018 kubernetes部署NFS持久存储 ================================================ FILE: components/external-storage/3、动态申请PV卷.md ================================================ Table of Contents ================= * [Kubernetes 中部署 NFS Provisioner 为 NFS 提供动态分配卷](#kubernetes-中部署-nfs-provisioner-为-nfs-提供动态分配卷) * [一、NFS Provisioner 简介](#一nfs-provisioner-简介) * [二、External NFS驱动的工作原理](#二external-nfs驱动的工作原理) * [1、nfs-client](#1nfs-client) * [2、nfs](#2nfs) * [三、部署服务](#三部署服务) * [1、配置授权](#1配置授权) * [2、部署nfs-client-provisioner](#2部署nfs-client-provisioner) * [3、部署NFS Provisioner](#3部署nfs-provisioner) * [4、创建StorageClass](#4创建storageclass) * [四、创建PVC](#四创建pvc) * [01、创建一个新的namespace,然后创建pvc资源](#01创建一个新的namespace然后创建pvc资源) * [五、创建测试Pod](#五创建测试pod) * [01、进入 NFS Server 服务器验证是否创建对应文件](#01进入-nfs-server-服务器验证是否创建对应文件) # Kubernetes 中部署 NFS Provisioner 为 NFS 提供动态分配卷 ## 一、NFS Provisioner 简介 NFS Provisioner 是一个自动配置卷程序,它使用现有的和已配置的 NFS 服务器来支持通过持久卷声明动态配置 Kubernetes 持久卷。 - 持久卷被配置为:namespace−{pvcName}-${pvName}。 ## 二、External NFS驱动的工作原理 K8S的外部NFS驱动,可以按照其工作方式(是作为NFS server还是NFS client)分为两类: ### 1、nfs-client - 也就是我们接下来演示的这一类,它通过K8S的内置的NFS驱动挂载远端的NFS服务器到本地目录;然后将自身作为storage provider,关联storage class。当用户创建对应的PVC来申请PV时,该provider就将PVC的要求与自身的属性比较,一旦满足就在本地挂载好的NFS目录中创建PV所属的子目录,为Pod提供动态的存储服务。 ### 2、nfs - 与nfs-client不同,该驱动并不使用k8s的NFS驱动来挂载远端的NFS到本地再分配,而是直接将本地文件映射到容器内部,然后在容器内使用ganesha.nfsd来对外提供NFS服务;在每次创建PV的时候,直接在本地的NFS根目录中创建对应文件夹,并export出该子目录。利用NFS动态提供Kubernetes后端存储卷 - 本文将介绍使用nfs-client-provisioner这个应用,利用NFS Server给Kubernetes作为持久存储的后端,并且动态提供PV。前提条件是有已经安装好的NFS服务器,并且NFS服务器与Kubernetes的Slave节点都能网络连通。将nfs-client驱动做一个deployment部署到K8S集群中,然后对外提供存储服务。 `nfs-client-provisioner` 是一个Kubernetes的简易NFS的外部 provisioner,本身不提供NFS,需要现有的NFS服务器提供存储 ## 三、部署服务 ### 1、配置授权 现在的 Kubernetes 集群大部分是基于 RBAC 的权限控制,所以创建一个一定权限的 ServiceAccount 与后面要创建的 “NFS Provisioner” 绑定,赋予一定的权限。 ```bash # 清理rbac授权 kubectl delete -f nfs-rbac.yaml -n kube-system # 编写yaml cat >nfs-rbac.yaml<<-EOF --- kind: ServiceAccount apiVersion: v1 metadata: name: nfs-client-provisioner --- kind: ClusterRole apiVersion: rbac.authorization.k8s.io/v1 metadata: name: nfs-client-provisioner-runner rules: - apiGroups: [""] resources: ["persistentvolumes"] verbs: ["get", "list", "watch", "create", "delete"] - apiGroups: [""] resources: ["persistentvolumeclaims"] verbs: ["get", "list", "watch", "update"] - apiGroups: ["storage.k8s.io"] resources: ["storageclasses"] verbs: ["get", "list", "watch"] - apiGroups: [""] resources: ["events"] verbs: ["create", "update", "patch"] --- kind: ClusterRoleBinding apiVersion: rbac.authorization.k8s.io/v1 metadata: name: run-nfs-client-provisioner subjects: - kind: ServiceAccount name: nfs-client-provisioner namespace: kube-system roleRef: kind: ClusterRole name: nfs-client-provisioner-runner apiGroup: rbac.authorization.k8s.io --- kind: Role apiVersion: rbac.authorization.k8s.io/v1 metadata: name: leader-locking-nfs-client-provisioner rules: - apiGroups: [""] resources: ["endpoints"] verbs: ["get", "list", "watch", "create", "update", "patch"] --- kind: RoleBinding apiVersion: rbac.authorization.k8s.io/v1 metadata: name: leader-locking-nfs-client-provisioner subjects: - kind: ServiceAccount name: nfs-client-provisioner # replace with namespace where provisioner is deployed namespace: kube-system roleRef: kind: Role name: leader-locking-nfs-client-provisioner apiGroup: rbac.authorization.k8s.io EOF # 应用授权 kubectl apply -f nfs-rbac.yaml -n kube-system ``` ### 2、部署nfs-client-provisioner 首先克隆仓库获取yaml文件 ``` git clone https://github.com/kubernetes-incubator/external-storage.git cp -R external-storage/nfs-client/deploy/ /root/ cd deploy ``` ### 3、部署NFS Provisioner 修改deployment.yaml文件,这里修改的参数包括NFS服务器所在的IP地址(10.198.1.155),以及NFS服务器共享的路径(/data/nfs/),两处都需要修改为你实际的NFS服务器和共享目录。另外修改nfs-client-provisioner镜像从七牛云拉取。 设置 NFS Provisioner 部署文件,这里将其部署到 “kube-system” Namespace 中。 ```bash # 清理NFS Provisioner资源 kubectl delete -f nfs-provisioner-deploy.yaml -n kube-system export NFS_ADDRESS='10.198.1.155' export NFS_DIR='/data/nfs' # 编写deployment.yaml cat >nfs-provisioner-deploy.yaml<<-EOF --- kind: Deployment apiVersion: apps/v1 metadata: name: nfs-client-provisioner spec: replicas: 1 selector: matchLabels: app: nfs-client-provisioner strategy: type: Recreate #---设置升级策略为删除再创建(默认为滚动更新) template: metadata: labels: app: nfs-client-provisioner spec: serviceAccountName: nfs-client-provisioner containers: - name: nfs-client-provisioner #---由于quay.io仓库国内被墙,所以替换成七牛云的仓库 #image: quay-mirror.qiniu.com/external_storage/nfs-client-provisioner:latest image: registry.cn-hangzhou.aliyuncs.com/open-ali/nfs-client-provisioner:latest volumeMounts: - name: nfs-client-root mountPath: /persistentvolumes env: - name: PROVISIONER_NAME value: nfs-client #---nfs-provisioner的名称,以后设置的storageclass要和这个保持一致 - name: NFS_SERVER value: ${NFS_ADDRESS} #---NFS服务器地址,和 valumes 保持一致 - name: NFS_PATH value: ${NFS_DIR} #---NFS服务器目录,和 valumes 保持一致 volumes: - name: nfs-client-root nfs: server: ${NFS_ADDRESS} #---NFS服务器地址 path: ${NFS_DIR} #---NFS服务器目录 EOF # 部署deployment.yaml kubectl apply -f nfs-provisioner-deploy.yaml -n kube-system # 查看创建的pod kubectl get pod -o wide -n kube-system|grep nfs-client # 查看pod日志 kubectl logs -f `kubectl get pod -o wide -n kube-system|grep nfs-client|awk '{print $1}'` -n kube-system ``` ### 4、创建StorageClass storage class的定义,需要注意的是:provisioner属性要等于驱动所传入的环境变量`PROVISIONER_NAME`的值。否则,驱动不知道知道如何绑定storage class。 此处可以不修改,或者修改provisioner的名字,需要与上面的deployment的`PROVISIONER_NAME`名字一致。 ```bash # 清理storageclass资源 kubectl delete -f nfs-storage.yaml # 编写yaml cat >nfs-storage.yaml<<-EOF apiVersion: storage.k8s.io/v1 kind: StorageClass metadata: name: nfs-storage annotations: storageclass.kubernetes.io/is-default-class: "true" #---设置为默认的storageclass provisioner: nfs-client #---动态卷分配者名称,必须和上面创建的"PROVISIONER_NAME"变量中设置的Name一致 parameters: archiveOnDelete: "true" #---设置为"false"时删除PVC不会保留数据,"true"则保留数据 mountOptions: - hard #指定为硬挂载方式 - nfsvers=4 #指定NFS版本,这个需要根据 NFS Server 版本号设置 EOF #部署class.yaml kubectl apply -f nfs-storage.yaml #查看创建的storageclass(这里可以看到nfs-storage已经变为默认的storageclass了) $ kubectl get sc NAME PROVISIONER AGE nfs-storage (default) nfs-client 3m38s ``` ## 四、创建PVC ### 01、创建一个新的namespace,然后创建pvc资源 ```bash # 删除命令空间 kubectl delete ns kube-public # 创建命名空间 kubectl create ns kube-public # 清理pvc kubectl delete -f test-claim.yaml -n kube-public # 编写yaml cat >test-claim.yaml<<\EOF kind: PersistentVolumeClaim apiVersion: v1 metadata: name: test-claim spec: storageClassName: nfs-storage #---需要与上面创建的storageclass的名称一致 accessModes: - ReadWriteMany resources: requests: storage: 100Gi EOF #创建PVC kubectl apply -f test-claim.yaml -n kube-public #查看创建的PV和PVC $ kubectl get pvc -n kube-public NAME STATUS VOLUME CAPACITY ACCESS MODES STORAGECLASS AGE test-claim Bound pvc-593f241f-a75f-459a-af18-a672e5090921 100Gi RWX nfs-storage 3s kubectl get pv #然后,我们进入到NFS的export目录,可以看到对应该volume name的目录已经创建出来了。其中volume的名字是namespace,PVC name以及uuid的组合: #注意,出现pvc在pending的原因可能为nfs-client-provisioner pod 出现了问题,删除重建的时候会出现镜像问题 ``` ## 五、创建测试Pod ```bash # 清理资源 kubectl delete -f test-pod.yaml -n kube-public # 编写yaml cat > test-pod.yaml <<\EOF kind: Pod apiVersion: v1 metadata: name: test-pod spec: containers: - name: test-pod image: busybox:latest command: - "/bin/sh" args: - "-c" - "touch /mnt/SUCCESS && exit 0 || exit 1" volumeMounts: - name: nfs-pvc mountPath: "/mnt" restartPolicy: "Never" volumes: - name: nfs-pvc persistentVolumeClaim: claimName: test-claim EOF #创建pod kubectl apply -f test-pod.yaml -n kube-public #查看创建的pod kubectl get pod -o wide -n kube-public ``` ### 01、进入 NFS Server 服务器验证是否创建对应文件 进入 NFS Server 服务器的 NFS 挂载目录,查看是否存在 Pod 中创建的文件: ```bash $ cd /data/nfs/ $ ls archived-kube-public-test-claim-pvc-2dd4740d-f2d1-4e88-a0fc-383c00e37255 kube-public-test-claim-pvc-ad304939-e75d-414f-81b5-7586ef17db6c archived-kube-public-test-claim-pvc-593f241f-a75f-459a-af18-a672e5090921 kube-system-test1-claim-pvc-f84dc09c-b41e-4e67-a239-b14f8d342efc archived-kube-public-test-claim-pvc-b08b209d-c448-4ce4-ab5c-1bf37cc568e6 pv001 default-test-claim-pvc-4f18ed06-27cd-465b-ac87-b2e0e9565428 pv002 # 可以看到已经生成 SUCCESS 该文件,并且可知通过 NFS Provisioner 创建的目录命名方式为 “namespace名称-pvc名称-pv名称”,pv 名称是随机字符串,所以每次只要不删除 PVC,那么 Kubernetes 中的与存储绑定将不会丢失,要是删除 PVC 也就意味着删除了绑定的文件夹,下次就算重新创建相同名称的 PVC,生成的文件夹名称也不会一致,因为 PV 名是随机生成的字符串,而文件夹命名又跟 PV 有关,所以删除 PVC 需谨慎。 ``` 参考文档: https://blog.csdn.net/qq_25611295/article/details/86065053 k8s pv与pvc持久化存储(静态与动态) https://blog.csdn.net/networken/article/details/86697018 kubernetes部署NFS持久存储 https://www.jianshu.com/p/5e565a8049fc kubernetes部署NFS持久存储(静态和动态) ================================================ FILE: components/external-storage/4、Kubernetes之MySQL持久存储和故障转移.md ================================================ Table of Contents ================= * [一、MySQL持久化演练](#一mysql持久化演练) * [1、数据库提供持久化存储,主要分为下面几个步骤:](#1数据库提供持久化存储主要分为下面几个步骤) * [二、静态PV PVC](#二静态pv-pvc) * [1、创建 PV](#1创建-pv) * [2、创建PVC](#2创建pvc) * [三、部署 MySQL](#三部署-mysql) * [1、MySQL 的配置文件mysql.yaml如下:](#1mysql-的配置文件mysqlyaml如下) * [2、更新 MySQL 数据](#2更新-mysql-数据) * [3、故障转移](#3故障转移) * [四、全新命名空间使用](#四全新命名空间使用) # 一、MySQL持久化演练 ## 1、数据库提供持久化存储,主要分为下面几个步骤: 1、创建 PV 和 PVC 2、部署 MySQL 3、向 MySQL 添加数据 4、模拟节点宕机故障,Kubernetes 将 MySQL 自动迁移到其他节点 5、验证数据一致性 # 二、静态PV PVC ```bash PV就好比是一个仓库,我们需要先购买一个仓库,即定义一个PV存储服务,例如CEPH,NFS,Local Hostpath等等。 PVC就好比租户,pv和pvc是一对一绑定的,挂载到POD中,一个pvc可以被多个pod挂载。 ``` ## 1、创建 PV ```bash # 清理pv资源 kubectl delete -f mysql-static-pv.yaml # 编写pv yaml资源文件 cat > mysql-static-pv.yaml <<\EOF apiVersion: v1 kind: PersistentVolume metadata: name: mysql-static-pv spec: capacity: storage: 80Gi accessModes: - ReadWriteOnce #ReadWriteOnce - 卷可以由单个节点以读写方式挂载 #ReadOnlyMany - 卷可以由许多节点以只读方式挂载 #ReadWriteMany - 卷可以由许多节点以读写方式挂载 persistentVolumeReclaimPolicy: Retain #Retain,不清理, 保留 Volume(需要手动清理) #Recycle,删除数据,即 rm -rf /thevolume/*(只有 NFS 和 HostPath 支持) #Delete,删除存储资源,比如删除 AWS EBS 卷(只有 AWS EBS, GCE PD, Azure Disk 和 Cinder 支持) nfs: path: /data/nfs/mysql/ server: 10.198.1.155 mountOptions: - vers=4 - minorversion=0 - noresvport EOF # 部署pv到集群中 kubectl apply -f mysql-static-pv.yaml # 查看pv $ kubectl get pv NAME CAPACITY ACCESS MODES RECLAIM POLICY STATUS CLAIM STORAGECLASS REASON AGE mysql-static-pv 80Gi RWO Retain Available 4m20s ``` ## 2、创建PVC ```bash # 清理pvc资源 kubectl delete -f mysql-pvc.yaml # 编写pvc yaml资源文件 cat > mysql-pvc.yaml <<\EOF apiVersion: v1 kind: PersistentVolumeClaim metadata: name: mysql-static-pvc spec: accessModes: - ReadWriteOnce resources: requests: storage: 80Gi EOF # 创建pvc资源 kubectl apply -f mysql-pvc.yaml # 查看pvc $ kubectl get pvc NAME STATUS VOLUME CAPACITY ACCESS MODES STORAGECLASS AGE mysql-static-pvc Bound pvc-c55f8695-2a0b-4127-a60b-5c1aba8b9104 80Gi RWO nfs-storage 81s ``` # 三、部署 MySQL ## 1、MySQL 的配置文件mysql.yaml如下: ```bash kubectl delete -f mysql.yaml cat >mysql.yaml<<\EOF apiVersion: v1 kind: Service metadata: name: mysql spec: ports: - port: 3306 selector: app: mysql --- apiVersion: extensions/v1beta1 kind: Deployment metadata: name: mysql spec: selector: matchLabels: app: mysql template: metadata: labels: app: mysql spec: containers: - name: mysql image: mysql:5.6 env: - name: MYSQL_ROOT_PASSWORD value: password ports: - name: mysql containerPort: 3306 volumeMounts: - name: mysql-persistent-storage mountPath: /var/lib/mysql volumes: - name: mysql-persistent-storage persistentVolumeClaim: claimName: mysql-static-pvc EOF kubectl apply -f mysql.yaml # PVC mysql-static-pvc Bound 的 PV mysql-static-pv 将被 mount 到 MySQL 的数据目录 /var/lib/mysql。 ``` ## 2、更新 MySQL 数据 MySQL 被部署到 k8s-node02,下面通过客户端访问 Service mysql: ```bash $ kubectl run -it --rm --image=mysql:5.6 --restart=Never mysql-client -- mysql -h mysql -ppassword If you don't see a command prompt, try pressing enter. mysql> 我们在mysql库中创建一个表myid,然后在表里新增几条数据。 mysql> use mysql Database changed mysql> drop table myid; Query OK, 0 rows affected (0.12 sec) mysql> create table myid(id int(4)); Query OK, 0 rows affected (0.23 sec) mysql> insert myid values(888); Query OK, 1 row affected (0.03 sec) mysql> select * from myid; +------+ | id | +------+ | 888 | +------+ 1 row in set (0.00 sec) ``` ## 3、故障转移 我们现在把 node02 机器关机,模拟节点宕机故障。 ```bash 1、一段时间之后,Kubernetes 将 MySQL 迁移到 k8s-node01 $ kubectl get pod -o wide NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES mysql-7686899cf9-8z6tc 1/1 Running 0 21s 10.244.1.19 node01 mysql-7686899cf9-d4m42 1/1 Terminating 0 23m 10.244.2.17 node02 2、验证数据的一致性 $ kubectl run -it --rm --image=mysql:5.6 --restart=Never mysql-client -- mysql -h mysql -ppassword If you don't see a command prompt, try pressing enter. mysql> use mysql Reading table information for completion of table and column names You can turn off this feature to get a quicker startup with -A Database changed mysql> select * from myid; +------+ | id | +------+ | 888 | +------+ 1 row in set (0.00 sec) 3、MySQL 服务恢复,数据也完好无损,我们可以可以在存储节点上面查看一下生成的数据库文件。 [root@nfs_server mysql-pv]# ll -rw-rw---- 1 systemd-bus-proxy ssh_keys 56 12月 14 09:53 auto.cnf -rw-rw---- 1 systemd-bus-proxy ssh_keys 12582912 12月 14 10:15 ibdata1 -rw-rw---- 1 systemd-bus-proxy ssh_keys 50331648 12月 14 10:15 ib_logfile0 -rw-rw---- 1 systemd-bus-proxy ssh_keys 50331648 12月 14 09:53 ib_logfile1 drwx------ 2 systemd-bus-proxy ssh_keys 4096 12月 14 10:05 mysql drwx------ 2 systemd-bus-proxy ssh_keys 4096 12月 14 09:53 performance_schema ``` # 四、全新命名空间使用 pv是全局的,pvc可以指定namespace ```bash kubectl delete ns test-ns kubectl create ns test-ns kubectl apply -f mysql-pvc.yaml -n test-ns kubectl apply -f mysql.yaml -n test-ns kubectl get pods -n test-ns -o wide kubectl -n test-ns logs -f $(kubectl get pods -n test-ns|grep mysql|awk '{print $1}') kubectl run -n test-ns -it --rm --image=mysql:5.6 --restart=Never mysql-client -- mysql -h mysql -ppassword ``` 参考文档: https://blog.51cto.com/wzlinux/2330295 Kubernetes 之 MySQL 持久存储和故障转移(十一) https://qingmu.io/2019/08/11/Run-mysql-on-kubernetes/ 从部署mysql聊一聊有状态服务和PV及PVC ================================================ FILE: components/external-storage/5、Kubernetes之Nginx动静态PV持久存储.md ================================================ Table of Contents ================= * [一、nginx使用nfs静态PV](#一nginx使用nfs静态pv) * [1、静态nfs-static-nginx-rc.yaml](#1静态nfs-static-nginx-rcyaml) * [2、静态nfs-static-nginx-deployment.yaml](#2静态nfs-static-nginx-deploymentyaml) * [3、nginx多目录挂载](#3nginx多目录挂载) * [二、nginx使用nfs动态PV](#二nginx使用nfs动态pv) * [1、动态nfs-dynamic-nginx.yaml](#1动态nfs-dynamic-nginxyaml) # 一、nginx使用nfs静态PV ## 1、静态nfs-static-nginx-rc.yaml ```bash ##清理资源 kubectl delete -f nfs-static-nginx-rc.yaml -n test cat >nfs-static-nginx-rc.yaml<<\EOF ##创建namespace --- apiVersion: v1 kind: Namespace metadata: name: test labels: name: test ##创建nfs-pv --- apiVersion: v1 kind: PersistentVolume metadata: name: nfs-pv labels: pv: nfs-pv spec: capacity: storage: 10Gi accessModes: - ReadWriteMany persistentVolumeReclaimPolicy: Retain storageClassName: nfs # 注意这里使用nfs的storageClassName,如果没改k8s的默认storageClassName的话,这里可以省略 nfs: path: /data/nfs/nginx/ server: 10.198.1.155 ##创建nfs-pvc --- kind: PersistentVolumeClaim apiVersion: v1 metadata: name: nfs-pvc namespace: test labels: pvc: nfs-pvc spec: accessModes: - ReadWriteMany resources: requests: storage: 10Gi storageClassName: nfs selector: matchLabels: pv: nfs-pv ##部署应用nginx --- apiVersion: v1 kind: ReplicationController metadata: name: nginx-test namespace: test labels: name: nginx-test spec: replicas: 2 selector: name: nginx-test template: metadata: labels: name: nginx-test spec: containers: - name: nginx-test image: docker.io/nginx volumeMounts: - mountPath: /usr/share/nginx/html name: nginx-data ports: - containerPort: 80 volumes: - name: nginx-data persistentVolumeClaim: claimName: nfs-pvc ##创建service --- apiVersion: v1 kind: Service metadata: namespace: test name: nginx-test labels: name: nginx-test spec: type: NodePort ports: - port: 80 protocol: TCP targetPort: 80 name: http nodePort: 30080 selector: name: nginx-test EOF ##创建资源 kubectl apply -f nfs-static-nginx-rc.yaml -n test ##查看pv资源 kubectl get pv -n test --show-labels ##查看pvc资源 kubectl get pvc -n test --show-labels ##查看pod $ kubectl get pods -n test NAME READY STATUS RESTARTS AGE nginx-test-r4n2j 1/1 Running 0 54s nginx-test-zstf5 1/1 Running 0 54s #可以看到,nginx应用已经部署成功。 #nginx应用的数据目录是使用的nfs共享存储,我们在nfs共享的目录里加入index.html文件,然后再访问nginx-service暴露的端口 #切换到到nfs-server服务器上 echo "Test NFS Share discovery with nfs-static-nginx-rc" > /data/nfs/nginx/index.html #在浏览器上访问kubernetes主节点的 http://master:30080,就能访问到这个页面内容了 ``` ## 2、静态nfs-static-nginx-deployment.yaml ```bash ##清理资源 kubectl delete -f nfs-static-nginx-deployment.yaml -n test cat >nfs-static-nginx-deployment.yaml<<\EOF ##创建namespace --- apiVersion: v1 kind: Namespace metadata: name: test labels: name: test ##创建nfs-pv --- apiVersion: v1 kind: PersistentVolume metadata: name: nfs-pv labels: pv: nfs-pv spec: capacity: storage: 10Gi accessModes: - ReadWriteMany persistentVolumeReclaimPolicy: Retain storageClassName: nfs # 注意这里使用nfs的storageClassName,如果没改k8s的默认storageClassName的话,这里可以省略 nfs: path: /data/nfs/nginx/ server: 10.198.1.155 ##创建nfs-pvc --- kind: PersistentVolumeClaim apiVersion: v1 metadata: name: nfs-pvc namespace: test labels: pvc: nfs-pvc spec: accessModes: - ReadWriteMany resources: requests: storage: 10Gi storageClassName: nfs selector: matchLabels: pv: nfs-pv ##部署应用nginx --- apiVersion: apps/v1 kind: Deployment metadata: name: nginx-deployment namespace: test labels: name: nginx-test spec: replicas: 2 selector: matchLabels: name: nginx-test template: metadata: labels: name: nginx-test spec: containers: - name: nginx-test image: docker.io/nginx volumeMounts: - mountPath: /usr/share/nginx/html name: nginx-data ports: - containerPort: 80 volumes: - name: nginx-data persistentVolumeClaim: claimName: nfs-pvc ##创建service --- apiVersion: v1 kind: Service metadata: namespace: test name: nginx-test labels: name: nginx-test spec: type: NodePort ports: - port: 80 protocol: TCP targetPort: 80 name: http nodePort: 30080 selector: name: nginx-test EOF ##创建资源 kubectl apply -f nfs-static-nginx-deployment.yaml -n test ##查看pv资源 kubectl get pv -n test --show-labels ##查看pvc资源 kubectl get pvc -n test --show-labels ##查看pod $ kubectl get pods -n test NAME READY STATUS RESTARTS AGE nginx-deployment-64d6f78cdf-8bw8t 1/1 Running 0 55s nginx-deployment-64d6f78cdf-n5n4q 1/1 Running 0 55s #可以看到,nginx应用已经部署成功。 #nginx应用的数据目录是使用的nfs共享存储,我们在nfs共享的目录里加入index.html文件,然后再访问nginx-service暴露的端口 #切换到到nfs-server服务器上 echo "Test NFS Share discovery with nfs-static-nginx-deployment" > /data/nfs/nginx/index.html #在浏览器上访问kubernetes主节点的 http://master:30080,就能访问到这个页面内容了 ``` ## 3、nginx多目录挂载 ``` 1、PV和PVC是一一对应关系,当有PV被某个PVC所占用时,会显示banding,其它PVC不能再使用绑定过的PV。 2、PVC一旦绑定PV,就相当于是一个存储卷,此时PVC可以被多个Pod所使用。(PVC支不支持被多个Pod访问,取决于访问模型accessMode的定义)。 3、PVC若没有找到合适的PV时,则会处于pending状态。 4、PV是属于集群级别的,不能定义在名称空间中。 5、PVC时属于名称空间级别的。 ``` ```bash ##清理资源 kubectl delete -f nfs-static-nginx-dp-many.yaml -n test cat >nfs-static-nginx-dp-many.yaml<<\EOF ##创建namespace --- apiVersion: v1 kind: Namespace metadata: name: test labels: name: test ##创建nginx-data-pv --- apiVersion: v1 kind: PersistentVolume metadata: name: nginx-data-pv labels: pv: nginx-data-pv spec: capacity: storage: 50Gi accessModes: - ReadWriteMany persistentVolumeReclaimPolicy: Retain storageClassName: nfs # 注意这里使用nfs的storageClassName,如果没改k8s的默认storageClassName的话,这里可以省略 nfs: path: /data/nfs/nginx/ server: 10.198.1.155 ##创建nginx-etc-pv --- apiVersion: v1 kind: PersistentVolume metadata: name: nginx-etc-pv labels: pv: nginx-etc-pv spec: capacity: storage: 50Gi accessModes: - ReadWriteMany persistentVolumeReclaimPolicy: Retain storageClassName: nfs # 注意这里使用nfs的storageClassName,如果没改k8s的默认storageClassName的话,这里可以省略 nfs: path: /data/nfs/nginx/ server: 10.198.1.155 ##创建pvc名字为nfs-nginx-data,存放数据 --- kind: PersistentVolumeClaim apiVersion: v1 metadata: name: nfs-nginx-data namespace: test labels: pvc: nfs-nginx-data spec: accessModes: - ReadWriteMany resources: requests: storage: 50Gi storageClassName: nfs selector: matchLabels: pv: nginx-data-pv ##创建pvc名字为nfs-nginx-etc,存放配置文件 --- kind: PersistentVolumeClaim apiVersion: v1 metadata: name: nfs-nginx-etc namespace: test labels: pvc: nfs-nginx-etc spec: accessModes: - ReadWriteMany resources: requests: storage: 50Gi storageClassName: nfs selector: matchLabels: pv: nginx-etc-pv ##部署应用nginx --- apiVersion: apps/v1 kind: Deployment metadata: name: nginx-deployment namespace: test labels: name: nginx-test spec: replicas: 2 selector: matchLabels: name: nginx-test template: metadata: labels: name: nginx-test spec: containers: - name: nginx-test image: docker.io/nginx volumeMounts: - mountPath: /usr/share/nginx/html name: nginx-data # - mountPath: /etc/nginx #--这里需要注意,如果是这么挂载,那么需要事先现在/data/nfs/nginx/目录下把nginx的完整配置提前拷贝好 # name: nginx-etc ports: - containerPort: 80 volumes: - name: nginx-data persistentVolumeClaim: claimName: nfs-nginx-data # - name: nginx-etc # persistentVolumeClaim: # claimName: nfs-nginx-etc ##创建service --- apiVersion: v1 kind: Service metadata: namespace: test name: nginx-test labels: name: nginx-test spec: type: NodePort ports: - port: 80 protocol: TCP targetPort: 80 name: http nodePort: 30080 selector: name: nginx-test EOF ##创建资源 kubectl apply -f nfs-static-nginx-dp-many.yaml -n test ##查看pv资源 kubectl get pv -n test --show-labels ##查看pvc资源 kubectl get pvc -n test --show-labels ##查看pod $ kubectl get pods -n test NAME READY STATUS RESTARTS AGE nginx-deployment-64d6f78cdf-8bw8t 1/1 Running 0 55s nginx-deployment-64d6f78cdf-n5n4q 1/1 Running 0 55s ##进入容器 kubectl exec -it nginx-deployment-f687cdf47-xncj8 -n test /bin/bash #可以看到,nginx应用已经部署成功。 #nginx应用的数据目录是使用的nfs共享存储,我们在nfs共享的目录里加入index.html文件,然后再访问nginx-service暴露的端口 #切换到到nfs-server服务器上 echo "Test NFS Share discovery with nfs-static-nginx-dp-many" > /data/nfs/nginx/index.html #在浏览器上访问kubernetes主节点的 http://master:30080,就能访问到这个页面内容了 ``` ## 4、参数namespace ```bash ##清理资源 export NAMESPACE="mos-namespace" kubectl delete -f nfs-static-nginx-dp-many.yaml -n ${NAMESPACE} cat >nfs-static-nginx-dp-many.yaml<<-EOF ##创建namespace --- apiVersion: v1 kind: Namespace metadata: name: ${NAMESPACE} labels: name: ${NAMESPACE} ##创建nginx-data-pv --- apiVersion: v1 kind: PersistentVolume metadata: name: nginx-data-pv labels: pv: nginx-data-pv spec: capacity: storage: 50Gi accessModes: - ReadWriteMany persistentVolumeReclaimPolicy: Retain storageClassName: nfs # 注意这里使用nfs的storageClassName,如果没改k8s的默认storageClassName的话,这里可以省略 nfs: path: /data/nfs/nginx/ server: 10.198.1.155 ##创建nginx-log-pv --- apiVersion: v1 kind: PersistentVolume metadata: name: nginx-log-pv labels: pv: nginx-log-pv spec: capacity: storage: 50Gi accessModes: - ReadWriteMany persistentVolumeReclaimPolicy: Retain storageClassName: nfs # 注意这里使用nfs的storageClassName,如果没改k8s的默认storageClassName的话,这里可以省略 nfs: path: /data/nfs/nginx/ server: 10.198.1.155 ##创建pvc名字为nfs-nginx-data,存放数据 --- kind: PersistentVolumeClaim apiVersion: v1 metadata: name: nfs-nginx-data labels: pvc: nfs-nginx-data spec: accessModes: - ReadWriteMany resources: requests: storage: 50Gi storageClassName: nfs selector: matchLabels: pv: nginx-data-pv ##创建pvc名字为nfs-nginx-log,存放日志文件 --- kind: PersistentVolumeClaim apiVersion: v1 metadata: name: nfs-nginx-log labels: pvc: nfs-nginx-log spec: accessModes: - ReadWriteMany resources: requests: storage: 50Gi storageClassName: nfs selector: matchLabels: pv: nginx-log-pv ##部署应用nginx --- apiVersion: apps/v1 kind: Deployment metadata: name: nginx-deployment labels: name: nginx-test spec: replicas: 2 selector: matchLabels: name: nginx-test template: metadata: labels: name: nginx-test spec: containers: - name: nginx-test image: docker.io/nginx volumeMounts: - mountPath: /usr/share/nginx/html name: nginx-data - mountPath: /var/log/nginx name: nginx-log ports: - containerPort: 80 volumes: - name: nginx-data persistentVolumeClaim: claimName: nfs-nginx-data - name: nginx-log persistentVolumeClaim: claimName: nfs-nginx-log ##创建service --- apiVersion: v1 kind: Service metadata: name: nginx-test labels: name: nginx-test spec: type: NodePort ports: - port: 80 protocol: TCP targetPort: 80 name: http nodePort: 30180 selector: name: nginx-test EOF ##创建资源 kubectl apply -f nfs-static-nginx-dp-many.yaml -n ${NAMESPACE} ``` # 二、nginx使用nfs动态PV `https://github.com/Lancger/opsfull/blob/master/components/external-storage/3%E3%80%81%E5%8A%A8%E6%80%81%E7%94%B3%E8%AF%B7PV%E5%8D%B7.md` ## 1、动态nfs-dynamic-nginx.yaml 通过参数控制在哪个命名空间创建 ```bash ##清理命名空间 kubectl delete ns k8s-public ##创建命名空间 kubectl create ns k8s-public ##清理资源 kubectl delete -f nfs-dynamic-nginx-deployment.yaml -n k8s-public cat >nfs-dynamic-nginx-deployment.yaml<<\EOF ##动态申请nfs-dynamic-pvc --- kind: PersistentVolumeClaim apiVersion: v1 metadata: name: nfs-dynamic-claim spec: storageClassName: nfs-storage #--需要与上面创建的storageclass的名称一致 accessModes: - ReadWriteMany resources: requests: storage: 90Gi ##部署应用nginx --- apiVersion: apps/v1 kind: Deployment metadata: name: nginx-deployment labels: name: nginx-test spec: replicas: 3 selector: matchLabels: name: nginx-test template: metadata: labels: name: nginx-test spec: containers: - name: nginx-test image: docker.io/nginx volumeMounts: - mountPath: /usr/share/nginx/html name: nginx-data ports: - containerPort: 80 volumes: - name: nginx-data persistentVolumeClaim: claimName: nfs-dynamic-claim ##创建service --- apiVersion: v1 kind: Service metadata: name: nginx-test labels: name: nginx-test spec: type: NodePort ports: - port: 80 protocol: TCP targetPort: 80 name: http nodePort: 30090 selector: name: nginx-test EOF ##创建资源 kubectl apply -f nfs-dynamic-nginx-deployment.yaml -n k8s-public ##查看pv资源 kubectl get pv -n k8s-public --show-labels ##查看pvc资源 kubectl get pvc -n k8s-public --show-labels ##查看pod $ kubectl get pods -n k8s-public NAME READY STATUS RESTARTS AGE nginx-deployment-544f569478-5t8wm 1/1 Running 0 40s nginx-deployment-544f569478-8gks5 1/1 Running 0 40s nginx-deployment-544f569478-pw96x 1/1 Running 0 40s #可以看到,nginx应用已经部署成功。 #nginx应用的数据目录是使用的nfs共享存储,我们在nfs共享的目录里加入index.html文件,然后再访问nginx-service暴露的端口 #切换到到nfs-server服务器上 #注意动态的在这个目录,创建的目录命名方式为 “namespace名称-pvc名称-pv名称” /data/nfs/kube-public-test-claim-pvc-ad304939-e75d-414f-81b5-7586ef17db6c echo "Test NFS Share discovery with nfs-dynamic-nginx-deployment" > /data/nfs/kube-public-test-claim-pvc-ad304939-e75d-414f-81b5-7586ef17db6c/index.html #在浏览器上访问kubernetes主节点的 http://master:30090,就能访问到这个页面内容了 ``` ![](https://github.com/Lancger/opsfull/blob/master/images/dynamic-pv.png) 参考文档: https://kubernetes.io/zh/docs/tasks/run-application/run-stateless-application-deployment/ https://blog.51cto.com/ylw6006/2071845 在kubernetes集群中运行nginx ================================================ FILE: components/external-storage/README.md ================================================ PersistenVolume(PV):对存储资源创建和使用的抽象,使得存储作为集群中的资源管理 PV分为静态和动态,动态能够自动创建PV PersistentVolumeClaim(PVC):让用户不需要关心具体的Volume实现细节 容器与PV、PVC之间的关系,可以如下图所示: ![PV](https://github.com/Lancger/opsfull/blob/master/images/pv01.png) 总的来说,PV是提供者,PVC是消费者,消费的过程就是绑定 # 问题一 pv挂载正常,pvc一直处于Pending状态 ```bash #在test的命名空间创建pvc $ kubectl get pvc -n test NAME STATUS VOLUME CAPACITY ACCESS MODES STORAGECLASS AGE nfs-pvc Pending---这里发现一直处于Pending的状态 nfs-storage 10s #查看日志 $ kubectl describe pvc nfs-pvc -n test failed to provision volume with StorageClass "nfs-storage": claim Selector is not supported #从日志中发现,问题出在标签匹配的地方 ``` 参考资料: https://blog.csdn.net/qq_25611295/article/details/86065053 k8s pv与pvc持久化存储(静态与动态) https://www.jianshu.com/p/5e565a8049fc kubernetes部署NFS持久存储(静态和动态) ================================================ FILE: components/heapster/README.md ================================================ # 一、问题现象 heapster: 已经被k8s给舍弃掉了 ```bash heapster logs这个报错啥情况 E0918 16:56:05.022867 1 manager.go:101] Error in scraping containers from kubelet_summary:10.10.188.242:10255: Get http://10.10.188.242:10255/stats/summary/: dial tcp 10.10.188.242:10255: getsockopt: connection refused ``` # 排查思路 ``` 1、排查下kubelet,10255是它暴露的端口 service kubelet status #看状态是正常的 #在10.10.188.242上执行 [root@localhost ~]# netstat -lnpt | grep 10255 tcp 0 0 10.10.188.240:10255 0.0.0.0:* LISTEN 9243/kubelet 看了下/var/log/pods/kube-system_heapster-5f848f54bc-rtbv4_abf53b7c-491f-472a-9e8b-815066a6ae3d/heapster下日志 所有的物理节点都是10255 拒绝连接 2、浏览器访问查看数据 10.10.188.242 是你节点的IP吧,正常的话浏览器访问http://IP:10255/stats/summary是有值的,你看下,如果没有那就是kubelet的配置出问题 ``` ![heapster获取数据异常1](https://github.com/Lancger/opsfull/blob/master/images/heapster-01.png) ![heapster获取数据异常2](https://github.com/Lancger/opsfull/blob/master/images/heapster-02.png) ================================================ FILE: components/ingress/0.通俗理解Kubernetes中Service、Ingress与Ingress Controller的作用与关系.md ================================================ # 一、通俗的讲: - 1、Service 是后端真实服务的抽象,一个 Service 可以代表多个相同的后端服务 - 2、Ingress 是反向代理规则,用来规定 HTTP/S 请求应该被转发到哪个 Service 上,比如根据请求中不同的 Host 和 url 路径让请求落到不同的 Service 上 - 3、Ingress Controller 就是一个反向代理程序,它负责解析 Ingress 的反向代理规则,如果 Ingress 有增删改的变动,所有的 Ingress Controller 都会及时更新自己相应的转发规则,当 Ingress Controller 收到请求后就会根据这些规则将请求转发到对应的 Service # 二、数据流向图 Kubernetes 并没有自带 Ingress Controller,它只是一种标准,具体实现有多种,需要自己单独安装,常用的是 Nginx Ingress Controller 和 Traefik Ingress Controller。 所以 Ingress 是一种转发规则的抽象,Ingress Controller 的实现需要根据这些 Ingress 规则来将请求转发到对应的 Service,我画了个图方便大家理解: ![Ingress Controller数据流向图](https://github.com/Lancger/opsfull/blob/master/images/Ingress%20Controller01.png) 从图中可以看出,Ingress Controller 收到请求,匹配 Ingress 转发规则,匹配到了就转发到后端 Service,而 Service 可能代表的后端 Pod 有多个,选出一个转发到那个 Pod,最终由那个 Pod 处理请求。 # 三、Ingress Controller对外暴露方式 有同学可能会问,既然 Ingress Controller 要接受外面的请求,而 Ingress Controller 是部署在集群中的,怎么让 Ingress Controller 本身能够被外面访问到呢,有几种方式: - 1、Ingress Controller 用 Deployment 方式部署,给它添加一个 Service,类型为 LoadBalancer,这样会自动生成一个 IP 地址,通过这个 IP 就能访问到了,并且一般这个 IP 是高可用的(前提是集群支持 LoadBalancer,通常云服务提供商才支持,自建集群一般没有) - 2、使用集群内部的某个或某些节点作为边缘节点,给 node 添加 label 来标识,Ingress Controller 用 DaemonSet 方式部署,使用 nodeSelector 绑定到边缘节点,保证每个边缘节点启动一个 Ingress Controller 实例,用 hostPort 直接在这些边缘节点宿主机暴露端口,然后我们可以访问边缘节点中 Ingress Controller 暴露的端口,这样外部就可以访问到 Ingress Controller 了 - 3、Ingress Controller 用 Deployment 方式部署,给它添加一个 Service,类型为 NodePort,部署完成后查看会给出一个端口,通过 kubectl get svc 我们可以查看到这个端口,这个端口在集群的每个节点都可以访问,通过访问集群节点的这个端口就可以访问 Ingress Controller 了。但是集群节点这么多,而且端口又不是 80和443,太不爽了,一般我们会在前面自己搭个负载均衡器,比如用 Nginx,将请求转发到集群各个节点的那个端口上,这样我们访问 Nginx 就相当于访问到 Ingress Controller 了 一般比较推荐的是前面两种方式。 参考资料: https://cloud.tencent.com/developer/article/1326535 通俗理解Kubernetes中Service、Ingress与Ingress Controller的作用与关系 ================================================ FILE: components/ingress/1.kubernetes部署Ingress-nginx单点和高可用.md ================================================ # 一、Ingress-nginx简介  Pod的IP以及service IP只能在集群内访问,如果想在集群外访问kubernetes提供的服务,可以使用nodeport、proxy、loadbalacer以及ingress等方式,由于service的IP集群外不能访问,就是使用ingress方式再代理一次,即ingress代理service,service代理pod. Ingress基本原理图如下: ![Ingress-nginx](https://github.com/Lancger/opsfull/blob/master/images/Ingress-nginx.png) # 二、部署nginx-ingress-controller ```bash # github地址 https://github.com/kubernetes/ingress-nginx https://kubernetes.github.io/ingress-nginx/ # 1、下载nginx-ingress-controller配置文件 kubectl apply -f https://raw.githubusercontent.com/kubernetes/ingress-nginx/master/deploy/static/mandatory.yaml # 2、service-nodeport.yaml为ingress通过nodeport对外提供服务,注意默认nodeport暴露端口为随机,可以编辑该文件自定义端口 Using NodePort: kubectl apply -f https://raw.githubusercontent.com/kubernetes/ingress-nginx/master/deploy/static/provider/baremetal/service-nodeport.yaml # 3、查看ingress-nginx组件状态 root># kubectl get pod -n ingress-nginx NAME READY STATUS RESTARTS AGE nginx-ingress-controller-568867bf56-mbvm2 1/1 Running 0 4m46s 查看创建的ingress service暴露的端口: root># kubectl get svc -n ingress-nginx NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE ingress-nginx NodePort 10.97.243.123 80:30725/TCP,443:32314/TCP 5m46s ``` # 二、创建ingress-nginx后端服务 1.创建一个Service及后端Deployment(以nginx为例) ``` cat > deploy-demon.yaml<<\EOF apiVersion: v1 kind: Service metadata: name: myapp namespace: default spec: selector: app: myapp release: canary ports: - name: http port: 80 targetPort: 80 --- apiVersion: apps/v1 kind: Deployment metadata: name: myapp-deploy spec: replicas: 2 selector: matchLabels: app: myapp release: canary template: metadata: labels: app: myapp release: canary spec: containers: - name: myapp image: ikubernetes/myapp:v2 ports: - name: httpd containerPort: 80 EOF root># kubectl apply -f deploy-demon.yaml root># kubectl get pods root># kubectl get svc myapp NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE myapp ClusterIP 10.106.30.175 80/TCP 59s # 通过ClusterIP方式内部测试访问Services root># curl 10.106.30.175 Hello MyApp | Version: v2 | Pod Name ``` # 三、创建myapp的ingress规则 ``` cat > ingress-myapp.yaml<<\EOF apiVersion: extensions/v1beta1 kind: Ingress metadata: name: ingress-myapp namespace: default annotations: kubernets.io/ingress.class: "nginx" spec: rules: - host: www.k8s-devops.com http: paths: - path: backend: serviceName: myapp servicePort: 80 EOF root># kubectl apply -f ingress-myapp.yaml root># kubectl get ingress NAME HOSTS ADDRESS PORTS AGE ingress-myapp www.k8s-devops.com 10.97.243.123 80 5s # 通过Ingress方式内部测试访问域名 root># curl -x 10.97.243.123:80 http://www.k8s-devops.com Hello MyApp | Version: v2 | Pod Name ``` # 四、查看ingress-default-backend的详细信息: ``` root># kubectl exec -n ingress-nginx -it nginx-ingress-controller-568867bf56-mbvm2 -- /bin/sh $ cat nginx.conf ## start server www.k8s-devops.com server { server_name www.k8s-devops.com ; listen 80 ; listen 443 ssl http2 ; set $proxy_upstream_name "-"; ssl_certificate_by_lua_block { certificate.call() } location / { set $namespace "default"; set $ingress_name "ingress-myapp"; set $service_name "myapp"; set $service_port "80"; set $location_path "/"; ``` # 五、测试域名 ``` 1、这是nginx-ingress-controller采用的deployment部署的多副本 root># kubectl get deployment -A ingress-nginx nginx-ingress-controller 6/6 6 6 65m (这里有6个副本) root># kubectl get svc -n ingress-nginx NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE ingress-nginx NodePort 10.97.243.123 80:30725/TCP,443:32314/TCP 69m root># kubectl describe svc ingress-nginx -n ingress-nginx Name: ingress-nginx Namespace: ingress-nginx Labels: app.kubernetes.io/name=ingress-nginx app.kubernetes.io/part-of=ingress-nginx Annotations: kubectl.kubernetes.io/last-applied-configuration: {"apiVersion":"v1","kind":"Service","metadata":{"annotations":{},"labels":{"app.kubernetes.io/name":"ingress-nginx","app.kubernetes.io/par... Selector: app.kubernetes.io/name=ingress-nginx,app.kubernetes.io/part-of=ingress-nginx Type: NodePort IP: 10.97.243.123 Port: http 80/TCP TargetPort: 80/TCP NodePort: http 30725/TCP Endpoints: 10.244.154.195:80,10.244.154.196:80,10.244.44.197:80 + 3 more... 这里转到6个pod Port: https 443/TCP TargetPort: 443/TCP NodePort: https 32314/TCP Endpoints: 10.244.154.195:443,10.244.154.196:443,10.244.44.197:443 + 3 more... Session Affinity: None External Traffic Policy: Cluster Events: root># kubectl get endpoints -n ingress-nginx NAME ENDPOINTS AGE ingress-nginx 10.244.154.195:80,10.244.154.196:80,10.244.44.197:80 + 9 more... 68m Ingress Controller 用 Deployment 方式部署,给它添加一个 Service,类型为 NodePort,部署完成后查看会给出一个端口,通过 kubectl get svc 我们可以查看到这个端口,这个端口在集群的每个节点都可以访问,通过访问集群节点的这个端口就可以访问 Ingress Controller 了。但是集群节点这么多,而且端口又不是 80和443,太不爽了,一般我们会在前面自己搭个负载均衡器,比如用 Nginx,将请求转发到集群各个节点的那个端口上,这样我们访问 Nginx 就相当于访问到 Ingress Controller 了。 # 通过Nodeport方式测试(主机IP+端口) curl 10.10.0.24:30725 curl 10.10.0.32:30725 curl 10.10.0.23:30725 curl 10.10.0.25:30725 curl 10.10.0.29:30725 curl 10.10.0.12:30725 2、通过Ingress IP 绑定域名测试 root># kubectl get ingress -A NAMESPACE NAME HOSTS ADDRESS PORTS AGE default ingress-myapp www.k8s-devops.com 10.97.243.123 80 45m root># curl -x 10.97.243.123:80 http://www.k8s-devops.com ``` # 六、Ingress高可用  Ingress高可用,我们可以通过修改deployment的副本数来实现高可用,但是由于ingress承载着整个集群流量的接入,所以生产环境中,建议把ingress通过DaemonSet的方式部署集群中,而且该节点打上污点不允许业务pod进行调度,以避免业务应用与Ingress服务发生资源争抢。然后通过SLB把ingress节点主机添为后端服务器,进行流量转发。 1、修改为DaemonSet方式部署 ``` wget -N https://raw.githubusercontent.com/kubernetes/ingress-nginx/master/deploy/static/mandatory.yaml -O ingress-nginx-mandatory.yaml 1、类型的修改 sed -i 's/kind: Deployment/kind: DaemonSet/g' ingress-nginx-mandatory.yaml sed -i 's/replicas:/#replicas:/g' ingress-nginx-mandatory.yaml 2、镜像的修改(可忽略) #sed -i -e 's?quay.io?quay.azk8s.cn?g' -e 's?k8s.gcr.io?gcr.azk8s.cn/google-containers?g' ingress-nginx-mandatory.yaml 3、使pod共享宿主机网络,暴露所监听的端口以及让容器使用K8S的DNS # spec.template.spec 下面 # serviceAccountName: nginx-ingress-serviceaccount 的前后,平级加上 hostNetwork: true 和 dnsPolicy: "ClusterFirstWithHostNet" sed -i '/serviceAccountName: nginx-ingress-serviceaccount/a\ hostNetwork: true' ingress-nginx-mandatory.yaml sed -i '/serviceAccountName: nginx-ingress-serviceaccount/a\ dnsPolicy: "ClusterFirstWithHostNet"' ingress-nginx-mandatory.yaml 4、节点打标签和污点 # 添加节点标签append to serviceAccountName nodeSelector: node-ingress: "true" tolerations: - key: "node-role.kubernetes.io/master" operator: "Equal" value: "" effect: "NoSchedule" sed -i '/serviceAccountName: nginx-ingress-serviceaccount/a\ nodeSelector:\n node-ingress: "true"' ingress-nginx-mandatory.yaml 修改参数如下:   kind: Deployment #修改为DaemonSet   replicas: 1 #注销此行,DaemonSet不需要此参数   hostNetwork: true #添加该字段让docker使用物理机网络,在物理机暴露服务端口(80),注意物理机80端口提前不能被占用   dnsPolicy: ClusterFirstWithHostNet #使用hostNetwork后容器会使用物理机网络包括DNS,会无法解析内部service,使用此参数让容器使用K8S的DNS   nodeSelector:node-ingress: "true" #添加节点标签   tolerations: 添加对指定节点容忍度 注意一点,因为我们创建的ingress-controller采用的时hostnetwork模式,所以无需在创建ingress-svc服务来把端口映射到节点主机上。 ```  这里我在3台master节点部署(生产环境不要使用master节点,应该部署在独立的节点上),因为我们采用DaemonSet的方式,所以我们需要对3个节点打标签以及容忍度。 ``` ## 查看标签 root># kubectl get nodes --show-labels ## 给节点打标签 [root@k8s-master-01]# kubectl label nodes k8s-master-01 node-ingress="true" [root@k8s-master-01]# kubectl label nodes k8s-master-02 node-ingress="true" [root@k8s-master-01]# kubectl label nodes k8s-master-03 node-ingress="true" ## 节点打污点 ### master节点我之前已经打过污点,如果你没有打污点,执行下面3条命令。此污点名称需要与yaml文件中pod的容忍污点对应 [root@k8s-master-01]# kubectl taint nodes k8s-master-01 node-role.kubernetes.io/master=:NoSchedule [root@k8s-master-01]# kubectl taint nodes k8s-master-02 node-role.kubernetes.io/master=:NoSchedule [root@k8s-master-01]# kubectl taint nodes k8s-master-03 node-role.kubernetes.io/master=:NoSchedule ``` 2、最终配置文件DaemonSet版的Ingress ``` cat >ingress-nginx-mandatory.yaml<<\EOF apiVersion: v1 kind: Namespace metadata: name: ingress-nginx labels: app.kubernetes.io/name: ingress-nginx app.kubernetes.io/part-of: ingress-nginx --- kind: ConfigMap apiVersion: v1 metadata: name: nginx-configuration namespace: ingress-nginx labels: app.kubernetes.io/name: ingress-nginx app.kubernetes.io/part-of: ingress-nginx --- kind: ConfigMap apiVersion: v1 metadata: name: tcp-services namespace: ingress-nginx labels: app.kubernetes.io/name: ingress-nginx app.kubernetes.io/part-of: ingress-nginx --- kind: ConfigMap apiVersion: v1 metadata: name: udp-services namespace: ingress-nginx labels: app.kubernetes.io/name: ingress-nginx app.kubernetes.io/part-of: ingress-nginx --- apiVersion: v1 kind: ServiceAccount metadata: name: nginx-ingress-serviceaccount namespace: ingress-nginx labels: app.kubernetes.io/name: ingress-nginx app.kubernetes.io/part-of: ingress-nginx --- apiVersion: rbac.authorization.k8s.io/v1beta1 kind: ClusterRole metadata: name: nginx-ingress-clusterrole labels: app.kubernetes.io/name: ingress-nginx app.kubernetes.io/part-of: ingress-nginx rules: - apiGroups: - "" resources: - configmaps - endpoints - nodes - pods - secrets verbs: - list - watch - apiGroups: - "" resources: - nodes verbs: - get - apiGroups: - "" resources: - services verbs: - get - list - watch - apiGroups: - "" resources: - events verbs: - create - patch - apiGroups: - "extensions" - "networking.k8s.io" resources: - ingresses verbs: - get - list - watch - apiGroups: - "extensions" - "networking.k8s.io" resources: - ingresses/status verbs: - update --- apiVersion: rbac.authorization.k8s.io/v1beta1 kind: Role metadata: name: nginx-ingress-role namespace: ingress-nginx labels: app.kubernetes.io/name: ingress-nginx app.kubernetes.io/part-of: ingress-nginx rules: - apiGroups: - "" resources: - configmaps - pods - secrets - namespaces verbs: - get - apiGroups: - "" resources: - configmaps resourceNames: # Defaults to "-" # Here: "-" # This has to be adapted if you change either parameter # when launching the nginx-ingress-controller. - "ingress-controller-leader-nginx" verbs: - get - update - apiGroups: - "" resources: - configmaps verbs: - create - apiGroups: - "" resources: - endpoints verbs: - get --- apiVersion: rbac.authorization.k8s.io/v1beta1 kind: RoleBinding metadata: name: nginx-ingress-role-nisa-binding namespace: ingress-nginx labels: app.kubernetes.io/name: ingress-nginx app.kubernetes.io/part-of: ingress-nginx roleRef: apiGroup: rbac.authorization.k8s.io kind: Role name: nginx-ingress-role subjects: - kind: ServiceAccount name: nginx-ingress-serviceaccount namespace: ingress-nginx --- apiVersion: rbac.authorization.k8s.io/v1beta1 kind: ClusterRoleBinding metadata: name: nginx-ingress-clusterrole-nisa-binding labels: app.kubernetes.io/name: ingress-nginx app.kubernetes.io/part-of: ingress-nginx roleRef: apiGroup: rbac.authorization.k8s.io kind: ClusterRole name: nginx-ingress-clusterrole subjects: - kind: ServiceAccount name: nginx-ingress-serviceaccount namespace: ingress-nginx --- apiVersion: apps/v1 #kind: Deployment #修改为DaemonSet kind: DaemonSet metadata: name: nginx-ingress-controller namespace: ingress-nginx labels: app.kubernetes.io/name: ingress-nginx app.kubernetes.io/part-of: ingress-nginx spec: #replicas: 1 #注销此行,DaemonSet不需要此参数 selector: matchLabels: app.kubernetes.io/name: ingress-nginx app.kubernetes.io/part-of: ingress-nginx template: metadata: labels: app.kubernetes.io/name: ingress-nginx app.kubernetes.io/part-of: ingress-nginx annotations: prometheus.io/port: "10254" prometheus.io/scrape: "true" spec: # wait up to five minutes for the drain of connections terminationGracePeriodSeconds: 300 serviceAccountName: nginx-ingress-serviceaccount hostNetwork: true #添加该字段让docker使用物理机网络,在物理机暴露服务端口(80),注意物理机80端口提前不能被占用 dnsPolicy: ClusterFirstWithHostNet #使用hostNetwork后容器会使用物理机网络包括DNS,会无法解析内部service,使用此参数让容器使用K8S的DNS nodeSelector: kubernetes.io/os: linux nodeSelector: node-ingress: "true" #添加节点标签 tolerations: #添加对指定节点容忍度 - key: "node-role.kubernetes.io/master" operator: "Equal" value: "" effect: "NoSchedule" containers: - name: nginx-ingress-controller image: quay.io/kubernetes-ingress-controller/nginx-ingress-controller:0.26.1 args: - /nginx-ingress-controller - --configmap=$(POD_NAMESPACE)/nginx-configuration - --tcp-services-configmap=$(POD_NAMESPACE)/tcp-services - --udp-services-configmap=$(POD_NAMESPACE)/udp-services - --publish-service=$(POD_NAMESPACE)/ingress-nginx - --annotations-prefix=nginx.ingress.kubernetes.io securityContext: allowPrivilegeEscalation: true capabilities: drop: - ALL add: - NET_BIND_SERVICE # www-data -> 33 runAsUser: 33 env: - name: POD_NAME valueFrom: fieldRef: fieldPath: metadata.name - name: POD_NAMESPACE valueFrom: fieldRef: fieldPath: metadata.namespace ports: - name: http containerPort: 80 - name: https containerPort: 443 livenessProbe: failureThreshold: 3 httpGet: path: /healthz port: 10254 scheme: HTTP initialDelaySeconds: 10 periodSeconds: 10 successThreshold: 1 timeoutSeconds: 10 readinessProbe: failureThreshold: 3 httpGet: path: /healthz port: 10254 scheme: HTTP periodSeconds: 10 successThreshold: 1 timeoutSeconds: 10 lifecycle: preStop: exec: command: - /wait-shutdown --- EOF kubectl apply -f ingress-nginx-mandatory.yaml ``` 3、创建资源 ``` [root@k8s-master01 ingress-master]# kubectl apply -f ingress-nginx-mandatory.yaml ## 查看资源分布情况 ### 可以看到两个ingress-controller已经根据我们选择,部署在3个master节点上 [root@k8s-master01 ingress-master]# kubectl get pod -n ingress-nginx -o wide NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES nginx-ingress-controller-298dq 1/1 Running 0 134m 172.16.11.122 k8s-master03 nginx-ingress-controller-sh9h2 1/1 Running 0 134m 172.16.11.121 k8s-master02 ``` 4、测试 ``` #配置集群外域名解析,当前测试环境我们使用windows hosts文件进行解析(针对于node节点有公网IP的类型) 92.168.92.56 www.k8s-devops.com 92.168.92.57 www.k8s-devops.com 92.168.92.58 www.k8s-devops.com 使用域名进行访问: www.k8s-devops.com ``` 参考资料: https://www.cnblogs.com/tchua/p/11174386.html Kubernetes集群Ingress高可用部署 https://github.com/kubernetes/ingress-nginx/blob/04e2ad8fcd51b0741263a37b8e7424ca3979137c/docs/deploy/index.md 官网 https://blog.csdn.net/networken/article/details/85881558 kubernetes部署Ingress-nginx https://www.jianshu.com/p/a8e18cef13b2 HA ingress-nginx: DaemonSet hostNetwork keepavlied ================================================ FILE: components/ingress/1.外部服务发现之Ingress介绍.md ================================================ # 一、ingress介绍   K8s集群对外暴露服务的方式目前只有三种:`LoadBlancer`、`NodePort`、`Ingress`。前两种熟悉起来比较快,而且使用起来也比较方便,在此就不进行介绍了。   Ingress其实就是从 kuberenets 集群外部访问集群的一个入口,将外部的请求转发到集群内不同的 Service 上,其实就相当于 nginx、haproxy 等负载均衡代理服务器,有的同学可能觉得我们直接使用 nginx 就实现了,但是只使用 nginx 这种方式有很大缺陷,每次有新服务加入的时候怎么改 Nginx 配置?不可能让我们去手动更改或者滚动更新前端的 Nginx Pod 吧?那我们再加上一个服务发现的工具比如 consul 如何?貌似是可以,对吧?而且在之前单独使用 docker 的时候,这种方式已经使用得很普遍了,Ingress 实际上就是这样实现的,只是服务发现的功能自己实现了,不需要使用第三方的服务了,然后再加上一个域名规则定义,路由信息的刷新需要一个靠 Ingress controller 来提供。   其中ingress controller目前主要有两种:基于`nginx`服务的ingress controller和基于`traefik`的ingress controller。而其中traefik的ingress controller,目前支持http和https协议 # 二、ingress的工作原理 ## 1、ingress由两部分组成: ingress controller和ingress服务   Ingress controller 可以理解为一个监听器,通过不断地与 kube-apiserver 打交道,实时的感知后端 service、pod 的变化,当得到这些变化信息后,Ingress controller 再结合 Ingress 的配置,更新反向代理负载均衡器,达到服务发现的作用。其实这点和服务发现工具 consul consul-template 非常类似。 ## 2、ingress具体的工作原理如下   ingress contronler通过与k8s的api进行交互,动态的去感知k8s集群中ingress服务规则的变化,然后读取它,并按照定义的ingress规则,转发到k8s集群中对应的service。而这个ingress规则写明了哪个域名对应k8s集群中的哪个service,然后再根据ingress-controller中的nginx配置模板,生成一段对应的nginx配置。然后再把该配置动态的写到ingress-controller的pod里,该ingress-controller的pod里面运行着一个nginx服务,控制器会把生成的nginx配置写入到nginx的配置文件中,然后reload一下,使其配置生效。以此来达到域名分配置及动态更新的效果。 # 三、Traefik   Traefik 是一款开源的反向代理与负载均衡工具。它最大的优点是能够与常见的微服务系统直接整合,可以实现自动化动态配置。目前支持 Docker、Swarm、Mesos/Marathon、 Mesos、Kubernetes、Consul、Etcd、Zookeeper、BoltDB、Rest API 等等后端模型。   要使用 traefik,我们同样需要部署 traefik 的 Pod,由于我们演示的集群中只有 master 节点有外网网卡,所以我们这里只有 master 这一个边缘节点,我们将 traefik 部署到该节点上即可。 ![traefik原理图](https://github.com/Lancger/opsfull/blob/master/images/traefik-architecture.png) - 1、 首先,为安全起见我们这里使用 RBAC 安全认证方式:([rbac.yaml](https://github.com/containous/traefik/blob/v1.7/examples/k8s/traefik-rbac.yaml)) ``` # vim rbac.yaml --- apiVersion: v1 kind: ServiceAccount metadata: name: traefik-ingress-controller namespace: kube-system --- kind: ClusterRole apiVersion: rbac.authorization.k8s.io/v1 metadata: name: traefik-ingress-controller rules: - apiGroups: - "" resources: - pods - services - endpoints - secrets verbs: - get - list - watch - apiGroups: - extensions resources: - ingresses verbs: - get - list - watch --- kind: ClusterRoleBinding apiVersion: rbac.authorization.k8s.io/v1 metadata: name: traefik-ingress-controller roleRef: apiGroup: rbac.authorization.k8s.io kind: ClusterRole name: traefik-ingress-controller subjects: - kind: ServiceAccount name: traefik-ingress-controller namespace: kube-system ``` - 2、直接在集群中创建即可: ``` $ kubectl create -f rbac.yaml serviceaccount "traefik-ingress-controller" created clusterrole.rbac.authorization.k8s.io "traefik-ingress-controller" created clusterrolebinding.rbac.authorization.k8s.io "traefik-ingress-controller" created ``` - 3、然后使用 Deployment 来管理 traefik Pod,直接使用官方的 traefik 镜像部署即可([traefik.yaml](https://github.com/containous/traefik/blob/v1.7/examples/k8s/traefik-deployment.yaml)) ``` # vim traefik.yaml --- kind: Deployment apiVersion: extensions/v1beta1 metadata: name: traefik-ingress-controller namespace: kube-system labels: k8s-app: traefik-ingress-lb spec: replicas: 1 selector: matchLabels: k8s-app: traefik-ingress-lb template: metadata: labels: k8s-app: traefik-ingress-lb name: traefik-ingress-lb spec: serviceAccountName: traefik-ingress-controller terminationGracePeriodSeconds: 60 tolerations: - operator: "Exists" nodeSelector: kubernetes.io/hostname: master #默认master是不允许被调度的,加上tolerations后允许被调度,然后这里使用自身机器master的地址,可以使用kubectl get nodes --show-labels来查看 containers: - image: traefik:v1.7 name: traefik-ingress-lb ports: - name: http containerPort: 80 #hostPort: 80 - name: admin containerPort: 8080 args: - --api - --kubernetes - --logLevel=INFO --- kind: Service apiVersion: v1 metadata: name: traefik-ingress-service namespace: kube-system spec: selector: k8s-app: traefik-ingress-lb ports: - protocol: TCP # 该端口为 traefik ingress-controller的服务端口 port: 80 name: web # 集群hosts文件中设置的 NODE_PORT_RANGE 作为 NodePort的可用范围 # 从默认20000~40000之间选一个可用端口,让ingress-controller暴露给外部的访问 nodePort: 23456 - protocol: TCP # 该端口为 traefik 的管理WEB界面 port: 8080 name: admin nodePort: 23457 type: NodePort ``` - 4、直接创建上面的资源对象即可: ``` $ kubectl create -f traefik.yaml deployment.extensions "traefik-ingress-controller" created service "traefik-ingress-service" created ``` - 5、要注意上面 yaml 文件: ``` tolerations: - operator: "Exists" nodeSelector: kubernetes.io/hostname: master 由于我们这里的特殊性,只有 master 节点有外网访问权限,所以我们使用nodeSelector标签将traefik的固定调度到master这个节点上,那么上面的tolerations是干什么的呢?这个是因为我们集群使用的 kubeadm 安装的,master 节点默认是不能被普通应用调度的,要被调度的话就需要添加这里的 tolerations 属性,当然如果你的集群和我们的不太一样,直接去掉这里的调度策略就行。 nodeSelector 和 tolerations 都属于 Pod 的调度策略,在后面的课程中会为大家讲解。 ``` - 6、traefik 还提供了一个 web ui 工具,就是上面的 8080 端口对应的服务,为了能够访问到该服务,我们这里将服务设置成的 NodePort: ``` $ kubectl get pods -n kube-system -l k8s-app=traefik-ingress-lb -o wide NAME READY STATUS RESTARTS AGE IP NODE traefik-ingress-controller-57c4f787d9-bfhnl 1/1 Running 0 8m 10.244.0.18 master $ kubectl get svc -n kube-system NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE ... traefik-ingress-service NodePort 10.102.183.112 80:23456/TCP,8080:23457/TCP 8m ... ``` 现在在浏览器中输入 [http://master_node_ip:23457 例如 http://16.21.206.156:23457/dashboard/ 注意这里是使用的IP] 就可以访问到 traefik 的 dashboard 了 # 四、Ingress 对象 以上我们是通过 NodePort 来访问 traefik 的 Dashboard 的,那怎样通过 ingress 来访问呢? 首先,需要创建一个 ingress 对象:(ingress.yaml) ``` # vim ingress.yaml --- apiVersion: extensions/v1beta1 kind: Ingress metadata: name: traefik-web-ui namespace: kube-system annotations: kubernetes.io/ingress.class: traefik spec: rules: - host: traefik-ui.test.com http: paths: - backend: serviceName: traefik-ingress-service #servicePort: 8080 servicePort: admin #跟上面service的name对应 ``` 然后为 traefik dashboard 创建对应的 ingress 对象: ``` $ kubectl create -f ingress.yaml ingress.extensions "traefik-web-ui" created ``` 要注意上面的 ingress 对象的规则,特别是 rules 区域,我们这里是要为 traefik 的 dashboard 建立一个 ingress 对象,所以这里的 serviceName 对应的是上面我们创建的 traefik-ingress-service,端口也要注意对应 8080 端口,为了避免端口更改,这里的 servicePort 的值也可以替换成上面定义的 port 的名字:admin 创建完成后,我们应该怎么来测试呢? - 1、第一步,在本地的/etc/hosts里面添加上 traefik-ui.test.com 与 master 节点外网 IP 的映射关系 - 2、第二步,在浏览器中访问:http://traefik-ui.test.com 我们会发现并没有得到我们期望的 dashboard 界面,这是因为我们上面部署 traefik 的时候使用的是 NodePort 这种 Service 对象,所以我们只能通过上面的 23456 端口访问到我们的目标对象:[http://traefik-ui.test.com:23456](http://traefik-ui.test.com:23456) 加上端口后我们发现可以访问到 dashboard 了,而且在 dashboard 当中多了一条记录,正是上面我们创建的 ingress 对象的数据,我们还可以切换到 HEALTH 界面中,可以查看当前 traefik 代理的服务的整体的健康状态   注意这里为何是23456而不是23457,因为这里是通过ingress设置的域名来访问的,宿主机的23456端口对应宿主机上traefik-ingress-controller-nginx-pod容器的80端口,然后再经过ingress代理到service对应的pod节点上,如果traefik-ingress-controller-nginx-pod设置了宿主机端口映射,那么可以省略23456端口,下面会讲到hostPort: 80参数的使用,因为走了多层代理,所以直接Nodeport方式的性能会好一些,但是量一多,维护起来就比较麻烦) - 3、第三步,上面我们可以通过自定义域名加上端口可以访问我们的服务了,但是我们平时服务别人的服务是不是都是直接用的域名啊,http 或者 https 的,几乎很少有在域名后面加上端口访问的吧?为什么?太麻烦啊,端口也记不住,要解决这个问题,怎么办,我们只需要把我们上面的 traefik 的核心应用的端口隐射到 master 节点上的 80 端口,是不是就可以了,因为 http 默认就是访问 80 端口,但是我们在 Service 里面是添加的一个 NodePort 类型的服务,没办法映射 80 端口,怎么办?这里就可以直接在 Pod 中指定一个 hostPort 即可,更改上面的 traefik.yaml 文件中的容器端口: ``` containers: - image: traefik name: traefik-ingress-lb ports: - name: http containerPort: 80 hostPort: 80 - name: admin containerPort: 8080 ``` 添加以后 hostPort: 80,然后更新应用: ``` $ kubectl apply -f traefik.yaml ``` 更新完成后,这个时候我们在浏览器中直接使用域名方法测试下: - 4、第四步,正常来说,我们如果有自己的域名,我们可以将我们的域名添加一条 DNS 记录,解析到 master 的外网 IP 上面,这样任何人都可以通过域名来访问我的暴露的服务了。如果你有多个边缘节点的话,可以在每个边缘节点上部署一个 ingress-controller 服务,然后在边缘节点前面挂一个负载均衡器,比如 nginx,将所有的边缘节点均作为这个负载均衡器的后端,这样就可以实现 ingress-controller 的高可用和负载均衡了。 到这里我们就通过 ingress 对象对外成功暴露了一个服务,下节课我们再来详细了解 traefik 的更多用法。 # 五、traefik 合并文件 1、创建文件 traefik-controller-ingress.yaml ``` vim traefik-controller-ingress.yaml --- apiVersion: v1 kind: ServiceAccount metadata: name: traefik-ingress-controller namespace: kube-system --- kind: ClusterRole apiVersion: rbac.authorization.k8s.io/v1 metadata: name: traefik-ingress-controller rules: - apiGroups: - "" resources: - pods - services - endpoints - secrets verbs: - get - list - watch - apiGroups: - extensions resources: - ingresses verbs: - get - list - watch --- kind: ClusterRoleBinding apiVersion: rbac.authorization.k8s.io/v1 metadata: name: traefik-ingress-controller roleRef: apiGroup: rbac.authorization.k8s.io kind: ClusterRole name: traefik-ingress-controller subjects: - kind: ServiceAccount name: traefik-ingress-controller namespace: kube-system --- kind: Deployment apiVersion: extensions/v1beta1 metadata: name: traefik-ingress-controller namespace: kube-system labels: k8s-app: traefik-ingress-lb spec: replicas: 1 selector: matchLabels: k8s-app: traefik-ingress-lb template: metadata: labels: k8s-app: traefik-ingress-lb name: traefik-ingress-lb spec: serviceAccountName: traefik-ingress-controller terminationGracePeriodSeconds: 60 tolerations: - operator: "Exists" nodeSelector: kubernetes.io/hostname: master #默认master是不允许被调度的,加上tolerations后允许被调度,然后这里使用自身机器master的地址,可以使用kubectl get nodes来查看 containers: - image: traefik:v1.7 name: traefik-ingress-lb ports: - name: http containerPort: 80 hostPort: 80 - name: admin containerPort: 8080 args: - --api - --kubernetes - --logLevel=INFO --- kind: Service apiVersion: v1 metadata: name: traefik-ingress-service namespace: kube-system spec: selector: k8s-app: traefik-ingress-lb ports: - protocol: TCP # 该端口为 traefik ingress-controller的服务端口 port: 80 name: web # 集群hosts文件中设置的 NODE_PORT_RANGE 作为 NodePort的可用范围 # 从默认20000~40000之间选一个可用端口,让ingress-controller暴露给外部的访问 nodePort: 23456 - protocol: TCP # 该端口为 traefik 的管理WEB界面 port: 8080 name: admin nodePort: 23457 type: NodePort --- apiVersion: extensions/v1beta1 kind: Ingress metadata: name: traefik-web-ui namespace: kube-system annotations: kubernetes.io/ingress.class: traefik spec: rules: - host: traefik-ui.test.com http: paths: - backend: serviceName: traefik-ingress-service #servicePort: 8080 servicePort: admin #跟上面service的name对应 ``` 2、更新应用 ``` $ kubectl apply -f traefik-controller-ingress.yaml ``` 3、访问测试 ``` http://traefik-ui.test.com 绑定master的公网IP或者VIP ``` https://blog.csdn.net/oyym_mv/article/details/86986510 Kubernetes实录(11) kubernetes使用traefik作为反向代理(Deamonset模式) ================================================ FILE: components/ingress/2.ingress tls配置.md ================================================ # 1、Ingress tls 上节课给大家展示了 traefik 的安装使用以及简单的 ingress 的配置方法,这节课我们来学习一下 ingress tls 以及 path 路径在 ingress 对象中的使用方法。 # 2、TLS 认证 在现在大部分场景下面我们都会使用 https 来访问我们的服务,这节课我们将使用一个自签名的证书,当然你有在一些正规机构购买的 CA 证书是最好的,这样任何人访问你的服务的时候都是受浏览器信任的证书。使用下面的 openssl 命令生成 CA 证书: ``` mkdir -p /ssl/ cd /ssl/ openssl req -newkey rsa:2048 -nodes -keyout tls.key -x509 -days 365 -out tls.crt ``` 现在我们有了证书,我们可以使用 kubectl 创建一个 secret 对象来存储上面的证书: ``` kubectl create secret generic traefik-cert --from-file=tls.crt --from-file=tls.key -n kube-system ``` # 3、配置 Traefik 前面我们使用的是 Traefik 的默认配置,现在我们来配置 Traefik,让其支持 https: ``` mkdir -p /config/ cd /config/ cat > traefik.toml <<\EOF defaultEntryPoints = ["http", "https"] [entryPoints] [entryPoints.http] address = ":80" [entryPoints.http.redirect] entryPoint = "https" [entryPoints.https] address = ":443" [entryPoints.https.tls] [[entryPoints.https.tls.certificates]] CertFile = "/ssl/tls.crt" KeyFile = "/ssl/tls.key" EOF 上面的配置文件中我们配置了 http 和 https 两个入口,并且配置了将 http 服务强制跳转到 https 服务,这样我们所有通过 traefik 进来的服务都是 https 的,要访问 https 服务,当然就得配置对应的证书了,可以看到我们指定了 CertFile 和 KeyFile 两个文件,由于 traefik pod 中并没有这两个证书,所以我们要想办法将上面生成的证书挂载到 Pod 中去,是不是前面我们讲解过 secret 对象可以通过 volume 形式挂载到 Pod 中?至于上面的 traefik.toml 这个文件我们要怎么让 traefik pod 能够访问到呢?还记得我们前面讲过的 ConfigMap 吗?我们是不是可以将上面的 traefik.toml 配置文件通过一个 ConfigMap 对象挂载到 traefik pod 中去: kubectl create configmap traefik-conf --from-file=traefik.toml -n kube-system root># kubectl get configmap -n kube-system NAME DATA AGE coredns 1 11h extension-apiserver-authentication 6 11h kube-flannel-cfg 2 11h kube-proxy 2 11h kubeadm-config 2 11h kubelet-config-1.15 1 11h traefik-conf 1 10s 现在就可以更改下上节课的 traefik pod 的 yaml 文件了: cd /data/components/ingress/ cat > traefik.yaml <<\EOF kind: Deployment apiVersion: extensions/v1beta1 metadata: name: traefik-ingress-controller namespace: kube-system labels: k8s-app: traefik-ingress-lb spec: replicas: 1 selector: matchLabels: k8s-app: traefik-ingress-lb template: metadata: labels: k8s-app: traefik-ingress-lb name: traefik-ingress-lb spec: serviceAccountName: traefik-ingress-controller terminationGracePeriodSeconds: 60 volumes: - name: ssl secret: secretName: traefik-cert - name: config configMap: name: traefik-conf tolerations: - operator: "Exists" nodeSelector: kubernetes.io/hostname: linux-node1.example.com containers: - image: traefik name: traefik-ingress-lb volumeMounts: - mountPath: "/ssl" #这里注意挂载的路径 name: "ssl" - mountPath: "/config" #这里注意挂载的路径 name: "config" ports: - name: http containerPort: 80 hostPort: 80 - name: https containerPort: 443 hostPort: 443 - name: admin containerPort: 8080 args: - --configfile=/config/traefik.toml - --api - --kubernetes - --logLevel=INFO EOF 和之前的比较,我们增加了 443 的端口配置,以及启动参数中通过 configfile 指定了 traefik.toml 配置文件,这个配置文件是通过 volume 挂载进来的。然后更新下 traefik pod: kubectl apply -f traefik.yaml kubectl logs -f traefik-ingress-controller-7dcfd9c6df-v58k7 -n kube-system 更新完成后我们查看 traefik pod 的日志,如果出现类似于上面的一些日志信息,证明更新成功了。现在我们去访问 traefik 的 dashboard 会跳转到 https 的地址,并会提示证书相关的报警信息,这是因为我们的证书是我们自建的,并不受浏览器信任,如果你是正规机构购买的证书并不会出现改报警信息,你应该可以看到我们常见的绿色标志: https://traefik.k8s.com/dashboard/ ``` # 4、配置 ingress 其实上面的 TLS 认证方式已经成功了,接下来我们通过一个实例来说明下 ingress 中 path 的用法,这里我们部署了3个简单的 web 服务,通过一个环境变量来标识当前运行的是哪个服务:(backend.yaml) ``` cd /data/components/ingress/ cat > backend.yaml <<\EOF kind: Deployment apiVersion: extensions/v1beta1 metadata: name: svc1 spec: replicas: 1 template: metadata: labels: app: svc1 spec: containers: - name: svc1 image: cnych/example-web-service env: - name: APP_SVC value: svc1 ports: - containerPort: 8080 protocol: TCP --- kind: Deployment apiVersion: extensions/v1beta1 metadata: name: svc2 spec: replicas: 1 template: metadata: labels: app: svc2 spec: containers: - name: svc2 image: cnych/example-web-service env: - name: APP_SVC value: svc2 ports: - containerPort: 8080 protocol: TCP --- kind: Deployment apiVersion: extensions/v1beta1 metadata: name: svc3 spec: replicas: 1 template: metadata: labels: app: svc3 spec: containers: - name: svc3 image: cnych/example-web-service env: - name: APP_SVC value: svc3 ports: - containerPort: 8080 protocol: TCP --- kind: Service apiVersion: v1 metadata: labels: app: svc1 name: svc1 spec: type: ClusterIP ports: - port: 8080 name: http selector: app: svc1 --- kind: Service apiVersion: v1 metadata: labels: app: svc2 name: svc2 spec: type: ClusterIP ports: - port: 8080 name: http selector: app: svc2 --- kind: Service apiVersion: v1 metadata: labels: app: svc3 name: svc3 spec: type: ClusterIP ports: - port: 8080 name: http selector: app: svc3 EOF 可以看到上面我们定义了3个 Deployment,分别对应3个 Service: kubectl create -f backend.yaml 然后我们创建一个 ingress 对象来访问上面的3个服务:(example-ingress.yaml) cat > example-ingress.yaml <<\EOF apiVersion: extensions/v1beta1 kind: Ingress metadata: name: example-web-app annotations: kubernetes.io/ingress.class: "traefik" spec: rules: - host: example.k8s.com http: paths: - path: /s1 backend: serviceName: svc1 servicePort: 8080 - path: /s2 backend: serviceName: svc2 servicePort: 8080 - path: / backend: serviceName: svc3 servicePort: 8080 EOF 注意我们这里定义的 ingress 对象和之前有一个不同的地方是我们增加了 path 路径的定义,不指定的话默认是 '/',创建该 ingress 对象: kubectl create -f example-ingress.yaml 现在我们可以在本地 hosts 里面给域名 example.k8s.com 添加对应的 hosts 解析,然后就可以在浏览器中访问,可以看到默认也会跳转到 https 的页面: ``` 参考文档: https://www.qikqiak.com/k8s-book/docs/41.ingress%20config.html ================================================ FILE: components/ingress/3.ingress-http使用示例.md ================================================ # 一、ingress-http测试示例 ## 1、关键三个点: 注意这3个资源的namespace: kube-system需要一致 Deployment Service Ingress ``` $ vim nginx-deployment-http.yaml --- apiVersion: apps/v1beta1 kind: Deployment metadata: name: nginx-deployment namespace: kube-system spec: replicas: 2 template: metadata: labels: app: nginx-pod spec: containers: - name: nginx image: nginx:1.15.5 ports: - containerPort: 80 --- apiVersion: v1 kind: Service metadata: name: nginx-service namespace: kube-system annotations: traefik.ingress.kubernetes.io/load-balancer-method: drr #动态加权轮训调度 spec: template: metadata: labels: name: nginx-service spec: selector: app: nginx-pod ports: - port: 80 targetPort: 80 --- apiVersion: extensions/v1beta1 kind: Ingress metadata: name: nginx-ingress namespace: kube-system annotations: kubernetes.io/ingress.class: traefik spec: rules: - host: k8s.nginx.com http: paths: - backend: serviceName: nginx-service servicePort: 80 ``` ## 2、创建资源 ``` $ kubectl apply -f nginx-deployment-http.yaml deployment.apps/nginx-pod create service/nginx-service create ingress.extensions/nginx-ingress create ``` ## 3、访问刚创建的资源 首先这里需要先找到traefik-ingress pod 分布到到了那个节点,这里我们发现是落在了10.199.1.159的节点,然后我们绑定该节点对应的公网IP,这里假设为16.21.26.139 ``` 16.21.26.139 k8s.nginx.com ``` ``` $ kubectl get pod -A -o wide|grep traefik-ingress kube-system traefik-ingress-controller-7d454d7c68-8qpjq 1/1 Running 0 21h 10.46.2.10 10.199.1.159 ``` ![ingress测试示例1](https://github.com/Lancger/opsfull/blob/master/images/ingress-k8s-01.png) ## 4、清理资源 ### 1、清理deployment ``` # 获取deployment $ kubectl get deploy -A NAMESPACE NAME READY UP-TO-DATE AVAILABLE AGE kube-system coredns 2/2 2 2 3d kube-system heapster 1/1 1 1 3d kube-system kubernetes-dashboard 1/1 1 1 3d kube-system metrics-server 1/1 1 1 3d kube-system nginx-pod 2/2 2 2 25m kube-system traefik-ingress-controller 1/1 1 1 2d22h # 清理deployment $ kubectl delete deploy nginx-pod -n kube-system deployment.extensions "nginx-pod" deleted ``` ### 2、清理service ``` # 获取svc $ kubectl get svc -A NAMESPACE NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE default kubernetes ClusterIP 10.44.0.1 443/TCP 3d kube-system heapster ClusterIP 10.44.158.46 80/TCP 3d kube-system kube-dns ClusterIP 10.44.0.2 53/UDP,53/TCP,9153/TCP 3d kube-system kubernetes-dashboard NodePort 10.44.176.99 443:27008/TCP 3d kube-system metrics-server ClusterIP 10.44.40.157 443/TCP 3d kube-system nginx-service ClusterIP 10.44.148.252 80/TCP 28m kube-system traefik-ingress-service NodePort 10.44.67.195 80:23456/TCP,443:23457/TCP,8080:33192/TCP 2d22h # 清理svc $ kubectl delete svc nginx-service -n kube-system service "nginx-service" deleted ``` ### 3、清理ingress ``` # 获取ingress $ kubectl get ingress -A NAMESPACE NAME HOSTS ADDRESS PORTS AGE kube-system kubernetes-dashboard dashboard.test.com 80 2d22h kube-system nginx-ingress k8s.nginx.com 80 29m kube-system traefik-web-ui traefik-ui.test.com 80 2d22h # 清理ingress $ kubectl delete ingress nginx-ingress -n kube-system ingress.extensions "nginx-ingress" deleted ``` 参考资料: https://xuchao918.github.io/2019/03/01/Kubernetes-traefik-ingress%E4%BD%BF%E7%94%A8/ Kubernetes traefik ingress使用 ================================================ FILE: components/ingress/4.ingress-https使用示例.md ================================================ # 一、ingress-https测试示例 1、TLS 认证 在现在大部分场景下面我们都会使用 https 来访问我们的服务,这节课我们将使用一个自签名的证书,当然你有在一些正规机构购买的 CA 证书是最好的,这样任何人访问你的服务的时候都是受浏览器信任的证书。使用下面的 openssl 命令生成 CA 证书: ``` mkdir -p /ssl-k8s/ cd /ssl-k8s/ openssl req -x509 -nodes -days 365 -newkey rsa:2048 -keyout tls_k8s.key -out tls_k8s.crt -subj "/CN=hello.k8s.com" ``` 现在我们有了证书,我们可以使用 kubectl 创建一个 secret 对象来存储上面的证书:(这个需手动执行创建好) ``` kubectl create secret generic traefik-k8s --from-file=tls_k8s.crt --from-file=tls_k8s.key -n kube-system ``` ``` # vim /config/traefik.toml defaultEntryPoints = ["http", "https"] [entryPoints] [entryPoints.http] address = ":80" [entryPoints.https] address = ":443" [entryPoints.https.tls] [[entryPoints.https.tls.certificates]] CertFile = "/ssl/tls_first.crt" KeyFile = "/ssl/tls_first.key" [[entryPoints.https.tls.certificates]] CertFile = "/ssl/tls_second.crt" KeyFile = "/ssl/tls_second.key" ``` ## 1、关键五个点: 注意这5个资源的namespace: kube-system需要一致 secret ---secret 对象来存储ssl证书 configmap ---configmap 用来保存一个或多个key/value信息 Deployment Service Ingress ## 2、合并创建secret,configmap以及traefik文件 ``` # vim traefik-controller-https.yaml --- apiVersion: v1 kind: ConfigMap metadata: name: traefik-conf namespace: kube-system data: traefik.toml: | insecureSkipVerify = true defaultEntryPoints = ["http", "https"] [entryPoints] [entryPoints.http] address = ":80" [entryPoints.https] address = ":443" [entryPoints.https.tls] [[entryPoints.https.tls.certificates]] CertFile = "/ssl/tls_first.crt" KeyFile = "/ssl/tls_first.key" [[entryPoints.https.tls.certificates]] CertFile = "/ssl/tls_second.crt" KeyFile = "/ssl/tls_second.key" --- kind: Deployment apiVersion: apps/v1beta1 metadata: name: traefik-ingress-controller namespace: kube-system labels: k8s-app: traefik-ingress-lb spec: replicas: 1 selector: matchLabels: k8s-app: traefik-ingress-lb template: metadata: labels: k8s-app: traefik-ingress-lb name: traefik-ingress-lb spec: serviceAccountName: traefik-ingress-controller terminationGracePeriodSeconds: 60 volumes: - name: ssl secret: secretName: traefik-cert - name: config configMap: name: traefik-conf #nodeSelector: # node-role.kubernetes.io/traefik: "true" containers: - image: traefik:v1.7.12 imagePullPolicy: IfNotPresent name: traefik-ingress-lb volumeMounts: - mountPath: "/ssl" name: "ssl" - mountPath: "/config" name: "config" resources: limits: cpu: 1000m memory: 800Mi requests: cpu: 500m memory: 600Mi args: - --configfile=/config/traefik.toml - --api - --kubernetes - --logLevel=INFO securityContext: capabilities: drop: - ALL add: - NET_BIND_SERVICE ports: - name: http containerPort: 80 hostPort: 80 - name: https containerPort: 443 hostPort: 443 --- kind: Service apiVersion: v1 metadata: name: traefik-ingress-service namespace: kube-system spec: selector: k8s-app: traefik-ingress-lb ports: - protocol: TCP # 该端口为 traefik ingress-controller的服务端口 port: 80 # 集群hosts文件中设置的 NODE_PORT_RANGE 作为 NodePort的可用范围 # 从默认20000~40000之间选一个可用端口,让ingress-controller暴露给外部的访问 nodePort: 23456 name: http - protocol: TCP # port: 443 nodePort: 23457 name: https - protocol: TCP # 该端口为 traefik 的管理WEB界面 port: 8080 name: admin type: NodePort --- kind: ClusterRole apiVersion: rbac.authorization.k8s.io/v1 metadata: name: traefik-ingress-controller rules: - apiGroups: - "" resources: - pods - services - endpoints - secrets verbs: - get - list - watch - apiGroups: - extensions resources: - ingresses verbs: - get - list - watch --- kind: ClusterRoleBinding apiVersion: rbac.authorization.k8s.io/v1 metadata: name: traefik-ingress-controller roleRef: apiGroup: rbac.authorization.k8s.io kind: ClusterRole name: traefik-ingress-controller subjects: - kind: ServiceAccount name: traefik-ingress-controller namespace: kube-system --- apiVersion: v1 kind: ServiceAccount metadata: name: traefik-ingress-controller namespace: kube-system ``` # 二、应用测试示例 ``` $ vim nginx-deployment-https.yaml --- apiVersion: apps/v1beta1 kind: Deployment metadata: name: nginx-deployment namespace: kube-system spec: replicas: 2 template: metadata: labels: app: nginx-pod spec: containers: - name: nginx image: nginx:1.15.5 ports: - containerPort: 80 --- apiVersion: v1 kind: Service metadata: name: nginx-service namespace: kube-system annotations: traefik.ingress.kubernetes.io/load-balancer-method: drr #动态加权轮训调度 spec: template: metadata: labels: name: nginx-service spec: selector: app: nginx-pod ports: - port: 80 targetPort: 80 --- apiVersion: extensions/v1beta1 kind: Ingress metadata: name: nginx-ingress namespace: kube-system annotations: kubernetes.io/ingress.class: traefik spec: rules: - host: k8s.nginx.com http: paths: - backend: serviceName: nginx-service servicePort: 80 tls: - secretName: traefik-k8s ``` ## 2、创建资源 ``` $ kubectl apply -f nginx-deployment-https.yaml deployment.apps/nginx-pod create service/nginx-service create ingress.extensions/nginx-ingress create ``` ## 3、访问刚创建的资源 首先这里需要先找到traefik-ingress pod 分布到到了那个节点,这里我们发现是落在了10.199.1.159的节点,然后我们绑定该节点对应的公网IP,这里假设为16.21.26.139 ``` 16.21.26.139 k8s.nginx.com ``` ``` $ kubectl get pod -A -o wide|grep traefik-ingress kube-system traefik-ingress-controller-7d454d7c68-8qpjq 1/1 Running 0 21h 10.46.2.10 10.199.1.159 ``` ![ingress测试示例1](https://github.com/Lancger/opsfull/blob/master/images/ingress-k8s-01.png) ## 4、清理资源 ### 1、清理deployment ``` # 获取deployment $ kubectl get deploy -A NAMESPACE NAME READY UP-TO-DATE AVAILABLE AGE kube-system coredns 2/2 2 2 3d kube-system heapster 1/1 1 1 3d kube-system kubernetes-dashboard 1/1 1 1 3d kube-system metrics-server 1/1 1 1 3d kube-system nginx-pod 2/2 2 2 25m kube-system traefik-ingress-controller 1/1 1 1 2d22h # 清理deployment $ kubectl delete deploy nginx-pod -n kube-system deployment.extensions "nginx-pod" deleted ``` ### 2、清理service ``` # 获取svc $ kubectl get svc -A NAMESPACE NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE default kubernetes ClusterIP 10.44.0.1 443/TCP 3d kube-system heapster ClusterIP 10.44.158.46 80/TCP 3d kube-system kube-dns ClusterIP 10.44.0.2 53/UDP,53/TCP,9153/TCP 3d kube-system kubernetes-dashboard NodePort 10.44.176.99 443:27008/TCP 3d kube-system metrics-server ClusterIP 10.44.40.157 443/TCP 3d kube-system nginx-service ClusterIP 10.44.148.252 80/TCP 28m kube-system traefik-ingress-service NodePort 10.44.67.195 80:23456/TCP,443:23457/TCP,8080:33192/TCP 2d22h # 清理svc $ kubectl delete svc nginx-service -n kube-system service "nginx-service" deleted ``` ### 3、清理ingress ``` # 获取ingress $ kubectl get ingress -A NAMESPACE NAME HOSTS ADDRESS PORTS AGE kube-system kubernetes-dashboard dashboard.test.com 80 2d22h kube-system nginx-ingress k8s.nginx.com 80 29m kube-system traefik-web-ui traefik-ui.test.com 80 2d22h # 清理ingress $ kubectl delete ingress nginx-ingress -n kube-system ingress.extensions "nginx-ingress" deleted ``` 参考资料: https://xuchao918.github.io/2019/03/01/Kubernetes-traefik-ingress%E4%BD%BF%E7%94%A8/ Kubernetes traefik ingress使用 http://docs.kubernetes.org.cn/558.html ================================================ FILE: components/ingress/5.hello-tls.md ================================================ # 证书文件 1、生成证书 ``` mkdir -p /ssl/{default,first,second} cd /ssl/default/ openssl req -x509 -nodes -days 165 -newkey rsa:2048 -keyout tls.key -out tls.crt -subj "/CN=k8s.test.com" kubectl -n kube-system create secret tls traefik-cert --key=tls.key --cert=tls.crt cd /ssl/first/ openssl req -x509 -nodes -days 265 -newkey rsa:2048 -keyout tls_first.key -out tls_first.crt -subj "/CN=k8s.first.com" kubectl create secret generic first-k8s --from-file=tls_first.crt --from-file=tls_first.key -n kube-system cd /ssl/second/ openssl req -x509 -nodes -days 365 -newkey rsa:2048 -keyout tls_second.key -out tls_second.crt -subj "/CN=k8s.second.com" kubectl create secret generic second-k8s --from-file=tls_second.crt --from-file=tls_second.key -n kube-system #查看证书 kubectl get secret traefik-cert first-k8s second-k8s -n kube-system kubectl describe secret traefik-cert first-k8s second-k8s -n kube-system ``` 2、删除证书 ``` $ kubectl delete secret traefik-cert first-k8s second-k8s -n kube-system secret "second-k8s" deleted secret "traefik-cert" deleted secret "first-k8s" deleted ``` # 证书配置 1、创建configMap(cm) ``` mkdir -p /config/ cd /config/ $ vim traefik.toml defaultEntryPoints = ["http", "https"] [entryPoints] [entryPoints.http] address = ":80" [entryPoints.https] address = ":443" [entryPoints.https.tls] [[entryPoints.https.tls.certificates]] CertFile = "/ssl/default/tls.crt" KeyFile = "/ssl/default/tls.key" [[entryPoints.https.tls.certificates]] CertFile = "/ssl/first/tls_first.crt" KeyFile = "/ssl/first/tls_first.key" [[entryPoints.https.tls.certificates]] CertFile = "/ssl/second/tls_second.crt" KeyFile = "/ssl/second/tls_second.key" $ kubectl create configmap traefik-conf --from-file=traefik.toml -n kube-system $ kubectl get configmap traefik-conf -n kube-system $ kubectl describe cm traefik-conf -n kube-system ``` 2、删除configMap(cm) ``` $ kubectl delete cm traefik-conf -n kube-system ``` # traefik-ingress-controller文件 1、创建文件 ``` $ vim traefik-controller-tls.yaml --- apiVersion: v1 kind: ConfigMap metadata: name: traefik-conf namespace: kube-system data: traefik.toml: | insecureSkipVerify = true defaultEntryPoints = ["http", "https"] [entryPoints] [entryPoints.http] address = ":80" [entryPoints.https] address = ":443" [entryPoints.https.tls] [[entryPoints.https.tls.certificates]] CertFile = "/ssl/default/tls.crt" KeyFile = "/ssl/default/tls.key" [[entryPoints.https.tls.certificates]] CertFile = "/ssl/first/tls_first.crt" KeyFile = "/ssl/first/tls_first.key" [[entryPoints.https.tls.certificates]] CertFile = "/ssl/second/tls_second.crt" KeyFile = "/ssl/second/tls_second.key" --- kind: Deployment apiVersion: apps/v1beta1 metadata: name: traefik-ingress-controller namespace: kube-system labels: k8s-app: traefik-ingress-lb spec: replicas: 1 selector: matchLabels: k8s-app: traefik-ingress-lb template: metadata: labels: k8s-app: traefik-ingress-lb name: traefik-ingress-lb spec: serviceAccountName: traefik-ingress-controller terminationGracePeriodSeconds: 60 volumes: - name: ssl secret: secretName: traefik-cert - name: config configMap: name: traefik-conf #nodeSelector: # node-role.kubernetes.io/traefik: "true" containers: - image: traefik:v1.7.12 imagePullPolicy: IfNotPresent name: traefik-ingress-lb volumeMounts: - mountPath: "/ssl" name: "ssl" - mountPath: "/config" name: "config" resources: limits: cpu: 1000m memory: 800Mi requests: cpu: 500m memory: 600Mi args: - --configfile=/config/traefik.toml - --api - --kubernetes - --logLevel=INFO securityContext: capabilities: drop: - ALL add: - NET_BIND_SERVICE ports: - name: http containerPort: 80 hostPort: 80 - name: https containerPort: 443 hostPort: 443 --- kind: Service apiVersion: v1 metadata: name: traefik-ingress-service namespace: kube-system spec: selector: k8s-app: traefik-ingress-lb ports: - protocol: TCP # 该端口为 traefik ingress-controller的服务端口 port: 80 # 集群hosts文件中设置的 NODE_PORT_RANGE 作为 NodePort的可用范围 # 从默认20000~40000之间选一个可用端口,让ingress-controller暴露给外部的访问 nodePort: 23456 name: http - protocol: TCP # port: 443 nodePort: 23457 name: https - protocol: TCP # 该端口为 traefik 的管理WEB界面 port: 8080 name: admin type: NodePort --- kind: ClusterRole apiVersion: rbac.authorization.k8s.io/v1 metadata: name: traefik-ingress-controller rules: - apiGroups: - "" resources: - pods - services - endpoints - secrets verbs: - get - list - watch - apiGroups: - extensions resources: - ingresses verbs: - get - list - watch --- kind: ClusterRoleBinding apiVersion: rbac.authorization.k8s.io/v1 metadata: name: traefik-ingress-controller roleRef: apiGroup: rbac.authorization.k8s.io kind: ClusterRole name: traefik-ingress-controller subjects: - kind: ServiceAccount name: traefik-ingress-controller namespace: kube-system --- apiVersion: v1 kind: ServiceAccount metadata: name: traefik-ingress-controller namespace: kube-system ``` 2、应用生效 ``` $ kubectl apply -f traefik-controller-tls.yaml configmap/traefik-conf created deployment.apps/traefik-ingress-controller created service/traefik-ingress-service created clusterrole.rbac.authorization.k8s.io/traefik-ingress-controller created clusterrolebinding.rbac.authorization.k8s.io/traefik-ingress-controller created serviceaccount/traefik-ingress-controller created #删除资源 $ kubectl delete -f traefik-controller-tls.yaml ``` # 测试deployment和ingress ``` $ vim nginx-ingress-deploy.yaml --- apiVersion: apps/v1beta1 kind: Deployment metadata: name: nginx-deployment namespace: kube-system spec: replicas: 2 template: metadata: labels: app: nginx-pod spec: containers: - name: nginx image: nginx:1.15.5 ports: - containerPort: 80 --- apiVersion: v1 kind: Service metadata: name: nginx-service namespace: kube-system annotations: traefik.ingress.kubernetes.io/load-balancer-method: drr #动态加权轮训调度 spec: template: metadata: labels: name: nginx-service spec: selector: app: nginx-pod ports: - port: 80 targetPort: 80 --- apiVersion: extensions/v1beta1 kind: Ingress metadata: name: nginx-ingress namespace: kube-system annotations: kubernetes.io/ingress.class: traefik spec: tls: - secretName: first-k8s - secretName: second-k8s rules: - host: k8s.first.com http: paths: - backend: serviceName: nginx-service servicePort: 80 - host: k8s.senond.com http: paths: - backend: serviceName: nginx-service servicePort: 80 $ kubectl apply -f nginx-ingress-deploy.yaml $ kubectl delete -f nginx-ingress-deploy.yaml ``` ================================================ FILE: components/ingress/6.ingress-https使用示例.md ================================================ # ingress-https测试示例 # 一、证书文件 ## 1、TLS 认证 在现在大部分场景下面我们都会使用 https 来访问我们的服务,这节课我们将使用一个自签名的证书,当然你有在一些正规机构购买的 CA 证书是最好的,这样任何人访问你的服务的时候都是受浏览器信任的证书。使用下面的 openssl 命令生成 CA 证书: ``` openssl req -x509 -nodes -days 365 -newkey rsa:2048 -keyout tls.key -out tls.crt -subj "/CN=hello.test.com" ``` 现在我们有了证书,我们可以使用 kubectl 创建一个 secret 对象来存储上面的证书:(这个需手动执行创建好) ``` kubectl -n kube-system create secret tls traefik-cert --key=tls.key --cert=tls.crt ``` ## 2、多证书创建 ``` mkdir -p /ssl/{default,first,second} cd /ssl/default/ openssl req -x509 -nodes -days 165 -newkey rsa:2048 -keyout tls_default.key -out tls_default.crt -subj "/CN=k8s.test.com" kubectl -n kube-system create secret tls traefik-cert --key=tls_default.key --cert=tls_default.crt cd /ssl/first/ openssl req -x509 -nodes -days 265 -newkey rsa:2048 -keyout tls_first.key -out tls_first.crt -subj "/CN=k8s.first.com" kubectl -n kube-system create secret tls first-k8s --key=tls_first.key --cert=tls_first.crt cd /ssl/second/ openssl req -x509 -nodes -days 365 -newkey rsa:2048 -keyout tls_second.key -out tls_second.crt -subj "/CN=k8s.second.com" kubectl -n kube-system create secret tls second-k8s --key=tls_second.key --cert=tls_second.crt #查看证书 kubectl get secret traefik-cert first-k8s second-k8s -n kube-system kubectl describe secret traefik-cert first-k8s second-k8s -n kube-system ``` ## 3、删除证书 ``` $ kubectl delete secret traefik-cert first-k8s second-k8s -n kube-system secret "second-k8s" deleted secret "traefik-cert" deleted secret "first-k8s" deleted ``` ## 4、关键5个点 ``` 注意这5个资源的namespace: kube-system 需要一致 secret ---secret 对象来存储ssl证书 configmap ---configmap 用来保存一个或多个key/value信息 Deployment Service Ingress ``` # 二、证书配置 ## 1、创建configMap(cm) ``` mkdir -p /config/ cd /config/ $ vim traefik.toml defaultEntryPoints = ["http", "https"] [entryPoints] [entryPoints.http] address = ":80" [entryPoints.https] address = ":443" [entryPoints.https.tls] [[entryPoints.https.tls.certificates]] CertFile = "/ssl/default/tls_default.crt" KeyFile = "/ssl/default/tls_default.key" [[entryPoints.https.tls.certificates]] CertFile = "/ssl/first/tls_first.crt" KeyFile = "/ssl/first/tls_first.key" [[entryPoints.https.tls.certificates]] CertFile = "/ssl/second/tls_second.crt" KeyFile = "/ssl/second/tls_second.key" $ kubectl create configmap traefik-conf --from-file=traefik.toml -n kube-system $ kubectl get configmap traefik-conf -n kube-system $ kubectl describe cm traefik-conf -n kube-system ``` ## 2、删除configMap(cm) ``` $ kubectl delete cm traefik-conf -n kube-system ``` # 三、traefik-ingress-controller控制文件 ## 1、创建文件 ``` $ cd /config/ $ vim traefik-controller-tls.yaml --- apiVersion: v1 kind: ConfigMap metadata: name: traefik-conf namespace: kube-system data: traefik.toml: | insecureSkipVerify = true defaultEntryPoints = ["http", "https"] [entryPoints] [entryPoints.http] address = ":80" [entryPoints.https] address = ":443" [entryPoints.https.tls] [[entryPoints.https.tls.certificates]] CertFile = "/ssl/default/tls.crt" KeyFile = "/ssl/default/tls.key" [[entryPoints.https.tls.certificates]] CertFile = "/ssl/first/tls_first.crt" KeyFile = "/ssl/first/tls_first.key" [[entryPoints.https.tls.certificates]] CertFile = "/ssl/second/tls_second.crt" KeyFile = "/ssl/second/tls_second.key" --- kind: Deployment apiVersion: apps/v1beta1 metadata: name: traefik-ingress-controller namespace: kube-system labels: k8s-app: traefik-ingress-lb spec: replicas: 1 selector: matchLabels: k8s-app: traefik-ingress-lb template: metadata: labels: k8s-app: traefik-ingress-lb name: traefik-ingress-lb spec: serviceAccountName: traefik-ingress-controller terminationGracePeriodSeconds: 60 volumes: - name: ssl secret: secretName: traefik-cert - name: config configMap: name: traefik-conf #nodeSelector: # node-role.kubernetes.io/traefik: "true" tolerations: - operator: "Exists" nodeSelector: kubernetes.io/hostname: 10.198.1.156 #指定traefik-ingress-controller跑在这个node节点上面 containers: - image: traefik:v1.7.12 imagePullPolicy: IfNotPresent name: traefik-ingress-lb volumeMounts: - mountPath: "/ssl" name: "ssl" - mountPath: "/config" name: "config" resources: limits: cpu: 1000m memory: 800Mi requests: cpu: 500m memory: 600Mi args: - --configfile=/config/traefik.toml - --api - --kubernetes - --logLevel=INFO securityContext: capabilities: drop: - ALL add: - NET_BIND_SERVICE ports: - name: http containerPort: 80 hostPort: 80 - name: https containerPort: 443 hostPort: 443 --- kind: Service apiVersion: v1 metadata: name: traefik-ingress-service namespace: kube-system spec: selector: k8s-app: traefik-ingress-lb ports: - protocol: TCP # 该端口为 traefik ingress-controller的服务端口 port: 80 # 集群hosts文件中设置的 NODE_PORT_RANGE 作为 NodePort的可用范围 # 从默认20000~40000之间选一个可用端口,让ingress-controller暴露给外部的访问 nodePort: 23456 name: http - protocol: TCP port: 443 nodePort: 23457 name: https - protocol: TCP # 该端口为 traefik 的管理WEB界面 port: 8080 name: admin type: NodePort --- kind: ClusterRole apiVersion: rbac.authorization.k8s.io/v1 metadata: name: traefik-ingress-controller rules: - apiGroups: - "" resources: - pods - services - endpoints - secrets verbs: - get - list - watch - apiGroups: - extensions resources: - ingresses verbs: - get - list - watch --- kind: ClusterRoleBinding apiVersion: rbac.authorization.k8s.io/v1 metadata: name: traefik-ingress-controller roleRef: apiGroup: rbac.authorization.k8s.io kind: ClusterRole name: traefik-ingress-controller subjects: - kind: ServiceAccount name: traefik-ingress-controller namespace: kube-system --- apiVersion: v1 kind: ServiceAccount metadata: name: traefik-ingress-controller namespace: kube-system ``` ## 2、应用生效 ``` $ kubectl apply -f traefik-controller-tls.yaml configmap/traefik-conf created deployment.apps/traefik-ingress-controller created service/traefik-ingress-service created clusterrole.rbac.authorization.k8s.io/traefik-ingress-controller created clusterrolebinding.rbac.authorization.k8s.io/traefik-ingress-controller created serviceaccount/traefik-ingress-controller created #删除资源 $ kubectl delete -f traefik-controller-tls.yaml ``` # 四、命令行创建 https ingress 例子 ``` # 创建示例应用 $ kubectl run test-hello --image=nginx:alpine --port=80 --expose -n kube-system # 删除示例应用(kubectl run 默认创建的是deployment资源应用 ) $ kubectl delete deployment test-hello -n kube-system $ kubectl delete svc test-hello -n kube-system # hello-tls-ingress 示例 $ cd /config/ $ vim hello-tls.ing.yaml apiVersion: extensions/v1beta1 kind: Ingress metadata: name: hello-tls-ingress namespace: kube-system annotations: kubernetes.io/ingress.class: traefik spec: rules: - host: k8s.test.com http: paths: - backend: serviceName: test-hello servicePort: 80 tls: - secretName: traefik-cert # 创建 https ingress $ kubectl apply -f /config/hello-tls.ing.yaml # 注意根据hello示例,需要在kube-system命名空间创建对应的secret: traefik-cert(这步在开篇已经创建了,无须再创建) $ kubectl -n kube-system create secret tls traefik-cert --key=tls_default.key --cert=tls_default.crt # 删除 https ingress $ kubectl delete -f /config/hello-tls.ing.yaml ``` #测试访问(找到traefik-controller pod运行在哪个node节点上,然后绑定该节点的IP,然后访问该url) https://k8s.test.com:23457 ![ingress测试](https://github.com/Lancger/opsfull/blob/master/images/ingress-k8s-02.png) # 五、测试deployment和ingress ``` $ vim nginx-ingress-deploy.yaml --- apiVersion: apps/v1beta1 kind: Deployment metadata: name: nginx-deployment namespace: kube-system spec: replicas: 2 template: metadata: labels: app: nginx-pod spec: containers: - name: nginx image: nginx:1.15.5 ports: - containerPort: 80 --- apiVersion: v1 kind: Service metadata: name: nginx-service namespace: kube-system annotations: traefik.ingress.kubernetes.io/load-balancer-method: drr #动态加权轮训调度 spec: template: metadata: labels: name: nginx-service spec: selector: app: nginx-pod ports: - port: 80 targetPort: 80 --- apiVersion: extensions/v1beta1 kind: Ingress metadata: name: nginx-ingress namespace: kube-system annotations: kubernetes.io/ingress.class: traefik spec: tls: - secretName: first-k8s - secretName: second-k8s rules: - host: k8s.first.com http: paths: - backend: serviceName: nginx-service servicePort: 80 - host: k8s.second.com http: paths: - backend: serviceName: nginx-service servicePort: 80 $ kubectl apply -f nginx-ingress-deploy.yaml $ kubectl delete -f nginx-ingress-deploy.yaml ``` #访问测试 https://k8s.first.com:23457/ ![ingress测试](https://github.com/Lancger/opsfull/blob/master/images/ingress-k8s-03.png) https://k8s.second.com:23457/ ![ingress测试](https://github.com/Lancger/opsfull/blob/master/images/ingress-k8s-04.png) 参考资料: https://xuchao918.github.io/2019/03/01/Kubernetes-traefik-ingress%E4%BD%BF%E7%94%A8/ Kubernetes traefik ingress使用 http://docs.kubernetes.org.cn/558.html ================================================ FILE: components/ingress/README.md ================================================ 参考资料: https://segmentfault.com/a/1190000019908991 k8s ingress原理及ingress-nginx部署测试 https://www.cnblogs.com/tchua/p/11174386.html Kubernetes集群Ingress高可用部署 ================================================ FILE: components/ingress/nginx-ingress/README.md ================================================ ================================================ FILE: components/ingress/traefik-ingress/1.traefik反向代理Deamonset模式.md ================================================ # 一、Deamonset方式部署traefik-controller-ingress https://github.com/containous/traefik/blob/v1.7/examples/k8s/traefik-ds.yaml 这里使用的DaemonSet,只是用traefik-ds.yaml ,traefik-rbac.yaml , ui.yaml ```bash kubectl delete -f traefik-ds.yaml rm -f ./traefik-ds.yaml cat >traefik-ds.yaml<<\EOF --- apiVersion: v1 kind: ServiceAccount metadata: name: traefik-ingress-controller namespace: kube-system --- kind: DaemonSet apiVersion: apps/v1 metadata: name: traefik-ingress-controller namespace: kube-system labels: k8s-app: traefik-ingress-lb spec: selector: matchLabels: k8s-app: traefik-ingress-lb template: metadata: labels: k8s-app: traefik-ingress-lb name: traefik-ingress-lb spec: serviceAccountName: traefik-ingress-controller terminationGracePeriodSeconds: 60 #=======添加nodeSelector信息:只在master节点创建======= tolerations: - operator: "Exists" nodeSelector: kubernetes.io/role: master #默认master是不允许被调度的,加上tolerations后允许被调度,然后这里使用自身机器master的地址,可以使用kubectl get nodes --show-labels来查看 #=================================================== containers: - image: traefik:v1.7 name: traefik-ingress-lb ports: - name: http containerPort: 80 hostPort: 80 - name: admin containerPort: 8080 hostPort: 8080 securityContext: capabilities: drop: - ALL add: - NET_BIND_SERVICE args: - --api - --kubernetes - --logLevel=INFO --- kind: Service apiVersion: v1 metadata: name: traefik-ingress-service namespace: kube-system spec: selector: k8s-app: traefik-ingress-lb ports: - protocol: TCP port: 80 name: web - protocol: TCP port: 8080 name: admin EOF kubectl apply -f traefik-ds.yaml ``` # 二、traefik-rbac配置 https://github.com/containous/traefik/blob/v1.7/examples/k8s/traefik-rbac.yaml ``` kubectl delete -f traefik-rbac.yaml rm -f ./traefik-rbac.yaml cat >traefik-rbac.yaml<<\EOF --- kind: ClusterRole apiVersion: rbac.authorization.k8s.io/v1beta1 metadata: name: traefik-ingress-controller rules: - apiGroups: - "" resources: - services - endpoints - secrets verbs: - get - list - watch - apiGroups: - extensions resources: - ingresses verbs: - get - list - watch - apiGroups: - extensions resources: - ingresses/status verbs: - update --- kind: ClusterRoleBinding apiVersion: rbac.authorization.k8s.io/v1beta1 metadata: name: traefik-ingress-controller roleRef: apiGroup: rbac.authorization.k8s.io kind: ClusterRole name: traefik-ingress-controller subjects: - kind: ServiceAccount name: traefik-ingress-controller namespace: kube-system --- EOF kubectl apply -f traefik-rbac.yaml ``` # 三、traefik-ui使用traefik进行代理 https://github.com/containous/traefik/blob/v1.7/examples/k8s/ui.yaml 1、代理方式一 ```bash kubectl delete -f ui.yaml rm -f ./ui.yaml cat >ui.yaml<<\EOF --- apiVersion: v1 kind: Service metadata: name: traefik-web-ui namespace: kube-system spec: selector: k8s-app: traefik-ingress-lb ports: - name: web port: 80 targetPort: 8080 --- apiVersion: extensions/v1beta1 kind: Ingress metadata: name: traefik-web-ui namespace: kube-system spec: rules: - host: traefik-ui.devops.com http: paths: - path: / backend: serviceName: traefik-web-ui servicePort: web --- EOF kubectl apply -f ui.yaml ``` 2、代理方式二 ``` kubectl delete -f ui.yaml rm -f ./ui.yaml cat >ui.yaml<<\EOF --- kind: Service apiVersion: v1 metadata: name: traefik-ingress-service namespace: kube-system spec: selector: k8s-app: traefik-ingress-lb ports: - protocol: TCP # 该端口为 traefik ingress-controller的服务端口 port: 80 name: web - protocol: TCP # 该端口为 traefik 的管理WEB界面 port: 8080 name: admin --- apiVersion: extensions/v1beta1 kind: Ingress metadata: name: traefik-web-ui namespace: kube-system annotations: kubernetes.io/ingress.class: traefik spec: rules: - host: traefik-ui.devops.com http: paths: - backend: serviceName: traefik-ingress-service #servicePort: 8080 servicePort: admin #跟上面service的name对应 --- EOF kubectl apply -f ui.yaml ``` # 四、访问测试 `http://traefik-ui.devops.com` # 五、汇总 ``` kubectl delete -f all-ds.yaml rm -f ./all-ds.yaml cat >all-ds.yaml<<\EOF --- apiVersion: v1 kind: ServiceAccount metadata: name: traefik-ingress-controller namespace: kube-system --- kind: DaemonSet apiVersion: apps/v1 metadata: name: traefik-ingress-controller namespace: kube-system labels: k8s-app: traefik-ingress-lb spec: selector: matchLabels: k8s-app: traefik-ingress-lb template: metadata: labels: k8s-app: traefik-ingress-lb name: traefik-ingress-lb spec: serviceAccountName: traefik-ingress-controller terminationGracePeriodSeconds: 60 #=======添加nodeSelector信息:只在master节点创建======= tolerations: - operator: "Exists" nodeSelector: kubernetes.io/role: master #默认master是不允许被调度的,加上tolerations后允许被调度,然后这里使用自身机器master的地址,可以使用kubectl get nodes --show-labels来查看 #=================================================== containers: - image: traefik:v1.7 name: traefik-ingress-lb ports: - name: http containerPort: 80 hostPort: 80 - name: admin containerPort: 8080 hostPort: 8080 securityContext: capabilities: drop: - ALL add: - NET_BIND_SERVICE args: - --api - --kubernetes - --logLevel=INFO --- kind: Service apiVersion: v1 metadata: name: traefik-ingress-service namespace: kube-system spec: selector: k8s-app: traefik-ingress-lb ports: - protocol: TCP port: 80 name: web - protocol: TCP port: 8080 name: admin --- kind: ClusterRole apiVersion: rbac.authorization.k8s.io/v1beta1 metadata: name: traefik-ingress-controller rules: - apiGroups: - "" resources: - services - endpoints - secrets verbs: - get - list - watch - apiGroups: - extensions resources: - ingresses verbs: - get - list - watch - apiGroups: - extensions resources: - ingresses/status verbs: - update --- kind: ClusterRoleBinding apiVersion: rbac.authorization.k8s.io/v1beta1 metadata: name: traefik-ingress-controller roleRef: apiGroup: rbac.authorization.k8s.io kind: ClusterRole name: traefik-ingress-controller subjects: - kind: ServiceAccount name: traefik-ingress-controller namespace: kube-system --- apiVersion: v1 kind: Service metadata: name: traefik-web-ui namespace: kube-system spec: selector: k8s-app: traefik-ingress-lb ports: - name: web port: 80 targetPort: 8080 --- apiVersion: extensions/v1beta1 kind: Ingress metadata: name: traefik-web-ui namespace: kube-system spec: rules: - host: traefik-ui.devops.com http: paths: - path: / backend: serviceName: traefik-web-ui servicePort: web EOF kubectl apply -f all-ds.yaml ``` 参考资料: https://blog.csdn.net/oyym_mv/article/details/86986510 kubernetes使用traefik作为反向代理(Deamonset模式) https://www.cnblogs.com/twodoge/p/11663006.html 第二个坑新版本的 apps.v1 API需要在yaml文件中,selector变为必选项 ================================================ FILE: components/ingress/traefik-ingress/2.traefik反向代理Deamonset模式TLS.md ================================================ # Ingress-Https测试示例 # 一、证书文件 ## 1、TLS 认证  在现在大部分场景下面我们都会使用 https 来访问我们的服务,这节课我们将使用一个自签名的证书,当然你有在一些正规机构购买的 CA 证书是最好的,这样任何人访问你的服务的时候都是受浏览器信任的证书。使用下面的 openssl 命令生成 CA 证书: ```bash rm -rf /etc/certs/ssl/ mkdir -p /etc/certs/ssl/ cd /etc/certs/ssl/ openssl req -x509 -nodes -days 365 -newkey rsa:2048 -keyout tls.key -out tls.crt -subj "/CN=hello.test.com" ```  现在我们有了证书,我们可以使用 kubectl 创建一个 secret 对象来存储上面的证书:(这个需手动执行创建好) ```bash kubectl -n kube-system create secret tls traefik-cert --key=tls.key --cert=tls.crt ``` ## 2、多证书创建 ```bash kubectl delete secret traefik-cert first-k8s second-k8s -n kube-system rm -rf /etc/certs/ssl/ mkdir -p /etc/certs/ssl/{default,first,second} cd /etc/certs/ssl/default/ openssl req -x509 -nodes -days 165 -newkey rsa:2048 -keyout tls_default.key -out tls_default.crt -subj "/CN=*.devops.com" kubectl -n kube-system create secret tls traefik-cert --key=tls_default.key --cert=tls_default.crt cd /etc/certs/ssl/first/ openssl req -x509 -nodes -days 265 -newkey rsa:2048 -keyout tls_first.key -out tls_first.crt -subj "/CN=k8s.first.com" kubectl -n kube-system create secret tls first-k8s --key=tls_first.key --cert=tls_first.crt cd /etc/certs/ssl/second/ openssl req -x509 -nodes -days 365 -newkey rsa:2048 -keyout tls_second.key -out tls_second.crt -subj "/CN=k8s.second.com" kubectl -n kube-system create secret tls second-k8s --key=tls_second.key --cert=tls_second.crt #查看证书 kubectl get secret traefik-cert first-k8s second-k8s -n kube-system kubectl describe secret traefik-cert first-k8s second-k8s -n kube-system ``` ## 3、关键5个点 ```bash 注意这5个资源的namespace: kube-system 需要一致 secret ---secret 对象来存储ssl证书 configmap ---configmap 用来保存一个或多个key/value信息 Deployment or DaemonSet Service Ingress ``` # 二、证书配置,创建configMap(cm) 1、http和https并存 ```bash kubectl delete cm traefik-conf -n kube-system rm -rf /etc/certs/config/ mkdir -p /etc/certs/config/ cd /etc/certs/config/ cat >traefik.toml<<\EOF # 设置insecureSkipVerify = true,可以配置backend为443(比如dashboard)的ingress规则 insecureSkipVerify = true defaultEntryPoints = ["http", "https"] [entryPoints] [entryPoints.http] address = ":80" [entryPoints.https] address = ":443" [entryPoints.https.tls] [[entryPoints.https.tls.certificates]] CertFile = "/etc/certs/ssl/default/tls_default.crt" KeyFile = "/etc/certs/ssl/default/tls_default.key" [[entryPoints.https.tls.certificates]] CertFile = "/etc/certs/ssl/first/tls_first.crt" KeyFile = "/etc/certs/ssl/first/tls_first.key" [[entryPoints.https.tls.certificates]] CertFile = "/etc/certs/ssl/second/tls_second.crt" KeyFile = "/etc/certs/ssl/second/tls_second.key" EOF kubectl create configmap traefik-conf --from-file=traefik.toml -n kube-system kubectl get configmap traefik-conf -n kube-system kubectl describe cm traefik-conf -n kube-system ``` 2、http跳转到https ```bash kubectl delete cm traefik-conf -n kube-system rm -rf /etc/certs/config/ mkdir -p /etc/certs/config/ cd /etc/certs/config/ cat >traefik.toml<<\EOF # 指定了 "traefik" 在访问 "https" 后端时可以忽略TLS证书验证错误,从而使得 "https" 的后端,可以像http后端一样直接通过 "traefik" 透出,如kubernetes dashboard insecureSkipVerify = true # defaultEntryPoints = ["http", "https"] [entryPoints] [entryPoints.http] address = ":80" [entryPoints.http.redirect] entryPoint = "https" [entryPoints.https] address = ":443" [entryPoints.https.tls] [[entryPoints.https.tls.certificates]] CertFile = "/etc/certs/ssl/default/tls_default.crt" KeyFile = "/etc/certs/ssl/default/tls_default.key" [[entryPoints.https.tls.certificates]] CertFile = "/etc/certs/ssl/first/tls_first.crt" KeyFile = "/etc/certs/ssl/first/tls_first.key" [[entryPoints.https.tls.certificates]] CertFile = "/etc/certs/ssl/second/tls_second.crt" KeyFile = "/etc/certs/ssl/second/tls_second.key" EOF kubectl create configmap traefik-conf --from-file=traefik.toml -n kube-system kubectl get configmap traefik-conf -n kube-system kubectl describe cm traefik-conf -n kube-system ``` # 三、traefik-ingress-controller控制文件 ## 1、创建文件 ``` kubectl delete -f traefik-controller-tls.yaml rm -f ./traefik-controller-tls.yaml cat >traefik-controller-tls.yaml<<\EOF --- apiVersion: v1 kind: ServiceAccount metadata: name: traefik-ingress-controller namespace: kube-system --- kind: DaemonSet apiVersion: extensions/v1beta1 metadata: name: traefik-ingress-controller namespace: kube-system labels: k8s-app: traefik-ingress-lb spec: template: metadata: labels: k8s-app: traefik-ingress-lb name: traefik-ingress-lb spec: serviceAccountName: traefik-ingress-controller terminationGracePeriodSeconds: 60 hostNetwork: true #添加该字段让docker使用物理机网络,在物理机暴露服务端口(80),注意物理机80端口提前不能被占用 dnsPolicy: ClusterFirstWithHostNet #使用hostNetwork后容器会使用物理机网络包括DNS,会无法解析内部service,使用此参数让容器使用K8S的DNS volumes: - name: ssl secret: secretName: traefik-cert - name: config configMap: name: traefik-conf #=======添加nodeSelector信息:只在master节点创建======= tolerations: - key: node-role.kubernetes.io/master operator: "Equal" value: "" effect: NoSchedule nodeSelector: node-role.kubernetes.io/master: "" #=================================================== containers: - image: traefik:v1.7.12 name: traefik-ingress-lb volumeMounts: - mountPath: "/etc/certs/ssl" name: "ssl" - mountPath: "/etc/certs/config" name: "config" resources: limits: cpu: 1000m memory: 800Mi requests: cpu: 500m memory: 600Mi ports: - name: http containerPort: 80 hostPort: 80 - name: https containerPort: 443 hostPort: 443 - name: admin containerPort: 8080 hostPort: 8080 securityContext: capabilities: drop: - ALL add: - NET_BIND_SERVICE args: - --configfile=/etc/certs/config/traefik.toml - --api - --kubernetes - --logLevel=INFO --- kind: ClusterRole apiVersion: rbac.authorization.k8s.io/v1beta1 metadata: name: traefik-ingress-controller rules: - apiGroups: - "" resources: - services - endpoints - secrets verbs: - get - list - watch - apiGroups: - extensions resources: - ingresses verbs: - get - list - watch - apiGroups: - extensions resources: - ingresses/status verbs: - update --- kind: ClusterRoleBinding apiVersion: rbac.authorization.k8s.io/v1beta1 metadata: name: traefik-ingress-controller roleRef: apiGroup: rbac.authorization.k8s.io kind: ClusterRole name: traefik-ingress-controller subjects: - kind: ServiceAccount name: traefik-ingress-controller namespace: kube-system --- kind: Service apiVersion: v1 metadata: name: traefik-ingress-service namespace: kube-system spec: selector: k8s-app: traefik-ingress-lb ports: - protocol: TCP port: 80 name: http - protocol: TCP port: 443 name: https - protocol: TCP port: 8080 name: admin --- apiVersion: extensions/v1beta1 kind: Ingress metadata: name: traefik-web-ui namespace: kube-system spec: tls: - secretName: traefik-cert rules: - host: traefik-ui.devops.com http: paths: - path: / backend: serviceName: traefik-ingress-service servicePort: admin --- EOF kubectl apply -f traefik-controller-tls.yaml ``` ## 2、删除资源 ``` kubectl delete -f traefik-controller-tls.yaml ``` # 四、命令行创建 https ingress 例子 ``` # 创建示例应用 $ kubectl run test-hello --image=nginx:alpine --port=80 --expose -n kube-system # 删除示例应用(kubectl run 默认创建的是deployment资源应用 ) $ kubectl delete deployment test-hello -n kube-system $ kubectl delete svc test-hello -n kube-system # hello-tls-ingress 示例 $ cd /config/ $ vim hello-tls.ing.yaml apiVersion: extensions/v1beta1 kind: Ingress metadata: name: hello-tls-ingress namespace: kube-system annotations: kubernetes.io/ingress.class: traefik spec: rules: - host: k8s.test.com http: paths: - backend: serviceName: test-hello servicePort: 80 tls: - secretName: traefik-cert # 创建 https ingress $ kubectl apply -f /config/hello-tls.ing.yaml # 注意根据hello示例,需要在kube-system命名空间创建对应的secret: traefik-cert(这步在开篇已经创建了,无须再创建) $ kubectl -n kube-system create secret tls traefik-cert --key=tls_default.key --cert=tls_default.crt # 删除 https ingress $ kubectl delete -f /config/hello-tls.ing.yaml ``` #测试访问(找到traefik-controller pod运行在哪个node节点上,然后绑定该节点的IP,然后访问该url) https://k8s.test.com:23457 ![ingress测试](https://github.com/Lancger/opsfull/blob/master/images/ingress-k8s-02.png) # 五、测试deployment和ingress ``` $ vim nginx-ingress-deploy.yaml --- apiVersion: apps/v1beta1 kind: Deployment metadata: name: nginx-deployment namespace: kube-system spec: replicas: 2 template: metadata: labels: app: nginx-pod spec: containers: - name: nginx image: nginx:1.15.5 ports: - containerPort: 80 --- apiVersion: v1 kind: Service metadata: name: nginx-service namespace: kube-system annotations: traefik.ingress.kubernetes.io/load-balancer-method: drr #动态加权轮训调度 spec: template: metadata: labels: name: nginx-service spec: selector: app: nginx-pod ports: - port: 80 targetPort: 80 --- apiVersion: extensions/v1beta1 kind: Ingress metadata: name: nginx-ingress namespace: kube-system annotations: kubernetes.io/ingress.class: traefik spec: tls: - secretName: first-k8s - secretName: second-k8s rules: - host: k8s.first.com http: paths: - backend: serviceName: nginx-service servicePort: 80 - host: k8s.second.com http: paths: - backend: serviceName: nginx-service servicePort: 80 $ kubectl apply -f nginx-ingress-deploy.yaml $ kubectl delete -f nginx-ingress-deploy.yaml ``` #访问测试 https://k8s.first.com:23457/ ![ingress测试](https://github.com/Lancger/opsfull/blob/master/images/ingress-k8s-03.png) https://k8s.second.com:23457/ ![ingress测试](https://github.com/Lancger/opsfull/blob/master/images/ingress-k8s-04.png) 参考资料: https://xuchao918.github.io/2019/03/01/Kubernetes-traefik-ingress%E4%BD%BF%E7%94%A8/ Kubernetes traefik ingress使用 http://docs.kubernetes.org.cn/558.html ================================================ FILE: components/ingress/traefik-ingress/README.md ================================================ ================================================ FILE: components/ingress/常用操作.md ================================================ ``` [root@master ingress]# kubectl get ingress -A NAMESPACE NAME HOSTS ADDRESS PORTS AGE default nginx-ingress k8s.nginx.com 80 40m kube-system kubernetes-dashboard dashboard.test.com 80 2d21h kube-system traefik-web-ui traefik-ui.test.com 80 2d21h [root@master ingress]# kubectl delete ingress hello-tls-ingress ingress.extensions "hello-tls-ingress" deleted ``` # 1、rbac.yaml 首先,为安全起见我们这里使用 RBAC 安全认证方式:(rbac.yaml) ``` mkdir -p /data/components/ingress cat > /data/components/ingress/rbac.yaml << \EOF --- apiVersion: v1 kind: ServiceAccount metadata: name: traefik-ingress-controller namespace: kube-system --- kind: ClusterRole apiVersion: rbac.authorization.k8s.io/v1beta1 metadata: name: traefik-ingress-controller rules: - apiGroups: - "" resources: - services - endpoints - secrets verbs: - get - list - watch - apiGroups: - extensions resources: - ingresses verbs: - get - list - watch --- kind: ClusterRoleBinding apiVersion: rbac.authorization.k8s.io/v1beta1 metadata: name: traefik-ingress-controller roleRef: apiGroup: rbac.authorization.k8s.io kind: ClusterRole name: traefik-ingress-controller subjects: - kind: ServiceAccount name: traefik-ingress-controller namespace: kube-system EOF kubectl create -f /data/components/ingress/rbac.yaml ``` # 2、traefik.yaml 然后使用 Deployment 来管理 Pod,直接使用官方的 traefik 镜像部署即可(traefik.yaml) ``` cat > /data/components/ingress/traefik.yaml << \EOF --- kind: Deployment apiVersion: extensions/v1beta1 metadata: name: traefik-ingress-controller namespace: kube-system labels: k8s-app: traefik-ingress-lb spec: replicas: 1 selector: matchLabels: k8s-app: traefik-ingress-lb template: metadata: labels: k8s-app: traefik-ingress-lb name: traefik-ingress-lb spec: serviceAccountName: traefik-ingress-controller terminationGracePeriodSeconds: 60 tolerations: - operator: "Exists" nodeSelector: kubernetes.io/hostname: linux-node1.example.com #默认master是不允许被调度的,加上tolerations后允许被调度 containers: - image: traefik name: traefik-ingress-lb ports: - name: http containerPort: 80 - name: admin containerPort: 8080 args: - --api - --kubernetes - --logLevel=INFO --- kind: Service apiVersion: v1 metadata: name: traefik-ingress-service namespace: kube-system spec: selector: k8s-app: traefik-ingress-lb ports: - protocol: TCP port: 80 name: web - protocol: TCP port: 8080 name: admin type: NodePort EOF kubectl create -f /data/components/ingress/traefik.yaml kubectl apply -f /data/components/ingress/traefik.yaml ``` ``` 要注意上面 yaml 文件: tolerations: - operator: "Exists" nodeSelector: kubernetes.io/hostname: master 由于我们这里的特殊性,只有 master 节点有外网访问权限,所以我们使用nodeSelector标签将traefik的固定调度到master这个节点上,那么上面的tolerations是干什么的呢?这个是因为我们集群使用的 kubeadm 安装的,master 节点默认是不能被普通应用调度的,要被调度的话就需要添加这里的 tolerations 属性,当然如果你的集群和我们的不太一样,直接去掉这里的调度策略就行。 nodeSelector 和 tolerations 都属于 Pod 的调度策略,在后面的课程中会为大家讲解。 ``` # 3、traefik-ui traefik 还提供了一个 web ui 工具,就是上面的 8080 端口对应的服务,为了能够访问到该服务,我们这里将服务设置成的 NodePort ``` root># kubectl get pods -n kube-system -l k8s-app=traefik-ingress-lb -o wide NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES traefik-ingress-controller-5b58d5c998-6dn97 1/1 Running 0 88s 10.244.0.2 linux-node1.example.com root># kubectl get svc -n kube-system|grep traefik-ingress-service traefik-ingress-service NodePort 10.102.214.49 80:32472/TCP,8080:32482/TCP 44s 现在在浏览器中输入 master_node_ip:32303 就可以访问到 traefik 的 dashboard 了 ``` http://192.168.56.11:32482/dashboard/ # 4、Ingress 对象 现在我们是通过 NodePort 来访问 traefik 的 Dashboard 的,那怎样通过 ingress 来访问呢? 首先,需要创建一个 ingress 对象:(ingress.yaml) ``` cat > /data/components/ingress/ingress.yaml <<\EOF apiVersion: extensions/v1beta1 kind: Ingress metadata: name: traefik-web-ui namespace: kube-system annotations: kubernetes.io/ingress.class: traefik spec: rules: - host: traefik.k8s.com http: paths: - backend: serviceName: traefik-ingress-service #servicePort: 8080 servicePort: admin #这里建议使用servicePort: admin,这样就避免端口的调整 EOF kubectl create -f /data/components/ingress/ingress.yaml kubectl apply -f /data/components/ingress/ingress.yaml 要注意上面的 ingress 对象的规则,特别是 rules 区域,我们这里是要为 traefik 的 dashboard 建立一个 ingress 对象,所以这里的 serviceName 对应的是上面我们创建的 traefik-ingress-service,端口也要注意对应 8080 端口,为了避免端口更改,这里的 servicePort 的值也可以替换成上面定义的 port 的名字:admin ``` 创建完成后,我们应该怎么来测试呢? ``` 第一步,在本地的/etc/hosts里面添加上 traefik.k8s.com 与 master 节点外网 IP 的映射关系 第二步,在浏览器中访问:http://traefik.k8s.com 我们会发现并没有得到我们期望的 dashboard 界面,这是因为我们上面部署 traefik 的时候使用的是 NodePort 这种 Service 对象,所以我们只能通过上面的 32482 端口访问到我们的目标对象:http://traefik.k8s.com:32482 加上端口后我们发现可以访问到 dashboard 了,而且在 dashboard 当中多了一条记录,正是上面我们创建的 ingress 对象的数据,我们还可以切换到 HEALTH 界面中,可以查看当前 traefik 代理的服务的整体的健康状态 第三步,上面我们可以通过自定义域名加上端口可以访问我们的服务了,但是我们平时服务别人的服务是不是都是直接用的域名啊,http 或者 https 的,几乎很少有在域名后面加上端口访问的吧?为什么?太麻烦啊,端口也记不住,要解决这个问题,怎么办,我们只需要把我们上面的 traefik 的核心应用的端口隐射到 master 节点上的 80 端口,是不是就可以了,因为 http 默认就是访问 80 端口,但是我们在 Service 里面是添加的一个 NodePort 类型的服务,没办法映射 80 端口,怎么办?这里就可以直接在 Pod 中指定一个 hostPort 即可,更改上面的 traefik.yaml 文件中的容器端口: containers: - image: traefik name: traefik-ingress-lb ports: - name: http containerPort: 80 hostPort: 80 #新增这行 - name: admin containerPort: 8080 添加以后hostPort: 80,然后更新应用 kubectl apply -f traefik.yaml 更新完成后,这个时候我们在浏览器中直接使用域名方法测试下 http://traefik.k8s.com 第四步,正常来说,我们如果有自己的域名,我们可以将我们的域名添加一条 DNS 记录,解析到 master 的外网 IP 上面,这样任何人都可以通过域名来访问我的暴露的服务了。 如果你有多个边缘节点的话,可以在每个边缘节点上部署一个 ingress-controller 服务,然后在边缘节点前面挂一个负载均衡器,比如 nginx,将所有的边缘节点均作为这个负载均衡器的后端,这样就可以实现 ingress-controller 的高可用和负载均衡了。 ``` # 5、ingress tls 上节课给大家展示了 traefik 的安装使用以及简单的 ingress 的配置方法,这节课我们来学习一下 ingress tls 以及 path 路径在 ingress 对象中的使用方法。 1、TLS 认证 在现在大部分场景下面我们都会使用 https 来访问我们的服务,这节课我们将使用一个自签名的证书,当然你有在一些正规机构购买的 CA 证书是最好的,这样任何人访问你的服务的时候都是受浏览器信任的证书。使用下面的 openssl 命令生成 CA 证书: ``` openssl req -newkey rsa:2048 -nodes -keyout tls.key -x509 -days 365 -out tls.crt ``` 现在我们有了证书,我们可以使用 kubectl 创建一个 secret 对象来存储上面的证书: ``` kubectl create secret generic traefik-cert --from-file=tls.crt --from-file=tls.key -n kube-system ``` 3、配置 Traefik 前面我们使用的是 Traefik 的默认配置,现在我们来配置 Traefik,让其支持 https: ================================================ FILE: components/initContainers/README.md ================================================ 参考资料: https://www.cnblogs.com/yanh0606/p/11395920.html Kubernetes的初始化容器initContainers https://www.jianshu.com/p/e57c3e17ce8c 理解 Init 容器 ================================================ FILE: components/job/README.md ================================================ 参考资料: https://www.jianshu.com/p/bd6cd1b4e076 Kubernetes对象之Job https://www.cnblogs.com/lvcisco/p/9670100.html k8s Job、Cronjob 的使用 ================================================ FILE: components/k8s-monitor/README.md ================================================ ``` # 1、持久化监控数据 cat > prometheus-class.yaml <<-EOF apiVersion: storage.k8s.io/v1 kind: StorageClass metadata: name: fast provisioner: fuseim.pri/ifs # or choose another name, must match deployment's env PROVISIONER_NAME' parameters: archiveOnDelete: "true" EOF #部署class.yaml kubectl apply -f prometheus-class.yaml #查看创建的storageclass kubectl get sc #2、修改 Prometheus 持久化 prometheus是一种 StatefulSet 有状态集的部署模式,所以直接将 StorageClass 配置到里面,在下面的yaml中最下面添加持久化配置 #cat prometheus/prometheus-prometheus.yaml apiVersion: monitoring.coreos.com/v1 kind: Prometheus metadata: labels: prometheus: k8s name: k8s namespace: monitoring spec: alerting: alertmanagers: - name: alertmanager-main namespace: monitoring port: web baseImage: quay.io/prometheus/prometheus nodeSelector: kubernetes.io/os: linux podMonitorSelector: {} replicas: 2 resources: requests: memory: 400Mi ruleSelector: matchLabels: prometheus: k8s role: alert-rules securityContext: fsGroup: 2000 runAsNonRoot: true runAsUser: 1000 serviceAccountName: prometheus-k8s serviceMonitorNamespaceSelector: {} serviceMonitorSelector: {} version: v2.11.0 storage: #----添加持久化配置,指定StorageClass为上面创建的fast volumeClaimTemplate: spec: storageClassName: fast #---指定为fast resources: requests: storage: 300Gi kubectl apply -f prometheus/prometheus-prometheus.yaml #3、修改 Grafana 持久化配置 由于 Grafana 是部署模式为 Deployment,所以我们提前为其创建一个 grafana-pvc.yaml 文件,加入下面 PVC 配置。 #vim grafana-pvc.yaml kind: PersistentVolumeClaim apiVersion: v1 metadata: name: grafana namespace: monitoring #---指定namespace为monitoring spec: storageClassName: fast #---指定StorageClass为上面创建的fast accessModes: - ReadWriteOnce resources: requests: storage: 200Gi kubectl apply -f grafana-pvc.yaml #vim grafana/grafana-deployment.yaml ...... volumes: - name: grafana-storage #-------新增持久化配置 persistentVolumeClaim: claimName: grafana #-------设置为创建的PVC名称 #- emptyDir: {} #-------注释掉旧的配置 # name: grafana-storage - name: grafana-datasources secret: secretName: grafana-datasources - configMap: name: grafana-dashboards name: grafana-dashboards ...... kubectl apply -f grafana/grafana-deployment.yaml ``` 参考资料: https://www.cnblogs.com/skyflask/articles/11410063.html kubernetes监控方案--cAdvisor+Heapster+InfluxDB+Grafana https://www.cnblogs.com/skyflask/p/11480988.html kubernetes监控终极方案-kube-promethues http://www.mydlq.club/article/10/#wow1 Kube-promethues监控k8s集群 https://jicki.me/docker/kubernetes/2019/07/22/kube-prometheus/ Coreos kube-prometheus 监控 ================================================ FILE: components/kube-proxy/README.md ================================================ # Kube-Proxy简述 ``` 运行在每个节点上,监听 API Server 中服务对象的变化,再通过管理 IPtables 来实现网络的转发 Kube-Proxy 目前支持三种模式: UserSpace k8s v1.2 后就已经淘汰 IPtables 目前默认方式 IPVS 需要安装ipvsadm、ipset 工具包和加载 ip_vs 内核模块 ``` 参考资料: https://ywnz.com/linuxyffq/2530.html 解析从外部访问Kubernetes集群中应用的几种方法 https://www.jianshu.com/p/b2d13cec7091 浅谈 k8s service&kube-proxy https://www.codercto.com/a/90806.html 探究K8S Service内部iptables路由规则 https://blog.51cto.com/goome/2369150 k8s实践7:ipvs结合iptables使用过程分析 https://blog.csdn.net/xinghun_4/article/details/50492041 kubernetes中port、target port、node port的对比分析,以及kube-proxy代理 ================================================ FILE: components/nfs/README.md ================================================ ================================================ FILE: components/pressure/README.md ================================================ # 一、生产大规模集群,网络组件选择 如果用calico-RR反射器这种模式,保证性能的情况下大概能支撑好多个节点? RR反射器还分为两种 可以由calico的节点服务承载 也可以是直接的物理路由器做RR 超大规模Calico如果全以BGP来跑没什么问题 只是要做好网络地址规划 即便是不同集群容器地址也不能重叠 # 二、flanel网络组件压测 ``` flannel受限于cpu压力 ``` ![k8s网络组件flannel压测](https://github.com/Lancger/opsfull/blob/master/images/pressure_flannel_01.png) # 三、calico网络组件压测 ``` calico则轻轻松松与宿主机性能相差无几 如果单单一个集群 节点数超级多 如果不做BGP路由聚合 物理路由器或三层交换机会扛不住的 ``` ![k8s网络组件calico压测](https://github.com/Lancger/opsfull/blob/master/images/pressure_calico_01.png) # 四、calico网络和宿主机压测 ![k8s网络组件压测](https://github.com/Lancger/opsfull/blob/master/images/pressure_physical_01.png) ================================================ FILE: components/pressure/calico bgp网络需要物理路由和交换机支持吗.md ================================================ ![](https://github.com/Lancger/opsfull/blob/master/images/calico_bgp_01.png) ![](https://github.com/Lancger/opsfull/blob/master/images/calico_bgp_02.png) ![](https://github.com/Lancger/opsfull/blob/master/images/calico_bgp_03.png) ![](https://github.com/Lancger/opsfull/blob/master/images/calico_bgp_04.png) ![](https://github.com/Lancger/opsfull/blob/master/images/calico_bgp_05.png) ![](https://github.com/Lancger/opsfull/blob/master/images/calico_bgp_06.png) ![](https://github.com/Lancger/opsfull/blob/master/images/calico_bgp_07.png) ![](https://github.com/Lancger/opsfull/blob/master/images/calico_bgp_08.png) ![](https://github.com/Lancger/opsfull/blob/master/images/calico_bgp_09.png) ![](https://github.com/Lancger/opsfull/blob/master/images/calico_bgp_10.png) ![](https://github.com/Lancger/opsfull/blob/master/images/calico_bgp_11.png) ![](https://github.com/Lancger/opsfull/blob/master/images/calico_bgp_12.png) ================================================ FILE: components/pressure/k8s集群更换网段方案.md ================================================ ``` 1、服务器IP更换网段 有什么解决方案吗?不重新搭建集群的话? 方案一: 改监听地址,重做集群证书 不然还真不好搞的 方案二: 如果etcd一开始是静态的 那就不好玩了 得一开始就是基于dns discovery方式 简明扼要的说 就是但凡涉及IP地址的地方 全部用fqdn 无论是证书还是配置文件 这四句话核心就够了 etcd官方本来就有正式文档讲dns discovery部署 只是k8s部分,官方部署没有提 ``` ![](https://github.com/Lancger/opsfull/blob/master/images/change_ip_01.png) ![](https://github.com/Lancger/opsfull/blob/master/images/change_ip_02.png) ![](https://github.com/Lancger/opsfull/blob/master/images/change_ip_05.png) ![](https://github.com/Lancger/opsfull/blob/master/images/change_ip_06.png) 来自: 广大群友讨论集锦 https://github.com/etcd-io/etcd/blob/a4018f25c91fff8f4f15cd2cee9f026650c7e688/Documentation/clustering.md#dns-discovery ================================================ FILE: docs/Envoy的架构与基本术语.md ================================================ 参考文档: https://jimmysong.io/kubernetes-handbook/usecases/envoy-terminology.html Envoy 的架构与基本术语 ================================================ FILE: docs/Kubernetes学习笔记.md ================================================ 参考文档: https://blog.gmem.cc/kubernetes-study-note Kubernetes学习笔记 ================================================ FILE: docs/Kubernetes架构介绍.md ================================================ # Kubernetes架构介绍 ## Kubernetes架构 ![](https://github.com/Lancger/opsfull/blob/master/images/kubernetes%E6%9E%B6%E6%9E%84.jpg) ## k8s架构图 ![](https://github.com/Lancger/opsfull/blob/master/images/k8s%E6%9E%B6%E6%9E%84%E5%9B%BE.jpg) ## 一、K8S Master节点 ### API Server apiserver提供集群管理的REST API接口,包括认证授权、数据校验以 及集群状态变更等 只有API Server才直接操作etcd 其他模块通过API Server查询或修改数据 提供其他模块之间的数据交互和通信的枢纽 ### Scheduler scheduler负责分配调度Pod到集群内的node节点 监听kube-apiserver,查询还未分配Node的Pod 根据调度策略为这些Pod分配节点 ### Controller Manager controller-manager由一系列的控制器组成,它通过apiserver监控整个 集群的状态,并确保集群处于预期的工作状态 ### ETCD 所有持久化的状态信息存储在ETCD中 ## 二、K8S Node节点 ### Kubelet 1. 管理Pods以及容器、镜像、Volume等,实现对集群 对节点的管理。 ### Kube-proxy 2. 提供网络代理以及负载均衡,实现与Service通信。 ### Docker Engine 3. 负责节点的容器的管理工作。 ## 三、资源对象介绍 ### 3.1 Replication Controller,RC 1. RC是K8s集群中最早的保证Pod高可用的API对象。通过监控运行中 的Pod来保证集群中运行指定数目的Pod副本。 2. 指定的数目可以是多个也可以是1个;少于指定数目,RC就会启动运 行新的Pod副本;多于指定数目,RC就会杀死多余的Pod副本。 3. 即使在指定数目为1的情况下,通过RC运行Pod也比直接运行Pod更 明智,因为RC也可以发挥它高可用的能力,保证永远有1个Pod在运 行。 ### 3.2 Replica Set,RS 1. RS是新一代RC,提供同样的高可用能力,区别主要在于RS后来居上, 能支持更多中的匹配模式。副本集对象一般不单独使用,而是作为部 署的理想状态参数使用。 2. 是K8S 1.2中出现的概念,是RC的升级。一般和Deployment共同使用。 ### 3.3 Deployment 1. Deployment表示用户对K8s集群的一次更新操作。Deployment是 一个比RS应用模式更广的API对象, 2. 可以是创建一个新的服务,更新一个新的服务,也可以是滚动升 级一个服务。滚动升级一个服务,实际是创建一个新的RS,然后 逐渐将新RS中副本数增加到理想状态,将旧RS中的副本数减小 到0的复合操作; 3. 这样一个复合操作用一个RS是不太好描述的,所以用一个更通用 的Deployment来描述。 ### 3.4 Service 1. RC、RS和Deployment只是保证了支撑服务的POD的数量,但是没有解 决如何访问这些服务的问题。一个Pod只是一个运行服务的实例,随时可 能在一个节点上停止,在另一个节点以一个新的IP启动一个新的Pod,因 此不能以确定的IP和端口号提供服务。 2. 要稳定地提供服务需要服务发现和负载均衡能力。服务发现完成的工作, 是针对客户端访问的服务,找到对应的的后端服务实例。 3. 在K8集群中,客户端需要访问的服务就是Service对象。每个Service会对 应一个集群内部有效的虚拟IP,集群内部通过虚拟IP访问一个服务。 ## 四、K8S的IP地址 1. Node IP: 节点设备的IP,如物理机,虚拟机等容器宿主的实际IP。 2. Pod IP: Pod 的IP地址,是根据docker0网格IP段进行分配的。 3. Cluster IP: Service的IP,是一个虚拟IP,仅作用于service对象,由k8s 管理和分配,需要结合service port才能使用,单独的IP没有通信功能, 集群外访问需要一些修改。 4. 在K8S集群内部,nodeip podip clusterip的通信机制是由k8s制定的路由 规则,不是IP路由。 ================================================ FILE: docs/Kubernetes集群环境准备.md ================================================ # 一、k8s集群实验环境准备 ![架构图](https://github.com/Lancger/opsfull/blob/master/images/K8S.png)
主机名 IP地址(NAT) 描述
linux-node1.example.com eth0:192.168.56.11 Kubernets Master节点/Etcd节点
linux-node2.example.com eth0:192.168.56.12 Kubernets Node节点/ Etcd节点
linux-node3.example.com eth0:192.168.56.13 Kubernets Node节点/ Etcd节点
# 二、准备工作 1、设置主机名 ``` hostnamectl set-hostname linux-node1 hostnamectl set-hostname linux-node2 hostnamectl set-hostname linux-node3 ``` 2、绑定主机host ``` cat > /etc/hosts </dev/null 2>&1 #设置时区 cp /usr/share/zoneinfo/Asia/Shanghai /etc/localtime #SSH登录慢 sed -i "s/#UseDNS yes/UseDNS no/" /etc/ssh/sshd_config sed -i "s/GSSAPIAuthentication yes/GSSAPIAuthentication no/" /etc/ssh/sshd_config systemctl restart sshd.service ``` 6、软件包下载 k8s-v1.12.0版本网盘地址: https://pan.baidu.com/s/1jU427W1f3oSDnzB3bU2s5w ``` #所有文件存放在/opt/kubernetes目录下 mkdir -p /opt/kubernetes/{cfg,bin,ssl,log} #使用二进制方式进行部署 官网下载地址: https://github.com/kubernetes/kubernetes/blob/master/CHANGELOG-1.12.md#downloads-for-v1121 #添加环境变量 vim /root/.bash_profile PATH=$PATH:$HOME/bin:/opt/kubernetes/bin source /root/.bash_profile ``` ![官网下载链接](https://github.com/Lancger/opsfull/blob/master/images/k8s-soft.jpg) 7、解压软件包 ``` tar -zxvf kubernetes.tar.gz -C /usr/local/src/ tar -zxvf kubernetes-server-linux-amd64.tar.gz -C /usr/local/src/ tar -zxvf kubernetes-client-linux-amd64.tar.gz -C /usr/local/src/ tar -zxvf kubernetes-node-linux-amd64.tar.gz -C /usr/local/src/ ``` ================================================ FILE: docs/app.md ================================================ 1.创建一个测试用的deployment ``` [root@linux-node1 ~]# kubectl run net-test --image=alpine --replicas=2 sleep 360000 [root@linux-node1 ~]# kubectl get deployment NAME DESIRED CURRENT UP-TO-DATE AVAILABLE AGE net-test 2 2 2 2 2h [root@linux-node1 ~]# kubectl delete deployment net-test ``` 2.查看获取IP情况 ``` [root@linux-node1 ~]# kubectl get pod -o wide NAME READY STATUS RESTARTS AGE IP NODE net-test-5767cb94df-6smfk 1/1 Running 1 1h 10.2.69.3 192.168.56.12 net-test-5767cb94df-ctkhz 1/1 Running 1 1h 10.2.17.3 192.168.56.13 ``` 3.测试联通性 ``` [root@linux-node1 ~]# ping -c 1 10.2.69.3 PING 10.2.69.3 (10.2.69.3) 56(84) bytes of data. 64 bytes from 10.2.69.3: icmp_seq=1 ttl=61 time=1.39 ms --- 10.2.69.3 ping statistics --- 1 packets transmitted, 1 received, 0% packet loss, time 0ms rtt min/avg/max/mdev = 1.396/1.396/1.396/0.000 ms [root@linux-node1 ~]# ping -c 1 10.2.17.3 PING 10.2.17.3 (10.2.17.3) 56(84) bytes of data. 64 bytes from 10.2.17.3: icmp_seq=1 ttl=61 time=1.16 ms --- 10.2.17.3 ping statistics --- 1 packets transmitted, 1 received, 0% packet loss, time 0ms rtt min/avg/max/mdev = 1.164/1.164/1.164/0.000 ms #如果要在master节点不能ping通pod的IP,则需要检查flanneld服务,下面是各节点的网卡ip情况(发现各节点的flannel0的ip网段都是不一样的) #node1 [root@linux-node1 ~]# ifconfig docker0: flags=4099 mtu 1500 inet 10.2.41.1 netmask 255.255.255.0 broadcast 10.2.41.255 ether 02:42:77:d9:95:e3 txqueuelen 0 (Ethernet) RX packets 0 bytes 0 (0.0 B) RX errors 0 dropped 0 overruns 0 frame 0 TX packets 0 bytes 0 (0.0 B) TX errors 0 dropped 0 overruns 0 carrier 0 collisions 0 eth0: flags=4163 mtu 1500 inet 192.168.56.11 netmask 255.255.255.0 broadcast 192.168.56.255 ether 00:0c:29:e6:00:79 txqueuelen 1000 (Ethernet) RX packets 75548 bytes 10771254 (10.2 MiB) RX errors 0 dropped 0 overruns 0 frame 0 TX packets 74344 bytes 12700211 (12.1 MiB) TX errors 0 dropped 0 overruns 0 carrier 0 collisions 0 flannel0: flags=4305 mtu 1472 inet 10.2.41.0 netmask 255.255.0.0 destination 10.2.41.0 unspec 00-00-00-00-00-00-00-00-00-00-00-00-00-00-00-00 txqueuelen 500 (UNSPEC) RX packets 30 bytes 2520 (2.4 KiB) RX errors 0 dropped 0 overruns 0 frame 0 TX packets 30 bytes 2520 (2.4 KiB) TX errors 0 dropped 0 overruns 0 carrier 0 collisions 0 lo: flags=73 mtu 65536 inet 127.0.0.1 netmask 255.0.0.0 loop txqueuelen 1000 (Local Loopback) RX packets 34140 bytes 8049438 (7.6 MiB) RX errors 0 dropped 0 overruns 0 frame 0 TX packets 34140 bytes 8049438 (7.6 MiB) TX errors 0 dropped 0 overruns 0 carrier 0 collisions 0 #node2 [root@linux-node2 ~]# ifconfig docker0: flags=4163 mtu 1400 inet 10.2.69.1 netmask 255.255.255.0 broadcast 10.2.69.255 ether 02:42:de:56:b5:1e txqueuelen 0 (Ethernet) RX packets 10 bytes 448 (448.0 B) RX errors 0 dropped 0 overruns 0 frame 0 TX packets 9 bytes 546 (546.0 B) TX errors 0 dropped 0 overruns 0 carrier 0 collisions 0 eth0: flags=4163 mtu 1500 inet 192.168.56.12 netmask 255.255.255.0 broadcast 192.168.56.255 ether 00:0c:29:ee:65:40 txqueuelen 1000 (Ethernet) RX packets 32893 bytes 4996885 (4.7 MiB) RX errors 0 dropped 0 overruns 0 frame 0 TX packets 32877 bytes 3737878 (3.5 MiB) TX errors 0 dropped 0 overruns 0 carrier 0 collisions 0 flannel0: flags=4305 mtu 1472 inet 10.2.69.0 netmask 255.255.0.0 destination 10.2.69.0 unspec 00-00-00-00-00-00-00-00-00-00-00-00-00-00-00-00 txqueuelen 500 (UNSPEC) RX packets 3 bytes 252 (252.0 B) RX errors 0 dropped 0 overruns 0 frame 0 TX packets 3 bytes 252 (252.0 B) TX errors 0 dropped 0 overruns 0 carrier 0 collisions 0 lo: flags=73 mtu 65536 inet 127.0.0.1 netmask 255.0.0.0 loop txqueuelen 1000 (Local Loopback) RX packets 347 bytes 36887 (36.0 KiB) RX errors 0 dropped 0 overruns 0 frame 0 TX packets 347 bytes 36887 (36.0 KiB) TX errors 0 dropped 0 overruns 0 carrier 0 collisions 0 veth09ea856c: flags=4163 mtu 1400 ether c6:be:00:bd:a9:18 txqueuelen 0 (Ethernet) RX packets 10 bytes 588 (588.0 B) RX errors 0 dropped 0 overruns 0 frame 0 TX packets 9 bytes 546 (546.0 B) TX errors 0 dropped 0 overruns 0 carrier 0 collisions 0 #node3 [root@linux-node3 ~]# ifconfig docker0: flags=4163 mtu 1400 inet 10.2.17.1 netmask 255.255.255.0 broadcast 10.2.17.255 ether 02:42:ac:11:ac:3c txqueuelen 0 (Ethernet) RX packets 32 bytes 2408 (2.3 KiB) RX errors 0 dropped 0 overruns 0 frame 0 TX packets 31 bytes 2814 (2.7 KiB) TX errors 0 dropped 0 overruns 0 carrier 0 collisions 0 eth0: flags=4163 mtu 1500 inet 192.168.56.13 netmask 255.255.255.0 broadcast 192.168.56.255 ether 00:0c:29:53:f4:b1 txqueuelen 1000 (Ethernet) RX packets 47504 bytes 7138550 (6.8 MiB) RX errors 0 dropped 0 overruns 0 frame 0 TX packets 48402 bytes 8310935 (7.9 MiB) TX errors 0 dropped 0 overruns 0 carrier 0 collisions 0 flannel0: flags=4305 mtu 1472 inet 10.2.17.0 netmask 255.255.0.0 destination 10.2.17.0 unspec 00-00-00-00-00-00-00-00-00-00-00-00-00-00-00-00 txqueuelen 500 (UNSPEC) RX packets 27 bytes 2268 (2.2 KiB) RX errors 0 dropped 0 overruns 0 frame 0 TX packets 27 bytes 2268 (2.2 KiB) TX errors 0 dropped 0 overruns 0 carrier 0 collisions 0 lo: flags=73 mtu 65536 inet 127.0.0.1 netmask 255.0.0.0 loop txqueuelen 1000 (Local Loopback) RX packets 129 bytes 13510 (13.1 KiB) RX errors 0 dropped 0 overruns 0 frame 0 TX packets 129 bytes 13510 (13.1 KiB) TX errors 0 dropped 0 overruns 0 carrier 0 collisions 0 veth8630a55b: flags=4163 mtu 1400 ether 72:e9:df:4f:f6:64 txqueuelen 0 (Ethernet) RX packets 32 bytes 2856 (2.7 KiB) RX errors 0 dropped 0 overruns 0 frame 0 TX packets 31 bytes 2814 (2.7 KiB) TX errors 0 dropped 0 overruns 0 carrier 0 collisions 0 ``` 4、创建nginx服务 ``` #创建deployment文件 [root@linux-node1 ~]# vim nginx-deployment.yaml apiVersion: apps/v1 kind: Deployment metadata: name: nginx-deployment labels: app: nginx spec: replicas: 3 selector: matchLabels: app: nginx template: metadata: labels: app: nginx spec: containers: - name: nginx image: nginx:1.13.12 ports: - containerPort: 80 #创建deployment [root@linux-node1 ~]# kubectl create -f nginx-deployment.yaml deployment.apps "nginx-deployment" created #查看deployment [root@linux-node1 ~]# kubectl get deployment NAME DESIRED CURRENT UP-TO-DATE AVAILABLE AGE nginx-deployment 3 3 3 2 48s #查看deployment详情 [root@linux-node1 ~]# kubectl describe deployment nginx-deployment Name: nginx-deployment Namespace: default CreationTimestamp: Tue, 09 Oct 2018 15:11:33 +0800 Labels: app=nginx Annotations: deployment.kubernetes.io/revision=1 Selector: app=nginx Replicas: 3 desired | 3 updated | 3 total | 3 available | 0 unavailable StrategyType: RollingUpdate MinReadySeconds: 0 RollingUpdateStrategy: 25% max unavailable, 25% max surge Pod Template: Labels: app=nginx Containers: nginx: Image: nginx:1.13.12 Port: 80/TCP Host Port: 0/TCP Environment: Mounts: Volumes: Conditions: Type Status Reason ---- ------ ------ Available True MinimumReplicasAvailable Progressing True NewReplicaSetAvailable OldReplicaSets: NewReplicaSet: nginx-deployment-6c45fc49cb (3/3 replicas created) Events: Type Reason Age From Message ---- ------ ---- ---- ------- Normal ScalingReplicaSet 2m deployment-controller Scaled up replica set nginx-deployment-6c45fc49cb to 3 #查看pod [root@linux-node1 ~]# kubectl get pod -o wide NAME READY STATUS RESTARTS AGE IP NODE nginx-deployment-6c45fc49cb-7rwdp 1/1 Running 0 4m 10.2.76.5 192.168.56.12 nginx-deployment-6c45fc49cb-8dgkd 1/1 Running 0 4m 10.2.76.4 192.168.56.12 nginx-deployment-6c45fc49cb-clgkl 1/1 Running 0 4m 10.2.76.4 192.168.56.13 #查看pod详情 [root@linux-node1 ~]# kubectl describe pod nginx-deployment-6c45fc49cb-7rwdp Name: nginx-deployment-6c45fc49cb-7rwdp Namespace: default Node: 192.168.56.12/192.168.56.12 Start Time: Tue, 09 Oct 2018 15:11:33 +0800 Labels: app=nginx pod-template-hash=2701970576 Annotations: Status: Running IP: 10.2.76.5 Controlled By: ReplicaSet/nginx-deployment-6c45fc49cb Containers: nginx: Container ID: docker://0ab9b4f9bf3691f16e9cb6836a7375cb7f886398bfa8a81147e9a24f3634d591 Image: nginx:1.13.12 Image ID: docker-pullable://nginx@sha256:b1d09e9718890e6ebbbd2bc319ef1611559e30ce1b6f56b2e3b479d9da51dc35 Port: 80/TCP Host Port: 0/TCP State: Running Started: Tue, 09 Oct 2018 15:12:33 +0800 Ready: True Restart Count: 0 Environment: Mounts: /var/run/secrets/kubernetes.io/serviceaccount from default-token-4cgj8 (ro) Conditions: Type Status Initialized True Ready True PodScheduled True Volumes: default-token-4cgj8: Type: Secret (a volume populated by a Secret) SecretName: default-token-4cgj8 Optional: false QoS Class: BestEffort Node-Selectors: Tolerations: Events: Type Reason Age From Message ---- ------ ---- ---- ------- Normal Scheduled 4m default-scheduler Successfully assigned nginx-deployment-6c45fc49cb-7rwdp to 192.168.56.12 Normal SuccessfulMountVolume 4m kubelet, 192.168.56.12 MountVolume.SetUp succeeded for volume "default-token-4cgj8" Normal Pulling 4m kubelet, 192.168.56.12 pulling image "nginx:1.13.12" Normal Pulled 3m kubelet, 192.168.56.12 Successfully pulled image "nginx:1.13.12" Normal Created 3m kubelet, 192.168.56.12 Created container Normal Started 3m kubelet, 192.168.56.12 Started container #导出资源描述 kubectl get --export -o yaml 命令会以Yaml格式导出系统中已有资源描述 比如,我们可以将系统中 nginx 部署的描述导成 Yaml 文件 kubectl get deployment nginx-deployment-6c45fc49cb-7rwdp --export -o yaml > nginx-deployment.yaml #测试pod访问 测试访问nginx镜像(在对应的节点上测试,本来是其他节点也可以正常访问的) [root@linux-node3 ~]# curl --head http://10.2.76.4 HTTP/1.1 200 OK Server: nginx/1.13.12 Date: Tue, 09 Oct 2018 07:17:55 GMT Content-Type: text/html Content-Length: 612 Last-Modified: Mon, 09 Apr 2018 16:01:09 GMT Connection: keep-alive ETag: "5acb8e45-264" Accept-Ranges: bytes ``` 5、更新Deployment ``` #--record 记录日志,方便以后回滚 [root@linux-node1 ~]# kubectl set image deployment/nginx-deployment nginx=nginx:1.12.1 --record deployment.apps "nginx-deployment" image updated ``` 6、查看更新后的Deployment ``` #这里发现镜像已经更新为1.12.1版本了,然后CURRENT(当前镜像数为4个,期望值DESIRED为3个,说明正在进行滚动更新) [root@linux-node1 ~]# kubectl get deployment -o wide NAME DESIRED CURRENT UP-TO-DATE AVAILABLE AGE CONTAINERS IMAGES SELECTOR nginx-deployment 3 4 1 3 13m nginx nginx:1.12.1 app=nginx ``` 7、查看历史记录 ``` [root@linux-node1 ~]# kubectl rollout history deployment/nginx-deployment deployments "nginx-deployment" REVISION CHANGE-CAUSE 1 ---第一个没有,是因为我们创建的时候没有加上--record参数 4 kubectl set image deployment/nginx-deployment nginx=nginx:1.12.2 --record=true 5 kubectl set image deployment/nginx-deployment nginx=nginx:1.12.1 --record=true ``` 7、查看具体某一个版本的升级历史 ``` [root@linux-node1 ~]# kubectl rollout history deployment/nginx-deployment --revision=1 deployments "nginx-deployment" with revision #1 Pod Template: Labels: app=nginx pod-template-hash=2701970576 Containers: nginx: Image: nginx:1.13.12 Port: 80/TCP Host Port: 0/TCP Environment: Mounts: Volumes: ``` 8、快速回滚到上一个版本 ``` [root@linux-node1 ~]# kubectl rollout undo deployment/nginx-deployment deployment.apps "nginx-deployment" [root@linux-node1 ~]# ``` 9、扩容到5个节点 ``` [root@linux-node1 ~]# kubectl get pod -o wide ----之前是3个pod NAME READY STATUS RESTARTS AGE IP NODE nginx-deployment-7498dc98f8-48lqg 1/1 Running 0 2m 10.2.76.15 192.168.56.12 nginx-deployment-7498dc98f8-g4zkp 1/1 Running 0 2m 10.2.76.9 192.168.56.13 nginx-deployment-7498dc98f8-z2466 1/1 Running 0 2m 10.2.76.16 192.168.56.12 [root@linux-node1 ~]# kubectl scale deployment nginx-deployment --replicas 5 deployment.extensions "nginx-deployment" scaled [root@linux-node1 ~]# kubectl get pod -o wide ----现在扩容到了5个pod NAME READY STATUS RESTARTS AGE IP NODE nginx-deployment-7498dc98f8-28894 1/1 Running 0 8s 10.2.76.10 192.168.56.13 nginx-deployment-7498dc98f8-48lqg 1/1 Running 0 2m 10.2.76.15 192.168.56.12 nginx-deployment-7498dc98f8-g4zkp 1/1 Running 0 2m 10.2.76.9 192.168.56.13 nginx-deployment-7498dc98f8-tt7z5 1/1 Running 0 7s 10.2.76.17 192.168.56.12 nginx-deployment-7498dc98f8-z2466 1/1 Running 0 2m 10.2.76.16 192.168.56.12 ``` 10、Pod ip 变化频繁, 引入service-ip ``` #创建nginx-server [root@linux-node1 ~]# cat nginx-service.yaml kind: Service apiVersion: v1 metadata: name: nginx-service spec: selector: app: nginx ports: - protocol: TCP port: 80 targetPort: 80 [root@linux-node1 ~]# kubectl create -f nginx-service.yaml service "nginx-service" created #发现给我们创建了一个vip 10.1.46.200 并且通过lvs做了负载均衡 [root@linux-node1 ~]# kubectl get service NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE kubernetes ClusterIP 10.1.0.1 443/TCP 3h nginx-service ClusterIP 10.1.46.200 80/TCP 5m #在node节点使用ipvsadm -Ln查看负载均衡后端节点 [root@linux-node2 ~]# ipvsadm -Ln IP Virtual Server version 1.2.1 (size=4096) Prot LocalAddress:Port Scheduler Flags -> RemoteAddress:Port Forward Weight ActiveConn InActConn TCP 10.1.46.200:80 rr -> 10.2.76.11:80 Masq 1 0 0 -> 10.2.76.12:80 Masq 1 0 0 -> 10.2.76.13:80 Masq 1 0 0 -> 10.2.76.18:80 Masq 1 0 0 -> 10.2.76.19:80 Masq 1 0 0 #在master上访问vip不行,是因为没有安装kube-proxy服务,需要在node节点去测试验证 [root@linux-node1 ~]# curl --head http://10.1.46.200 [root@linux-node2 ~]# curl --head http://10.1.46.200 HTTP/1.1 200 OK Server: nginx/1.10.3 Date: Tue, 09 Oct 2018 07:55:57 GMT Content-Type: text/html Content-Length: 612 Last-Modified: Tue, 31 Jan 2017 15:01:11 GMT Connection: keep-alive ETag: "5890a6b7-264" Accept-Ranges: bytes #每执行一次curl --head http://10.1.46.200请求,后端InActConn连接数就会增加1 [root@linux-node2 ~]# ipvsadm -Ln IP Virtual Server version 1.2.1 (size=4096) Prot LocalAddress:Port Scheduler Flags -> RemoteAddress:Port Forward Weight ActiveConn InActConn TCP 10.1.46.200:80 rr -> 10.2.76.11:80 Masq 1 0 1 -> 10.2.76.12:80 Masq 1 0 1 -> 10.2.76.13:80 Masq 1 0 2 -> 10.2.76.18:80 Masq 1 0 2 -> 10.2.76.19:80 Masq 1 0 2 ``` ================================================ FILE: docs/app2.md ================================================ 1.查询命名空间 ``` [root@linux-node1 ~]# kubectl get namespace --all-namespaces NAME STATUS AGE default Active 3d13h kube-node-lease Active 3d13h kube-public Active 3d13h kube-system Active 3d13h ``` 2.查询健康状况 ``` [root@linux-node1 ~]# kubectl get cs --all-namespaces NAME STATUS MESSAGE ERROR controller-manager Healthy ok scheduler Healthy ok etcd-0 Healthy {"health":"true"} etcd-2 Healthy {"health":"true"} etcd-1 Healthy {"health":"true"} ``` 3.查询node ``` [root@linux-node1 ~]# kubectl get node -o wide NAME STATUS ROLES AGE VERSION INTERNAL-IP EXTERNAL-IP OS-IMAGE KERNEL-VERSION CONTAINER-RUNTIME 10.33.35.5 Ready,SchedulingDisabled master 3d13h v1.15.2 10.33.35.5 CentOS Linux 7 (Core) 3.10.0-957.27.2.el7.x86_64 docker://18.9.6 10.33.35.6 Ready node 3d13h v1.15.2 10.33.35.6 CentOS Linux 7 (Core) 3.10.0-957.27.2.el7.x86_64 docker://18.9.6 10.33.35.7 Ready node 3d13h v1.15.2 10.33.35.7 CentOS Linux 7 (Core) 3.10.0-957.27.2.el7.x86_64 docker://18.9.6 ``` 4.创建一个测试用的deployment ``` [root@linux-node1 ~]# kubectl run net-test --image=alpine --replicas=2 sleep 360000 [root@linux-node1 ~]# kubectl get deployment NAME DESIRED CURRENT UP-TO-DATE AVAILABLE AGE net-test 2 2 2 2 2h [root@linux-node1 ~]# kubectl delete deployment net-test ``` 5.查看获取IP情况 ``` [root@linux-node1 ~]# kubectl get pod -o wide NAME READY STATUS RESTARTS AGE IP NODE net-test-54ddf4f6c7-qfgw9 1/1 Running 0 22s 172.20.2.131 10.33.35.7 net-test-54ddf4f6c7-rwgmc 1/1 Running 0 22s 172.20.1.137 10.33.35.6 ``` 6、创建nginx服务 ``` #创建deployment文件 [root@linux-node1 ~]# vim nginx-deployment.yaml apiVersion: apps/v1 kind: Deployment metadata: name: nginx-deployment labels: app: nginx spec: replicas: 3 selector: matchLabels: app: nginx template: metadata: labels: app: nginx spec: containers: - name: nginx image: nginx:1.13.12 ports: - containerPort: 80 #创建deployment [root@linux-node1 ~]# kubectl create -f nginx-deployment.yaml deployment.apps "nginx-deployment" created #查看deployment [root@linux-node1 ~]# kubectl get deployment NAME DESIRED CURRENT UP-TO-DATE AVAILABLE AGE nginx-deployment 3 3 3 2 48s #查看deployment详情 [root@linux-node1 ~]# kubectl describe deployment nginx-deployment Name: nginx-deployment Namespace: default CreationTimestamp: Tue, 09 Oct 2018 15:11:33 +0800 Labels: app=nginx Annotations: deployment.kubernetes.io/revision=1 Selector: app=nginx Replicas: 3 desired | 3 updated | 3 total | 3 available | 0 unavailable StrategyType: RollingUpdate MinReadySeconds: 0 RollingUpdateStrategy: 25% max unavailable, 25% max surge Pod Template: Labels: app=nginx Containers: nginx: Image: nginx:1.13.12 Port: 80/TCP Host Port: 0/TCP Environment: Mounts: Volumes: Conditions: Type Status Reason ---- ------ ------ Available True MinimumReplicasAvailable Progressing True NewReplicaSetAvailable OldReplicaSets: NewReplicaSet: nginx-deployment-6c45fc49cb (3/3 replicas created) Events: Type Reason Age From Message ---- ------ ---- ---- ------- Normal ScalingReplicaSet 2m deployment-controller Scaled up replica set nginx-deployment-6c45fc49cb to 3 #查看pod [root@linux-node1 ~]# kubectl get pod -o wide NAME READY STATUS RESTARTS AGE IP NODE nginx-deployment-6c45fc49cb-7rwdp 1/1 Running 0 4m 10.2.76.5 192.168.56.12 nginx-deployment-6c45fc49cb-8dgkd 1/1 Running 0 4m 10.2.76.4 192.168.56.12 nginx-deployment-6c45fc49cb-clgkl 1/1 Running 0 4m 10.2.76.4 192.168.56.13 #查看pod详情 [root@linux-node1 ~]# kubectl describe pod nginx-deployment-6c45fc49cb-7rwdp Name: nginx-deployment-6c45fc49cb-7rwdp Namespace: default Node: 192.168.56.12/192.168.56.12 Start Time: Tue, 09 Oct 2018 15:11:33 +0800 Labels: app=nginx pod-template-hash=2701970576 Annotations: Status: Running IP: 10.2.76.5 Controlled By: ReplicaSet/nginx-deployment-6c45fc49cb Containers: nginx: Container ID: docker://0ab9b4f9bf3691f16e9cb6836a7375cb7f886398bfa8a81147e9a24f3634d591 Image: nginx:1.13.12 Image ID: docker-pullable://nginx@sha256:b1d09e9718890e6ebbbd2bc319ef1611559e30ce1b6f56b2e3b479d9da51dc35 Port: 80/TCP Host Port: 0/TCP State: Running Started: Tue, 09 Oct 2018 15:12:33 +0800 Ready: True Restart Count: 0 Environment: Mounts: /var/run/secrets/kubernetes.io/serviceaccount from default-token-4cgj8 (ro) Conditions: Type Status Initialized True Ready True PodScheduled True Volumes: default-token-4cgj8: Type: Secret (a volume populated by a Secret) SecretName: default-token-4cgj8 Optional: false QoS Class: BestEffort Node-Selectors: Tolerations: Events: Type Reason Age From Message ---- ------ ---- ---- ------- Normal Scheduled 4m default-scheduler Successfully assigned nginx-deployment-6c45fc49cb-7rwdp to 192.168.56.12 Normal SuccessfulMountVolume 4m kubelet, 192.168.56.12 MountVolume.SetUp succeeded for volume "default-token-4cgj8" Normal Pulling 4m kubelet, 192.168.56.12 pulling image "nginx:1.13.12" Normal Pulled 3m kubelet, 192.168.56.12 Successfully pulled image "nginx:1.13.12" Normal Created 3m kubelet, 192.168.56.12 Created container Normal Started 3m kubelet, 192.168.56.12 Started container #导出资源描述 kubectl get --export -o yaml 命令会以Yaml格式导出系统中已有资源描述 比如,我们可以将系统中 nginx 部署的描述导成 Yaml 文件 kubectl get deployment nginx-deployment-6c45fc49cb-7rwdp --export -o yaml > nginx-deployment.yaml #测试pod访问 测试访问nginx镜像(在对应的节点上测试,本来是其他节点也可以正常访问的) [root@linux-node3 ~]# curl --head http://10.2.76.4 HTTP/1.1 200 OK Server: nginx/1.13.12 Date: Tue, 09 Oct 2018 07:17:55 GMT Content-Type: text/html Content-Length: 612 Last-Modified: Mon, 09 Apr 2018 16:01:09 GMT Connection: keep-alive ETag: "5acb8e45-264" Accept-Ranges: bytes ``` 8、更新Deployment ``` #--record 记录日志,方便以后回滚 [root@linux-node1 ~]# kubectl set image deployment/nginx-deployment nginx=nginx:1.12.1 --record deployment.apps "nginx-deployment" image updated ``` 9、查看更新后的Deployment ``` #这里发现镜像已经更新为1.12.1版本了,然后CURRENT(当前镜像数为4个,期望值DESIRED为3个,说明正在进行滚动更新) [root@linux-node1 ~]# kubectl get deployment -o wide NAME DESIRED CURRENT UP-TO-DATE AVAILABLE AGE CONTAINERS IMAGES SELECTOR nginx-deployment 3 4 1 3 13m nginx nginx:1.12.1 app=nginx ``` 10、查看历史记录 ``` [root@linux-node1 ~]# kubectl rollout history deployment/nginx-deployment deployments "nginx-deployment" REVISION CHANGE-CAUSE 1 ---第一个没有,是因为我们创建的时候没有加上--record参数 4 kubectl set image deployment/nginx-deployment nginx=nginx:1.12.2 --record=true 5 kubectl set image deployment/nginx-deployment nginx=nginx:1.12.1 --record=true ``` 11、查看具体某一个版本的升级历史 ``` [root@linux-node1 ~]# kubectl rollout history deployment/nginx-deployment --revision=1 deployments "nginx-deployment" with revision #1 Pod Template: Labels: app=nginx pod-template-hash=2701970576 Containers: nginx: Image: nginx:1.13.12 Port: 80/TCP Host Port: 0/TCP Environment: Mounts: Volumes: ``` 12、快速回滚到上一个版本 ``` [root@linux-node1 ~]# kubectl rollout undo deployment/nginx-deployment deployment.apps "nginx-deployment" [root@linux-node1 ~]# ``` 13、扩容到5个节点 ``` [root@linux-node1 ~]# kubectl get pod -o wide ----之前是3个pod NAME READY STATUS RESTARTS AGE IP NODE nginx-deployment-7498dc98f8-48lqg 1/1 Running 0 2m 10.2.76.15 192.168.56.12 nginx-deployment-7498dc98f8-g4zkp 1/1 Running 0 2m 10.2.76.9 192.168.56.13 nginx-deployment-7498dc98f8-z2466 1/1 Running 0 2m 10.2.76.16 192.168.56.12 [root@linux-node1 ~]# kubectl scale deployment nginx-deployment --replicas 5 deployment.extensions "nginx-deployment" scaled [root@linux-node1 ~]# kubectl get pod -o wide ----现在扩容到了5个pod NAME READY STATUS RESTARTS AGE IP NODE nginx-deployment-7498dc98f8-28894 1/1 Running 0 8s 10.2.76.10 192.168.56.13 nginx-deployment-7498dc98f8-48lqg 1/1 Running 0 2m 10.2.76.15 192.168.56.12 nginx-deployment-7498dc98f8-g4zkp 1/1 Running 0 2m 10.2.76.9 192.168.56.13 nginx-deployment-7498dc98f8-tt7z5 1/1 Running 0 7s 10.2.76.17 192.168.56.12 nginx-deployment-7498dc98f8-z2466 1/1 Running 0 2m 10.2.76.16 192.168.56.12 ``` 14、Pod ip 变化频繁, 引入service-ip ``` #创建nginx-server [root@linux-node1 ~]# cat nginx-service.yaml kind: Service apiVersion: v1 metadata: name: nginx-service spec: selector: app: nginx ports: - protocol: TCP port: 80 targetPort: 80 [root@linux-node1 ~]# kubectl create -f nginx-service.yaml service "nginx-service" created #发现给我们创建了一个vip 10.1.46.200 并且通过lvs做了负载均衡 [root@linux-node1 ~]# kubectl get service --all-namespaces NAMESPACE NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE default kubernetes ClusterIP 172.68.0.1 443/TCP 3d13h default my-mc-service ClusterIP 172.68.213.121 60001/TCP,60002/TCP 23m default php-service ClusterIP 172.68.210.6 9898/TCP 18h default test-hello ClusterIP 172.68.248.205 80/TCP 23h kube-system heapster ClusterIP 172.68.19.198 80/TCP 3d13h kube-system kube-dns ClusterIP 172.68.0.2 53/UDP,53/TCP,9153/TCP 3d13h kube-system kubernetes-dashboard NodePort 172.68.58.252 443:26400/TCP 3d13h kube-system metrics-server ClusterIP 172.68.31.222 443/TCP 3d13h kube-system traefik-ingress-service NodePort 172.68.221.108 80:23456/TCP,8080:31477/TCP 3d13h #删除service [root@linux-node1 ~]# kubectl delete service nginx-service service "nginx-service" deleted #查看service的后端节点 [root@linux-node1 ~]# kubectl describe svc nginx-service Name: nginx-service Namespace: default Labels: Annotations: Selector: app=nginx Type: ClusterIP IP: 172.68.176.9 Port: 80/TCP TargetPort: 80/TCP Endpoints: 172.20.1.138:80,172.20.2.132:80,172.20.2.133:80 --这里发现有3个后端节点 Session Affinity: None Events: ``` 15.创建自定义Ingress 有了ingress-controller,我们就可以创建自定义的Ingress了。这里已提前搭建好了nginx服务,我们针对nginx创建一个Ingress: ``` #vim nginx-ingress.yaml apiVersion: extensions/v1beta1 kind: Ingress metadata: name: nginx-ingress namespace: default spec: rules: - host: myk8s.com http: paths: - path: / backend: serviceName: nginx-service servicePort: 80 其中: rules中的host必须为域名,不能为IP,表示Ingress-controller的Pod所在主机域名,也就是Ingress-controller的IP对应的域名。 paths中的path则表示映射的路径。如映射/表示若访问myk8s.com,则会将请求转发至nginx的service,端口为80。 kubectl create -f nginx-ingress.yaml kubectl get ingress -o wide kubectl delete ingress nginx-ingress #需要找出Ingress-controller的Pod所在主机(这里发现是在node2机器) [root@linux-node1 ~]# kubectl get pods --all-namespaces -o wide NAMESPACE NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES default busybox 1/1 Running 41 41h 172.20.1.27 10.33.35.6 default my-mc-deployment-76f77494c7-kxv85 2/2 Running 0 63m 172.20.2.130 10.33.35.7 kube-system traefik-ingress-controller-766dbfdddd-fzb8d 1/1 Running 1 3d14h 172.20.1.14 10.33.35.6 #然后机器绑定域名 10.33.35.6 myk8s.com #访问测试 [root@linux-node1 ~]# curl http://myk8s.com -I HTTP/1.1 200 OK Server: nginx/1.12.2 Date: Fri, 23 Aug 2019 04:07:44 GMT Content-Type: text/html Content-Length: 3700 Last-Modified: Fri, 10 May 2019 08:08:40 GMT Connection: keep-alive ETag: "5cd53188-e74" Accept-Ranges: bytes ``` 参考资料: https://www.jianshu.com/p/feeea0bbd73e ================================================ FILE: docs/ca.md ================================================ # 手动制作CA证书 ``` Kubernetes 系统各组件需要使用 TLS 证书对通信进行加密。 CA证书管理工具: • easyrsa ---openvpn比较常用 • openssl • cfssl ---使用最多,使用json文件格式,相对简单 ``` ## 1.安装 CFSSL ``` [root@linux-node1 ~]# cd /usr/local/src [root@linux-node1 src]# wget https://pkg.cfssl.org/R1.2/cfssl_linux-amd64 [root@linux-node1 src]# wget https://pkg.cfssl.org/R1.2/cfssljson_linux-amd64 [root@linux-node1 src]# wget https://pkg.cfssl.org/R1.2/cfssl-certinfo_linux-amd64 [root@linux-node1 src]# chmod +x cfssl* [root@linux-node1 src]# mv cfssl-certinfo_linux-amd64 /opt/kubernetes/bin/cfssl-certinfo [root@linux-node1 src]# mv cfssljson_linux-amd64 /opt/kubernetes/bin/cfssljson [root@linux-node1 src]# mv cfssl_linux-amd64 /opt/kubernetes/bin/cfssl #复制cfssl命令文件到k8s-node1和k8s-node2节点。如果实际中多个节点,就都需要同步复制。 [root@linux-node1 ~]# scp /opt/kubernetes/bin/cfssl* 192.168.56.12:/opt/kubernetes/bin [root@linux-node1 ~]# scp /opt/kubernetes/bin/cfssl* 192.168.56.13:/opt/kubernetes/bin ``` ## 2.初始化cfssl ``` [root@linux-node1 src]# mkdir ssl && cd ssl [root@linux-node1 ssl]# cfssl print-defaults config > config.json --生成ca-config.json的样例(可省略) [root@linux-node1 ssl]# cfssl print-defaults csr > csr.json --生成ca-csr.json的样例(可省略) ``` ## 3.创建用来生成 CA 文件的 JSON 配置文件 ``` [root@linux-node1 ssl]# cat > ca-config.json < ca-csr.json < 53/UDP,53/TCP 2m #在node节点使用ipvsadm -Ln查看转发的后端节点(TCP和UDP的53端口) [root@linux-node2 ~]# ipvsadm -Ln IP Virtual Server version 1.2.1 (size=4096) Prot LocalAddress:Port Scheduler Flags -> RemoteAddress:Port Forward Weight ActiveConn InActConn TCP 10.1.0.2:53 rr -> 10.2.76.14:53 Masq 1 0 0 -> 10.2.76.20:53 Masq 1 0 0 UDP 10.1.0.2:53 rr -> 10.2.76.14:53 Masq 1 0 0 -> 10.2.76.20:53 Masq 1 0 0 #发现是转到这2个pod容器 [root@linux-node1 ~]# kubectl get pod -n kube-system -o wide NAME READY STATUS RESTARTS AGE IP NODE coredns-77c989547b-4f9xz 1/1 Running 0 5m 10.2.76.20 192.168.56.12 coredns-77c989547b-9zm4m 1/1 Running 0 5m 10.2.76.14 192.168.56.13 ``` ## 测试CoreDNS ``` [root@linux-node1 coredns]# kubectl run dns-test --rm -it --image=alpine /bin/sh If you don't see a command prompt, try pressing enter. / # ping www.qq.com PING www.qq.com (121.51.142.21): 56 data bytes 64 bytes from 121.51.142.21: seq=0 ttl=127 time=20.864 ms 64 bytes from 121.51.142.21: seq=1 ttl=127 time=19.937 ms ``` ================================================ FILE: docs/dashboard.md ================================================ # Kubernetes Dashboard ## 创建Dashboard ``` [root@linux-node1 ~]# kubectl create -f /srv/addons/dashboard/ [root@linux-node1 ~]# kubectl cluster-info Kubernetes master is running at https://192.168.56.11:6443 kubernetes-dashboard is running at https://192.168.56.11:6443/api/v1/namespaces/kube-system/services/https:kubernetes-dashboard:/proxy To further debug and diagnose cluster problems, use 'kubectl cluster-info dump'. ``` ## 查看Dashboard信息 ``` #发现Dashboard是运行在node3节点 [root@linux-node1 ~]# kubectl get pod -n kube-system -o wide NAME READY STATUS RESTARTS AGE IP NODE kubernetes-dashboard-66c9d98865-bqwl5 1/1 Running 0 1h 10.2.76.3 192.168.56.13 #查看Dashboard运行日志 [root@linux-node1 ~]# kubectl logs pod/kubernetes-dashboard-66c9d98865-bqwl5 -n kube-system #查看Dashboard服务IP(可以访问任意node节点的34696端口就可以访问到Dashboard页面 https://192.168.56.13:34696/#!/overview?namespace=default,如何master节点安装了kube-proxy也可以访问) [root@linux-node1 ~]# kubectl get service -n kube-system NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE kubernetes-dashboard NodePort 10.1.36.42 443:34696/TCP 1h ``` https://192.168.56.13:34696/#!/overview?namespace=default ![dashboard登录](https://github.com/Lancger/opsfull/blob/master/images/Dashboard-login.jpg) ## 访问Dashboard https://192.168.56.11:6443/api/v1/namespaces/kube-system/services/https:kubernetes-dashboard:/proxy 用户名:admin 密码:admin 选择令牌模式登录。 ### 获取Token ``` [root@linux-node1 ~]# kubectl -n kube-system describe secret $(kubectl -n kube-system get secret | grep admin-user | awk '{print $1}') Name: admin-user-token-c97bl Namespace: kube-system Labels: Annotations: kubernetes.io/service-account.name=admin-user kubernetes.io/service-account.uid=379208ff-cb86-11e8-9f1c-080027dc9cd8 Type: kubernetes.io/service-account-token Data ==== ca.crt: 1359 bytes namespace: 11 bytes token: eyJhbGciOiJSUzI1NiIsImtpZCI6IiJ9.eyJpc3MiOiJrdWJlcm5ldGVzL3NlcnZpY2VhY2NvdW50Iiwia3ViZXJuZXRlcy5pby9zZXJ2aWNlYWNjb3VudC9uYW1lc3BhY2UiOiJrdWJlLXN5c3RlbSIsImt1YmVybmV0ZXMuaW8vc2VydmljZWFjY291bnQvc2VjcmV0Lm5hbWUiOiJhZG1pbi11c2VyLXRva2VuLWM5N2JsIiwia3ViZXJuZXRlcy5pby9zZXJ2aWNlYWNjb3VudC9zZXJ2aWNlLWFjY291bnQubmFtZSI6ImFkbWluLXVzZXIiLCJrdWJlcm5ldGVzLmlvL3NlcnZpY2VhY2NvdW50L3NlcnZpY2UtYWNjb3VudC51aWQiOiIzNzkyMDhmZi1jYjg2LTExZTgtOWYxYy0wODAwMjdkYzljZDgiLCJzdWIiOiJzeXN0ZW06c2VydmljZWFjY291bnQ6a3ViZS1zeXN0ZW06YWRtaW4tdXNlciJ9.LopL7AD9feBZmhAuAUlPNjfthlJ1lJAPG6VXgBl-MZdofZpqNU9m-o-7M4hHa5AXkpeLvQrA1UKWWSR9eWEN06ugIkcH4Pk-tKrSVQUM6CDaE7eBdK91x1ltTonLz62_z_X8IvRYx1piv3wRUijoyRHCdziBnOhg67sT974CSPoRSOpl7ZR0Kn_L0LYRMOE9xfU3w4-sCpSx-jgc5oysAix95NqZgIkaZ6TRANpCnHE66fqL6yUwQxQ5yt7pw7J2iuSE3OxPU_cKArjYlWUvr72zG3SxZaR7dzQEggwmjSSeHRs0OK0968QAtCca1NTmcPaTtKhXYfXXdtusVCx7bA ``` ![dashboard预览](https://github.com/Lancger/opsfull/blob/master/images/Dashboard.jpg) ================================================ FILE: docs/dashboard_op.md ================================================ # Kubernetes Dashboard ``` chattr -i /etc/passwd* && chattr -i /etc/group* && chattr -i /etc/shadow* && chattr -i /etc/gshadow* cd /etc/ansible/ ansible-playbook 07.cluster-addon.yml ansible-playbook 90.setup.yml systemctl restart iptables systemctl restart kube-scheduler systemctl restart kube-controller-manager systemctl restart kube-apiserver systemctl restart etcd systemctl restart docker systemctl restart iptables systemctl restart kubelet systemctl restart kube-proxy systemctl restart etcd systemctl restart docker ``` ## 1、查看deployment ``` [root@node1 ~]# kubectl get deployment -A NAMESPACE NAME READY UP-TO-DATE AVAILABLE AGE default my-mc-deployment 3/3 3 3 2d18h default net 3/3 3 3 4d15h default net-test 2/2 2 2 4d16h default test-hello 1/1 1 1 6d default test-jrr 1/1 1 1 43h kube-system coredns 0/2 2 0 4d15h kube-system heapster 1/1 1 1 8d kube-system kubernetes-dashboard 0/1 1 0 4m42s kube-system metrics-server 0/1 1 0 8d kube-system traefik-ingress-controller 1/1 1 1 2d18h #查看deployment详情 [root@node1 ~]# kubectl describe deployment kubernetes-dashboard -n kube-system #删除deployment [root@node1 ~]# kubectl delete deployment kubernetes-dashboard -n kube-system deployment.extensions "kubernetes-dashboard" deleted ``` ## 2、查看Service信息 ``` [root@tw06a2753 ~]# kubectl get service -A -o wide NAMESPACE NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE SELECTOR default kubernetes ClusterIP 172.68.0.1 443/TCP 8d default my-mc-service ClusterIP 172.68.113.166 60001/TCP,60002/TCP 3d14h app=products,department=sales default nginx-service ClusterIP 172.68.176.9 80/TCP 5d app=nginx default php-service ClusterIP 172.68.210.6 9898/TCP 5d19h app=nginx-php default test-hello ClusterIP 172.68.248.205 80/TCP 6d run=test-hello default test-jrr-php-service ClusterIP 172.68.58.202 9090/TCP 43h app=test-jrr-nginx-php kube-system heapster ClusterIP 172.68.19.198 80/TCP 8d k8s-app=heapster kube-system kube-dns ClusterIP 172.68.0.2 53/UDP,53/TCP,9153/TCP 4d15h k8s-app=kube-dns kube-system kubernetes-dashboard NodePort 172.68.46.171 443:29107/TCP 6m31s k8s-app=kubernetes-dashboard kube-system metrics-server ClusterIP 172.68.31.222 443/TCP 8d k8s-app=metrics-server kube-system traefik-ingress-service NodePort 172.68.124.46 80:33813/TCP,8080:21315/TCP 2d18h k8s-app=traefik-ingress-lb kube-system traefik-web-ui ClusterIP 172.68.226.139 80/TCP 2d19h k8s-app=traefik-ingress-lb #查看service详情 [root@node1 ~]# kubectl describe svc kubernetes-dashboard -n kube-system #删除service [root@node1 ~]# kubectl delete svc kubernetes-dashboard -n kube-system service "kubernetes-dashboard" deleted ``` ## 3、查看Service对应的后端节点 ``` #查看kubernetes-dashboard [root@node1 ~]# kubectl describe svc kubernetes-dashboard -n kube-system #查看服务my-mc-service [root@node1 ~]# kubectl describe svc my-mc-service -n default Name: my-mc-service Namespace: default Labels: Annotations: kubectl.kubernetes.io/last-applied-configuration: {"apiVersion":"v1","kind":"Service","metadata":{"annotations":{},"name":"my-mc-service","namespace":"default"},"spec":{"ports":[{"name":"m... Selector: app=products,department=sales Type: ClusterIP IP: 172.68.113.166 Port: my-first-port 60001/TCP TargetPort: 50001/TCP Endpoints: 172.20.1.209:50001,172.20.2.206:50001,172.20.2.208:50001 Port: my-second-port 60002/TCP TargetPort: 50002/TCP Endpoints: 172.20.1.209:50002,172.20.2.206:50002,172.20.2.208:50002 ---发现这个service有这3个后端 Session Affinity: None Events: ``` ## 4、Dashboard运行在哪个节点 ``` #发现Dashboard是运行在node3节点 [root@linux-node1 ~]# kubectl get pod -n kube-system -o wide NAME READY STATUS RESTARTS AGE IP NODE kubernetes-dashboard-66c9d98865-bqwl5 1/1 Running 0 1h 10.2.76.3 192.168.56.13 #查看Dashboard运行日志 [root@linux-node1 ~]# kubectl logs pod/kubernetes-dashboard-66c9d98865-bqwl5 -n kube-system #查看Dashboard服务IP(可以访问任意node节点的34696端口就可以访问到Dashboard页面 https://192.168.56.13:34696/#!/overview?namespace=default,如何master节点安装了kube-proxy也可以访问) [root@linux-node1 ~]# kubectl get service -n kube-system NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE kubernetes-dashboard NodePort 10.1.36.42 443:34696/TCP 1h ``` https://192.168.56.13:34696/#!/overview?namespace=default ![dashboard登录](https://github.com/Lancger/opsfull/blob/master/images/Dashboard-login.jpg) ## 访问Dashboard https://192.168.56.11:6443/api/v1/namespaces/kube-system/services/https:kubernetes-dashboard:/proxy 用户名:admin 密码:admin 选择令牌模式登录。 ### 获取Token ``` [root@linux-node1 ~]# kubectl -n kube-system describe secret $(kubectl -n kube-system get secret | grep admin-user | awk '{print $1}') Name: admin-user-token-c97bl Namespace: kube-system Labels: Annotations: kubernetes.io/service-account.name=admin-user kubernetes.io/service-account.uid=379208ff-cb86-11e8-9f1c-080027dc9cd8 Type: kubernetes.io/service-account-token Data ==== ca.crt: 1359 bytes namespace: 11 bytes token: eyJhbGciOiJSUzI1NiIsImtpZCI6IiJ9.eyJpc3MiOiJrdWJlcm5ldGVzL3NlcnZpY2VhY2NvdW50Iiwia3ViZXJuZXRlcy5pby9zZXJ2aWNlYWNjb3VudC9uYW1lc3BhY2UiOiJrdWJlLXN5c3RlbSIsImt1YmVybmV0ZXMuaW8vc2VydmljZWFjY291bnQvc2VjcmV0Lm5hbWUiOiJhZG1pbi11c2VyLXRva2VuLWM5N2JsIiwia3ViZXJuZXRlcy5pby9zZXJ2aWNlYWNjb3VudC9zZXJ2aWNlLWFjY291bnQubmFtZSI6ImFkbWluLXVzZXIiLCJrdWJlcm5ldGVzLmlvL3NlcnZpY2VhY2NvdW50L3NlcnZpY2UtYWNjb3VudC51aWQiOiIzNzkyMDhmZi1jYjg2LTExZTgtOWYxYy0wODAwMjdkYzljZDgiLCJzdWIiOiJzeXN0ZW06c2VydmljZWFjY291bnQ6a3ViZS1zeXN0ZW06YWRtaW4tdXNlciJ9.LopL7AD9feBZmhAuAUlPNjfthlJ1lJAPG6VXgBl-MZdofZpqNU9m-o-7M4hHa5AXkpeLvQrA1UKWWSR9eWEN06ugIkcH4Pk-tKrSVQUM6CDaE7eBdK91x1ltTonLz62_z_X8IvRYx1piv3wRUijoyRHCdziBnOhg67sT974CSPoRSOpl7ZR0Kn_L0LYRMOE9xfU3w4-sCpSx-jgc5oysAix95NqZgIkaZ6TRANpCnHE66fqL6yUwQxQ5yt7pw7J2iuSE3OxPU_cKArjYlWUvr72zG3SxZaR7dzQEggwmjSSeHRs0OK0968QAtCca1NTmcPaTtKhXYfXXdtusVCx7bA ``` ![dashboard预览](https://github.com/Lancger/opsfull/blob/master/images/Dashboard.jpg) ================================================ FILE: docs/delete.md ================================================ ``` #master systemctl restart kube-scheduler systemctl restart kube-controller-manager systemctl restart kube-apiserver systemctl restart flannel systemctl restart etcd systemctl restart docker systemctl stop kube-scheduler systemctl stop kube-controller-manager systemctl stop kube-apiserver systemctl stop flannel systemctl stop etcd systemctl stop docker #node systemctl restart kubelet systemctl restart kube-proxy systemctl restart flannel systemctl restart etcd systemctl restart docker systemctl stop kubelet systemctl stop kube-proxy systemctl stop flannel systemctl stop etcd systemctl stop docker ``` ``` # 清理k8s集群 rm -rf /var/lib/etcd/ rm -rf /var/lib/docker rm -rf /opt/containerd rm -rf /opt/kubernetes rm -rf /var/lib/kubelet rm -rf /var/lib/chrony rm -rf /var/lib/kube-proxy rm -rf /srv/* systemctl disable kube-scheduler systemctl disable kube-controller-manager systemctl disable kube-apiserver systemctl disable flannel systemctl disable etcd systemctl disable docker systemctl disable kubelet systemctl disable kube-proxy systemctl disable flannel systemctl disable etcd systemctl disable docker ``` ================================================ FILE: docs/docker-install.md ================================================ # study_docker ## 0.卸载旧版本 ```bash yum remove -y docker \ docker-client \ docker-client-latest \ docker-common \ docker-latest \ docker-latest-logrotate \ docker-logrotate \ docker-selinux \ docker-engine-selinux \ docker-engine ``` ## 1.安装Docker 第一步:使用国内Docker源 ``` cd /etc/yum.repos.d/ wget -O docker-ce.repo https://mirrors.aliyun.com/docker-ce/linux/centos/docker-ce.repo #或 yum -y install yum-utils yum-config-manager --add-repo http://mirrors.aliyun.com/docker-ce/linux/centos/docker-ce.repo yum install -y yum-utils \ device-mapper-persistent-data \ lvm2 ``` 第二步:Docker安装: ``` yum install -y docker-ce ``` 第三步:启动后台进程: ```bash #启动docker服务 systemctl restart docker #设置docker服务开启自启 systemctl enable docker #Created symlink from /etc/systemd/system/multi-user.target.wants/docker.service to /usr/lib/systemd/system/docker.service. #查看是否成功设置docker服务开启自启 systemctl list-unit-files|grep docker docker.service enabled #关闭docker服务开启自启 systemctl disable docker #Removed symlink /etc/systemd/system/multi-user.target.wants/docker.service. ``` ## 2.脚本安装Docker ```bash #2.1、Docker官方安装脚本 curl -sSL https://get.docker.com/ | sh #这个脚本会添加docker.repo仓库并且安装Docker #2.2、阿里云的安装脚本 curl -sSL http://acs-public-mirror.oss-cn-hangzhou.aliyuncs.com/docker-engine/internet | sh - #2.3、DaoCloud 的安装脚本 curl -sSL https://get.daocloud.io/docker | sh ``` ### 3.Docker服务文件 ```bash # Docker从1.13版本开始调整了默认的防火墙规则,禁用了iptables filter表中FOWARD链,这样会引起Kubernetes集群中跨Node的Pod无法通信,执行下面命令 #注意,有变量的地方需要使用转义符号 cat > /usr/lib/systemd/system/docker.service << EOF [Unit] Description=Docker Application Container Engine Documentation=https://docs.docker.com BindsTo=containerd.service After=network-online.target firewalld.service containerd.service Wants=network-online.target Requires=docker.socket [Service] Type=notify # the default is not to use systemd for cgroups because the delegate issues still # exists and systemd currently does not support the cgroup feature set required # for containers run by docker ExecStart=/usr/bin/dockerd -H fd:// --containerd=/run/containerd/containerd.sock --exec-opt native.cgroupdriver=systemd ExecReload=/bin/kill -s HUP \$MAINPID ExecStartPost=/usr/sbin/iptables -P FORWARD ACCEPT TimeoutSec=0 RestartSec=2 Restart=always # Note that StartLimit* options were moved from "Service" to "Unit" in systemd 229. # Both the old, and new location are accepted by systemd 229 and up, so using the old location # to make them work for either version of systemd. StartLimitBurst=3 # Note that StartLimitInterval was renamed to StartLimitIntervalSec in systemd 230. # Both the old, and new name are accepted by systemd 230 and up, so using the old name to make # this option work for either version of systemd. StartLimitInterval=60s # Having non-zero Limit*s causes performance problems due to accounting overhead # in the kernel. We recommend using cgroups to do container-local accounting. LimitNOFILE=infinity LimitNPROC=infinity LimitCORE=infinity # Comment TasksMax if your systemd version does not support it. # Only systemd 226 and above support this option. TasksMax=infinity # set delegate yes so that systemd does not reset the cgroups of docker containers Delegate=yes # kill only the docker process, not all processes in the cgroup KillMode=process [Install] WantedBy=multi-user.target EOF ``` ## 3.1、配置docker加速器 ```bash mkdir -p /data0/docker-data cat > /etc/docker/daemon.json << \EOF { "exec-opts": ["native.cgroupdriver=systemd"], "data-root": "/data0/docker-data", "registry-mirrors" : [ "https://ot2k4d59.mirror.aliyuncs.com/" ], "insecure-registries": ["reg.hub.com"] } EOF 或者 curl -sSL https://get.daocloud.io/daotools/set_mirror.sh | sh -s http://f1361db2.m.daocloud.io ``` ### 3.2、重新加载docker的配置文件 ```bash systemctl daemon-reload systemctl restart docker ``` ### 3.3、内核参数配置 ```bash #编辑文件 vim /etc/sysctl.conf net.bridge.bridge-nf-call-ip6tables = 1 net.bridge.bridge-nf-call-iptables = 1 #然后执行 sysctl -p #查看docker信息是否生效 docker info ``` ## 4.通过测试镜像运行一个容器来验证Docker是否安装正确 ```bash docker run hello-world ``` ================================================ FILE: docs/etcd-install.md ================================================ # 手动部署ETCD集群 ## 0.准备etcd软件包 ``` [root@linux-node1 src]# wget https://github.com/coreos/etcd/releases/download/v3.2.18/etcd-v3.2.18-linux-amd64.tar.gz [root@linux-node1 src]# tar zxf etcd-v3.2.18-linux-amd64.tar.gz [root@linux-node1 src]# cd etcd-v3.2.18-linux-amd64 [root@linux-node1 etcd-v3.2.18-linux-amd64]# cp etcd etcdctl /opt/kubernetes/bin/ [root@linux-node1 etcd-v3.2.18-linux-amd64]# scp etcd etcdctl 192.168.56.12:/opt/kubernetes/bin/ [root@linux-node1 etcd-v3.2.18-linux-amd64]# scp etcd etcdctl 192.168.56.13:/opt/kubernetes/bin/ ``` ## 1.创建 etcd 证书签名请求: ``` #约定所有证书都放在 /usr/local/src/ssl 目录中,然后同步到其他机器 [root@linux-node1 ~]# cd /usr/local/src/ssl [root@linux-node1 ssl]# cat > etcd-csr.json < flanneld-csr.json </dev/null 2>&1 ``` 启动flannel ``` [root@linux-node1 ~]# systemctl daemon-reload [root@linux-node1 ~]# systemctl enable flannel [root@linux-node1 ~]# chmod +x /opt/kubernetes/bin/* [root@linux-node1 ~]# systemctl start flannel ``` 查看服务状态 ``` [root@linux-node1 ~]# systemctl status flannel ``` ## 配置Docker使用Flannel ``` [root@linux-node1 ~]# vim /usr/lib/systemd/system/docker.service [Unit] #在Unit下面修改After和增加Requires After=network-online.target flannel.service Wants=network-online.target Requires=flannel.service [Service] #增加EnvironmentFile=-/run/flannel/docker Type=notify EnvironmentFile=-/run/flannel/docker ExecStart=/usr/bin/dockerd $DOCKER_OPTS #最终配置 cat /usr/lib/systemd/system/docker.service [Unit] Description=Docker Application Container Engine Documentation=http://docs.docker.com After=network.target flannel.service Requires=flannel.service [Service] Type=notify EnvironmentFile=-/run/flannel/docker EnvironmentFile=-/opt/kubernetes/cfg/docker ExecStart=/usr/bin/dockerd $DOCKER_OPT_BIP $DOCKER_OPT_MTU $DOCKER_OPTS LimitNOFILE=1048576 LimitNPROC=1048576 ExecReload=/bin/kill -s HUP $MAINPID # Having non-zero Limit*s causes performance problems due to accounting overhead # in the kernel. We recommend using cgroups to do container-local accounting. LimitNOFILE=infinity LimitNPROC=infinity LimitCORE=infinity # Uncomment TasksMax if your systemd version supports it. # Only systemd 226 and above support this version. #TasksMax=infinity TimeoutStartSec=0 # set delegate yes so that systemd does not reset the cgroups of docker containers Delegate=yes # kill only the docker process, not all processes in the cgroup KillMode=process # restart the docker process if it exits prematurely Restart=on-failure StartLimitBurst=3 StartLimitInterval=60s [Install] WantedBy=multi-user.target ``` 将配置复制到另外两个阶段 ``` [root@linux-node1 ~]# scp /usr/lib/systemd/system/docker.service 192.168.56.12:/usr/lib/systemd/system/ [root@linux-node1 ~]# scp /usr/lib/systemd/system/docker.service 192.168.56.13:/usr/lib/systemd/system/ ``` 重启Docker ``` systemctl daemon-reload systemctl restart docker ``` ================================================ FILE: docs/k8s-error-resolution.md ================================================ ## 报错一:flanneld 启动不了 ``` Oct 10 10:42:19 linux-node1 flanneld: E1010 10:42:19.499080 1816 main.go:349] Couldn't fetch network config: 100: Key not found (/coreos.com) [11] ``` ## 解决办法: ``` #首先查看flannel使用的那种类型的网络模式是对应的etcd中的key是哪个(/kubernetes/network/config 或 /coreos.com/network ) [root@linux-node3 cfg]# cat /opt/kubernetes/cfg/flannel FLANNEL_ETCD="-etcd-endpoints=https://192.168.56.11:2379,https://192.168.56.12:2379,https://192.168.56.13:2379" FLANNEL_ETCD_KEY="-etcd-prefix=/coreos.com/network" ----这个参数值 FLANNEL_ETCD_CAFILE="--etcd-cafile=/opt/kubernetes/ssl/ca.pem" FLANNEL_ETCD_CERTFILE="--etcd-certfile=/opt/kubernetes/ssl/flanneld.pem" FLANNEL_ETCD_KEYFILE="--etcd-keyfile=/opt/kubernetes/ssl/flanneld-key.pem" #etcd集群集群执行下面命令,清空etcd数据 rm -rf /var/lib/etcd/default.etcd/ #下面这条只需在一个节点执行就可以 #如果是/coreos.com/network则执行下面的 [root@linux-node1 ~]# /opt/kubernetes/bin/etcdctl --ca-file /opt/kubernetes/ssl/ca.pem \ --cert-file /opt/kubernetes/ssl/flanneld.pem \ --key-file /opt/kubernetes/ssl/flanneld-key.pem \ --no-sync -C https://192.168.56.11:2379,https://192.168.56.12:2379,https://192.168.56.13:2379 \ mk /coreos.com/network/config '{"Network":"172.17.0.0/16"}' #如果是/kubernetes/network/config则执行下面的 [root@linux-node1 ~]# /opt/kubernetes/bin/etcdctl --ca-file /opt/kubernetes/ssl/ca.pem \ --cert-file /opt/kubernetes/ssl/flanneld.pem \ --key-file /opt/kubernetes/ssl/flanneld-key.pem \ --no-sync -C https://192.168.56.11:2379,https://192.168.56.12:2379,https://192.168.56.13:2379 \ mk /kubernetes/network/config '{ "Network": "10.2.0.0/16", "Backend": { "Type": "vxlan", "VNI": 1 }}' ``` 参考文档:https://stackoverflow.com/questions/34439659/flannel-and-docker-dont-start ## 报错二:flanneld 启动不了 ``` Oct 10 11:40:11 linux-node1 flanneld: E1010 11:40:11.797324 20669 main.go:349] Couldn't fetch network config: 104: Not a directory (/kubernetes/network/config) [12] 问题原因:在初次配置的时候,把flannel的配置文件中的etcd-prefix-key配置成了/kubernetes/network/config,实际上应该是/kubernetes/network [root@linux-node1 ~]# cat /opt/kubernetes/cfg/flannel FLANNEL_ETCD="-etcd-endpoints=https://192.168.56.11:2379,https://192.168.56.12:2379,https://192.168.56.13:2379" FLANNEL_ETCD_KEY="-etcd-prefix=/kubernetes/network/config" --正确的应该为 /kubernetes/network/ FLANNEL_ETCD_CAFILE="--etcd-cafile=/opt/kubernetes/ssl/ca.pem" FLANNEL_ETCD_CERTFILE="--etcd-certfile=/opt/kubernetes/ssl/flanneld.pem" FLANNEL_ETCD_KEYFILE="--etcd-keyfile=/opt/kubernetes/ssl/flanneld-key.pem" ``` 参考文档:https://www.cnblogs.com/lyzw/p/6016789.html ================================================ FILE: docs/k8s_pv_local.md ================================================ 参考文档: https://kubernetes.io/blog/2018/04/13/local-persistent-volumes-beta/ ================================================ FILE: docs/k8s重启pod.md ================================================ 通过kubectl delete批量删除全部Pod ``` kubectl delete pod --all ``` ``` 在没有pod 的yaml文件时,强制重启某个pod kubectl get pod PODNAME -n NAMESPACE -o yaml | kubectl replace --force -f - ``` ``` Q:如何进入一个 pod ? kubectl get pod 查看pod name kubectl describe pod name_of_pod 查看pod详细信息 进入pod: [root@test001 ~]# kubectl get pod -o wide NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES nginx-deployment-68c7f5464c-p52rl 1/1 Running 0 17m 172.20.1.22 10.33.35.6 nginx-deployment-68c7f5464c-qfd24 1/1 Running 0 17m 172.20.2.16 10.33.35.7 kubectl exec -it name-of-pod /bin/bash ``` 参考资料: https://www.jianshu.com/p/baa6b11062de ================================================ FILE: docs/master.md ================================================ ## 一.部署Kubernetes API服务部署 ### 0.准备软件包 ``` [root@linux-node1 ~]# cd /usr/local/src/kubernetes [root@linux-node1 kubernetes]# cp server/bin/kube-apiserver /opt/kubernetes/bin/ [root@linux-node1 kubernetes]# cp server/bin/kube-controller-manager /opt/kubernetes/bin/ [root@linux-node1 kubernetes]# cp server/bin/kube-scheduler /opt/kubernetes/bin/ ``` ### 1.创建生成CSR的 JSON 配置文件 ``` [root@linux-node1 ~]# cd /usr/local/src/ssl [root@linux-node1 ssl]# cat > kubernetes-csr.json < admin-csr.json < /etc/cni/net.d/10-default.conf <0{print $1}'| xargs kubectl certificate approve ``` 执行完毕后,查看节点状态已经是Ready的状态了 ``` [root@linux-node1 ~]# kubectl get node NAME STATUS ROLES AGE VERSION 192.168.56.12 Ready 103s v1.12.1 192.168.56.13 Ready 103s v1.12.1 ``` ## 部署Kubernetes Proxy 1.配置kube-proxy使用LVS ``` [root@linux-node2 ~]# yum install -y ipvsadm ipset conntrack ``` 2.创建 kube-proxy 证书请求 ``` [root@linux-node1 ~]# cd /usr/local/src/ssl/ [root@linux-node1 ssl]# cat > kube-proxy-csr.json < RemoteAddress:Port Forward Weight ActiveConn InActConn TCP 10.1.0.1:443 rr persistent 10800 -> 192.168.56.11:6443 Masq 1 0 0 ``` 如果你在两台实验机器都安装了kubelet和proxy服务,使用下面的命令可以检查状态: ``` [root@linux-node1 ssl]# kubectl get node NAME STATUS ROLES AGE VERSION 192.168.56.12 Ready 22m v1.10.1 192.168.56.13 Ready 3m v1.10.1 ``` linux-node3节点请自行部署。 ================================================ FILE: docs/operational.md ================================================ ## 一、服务重启 ``` #master systemctl restart kube-scheduler systemctl restart kube-controller-manager systemctl restart kube-apiserver systemctl restart flannel systemctl restart etcd systemctl stop kube-scheduler systemctl stop kube-controller-manager systemctl stop kube-apiserver systemctl stop flannel systemctl stop etcd systemctl status kube-apiserver systemctl status kube-scheduler systemctl status kube-controller-manager systemctl status etcd #node systemctl restart kubelet systemctl restart kube-proxy systemctl restart flannel systemctl restart etcd systemctl stop kubelet systemctl stop kube-proxy systemctl stop flannel systemctl stop etcd systemctl status kubelet systemctl status kube-proxy systemctl status flannel systemctl status etcd ``` ## 二、常用查询 ``` #查询命名空间 [root@linux-node1 ~]# kubectl get namespace --all-namespaces NAME STATUS AGE default Active 3d13h kube-node-lease Active 3d13h kube-public Active 3d13h kube-system Active 3d13h #查询健康状况 [root@linux-node1 ~]# kubectl get cs --all-namespaces NAME STATUS MESSAGE ERROR controller-manager Healthy ok scheduler Healthy ok etcd-0 Healthy {"health":"true"} etcd-2 Healthy {"health":"true"} etcd-1 Healthy {"health":"true"} #查询node [root@linux-node1 ~]# kubectl get node -o wide NAME STATUS ROLES AGE VERSION EXTERNAL-IP OS-IMAGE KERNEL-VERSION CONTAINER-RUNTIME 192.168.56.12 Ready 2m v1.10.3 CentOS Linux 7 (Core) 3.10.0-862.el7.x86_64 docker://18.6.1 192.168.56.13 Ready 2m v1.10.3 CentOS Linux 7 (Core) 3.10.0-862.el7.x86_64 docker://18.6.1 #创建测试deployment [root@linux-node1 ~]# kubectl run net-test --image=alpine --replicas=2 sleep 360000 #查看创建的deployment kubectl get deployment -o wide --all-namespaces #查询pod [root@linux-node1 ~]# kubectl get pod -o wide --all-namespaces NAME READY STATUS RESTARTS AGE IP NODE net-test-5767cb94df-6smfk 1/1 Running 1 1h 10.2.69.3 192.168.56.12 net-test-5767cb94df-ctkhz 1/1 Running 1 1h 10.2.17.3 192.168.56.13 #查询service [root@linux-node1 ~]# kubectl get service NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE kubernetes ClusterIP 10.1.0.1 443/TCP 4m #Etcd集群健康状况查询 [root@linux-node1 ~]# etcdctl --endpoints=https://192.168.56.11:2379 \ --ca-file=/opt/kubernetes/ssl/ca.pem \ --cert-file=/opt/kubernetes/ssl/etcd.pem \ --key-file=/opt/kubernetes/ssl/etcd-key.pem cluster-health ``` ## 三、修改POD的IP地址段 ``` #修改一 [root@linux-node1 ~]# vim /usr/lib/systemd/system/kube-controller-manager.service [Unit] Description=Kubernetes Controller Manager Documentation=https://github.com/GoogleCloudPlatform/kubernetes [Service] ExecStart=/opt/kubernetes/bin/kube-controller-manager \ --address=127.0.0.1 \ --master=http://127.0.0.1:8080 \ --allocate-node-cidrs=true \ --service-cluster-ip-range=10.1.0.0/16 \ --cluster-cidr=10.2.0.0/16 \ ---POD的IP地址段 --cluster-name=kubernetes \ --cluster-signing-cert-file=/opt/kubernetes/ssl/ca.pem \ --cluster-signing-key-file=/opt/kubernetes/ssl/ca-key.pem \ --service-account-private-key-file=/opt/kubernetes/ssl/ca-key.pem \ --root-ca-file=/opt/kubernetes/ssl/ca.pem \ --leader-elect=true \ --v=2 \ --logtostderr=false \ --log-dir=/opt/kubernetes/log Restart=on-failure RestartSec=5 [Install] WantedBy=multi-user.target #修改二(修改etcd key中的值) #创建etcd的key值 /opt/kubernetes/bin/etcdctl --ca-file /opt/kubernetes/ssl/ca.pem --cert-file /opt/kubernetes/ssl/flanneld.pem --key-file /opt/kubernetes/ssl/flanneld-key.pem \ --no-sync -C https://192.168.56.11:2379,https://192.168.56.12:2379,https://192.168.56.13:2379 \ mk /kubernetes/network/config '{ "Network": "10.2.0.0/16", "Backend": { "Type": "vxlan", "VNI": 1 }}' 获取etcd中key的值 /opt/kubernetes/bin/etcdctl --ca-file /opt/kubernetes/ssl/ca.pem --cert-file /opt/kubernetes/ssl/flanneld.pem --key-file /opt/kubernetes/ssl/flanneld-key.pem \ --no-sync -C https://192.168.56.11:2379,https://192.168.56.12:2379,https://192.168.56.13:2379 \ get /kubernetes/network/config 修改etcd中key的值 /opt/kubernetes/bin/etcdctl --ca-file /opt/kubernetes/ssl/ca.pem --cert-file /opt/kubernetes/ssl/flanneld.pem --key-file /opt/kubernetes/ssl/flanneld-key.pem \ --no-sync -C https://192.168.56.11:2379,https://192.168.56.12:2379,https://192.168.56.13:2379 \ set /kubernetes/network/config '{ "Network": "10.3.0.0/16", "Backend": { "Type": "vxlan", "VNI": 1 }}' ``` ================================================ FILE: docs/外部访问K8s中Pod的几种方式.md ================================================ ``` Ingress是个什么鬼,网上资料很多(推荐官方),大家自行研究。简单来讲,就是一个负载均衡的玩意,其主要用来解决使用NodePort暴露Service的端口时Node IP会漂移的问题。同时,若大量使用NodePort暴露主机端口,管理会非常混乱。 好的解决方案就是让外界通过域名去访问Service,而无需关心其Node IP及Port。那为什么不直接使用Nginx?这是因为在K8S集群中,如果每加入一个服务,我们都在Nginx中添加一个配置,其实是一个重复性的体力活,只要是重复性的体力活,我们都应该通过技术将它干掉。 Ingress就可以解决上面的问题,其包含两个组件Ingress Controller和Ingress: Ingress 将Nginx的配置抽象成一个Ingress对象,每添加一个新的服务只需写一个新的Ingress的yaml文件即可 Ingress Controller 将新加入的Ingress转化成Nginx的配置文件并使之生效 ``` 参考文档: https://blog.csdn.net/qq_23348071/article/details/87185025 从外部访问K8s中Pod的五种方式 ================================================ FILE: docs/虚拟机环境准备.md ================================================ # 一、安装环境准备 下载系统镜像:可以在阿里云镜像站点下载 CentOS 镜像: http://mirrors.aliyun.com/centos/7/isos/x86_64/CentOS-7-x86_64-DVD-1804.iso 创建虚拟机:步骤略。 # 二、操作系统安装 为了统一环境,保证实验的通用性,将网卡名称设置为 eth*,不使用 CentOS 7 默认的网卡命名规则。所以需要在安装的时候,增加内核参数。 ## 1)光标选择“Install CentOS 7” ![](https://github.com/Lancger/opsfull/blob/master/images/install%20centos7.png) ## 2)点击 Tab,打开 kernel 启动选项后,增加 net.ifnames=0 biosdevname=0,如下图所示。 ![](https://github.com/Lancger/opsfull/blob/master/images/change%20network.png) # 三、设置网络 ## 1.vmware-workstation设置网络。 如果你的默认 NAT 地址段不是 192.168.56.0/24 可以修改 VMware Workstation 的配置,点击编辑 -> 虚拟 网络配置,然后进行配置。 ![](https://github.com/Lancger/opsfull/blob/master/images/vmware-network.png) ## 2.virtualbox设置网络。 ![eth0](https://github.com/Lancger/opsfull/blob/master/images/virtualbox-network-eth0.jpg) ![eth1](https://github.com/Lancger/opsfull/blob/master/images/virtualbox-network-eth1.png) # 四、系统配置 ## 1.设置主机名 ``` [root@localhost ~]# vi /etc/hostname linux-node1.example.com 或 #修改本机hostname [root@localhost ~]# hostnamectl set-hostname linux-node1.example.com #让主机名修改生效 [root@localhost ~]# su -l Last login: Sun Sep 30 04:30:53 EDT 2018 on pts/0 [root@linux-node1 ~]# ``` ## 2.安装依赖 ``` #为了保证各服务器间时间一致,使用ntpdate同步时间。 # 安装ntpdate [root@linux-node1 ~]# yum install -y wget lrzsz vim net-tools openssh-clients ntpdate unzip xz $ 加入crontab 1 * * * * (/usr/sbin/ntpdate -s ntp1.aliyun.com;/usr/sbin/hwclock -w) > /dev/null 2>&1 1 * * * * /usr/sbin/ntpdate ntp1.aliyun.com >/dev/null 2>&1 #设置时区 [root@linux-node1 ~]# cp /usr/share/zoneinfo/Asia/Shanghai /etc/localtime ``` ## 3.设置 IP 地址 请配置静态 IP 地址。注意将 UUID 和 MAC 地址已经其它配置删除掉,便于进行虚 拟机克隆,请参考下面的配置。 ``` [root@linux-node1 ~]# vim /etc/sysconfig/network-scripts/ifcfg-eth0 TYPE=Ethernet BOOTPROTO=static NAME=eth0 DEVICE=eth0 ONBOOT=yes IPADDR=192.168.56.11 NETMASK=255.255.255.0 #GATEWAY=192.168.56.2 #重启网络服务 [root@linux-node1 ~]# systemctl restart network ``` ## 4.关闭 NetworkManager 和防火墙开启自启动 ``` [root@linux-node1 ~]# systemctl disable firewalld [root@linux-node1 ~]# systemctl disable NetworkManager ``` ## 5.设置主机名解析 ``` [root@linux-node1 ~]# cat > /etc/hosts <> /etc/sysctl.conf [root@linux-node1 ~]# sysctl -p ``` ## 8.重启 ``` [root@linux-node1 ~]# reboot ``` ## 9.克隆虚拟机 关闭虚拟机,并克隆当前虚拟机 linux-node1 到 linux-node2 linux-node3,建议选择“创建完整克隆”,而不是“创 建链接克隆”。 克隆完毕后请给 linux-node2 linux-node3 设置正确的 IP 地址和主机名。 ## 10.给虚拟机做快照 分别给三台虚拟机做快照。以便于随时回到一个刚初始化完毕的系统中。可以有效的减少学习过程中 的环境准备时间。同时,请确保实验环境的一致性,便于顺利的完成所有实验。 ================================================ FILE: example/coredns/coredns.yaml ================================================ apiVersion: v1 kind: ServiceAccount metadata: name: coredns namespace: kube-system labels: kubernetes.io/cluster-service: "true" addonmanager.kubernetes.io/mode: Reconcile --- apiVersion: rbac.authorization.k8s.io/v1 kind: ClusterRole metadata: labels: kubernetes.io/bootstrapping: rbac-defaults addonmanager.kubernetes.io/mode: Reconcile name: system:coredns rules: - apiGroups: - "" resources: - endpoints - services - pods - namespaces verbs: - list - watch --- apiVersion: rbac.authorization.k8s.io/v1 kind: ClusterRoleBinding metadata: annotations: rbac.authorization.kubernetes.io/autoupdate: "true" labels: kubernetes.io/bootstrapping: rbac-defaults addonmanager.kubernetes.io/mode: EnsureExists name: system:coredns roleRef: apiGroup: rbac.authorization.k8s.io kind: ClusterRole name: system:coredns subjects: - kind: ServiceAccount name: coredns namespace: kube-system --- apiVersion: v1 kind: ConfigMap metadata: name: coredns namespace: kube-system labels: addonmanager.kubernetes.io/mode: EnsureExists data: Corefile: | .:53 { errors health kubernetes cluster.local. in-addr.arpa ip6.arpa { pods insecure upstream fallthrough in-addr.arpa ip6.arpa } prometheus :9153 proxy . /etc/resolv.conf cache 30 } --- apiVersion: extensions/v1beta1 kind: Deployment metadata: name: coredns namespace: kube-system labels: k8s-app: coredns kubernetes.io/cluster-service: "true" addonmanager.kubernetes.io/mode: Reconcile kubernetes.io/name: "CoreDNS" spec: replicas: 2 strategy: type: RollingUpdate rollingUpdate: maxUnavailable: 1 selector: matchLabels: k8s-app: coredns template: metadata: labels: k8s-app: coredns spec: serviceAccountName: coredns tolerations: - key: node-role.kubernetes.io/master effect: NoSchedule - key: "CriticalAddonsOnly" operator: "Exists" containers: - name: coredns image: coredns/coredns:1.0.6 imagePullPolicy: IfNotPresent resources: limits: memory: 170Mi requests: cpu: 100m memory: 70Mi args: [ "-conf", "/etc/coredns/Corefile" ] volumeMounts: - name: config-volume mountPath: /etc/coredns ports: - containerPort: 53 name: dns protocol: UDP - containerPort: 53 name: dns-tcp protocol: TCP livenessProbe: httpGet: path: /health port: 8080 scheme: HTTP initialDelaySeconds: 60 timeoutSeconds: 5 successThreshold: 1 failureThreshold: 5 dnsPolicy: Default volumes: - name: config-volume configMap: name: coredns items: - key: Corefile path: Corefile --- apiVersion: v1 kind: Service metadata: name: coredns namespace: kube-system labels: k8s-app: coredns kubernetes.io/cluster-service: "true" addonmanager.kubernetes.io/mode: Reconcile kubernetes.io/name: "CoreDNS" spec: selector: k8s-app: coredns clusterIP: 10.1.0.2 ports: - name: dns port: 53 protocol: UDP - name: dns-tcp port: 53 protocol: TCP ================================================ FILE: example/nginx/nginx-daemonset.yaml ================================================ apiVersion: apps/v1 kind: DaemonSet metadata: name: nginx-daemonset labels: app: nginx spec: selector: matchLabels: app: nginx template: metadata: labels: app: nginx spec: containers: - name: nginx image: nginx:1.13.12 ports: - containerPort: 80 ================================================ FILE: example/nginx/nginx-deployment.yaml ================================================ apiVersion: apps/v1 kind: Deployment metadata: name: nginx-deployment labels: app: nginx spec: replicas: 3 selector: matchLabels: app: nginx template: metadata: labels: app: nginx spec: containers: - name: nginx image: nginx:1.13.12 ports: - containerPort: 80 ================================================ FILE: example/nginx/nginx-ingress.yaml ================================================ apiVersion: extensions/v1beta1 kind: Ingress metadata: name: nginx-ingress spec: rules: - host: www.example.com http: paths: - path: / backend: serviceName: nginx-service servicePort: 80 ================================================ FILE: example/nginx/nginx-pod.yaml ================================================ apiVersion: v1 kind: Pod metadata: name: nginx-pod labels: app: nginx spec: containers: - name: nginx image: nginx:1.13.12 ports: - containerPort: 80 ================================================ FILE: example/nginx/nginx-rc.yaml ================================================ apiVersion: v1 kind: ReplicationController metadata: name: nginx-rc spec: replicas: 3 selector: app: nginx template: metadata: name: nginx labels: app: nginx spec: containers: - name: nginx image: nginx:1.13.12 ports: - containerPort: 80 ================================================ FILE: example/nginx/nginx-rs.yaml ================================================ apiVersion: apps/v1 kind: ReplicaSet metadata: name: nginx-rs labels: app: nginx spec: replicas: 3 selector: matchLabels: app: nginx template: metadata: labels: app: nginx spec: containers: - name: nginx image: nginx:1.13.12 ports: - containerPort: 80 ================================================ FILE: example/nginx/nginx-service-nodeport.yaml ================================================ kind: Service apiVersion: v1 metadata: name: nginx-service spec: selector: app: nginx ports: - protocol: TCP port: 80 targetPort: 80 type: NodePort ================================================ FILE: example/nginx/nginx-service.yaml ================================================ kind: Service apiVersion: v1 metadata: name: nginx-service spec: selector: app: nginx ports: - protocol: TCP port: 80 targetPort: 80 ================================================ FILE: helm/README.md ================================================ # 一、Helm - K8S的包管理器 类似Centos的yum ## 1、Helm架构 ```bash helm包括chart和release. helm包含2个组件,Helm客户端和Tiller服务器. ``` ## 2、Helm客户端安装 1、脚本安装 ```bash #安装 curl https://raw.githubusercontent.com/kubernetes/helm/master/scripts/get |bash #查看 which helm #因服务器端还没安装,这里会报无法连接 helm version #添加命令补全 helm completion bash > .helmrc echo "source .helmrc" >> .bashrc ``` 2、源码安装 ```bash #源码安装 #curl -O https://get.helm.sh/helm-v2.16.0-linux-amd64.tar.gz wget -O helm-v2.16.0-linux-amd64.tar.gz https://get.helm.sh/helm-v2.16.0-linux-amd64.tar.gz tar -zxvf helm-v2.16.0-linux-amd64.tar.gz cd linux-amd64 #若采用容器化部署到kubernetes中,则可以不用管tiller,只需将helm复制到/usr/bin目录即可 cp helm /usr/bin/ echo "source <(helm completion bash)" >> /root/.bashrc # 命令自动补全 ``` ## 3、Tiller服务器端安装 1、安装 ```bash helm init --upgrade -i registry.cn-hangzhou.aliyuncs.com/google_containers/tiller:v2.16.0 --stable-repo-url https://kubernetes.oss-cn-hangzhou.aliyuncs.com/charts #查看 kubectl get --namespace=kube-system service tiller-deploy kubectl get --namespace=kube-system deployments. tiller-deploy kubectl get --namespace=kube-system pods|grep tiller-deploy #能够看到服务器版本信息 helm version #添加新的repo helm repo add stable http://mirror.azure.cn/kubernetes/charts/ ``` 2、创建helm-rbac.yaml文件 ```bash cat >helm-rbac.yaml<<\EOF apiVersion: v1 kind: ServiceAccount metadata: name: tiller namespace: kube-system --- apiVersion: rbac.authorization.k8s.io/v1beta1 kind: ClusterRoleBinding metadata: name: tiller roleRef: apiGroup: rbac.authorization.k8s.io kind: ClusterRole name: cluster-admin subjects: - kind: ServiceAccount name: tiller namespace: kube-system EOF kubectl apply -f helm-rbac.yaml ``` ## 4、Helm使用 ```bash #搜索 helm search #执行命名添加权限 kubectl create serviceaccount --namespace kube-system tiller kubectl create clusterrolebinding tiller-cluster-rule --clusterrole=cluster-admin --serviceaccount=kube-system:tiller kubectl patch deploy --namespace kube-system tiller-deploy -p '{"spec":{"template":{"spec":{"serviceAccount":"tiller"}}}}' #安装chart的mysql应用 helm install stable/mysql 会自动部署 Service,Deployment,Secret 和 PersistentVolumeClaim,并给与很多提示信息,比如mysql密码获取,连接端口等. #查看release各个对象 kubectl get service doltish-beetle-mysql kubectl get deployments. doltish-beetle-mysql kubectl get pods doltish-beetle-mysql-75fbddbd9d-f64j4 kubectl get pvc doltish-beetle-mysql helm list # 显示已经部署的release #删除 helm delete doltish-beetle kubectl get pods kubectl get service kubectl get deployments. kubectl get pvc ``` # 二、使用Helm部署Nginx Ingress ## 1、标记标签 我们将kub1(192.168.56.11)做为边缘节点,打上Label ```bash #查看node标签 kubectl get nodes --show-labels kubectl label node k8s-master-01 node-role.kubernetes.io/edge= $ kubectl get node NAME STATUS ROLES AGE VERSION k8s-master-01 Ready edge,master 59m v1.16.2 k8s-master-02 Ready 58m v1.16.2 k8s-master-03 Ready 58m v1.16.2 ``` ## 2、编写chart的值文件ingress-nginx.yaml ```bash cat >ingress-nginx.yaml<<\EOF controller: hostNetwork: true daemonset: useHostPort: false hostPorts: http: 80 https: 443 service: type: ClusterIP tolerations: - operator: "Exists" nodeSelector: node-role.kubernetes.io/edge: '' defaultBackend: tolerations: - operator: "Exists" nodeSelector: node-role.kubernetes.io/edge: '' EOF ``` ## 3、安装nginx-ingress ```bash helm del --purge nginx-ingress helm repo update helm install stable/nginx-ingress \ --name nginx-ingress \ --namespace kube-system \ -f ingress-nginx.yaml 如果访问 http://192.168.56.11 返回default backend,则部署完成。 #nginx-ingress docker pull registry.cn-hangzhou.aliyuncs.com/google_containers/nginx-ingress-controller:0.26.1 docker tag registry.cn-hangzhou.aliyuncs.com/google_containers/nginx-ingress-controller:0.26.1 quay.io/kubernetes-ingress-controller/nginx-ingress-controller:0.25.1 docker rmi registry.cn-hangzhou.aliyuncs.com/google_containers/nginx-ingress-controller:0.26.1 #defaultbackend docker pull googlecontainer/defaultbackend-amd64:1.5 docker tag googlecontainer/defaultbackend-amd64:1.5 k8s.gcr.io/defaultbackend-amd64:1.5 docker rmi googlecontainer/defaultbackend-amd64:1.5 ``` ## 4、查看 nginx-ingress 的 Pod ```bash kubectl get pods -n kube-system | grep nginx-ingress ``` # 三、Helm 安装部署Kubernetes的dashboard ## 1、创建tls secret ```bash openssl req -x509 -nodes -days 3650 -newkey rsa:2048 -keyout ./tls.key -out ./tls.crt -subj "/CN=k8s.test.com" ``` ## 2、安装tls secret ```bash kubectl delete secret dashboard-tls-secret -n kube-system kubectl -n kube-system create secret tls dashboard-tls-secret --key ./tls.key --cert ./tls.crt kubectl get secret -n kube-system |grep dashboard ``` ## 3、安装 ```bash cat >kubernetes-dashboard.yaml<<\EOF image: repository: registry.cn-hangzhou.aliyuncs.com/google_containers/kubernetes-dashboard-amd64 tag: v1.10.1 ingress: enabled: true hosts: - k8s.test.com annotations: nginx.ingress.kubernetes.io/ssl-redirect: "false" nginx.ingress.kubernetes.io/backend-protocol: "HTTPS" tls: - secretName: dashboard-tls-secret hosts: - k8s.test.com nodeSelector: node-role.kubernetes.io/edge: '' tolerations: - key: node-role.kubernetes.io/master operator: Exists effect: NoSchedule - key: node-role.kubernetes.io/master operator: Exists effect: PreferNoSchedule rbac: clusterAdminRole: true EOF 相比默认配置,修改了以下配置项: ingress.enabled - 置为 true 开启 Ingress,用 Ingress 将 Kubernetes Dashboard 服务暴露出来,以便让我们浏览器能够访问 ingress.annotations - 指定 ingress.class 为 nginx,让我们安装 Nginx Ingress Controller 来反向代理 Kubernetes Dashboard 服务;由于 Kubernetes Dashboard 后端服务是以 https 方式监听的,而 Nginx Ingress Controller 默认会以 HTTP 协议将请求转发给后端服务,用secure-backends这个 annotation 来指示 Nginx Ingress Controller 以 HTTPS 协议将请求转发给后端服务 ingress.hosts - 这里替换为证书配置的域名 Ingress.tls - secretName 配置为 cert-manager 生成的免费证书所在的 Secret 资源名称,hosts 替换为证书配置的域名 rbac.clusterAdminRole - 置为 true 让 dashboard 的权限够大,这样我们可以方便操作多个 namespace ``` ## 4、命令安装 1、安装 ```bash #删除 helm delete kubernetes-dashboard helm del --purge kubernetes-dashboard #安装 helm install stable/kubernetes-dashboard \ -n kubernetes-dashboard \ --namespace kube-system \ -f kubernetes-dashboard.yaml ``` 2、查看pod ```bash kubectl get pods -n kube-system -o wide ``` 3、查看详细信息 ```bash kubectl describe pod `kubectl get pod -A|grep dashboard|awk '{print $2}'` -n kube-system ``` 4、访问 ```bash #获取token kubectl describe -n kube-system secret/`kubectl -n kube-system get secret | grep kubernetes-dashboard-token|awk '{print $1}'` #访问 https://k8s.test.com ``` 参考文档: https://www.cnblogs.com/hongdada/p/11395200.html 镜像问题 https://www.qikqiak.com/post/install-nginx-ingress/ https://www.cnblogs.com/bugutian/p/11366556.html 国内不fq安装K8S三: 使用helm安装kubernet-dashboard https://www.cnblogs.com/hongdada/p/11284534.html Helm 安装部署Kubernetes的dashboard https://www.cnblogs.com/chanix/p/11731388.html Helm - K8S的包管理器 https://www.cnblogs.com/peitianwang/p/11649621.html ================================================ FILE: kubeadm/K8S-HA-V1.13.4-关闭防火墙版.md ================================================ # 环境介绍: ```bash CentOS: 7.6 Docker: 18.06.1-ce Kubernetes: 1.13.4 Kuberadm: 1.13.4 Kuberlet: 1.13.4 Kuberctl: 1.13.4 ``` # 部署介绍:  创建高可用首先先有一个 Master 节点,然后再让其他服务器加入组成三个 Master 节点高可用,然后再将工作节点 Node 加入。下面将描述每个节点要执行的步骤: ```bash Master01: 二、三、四、五、六、七、八、九、十一 Master02、Master03: 二、三、五、六、四、九 node01、node02、node03: 二、五、六、九 ``` # 集群架构: ![kubeadm高可用架构图](https://github.com/Lancger/opsfull/blob/master/images/kubeadm-ha.jpg) # 一、kuberadm 简介 ### 1、Kuberadm 作用  Kubeadm 是一个工具,它提供了 kubeadm init 以及 kubeadm join 这两个命令作为快速创建 kubernetes 集群的最佳实践。  kubeadm 通过执行必要的操作来启动和运行一个最小可用的集群。它被故意设计为只关心启动集群,而不是之前的节点准备工作。同样的,诸如安装各种各样值得拥有的插件,例如 Kubernetes Dashboard、监控解决方案以及特定云提供商的插件,这些都不在它负责的范围。  相反,我们期望由一个基于 kubeadm 从更高层设计的更加合适的工具来做这些事情;并且,理想情况下,使用 kubeadm 作为所有部署的基础将会使得创建一个符合期望的集群变得容易。 ### 2、Kuberadm 功能 ```bash kubeadm init: 启动一个 Kubernetes 主节点 kubeadm join: 启动一个 Kubernetes 工作节点并且将其加入到集群 kubeadm upgrade: 更新一个 Kubernetes 集群到新版本 kubeadm config: 如果使用 v1.7.x 或者更低版本的 kubeadm 初始化集群,您需要对集群做一些配置以便使用 kubeadm upgrade 命令 kubeadm token: 管理 kubeadm join 使用的令牌 kubeadm reset: 还原 kubeadm init 或者 kubeadm join 对主机所做的任何更改 kubeadm version: 打印 kubeadm 版本 kubeadm alpha: 预览一组可用的新功能以便从社区搜集反馈 ``` ### 3、功能版本
Area Maturity Level
Command line UX GA
Implementation GA
Config file API beta
CoreDNS GA
kubeadm alpha subcommands alpha
High availability alpha
DynamicKubeletConfig alpha
Self-hosting alpha
# 二、前期准备 ### 1、虚拟机分配说明
地址 主机名 内存&CPU 角色
10.19.2.200 - - vip
10.19.2.56 k8s-master-01 2C & 2G master
10.19.2.57 k8s-master-02 2C & 2G master
10.19.2.58 k8s-master-03 2C & 2G master
10.19.2.246 k8s-node-01 4C & 8G node
10.19.2.247 k8s-node-02 4C & 8G node
10.19.2.248 k8s-node-03 4C & 8G node
### 2、各个节点端口占用 - Master 节点
规则 方向 端口范围 作用 使用者
TCP Inbound 入口 6443* Kubernetes API server All
TCP Inbound 入口 2379-2380 etcd server client API kube-apiserver, etcd
TCP Inbound 入口 10250 Kubernetes API Self, Control plane
TCP Inbound 入口 10251 kube-scheduler Self
TCP Inbound 入口 10252 kube-controller-manager Self
- node 节点
规则 方向 端口范围 作用 使用者
TCP Inbound 入口 10250 Kubernetes API Self, Control plane
TCP Inbound 入口 30000-32767 NodePort Services** All
### 3、基础环境设置  Kubernetes 需要一定的环境来保证正常运行,如各个节点时间同步,主机名称解析,关闭防火墙等等。 1、主机名称解析  分布式系统环境中的多主机通信通常基于主机名称进行,这在 IP 地址存在变化的可能性时为主机提供了固定的访问人口,因此一般需要有专用的 DNS 服务负责解决各节点主机 不过,考虑到此处部署的是测试集群,因此为了降低系复杂度,这里将基于 hosts 的文件进行主机名称解析。 2、修改hosts和免key登录 ```bash #分别进入不同服务器,进入 /etc/hosts 进行编辑 cat > /etc/hosts << \EOF 127.0.0.1 localhost localhost.localdomain localhost4 localhost4.localdomain4 ::1 localhost localhost.localdomain localhost6 localhost6.localdomain6 10.19.2.200 k8s-vip master master.k8s.io 10.19.2.56 k8s-master-01 master01 master01.k8s.io 10.19.2.57 k8s-master-02 master02 master02.k8s.io 10.19.2.58 k8s-master-03 master03 master03.k8s.io 10.19.2.246 k8s-node-01 node01 node01.k8s.io 10.19.2.247 k8s-node-02 node02 node02.k8s.io 10.19.2.248 k8s-node-03 node03 node03.k8s.io EOF #root用户免密登录 mkdir -p /root/.ssh/ chmod 700 /root/.ssh/ echo 'ssh-rsa AAAAB3NzaC1yc2EAAAADAQABAAABAQC7bRm20od1b3rzW3ZPLB5NZn3jQesvfiz2p0WlfcYJrFHfF5Ap0ubIBUSQpVNLn94u8ABGBLboZL8Pjo+rXQPkIcObJxoKS8gz6ZOxcxJhldudbadabdanKAAKAKKKKKKKKKKKKKKKKKKKKKKK root@k8s-master-01' > /root/.ssh/authorized_keys chmod 400 /root/.ssh/authorized_keys ``` 3、修改hostname ```bash #分别进入不同的服务器修改 hostname 名称 # 修改 10.19.2.56 服务器 hostnamectl set-hostname k8s-master-01 # 修改 10.19.2.57 服务器 hostnamectl set-hostname k8s-master-02 # 修改 10.19.2.58 服务器 hostnamectl set-hostname k8s-master-03 # 修改 10.19.2.246 服务器 hostnamectl set-hostname k8s-node-01 # 修改 10.19.2.247 服务器 hostnamectl set-hostname k8s-node-02 # 修改 10.19.2.248 服务器 hostnamectl set-hostname k8s-node-03 ``` 4、主机时间同步 ```bash #将各个服务器的时间同步,并设置开机启动同步时间服务 systemctl start chronyd.service systemctl enable chronyd.service ``` 5、关闭防火墙服务 ```bash systemctl stop firewalld systemctl disable firewalld ``` 6、关闭并禁用SELinux ```bash # 若当前启用了 SELinux 则需要临时设置其当前状态为 permissive setenforce 0 # 编辑/etc/sysconfig selinux 文件,以彻底禁用 SELinux sed -i 's/^SELINUX=enforcing$/SELINUX=disabled/' /etc/selinux/config # 查看selinux状态 getenforce 如果为permissive,则执行reboot重新启动即可 ``` 7、禁用 Swap 设备  kubeadm 默认会预先检当前主机是否禁用了 Swap 设备,并在未用时强制止部署 过程因此,在主机内存资惊充裕的条件下,需要禁用所有的 Swap 设备 ``` # 关闭当前已启用的所有 Swap 设备 swapoff -a && sysctl -w vm.swappiness=0 sed -ri 's/.*swap.*/#&/' /etc/fstab 或 # 编辑 fstab 配置文件,注释掉标识为 Swap 设备的所有行 vi /etc/fstab UUID=9be41058-76a6-4588-8e3f-5b44604d8de1 / xfs defaults,noatime 0 0 UUID=4489cc8f-1885-4e17-bfe7-8652fd1d3feb /boot xfs defaults,noatime 0 0 #UUID=0f5ae5f1-4872-471f-9f3a-f172a43fc1ff swap swap defaults,noatime 0 0 ``` 8、设置系统参数  设置允许路由转发,不对bridge的数据进行处理 ```bash #创建 /etc/sysctl.d/k8s.conf 文件 cat > /etc/sysctl.d/k8s.conf << \EOF vm.swappiness = 0 net.ipv4.ip_forward = 1 net.bridge.bridge-nf-call-ip6tables = 1 net.bridge.bridge-nf-call-iptables = 1 EOF #挂载br_netfilter modprobe br_netfilter #生效配置文件 sysctl -p /etc/sysctl.d/k8s.conf #查看是否生成相关文件 ls /proc/sys/net/bridge ``` 9、资源配置文件 `/etc/security/limits.conf` 是 Linux 资源使用配置文件,用来限制用户对系统资源的使用 ```bash echo "* soft nofile 65536" >> /etc/security/limits.conf echo "* hard nofile 65536" >> /etc/security/limits.conf echo "* soft nproc 65536" >> /etc/security/limits.conf echo "* hard nproc 65536" >> /etc/security/limits.conf echo "* soft memlock unlimited" >> /etc/security/limits.conf echo "* hard memlock unlimited" >> /etc/security/limits.conf ``` 10、安装依赖包以及相关工具 ```bash yum install -y epel-release yum install -y yum-utils device-mapper-persistent-data lvm2 net-tools conntrack-tools wget vim ntpdate libseccomp libtool-ltdl ``` # 三、安装Keepalived - keepalived介绍: 是集群管理中保证集群高可用的一个服务软件,其功能类似于heartbeat,用来防止单点故障 - Keepalived作用: 为haproxy提供vip(10.19.2.200)在三个haproxy实例之间提供主备,降低当其中一个haproxy失效的时对服务的影响。 ### 1、yum安装Keepalived ```bash # 安装keepalived chattr -i /etc/passwd* && chattr -i /etc/group* && chattr -i /etc/shadow* && chattr -i /etc/gshadow* yum install -y keepalived ``` ### 2、配置Keepalived ```bash cat < /etc/keepalived/keepalived.conf ! Configuration File for keepalived # 主要是配置故障发生时的通知对象以及机器标识。 global_defs { # 标识本节点的字条串,通常为 hostname,但不一定非得是 hostname。故障发生时,邮件通知会用到。 router_id LVS_k8s } # 用来做健康检查的,当时检查失败时会将 vrrp_instance 的 priority 减少相应的值。 vrrp_script check_haproxy { script "killall -0 haproxy" #根据进程名称检测进程是否存活 interval 3 weight -2 fall 10 rise 2 } # rp_instance用来定义对外提供服务的 VIP 区域及其相关属性。 vrrp_instance VI_1 { state MASTER #当前节点为MASTER,其他两个节点设置为BACKUP interface eth0 #改为自己的网卡 virtual_router_id 51 priority 250 advert_int 1 authentication { auth_type PASS auth_pass 35f18af7190d51c9f7f78f37300a0cbd } virtual_ipaddress { 10.19.2.200 #虚拟ip,即VIP } track_script { check_haproxy } } EOF ``` 当前节点的配置中 state 配置为 MASTER,其它两个节点设置为 BACKUP ```bash 配置说明: virtual_ipaddress: vip track_script: 执行上面定义好的检测的script interface: 节点固有IP(非VIP)的网卡,用来发VRRP包。 virtual_router_id: 取值在0-255之间,用来区分多个instance的VRRP组播 advert_int: 发VRRP包的时间间隔,即多久进行一次master选举(可以认为是健康查检时间间隔)。 authentication: 认证区域,认证类型有PASS和HA(IPSEC),推荐使用PASS(密码只识别前8位)。 state: 可以是MASTER或BACKUP,不过当其他节点keepalived启动时会将priority比较大的节点选举为MASTER,因此该项其实没有实质用途。 priority: 用来选举master的,要成为master,那么这个选项的值最好高于其他机器50个点,该项取值范围是1-255(在此范围之外会被识别成默认值100)。 # 1、注意防火墙需要放开vrrp协议(不然会出现脑裂现象,三台主机都存在VIP的情况) #-A INPUT -p vrrp -j ACCEPT -A RH-Firewall-1-INPUT -p vrrp -j ACCEPT #2、注意上面配置script "killall -0 haproxy" #根据进程名称检测进程是否存活,会在/var/log/messages每隔一秒执行检测的日志记录 # tail -100f /var/log/message Sep 27 10:54:16 tw19410s1 Keepalived_vrrp[9113]: /usr/bin/killall -0 haproxy exited with status 1 ``` ### 3、启动Keepalived ```bash # 设置开机启动 systemctl enable keepalived # 启动keepalived systemctl start keepalived # 查看启动状态 systemctl status keepalived ``` ### 4、查看网络状态 kepplived 配置中 state 为 MASTER 的节点启动后,查看网络状态,可以看到虚拟IP已经加入到绑定的网卡中 ```bash [root@k8s-master-01 ~]# ip address show eth0 2: eth0: mtu 1500 qdisc mq state UP group default qlen 1000 link/ether 00:50:56:be:86:af brd ff:ff:ff:ff:ff:ff inet 10.19.2.56/22 brd 10.19.3.255 scope global eth0 valid_lft forever preferred_lft forever inet 10.19.2.200/32 scope global eth0 valid_lft forever preferred_lft forever 当关掉当前节点的keeplived服务后将进行虚拟IP转移,将会推选state 为 BACKUP 的节点的某一节点为新的MASTER,可以在那台节点上查看网卡,将会查看到虚拟IP ``` # 四、安装haproxy  此处的haproxy为apiserver提供反向代理,haproxy将所有请求轮询转发到每个master节点上。相对于仅仅使用keepalived主备模式仅单个master节点承载流量,这种方式更加合理、健壮。 ### 1、yum安装haproxy ```bash chattr -i /etc/passwd* && chattr -i /etc/group* && chattr -i /etc/shadow* && chattr -i /etc/gshadow* yum install -y haproxy ``` ### 2、配置haproxy ```bash cat > /etc/haproxy/haproxy.cfg << EOF #--------------------------------------------------------------------- # Global settings #--------------------------------------------------------------------- global # to have these messages end up in /var/log/haproxy.log you will # need to: # 1) configure syslog to accept network log events. This is done # by adding the '-r' option to the SYSLOGD_OPTIONS in # /etc/sysconfig/syslog # 2) configure local2 events to go to the /var/log/haproxy.log # file. A line like the following can be added to # /etc/sysconfig/syslog # # local2.* /var/log/haproxy.log # log 127.0.0.1 local2 chroot /var/lib/haproxy pidfile /var/run/haproxy.pid maxconn 4000 user haproxy group haproxy daemon # turn on stats unix socket stats socket /var/lib/haproxy/stats #--------------------------------------------------------------------- # common defaults that all the 'listen' and 'backend' sections will # use if not designated in their block #--------------------------------------------------------------------- defaults mode http log global option httplog option dontlognull option http-server-close option forwardfor except 127.0.0.0/8 option redispatch retries 3 timeout http-request 10s timeout queue 1m timeout connect 10s timeout client 1m timeout server 1m timeout http-keep-alive 10s timeout check 10s maxconn 3000 #--------------------------------------------------------------------- # kubernetes apiserver frontend which proxys to the backends #--------------------------------------------------------------------- frontend kubernetes-apiserver mode tcp bind *:16443 option tcplog default_backend kubernetes-apiserver #--------------------------------------------------------------------- # round robin balancing between the various backends #--------------------------------------------------------------------- backend kubernetes-apiserver mode tcp balance roundrobin server master01.k8s.io 10.19.2.56:6443 check server master02.k8s.io 10.19.2.57:6443 check server master03.k8s.io 10.19.2.58:6443 check #--------------------------------------------------------------------- # collection haproxy statistics message #--------------------------------------------------------------------- listen stats bind *:1080 stats auth admin:awesomePassword stats refresh 5s stats realm HAProxy\ Statistics stats uri /admin?stats EOF ``` haproxy配置在其他master节点上(10.19.2.57和10.19.2.58)相同 ### 3、启动并检测haproxy ```bash # 设置开机启动 systemctl enable haproxy # 开启haproxy systemctl start haproxy # 查看启动状态 systemctl status haproxy ``` ### 4、检测haproxy端口 ```bash ss -lnt | grep -E "16443|1080" ``` # 五、安装Docker (所有节点) ### 1、移除之前安装过的Docker ```bash sudo yum remove -y docker \ docker-client \ docker-client-latest \ docker-common \ docker-latest \ docker-latest-logrotate \ docker-logrotate \ docker-selinux \ docker-engine-selinux \ docker-ce-cli \ docker-engine # 查看还有没有存在的docker组件 rpm -qa|grep docker # 有则通过命令 yum -y remove XXX 来删除,比如: yum remove docker-ce-cli ``` ### 2、配置docker的yum源 下面两个镜像源选择其一即可,由于官方下载速度比较慢,推荐用阿里镜像源 - 阿里镜像源 ```bash sudo yum-config-manager --add-repo http://mirrors.aliyun.com/docker-ce/linux/centos/docker-ce.repo ``` - Docker官方镜像源 ```bash sudo yum-config-manager --add-repo https://download.docker.com/linux/centos/docker-ce.repo ``` ### 3、安装Docker: ``` # 显示docker-ce所有可安装版本: yum list docker-ce --showduplicates | sort -r # 安装指定docker版本 sudo yum install docker-ce-18.06.1.ce-3.el7 -y # 启动docker并设置docker开机启动 systemctl enable docker systemctl start docker # 确认一下iptables 确认一下iptables filter表中FOWARD链的默认策略(pllicy)为ACCEPT。 iptables -nvL Chain FORWARD (policy ACCEPT 0 packets, 0 bytes) pkts bytes target prot opt in out source destination 0 0 DOCKER-USER all -- * * 0.0.0.0/0 0.0.0.0/0 0 0 DOCKER-ISOLATION-STAGE-1 all -- * * 0.0.0.0/0 0.0.0.0/0 0 0 ACCEPT all -- * docker0 0.0.0.0/0 0.0.0.0/0 ctstate RELATED,ESTABLISHED 0 0 DOCKER all -- * docker0 0.0.0.0/0 0.0.0.0/0 0 0 ACCEPT all -- docker0 !docker0 0.0.0.0/0 0.0.0.0/0 0 0 ACCEPT all -- docker0 docker0 0.0.0.0/0 0.0.0.0/0 Docker从1.13版本开始调整了默认的防火墙规则,禁用了iptables filter表中FOWARD链,这样会引起Kubernetes集群中跨Node的Pod无法通信。但这里通过安装docker 1806,发现默认策略又改回了ACCEPT,这个不知道是从哪个版本改回的,因为我们线上版本使用的1706还是需要手动调整这个策略的。 # 执行下面命令 iptables -P FORWARD ACCEPT # 修改docker的配置 vim /usr/lib/systemd/system/docker.service # 增加下面命令 ExecStartPost=/usr/sbin/iptables -P FORWARD ACCEPT # 配置docker加速器 cat > /etc/docker/daemon.json << \EOF { "registry-mirrors": [ "https://dockerhub.azk8s.cn", "https://i37dz0y4.mirror.aliyuncs.com" ], "insecure-registries": ["reg.hub.com"] } EOF # 重启Docker systemctl daemon-reload systemctl restart docker ``` ### 4、docker最终的服务文件 ``` #注意,有变量的地方需要使用转义符号 cat > /usr/lib/systemd/system/docker.service << EOF [Unit] Description=Docker Application Container Engine Documentation=https://docs.docker.com BindsTo=containerd.service After=network-online.target firewalld.service containerd.service Wants=network-online.target Requires=docker.socket [Service] Type=notify # the default is not to use systemd for cgroups because the delegate issues still # exists and systemd currently does not support the cgroup feature set required # for containers run by docker ExecStart=/usr/bin/dockerd -H fd:// --containerd=/run/containerd/containerd.sock --exec-opt native.cgroupdriver=systemd ExecReload=/bin/kill -s HUP \$MAINPID ExecStartPost=/usr/sbin/iptables -P FORWARD ACCEPT TimeoutSec=0 RestartSec=2 Restart=always # Note that StartLimit* options were moved from "Service" to "Unit" in systemd 229. # Both the old, and new location are accepted by systemd 229 and up, so using the old location # to make them work for either version of systemd. StartLimitBurst=3 # Note that StartLimitInterval was renamed to StartLimitIntervalSec in systemd 230. # Both the old, and new name are accepted by systemd 230 and up, so using the old name to make # this option work for either version of systemd. StartLimitInterval=60s # Having non-zero Limit*s causes performance problems due to accounting overhead # in the kernel. We recommend using cgroups to do container-local accounting. LimitNOFILE=infinity LimitNPROC=infinity LimitCORE=infinity # Comment TasksMax if your systemd version does not support it. # Only systemd 226 and above support this option. TasksMax=infinity # set delegate yes so that systemd does not reset the cgroups of docker containers Delegate=yes # kill only the docker process, not all processes in the cgroup KillMode=process [Install] WantedBy=multi-user.target EOF ``` # 六、安装kubeadm、kubelet ### 1、配置可用的国内yum源用于安装: ``` cat < /etc/yum.repos.d/kubernetes.repo [kubernetes] name=Kubernetes baseurl=https://mirrors.aliyun.com/kubernetes/yum/repos/kubernetes-el7-x86_64/ enabled=1 gpgcheck=0 repo_gpgcheck=0 gpgkey=https://mirrors.aliyun.com/kubernetes/yum/doc/yum-key.gpg https://mirrors.aliyun.com/kubernetes/yum/doc/rpm-package-key.gpg EOF ``` ### 2、安装kubelet ``` # 需要在每台机器上都安装以下的软件包: kubeadm: 用来初始化集群的指令。 kubelet: 在集群中的每个节点上用来启动 pod 和 container 等。 kubectl: 用来与集群通信的命令行工具。 # 查看kubelet版本列表 yum list kubelet --showduplicates | sort -r # 安装kubelet yum install -y kubelet-1.13.4-0 # 启动kubelet并设置开机启动 systemctl enable kubelet systemctl start kubelet # 检查状态 检查状态,发现是failed状态,正常,kubelet会10秒重启一次,等初始化master节点后即可正常 systemctl status kubelet ``` ### 3、安装kubeadm ``` # 负责初始化集群 # 1、查看kubeadm版本列表 yum list kubeadm --showduplicates | sort -r # 2、安装kubeadm yum install -y kubeadm-1.13.4-0 # 安装 kubeadm 时候会默认安装 kubectl ,所以不需要单独安装kubectl # 3、重启服务器 为了防止发生某些未知错误,这里我们重启下服务器,方便进行后续操作 reboot ``` # 七、初始化第一个kubernetes master节点 ``` # 因为需要绑定虚拟IP,所以需要首先先查看虚拟IP启动这几台master机子哪台上 [root@k8s-master-01 ~]# ip address show eth0 2: eth0: mtu 1500 qdisc mq state UP group default qlen 1000 link/ether 00:50:56:be:86:af brd ff:ff:ff:ff:ff:ff inet 10.19.2.56/22 brd 10.19.3.255 scope global eth0 valid_lft forever preferred_lft forever inet 10.19.2.200/32 scope global eth0 valid_lft forever preferred_lft forever 可以看到虚拟IP 10.19.2.200 和 服务器IP 10.19.2.56在一台机子上,所以初始化kubernetes第一个master要在master01机子上进行安装 ``` ### 1、创建kubeadm配置的yaml文件 ``` # 1、创建kubeadm配置的yaml文件 cat > kubeadm-config.yaml << EOF apiServer: certSANs: - k8s-master-01 - k8s-master-02 - k8s-master-03 - master.k8s.io - 10.19.2.56 - 10.19.2.57 - 10.19.2.58 - 10.19.2.200 - 127.0.0.1 extraArgs: authorization-mode: Node,RBAC timeoutForControlPlane: 4m0s apiVersion: kubeadm.k8s.io/v1beta1 certificatesDir: /etc/kubernetes/pki clusterName: kubernetes controlPlaneEndpoint: "master.k8s.io:16443" controllerManager: {} dns: type: CoreDNS etcd: local: dataDir: /var/lib/etcd imageRepository: registry.aliyuncs.com/google_containers kind: ClusterConfiguration kubernetesVersion: v1.13.4 networking: dnsDomain: cluster.local podSubnet: 10.20.0.0/16 serviceSubnet: 10.10.0.0/16 scheduler: {} EOF 以下两个地方设置: - certSANs: 虚拟ip地址(为了安全起见,把所有集群地址都加上) - controlPlaneEndpoint: 虚拟IP:监控端口号 配置说明: imageRepository: registry.aliyuncs.com/google_containers (使用阿里云镜像仓库) podSubnet: 10.20.0.0/16 (#pod地址池) serviceSubnet: 10.10.0.0/16 (#service地址池) ``` ### 2、初始化第一个master节点 ``` kubeadm init --config kubeadm-config.yaml ``` 日志 ``` Your Kubernetes master has initialized successfully! To start using your cluster, you need to run the following as a regular user: mkdir -p $HOME/.kube sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config sudo chown $(id -u):$(id -g) $HOME/.kube/config You should now deploy a pod network to the cluster. Run "kubectl apply -f [podnetwork].yaml" with one of the options listed at: https://kubernetes.io/docs/concepts/cluster-administration/addons/ You can now join any number of machines by running the following on each node as root: kubeadm join master.k8s.io:16443 --token i77yg1.1eype0c53jsanoge --discovery-token-ca-cert-hash sha256:8f0a817012ab333a057b6a7410e65971be20b95c1b75fc4015f8f3b6785f626f ``` 在此处看日志可以知道,通过 ``` kubeadm join master.k8s.io:16443 --token i77yg1.1eype0c53jsanoge --discovery-token-ca-cert-hash sha256:8f0a817012ab333a057b6a7410e65971be20b95c1b75fc4015f8f3b6785f626f ``` 来让节点加入集群 ### 3、配置kubectl环境变量 ```bash # 配置环境变量 mkdir -p $HOME/.kube sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config sudo chown $(id -u):$(id -g) $HOME/.kube/config # 指令补全 yum install bash-completion -y source <(kubectl completion bash) echo "source <(kubectl completion bash)" >> ~/.bashrc ``` ### 4、查看组件状态 ```bash kubectl get cs NAME STATUS MESSAGE ERROR controller-manager Healthy ok scheduler Healthy ok etcd-0 Healthy {"health": "true"} # 查看pod状态 [root@k8s-master-01 ~]# kubectl get pods --namespace=kube-system NAME READY STATUS RESTARTS AGE coredns-78d4cf999f-5zt5z 0/1 Pending 0 7m32s ---coredns没有启动 coredns-78d4cf999f-mkgsx 0/1 Pending 0 7m32s ---coredns没有启动 etcd-k8s-master-01 1/1 Running 0 6m39s kube-apiserver-k8s-master-01 1/1 Running 0 6m43s kube-controller-manager-k8s-master-01 1/1 Running 0 6m32s kube-proxy-88s74 1/1 Running 0 7m32s kube-scheduler-k8s-master-01 1/1 Running 0 6m45s 可以看到coredns没有启动,这是由于还没有配置网络插件,接下来配置下后再重新查看启动状态 ``` # 八、安装网络插件 ### 1、配置flannel插件的yaml文件 ```bash cat > kube-flannel.yaml << EOF --- kind: ClusterRole apiVersion: rbac.authorization.k8s.io/v1beta1 metadata: name: flannel rules: - apiGroups: - "" resources: - pods verbs: - get - apiGroups: - "" resources: - nodes verbs: - list - watch - apiGroups: - "" resources: - nodes/status verbs: - patch --- kind: ClusterRoleBinding apiVersion: rbac.authorization.k8s.io/v1beta1 metadata: name: flannel roleRef: apiGroup: rbac.authorization.k8s.io kind: ClusterRole name: flannel subjects: - kind: ServiceAccount name: flannel namespace: kube-system --- apiVersion: v1 kind: ServiceAccount metadata: name: flannel namespace: kube-system --- kind: ConfigMap apiVersion: v1 metadata: name: kube-flannel-cfg namespace: kube-system labels: tier: node app: flannel data: cni-conf.json: | { "name": "cbr0", "plugins": [ { "type": "flannel", "delegate": { "hairpinMode": true, "isDefaultGateway": true } }, { "type": "portmap", "capabilities": { "portMappings": true } } ] } net-conf.json: | { "Network": "10.20.0.0/16", "Backend": { "Type": "vxlan" } } --- apiVersion: extensions/v1beta1 kind: DaemonSet metadata: name: kube-flannel-ds-amd64 namespace: kube-system labels: tier: node app: flannel spec: template: metadata: labels: tier: node app: flannel spec: hostNetwork: true nodeSelector: beta.kubernetes.io/arch: amd64 tolerations: - operator: Exists effect: NoSchedule serviceAccountName: flannel initContainers: - name: install-cni image: registry.cn-shenzhen.aliyuncs.com/cp_m/flannel:v0.10.0-amd64 command: - cp args: - -f - /etc/kube-flannel/cni-conf.json - /etc/cni/net.d/10-flannel.conflist volumeMounts: - name: cni mountPath: /etc/cni/net.d - name: flannel-cfg mountPath: /etc/kube-flannel/ containers: - name: kube-flannel image: registry.cn-shenzhen.aliyuncs.com/cp_m/flannel:v0.10.0-amd64 command: - /opt/bin/flanneld args: - --ip-masq - --kube-subnet-mgr resources: requests: cpu: "100m" memory: "50Mi" limits: cpu: "100m" memory: "50Mi" securityContext: privileged: true env: - name: POD_NAME valueFrom: fieldRef: fieldPath: metadata.name - name: POD_NAMESPACE valueFrom: fieldRef: fieldPath: metadata.namespace volumeMounts: - name: run mountPath: /run - name: flannel-cfg mountPath: /etc/kube-flannel/ volumes: - name: run hostPath: path: /run - name: cni hostPath: path: /etc/cni/net.d - name: flannel-cfg configMap: name: kube-flannel-cfg EOF “Network”: “10.20.0.0/16”要和kubeadm-config.yaml配置文件中podSubnet: 10.20.0.0/16相同 ``` ### 2、创建flanner相关role和pod ``` # 应用生效 [root@k8s-master-01 ~]# kubectl apply -f kube-flannel.yaml clusterrole.rbac.authorization.k8s.io/flannel created clusterrolebinding.rbac.authorization.k8s.io/flannel created serviceaccount/flannel created configmap/kube-flannel-cfg created daemonset.extensions/kube-flannel-ds-amd64 created # 等待一会时间,再次查看各个pods的状态 [root@k8s-master-01 ~]# kubectl get pods --namespace=kube-system NAME READY STATUS RESTARTS AGE coredns-78d4cf999f-5zt5z 1/1 Running 0 12m ---coredns启动成功 coredns-78d4cf999f-mkgsx 1/1 Running 0 12m ---coredns启动成功 etcd-k8s-master-01 1/1 Running 0 11m kube-apiserver-k8s-master-01 1/1 Running 0 12m kube-controller-manager-k8s-master-01 1/1 Running 0 11m kube-flannel-ds-amd64-7lj6m 1/1 Running 0 13s kube-proxy-88s74 1/1 Running 0 12m kube-scheduler-k8s-master-01 1/1 Running 0 12m ``` # 九、加入集群 ### 1、Master加入集群构成高可用 ``` 复制秘钥到各个节点 在master01 服务器上执行下面命令,将kubernetes相关文件复制到 master02、master03 如果其他节点为初始化第一个master节点,则将该节点的配置文件复制到其余两个主节点,例如master03为第一个master节点,则将它的k8s配置复制到master02和master01。 ``` - 复制文件到 master02 ``` ssh root@master02.k8s.io mkdir -p /etc/kubernetes/pki/etcd scp /etc/kubernetes/admin.conf root@master02.k8s.io:/etc/kubernetes scp /etc/kubernetes/pki/{ca.*,sa.*,front-proxy-ca.*} root@master02.k8s.io:/etc/kubernetes/pki scp /etc/kubernetes/pki/etcd/ca.* root@master02.k8s.io:/etc/kubernetes/pki/etcd ``` - 复制文件到 master03 ``` ssh root@master03.k8s.io mkdir -p /etc/kubernetes/pki/etcd scp /etc/kubernetes/admin.conf root@master03.k8s.io:/etc/kubernetes scp /etc/kubernetes/pki/{ca.*,sa.*,front-proxy-ca.*} root@master03.k8s.io:/etc/kubernetes/pki scp /etc/kubernetes/pki/etcd/ca.* root@master03.k8s.io:/etc/kubernetes/pki/etcd ``` - master节点加入集群  master02 和 master03 服务器上都执行加入集群操作 ```bash kubeadm join master.k8s.io:16443 --token i77yg1.1eype0c53jsanoge --discovery-token-ca-cert-hash sha256:8f0a817012ab333a057b6a7410e65971be20b95c1b75fc4015f8f3b6785f626f --experimental-control-plane ```  如果加入失败想重新尝试,请输入 kubeadm reset 命令清除之前的设置,重新执行从“复制秘钥”和“加入集群”这两步  如果是master加入,请在最后面加上 –experimental-control-plane 这个参数 ```bash # 显示安装过程: This node has joined the cluster and a new control plane instance was created: * Certificate signing request was sent to apiserver and approval was received. * The Kubelet was informed of the new secure connection details. * Master label and taint were applied to the new node. * The Kubernetes control plane instances scaled up. * A new etcd member was added to the local/stacked etcd cluster. To start administering your cluster from this node, you need to run the following as a regular user: mkdir -p $HOME/.kube sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config sudo chown $(id -u):$(id -g) $HOME/.kube/config Run 'kubectl get nodes' to see this node join the cluster. ``` - 配置kubectl环境变量 ```bash mkdir -p $HOME/.kube sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config sudo chown $(id -u):$(id -g) $HOME/.kube/config # 指令补全 yum install bash-completion -y source <(kubectl completion bash) echo "source <(kubectl completion bash)" >> ~/.bashrc ``` ### 2、node节点加入集群  除了让master节点加入集群组成高可用外,slave节点也要加入集群中。  这里将k8s-node-01、k8s-node-02、k8s-node-03加入集群,进行工作  输入初始化k8s master时候提示的加入命令,如下: ``` kubeadm join master.k8s.io:16443 --token i77yg1.1eype0c53jsanoge --discovery-token-ca-cert-hash sha256:8f0a817012ab333a057b6a7410e65971be20b95c1b75fc4015f8f3b6785f626f ```  node节点加入,不需要加上 –experimental-control-plane 这个参数 ### 3、如果忘记加入集群的token和sha256 (如正常则跳过) - 显示获取token列表 ``` kubeadm token list ``` 默认情况下 Token 过期是时间是24小时,如果 Token 过期以后,可以输入以下命令,生成新的 Token ``` kubeadm token create ``` - 获取ca证书sha256编码hash值 ``` openssl x509 -pubkey -in /etc/kubernetes/pki/ca.crt | openssl rsa -pubin -outform der 2>/dev/null | openssl dgst -sha256 -hex | sed 's/^.* //' ``` 拼接命令 ``` kubeadm join master.k8s.io:16443 --token 882ik4.9ib2kb0eftvuhb58 --discovery-token-ca-cert-hash sha256:0b1a836894d930c8558b350feeac8210c85c9d35b6d91fde202b870f3244016a 如果是master加入,请在最后面加上 –experimental-control-plane 这个参数 ``` ### 4、查看各个节点加入集群情况 ``` kubectl get nodes -o wide ``` # 十、从集群中删除 Node - Master节点: ``` kubectl drain --delete-local-data --force --ignore-daemonsets kubectl delete node ``` - Slave节点: ``` kubeadm reset ``` ## 初始化失败 ```bash kubeadm reset ifconfig cni0 down ip link delete cni0 ifconfig flannel.1 down ip link delete flannel.1 rm -rf /var/lib/cni/ rm -rf /var/lib/etcd/* ``` 参考资料: http://www.mydlq.club/article/4/ ================================================ FILE: kubeadm/K8S-HA-V1.16.x-云环境-Calico.md ================================================ # 环境介绍: ```bash CentOS: 7.6 Docker: docker-ce-18.09.9 Kubernetes: 1.16.2 - calico 3.8.2 - Kubeadm: 1.16.2 - nginx-ingress 1.5.3 - Kubelet: 1.16.2 ``` # 部署介绍: ``` 三个 master 组成主节点集群,通过内网 loader balancer 实现负载均衡;至少需要三个 master 节点才可组成高可用集群,否则会出现 脑裂 现象 多个 worker 组成工作节点集群,通过外网 loader balancer 实现负载均衡 ``` # 集群架构: ![kubeadm高可用架构图](https://github.com/Lancger/opsfull/blob/master/images/kubeadm-ha.jpg) # 一、kuberadm 简介 ### 1、Kuberadm 作用  Kubeadm 是一个工具,它提供了 kubeadm init 以及 kubeadm join 这两个命令作为快速创建 kubernetes 集群的最佳实践。  kubeadm 通过执行必要的操作来启动和运行一个最小可用的集群。它被故意设计为只关心启动集群,而不是之前的节点准备工作。同样的,诸如安装各种各样值得拥有的插件,例如 Kubernetes Dashboard、监控解决方案以及特定云提供商的插件,这些都不在它负责的范围。  相反,我们期望由一个基于 kubeadm 从更高层设计的更加合适的工具来做这些事情;并且,理想情况下,使用 kubeadm 作为所有部署的基础将会使得创建一个符合期望的集群变得容易。 ### 2、Kuberadm 功能 ```bash kubeadm init: 启动一个 Kubernetes 主节点 kubeadm join: 启动一个 Kubernetes 工作节点并且将其加入到集群 kubeadm upgrade: 更新一个 Kubernetes 集群到新版本 kubeadm config: 如果使用 v1.7.x 或者更低版本的 kubeadm 初始化集群,您需要对集群做一些配置以便使用 kubeadm upgrade 命令 kubeadm token: 管理 kubeadm join 使用的令牌 kubeadm reset: 还原 kubeadm init 或者 kubeadm join 对主机所做的任何更改 kubeadm version: 打印 kubeadm 版本 kubeadm alpha: 预览一组可用的新功能以便从社区搜集反馈 ``` ### 3、功能版本
Area Maturity Level
Command line UX GA
Implementation GA
Config file API beta
CoreDNS GA
kubeadm alpha subcommands alpha
High availability alpha
DynamicKubeletConfig alpha
Self-hosting alpha
# 二、前期准备 ### 1、虚拟机分配说明
地址 主机名 内存&CPU 角色
10.10.1.100 - - vip
10.10.0.24 k8s-master-01 2C & 2G master
10.10.0.32 k8s-master-02 2C & 2G master
10.10.0.23 k8s-master-03 2C & 2G master
10.10.0.25 k8s-node-01 4C & 8G node
10.10.0.29 k8s-node-02 4C & 8G node
10.10.0.12 k8s-node-03 4C & 8G node
### 2、各个节点端口占用 - Master 节点
规则 方向 端口范围 作用 使用者
TCP Inbound 入口 6443* Kubernetes API server All
TCP Inbound 入口 2379-2380 etcd server client API kube-apiserver, etcd
TCP Inbound 入口 10250 Kubernetes API Self, Control plane
TCP Inbound 入口 10251 kube-scheduler Self
TCP Inbound 入口 10252 kube-controller-manager Self
- node 节点
规则 方向 端口范围 作用 使用者
TCP Inbound 入口 10250 Kubernetes API Self, Control plane
TCP Inbound 入口 30000-32767 NodePort Services** All
### 3、基础环境设置  Kubernetes 需要一定的环境来保证正常运行,如各个节点时间同步,主机名称解析,关闭防火墙等等。 1、主机名称解析  分布式系统环境中的多主机通信通常基于主机名称进行,这在 IP 地址存在变化的可能性时为主机提供了固定的访问人口,因此一般需要有专用的 DNS 服务负责解决各节点主机 不过,考虑到此处部署的是测试集群,因此为了降低系复杂度,这里将基于 hosts 的文件进行主机名称解析。 2、修改hosts和免key登录 ```bash #分别进入不同服务器,进入 /etc/hosts 进行编辑 cat > /etc/hosts << \EOF 127.0.0.1 localhost localhost.localdomain localhost4 localhost4.localdomain4 ::1 localhost localhost.localdomain localhost6 localhost6.localdomain6 10.10.1.100 k8s-vip master master.k8s.io 10.10.0.24 k8s-master-01 master01 master01.k8s.io 10.10.0.32 k8s-master-02 master02 master02.k8s.io 10.10.0.23 k8s-master-03 master03 master03.k8s.io 10.10.0.25 k8s-node-01 node01 node01.k8s.io 10.10.0.29 k8s-node-02 node02 node02.k8s.io 10.10.0.12 k8s-node-03 node03 node03.k8s.io EOF #root用户免密登录 mkdir -p /root/.ssh/ chmod 700 /root/.ssh/ echo 'ssh-rsa AAAAB3NzaC1yc2EAAAADAQABAAABAQC7bRm20od1b3rzW3ZPLB5NZn3jQesvfiz2p0WlfcYJrFHfF5Ap0ubIBUSQpVNLn94u8ABGBLboZL8Pjo+rXQPkIcObJxoKS8gz6ZOxcxJhldudbadabdanKAAKAKKKKKKKKKKKKKKKKKKKKKKK root@k8s-master-01' > /root/.ssh/authorized_keys chmod 400 /root/.ssh/authorized_keys ``` 3、修改hostname ```bash #分别进入不同的服务器修改 hostname 名称 # 修改 10.10.0.24 服务器 hostnamectl set-hostname k8s-master-01 # 修改 10.10.0.32 服务器 hostnamectl set-hostname k8s-master-02 # 修改 10.10.0.23 服务器 hostnamectl set-hostname k8s-master-03 # 修改 10.10.0.25 服务器 hostnamectl set-hostname k8s-node-01 # 修改 10.10.0.29 服务器 hostnamectl set-hostname k8s-node-02 # 修改 10.10.0.12 服务器 hostnamectl set-hostname k8s-node-03 ``` 4、主机时间同步 ```bash #将各个服务器的时间同步,并设置开机启动同步时间服务 yum install chrony -y systemctl restart chronyd.service systemctl enable chronyd.service ``` 5、关闭防火墙服务 ```bash systemctl stop firewalld systemctl disable firewalld ``` 6、关闭并禁用SELinux ```bash # 若当前启用了 SELinux 则需要临时设置其当前状态为 permissive setenforce 0 # 编辑/etc/sysconfig selinux 文件,以彻底禁用 SELinux sed -i 's/^SELINUX=enforcing$/SELINUX=disabled/' /etc/selinux/config # 查看selinux状态 getenforce 如果为permissive,则执行reboot重新启动即可 ``` 7、禁用 Swap 设备  kubeadm 默认会预先检当前主机是否禁用了 Swap 设备,并在未用时强制止部署 过程因此,在主机内存资惊充裕的条件下,需要禁用所有的 Swap 设备 ``` # 关闭当前已启用的所有 Swap 设备 swapoff -a && sysctl -w vm.swappiness=0 sed -ri 's/.*swap.*/#&/' /etc/fstab cat /etc/fstab 或 # 编辑 fstab 配置文件,注释掉标识为 Swap 设备的所有行 vi /etc/fstab UUID=9be41058-76a6-4588-8e3f-5b44604d8de1 / xfs defaults,noatime 0 0 UUID=4489cc8f-1885-4e17-bfe7-8652fd1d3feb /boot xfs defaults,noatime 0 0 #UUID=0f5ae5f1-4872-471f-9f3a-f172a43fc1ff swap swap defaults,noatime 0 0 ``` 8、设置系统参数  设置允许路由转发,不对bridge的数据进行处理 ```bash #创建 /etc/sysctl.d/k8s.conf 文件 cat > /etc/sysctl.d/k8s.conf << \EOF vm.swappiness = 0 net.ipv4.ip_forward = 1 net.bridge.bridge-nf-call-ip6tables = 1 net.bridge.bridge-nf-call-iptables = 1 EOF #挂载br_netfilter modprobe br_netfilter #生效配置文件 sysctl -p /etc/sysctl.d/k8s.conf #查看是否生成相关文件 ls /proc/sys/net/bridge ``` 9、资源配置文件 `/etc/security/limits.conf` 是 Linux 资源使用配置文件,用来限制用户对系统资源的使用 ```bash echo "* soft nofile 65536" >> /etc/security/limits.conf echo "* hard nofile 65536" >> /etc/security/limits.conf echo "* soft nproc 65536" >> /etc/security/limits.conf echo "* hard nproc 65536" >> /etc/security/limits.conf echo "* soft memlock unlimited" >> /etc/security/limits.conf echo "* hard memlock unlimited" >> /etc/security/limits.conf ``` 10、安装依赖包以及相关工具 ```bash yum install -y epel-release yum install -y yum-utils nfs-utils expect device-mapper-persistent-data lvm2 net-tools conntrack-tools wget vim ntpdate libseccomp libtool-ltdl ``` # 五、安装Docker (所有节点) ### 1、移除之前安装过的Docker ```bash sudo yum remove -y docker \ docker-client \ docker-client-latest \ docker-common \ docker-latest \ docker-latest-logrotate \ docker-logrotate \ docker-selinux \ docker-engine-selinux \ docker-ce-cli \ docker-engine # 查看还有没有存在的docker组件 rpm -qa|grep docker # 有则通过命令 yum -y remove XXX 来删除,比如: yum remove docker-ce-cli ``` ### 2、配置docker的yum源 下面两个镜像源选择其一即可,由于官方下载速度比较慢,推荐用阿里镜像源 - 阿里镜像源 ```bash yum install -y yum-utils \ device-mapper-persistent-data \ lvm2 yum-config-manager --add-repo http://mirrors.aliyun.com/docker-ce/linux/centos/docker-ce.repo ``` - Docker官方镜像源 ```bash yum-config-manager --add-repo https://download.docker.com/linux/centos/docker-ce.repo ``` ### 3、安装Docker: ``` # 显示docker-ce所有可安装版本: yum list docker-ce --showduplicates | sort -r # 安装指定docker版本 yum install -y docker-ce-18.09.9 docker-ce-cli-18.09.9 containerd.io # 启动docker并设置docker开机启动 systemctl enable docker systemctl start docker # 确认一下iptables 确认一下iptables filter表中FOWARD链的默认策略(pllicy)为ACCEPT。 iptables -nvL Chain FORWARD (policy ACCEPT 0 packets, 0 bytes) pkts bytes target prot opt in out source destination 0 0 DOCKER-USER all -- * * 0.0.0.0/0 0.0.0.0/0 0 0 DOCKER-ISOLATION-STAGE-1 all -- * * 0.0.0.0/0 0.0.0.0/0 0 0 ACCEPT all -- * docker0 0.0.0.0/0 0.0.0.0/0 ctstate RELATED,ESTABLISHED 0 0 DOCKER all -- * docker0 0.0.0.0/0 0.0.0.0/0 0 0 ACCEPT all -- docker0 !docker0 0.0.0.0/0 0.0.0.0/0 0 0 ACCEPT all -- docker0 docker0 0.0.0.0/0 0.0.0.0/0 Docker从1.13版本开始调整了默认的防火墙规则,禁用了iptables filter表中FOWARD链,这样会引起Kubernetes集群中跨Node的Pod无法通信。但这里通过安装docker 1806,发现默认策略又改回了ACCEPT,这个不知道是从哪个版本改回的,因为我们线上版本使用的1706还是需要手动调整这个策略的。 # 执行下面命令 iptables -P FORWARD ACCEPT # 修改docker的配置 vim /usr/lib/systemd/system/docker.service # 增加下面命令(ExecReload后面新增ExecStartPost=...) ... ExecReload=/bin/kill -s HUP $MAINPID ExecStartPost=/usr/sbin/iptables -P FORWARD ACCEPT ... # 修改docker Cgroup Driver为systemd # sed -i "s#^ExecStart=/usr/bin/dockerd.*#ExecStart=/usr/bin/dockerd -H fd:// --containerd=/run/containerd/containerd.sock --exec-opt native.cgroupdriver=systemd#g" /usr/lib/systemd/system/docker.service # 设置 docker 镜像,提高 docker 镜像下载速度和稳定性 curl -sSL https://get.daocloud.io/daotools/set_mirror.sh | sh -s http://f1361db2.m.daocloud.io # 或者直接配置文件docker加速器 cat > /etc/docker/daemon.json << \EOF { "exec-opts": ["native.cgroupdriver=systemd"], "registry-mirrors": [ "https://dockerhub.azk8s.cn", "https://i37dz0y4.mirror.aliyuncs.com" ], "insecure-registries": ["reg.hub.com"] } EOF # 重启Docker systemctl daemon-reload systemctl restart docker docker info|grep -i Cgroup ``` ### 4、docker最终的服务文件 ``` #注意,有变量的地方需要使用转义符号 cat > /usr/lib/systemd/system/docker.service << EOF [Unit] Description=Docker Application Container Engine Documentation=https://docs.docker.com BindsTo=containerd.service After=network-online.target firewalld.service containerd.service Wants=network-online.target Requires=docker.socket [Service] Type=notify # the default is not to use systemd for cgroups because the delegate issues still # exists and systemd currently does not support the cgroup feature set required # for containers run by docker ExecStart=/usr/bin/dockerd -H fd:// --containerd=/run/containerd/containerd.sock --exec-opt native.cgroupdriver=systemd ExecReload=/bin/kill -s HUP \$MAINPID ExecStartPost=/usr/sbin/iptables -P FORWARD ACCEPT TimeoutSec=0 RestartSec=2 Restart=always # Note that StartLimit* options were moved from "Service" to "Unit" in systemd 229. # Both the old, and new location are accepted by systemd 229 and up, so using the old location # to make them work for either version of systemd. StartLimitBurst=3 # Note that StartLimitInterval was renamed to StartLimitIntervalSec in systemd 230. # Both the old, and new name are accepted by systemd 230 and up, so using the old name to make # this option work for either version of systemd. StartLimitInterval=60s # Having non-zero Limit*s causes performance problems due to accounting overhead # in the kernel. We recommend using cgroups to do container-local accounting. LimitNOFILE=infinity LimitNPROC=infinity LimitCORE=infinity # Comment TasksMax if your systemd version does not support it. # Only systemd 226 and above support this option. TasksMax=infinity # set delegate yes so that systemd does not reset the cgroups of docker containers Delegate=yes # kill only the docker process, not all processes in the cgroup KillMode=process [Install] WantedBy=multi-user.target EOF # 重启Docker systemctl daemon-reload systemctl restart docker systemctl enable docker ``` # 六、安装kubeadm、kubelet ### 1、配置yum源用于安装: - 1、配置国内yum源 ``` cat < /etc/yum.repos.d/kubernetes.repo [kubernetes] name=Kubernetes baseurl=https://mirrors.aliyun.com/kubernetes/yum/repos/kubernetes-el7-x86_64/ enabled=1 gpgcheck=0 repo_gpgcheck=0 gpgkey=https://mirrors.aliyun.com/kubernetes/yum/doc/yum-key.gpg https://mirrors.aliyun.com/kubernetes/yum/doc/rpm-package-key.gpg EOF # 安装kubelet、kubeadm、kubectl yum install -y kubelet-1.16.2 kubeadm-1.16.2 kubectl-1.16.2 --disableexcludes=kubernetes systemctl daemon-reload systemctl restart kubelet.service systemctl enable kubelet.service ``` - 2、kubeadm 官方镜像源 ``` cat < /etc/yum.repos.d/kubernetes.repo [kubernetes] name=Kubernetes baseurl=https://packages.cloud.google.com/yum/repos/kubernetes-el7-x86_64 enabled=1 gpgcheck=1 repo_gpgcheck=1 gpgkey=https://packages.cloud.google.com/yum/doc/yum-key.gpg https://packages.cloud.google.com/yum/doc/rpm-package-key.gpg EOF # 安装kubelet、kubeadm、kubectl yum install -y kubelet-1.16.2 kubeadm-1.16.2 kubectl-1.16.2 --disableexcludes=kubernetes systemctl daemon-reload systemctl restart kubelet.service systemctl enable kubelet.service ``` ### 2、安装kubelet ``` # 需要在每台机器上都安装以下的软件包: kubeadm: 用来初始化集群的指令。 kubelet: 在集群中的每个节点上用来启动 pod 和 container 等。 kubectl: 用来与集群通信的命令行工具。 # 查看kubelet版本列表 yum list kubelet --showduplicates | sort -r # 安装kubelet yum install -y kubelet-1.16.2 # 启动kubelet并设置开机启动 systemctl daemon-reload systemctl enable kubelet systemctl restart kubelet # 检查状态 检查状态,发现是failed状态,正常,kubelet会10秒重启一次,需等下面完成初始化master节点后即可正常 systemctl status kubelet # 查看kubelet日志 journalctl -u kubelet --no-pager ``` ### 3、安装kubeadm ``` # 负责初始化集群 # 1、查看kubeadm版本列表 yum list kubeadm --showduplicates | sort -r # 2、安装kubeadm yum install -y kubeadm-1.16.2 # 安装 kubeadm 时候会默认安装 kubectl ,所以不需要单独安装kubectl # 3、重启服务器 为了防止发生某些未知错误,这里我们重启下服务器,方便进行后续操作 reboot ``` # 七、初始化第一个kubernetes master节点 以 `root` 身份在 `k8s-master-01` 机器上执行 初始化 `master` 节点时,如果因为中间某些步骤的配置出错,想要重新初始化 `master` 节点,请先执行 `yes | kubeadm reset` 操作 ```bash #查看初始化配置文件 kubeadm config view ``` 1、精简配置文件初始化 ``` # 替换 apiserver.demo 为 您想要的 dnsName export APISERVER_NAME=master.k8s.io # Kubernetes 容器组所在的网段,该网段安装完成后,由 kubernetes 创建,事先并不存在于您的物理网络中 export VER=v1.16.2 export POD_SUBNET=10.244.0.0/16 export SVC_SUBNET=10.96.0.0/12 rm -f ./kubeadm-config.yaml cat < ./kubeadm-config.yaml apiVersion: kubeadm.k8s.io/v1beta2 kind: ClusterConfiguration kubernetesVersion: ${VER} #imageRepository: registry.cn-hangzhou.aliyuncs.com/google_containers controlPlaneEndpoint: "${APISERVER_NAME}:6443" networking: serviceSubnet: "${SVC_SUBNET}" podSubnet: "${POD_SUBNET}" dnsDomain: "cluster.local" EOF # kubeadm init # 根据您服务器网速的情况,您需要等候 3 - 10 分钟 kubeadm init --config=kubeadm-config.yaml --upload-certs # 配置 kubectl rm -rf /root/.kube/ mkdir /root/.kube/ yes | cp -i /etc/kubernetes/admin.conf /root/.kube/config ``` 2、详细配置文件初始化 ``` # 1、创建kubeadm配置的yaml文件 rm -f ./kubeadm-config.yaml export VER=v1.16.2 export MASTER_NODE1=10.10.0.24 export APISERVER_NAME=master.k8s.io export POD_SUBNET=10.244.0.0/16 export SVC_SUBNET=10.96.0.0/12 cat < ./kubeadm-config.yaml apiVersion: kubeadm.k8s.io/v1beta2 bootstrapTokens: - groups: - system:bootstrappers:kubeadm:default-node-token token: abcdef.0123456789abcdef ttl: 24h0m0s usages: - signing - authentication kind: InitConfiguration localAPIEndpoint: advertiseAddress: ${MASTER_NODE1} #这里填写第一个初始化的master的ip bindPort: 6443 nodeRegistration: criSocket: /var/run/dockershim.sock name: k8s-master-01 #注意这里需要调整为自己的节点 taints: - effect: NoSchedule key: node-role.kubernetes.io/master --- apiVersion: kubeadm.k8s.io/v1beta2 kind: ClusterConfiguration clusterName: kubernetes kubernetesVersion: ${VER} certificatesDir: /etc/kubernetes/pki controllerManager: {} controlPlaneEndpoint: "${APISERVER_NAME}:16443" # 这里写vip的地址或域名加上端口 imageRepository: k8s.gcr.io #imageRepository: registry.aliyuncs.com/google_containers # 使用阿里云镜像 apiServer: timeoutForControlPlane: 4m0s certSANs: - k8s-master-01 - k8s-master-02 - k8s-master-03 - master.k8s.io - 10.10.1.100 - 10.10.0.24 - 10.10.0.32 - 10.10.0.23 - 127.0.0.1 dns: type: CoreDNS etcd: local: dataDir: /var/lib/etcd networking: dnsDomain: cluster.local podSubnet: ${POD_SUBNET} serviceSubnet: ${SVC_SUBNET} scheduler: {} --- apiVersion: kubeproxy.config.k8s.io/v1alpha1 kind: KubeProxyConfiguration mode: ipvs # kube-proxy 模式 EOF kubeadm init --config=kubeadm-config.yaml --upload-certs 以下两个地方设置: - certSANs: 虚拟ip地址(为了安全起见,把所有集群地址都加上) - controlPlaneEndpoint: VIP:端口号 配置说明: imageRepository: registry.aliyuncs.com/google_containers (使用阿里云镜像仓库) podSubnet: 10.244.0.0/16 (#pod地址池) serviceSubnet: 10.96.0.0/12 (#service地址池) ``` 3、查看初始化配置文件 ``` # 查看kubeadm配置文件 root># kubeadm config view apiServer: extraArgs: authorization-mode: Node,RBAC timeoutForControlPlane: 4m0s apiVersion: kubeadm.k8s.io/v1beta2 certificatesDir: /etc/kubernetes/pki clusterName: kubernetes controlPlaneEndpoint: master.k8s.io:6443 controllerManager: {} dns: type: CoreDNS etcd: local: dataDir: /var/lib/etcd imageRepository: k8s.gcr.io kind: ClusterConfiguration kubernetesVersion: v1.16.2 networking: dnsDomain: cluster.local podSubnet: 10.244.0.0/16 serviceSubnet: 10.96.0.0/12 scheduler: {} ``` ### 2、初始化第一个master节点 ``` kubeadm init --config=kubeadm-config.yaml --upload-certs #使用这个就不用做拷贝证书的操作 ``` 日志 ``` Your Kubernetes control-plane has initialized successfully! To start using your cluster, you need to run the following as a regular user: mkdir -p $HOME/.kube sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config sudo chown $(id -u):$(id -g) $HOME/.kube/config You should now deploy a pod network to the cluster. Run "kubectl apply -f [podnetwork].yaml" with one of the options listed at: https://kubernetes.io/docs/concepts/cluster-administration/addons/ You can now join any number of the control-plane node running the following command on each as root: kubeadm join master.k8s.io:16443 --token wf0eoe.liqcp0nhtlov4ioi \ --discovery-token-ca-cert-hash sha256:e43bbb08bb5decae1ce0001f2988ff79095e6be5a3dea77a7c6af180562c7e56 \ --control-plane --certificate-key 6054323448a1aeb661b78763262db5c30e12026c54341400d48401a853194ec2 Please note that the certificate-key gives access to cluster sensitive data, keep it secret! As a safeguard, uploaded-certs will be deleted in two hours; If necessary, you can use "kubeadm init phase upload-certs --upload-certs" to reload certs afterward. Then you can join any number of worker nodes by running the following on each as root: kubeadm join master.k8s.io:16443 --token wf0eoe.liqcp0nhtlov4ioi \ --discovery-token-ca-cert-hash sha256:e43bbb08bb5decae1ce0001f2988ff79095e6be5a3dea77a7c6af180562c7e56 ``` ### 执行结果中 用于初始化第二、三个 master 节点 ``` #初始化第二个master节点 export MASTER_NODE2=10.10.0.32 kubeadm join master.k8s.io:16443 --apiserver-advertise-address ${MASTER_NODE2} --token abcdef.0123456789abcdef \ --discovery-token-ca-cert-hash sha256:ab6da874166785bfe75acc4d6fd622bf821a7451837332e3a21a6106e346c8d5 \ --control-plane --certificate-key 13284467f0141778898ffa33d340c0598cb757c6aa016f00da2165cd3eab4523 #初始化第三个master节点 export MASTER_NODE3=10.10.0.23 kubeadm join master.k8s.io:16443 --apiserver-advertise-address ${MASTER_NODE3} --token abcdef.0123456789abcdef \ --discovery-token-ca-cert-hash sha256:ab6da874166785bfe75acc4d6fd622bf821a7451837332e3a21a6106e346c8d5 \ --control-plane --certificate-key 13284467f0141778898ffa33d340c0598cb757c6aa016f00da2165cd3eab4523 ``` 用于初始化 worker 节点 ``` kubeadm join master.k8s.io:16443 --token abcdef.0123456789abcdef \ --discovery-token-ca-cert-hash sha256:ab6da874166785bfe75acc4d6fd622bf821a7451837332e3a21a6106e346c8d5 ``` ### 3、配置kubectl环境变量 ```bash # 配置环境变量 rm -rf $HOME/.kube mkdir -p $HOME/.kube sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config sudo chown $(id -u):$(id -g) $HOME/.kube/config # 指令补全 yum install bash-completion -y source <(kubectl completion bash) echo "source <(kubectl completion bash)" >> ~/.bashrc ``` ### 4、查看组件状态 ```bash kubectl get cs NAME STATUS MESSAGE ERROR controller-manager Healthy ok scheduler Healthy ok etcd-0 Healthy {"health": "true"} # 查看pod状态 [root@k8s-master-01 ~]# kubectl get pods --namespace=kube-system NAME READY STATUS RESTARTS AGE coredns-78d4cf999f-5zt5z 0/1 Pending 0 7m32s ---coredns没有启动 coredns-78d4cf999f-mkgsx 0/1 Pending 0 7m32s ---coredns没有启动 etcd-k8s-master-01 1/1 Running 0 6m39s kube-apiserver-k8s-master-01 1/1 Running 0 6m43s kube-controller-manager-k8s-master-01 1/1 Running 0 6m32s kube-proxy-88s74 1/1 Running 0 7m32s kube-scheduler-k8s-master-01 1/1 Running 0 6m45s 可以看到coredns没有启动,这是由于还没有配置网络插件,接下来配置下后再重新查看启动状态 #检查ETCD服务 docker exec -it $(docker ps |grep etcd_etcd|awk '{print $1}') sh etcdctl --endpoints=https://192.168.56.11:2379 --ca-file=/etc/kubernetes/pki/etcd/ca.crt --cert-file=/etc/kubernetes/pki/etcd/server.crt --key-file=/etc/kubernetes/pki/etcd/server.key member list etcdctl --endpoints=https://192.168.56.11:2379 --ca-file=/etc/kubernetes/pki/etcd/ca.crt --cert-file=/etc/kubernetes/pki/etcd/server.crt --key-file=/etc/kubernetes/pki/etcd/server.key cluster-health ``` # 八、安装网络插件 ### 1、安装 calico 网络插件 ``` # 安装 calico 网络插件 # 参考文档 https://docs.projectcalico.org/v3.9/getting-started/kubernetes/ export POD_SUBNET=10.244.0.0/16 rm -f calico.yaml wget https://docs.projectcalico.org/v3.9/manifests/calico.yaml sed -i "s#192\.168\.0\.0/16#${POD_SUBNET}#" calico.yaml kubectl apply -f calico.yaml ``` ### 2、等待一会时间,再次查看各个pods的状态 ``` [root@k8s-master-01 ~]# kubectl get pods --namespace=kube-system NAME READY STATUS RESTARTS AGE coredns-78d4cf999f-5zt5z 1/1 Running 0 12m ---coredns启动成功 coredns-78d4cf999f-mkgsx 1/1 Running 0 12m ---coredns启动成功 etcd-k8s-master-01 1/1 Running 0 11m kube-apiserver-k8s-master-01 1/1 Running 0 12m kube-controller-manager-k8s-master-01 1/1 Running 0 11m kube-flannel-ds-amd64-7lj6m 1/1 Running 0 13s kube-proxy-88s74 1/1 Running 0 12m kube-scheduler-k8s-master-01 1/1 Running 0 12m ``` # 九、加入集群 ### 1、Master加入集群构成高可用 ``` 复制秘钥到各个节点 在master01 服务器上执行下面命令,将kubernetes相关文件复制到 master02、master03 如果其他节点为初始化第一个master节点,则将该节点的配置文件复制到其余两个主节点,例如master03为第一个master节点,则将它的k8s配置复制到master02和master01。 ``` - 复制文件到 master02 ``` ssh root@master02.k8s.io mkdir -p /etc/kubernetes/pki/etcd scp /etc/kubernetes/admin.conf root@master02.k8s.io:/etc/kubernetes scp /etc/kubernetes/pki/{ca.*,sa.*,front-proxy-ca.*} root@master02.k8s.io:/etc/kubernetes/pki scp /etc/kubernetes/pki/etcd/ca.* root@master02.k8s.io:/etc/kubernetes/pki/etcd ``` - 复制文件到 master03 ``` ssh root@master03.k8s.io mkdir -p /etc/kubernetes/pki/etcd scp /etc/kubernetes/admin.conf root@master03.k8s.io:/etc/kubernetes scp /etc/kubernetes/pki/{ca.*,sa.*,front-proxy-ca.*} root@master03.k8s.io:/etc/kubernetes/pki scp /etc/kubernetes/pki/etcd/ca.* root@master03.k8s.io:/etc/kubernetes/pki/etcd ``` - master节点加入集群  master02 和 master03 服务器上都执行加入集群操作 ```bash kubeadm join master.k8s.io:16443 --token i77yg1.1eype0c53jsanoge --discovery-token-ca-cert-hash sha256:8f0a817012ab333a057b6a7410e65971be20b95c1b75fc4015f8f3b6785f626f --experimental-control-plane ```  如果加入失败想重新尝试,请输入 kubeadm reset 命令清除之前的设置,重新执行从“复制秘钥”和“加入集群”这两步  如果是master加入,请在最后面加上 –experimental-control-plane 这个参数 ```bash # 显示安装过程: This node has joined the cluster and a new control plane instance was created: * Certificate signing request was sent to apiserver and approval was received. * The Kubelet was informed of the new secure connection details. * Master label and taint were applied to the new node. * The Kubernetes control plane instances scaled up. * A new etcd member was added to the local/stacked etcd cluster. To start administering your cluster from this node, you need to run the following as a regular user: mkdir -p $HOME/.kube sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config sudo chown $(id -u):$(id -g) $HOME/.kube/config Run 'kubectl get nodes' to see this node join the cluster. ``` - 配置kubectl环境变量 ```bash mkdir -p $HOME/.kube sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config sudo chown $(id -u):$(id -g) $HOME/.kube/config # 指令补全 yum install bash-completion -y source <(kubectl completion bash) echo "source <(kubectl completion bash)" >> ~/.bashrc ``` ### 2、node节点加入集群  除了让master节点加入集群组成高可用外,slave节点也要加入集群中。  这里将k8s-node-01、k8s-node-02、k8s-node-03加入集群,进行工作  输入初始化k8s master时候提示的加入命令,如下: ``` kubeadm join master.k8s.io:16443 --token i77yg1.1eype0c53jsanoge --discovery-token-ca-cert-hash sha256:8f0a817012ab333a057b6a7410e65971be20b95c1b75fc4015f8f3b6785f626f ```  node节点加入,不需要加上 –experimental-control-plane 这个参数 ### 3、如果忘记加入集群的token和sha256 (如正常则跳过) - 显示获取token列表 ``` kubeadm token list ``` 默认情况下 Token 过期是时间是24小时,如果 Token 过期以后,可以输入以下命令,生成新的 Token ``` kubeadm token create ``` - 获取ca证书sha256编码hash值 ``` openssl x509 -pubkey -in /etc/kubernetes/pki/ca.crt | openssl rsa -pubin -outform der 2>/dev/null | openssl dgst -sha256 -hex | sed 's/^.* //' ``` 拼接命令 ``` kubeadm join master.k8s.io:16443 --token 882ik4.9ib2kb0eftvuhb58 --discovery-token-ca-cert-hash sha256:0b1a836894d930c8558b350feeac8210c85c9d35b6d91fde202b870f3244016a 如果是master加入,请在最后面加上 –experimental-control-plane 这个参数 ``` ### 4、查看各个节点加入集群情况 ``` kubectl get nodes -o wide ``` # 十、从集群中删除 Node - Master节点: ``` kubectl drain --delete-local-data --force --ignore-daemonsets kubectl delete node ``` - Slave节点: ``` kubeadm reset ``` ## 初始化失败 ```bash yes | kubeadm reset ifconfig cni0 down ip link delete cni0 ifconfig flannel.1 down ip link delete flannel.1 rm -rf /var/lib/cni/ rm -rf /var/lib/etcd/* ``` # 十一、安装Kubernetes Dashboard 2.0 ``` #安装 kubectl apply -f https://raw.githubusercontent.com/kubernetes/dashboard/v2.0.0-beta4/aio/deploy/recommended.yaml #卸载 kubectl delete -f https://raw.githubusercontent.com/kubernetes/dashboard/v2.0.0-beta4/aio/deploy/recommended.yaml ``` 参考资料: http://www.mydlq.club/article/4/ https://kuboard.cn/install/install-kubernetes.html#%E5%88%9D%E5%A7%8B%E5%8C%96%E7%AC%AC%E4%B8%80%E4%B8%AAmaster%E8%8A%82%E7%82%B9 https://blog.51cto.com/fengwan/2426528?source=dra kubeadm搭建高可用kubernetes 1.15.1 https://segmentfault.com/a/1190000018741112?utm_source=tag-newest Kubernetes的几种主流部署方式02-kubeadm部署高可用集群 ================================================ FILE: kubeadm/K8S-V1.16.2-开启防火墙-Flannel.md ================================================ Table of Contents ================= * [一、防火墙配置](#一防火墙配置) * [二、初始化](#二初始化) * [三、初始化集群](#三初始化集群) * [1、命令行初始化](#1命令行初始化) * [2、通过配置文件进行初始化](#2通过配置文件进行初始化) * [3、初始化进行的操作](#3初始化进行的操作) * [4、单独部署coredns(选择操作)](#4单独部署coredns选择操作) * [5、集群移除节点](#5集群移除节点) * [6、kube-proxy开启ipvs](#6kube-proxy开启ipvs) * [四、Master操作](#四master操作) * [五、Node操作](#五node操作) * [六、集群操作](#六集群操作) * [七、网络插件部署](#七网络插件部署) * [1、master上部署flannel插件](#1master上部署flannel插件) * [2、master上部署calico插件](#2master上部署calico插件) * [3、性能对比](#3性能对比) * [八、安装 Dashboard](#八安装-dashboard) * [1、下载yaml文件](#1下载yaml文件) * [2、修改配置](#2修改配置) * [3、查看dashboard](#3查看dashboard) * [4、然后创建一个具有全局所有权限的用户来登录Dashboard:(admin.yaml)](#4然后创建一个具有全局所有权限的用户来登录dashboardadminyaml) * [九、问题排查](#九问题排查) * [1、coredns异常问题](#1coredns异常问题) * [1.1、解决办法](#11解决办法) * [2、kubelet异常问题1](#2kubelet异常问题1) * [3、kubelet异常问题2](#3kubelet异常问题2) # 一、防火墙配置 ```bash chattr -i /etc/passwd* && chattr -i /etc/group* && chattr -i /etc/shadow* && chattr -i /etc/gshadow* yum install iptables iptables-services -y cat > /etc/sysconfig/iptables << \EOF # Generated by iptables-save v1.4.21 on Thu Aug 1 01:26:09 2019 *filter :INPUT ACCEPT [0:0] :FORWARD ACCEPT [0:0] :OUTPUT ACCEPT [0:0] :RH-Firewall-1-INPUT - [0:0] -A INPUT -j RH-Firewall-1-INPUT -A FORWARD -j RH-Firewall-1-INPUT -A RH-Firewall-1-INPUT -i lo -j ACCEPT -A RH-Firewall-1-INPUT -p icmp -m icmp --icmp-type any -j ACCEPT -A RH-Firewall-1-INPUT -s 192.168.56.0/24 -p tcp -m tcp --dport 22 -j ACCEPT -A RH-Firewall-1-INPUT -p tcp -m tcp --dport 22 -j DROP ### k8s ### -A RH-Firewall-1-INPUT -s 192.168.56.11/32 -j ACCEPT -A RH-Firewall-1-INPUT -s 192.168.56.12/32 -j ACCEPT -A RH-Firewall-1-INPUT -s 192.168.56.13/32 -j ACCEPT # serviceSubnet rules -A RH-Firewall-1-INPUT -s 10.96.0.0/12 -j ACCEPT # podSubnet rules -A RH-Firewall-1-INPUT -s 10.244.0.0/16 -j ACCEPT # keepalived rules -A RH-Firewall-1-INPUT -p vrrp -j ACCEPT # port rules -A RH-Firewall-1-INPUT -s 192.168.56.1/32 -p tcp -m multiport --dports 80,443,1080,6443,16443,30000:32767 -j ACCEPT ### k8s ### -A RH-Firewall-1-INPUT -m state --state RELATED,ESTABLISHED -j ACCEPT -A RH-Firewall-1-INPUT -j REJECT --reject-with icmp-host-prohibited COMMIT # Completed on Thu Aug 1 01:26:09 2019 EOF systemctl restart iptables.service systemctl enable iptables.service iptables -nvL ``` # 二、初始化 ```bash cat > /etc/hosts << \EOF 127.0.0.1 localhost localhost.localdomain localhost4 localhost4.localdomain4 ::1 localhost localhost.localdomain localhost6 localhost6.localdomain6 192.168.56.11 linux-node1 linux-node1.example.com 192.168.56.12 linux-node2 linux-node2.example.com 192.168.56.13 linux-node3 linux-node3.example.com EOF systemctl stop firewalld systemctl disable firewalld setenforce 0 sed -i 's/SELINUX=.*/SELINUX=disabled/g' /etc/selinux/config sed -i 's/SELINUXTYPE=.*/SELINUXTYPE=disabled/g' /etc/selinux/config # 关闭 swap swapoff -a #sed -ir 's/.*swap.*/#&/' /etc/fstab #或 yes | cp /etc/fstab /etc/fstab_bak cat /etc/fstab_bak |grep -v swap > /etc/fstab #export Time=`date "+%Y%m%d%H%M%S"` #cp /etc/fstab /etc/fstab_$Time cat > /etc/sysctl.d/k8s.conf << \EOF net.bridge.bridge-nf-call-ip6tables = 1 net.bridge.bridge-nf-call-iptables = 1 net.ipv4.ip_forward = 1 vm.swappiness = 0 EOF #加载 br_netfilter 模块 modprobe br_netfilter sysctl -p /etc/sysctl.d/k8s.conf #创建/etc/sysconfig/modules/ipvs.modules文件,保证在节点重启后能自动加载所需模块 cat > /etc/sysconfig/modules/ipvs.modules < /etc/docker/daemon.json << \EOF { "exec-opts": ["native.cgroupdriver=systemd"], "data-root": "/data0/docker-data", "registry-mirrors" : [ "https://ot2k4d59.mirror.aliyuncs.com/" ], "insecure-registries": ["reg.hub.com"] } EOF systemctl daemon-reload systemctl restart docker cat < /etc/yum.repos.d/kubernetes.repo [kubernetes] name=Kubernetes baseurl=http://mirrors.aliyun.com/kubernetes/yum/repos/kubernetes-el7-x86_64 enabled=1 gpgcheck=0 repo_gpgcheck=0 gpgkey=http://mirrors.aliyun.com/kubernetes/yum/doc/yum-key.gpg http://mirrors.aliyun.com/kubernetes/yum/doc/rpm-package-key.gpg EOF yum install -y kubelet-1.16.2-0 kubeadm-1.16.2-0 kubectl-1.16.2-0 --disableexcludes=kubernetes kubeadm version systemctl daemon-reload systemctl restart kubelet.service systemctl enable kubelet.service systemctl status kubelet #查看kubelet日志 journalctl -f -u kubelet #kubelet.service服务位置 ls -l /lib/systemd/system/kubelet.service ``` # 三、初始化集群 ## 1、命令行初始化 ```bash #master节点初始化指令 kubeadm init \ --apiserver-advertise-address=192.168.56.11 \ --image-repository registry.aliyuncs.com/google_containers \ --kubernetes-version v1.16.2 \ --apiserver-bind-port=6443 \ --service-cidr=10.96.0.0/12 \ --pod-network-cidr=10.244.0.0/16 #这里使用这个是因为官方flannel使用的这个段地址,不然的话,kube-flannel.yml那里需要调整 #其他节点可以先指定image源,先下载需要的镜像 kubeadm config images pull --image-repository registry.aliyuncs.com/google_containers #查看集群初始化配置 kubeadm config view #获取加入集群的指令 kubeadm token create --print-join-command kubeadm join 192.168.56.11:6443 --token 5avfk1.fwui1smk5utcu7m9 --discovery-token-ca-cert-hash sha256:6730e91a516d8bf3e26d8f5eddd6409a224f8703b94f6ecde2b1fd7481bbbd25 #集群初始化如果遇到问题,可以使用下面的命令进行清理 yes | kubeadm reset ifconfig cni0 down ip link delete cni0 ifconfig flannel.1 down ip link delete flannel.1 rm -rf /var/lib/cni/ rm -f $HOME/.kube/config systemctl restart kubelet systemctl status kubelet journalctl -f -u kubelet ``` ## 2、通过配置文件进行初始化 ```bash #在 master 节点配置 kubeadm 初始化文件,可以通过如下命令导出默认的初始化配置: root># kubeadm config print init-defaults > kubeadm.yaml ``` ```bash #然后根据我们自己的需求修改配置,比如修改 imageRepository 的值,kube-proxy 的模式为 ipvs 如果是 flannel 网络插件的,需要将 networking.podSubnet 设置为默认的 10.244.0.0/16 如果是 Calico 网络插件的,配置成 Calico 的默认网段 podSubnet: 192.168.0.0/16,这个也可以修改Calico的配置文件调整 rm -f kubeadm.yaml cat > kubeadm.yaml << \EOF apiVersion: kubeadm.k8s.io/v1beta2 bootstrapTokens: - groups: - system:bootstrappers:kubeadm:default-node-token token: abcdef.0123456789abcdef ttl: 24h0m0s usages: - signing - authentication kind: InitConfiguration localAPIEndpoint: advertiseAddress: 192.168.56.11 #修改为主节点 IP bindPort: 6443 #controlPlaneEndpoint: 1.1.1.100 #如果前面配置了负载均衡,此处填写vip地址 nodeRegistration: criSocket: /var/run/dockershim.sock name: linux-node1.example.com taints: - effect: NoSchedule key: node-role.kubernetes.io/master --- apiServer: timeoutForControlPlane: 4m0s apiVersion: kubeadm.k8s.io/v1beta2 certificatesDir: /etc/kubernetes/pki clusterName: kubernetes controllerManager: {} dns: type: CoreDNS #dns 类型 etcd: local: dataDir: /var/lib/etcd #imageRepository: k8s.gcr.io imageRepository: registry.aliyuncs.com/google_containers #国内不能访问 Google,修改为阿里云 kind: ClusterConfiguration kubernetesVersion: v1.16.2 # 修改版本号 networking: dnsDomain: cluster.local # 配置成 flannel 的默认网段 serviceSubnet: 10.96.0.0/12 podSubnet: 10.244.0.0/16 scheduler: {} --- # 开启 IPVS 模式 apiVersion: kubeproxy.config.k8s.io/v1alpha1 kind: KubeProxyConfiguration mode: ipvs # kube-proxy 模式 EOF kubeadm init --config kubeadm.yaml ``` ## 3、初始化进行的操作 ```bash 初始化操作主要经历了下面15个步骤,每个阶段均输出均使用[步骤名称]作为开头: 1、[init]:指定版本进行初始化操作 2、[preflight] :初始化前的检查和下载所需要的Docker镜像文件。 3、[kubelet-start] :生成kubelet的配置文件”/var/lib/kubelet/config.yaml”,没有这个文件kubelet无法启动,所以初始化之前的kubelet实际上启动失败。 4、[certificates]:生成Kubernetes使用的证书,存放在/etc/kubernetes/pki目录中。 5、[kubeconfig] :生成 KubeConfig 文件,存放在/etc/kubernetes目录中,组件之间通信需要使用对应文件。 6、[control-plane]:使用/etc/kubernetes/manifest目录下的YAML文件,安装 Master 组件。 7、[etcd]:使用/etc/kubernetes/manifest/etcd.yaml安装Etcd服务。 8、[wait-control-plane]:等待control-plan部署的Master组件启动。 9、[apiclient]:检查Master组件服务状态。 10、[uploadconfig]:更新配置 11、[kubelet]:使用configMap配置kubelet。 12、[patchnode]:更新CNI信息到Node上,通过注释的方式记录。 13、[mark-control-plane]:为当前节点打标签,打了角色Master,和不可调度标签,这样默认就不会使用Master节点来运行Pod。 14、[bootstrap-token]:生成token记录下来,后边使用kubeadm join往集群中添加节点时会用到 15、[addons]:安装附加组件CoreDNS和kube-proxy kubectl默认会在执行的用户家目录下面的.kube目录下寻找config文件。这里是将在初始化时[kubeconfig]步骤生成的admin.conf拷贝到.kube/config。 ``` ## 4、单独部署coredns(选择操作) ```bash # 不依赖kubeadm的方式,适用于不是使用kubeadm创建的k8s集群,或者kubeadm初始化集群之后,删除了dns相关部署 # 在calico网络中也配置一个coredns # 10.96.0.10 为k8s官方指定的kube-dns地址 rm -f coredns.yaml.sed deploy.sh coredns.yml wget https://raw.githubusercontent.com/coredns/deployment/master/kubernetes/coredns.yaml.sed wget https://raw.githubusercontent.com/coredns/deployment/master/kubernetes/deploy.sh chmod +x deploy.sh ./deploy.sh -i 10.10.0.10 > coredns.yml #这里从--service-cidr=10.10.0.0/16中选用10.10.0.10作为coredns地址 kubectl apply -f coredns.yml # 查看 kubectl get pods --namespace kube-system kubectl get svc --namespace kube-system #删除coredns kubectl delete deployment coredns -n kube-system kubectl delete svc kube-dns -n kube-system kubectl delete cm coredns -n kube-system ``` ## 5、集群移除节点 ```bash 1、#移除work节点 在准备移除的 worker 节点上执行 kubeadm reset 2、在第一个 master 节点 demo-master-a-1 上执行 kubectl delete node demo-worker-x-x #worker 节点的名字可以通过在第一个 master 节点 demo-master-a-1 上执行 kubectl get nodes 命令获得 ``` ## 6、kube-proxy开启ipvs ```bash 1、#修改ConfigMap的kube-system/kube-proxy中的config.conf,把 mode: "" 改为mode: “ipvs" 保存退出即可 root># kubectl edit cm kube-proxy -n kube-system configmap/kube-proxy edited 2、#删除之前的proxy pod root># kubectl get pod -n kube-system |grep kube-proxy |awk '{system("kubectl delete pod "$1" -n kube-system")}' 3、#查看proxy运行状态 root># kubectl get pod -n kube-system | grep kube-proxy 4、#查看日志,如果有 `Using ipvs Proxier.` 说明kube-proxy的ipvs 开启成功! root># kubectl logs kube-proxy-54qnw -n kube-system I0518 20:24:09.319160 1 server_others.go:176] Using ipvs Proxier. W0518 20:24:09.319751 1 proxier.go:386] IPVS scheduler not specified, use rr by default I0518 20:24:09.320035 1 server.go:562] Version: v1.14.2 I0518 20:24:09.334372 1 conntrack.go:52] Setting nf_conntrack_max to 131072 I0518 20:24:09.334853 1 config.go:102] Starting endpoints config controller I0518 20:24:09.334916 1 controller_utils.go:1027] Waiting for caches to sync for endpoints config controller I0518 20:24:09.334945 1 config.go:202] Starting service config controller I0518 20:24:09.334976 1 controller_utils.go:1027] Waiting for caches to sync for service config controller I0518 20:24:09.435153 1 controller_utils.go:1034] Caches are synced for service config controller I0518 20:24:09.435271 1 controller_utils.go:1034] Caches are synced for endpoints config controller ``` # 四、Master操作 ```bash #将 master 节点上面的 $HOME/.kube/config 文件拷贝到 node 节点对应的文件中 mkdir -p $HOME/.kube yes | cp -i /etc/kubernetes/admin.conf $HOME/.kube/config chown $(id -u):$(id -g) $HOME/.kube/config scp $HOME/.kube/config root@linux-node2:$HOME/.kube/config scp $HOME/.kube/config root@linux-node3:$HOME/.kube/config #指令补全 yum install bash-completion -y source <(kubectl completion bash) echo "source <(kubectl completion bash)" >> ~/.bashrc ``` # 五、Node操作 ```bash #node节点操作 mkdir -p $HOME/.kube sudo chown $(id -u):$(id -g) $HOME/.kube/config #加入集群 kubeadm join 192.168.56.11:6443 --token 5avfk1.fwui1smk5utcu7m9 --discovery-token-ca-cert-hash sha256:6730e91a516d8bf3e26d8f5eddd6409a224f8703b94f6ecde2b1fd7481bbbd25 ``` # 六、集群操作 ```bash #批量重启docker docker restart `docker ps -a -q` root># kubectl get nodes NAME STATUS ROLES AGE VERSION linux-node1.example.com NotReady master 11m v1.15.3 linux-node2.example.com NotReady 5m9s v1.15.3 linux-node3.example.com NotReady 4m58s v1.15.3 可以看到是 NotReady 状态,这是因为还没有安装网络插件,接下来安装网络插件,可以在文档 https://kubernetes.io/docs/setup/production-environment/tools/kubeadm/create-cluster-kubeadm/ 中选择我们自己的网络插件,这里我们安装 flannel: iptables -I RH-Firewall-1-INPUT -s 10.96.0.0/16 -j ACCEPT service iptables save root># kubectl get pods -n kube-system NAME READY STATUS RESTARTS AGE coredns-5c98db65d4-mk254 1/1 Running 0 14m coredns-5c98db65d4-ntz98 1/1 Running 0 14m etcd-linux-node1.example.com 1/1 Running 0 13m kube-apiserver-linux-node1.example.com 1/1 Running 0 13m kube-controller-manager-linux-node1.example.com 1/1 Running 0 13m kube-flannel-ds-amd64-6kx7m 1/1 Running 0 11m kube-flannel-ds-amd64-cqfnb 1/1 Running 0 11m kube-flannel-ds-amd64-thxx2 1/1 Running 0 11m kube-proxy-gdtjg 1/1 Running 0 12m kube-proxy-lcscl 1/1 Running 0 14m kube-proxy-sb7d8 1/1 Running 0 12m kube-scheduler-linux-node1.example.com 1/1 Running 0 13m kubernetes-dashboard-fcfb4cbc-dqbq9 1/1 Running 0 4m43s kubectl describe pod/coredns-5c98db65d4-mk254 -n kube-system #创建Deployment kubectl run --image=nginx nginx-web-1 --image-pull-policy='IfNotPresent' --replicas=3 #以不同方式暴露出去 kubectl expose deployment nginx-web-1 --port=80 --target-port=80 kubectl expose deployment nginx-web-1 --port=80 --target-port=80 --type=NodePort root># kubectl exec -it nginx-web-1-5cc49f46bc-kn46r -- \ sh -c "echo hello>/usr/share/nginx/html/index.html" root># kubectl get svc -A default nginx-web-1 NodePort 10.10.43.53 80:30163/TCP 101s root># kubectl get endpoints nginx-web-1 10.244.154.193:80,10.244.44.193:80,10.244.89.129:80 5m27s root># curl 10.10.43.53 hello #显示iptables规则(注意这里kube-proxy需要使用ipvs模式,上面主机预设的iptables策略才生效) iptables -nvL --line-number #删除规则 iptables -D RH-Firewall-1-INPUT 4 ``` # 七、网络插件部署 ## 1、master上部署flannel插件 ```bash #插件镜像 network: flannel image(因墙的问题,需要从国内源下载) docker pull quay-mirror.qiniu.com/coreos/flannel:v0.11.0-amd64 docker tag quay-mirror.qiniu.com/coreos/flannel:v0.11.0-amd64 quay.io/coreos/flannel:v0.11.0-amd64 https://www.cnblogs.com/horizonli/p/10855666.html #部署flannel rm -f kube-flannel.yml wget https://raw.githubusercontent.com/coreos/flannel/master/Documentation/kube-flannel.yml sed -i 's#image: quay.io/coreos/flannel:v0.11.0-amd64#image: registry.cn-shenzhen.aliyuncs.com/cp_m/flannel:v0.10.0-amd64#g' kube-flannel.yml kubectl apply -f kube-flannel.yml #另外需要注意的是如果你的节点有多个网卡的话,需要在 kube-flannel.yml 中使用--iface参数指定集群主机内网网卡的名称,否则可能会出现 dns 无法解析。flanneld 启动参数加上--iface= args: - --ip-masq - --kube-subnet-mgr - --iface=eth0 ``` ## 2、master上部署calico插件 ```bash export POD_SUBNET=10.244.0.0/16 rm -f calico.yaml wget https://docs.projectcalico.org/v3.8/manifests/calico.yaml sed -i "s#192\.168\.0\.0/16#${POD_SUBNET}#" calico.yaml kubectl apply -f calico.yaml https://www.cnblogs.com/goldsunshine/p/10701242.html k8s网络之Calico网络 ``` ## 3、性能对比 ```bash https://www.2cto.com/net/201701/591629.html kubernetes flannel neutron calico三种网络方案性能测试分析 ``` # 八、安装 Dashboard 使用 dashboard 最好把浏览器的默认语言设置为英文,不然在进入容器操作的时候会有bug,会出现重影,然后k8s v1.16.x之后,需要使用Dashboard v2.0以上的版本,不然出现在error_outline 未知服务器错误 (404) ## 1、下载yaml文件 ```bash #下载 wget https://raw.githubusercontent.com/kubernetes/dashboard/v2.0.0-beta5/aio/deploy/recommended.yaml ``` ## 2、修改配置 ```bash 1、#热更新打补丁的方式修改svc kubectl apply -f recommended.yaml kubectl -n kubernetes-dashboard patch svc kubernetes-dashboard -p '{"spec":{"type":"NodePort"}}' kubectl -n kubernetes-dashboard patch svc kubernetes-dashboard -p '{"spec": {"ports": [{"port":443, "nodePort": 30001}]}}' kubectl get svc -A|grep kubernetes-dashboard https://www.jianshu.com/p/f38e1767bf19 使用 kubectl patch 更新 API 对象 2、#手动修改recommended.yaml文件,为了方便访问,修改kubernetes-dashboard的Service定义,指定Service的type类型为NodeType,指定nodePort端口 kubectl delete -f recommended.yaml --- kind: Service apiVersion: v1 metadata: labels: k8s-app: kubernetes-dashboard name: kubernetes-dashboard namespace: kubernetes-dashboard spec: type: NodePort # 新增这一行,指定为NodePort方式 ports: - port: 443 targetPort: 8443 nodePort: 30001 # 指定端口为30001 selector: k8s-app: kubernetes-dashboard --- kubectl apply -f recommended.yaml #注:dashboard-metrics-scraper的Service不需要修改 Kubernetes Dashboard 默认部署时,只配置了最低权限的 RBAC 参考文档:https://github.com/kubernetes/dashboard/blob/master/docs/user/access-control/creating-sample-user.md ``` ## 3、查看dashboard ```bash root># kubectl get pod,deploy,svc -n kubernetes-dashboard NAME READY STATUS RESTARTS AGE pod/dashboard-metrics-scraper-76585494d8-ws57d 1/1 Running 0 2m18s pod/kubernetes-dashboard-6b86b44f87-q26w6 1/1 Running 0 2m18s NAME READY UP-TO-DATE AVAILABLE AGE deployment.apps/dashboard-metrics-scraper 1/1 1 1 2m18s deployment.apps/kubernetes-dashboard 1/1 1 1 2m18s NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE service/dashboard-metrics-scraper ClusterIP 10.102.114.143 8000/TCP 2m18s service/kubernetes-dashboard NodePort 10.111.191.70 443:30001/TCP 2m19s root># curl https://10.111.191.70:443 -k -I HTTP/1.1 200 OK Accept-Ranges: bytes Cache-Control: no-store Content-Length: 1262 Content-Type: text/html; charset=utf-8 Last-Modified: Mon, 14 Oct 2019 16:39:02 GMT Date: Wed, 13 Nov 2019 02:25:52 GMT # 我们可以看到官方的dashboard帮我们启动了web-ui,并且帮我们启动了一个Metric服务 # 但是dashboard默认使用的https的443端口 然后可以通过上面的 https://NodeIP:30001 端口去访问 Dashboard,要记住使用 https,Chrome不生效可以使用Firefox测试: ``` ## 4、然后创建一个具有全局所有权限的用户来登录Dashboard:(admin.yaml) ```bash cat > admin.yaml << \EOF kind: ClusterRoleBinding apiVersion: rbac.authorization.k8s.io/v1beta1 metadata: name: admin annotations: rbac.authorization.kubernetes.io/autoupdate: "true" roleRef: kind: ClusterRole name: cluster-admin apiGroup: rbac.authorization.k8s.io subjects: - kind: ServiceAccount name: admin namespace: kube-system --- apiVersion: v1 kind: ServiceAccount metadata: name: admin namespace: kube-system labels: kubernetes.io/cluster-service: "true" addonmanager.kubernetes.io/mode: Reconcile EOF kubectl apply -f admin.yaml kubectl delete -f admin.yaml #获取token kubectl -n kube-system describe secret $(kubectl -n kube-system get secret | grep admin | awk '{print $1}') ``` https://192.168.56.12:31513 然后用上面的base64解码后的字符串作为token登录Dashboard即可: k8s dashboard 最终我们就完成了使用 kubeadm 搭建 v1.15.3 版本的 kubernetes 集群、coredns、ipvs、flannel。 # 九、问题排查 ## 1、coredns异常问题 ![coredns异常问题](https://github.com/Lancger/opsfull/blob/master/images/coredns-01.png) ``` E1006 12:30:53.935744 1 reflector.go:134] github.com/coredns/coredns/plugin/kubernetes/controller.go:317: Failed to list *v1.Endpoints: Get https://10.10.0.1:443/api/v1/endpoints?limit=500&resourceVersion=0: dial tcp 10.10.0.1:443: connect: no route to host E1006 12:30:53.935744 1 reflector.go:134] github.com/coredns/coredns/plugin/kubernetes/controller.go:317: Failed to list *v1.Endpoints: Get https://10.10.0.1:443/api/v1/endpoints?limit=500&resourceVersion=0: dial tcp 10.10.0.1:443: connect: no route to host log: exiting because of error: log: cannot create log: open /tmp/coredns.coredns-bccdc95cf-vlqxk.unknownuser.log.ERROR.20191006-123053.1: no such file or directory ``` ### 1.1、解决办法 ``` 实际上是主机防火墙的问题,需要添加 iptables -A RH-Firewall-1-INPUT -s 10.10.0.0/16 -j ACCEPT 其他参考 https://medium.com/@cminion/quicknote-kubernetes-networking-issues-78f1e0d06e12 https://github.com/coredns/coredns/issues/2325 ``` ## 2、kubelet异常问题1 ``` 问题现象: kubelet fails to get cgroup stats for docker and kubelet services 解决办法: cat > /etc/sysconfig/kubelet <<\EOF KUBELET_EXTRA_ARGS=--runtime-cgroups=/systemd/system.slice --kubelet-cgroups=/systemd/system.slice EOF systemctl daemon-reload systemctl restart kubelet systemctl status kubelet #查看kubelet日志 journalctl -f -u kubelet https://stackoverflow.com/questions/46726216/kubelet-fails-to-get-cgroup-stats-for-docker-and-kubelet-services https://www.twblogs.net/a/5cc87d63bd9eee1ac2ed736b ``` ## 3、kubelet异常问题2 ``` failed to create kubelet: misconfiguration: kubelet cgroup driver: "cgroupfs" is different from docker cgroup driver: "systemd" #解决办法 添加如下内容--cgroup-driver=systemd [root@tw19336 ~]# cat /usr/lib/systemd/system/kubelet.service.d/10-kubeadm.conf # Note: This dropin only works with kubeadm and kubelet v1.11+ [Service] Environment="KUBELET_KUBECONFIG_ARGS=--bootstrap-kubeconfig=/etc/kubernetes/bootstrap-kubelet.conf --kubeconfig=/etc/kubernetes/kubelet.conf --cgroup-driver=systemd" Environment="KUBELET_CONFIG_ARGS=--config=/var/lib/kubelet/config.yaml" # This is a file that "kubeadm init" and "kubeadm join" generates at runtime, populating the KUBELET_KUBEADM_ARGS variable dynamically EnvironmentFile=-/var/lib/kubelet/kubeadm-flags.env # This is a file that the user can use for overrides of the kubelet args as a last resort. Preferably, the user should use # the .NodeRegistration.KubeletExtraArgs object in the configuration files instead. KUBELET_EXTRA_ARGS should be sourced from this file. EnvironmentFile=-/etc/sysconfig/kubelet ExecStart= ExecStart=/usr/bin/kubelet $KUBELET_KUBECONFIG_ARGS $KUBELET_CONFIG_ARGS $KUBELET_KUBEADM_ARGS $KUBELET_EXTRA_ARGS systemctl daemon-reload systemctl restart kubelet systemctl status kubelet https://www.cnblogs.com/hongdada/p/9771857.html ``` 参考文档: https://www.cnblogs.com/liyongjian5179/p/11417794.html 使用kubeadm安装Kubernetes 1.15.3 并开启 ipvs https://www.jianshu.com/p/8bc61078bded https://www.cnblogs.com/lovesKey/p/10888006.html centos7下用kubeadm安装k8s集群并使用ipvs做高可用方案 https://github.com/kubernetes/dashboard/wiki/Creating-sample-user https://www.qikqiak.com/post/use-kubeadm-install-kubernetes-1.15.3/ https://kubernetes.io/docs/setup/production-environment/tools/kubeadm/create-cluster-kubeadm/ 官方文档 https://www.jianshu.com/p/d0933d6ae162 kubeadm 1.15 安装 https://yq.aliyun.com/articles/680080/ 单独部署coredns https://kubernetes.io/docs/setup/production-environment/tools/kubeadm/ha-topology/#stacked-etcd-topology etcd-stacked-cluster https://www.kubernetes.org.cn/5021.html etcd 集群运维实践 ================================================ FILE: kubeadm/Kubernetes 集群变更IP地址.md ================================================ 参考资料: https://blog.csdn.net/whywhy0716/article/details/92658111 Kubernetes 集群变更IP地址 ================================================ FILE: kubeadm/README.md ================================================ # 一、防火墙配置 ``` chattr -i /etc/passwd* && chattr -i /etc/group* && chattr -i /etc/shadow* && chattr -i /etc/gshadow* yum install iptables iptables-services -y cat > /etc/sysconfig/iptables << \EOF # Generated by iptables-save v1.4.21 on Thu Aug 1 01:26:09 2019 *filter :INPUT ACCEPT [0:0] :FORWARD ACCEPT [0:0] :OUTPUT ACCEPT [0:0] :RH-Firewall-1-INPUT - [0:0] -A INPUT -j RH-Firewall-1-INPUT -A FORWARD -j RH-Firewall-1-INPUT -A RH-Firewall-1-INPUT -i lo -j ACCEPT -A RH-Firewall-1-INPUT -p icmp -m icmp --icmp-type any -j ACCEPT -A RH-Firewall-1-INPUT -s 192.168.56.0/24 -p tcp -m tcp --dport 22 -j ACCEPT -A RH-Firewall-1-INPUT -p tcp -m tcp --dport 22 -j DROP ### k8s ### -A RH-Firewall-1-INPUT -s 192.168.56.11/32 -j ACCEPT -A RH-Firewall-1-INPUT -s 192.168.56.12/32 -j ACCEPT -A RH-Firewall-1-INPUT -s 192.168.56.13/32 -j ACCEPT # serviceSubnet rules -A RH-Firewall-1-INPUT -s 10.96.0.0/12 -j ACCEPT # podSubnet rules -A RH-Firewall-1-INPUT -s 10.244.0.0/16 -j ACCEPT # keepalived rules -A RH-Firewall-1-INPUT -p vrrp -j ACCEPT # port rules -A RH-Firewall-1-INPUT -s 192.168.56.1/32 -p tcp -m multiport --dports 80,443,1080,6443,16443,30000:32767 -j ACCEPT ### k8s ### -A RH-Firewall-1-INPUT -m state --state RELATED,ESTABLISHED -j ACCEPT -A RH-Firewall-1-INPUT -j REJECT --reject-with icmp-host-prohibited COMMIT # Completed on Thu Aug 1 01:26:09 2019 EOF systemctl restart iptables.service systemctl enable iptables.service iptables -nvL ``` # 二、初始化 ```bash cat > /etc/hosts << \EOF 127.0.0.1 localhost localhost.localdomain localhost4 localhost4.localdomain4 ::1 localhost localhost.localdomain localhost6 localhost6.localdomain6 192.168.56.11 linux-node1 linux-node1.example.com 192.168.56.12 linux-node2 linux-node2.example.com 192.168.56.13 linux-node3 linux-node3.example.com EOF systemctl stop firewalld systemctl disable firewalld setenforce 0 sed -i 's/SELINUX=.*/SELINUX=disabled/g' /etc/selinux/config sed -i 's/SELINUXTYPE=.*/SELINUXTYPE=disabled/g' /etc/selinux/config # 关闭 swap swapoff -a #sed -ir 's/.*swap.*/#&/' /etc/fstab #或 yes | cp /etc/fstab /etc/fstab_bak cat /etc/fstab_bak |grep -v swap > /etc/fstab #export Time=`date "+%Y%m%d%H%M%S"` #cp /etc/fstab /etc/fstab_$Time cat > /etc/sysctl.d/k8s.conf << \EOF net.bridge.bridge-nf-call-ip6tables = 1 net.bridge.bridge-nf-call-iptables = 1 net.ipv4.ip_forward = 1 vm.swappiness = 0 EOF #加载 br_netfilter 模块 modprobe br_netfilter sysctl -p /etc/sysctl.d/k8s.conf #创建/etc/sysconfig/modules/ipvs.modules文件,保证在节点重启后能自动加载所需模块 cat > /etc/sysconfig/modules/ipvs.modules < /etc/docker/daemon.json << \EOF { "exec-opts": ["native.cgroupdriver=systemd"], "registry-mirrors" : [ "https://ot2k4d59.mirror.aliyuncs.com/" ] } EOF systemctl daemon-reload systemctl restart docker cat < /etc/yum.repos.d/kubernetes.repo [kubernetes] name=Kubernetes baseurl=http://mirrors.aliyun.com/kubernetes/yum/repos/kubernetes-el7-x86_64 enabled=1 gpgcheck=0 repo_gpgcheck=0 gpgkey=http://mirrors.aliyun.com/kubernetes/yum/doc/yum-key.gpg http://mirrors.aliyun.com/kubernetes/yum/doc/rpm-package-key.gpg EOF yum install -y kubelet-1.16.2 kubeadm-1.16.2 kubectl-1.16.2 --disableexcludes=kubernetes systemctl daemon-reload systemctl restart kubelet.service kubeadm version systemctl enable kubelet.service systemctl status kubelet #查看kubelet日志 journalctl -f -u kubelet #kubelet.service服务位置 /lib/systemd/system/kubelet.service ``` # 三、初始化集群 1、命令行初始化 ```bash kubeadm init \ --apiserver-advertise-address=192.168.56.11 \ --image-repository registry.aliyuncs.com/google_containers \ --kubernetes-version v1.16.2 \ --apiserver-bind-port=6443 \ --service-cidr=10.96.0.0/12 \ --pod-network-cidr=10.244.0.0/16 #这里使用这个是因为官方flannel使用的这个段地址,不然的话,kube-flannel.yml那里需要调整 #其他节点可以先指定image源,先下载需要的镜像 kubeadm config images pull --image-repository registry.aliyuncs.com/google_containers #获取加入集群的指令 kubeadm token create --print-join-command kubeadm join 192.168.56.11:6443 --token 5avfk1.fwui1smk5utcu7m9 --discovery-token-ca-cert-hash sha256:6730e91a516d8bf3e26d8f5eddd6409a224f8703b94f6ecde2b1fd7481bbbd25 #集群初始化如果遇到问题,可以使用下面的命令进行清理 yes | kubeadm reset ifconfig cni0 down ip link delete cni0 ifconfig flannel.1 down ip link delete flannel.1 rm -rf /var/lib/cni/ rm -f $HOME/.kube/config systemctl restart kubelet systemctl status kubelet journalctl -f -u kubelet ``` 2、通过配置文件进行初始化 ```bash #在 master 节点配置 kubeadm 初始化文件,可以通过如下命令导出默认的初始化配置: root># kubeadm config print init-defaults > kubeadm.yaml ``` ``` #然后根据我们自己的需求修改配置,比如修改 imageRepository 的值,kube-proxy 的模式为 ipvs 如果是 flannel 网络插件的,需要将 networking.podSubnet 设置为默认的 10.244.0.0/16 如果是 Calico 网络插件的,配置成 Calico 的默认网段 podSubnet: 192.168.0.0/16,这个也可以修改Calico的配置文件调整 rm -f kubeadm.yaml cat > kubeadm.yaml << \EOF apiVersion: kubeadm.k8s.io/v1beta2 bootstrapTokens: - groups: - system:bootstrappers:kubeadm:default-node-token token: abcdef.0123456789abcdef ttl: 24h0m0s usages: - signing - authentication kind: InitConfiguration localAPIEndpoint: advertiseAddress: 192.168.56.11 #修改为主节点 IP bindPort: 6443 #controlPlaneEndpoint: 1.1.1.100 #如果前面配置了负载均衡,此处填写vip地址 nodeRegistration: criSocket: /var/run/dockershim.sock name: linux-node1.example.com taints: - effect: NoSchedule key: node-role.kubernetes.io/master --- apiServer: timeoutForControlPlane: 4m0s apiVersion: kubeadm.k8s.io/v1beta2 certificatesDir: /etc/kubernetes/pki clusterName: kubernetes controllerManager: {} dns: type: CoreDNS #dns 类型 etcd: local: dataDir: /var/lib/etcd #imageRepository: k8s.gcr.io imageRepository: registry.aliyuncs.com/google_containers #国内不能访问 Google,修改为阿里云 kind: ClusterConfiguration kubernetesVersion: v1.16.2 # 修改版本号 networking: dnsDomain: cluster.local # 配置成 flannel 的默认网段 serviceSubnet: 10.96.0.0/12 podSubnet: 10.244.0.0/16 scheduler: {} --- # 开启 IPVS 模式 apiVersion: kubeproxy.config.k8s.io/v1alpha1 kind: KubeProxyConfiguration mode: ipvs # kube-proxy 模式 EOF kubeadm init --config kubeadm.yaml ``` 3、初始化进行的操作 ```bash 初始化操作主要经历了下面15个步骤,每个阶段均输出均使用[步骤名称]作为开头: 1、[init]:指定版本进行初始化操作 2、[preflight] :初始化前的检查和下载所需要的Docker镜像文件。 3、[kubelet-start] :生成kubelet的配置文件”/var/lib/kubelet/config.yaml”,没有这个文件kubelet无法启动,所以初始化之前的kubelet实际上启动失败。 4、[certificates]:生成Kubernetes使用的证书,存放在/etc/kubernetes/pki目录中。 5、[kubeconfig] :生成 KubeConfig 文件,存放在/etc/kubernetes目录中,组件之间通信需要使用对应文件。 6、[control-plane]:使用/etc/kubernetes/manifest目录下的YAML文件,安装 Master 组件。 7、[etcd]:使用/etc/kubernetes/manifest/etcd.yaml安装Etcd服务。 8、[wait-control-plane]:等待control-plan部署的Master组件启动。 9、[apiclient]:检查Master组件服务状态。 10、[uploadconfig]:更新配置 11、[kubelet]:使用configMap配置kubelet。 12、[patchnode]:更新CNI信息到Node上,通过注释的方式记录。 13、[mark-control-plane]:为当前节点打标签,打了角色Master,和不可调度标签,这样默认就不会使用Master节点来运行Pod。 14、[bootstrap-token]:生成token记录下来,后边使用kubeadm join往集群中添加节点时会用到 15、[addons]:安装附加组件CoreDNS和kube-proxy kubectl默认会在执行的用户家目录下面的.kube目录下寻找config文件。这里是将在初始化时[kubeconfig]步骤生成的admin.conf拷贝到.kube/config。 ``` 2、单独部署coredns(选择操作) ``` # 不依赖kubeadm的方式,适用于不是使用kubeadm创建的k8s集群,或者kubeadm初始化集群之后,删除了dns相关部署 # 在calico网络中也配置一个coredns # 10.96.0.10 为k8s官方指定的kube-dns地址 rm -f coredns.yaml.sed deploy.sh coredns.yml wget https://raw.githubusercontent.com/coredns/deployment/master/kubernetes/coredns.yaml.sed wget https://raw.githubusercontent.com/coredns/deployment/master/kubernetes/deploy.sh chmod +x deploy.sh ./deploy.sh -i 10.10.0.10 > coredns.yml #这里从--service-cidr=10.10.0.0/16中选用10.10.0.10作为coredns地址 kubectl apply -f coredns.yml # 查看 kubectl get pods --namespace kube-system kubectl get svc --namespace kube-system #删除coredns kubectl delete deployment coredns -n kube-system kubectl delete svc kube-dns -n kube-system kubectl delete cm coredns -n kube-system ``` 3、集群移除节点 ``` 1、#移除work节点 在准备移除的 worker 节点上执行 kubeadm reset 2、在第一个 master 节点 demo-master-a-1 上执行 kubectl delete node demo-worker-x-x #worker 节点的名字可以通过在第一个 master 节点 demo-master-a-1 上执行 kubectl get nodes 命令获得 ``` 4、kube-proxy开启ipvs ``` 1、#修改ConfigMap的kube-system/kube-proxy中的config.conf,把 mode: "" 改为mode: “ipvs" 保存退出即可 root># kubectl edit cm kube-proxy -n kube-system configmap/kube-proxy edited 2、#删除之前的proxy pod root># kubectl get pod -n kube-system |grep kube-proxy |awk '{system("kubectl delete pod "$1" -n kube-system")}' 3、#查看proxy运行状态 root># kubectl get pod -n kube-system | grep kube-proxy 4、#查看日志,如果有 `Using ipvs Proxier.` 说明kube-proxy的ipvs 开启成功! root># kubectl logs kube-proxy-54qnw -n kube-system I0518 20:24:09.319160 1 server_others.go:176] Using ipvs Proxier. W0518 20:24:09.319751 1 proxier.go:386] IPVS scheduler not specified, use rr by default I0518 20:24:09.320035 1 server.go:562] Version: v1.14.2 I0518 20:24:09.334372 1 conntrack.go:52] Setting nf_conntrack_max to 131072 I0518 20:24:09.334853 1 config.go:102] Starting endpoints config controller I0518 20:24:09.334916 1 controller_utils.go:1027] Waiting for caches to sync for endpoints config controller I0518 20:24:09.334945 1 config.go:202] Starting service config controller I0518 20:24:09.334976 1 controller_utils.go:1027] Waiting for caches to sync for service config controller I0518 20:24:09.435153 1 controller_utils.go:1034] Caches are synced for service config controller I0518 20:24:09.435271 1 controller_utils.go:1034] Caches are synced for endpoints config controller ``` # 四、Master操作 ``` #将 master 节点上面的 $HOME/.kube/config 文件拷贝到 node 节点对应的文件中 mkdir -p $HOME/.kube yes | cp -i /etc/kubernetes/admin.conf $HOME/.kube/config chown $(id -u):$(id -g) $HOME/.kube/config scp $HOME/.kube/config root@linux-node2:$HOME/.kube/config scp $HOME/.kube/config root@linux-node3:$HOME/.kube/config #指令补全 yum install bash-completion -y source <(kubectl completion bash) echo "source <(kubectl completion bash)" >> ~/.bashrc ``` # 五、Node操作 ``` #node节点操作 mkdir -p $HOME/.kube sudo chown $(id -u):$(id -g) $HOME/.kube/config #加入集群 kubeadm join 192.168.56.11:6443 --token 5avfk1.fwui1smk5utcu7m9 --discovery-token-ca-cert-hash sha256:6730e91a516d8bf3e26d8f5eddd6409a224f8703b94f6ecde2b1fd7481bbbd25 ``` # 六、集群操作 ``` #批量重启docker docker restart `docker ps -a -q` root># kubectl get nodes NAME STATUS ROLES AGE VERSION linux-node1.example.com NotReady master 11m v1.15.3 linux-node2.example.com NotReady 5m9s v1.15.3 linux-node3.example.com NotReady 4m58s v1.15.3 可以看到是 NotReady 状态,这是因为还没有安装网络插件,接下来安装网络插件,可以在文档 https://kubernetes.io/docs/setup/production-environment/tools/kubeadm/create-cluster-kubeadm/ 中选择我们自己的网络插件,这里我们安装 flannel: iptables -I RH-Firewall-1-INPUT -s 10.96.0.0/16 -j ACCEPT service iptables save root># kubectl get pods -n kube-system NAME READY STATUS RESTARTS AGE coredns-5c98db65d4-mk254 1/1 Running 0 14m coredns-5c98db65d4-ntz98 1/1 Running 0 14m etcd-linux-node1.example.com 1/1 Running 0 13m kube-apiserver-linux-node1.example.com 1/1 Running 0 13m kube-controller-manager-linux-node1.example.com 1/1 Running 0 13m kube-flannel-ds-amd64-6kx7m 1/1 Running 0 11m kube-flannel-ds-amd64-cqfnb 1/1 Running 0 11m kube-flannel-ds-amd64-thxx2 1/1 Running 0 11m kube-proxy-gdtjg 1/1 Running 0 12m kube-proxy-lcscl 1/1 Running 0 14m kube-proxy-sb7d8 1/1 Running 0 12m kube-scheduler-linux-node1.example.com 1/1 Running 0 13m kubernetes-dashboard-fcfb4cbc-dqbq9 1/1 Running 0 4m43s kubectl describe pod/coredns-5c98db65d4-mk254 -n kube-system #创建Deployment kubectl run --image=nginx nginx-web-1 --image-pull-policy='IfNotPresent' --replicas=3 #以不同方式暴露出去 kubectl expose deployment nginx-web-1 --port=80 --target-port=80 kubectl expose deployment nginx-web-1 --port=80 --target-port=80 --type=NodePort root># kubectl exec -it nginx-web-1-5cc49f46bc-kn46r -- \ sh -c "echo hello>/usr/share/nginx/html/index.html" root># kubectl get svc -A default nginx-web-1 NodePort 10.10.43.53 80:30163/TCP 101s root># kubectl get endpoints nginx-web-1 10.244.154.193:80,10.244.44.193:80,10.244.89.129:80 5m27s root># curl 10.10.43.53 hello #显示iptables规则(注意这里kube-proxy需要使用ipvs模式,上面主机预设的iptables策略才生效) iptables -nvL --line-number #删除规则 iptables -D RH-Firewall-1-INPUT 4 ``` # 七、网络插件部署 1、master上部署flannel插件 ``` #插件镜像 network: flannel image(因墙的问题,需要从国内源下载) docker pull quay-mirror.qiniu.com/coreos/flannel:v0.11.0-amd64 docker tag quay-mirror.qiniu.com/coreos/flannel:v0.11.0-amd64 quay.io/coreos/flannel:v0.11.0-amd64 https://www.cnblogs.com/horizonli/p/10855666.html #部署flannel rm -f kube-flannel.yml wget https://raw.githubusercontent.com/coreos/flannel/master/Documentation/kube-flannel.yml sed -i 's#image: quay.io/coreos/flannel:v0.11.0-amd64#image: registry.cn-shenzhen.aliyuncs.com/cp_m/flannel:v0.10.0-amd64#g' kube-flannel.yml kubectl apply -f kube-flannel.yml #另外需要注意的是如果你的节点有多个网卡的话,需要在 kube-flannel.yml 中使用--iface参数指定集群主机内网网卡的名称,否则可能会出现 dns 无法解析。flanneld 启动参数加上--iface= args: - --ip-masq - --kube-subnet-mgr - --iface=eth0 ``` 2、master上部署calico插件 ``` export POD_SUBNET=10.244.0.0/16 rm -f calico.yaml wget https://docs.projectcalico.org/v3.8/manifests/calico.yaml sed -i "s#192\.168\.0\.0/16#${POD_SUBNET}#" calico.yaml kubectl apply -f calico.yaml https://www.cnblogs.com/goldsunshine/p/10701242.html k8s网络之Calico网络 ``` 3、性能对比 ``` https://www.2cto.com/net/201701/591629.html kubernetes flannel neutron calico三种网络方案性能测试分析 ``` # 八、安装 Dashboard 使用 dashboard 最好把浏览器的默认语言设置为英文,不然在进入容器操作的时候会有bug,会出现重影 1、下载yaml文件 ``` wget https://raw.githubusercontent.com/kubernetes/dashboard/v1.10.1/src/deploy/recommended/kubernetes-dashboard.yaml vim kubernetes-dashboard.yaml 1、# 修改镜像名称 ...... spec: containers: - name: kubernetes-dashboard #image: k8s.gcr.io/kubernetes-dashboard-amd64:v1.10.1 #这个换成阿里云的镜像 image: registry.cn-hangzhou.aliyuncs.com/google_containers/kubernetes-dashboard-amd64:v1.10.1 ports: - containerPort: 8443 protocol: TCP args: - --auto-generate-certificates ...... 2、# 修改Service为NodePort类型 ...... kind: Service apiVersion: v1 metadata: labels: k8s-app: kubernetes-dashboard name: kubernetes-dashboard namespace: kube-system spec: type: NodePort # 新增这一行,指定为NodePort方式 ports: - port: 443 targetPort: 8443 nodePort: 32370 #新增这一行,指定固定node端口 selector: k8s-app: kubernetes-dashboard ``` 2、dashboard最终文件 ``` cat > kubernetes-dashboard.yaml << \EOF # Copyright 2017 The Kubernetes Authors. # # Licensed under the Apache License, Version 2.0 (the "License"); # you may not use this file except in compliance with the License. # You may obtain a copy of the License at # # http://www.apache.org/licenses/LICENSE-2.0 # # Unless required by applicable law or agreed to in writing, software # distributed under the License is distributed on an "AS IS" BASIS, # WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. # See the License for the specific language governing permissions and # limitations under the License. # ------------------- Dashboard Secret ------------------- # apiVersion: v1 kind: Secret metadata: labels: k8s-app: kubernetes-dashboard name: kubernetes-dashboard-certs namespace: kube-system type: Opaque --- # ------------------- Dashboard Service Account ------------------- # apiVersion: v1 kind: ServiceAccount metadata: labels: k8s-app: kubernetes-dashboard name: kubernetes-dashboard namespace: kube-system --- # ------------------- Dashboard Role & Role Binding ------------------- # kind: Role apiVersion: rbac.authorization.k8s.io/v1 metadata: name: kubernetes-dashboard-minimal namespace: kube-system rules: # Allow Dashboard to create 'kubernetes-dashboard-key-holder' secret. - apiGroups: [""] resources: ["secrets"] verbs: ["create"] # Allow Dashboard to create 'kubernetes-dashboard-settings' config map. - apiGroups: [""] resources: ["configmaps"] verbs: ["create"] # Allow Dashboard to get, update and delete Dashboard exclusive secrets. - apiGroups: [""] resources: ["secrets"] resourceNames: ["kubernetes-dashboard-key-holder", "kubernetes-dashboard-certs"] verbs: ["get", "update", "delete"] # Allow Dashboard to get and update 'kubernetes-dashboard-settings' config map. - apiGroups: [""] resources: ["configmaps"] resourceNames: ["kubernetes-dashboard-settings"] verbs: ["get", "update"] # Allow Dashboard to get metrics from heapster. - apiGroups: [""] resources: ["services"] resourceNames: ["heapster"] verbs: ["proxy"] - apiGroups: [""] resources: ["services/proxy"] resourceNames: ["heapster", "http:heapster:", "https:heapster:"] verbs: ["get"] --- apiVersion: rbac.authorization.k8s.io/v1 kind: RoleBinding metadata: name: kubernetes-dashboard-minimal namespace: kube-system roleRef: apiGroup: rbac.authorization.k8s.io kind: Role name: kubernetes-dashboard-minimal subjects: - kind: ServiceAccount name: kubernetes-dashboard namespace: kube-system --- # ------------------- Dashboard Deployment ------------------- # kind: Deployment apiVersion: apps/v1 metadata: labels: k8s-app: kubernetes-dashboard name: kubernetes-dashboard namespace: kube-system spec: replicas: 1 revisionHistoryLimit: 10 selector: matchLabels: k8s-app: kubernetes-dashboard template: metadata: labels: k8s-app: kubernetes-dashboard spec: containers: - name: kubernetes-dashboard #image: k8s.gcr.io/kubernetes-dashboard-amd64:v1.10.1 image: registry.cn-hangzhou.aliyuncs.com/google_containers/kubernetes-dashboard-amd64:v1.10.1 ports: - containerPort: 8443 protocol: TCP args: - --auto-generate-certificates # Uncomment the following line to manually specify Kubernetes API server Host # If not specified, Dashboard will attempt to auto discover the API server and connect # to it. Uncomment only if the default does not work. # - --apiserver-host=http://my-address:port volumeMounts: - name: kubernetes-dashboard-certs mountPath: /certs # Create on-disk volume to store exec logs - mountPath: /tmp name: tmp-volume livenessProbe: httpGet: scheme: HTTPS path: / port: 8443 initialDelaySeconds: 30 timeoutSeconds: 30 volumes: - name: kubernetes-dashboard-certs secret: secretName: kubernetes-dashboard-certs - name: tmp-volume emptyDir: {} serviceAccountName: kubernetes-dashboard # Comment the following tolerations if Dashboard must not be deployed on master tolerations: - key: node-role.kubernetes.io/master effect: NoSchedule --- # ------------------- Dashboard Service ------------------- # kind: Service apiVersion: v1 metadata: labels: k8s-app: kubernetes-dashboard name: kubernetes-dashboard namespace: kube-system spec: type: NodePort # 新增这一行,指定为NodePort方式 ports: - port: 443 targetPort: 8443 nodePort: 32370 #新增这一行,指定固定node端口 selector: k8s-app: kubernetes-dashboard EOF kubectl apply -f kubernetes-dashboard.yaml ``` 3、查看dashboard ``` root># kubectl get pods -n kube-system -l k8s-app=kubernetes-dashboard NAME READY STATUS RESTARTS AGE kubernetes-dashboard-fcfb4cbc-dqbq9 1/1 Running 0 8m5s root># kubectl get svc -n kube-system -l k8s-app=kubernetes-dashboard NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE kubernetes-dashboard NodePort 192.168.56.11 443:32730/TCP 8m25s 然后可以通过上面的 https://NodeIP:32730 端口去访问 Dashboard,要记住使用 https,Chrome不生效可以使用Firefox测试: ``` 4、然后创建一个具有全局所有权限的用户来登录Dashboard:(admin.yaml) ``` cat > admin.yaml << \EOF kind: ClusterRoleBinding apiVersion: rbac.authorization.k8s.io/v1beta1 metadata: name: admin annotations: rbac.authorization.kubernetes.io/autoupdate: "true" roleRef: kind: ClusterRole name: cluster-admin apiGroup: rbac.authorization.k8s.io subjects: - kind: ServiceAccount name: admin namespace: kube-system --- apiVersion: v1 kind: ServiceAccount metadata: name: admin namespace: kube-system labels: kubernetes.io/cluster-service: "true" addonmanager.kubernetes.io/mode: Reconcile EOF kubectl apply -f admin.yaml kubectl delete -f admin.yaml #获取token kubectl -n kube-system describe secret $(kubectl -n kube-system get secret | grep admin | awk '{print $1}') ``` https://192.168.56.12:31513 然后用上面的base64解码后的字符串作为token登录Dashboard即可: k8s dashboard 最终我们就完成了使用 kubeadm 搭建 v1.15.3 版本的 kubernetes 集群、coredns、ipvs、flannel。 # 九、问题排查 1、coredns异常问题 ![coredns异常问题](https://github.com/Lancger/opsfull/blob/master/images/coredns-01.png) ``` E1006 12:30:53.935744 1 reflector.go:134] github.com/coredns/coredns/plugin/kubernetes/controller.go:317: Failed to list *v1.Endpoints: Get https://10.10.0.1:443/api/v1/endpoints?limit=500&resourceVersion=0: dial tcp 10.10.0.1:443: connect: no route to host E1006 12:30:53.935744 1 reflector.go:134] github.com/coredns/coredns/plugin/kubernetes/controller.go:317: Failed to list *v1.Endpoints: Get https://10.10.0.1:443/api/v1/endpoints?limit=500&resourceVersion=0: dial tcp 10.10.0.1:443: connect: no route to host log: exiting because of error: log: cannot create log: open /tmp/coredns.coredns-bccdc95cf-vlqxk.unknownuser.log.ERROR.20191006-123053.1: no such file or directory ``` 解决办法 ``` 实际上是主机防火墙的问题,需要添加 iptables -A RH-Firewall-1-INPUT -s 10.10.0.0/16 -j ACCEPT 其他参考 https://medium.com/@cminion/quicknote-kubernetes-networking-issues-78f1e0d06e12 https://github.com/coredns/coredns/issues/2325 ``` 2、kubelet异常问题1 ``` 问题现象: kubelet fails to get cgroup stats for docker and kubelet services 解决办法: cat > /etc/sysconfig/kubelet <<\EOF KUBELET_EXTRA_ARGS=--runtime-cgroups=/systemd/system.slice --kubelet-cgroups=/systemd/system.slice EOF systemctl daemon-reload systemctl restart kubelet systemctl status kubelet #查看kubelet日志 journalctl -f -u kubelet https://stackoverflow.com/questions/46726216/kubelet-fails-to-get-cgroup-stats-for-docker-and-kubelet-services https://www.twblogs.net/a/5cc87d63bd9eee1ac2ed736b ``` 3、kubelet异常问题2 ``` failed to create kubelet: misconfiguration: kubelet cgroup driver: "cgroupfs" is different from docker cgroup driver: "systemd" #解决办法 添加如下内容--cgroup-driver=systemd [root@tw19336 ~]# cat /usr/lib/systemd/system/kubelet.service.d/10-kubeadm.conf # Note: This dropin only works with kubeadm and kubelet v1.11+ [Service] Environment="KUBELET_KUBECONFIG_ARGS=--bootstrap-kubeconfig=/etc/kubernetes/bootstrap-kubelet.conf --kubeconfig=/etc/kubernetes/kubelet.conf --cgroup-driver=systemd" Environment="KUBELET_CONFIG_ARGS=--config=/var/lib/kubelet/config.yaml" # This is a file that "kubeadm init" and "kubeadm join" generates at runtime, populating the KUBELET_KUBEADM_ARGS variable dynamically EnvironmentFile=-/var/lib/kubelet/kubeadm-flags.env # This is a file that the user can use for overrides of the kubelet args as a last resort. Preferably, the user should use # the .NodeRegistration.KubeletExtraArgs object in the configuration files instead. KUBELET_EXTRA_ARGS should be sourced from this file. EnvironmentFile=-/etc/sysconfig/kubelet ExecStart= ExecStart=/usr/bin/kubelet $KUBELET_KUBECONFIG_ARGS $KUBELET_CONFIG_ARGS $KUBELET_KUBEADM_ARGS $KUBELET_EXTRA_ARGS systemctl daemon-reload systemctl restart kubelet systemctl status kubelet https://www.cnblogs.com/hongdada/p/9771857.html ``` 参考文档: https://www.cnblogs.com/liyongjian5179/p/11417794.html 使用kubeadm安装Kubernetes 1.15.3 并开启 ipvs https://www.jianshu.com/p/8bc61078bded https://www.cnblogs.com/lovesKey/p/10888006.html centos7下用kubeadm安装k8s集群并使用ipvs做高可用方案 https://github.com/kubernetes/dashboard/wiki/Creating-sample-user https://www.qikqiak.com/post/use-kubeadm-install-kubernetes-1.15.3/ https://kubernetes.io/docs/setup/production-environment/tools/kubeadm/create-cluster-kubeadm/ 官方文档 https://www.jianshu.com/p/d0933d6ae162 kubeadm 1.15 安装 https://yq.aliyun.com/articles/680080/ 单独部署coredns https://kubernetes.io/docs/setup/production-environment/tools/kubeadm/ha-topology/#stacked-etcd-topology etcd-stacked-cluster https://www.kubernetes.org.cn/5021.html etcd 集群运维实践 ================================================ FILE: kubeadm/k8S-HA-V1.15.3-Calico-开启防火墙版.md ================================================ # 环境介绍: ```bash CentOS: 7.6 Docker: docker-ce-18.09.9 Kubernetes: 1.15.3 Kubeadm: 1.15.3 Kubelet: 1.15.3 Kubectl: 1.15.3 ``` # 部署介绍:  创建高可用首先先有一个 Master 节点,然后再让其他服务器加入组成三个 Master 节点高可用,然后再将工作节点 Node 加入。下面将描述每个节点要执行的步骤: ```bash Master01: 二、三、四、五、六、七、八、九、十一 Master02、Master03: 二、三、五、六、四、九 node01、node02、node03: 二、五、六、九 ``` # 防火墙配置 ```bash yum install iptables iptables-services -y cat > /etc/sysconfig/iptables << \EOF # Generated by iptables-save v1.4.21 on Thu Aug 1 01:26:09 2019 *filter :INPUT ACCEPT [0:0] :FORWARD ACCEPT [0:0] :OUTPUT ACCEPT [0:0] :RH-Firewall-1-INPUT - [0:0] -A INPUT -j RH-Firewall-1-INPUT -A FORWARD -j RH-Firewall-1-INPUT -A RH-Firewall-1-INPUT -i lo -j ACCEPT -A RH-Firewall-1-INPUT -p icmp -m icmp --icmp-type any -j ACCEPT -A RH-Firewall-1-INPUT -s 192.168.56.0/24 -p tcp -m tcp --dport 22 -j ACCEPT -A RH-Firewall-1-INPUT -p tcp -m tcp --dport 22 -j DROP #k8s -A RH-Firewall-1-INPUT -s 192.168.56.11/32 -j ACCEPT -A RH-Firewall-1-INPUT -s 192.168.56.12/32 -j ACCEPT -A RH-Firewall-1-INPUT -s 192.168.56.13/32 -j ACCEPT -A RH-Firewall-1-INPUT -s 192.168.56.14/32 -j ACCEPT -A RH-Firewall-1-INPUT -p vrrp -j ACCEPT -A RH-Firewall-1-INPUT -s 192.168.56.1/32 -p tcp -m multiport --dports 80,443,1080,6443,16443 -j ACCEPT # -A RH-Firewall-1-INPUT -m state --state RELATED,ESTABLISHED -j ACCEPT -A RH-Firewall-1-INPUT -j REJECT --reject-with icmp-host-prohibited COMMIT # Completed on Thu Aug 1 01:26:09 2019 EOF systemctl restart iptables.service systemctl enable iptables.service iptables -nvL ``` # 集群架构: ![kubeadm高可用架构图](https://github.com/Lancger/opsfull/blob/master/images/kubeadm-ha.jpg) # 一、kuberadm 简介 ### 1、Kuberadm 作用  Kubeadm 是一个工具,它提供了 kubeadm init 以及 kubeadm join 这两个命令作为快速创建 kubernetes 集群的最佳实践。  kubeadm 通过执行必要的操作来启动和运行一个最小可用的集群。它被故意设计为只关心启动集群,而不是之前的节点准备工作。同样的,诸如安装各种各样值得拥有的插件,例如 Kubernetes Dashboard、监控解决方案以及特定云提供商的插件,这些都不在它负责的范围。  相反,我们期望由一个基于 kubeadm 从更高层设计的更加合适的工具来做这些事情;并且,理想情况下,使用 kubeadm 作为所有部署的基础将会使得创建一个符合期望的集群变得容易。 ### 2、Kuberadm 功能 ```bash kubeadm init: 启动一个 Kubernetes 主节点 kubeadm join: 启动一个 Kubernetes 工作节点并且将其加入到集群 kubeadm upgrade: 更新一个 Kubernetes 集群到新版本 kubeadm config: 如果使用 v1.7.x 或者更低版本的 kubeadm 初始化集群,您需要对集群做一些配置以便使用 kubeadm upgrade 命令 kubeadm token: 管理 kubeadm join 使用的令牌 kubeadm reset: 还原 kubeadm init 或者 kubeadm join 对主机所做的任何更改 kubeadm version: 打印 kubeadm 版本 kubeadm alpha: 预览一组可用的新功能以便从社区搜集反馈 ``` ### 3、功能版本
Area Maturity Level
Command line UX GA
Implementation GA
Config file API beta
CoreDNS GA
kubeadm alpha subcommands alpha
High availability alpha
DynamicKubeletConfig alpha
Self-hosting alpha
# 二、前期准备 ### 1、虚拟机分配说明
地址 主机名 内存&CPU 角色
192.168.56.200 - - vip
192.168.56.11 k8s-master-01 2C & 2G master
192.168.56.12 k8s-master-02 2C & 2G master
192.168.56.13 k8s-master-03 2C & 2G master
192.168.56.14 k8s-node-01 4C & 8G node
192.168.56.15 k8s-node-02 4C & 8G node
192.168.56.16 k8s-node-03 4C & 8G node
### 2、各个节点端口占用 - Master 节点
规则 方向 端口范围 作用 使用者
TCP Inbound 入口 6443* Kubernetes API server All
TCP Inbound 入口 2379-2380 etcd server client API kube-apiserver, etcd
TCP Inbound 入口 10250 Kubernetes API Self, Control plane
TCP Inbound 入口 10251 kube-scheduler Self
TCP Inbound 入口 10252 kube-controller-manager Self
- node 节点
规则 方向 端口范围 作用 使用者
TCP Inbound 入口 10250 Kubernetes API Self, Control plane
TCP Inbound 入口 30000-32767 NodePort Services** All
### 3、基础环境设置  Kubernetes 需要一定的环境来保证正常运行,如各个节点时间同步,主机名称解析,关闭防火墙等等。 1、主机名称解析  分布式系统环境中的多主机通信通常基于主机名称进行,这在 IP 地址存在变化的可能性时为主机提供了固定的访问人口,因此一般需要有专用的 DNS 服务负责解决各节点主机 不过,考虑到此处部署的是测试集群,因此为了降低系复杂度,这里将基于 hosts 的文件进行主机名称解析。 2、修改hosts和免key登录 ```bash #分别进入不同服务器,进入 /etc/hosts 进行编辑 cat > /etc/hosts << \EOF 127.0.0.1 localhost localhost.localdomain localhost4 localhost4.localdomain4 ::1 localhost localhost.localdomain localhost6 localhost6.localdomain6 192.168.56.200 k8s-vip master master.k8s.io 192.168.56.11 k8s-master-01 master01 master01.k8s.io 192.168.56.12 k8s-master-02 master02 master02.k8s.io 192.168.56.13 k8s-master-03 master03 master03.k8s.io 192.168.56.14 k8s-node-01 node01 node01.k8s.io 192.168.56.15 k8s-node-02 node02 node02.k8s.io 192.168.56.16 k8s-node-03 node03 node03.k8s.io EOF #root用户免密登录 mkdir -p /root/.ssh/ chmod 700 /root/.ssh/ echo 'ssh-rsa AAAAB3NzaC1yc2EAAAADAQABAAABAQC7bRm20od1b3rzW3ZPLB5NZn3jQesvfiz2p0WlfcYJrFHfF5Ap0ubIBUSQpVNLn94u8ABGBLboZL8Pjo+rXQPkIcObJxoKS8gz6ZOxcxJhldudbadabdanKAAKAKKKKKKKKKKKKKKKKKKKKKKK root@k8s-master-01' > /root/.ssh/authorized_keys chmod 400 /root/.ssh/authorized_keys ``` 3、修改hostname ```bash #分别进入不同的服务器修改 hostname 名称 # 修改 192.168.56.11 服务器 hostnamectl set-hostname k8s-master-01 # 修改 192.168.56.12 服务器 hostnamectl set-hostname k8s-master-02 # 修改 192.168.56.13 服务器 hostnamectl set-hostname k8s-master-03 # 修改 192.168.56.14 服务器 hostnamectl set-hostname k8s-node-01 # 修改 192.168.56.15 服务器 hostnamectl set-hostname k8s-node-02 # 修改 192.168.56.16 服务器 hostnamectl set-hostname k8s-node-03 ``` 4、主机时间同步 ```bash #将各个服务器的时间同步,并设置开机启动同步时间服务 systemctl start chronyd.service systemctl enable chronyd.service ``` 5、关闭防火墙服务 ```bash systemctl stop firewalld systemctl disable firewalld ``` 6、关闭并禁用SELinux ```bash # 若当前启用了 SELinux 则需要临时设置其当前状态为 permissive setenforce 0 # 编辑/etc/sysconfig selinux 文件,以彻底禁用 SELinux sed -i 's/^SELINUX=enforcing$/SELINUX=disabled/' /etc/selinux/config # 查看selinux状态 getenforce 如果为permissive,则执行reboot重新启动即可 ``` 7、禁用 Swap 设备  kubeadm 默认会预先检当前主机是否禁用了 Swap 设备,并在未用时强制止部署 过程因此,在主机内存资惊充裕的条件下,需要禁用所有的 Swap 设备 ``` # 关闭当前已启用的所有 Swap 设备 swapoff -a && sysctl -w vm.swappiness=0 sed -ri 's/.*swap.*/#&/' /etc/fstab cat /etc/fstab 或 # 编辑 fstab 配置文件,注释掉标识为 Swap 设备的所有行 vi /etc/fstab UUID=9be41058-76a6-4588-8e3f-5b44604d8de1 / xfs defaults,noatime 0 0 UUID=4489cc8f-1885-4e17-bfe7-8652fd1d3feb /boot xfs defaults,noatime 0 0 #UUID=0f5ae5f1-4872-471f-9f3a-f172a43fc1ff swap swap defaults,noatime 0 0 ``` 8、设置系统参数  设置允许路由转发,不对bridge的数据进行处理 ```bash #创建 /etc/sysctl.d/k8s.conf 文件 cat > /etc/sysctl.d/k8s.conf << \EOF vm.swappiness = 0 net.ipv4.ip_forward = 1 net.bridge.bridge-nf-call-ip6tables = 1 net.bridge.bridge-nf-call-iptables = 1 EOF #挂载br_netfilter modprobe br_netfilter #生效配置文件 sysctl -p /etc/sysctl.d/k8s.conf #查看是否生成相关文件 ls /proc/sys/net/bridge ``` 9、资源配置文件 `/etc/security/limits.conf` 是 Linux 资源使用配置文件,用来限制用户对系统资源的使用 ```bash echo "* soft nofile 65536" >> /etc/security/limits.conf echo "* hard nofile 65536" >> /etc/security/limits.conf echo "* soft nproc 65536" >> /etc/security/limits.conf echo "* hard nproc 65536" >> /etc/security/limits.conf echo "* soft memlock unlimited" >> /etc/security/limits.conf echo "* hard memlock unlimited" >> /etc/security/limits.conf ``` 10、安装依赖包以及相关工具 ```bash yum install -y epel-release yum install -y yum-utils device-mapper-persistent-data lvm2 net-tools conntrack-tools wget vim ntpdate libseccomp libtool-ltdl ``` # 三、安装Keepalived - keepalived介绍: 是集群管理中保证集群高可用的一个服务软件,其功能类似于heartbeat,用来防止单点故障 - Keepalived作用: 为haproxy提供vip(192.168.56.200)在三个haproxy实例之间提供主备,降低当其中一个haproxy失效的时对服务的影响。 ### 1、yum安装Keepalived ```bash # 安装keepalived chattr -i /etc/passwd* && chattr -i /etc/group* && chattr -i /etc/shadow* && chattr -i /etc/gshadow* yum install -y keepalived ``` ### 2、配置Keepalived ```bash cat < /etc/keepalived/keepalived.conf ! Configuration File for keepalived # 主要是配置故障发生时的通知对象以及机器标识。 global_defs { # 标识本节点的字条串,通常为 hostname,但不一定非得是 hostname。故障发生时,邮件通知会用到。 router_id LVS_K8S } # 用来做健康检查的,当时检查失败时会将 vrrp_instance 的 priority 减少相应的值。 vrrp_script check_haproxy { script "killall -0 haproxy" #根据进程名称检测进程是否存活 interval 3 weight -2 fall 10 rise 2 } # rp_instance用来定义对外提供服务的 VIP 区域及其相关属性。 vrrp_instance VI_1 { state MASTER #当前节点为MASTER,其他两个节点设置为BACKUP interface eth0 #改为自己的网卡 virtual_router_id 51 priority 250 advert_int 1 authentication { auth_type PASS auth_pass 35f18af7190d51c9f7f78f37300a0cbd } virtual_ipaddress { 192.168.56.200 #虚拟ip,即VIP } track_script { check_haproxy } } EOF ``` 当前节点的配置中 state 配置为 MASTER,其它两个节点设置为 BACKUP ```bash 配置说明: virtual_ipaddress: vip track_script: 执行上面定义好的检测的script interface: 节点固有IP(非VIP)的网卡,用来发VRRP包。 virtual_router_id: 取值在0-255之间,用来区分多个instance的VRRP组播 advert_int: 发VRRP包的时间间隔,即多久进行一次master选举(可以认为是健康查检时间间隔)。 authentication: 认证区域,认证类型有PASS和HA(IPSEC),推荐使用PASS(密码只识别前8位)。 state: 可以是MASTER或BACKUP,不过当其他节点keepalived启动时会将priority比较大的节点选举为MASTER,因此该项其实没有实质用途。 priority: 用来选举master的,要成为master,那么这个选项的值最好高于其他机器50个点,该项取值范围是1-255(在此范围之外会被识别成默认值100)。 # 1、注意防火墙需要放开vrrp协议(不然会出现脑裂现象,三台主机都存在VIP的情况) #-A INPUT -p vrrp -j ACCEPT -A RH-Firewall-1-INPUT -p vrrp -j ACCEPT #2、注意上面配置script "killall -0 haproxy" #根据进程名称检测进程是否存活,会在/var/log/messages每隔一秒执行检测的日志记录 # tail -100f /var/log/messages Sep 27 10:54:16 tw19410s1 Keepalived_vrrp[9113]: /usr/bin/killall -0 haproxy exited with status 1 ``` ### 3、启动Keepalived ```bash # 设置开机启动 systemctl enable keepalived # 启动keepalived systemctl start keepalived # 查看启动状态 systemctl status keepalived ``` ### 4、查看网络状态 kepplived 配置中 state 为 MASTER 的节点启动后,查看网络状态,可以看到虚拟IP已经加入到绑定的网卡中 ```bash [root@k8s-master-01 ~]# ip address show eth0 2: eth0: mtu 1500 qdisc mq state UP group default qlen 1000 link/ether 00:50:56:be:86:af brd ff:ff:ff:ff:ff:ff inet 192.168.56.11/24 brd 192.168.56.255 scope global eth0 valid_lft forever preferred_lft forever inet 192.168.56.200/32 scope global eth0 valid_lft forever preferred_lft forever 当关掉当前节点的keeplived服务后将进行虚拟IP转移,将会推选state 为 BACKUP 的节点的某一节点为新的MASTER,可以在那台节点上查看网卡,将会查看到虚拟IP ``` # 四、安装haproxy  此处的haproxy为apiserver提供反向代理,haproxy将所有请求轮询转发到每个master节点上。相对于仅仅使用keepalived主备模式仅单个master节点承载流量,这种方式更加合理、健壮。 ### 1、yum安装haproxy ```bash chattr -i /etc/passwd* && chattr -i /etc/group* && chattr -i /etc/shadow* && chattr -i /etc/gshadow* yum install -y haproxy ``` ### 2、配置haproxy ```bash cat > /etc/haproxy/haproxy.cfg << EOF #--------------------------------------------------------------------- # Global settings #--------------------------------------------------------------------- global # to have these messages end up in /var/log/haproxy.log you will # need to: # 1) configure syslog to accept network log events. This is done # by adding the '-r' option to the SYSLOGD_OPTIONS in # /etc/sysconfig/syslog # 2) configure local2 events to go to the /var/log/haproxy.log # file. A line like the following can be added to # /etc/sysconfig/syslog # # local2.* /var/log/haproxy.log # log 127.0.0.1 local2 chroot /var/lib/haproxy pidfile /var/run/haproxy.pid maxconn 4000 user haproxy group haproxy daemon # turn on stats unix socket stats socket /var/lib/haproxy/stats #--------------------------------------------------------------------- # common defaults that all the 'listen' and 'backend' sections will # use if not designated in their block #--------------------------------------------------------------------- defaults mode http log global option httplog option dontlognull option http-server-close option forwardfor except 127.0.0.0/8 option redispatch retries 3 timeout http-request 10s timeout queue 1m timeout connect 10s timeout client 1m timeout server 1m timeout http-keep-alive 10s timeout check 10s maxconn 3000 #--------------------------------------------------------------------- # kubernetes apiserver frontend which proxys to the backends #--------------------------------------------------------------------- frontend kubernetes-apiserver mode tcp bind *:16443 option tcplog default_backend kubernetes-apiserver #--------------------------------------------------------------------- # round robin balancing between the various backends #--------------------------------------------------------------------- backend kubernetes-apiserver mode tcp balance roundrobin server master01.k8s.io 192.168.56.11:6443 check server master02.k8s.io 192.168.56.12:6443 check server master03.k8s.io 192.168.56.13:6443 check #--------------------------------------------------------------------- # collection haproxy statistics message #--------------------------------------------------------------------- listen stats bind *:1080 stats auth admin:awesomePassword stats refresh 5s stats realm HAProxy\ Statistics stats uri /admin?stats EOF ``` haproxy配置在其他master节点上(192.168.56.12和192.168.56.13)相同 ### 3、启动并检测haproxy ```bash # 设置开机启动 systemctl enable haproxy # 开启haproxy systemctl start haproxy # 查看启动状态 systemctl status haproxy ``` ### 4、检测haproxy端口 ```bash ss -lnt | grep -E "16443|1080" ``` # 五、安装Docker (所有节点) ### 1、移除之前安装过的Docker ```bash sudo yum remove -y docker \ docker-client \ docker-client-latest \ docker-common \ docker-latest \ docker-latest-logrotate \ docker-logrotate \ docker-selinux \ docker-engine-selinux \ docker-ce-cli \ docker-engine # 查看还有没有存在的docker组件 rpm -qa|grep docker # 有则通过命令 yum -y remove XXX 来删除,比如: yum remove docker-ce-cli ``` ### 2、配置docker的yum源 下面两个镜像源选择其一即可,由于官方下载速度比较慢,推荐用阿里镜像源 - 阿里镜像源 ```bash sudo yum-config-manager --add-repo http://mirrors.aliyun.com/docker-ce/linux/centos/docker-ce.repo ``` - Docker官方镜像源 ```bash sudo yum-config-manager --add-repo https://download.docker.com/linux/centos/docker-ce.repo ``` ### 3、安装Docker: ``` # 显示docker-ce所有可安装版本: yum list docker-ce --showduplicates | sort -r # 安装指定docker版本 sudo yum install docker-ce-18.09.9-3.el7.x86_64 -y # 启动docker并设置docker开机启动 systemctl enable docker systemctl start docker # 确认一下iptables 确认一下iptables filter表中FOWARD链的默认策略(pllicy)为ACCEPT。 iptables -nvL Chain FORWARD (policy ACCEPT 0 packets, 0 bytes) pkts bytes target prot opt in out source destination 0 0 DOCKER-USER all -- * * 0.0.0.0/0 0.0.0.0/0 0 0 DOCKER-ISOLATION-STAGE-1 all -- * * 0.0.0.0/0 0.0.0.0/0 0 0 ACCEPT all -- * docker0 0.0.0.0/0 0.0.0.0/0 ctstate RELATED,ESTABLISHED 0 0 DOCKER all -- * docker0 0.0.0.0/0 0.0.0.0/0 0 0 ACCEPT all -- docker0 !docker0 0.0.0.0/0 0.0.0.0/0 0 0 ACCEPT all -- docker0 docker0 0.0.0.0/0 0.0.0.0/0 Docker从1.13版本开始调整了默认的防火墙规则,禁用了iptables filter表中FOWARD链,这样会引起Kubernetes集群中跨Node的Pod无法通信。但这里通过安装docker 1806,发现默认策略又改回了ACCEPT,这个不知道是从哪个版本改回的,因为我们线上版本使用的1706还是需要手动调整这个策略的。 # 执行下面命令 iptables -P FORWARD ACCEPT # 修改docker的配置 vim /usr/lib/systemd/system/docker.service # 增加下面命令(ExecReload后面新增ExecStartPost=...) ... ExecReload=/bin/kill -s HUP $MAINPID ExecStartPost=/usr/sbin/iptables -P FORWARD ACCEPT ... # 配置docker加速器 cat > /etc/docker/daemon.json << \EOF { "registry-mirrors": [ "https://dockerhub.azk8s.cn", "https://i37dz0y4.mirror.aliyuncs.com" ], "insecure-registries": ["reg.hub.com"] } EOF # 重启Docker systemctl daemon-reload systemctl restart docker ``` ### 4、docker最终的服务文件 ``` #注意,有变量的地方需要使用转义符号 cat > /usr/lib/systemd/system/docker.service << EOF [Unit] Description=Docker Application Container Engine Documentation=https://docs.docker.com BindsTo=containerd.service After=network-online.target firewalld.service containerd.service Wants=network-online.target Requires=docker.socket [Service] Type=notify # the default is not to use systemd for cgroups because the delegate issues still # exists and systemd currently does not support the cgroup feature set required # for containers run by docker ExecStart=/usr/bin/dockerd -H fd:// --containerd=/run/containerd/containerd.sock --exec-opt native.cgroupdriver=systemd ExecReload=/bin/kill -s HUP \$MAINPID ExecStartPost=/usr/sbin/iptables -P FORWARD ACCEPT TimeoutSec=0 RestartSec=2 Restart=always # Note that StartLimit* options were moved from "Service" to "Unit" in systemd 229. # Both the old, and new location are accepted by systemd 229 and up, so using the old location # to make them work for either version of systemd. StartLimitBurst=3 # Note that StartLimitInterval was renamed to StartLimitIntervalSec in systemd 230. # Both the old, and new name are accepted by systemd 230 and up, so using the old name to make # this option work for either version of systemd. StartLimitInterval=60s # Having non-zero Limit*s causes performance problems due to accounting overhead # in the kernel. We recommend using cgroups to do container-local accounting. LimitNOFILE=infinity LimitNPROC=infinity LimitCORE=infinity # Comment TasksMax if your systemd version does not support it. # Only systemd 226 and above support this option. TasksMax=infinity # set delegate yes so that systemd does not reset the cgroups of docker containers Delegate=yes # kill only the docker process, not all processes in the cgroup KillMode=process [Install] WantedBy=multi-user.target EOF # 重启Docker systemctl daemon-reload systemctl restart docker systemctl enable docker ``` # 六、安装kubeadm、kubelet ### 1、配置可用的国内yum源用于安装: ``` cat < /etc/yum.repos.d/kubernetes.repo [kubernetes] name=Kubernetes baseurl=https://mirrors.aliyun.com/kubernetes/yum/repos/kubernetes-el7-x86_64/ enabled=1 gpgcheck=0 repo_gpgcheck=0 gpgkey=https://mirrors.aliyun.com/kubernetes/yum/doc/yum-key.gpg https://mirrors.aliyun.com/kubernetes/yum/doc/rpm-package-key.gpg EOF ``` ### 2、安装kubelet ``` # 需要在每台机器上都安装以下的软件包: kubeadm: 用来初始化集群的指令。 kubelet: 在集群中的每个节点上用来启动 pod 和 container 等。 kubectl: 用来与集群通信的命令行工具。 # 查看kubelet版本列表 yum list kubelet --showduplicates | sort -r # 安装kubelet yum install -y kubelet-1.15.3-0 # 启动kubelet并设置开机启动 systemctl enable kubelet systemctl start kubelet # 检查状态 检查状态,发现是failed状态,正常,kubelet会10秒重启一次,需等下面完成初始化master节点后即可正常 systemctl status kubelet # 查看kubelet日志 journalctl -u kubelet --no-pager ``` ### 3、安装kubeadm ``` # 负责初始化集群 # 1、查看kubeadm版本列表 yum list kubeadm --showduplicates | sort -r # 2、安装kubeadm yum install -y kubeadm-1.15.3-0 # 安装 kubeadm 时候会默认安装 kubectl ,所以不需要单独安装kubectl # 3、重启服务器 为了防止发生某些未知错误,这里我们重启下服务器,方便进行后续操作 reboot ``` # 七、初始化第一个kubernetes master节点 ``` # 因为需要绑定虚拟IP,所以需要首先先查看虚拟IP启动这几台master机子哪台上 [root@k8s-master-01 ~]# ip address show eth0 2: eth0: mtu 1500 qdisc mq state UP group default qlen 1000 link/ether 00:50:56:be:86:af brd ff:ff:ff:ff:ff:ff inet 192.168.56.56/22 brd 10.19.3.255 scope global eth0 valid_lft forever preferred_lft forever inet 192.168.56.200/32 scope global eth0 valid_lft forever preferred_lft forever 可以看到虚拟IP 192.168.56.200 和 服务器IP 192.168.56.11在一台机子上,所以初始化kubernetes第一个master要在master01机子上进行安装 ``` ### 1、创建kubeadm配置的yaml文件 ``` # 1、创建kubeadm配置的yaml文件 rm -f ./kubeadm-config.yaml export APISERVER_NAME=master.k8s.io export POD_SUBNET=10.20.0.0/16 export SVC_SUBNET=10.96.0.0/16 cat > kubeadm-config.yaml << EOF apiServer: certSANs: - k8s-master-01 - k8s-master-02 - k8s-master-03 - master.k8s.io - 192.168.56.11 - 192.168.56.12 - 192.168.56.13 - 192.168.56.200 - 127.0.0.1 extraArgs: authorization-mode: Node,RBAC timeoutForControlPlane: 4m0s apiVersion: kubeadm.k8s.io/v1beta1 certificatesDir: /etc/kubernetes/pki clusterName: kubernetes controlPlaneEndpoint: "${APISERVER_NAME}:16443" controllerManager: {} dns: type: CoreDNS etcd: local: dataDir: /var/lib/etcd imageRepository: registry.aliyuncs.com/google_containers kind: ClusterConfiguration kubernetesVersion: v1.15.3 networking: dnsDomain: cluster.local podSubnet: "${POD_SUBNET}" serviceSubnet: "${SVC_SUBNET}" scheduler: {} EOF 以下两个地方设置: - certSANs: 虚拟ip地址(为了安全起见,把所有集群地址都加上) - controlPlaneEndpoint: 虚拟IP:监控端口号 配置说明: imageRepository: registry.aliyuncs.com/google_containers (使用阿里云镜像仓库) podSubnet: 10.20.0.1/16 (#pod地址池) serviceSubnet: 10.96.0.1/16 (#service地址池) ``` ### 2、初始化第一个master节点 ``` kubeadm init --config=kubeadm-config.yaml --upload-certs #使用这个就不用做拷贝证书的操作 ``` 日志 ``` Your Kubernetes control-plane has initialized successfully! To start using your cluster, you need to run the following as a regular user: mkdir -p $HOME/.kube sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config sudo chown $(id -u):$(id -g) $HOME/.kube/config You should now deploy a pod network to the cluster. Run "kubectl apply -f [podnetwork].yaml" with one of the options listed at: https://kubernetes.io/docs/concepts/cluster-administration/addons/ You can now join any number of the control-plane node running the following command on each as root: kubeadm join master.k8s.io:16443 --token wf0eoe.liqcp0nhtlov4ioi \ --discovery-token-ca-cert-hash sha256:e43bbb08bb5decae1ce0001f2988ff79095e6be5a3dea77a7c6af180562c7e56 \ --control-plane --certificate-key 6054323448a1aeb661b78763262db5c30e12026c54341400d48401a853194ec2 Please note that the certificate-key gives access to cluster sensitive data, keep it secret! As a safeguard, uploaded-certs will be deleted in two hours; If necessary, you can use "kubeadm init phase upload-certs --upload-certs" to reload certs afterward. Then you can join any number of worker nodes by running the following on each as root: kubeadm join master.k8s.io:16443 --token wf0eoe.liqcp0nhtlov4ioi \ --discovery-token-ca-cert-hash sha256:e43bbb08bb5decae1ce0001f2988ff79095e6be5a3dea77a7c6af180562c7e56 ``` ### 执行结果中 用于初始化第二、三个 master 节点 ``` kubeadm join master.k8s.io:16443 --token wf0eoe.liqcp0nhtlov4ioi \ --discovery-token-ca-cert-hash sha256:e43bbb08bb5decae1ce0001f2988ff79095e6be5a3dea77a7c6af180562c7e56 \ --control-plane --certificate-key 6054323448a1aeb661b78763262db5c30e12026c54341400d48401a853194ec2 ``` 用于初始化 worker 节点 ``` kubeadm join master.k8s.io:16443 --token wf0eoe.liqcp0nhtlov4ioi \ --discovery-token-ca-cert-hash sha256:e43bbb08bb5decae1ce0001f2988ff79095e6be5a3dea77a7c6af180562c7e56 ``` ### 3、配置kubectl环境变量 ```bash # 配置环境变量 rm -rf $HOME/.kube mkdir -p $HOME/.kube sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config sudo chown $(id -u):$(id -g) $HOME/.kube/config # 指令补全 yum install bash-completion -y source <(kubectl completion bash) echo "source <(kubectl completion bash)" >> ~/.bashrc ``` ### 4、查看组件状态 ```bash kubectl get cs NAME STATUS MESSAGE ERROR controller-manager Healthy ok scheduler Healthy ok etcd-0 Healthy {"health": "true"} # 查看pod状态 [root@k8s-master-01 ~]# kubectl get pods --namespace=kube-system NAME READY STATUS RESTARTS AGE coredns-78d4cf999f-5zt5z 0/1 Pending 0 7m32s ---coredns没有启动 coredns-78d4cf999f-mkgsx 0/1 Pending 0 7m32s ---coredns没有启动 etcd-k8s-master-01 1/1 Running 0 6m39s kube-apiserver-k8s-master-01 1/1 Running 0 6m43s kube-controller-manager-k8s-master-01 1/1 Running 0 6m32s kube-proxy-88s74 1/1 Running 0 7m32s kube-scheduler-k8s-master-01 1/1 Running 0 6m45s 可以看到coredns没有启动,这是由于还没有配置网络插件,接下来配置下后再重新查看启动状态 #检查ETCD服务 docker exec -it $(docker ps |grep etcd_etcd|awk '{print $1}') sh etcdctl --endpoints=https://192.168.56.11:2379 --ca-file=/etc/kubernetes/pki/etcd/ca.crt --cert-file=/etc/kubernetes/pki/etcd/server.crt --key-file=/etc/kubernetes/pki/etcd/server.key member list etcdctl --endpoints=https://192.168.56.11:2379 --ca-file=/etc/kubernetes/pki/etcd/ca.crt --cert-file=/etc/kubernetes/pki/etcd/server.crt --key-file=/etc/kubernetes/pki/etcd/server.key cluster-health ``` # 八、安装网络插件 ### 1、安装 calico 网络插件 ``` # 安装 calico 网络插件 # 参考文档 https://docs.projectcalico.org/v3.8/getting-started/kubernetes/ export POD_SUBNET=10.20.0.0/16 rm -f calico.yaml wget https://docs.projectcalico.org/v3.8/manifests/calico.yaml sed -i "s#192\.168\.0\.0/16#${POD_SUBNET}#" calico.yaml kubectl apply -f calico.yaml ``` ### 2、等待一会时间,再次查看各个pods的状态 ``` [root@k8s-master-01 ~]# kubectl get pods --namespace=kube-system NAME READY STATUS RESTARTS AGE coredns-78d4cf999f-5zt5z 1/1 Running 0 12m ---coredns启动成功 coredns-78d4cf999f-mkgsx 1/1 Running 0 12m ---coredns启动成功 etcd-k8s-master-01 1/1 Running 0 11m kube-apiserver-k8s-master-01 1/1 Running 0 12m kube-controller-manager-k8s-master-01 1/1 Running 0 11m kube-flannel-ds-amd64-7lj6m 1/1 Running 0 13s kube-proxy-88s74 1/1 Running 0 12m kube-scheduler-k8s-master-01 1/1 Running 0 12m ``` # 九、加入集群 ### 1、Master加入集群构成高可用 ``` 复制秘钥到各个节点 在master01 服务器上执行下面命令,将kubernetes相关文件复制到 master02、master03 如果其他节点为初始化第一个master节点,则将该节点的配置文件复制到其余两个主节点,例如master03为第一个master节点,则将它的k8s配置复制到master02和master01。 ``` - 复制文件到 master02 ``` ssh root@master02.k8s.io mkdir -p /etc/kubernetes/pki/etcd scp /etc/kubernetes/admin.conf root@master02.k8s.io:/etc/kubernetes scp /etc/kubernetes/pki/{ca.*,sa.*,front-proxy-ca.*} root@master02.k8s.io:/etc/kubernetes/pki scp /etc/kubernetes/pki/etcd/ca.* root@master02.k8s.io:/etc/kubernetes/pki/etcd ``` - 复制文件到 master03 ``` ssh root@master03.k8s.io mkdir -p /etc/kubernetes/pki/etcd scp /etc/kubernetes/admin.conf root@master03.k8s.io:/etc/kubernetes scp /etc/kubernetes/pki/{ca.*,sa.*,front-proxy-ca.*} root@master03.k8s.io:/etc/kubernetes/pki scp /etc/kubernetes/pki/etcd/ca.* root@master03.k8s.io:/etc/kubernetes/pki/etcd ``` - master节点加入集群  master02 和 master03 服务器上都执行加入集群操作 ```bash kubeadm join master.k8s.io:16443 --token i77yg1.1eype0c53jsanoge --discovery-token-ca-cert-hash sha256:8f0a817012ab333a057b6a7410e65971be20b95c1b75fc4015f8f3b6785f626f --experimental-control-plane ```  如果加入失败想重新尝试,请输入 kubeadm reset 命令清除之前的设置,重新执行从“复制秘钥”和“加入集群”这两步  如果是master加入,请在最后面加上 –experimental-control-plane 这个参数 ```bash # 显示安装过程: This node has joined the cluster and a new control plane instance was created: * Certificate signing request was sent to apiserver and approval was received. * The Kubelet was informed of the new secure connection details. * Master label and taint were applied to the new node. * The Kubernetes control plane instances scaled up. * A new etcd member was added to the local/stacked etcd cluster. To start administering your cluster from this node, you need to run the following as a regular user: mkdir -p $HOME/.kube sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config sudo chown $(id -u):$(id -g) $HOME/.kube/config Run 'kubectl get nodes' to see this node join the cluster. ``` - 配置kubectl环境变量 ```bash mkdir -p $HOME/.kube sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config sudo chown $(id -u):$(id -g) $HOME/.kube/config # 指令补全 yum install bash-completion -y source <(kubectl completion bash) echo "source <(kubectl completion bash)" >> ~/.bashrc ``` ### 2、node节点加入集群  除了让master节点加入集群组成高可用外,slave节点也要加入集群中。  这里将k8s-node-01、k8s-node-02、k8s-node-03加入集群,进行工作  输入初始化k8s master时候提示的加入命令,如下: ``` kubeadm join master.k8s.io:16443 --token i77yg1.1eype0c53jsanoge --discovery-token-ca-cert-hash sha256:8f0a817012ab333a057b6a7410e65971be20b95c1b75fc4015f8f3b6785f626f ```  node节点加入,不需要加上 –experimental-control-plane 这个参数 ### 3、如果忘记加入集群的token和sha256 (如正常则跳过) - 显示获取token列表 ``` kubeadm token list ``` 默认情况下 Token 过期是时间是24小时,如果 Token 过期以后,可以输入以下命令,生成新的 Token ``` kubeadm token create ``` - 获取ca证书sha256编码hash值 ``` openssl x509 -pubkey -in /etc/kubernetes/pki/ca.crt | openssl rsa -pubin -outform der 2>/dev/null | openssl dgst -sha256 -hex | sed 's/^.* //' ``` 拼接命令 ``` kubeadm join master.k8s.io:16443 --token 882ik4.9ib2kb0eftvuhb58 --discovery-token-ca-cert-hash sha256:0b1a836894d930c8558b350feeac8210c85c9d35b6d91fde202b870f3244016a 如果是master加入,请在最后面加上 –experimental-control-plane 这个参数 ``` ### 4、查看各个节点加入集群情况 ``` kubectl get nodes -o wide ``` # 十、从集群中删除 Node - Master节点: ``` kubectl drain --delete-local-data --force --ignore-daemonsets kubectl delete node ``` - Slave节点: ``` kubeadm reset ``` ## 初始化失败 ```bash kubeadm reset ifconfig cni0 down ip link delete cni0 ifconfig flannel.1 down ip link delete flannel.1 rm -rf /var/lib/cni/ rm -rf /var/lib/etcd/* ``` 参考资料: http://www.mydlq.club/article/4/ https://kuboard.cn/install/install-kubernetes.html#%E5%88%9D%E5%A7%8B%E5%8C%96%E7%AC%AC%E4%B8%80%E4%B8%AAmaster%E8%8A%82%E7%82%B9 https://blog.51cto.com/fengwan/2426528?source=dra kubeadm搭建高可用kubernetes 1.15.1 https://segmentfault.com/a/1190000018741112?utm_source=tag-newest Kubernetes的几种主流部署方式02-kubeadm部署高可用集群 ================================================ FILE: kubeadm/k8S-HA-V1.15.3-Flannel-开启防火墙版.md ================================================ # 环境介绍: ```bash CentOS: 7.6 Docker: docker-ce-18.09.9 Kubernetes: 1.15.3 Kubeadm: 1.15.3 Kubelet: 1.15.3 Kubectl: 1.15.3 ``` # 部署介绍:  创建高可用首先先有一个 Master 节点,然后再让其他服务器加入组成三个 Master 节点高可用,然后再将工作节点 Node 加入。下面将描述每个节点要执行的步骤: ```bash Master01: 二、三、四、五、六、七、八、九、十一 Master02、Master03: 二、三、五、六、四、九 node01、node02、node03: 二、五、六、九 ``` # 防火墙配置 ```bash 1、防火墙策略 yum install iptables iptables-services -y cat > /etc/sysconfig/iptables << \EOF # Generated by iptables-save v1.4.21 on Thu Aug 1 01:26:09 2019 *filter :INPUT ACCEPT [0:0] :FORWARD ACCEPT [0:0] :OUTPUT ACCEPT [0:0] :RH-Firewall-1-INPUT - [0:0] -A INPUT -j RH-Firewall-1-INPUT -A FORWARD -j RH-Firewall-1-INPUT -A RH-Firewall-1-INPUT -i lo -j ACCEPT -A RH-Firewall-1-INPUT -p icmp -m icmp --icmp-type any -j ACCEPT -A RH-Firewall-1-INPUT -s 192.168.56.0/24 -p tcp -m tcp --dport 22 -j ACCEPT -A RH-Firewall-1-INPUT -p tcp -m tcp --dport 22 -j DROP # k8s 服务器公网和内网IP,VIP都加上 -A RH-Firewall-1-INPUT -s 192.168.56.200/32 -j ACCEPT -A RH-Firewall-1-INPUT -s 192.168.56.11/32 -j ACCEPT -A RH-Firewall-1-INPUT -s 192.168.56.12/32 -j ACCEPT -A RH-Firewall-1-INPUT -s 192.168.56.13/32 -j ACCEPT -A RH-Firewall-1-INPUT -s 192.168.56.14/32 -j ACCEPT # keepalived -A RH-Firewall-1-INPUT -p vrrp -j ACCEPT # serviceSubnet rules -A RH-Firewall-1-INPUT -s 10.96.0.0/12 -j ACCEPT # podSubnet rules -A RH-Firewall-1-INPUT -s 10.244.0.0/16 -j ACCEPT # port rules -A RH-Firewall-1-INPUT -s 192.168.56.1/32 -p tcp -m multiport --dports 80,443,1080,6443,16443 -j ACCEPT # -A RH-Firewall-1-INPUT -m state --state RELATED,ESTABLISHED -j ACCEPT -A RH-Firewall-1-INPUT -j REJECT --reject-with icmp-host-prohibited COMMIT # Completed on Thu Aug 1 01:26:09 2019 EOF systemctl restart iptables.service systemctl enable iptables.service iptables -nvL 2、hosts.deny配置(注意需要注释掉) sed -ri 's/.*all:all.*/#all:all/g' /etc/hosts.deny cat /etc/hosts.deny ``` # 集群架构: ![kubeadm高可用架构图](https://github.com/Lancger/opsfull/blob/master/images/kubeadm-ha.jpg) # 一、kuberadm 简介 ### 1、Kuberadm 作用  Kubeadm 是一个工具,它提供了 kubeadm init 以及 kubeadm join 这两个命令作为快速创建 kubernetes 集群的最佳实践。  kubeadm 通过执行必要的操作来启动和运行一个最小可用的集群。它被故意设计为只关心启动集群,而不是之前的节点准备工作。同样的,诸如安装各种各样值得拥有的插件,例如 Kubernetes Dashboard、监控解决方案以及特定云提供商的插件,这些都不在它负责的范围。  相反,我们期望由一个基于 kubeadm 从更高层设计的更加合适的工具来做这些事情;并且,理想情况下,使用 kubeadm 作为所有部署的基础将会使得创建一个符合期望的集群变得容易。 ### 2、Kuberadm 功能 ```bash kubeadm init: 启动一个 Kubernetes 主节点 kubeadm join: 启动一个 Kubernetes 工作节点并且将其加入到集群 kubeadm upgrade: 更新一个 Kubernetes 集群到新版本 kubeadm config: 如果使用 v1.7.x 或者更低版本的 kubeadm 初始化集群,您需要对集群做一些配置以便使用 kubeadm upgrade 命令 kubeadm token: 管理 kubeadm join 使用的令牌 kubeadm reset: 还原 kubeadm init 或者 kubeadm join 对主机所做的任何更改 kubeadm version: 打印 kubeadm 版本 kubeadm alpha: 预览一组可用的新功能以便从社区搜集反馈 ``` ### 3、功能版本
Area Maturity Level
Command line UX GA
Implementation GA
Config file API beta
CoreDNS GA
kubeadm alpha subcommands alpha
High availability alpha
DynamicKubeletConfig alpha
Self-hosting alpha
# 二、前期准备 ### 1、虚拟机分配说明
地址 主机名 内存&CPU 角色
10.199.1.200 - - vip
10.199.1.136 k8s-master-01 2C & 2G master
10.199.1.137 k8s-master-02 2C & 2G master
10.199.1.138 k8s-master-03 2C & 2G master
10.199.1.139 k8s-node-01 4C & 8G node
10.199.1.140 k8s-node-02 4C & 8G node
10.199.1.141 k8s-node-03 4C & 8G node
### 2、各个节点端口占用 - Master 节点
规则 方向 端口范围 作用 使用者
TCP Inbound 入口 6443* Kubernetes API server All
TCP Inbound 入口 2379-2380 etcd server client API kube-apiserver, etcd
TCP Inbound 入口 10250 Kubernetes API Self, Control plane
TCP Inbound 入口 10251 kube-scheduler Self
TCP Inbound 入口 10252 kube-controller-manager Self
- node 节点
规则 方向 端口范围 作用 使用者
TCP Inbound 入口 10250 Kubernetes API Self, Control plane
TCP Inbound 入口 30000-32767 NodePort Services** All
### 3、基础环境设置  Kubernetes 需要一定的环境来保证正常运行,如各个节点时间同步,主机名称解析,关闭防火墙等等。 1、主机名称解析  分布式系统环境中的多主机通信通常基于主机名称进行,这在 IP 地址存在变化的可能性时为主机提供了固定的访问人口,因此一般需要有专用的 DNS 服务负责解决各节点主机 不过,考虑到此处部署的是测试集群,因此为了降低系复杂度,这里将基于 hosts 的文件进行主机名称解析。 2、修改hosts和免key登录 ```bash #分别进入不同服务器,进入 /etc/hosts 进行编辑 cat > /etc/hosts << \EOF 127.0.0.1 localhost localhost.localdomain localhost4 localhost4.localdomain4 ::1 localhost localhost.localdomain localhost6 localhost6.localdomain6 10.199.1.200 k8s-vip master master.k8s.io 10.199.1.136 k8s-master-01 master01 master01.k8s.io 10.199.1.137 k8s-master-02 master02 master02.k8s.io 10.199.1.138 k8s-master-03 master03 master03.k8s.io 10.199.1.139 k8s-node-01 node01 node01.k8s.io EOF #root用户免密登录 mkdir -p /root/.ssh/ chmod 700 /root/.ssh/ echo 'ssh-rsa AAAAB3NzaC1yc2EAAAADAQABAAABAQC7bRm20od1b3rzW3ZPLB5NZn3jQesvfiz2p0WlfcYJrFHfF5Ap0ubIBUSQpVNLn94u8ABGBLboZL8Pjo+rXQPkIcObJxoKS8gz6ZOxcxJhldudbadabdanKAAKAKKKKKKKKKKKKKKKKKKKKKKK root@k8s-master-01' > /root/.ssh/authorized_keys chmod 400 /root/.ssh/authorized_keys ``` 3、修改hostname ```bash #分别进入不同的服务器修改 hostname 名称 # 修改 10.199.1.136 服务器 hostnamectl set-hostname k8s-master-01 # 修改 10.199.1.137 服务器 hostnamectl set-hostname k8s-master-02 # 修改 10.199.1.138 服务器 hostnamectl set-hostname k8s-master-03 # 修改 10.199.1.139 服务器 hostnamectl set-hostname k8s-node-01 # 修改 10.199.1.140 服务器 hostnamectl set-hostname k8s-node-02 # 修改 10.199.1.141 服务器 hostnamectl set-hostname k8s-node-03 ``` 4、主机时间同步 ```bash #将各个服务器的时间同步,并设置开机启动同步时间服务 systemctl restart chronyd.service systemctl enable chronyd.service ``` 5、关闭防火墙服务 ```bash systemctl stop firewalld systemctl disable firewalld ``` 6、关闭并禁用SELinux ```bash # 若当前启用了 SELinux 则需要临时设置其当前状态为 permissive setenforce 0 # 编辑/etc/sysconfig selinux 文件,以彻底禁用 SELinux sed -i 's/^SELINUX=enforcing$/SELINUX=disabled/' /etc/selinux/config # 查看selinux状态 getenforce 如果为permissive,则执行reboot重新启动即可 ``` 7、禁用 Swap 设备  kubeadm 默认会预先检当前主机是否禁用了 Swap 设备,并在未用时强制止部署 过程因此,在主机内存资惊充裕的条件下,需要禁用所有的 Swap 设备 ``` # 关闭当前已启用的所有 Swap 设备 swapoff -a && sysctl -w vm.swappiness=0 sed -ri 's/.*swap.*/#&/' /etc/fstab cat /etc/fstab 或 # 编辑 fstab 配置文件,注释掉标识为 Swap 设备的所有行 vi /etc/fstab UUID=9be41058-76a6-4588-8e3f-5b44604d8de1 / xfs defaults,noatime 0 0 UUID=4489cc8f-1885-4e17-bfe7-8652fd1d3feb /boot xfs defaults,noatime 0 0 #UUID=0f5ae5f1-4872-471f-9f3a-f172a43fc1ff swap swap defaults,noatime 0 0 ``` 8、设置系统参数  设置允许路由转发,不对bridge的数据进行处理 ```bash #创建 /etc/sysctl.d/k8s.conf 文件 cat > /etc/sysctl.d/k8s.conf << \EOF vm.swappiness = 0 net.ipv4.ip_forward = 1 net.bridge.bridge-nf-call-ip6tables = 1 net.bridge.bridge-nf-call-iptables = 1 EOF #挂载br_netfilter modprobe br_netfilter #生效配置文件 sysctl -p /etc/sysctl.d/k8s.conf #查看是否生成相关文件 ls /proc/sys/net/bridge ``` 9、资源配置文件 `/etc/security/limits.conf` 是 Linux 资源使用配置文件,用来限制用户对系统资源的使用 ```bash echo "* soft nofile 65536" >> /etc/security/limits.conf echo "* hard nofile 65536" >> /etc/security/limits.conf echo "* soft nproc 65536" >> /etc/security/limits.conf echo "* hard nproc 65536" >> /etc/security/limits.conf echo "* soft memlock unlimited" >> /etc/security/limits.conf echo "* hard memlock unlimited" >> /etc/security/limits.conf ``` 10、安装依赖包以及相关工具 ```bash yum install -y epel-release yum install -y yum-utils device-mapper-persistent-data lvm2 net-tools conntrack-tools wget vim ntpdate libseccomp libtool-ltdl ``` # 三、安装Keepalived - keepalived介绍: 是集群管理中保证集群高可用的一个服务软件,其功能类似于heartbeat,用来防止单点故障 - Keepalived作用: 为haproxy提供vip(192.168.56.200)在三个haproxy实例之间提供主备,降低当其中一个haproxy失效的时对服务的影响。 ### 1、yum安装Keepalived ```bash # 安装keepalived chattr -i /etc/passwd* && chattr -i /etc/group* && chattr -i /etc/shadow* && chattr -i /etc/gshadow* yum install -y keepalived ``` ### 2、配置Keepalived ```bash cat < /etc/keepalived/keepalived.conf ! Configuration File for keepalived # 主要是配置故障发生时的通知对象以及机器标识。 global_defs { # 标识本节点的字条串,通常为 hostname,但不一定非得是 hostname。故障发生时,邮件通知会用到。 router_id LVS_K8S } # 用来做健康检查的,当时检查失败时会将 vrrp_instance 的 priority 减少相应的值。 vrrp_script check_haproxy { script "killall -0 haproxy" #根据进程名称检测进程是否存活 interval 3 weight -2 fall 10 rise 2 } # rp_instance用来定义对外提供服务的 VIP 区域及其相关属性。 vrrp_instance VI_1 { state MASTER #当前节点为MASTER,其他两个节点设置为 BACKUP interface bond0 #改为自己的网卡 virtual_router_id 51 priority 200 advert_int 1 authentication { auth_type PASS auth_pass 35f18af7190d51c9f7f78f37300a0cbd } virtual_ipaddress { 10.199.1.200/22 #虚拟VIP,即VIP,注意掩码一定要写,不然会出现VIP端口,部分机器正常,部分机器异常问题 } track_script { check_haproxy } } EOF ``` 当前节点的配置中 state 配置为 MASTER,其它两个节点设置为 BACKUP ```bash 配置说明: virtual_ipaddress: vip track_script: 执行上面定义好的检测的script interface: 节点固有IP(非VIP)的网卡,用来发VRRP包。 virtual_router_id: 取值在0-255之间,用来区分多个instance的VRRP组播 advert_int: 发VRRP包的时间间隔,即多久进行一次master选举(可以认为是健康查检时间间隔)。 authentication: 认证区域,认证类型有PASS和HA(IPSEC),推荐使用PASS(密码只识别前8位)。 state: 可以是MASTER或BACKUP,不过当其他节点keepalived启动时会将priority比较大的节点选举为MASTER,因此该项其实没有实质用途。 priority: 用来选举master的,要成为master,那么这个选项的值最好高于其他机器50个点,该项取值范围是1-255(在此范围之外会被识别成默认值100)。 # 1、注意防火墙需要放开vrrp协议(不然会出现脑裂现象,三台主机都存在VIP的情况) #-A INPUT -p vrrp -j ACCEPT -A RH-Firewall-1-INPUT -p vrrp -j ACCEPT # 2、注意上面配置script "killall -0 haproxy" #根据进程名称检测进程是否存活,会在/var/log/messages每隔一秒执行检测的日志记录 # tail -100f /var/log/messages Sep 27 10:54:16 tw19410s1 Keepalived_vrrp[9113]: /usr/bin/killall -0 haproxy exited with status 1 # 3、“VRRP实例的绑定到IP”对于所使用的网卡需要合法 比如使用网卡“bond0”,该网卡的掩码为“255.255.255.0”,那么所使用的“VRRP实例的绑定到IP”的掩码也必须为“255.255.255.0”,即具有“xxx.xxx.xxx.xxx/24”的形式。 tcpdump -ani any vrrp | grep vrid 特别需要注意的是,同一网段中的virtual_router_id(vrid)的值不能重复,否则会干扰其他Keepalived集群的正常运行。 ``` ### 3、启动Keepalived ```bash # 设置开机启动 systemctl enable keepalived # 启动keepalived systemctl restart keepalived # 查看启动状态 systemctl status keepalived ``` ### 4、查看网络状态 kepplived 配置中 state 为 MASTER 的节点启动后,查看网络状态,可以看到虚拟IP已经加入到绑定的网卡中 ```bash [root@k8s-master-01 ~]# ip address show bond0 6: bond0: mtu 1500 qdisc noqueue state UP group default qlen 1000 link/ether 6c:92:bf:27:9e:ed brd ff:ff:ff:ff:ff:ff inet 10.199.1.136/22 brd 10.19.3.255 scope global bond0 valid_lft forever preferred_lft forever inet 10.199.1.200/32 scope global bond0 valid_lft forever preferred_lft forever 当关掉当前节点的keeplived服务后将进行虚拟IP转移,将会推选state 为 BACKUP 的节点的某一节点为新的MASTER,可以在那台节点上查看网卡,将会查看到虚拟IP ``` # 四、安装haproxy  此处的haproxy为apiserver提供反向代理,haproxy将所有请求轮询转发到每个master节点上。相对于仅仅使用keepalived主备模式仅单个master节点承载流量,这种方式更加合理、健壮。 ### 1、yum安装haproxy ```bash chattr -i /etc/passwd* && chattr -i /etc/group* && chattr -i /etc/shadow* && chattr -i /etc/gshadow* yum install -y haproxy ``` ### 2、配置haproxy ```bash cat > /etc/haproxy/haproxy.cfg << EOF #--------------------------------------------------------------------- # Global settings #--------------------------------------------------------------------- global # to have these messages end up in /var/log/haproxy.log you will # need to: # 1) configure syslog to accept network log events. This is done # by adding the '-r' option to the SYSLOGD_OPTIONS in # /etc/sysconfig/syslog # 2) configure local2 events to go to the /var/log/haproxy.log # file. A line like the following can be added to # /etc/sysconfig/syslog # # local2.* /var/log/haproxy.log # log 127.0.0.1 local2 chroot /var/lib/haproxy pidfile /var/run/haproxy.pid maxconn 4000 user haproxy group haproxy daemon # turn on stats unix socket stats socket /var/lib/haproxy/stats #--------------------------------------------------------------------- # common defaults that all the 'listen' and 'backend' sections will # use if not designated in their block #--------------------------------------------------------------------- defaults mode http log global option httplog option dontlognull option http-server-close option forwardfor except 127.0.0.0/8 option redispatch retries 3 timeout http-request 10s timeout queue 1m timeout connect 10s timeout client 1m timeout server 1m timeout http-keep-alive 10s timeout check 10s maxconn 3000 #--------------------------------------------------------------------- # kubernetes apiserver frontend which proxys to the backends #--------------------------------------------------------------------- frontend kubernetes-apiserver mode tcp bind *:16443 option tcplog default_backend kubernetes-apiserver #--------------------------------------------------------------------- # round robin balancing between the various backends #--------------------------------------------------------------------- backend kubernetes-apiserver mode tcp balance roundrobin server master01.k8s.io 10.199.1.136:6443 check server master02.k8s.io 10.199.1.137:6443 check server master03.k8s.io 10.199.1.138:6443 check #--------------------------------------------------------------------- # collection haproxy statistics message #--------------------------------------------------------------------- listen stats bind *:1080 stats auth admin:awesomePassword stats refresh 5s stats realm HAProxy\ Statistics stats uri /admin?stats EOF ``` haproxy配置在其他master节点上(10.199.1.137和10.199.1.138)相同 ### 3、启动并检测haproxy ```bash # 设置开机启动 systemctl enable haproxy # 开启haproxy systemctl restart haproxy # 查看启动状态 systemctl status haproxy ``` ### 4、检测haproxy端口 ```bash ss -lnt | grep -E "16443|1080" nc -zv master.k8s.io 16443 nc -zv master.k8s.io 1080 ``` # 五、安装Docker (所有节点) ### 1、移除之前安装过的Docker ```bash sudo yum remove -y docker \ docker-client \ docker-client-latest \ docker-common \ docker-latest \ docker-latest-logrotate \ docker-logrotate \ docker-selinux \ docker-engine-selinux \ docker-ce-cli \ docker-engine # 查看还有没有存在的docker组件 rpm -qa|grep docker # 有则通过命令 yum -y remove XXX 来删除,比如: yum remove docker-ce-cli ``` ### 2、配置docker的yum源 下面两个镜像源选择其一即可,由于官方下载速度比较慢,推荐用阿里镜像源 - 阿里镜像源 ```bash sudo yum-config-manager --add-repo http://mirrors.aliyun.com/docker-ce/linux/centos/docker-ce.repo ``` - Docker官方镜像源 ```bash sudo yum-config-manager --add-repo https://download.docker.com/linux/centos/docker-ce.repo ``` ### 3、安装Docker: ``` # 显示docker-ce所有可安装版本: yum list docker-ce --showduplicates | sort -r # 安装指定docker版本 sudo yum install docker-ce-18.09.9-3.el7.x86_64 -y # 启动docker并设置docker开机启动 systemctl enable docker systemctl start docker # 确认一下iptables 确认一下iptables filter表中FOWARD链的默认策略(pllicy)为ACCEPT。 iptables -nvL Chain FORWARD (policy ACCEPT 0 packets, 0 bytes) pkts bytes target prot opt in out source destination 0 0 DOCKER-USER all -- * * 0.0.0.0/0 0.0.0.0/0 0 0 DOCKER-ISOLATION-STAGE-1 all -- * * 0.0.0.0/0 0.0.0.0/0 0 0 ACCEPT all -- * docker0 0.0.0.0/0 0.0.0.0/0 ctstate RELATED,ESTABLISHED 0 0 DOCKER all -- * docker0 0.0.0.0/0 0.0.0.0/0 0 0 ACCEPT all -- docker0 !docker0 0.0.0.0/0 0.0.0.0/0 0 0 ACCEPT all -- docker0 docker0 0.0.0.0/0 0.0.0.0/0 Docker从1.13版本开始调整了默认的防火墙规则,禁用了iptables filter表中FOWARD链,这样会引起Kubernetes集群中跨Node的Pod无法通信。但这里通过安装docker 1806,发现默认策略又改回了ACCEPT,这个不知道是从哪个版本改回的,因为我们线上版本使用的1706还是需要手动调整这个策略的。 # 执行下面命令 iptables -P FORWARD ACCEPT # 修改docker的配置 vim /usr/lib/systemd/system/docker.service # 增加下面命令(ExecReload后面新增ExecStartPost=...) ... ExecReload=/bin/kill -s HUP $MAINPID ExecStartPost=/usr/sbin/iptables -P FORWARD ACCEPT ... # 配置docker加速器 cat > /etc/docker/daemon.json << \EOF { "exec-opts": ["native.cgroupdriver=systemd"], "registry-mirrors" : [ "https://ot2k4d59.mirror.aliyuncs.com/" ] } EOF # 重启Docker systemctl daemon-reload systemctl restart docker docker info|grep -i cgroup ``` ### 4、docker最终的服务文件 ``` #注意,有变量的地方需要使用转义符号 cat > /usr/lib/systemd/system/docker.service << EOF [Unit] Description=Docker Application Container Engine Documentation=https://docs.docker.com BindsTo=containerd.service After=network-online.target firewalld.service containerd.service Wants=network-online.target Requires=docker.socket [Service] Type=notify # the default is not to use systemd for cgroups because the delegate issues still # exists and systemd currently does not support the cgroup feature set required # for containers run by docker ExecStart=/usr/bin/dockerd -H fd:// --containerd=/run/containerd/containerd.sock --exec-opt native.cgroupdriver=systemd ExecReload=/bin/kill -s HUP \$MAINPID ExecStartPost=/usr/sbin/iptables -P FORWARD ACCEPT TimeoutSec=0 RestartSec=2 Restart=always # Note that StartLimit* options were moved from "Service" to "Unit" in systemd 229. # Both the old, and new location are accepted by systemd 229 and up, so using the old location # to make them work for either version of systemd. StartLimitBurst=3 # Note that StartLimitInterval was renamed to StartLimitIntervalSec in systemd 230. # Both the old, and new name are accepted by systemd 230 and up, so using the old name to make # this option work for either version of systemd. StartLimitInterval=60s # Having non-zero Limit*s causes performance problems due to accounting overhead # in the kernel. We recommend using cgroups to do container-local accounting. LimitNOFILE=infinity LimitNPROC=infinity LimitCORE=infinity # Comment TasksMax if your systemd version does not support it. # Only systemd 226 and above support this option. TasksMax=infinity # set delegate yes so that systemd does not reset the cgroups of docker containers Delegate=yes # kill only the docker process, not all processes in the cgroup KillMode=process [Install] WantedBy=multi-user.target EOF # 重启Docker systemctl daemon-reload systemctl restart docker systemctl enable docker ``` # 六、安装kubeadm、kubelet ### 1、配置可用的国内yum源用于安装: ``` cat < /etc/yum.repos.d/kubernetes.repo [kubernetes] name=Kubernetes baseurl=https://mirrors.aliyun.com/kubernetes/yum/repos/kubernetes-el7-x86_64/ enabled=1 gpgcheck=0 repo_gpgcheck=0 gpgkey=https://mirrors.aliyun.com/kubernetes/yum/doc/yum-key.gpg https://mirrors.aliyun.com/kubernetes/yum/doc/rpm-package-key.gpg EOF ``` ### 2、安装kubelet ``` # 需要在每台机器上都安装以下的软件包: kubeadm: 用来初始化集群的指令。 kubelet: 在集群中的每个节点上用来启动 pod 和 container 等。 kubectl: 用来与集群通信的命令行工具。 # 查看kubelet版本列表 yum list kubelet --showduplicates | sort -r # 安装kubelet yum install -y kubelet-1.15.3-0 # 启动kubelet并设置开机启动 systemctl enable kubelet systemctl start kubelet # 检查状态 检查状态,发现是failed状态,正常,kubelet会10秒重启一次,需等下面完成初始化master节点后即可正常 systemctl status kubelet # 查看kubelet日志 journalctl -u kubelet --no-pager ``` ### 3、安装kubeadm ``` # 负责初始化集群 # 1、查看kubeadm版本列表 yum list kubeadm --showduplicates | sort -r # 2、安装kubeadm yum install -y kubeadm-1.15.3-0 # 安装 kubeadm 时候会默认安装 kubectl ,所以不需要单独安装kubectl # 3、重启服务器 为了防止发生某些未知错误,这里我们重启下服务器,方便进行后续操作 reboot ``` # 七、初始化第一个kubernetes master节点 ``` # 因为需要绑定虚拟IP,所以需要首先先查看虚拟IP启动这几台master机子哪台上 [root@k8s-master-01 ~]# ip address show bond0 6: bond0: mtu 1500 qdisc noqueue state UP group default qlen 1000 link/ether 6c:92:bf:27:9e:ed brd ff:ff:ff:ff:ff:ff inet 10.199.1.136/22 brd 10.19.3.255 scope global bond0 valid_lft forever preferred_lft forever inet 10.199.1.200/32 scope global bond0 valid_lft forever preferred_lft forever 7: bond0.101@bond0: mtu 1500 qdisc noqueue state UP group default qlen 1000 link/ether 6c:92:bf:27:9e:ed brd ff:ff:ff:ff:ff:ff inet 16.201.26.36/24 brd 16.201.26.255 scope global bond0.101 valid_lft forever preferred_lft forever 可以看到虚拟IP 10.199.1.200 和 服务器IP 10.199.1.136 在一台机子上,所以初始化kubernetes第一个master要在master01机子上进行安装 ``` ### 1、创建kubeadm配置的yaml文件 ``` # 1、创建kubeadm配置的yaml文件 rm -f ./kubeadm-config.yaml export MASTER_NODE1=10.199.1.136 export APISERVER_NAME=master.k8s.io export POD_SUBNET=10.244.0.0/16 export SVC_SUBNET=10.96.0.0/12 cat < ./kubeadm-config.yaml apiVersion: kubeadm.k8s.io/v1beta2 bootstrapTokens: - groups: - system:bootstrappers:kubeadm:default-node-token token: abcdef.0123456789abcdef ttl: 24h0m0s usages: - signing - authentication kind: InitConfiguration localAPIEndpoint: advertiseAddress: ${MASTER_NODE1} #这里填写第一个初始化的master的ip bindPort: 6443 nodeRegistration: criSocket: /var/run/dockershim.sock name: k8s-master-01 #注意这里需要调整为自己的节点 taints: - effect: NoSchedule key: node-role.kubernetes.io/master --- apiVersion: kubeadm.k8s.io/v1beta2 kind: ClusterConfiguration clusterName: kubernetes kubernetesVersion: v1.15.3 certificatesDir: /etc/kubernetes/pki controllerManager: {} controlPlaneEndpoint: "${APISERVER_NAME}:16443" # 这里写vip的地址或域名加上端口 imageRepository: registry.aliyuncs.com/google_containers # 使用阿里云镜像 apiServer: timeoutForControlPlane: 4m0s certSANs: - k8s-master-01 - k8s-master-02 - k8s-master-03 - master.k8s.io - 10.199.1.200 - 10.199.1.136 - 10.199.1.137 - 10.199.1.138 - 127.0.0.1 dns: type: CoreDNS etcd: local: dataDir: /var/lib/etcd networking: dnsDomain: cluster.local podSubnet: ${POD_SUBNET} serviceSubnet: ${SVC_SUBNET} scheduler: {} --- # 开启 IPVS 模式 apiVersion: kubeproxy.config.k8s.io/v1alpha1 kind: KubeProxyConfiguration mode: ipvs # kube-proxy 模式 EOF kubeadm init --config=kubeadm-config.yaml --upload-certs 以下两个地方设置: - certSANs: 虚拟ip地址(为了安全起见,把所有集群地址都加上) - controlPlaneEndpoint: 虚拟IP:监控端口号 配置说明: imageRepository: registry.aliyuncs.com/google_containers (使用阿里云镜像仓库) podSubnet: 10.244.0.0/16 (#pod地址池) serviceSubnet: 10.96.0.0/12 (#service地址池) ``` ### 2、初始化第一个master节点 ``` kubeadm init --config=kubeadm-config.yaml --upload-certs #使用这个就不用做拷贝证书的操作 kubeadm init --config kubeadm-config.yaml #使用这个还需要手动做拷贝证书的操作 #验证下端口是否通 nc -zv master.k8s.io 6443 nc -zv master.k8s.io 16443 ``` 日志 ``` Your Kubernetes control-plane has initialized successfully! To start using your cluster, you need to run the following as a regular user: mkdir -p $HOME/.kube sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config sudo chown $(id -u):$(id -g) $HOME/.kube/config You should now deploy a pod network to the cluster. Run "kubectl apply -f [podnetwork].yaml" with one of the options listed at: https://kubernetes.io/docs/concepts/cluster-administration/addons/ You can now join any number of the control-plane node running the following command on each as root: kubeadm join master.k8s.io:16443 --token abcdef.0123456789abcdef \ --discovery-token-ca-cert-hash sha256:ab6da874166785bfe75acc4d6fd622bf821a7451837332e3a21a6106e346c8d5 \ --control-plane --certificate-key 13284467f0141778898ffa33d340c0598cb757c6aa016f00da2165cd3eab4523 Please note that the certificate-key gives access to cluster sensitive data, keep it secret! As a safeguard, uploaded-certs will be deleted in two hours; If necessary, you can use "kubeadm init phase upload-certs --upload-certs" to reload certs afterward. Then you can join any number of worker nodes by running the following on each as root: kubeadm join master.k8s.io:16443 --token abcdef.0123456789abcdef \ --discovery-token-ca-cert-hash sha256:ab6da874166785bfe75acc4d6fd622bf821a7451837332e3a21a6106e346c8d5 ``` ### 执行结果中 用于初始化第二、三个 master 节点 ``` #初始化第二个master节点 export MASTER_NODE2=10.199.1.137 kubeadm join master.k8s.io:16443 --apiserver-advertise-address ${MASTER_NODE2} --token abcdef.0123456789abcdef \ --discovery-token-ca-cert-hash sha256:ab6da874166785bfe75acc4d6fd622bf821a7451837332e3a21a6106e346c8d5 \ --control-plane --certificate-key 13284467f0141778898ffa33d340c0598cb757c6aa016f00da2165cd3eab4523 #初始化第三个master节点 export MASTER_NODE3=10.199.1.138 kubeadm join master.k8s.io:16443 --apiserver-advertise-address ${MASTER_NODE3} --token abcdef.0123456789abcdef \ --discovery-token-ca-cert-hash sha256:ab6da874166785bfe75acc4d6fd622bf821a7451837332e3a21a6106e346c8d5 \ --control-plane --certificate-key 13284467f0141778898ffa33d340c0598cb757c6aa016f00da2165cd3eab4523 ``` 用于初始化 worker 节点 ``` kubeadm join master.k8s.io:16443 --token abcdef.0123456789abcdef \ --discovery-token-ca-cert-hash sha256:ab6da874166785bfe75acc4d6fd622bf821a7451837332e3a21a6106e346c8d5 ``` ### 3、配置kubectl环境变量 ```bash # 配置环境变量 rm -rf $HOME/.kube mkdir -p $HOME/.kube sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config sudo chown $(id -u):$(id -g) $HOME/.kube/config # 指令补全 yum install bash-completion -y source <(kubectl completion bash) echo "source <(kubectl completion bash)" >> ~/.bashrc ``` ### 4、查看组件状态 ```bash kubectl get cs NAME STATUS MESSAGE ERROR controller-manager Healthy ok scheduler Healthy ok etcd-0 Healthy {"health": "true"} # 查看pod状态 [root@k8s-master-01 ~]# kubectl get pods --namespace=kube-system NAME READY STATUS RESTARTS AGE coredns-78d4cf999f-5zt5z 0/1 Pending 0 7m32s ---coredns没有启动 coredns-78d4cf999f-mkgsx 0/1 Pending 0 7m32s ---coredns没有启动 etcd-k8s-master-01 1/1 Running 0 6m39s kube-apiserver-k8s-master-01 1/1 Running 0 6m43s kube-controller-manager-k8s-master-01 1/1 Running 0 6m32s kube-proxy-88s74 1/1 Running 0 7m32s kube-scheduler-k8s-master-01 1/1 Running 0 6m45s 可以看到coredns没有启动,这是由于还没有配置网络插件,接下来配置下后再重新查看启动状态 #检查ETCD服务 docker exec -it $(docker ps |grep etcd_etcd|awk '{print $1}') sh etcdctl --endpoints=https://192.168.56.11:2379 --ca-file=/etc/kubernetes/pki/etcd/ca.crt --cert-file=/etc/kubernetes/pki/etcd/server.crt --key-file=/etc/kubernetes/pki/etcd/server.key member list etcdctl --endpoints=https://192.168.56.11:2379 --ca-file=/etc/kubernetes/pki/etcd/ca.crt --cert-file=/etc/kubernetes/pki/etcd/server.crt --key-file=/etc/kubernetes/pki/etcd/server.key cluster-health ``` # 八、安装网络插件 ### 1、安装 calico 网络插件 ``` # 安装 calico 网络插件 # 参考文档 https://docs.projectcalico.org/v3.8/getting-started/kubernetes/ export POD_SUBNET=10.244.0.0/16 rm -f calico.yaml wget https://docs.projectcalico.org/v3.8/manifests/calico.yaml sed -i "s#192\.168\.0\.0/16#${POD_SUBNET}#" calico.yaml kubectl apply -f calico.yaml ``` ### 2、安装 flannel 网络插件 ```bash export POD_SUBNET=10.244.0.0/16 cat > kube-flannel.yaml << EOF --- kind: ClusterRole apiVersion: rbac.authorization.k8s.io/v1beta1 metadata: name: flannel rules: - apiGroups: - "" resources: - pods verbs: - get - apiGroups: - "" resources: - nodes verbs: - list - watch - apiGroups: - "" resources: - nodes/status verbs: - patch --- kind: ClusterRoleBinding apiVersion: rbac.authorization.k8s.io/v1beta1 metadata: name: flannel roleRef: apiGroup: rbac.authorization.k8s.io kind: ClusterRole name: flannel subjects: - kind: ServiceAccount name: flannel namespace: kube-system --- apiVersion: v1 kind: ServiceAccount metadata: name: flannel namespace: kube-system --- kind: ConfigMap apiVersion: v1 metadata: name: kube-flannel-cfg namespace: kube-system labels: tier: node app: flannel data: cni-conf.json: | { "name": "cbr0", "plugins": [ { "type": "flannel", "delegate": { "hairpinMode": true, "isDefaultGateway": true } }, { "type": "portmap", "capabilities": { "portMappings": true } } ] } net-conf.json: | { "Network": "${POD_SUBNET}", "Backend": { "Type": "vxlan" } } --- apiVersion: extensions/v1beta1 kind: DaemonSet metadata: name: kube-flannel-ds-amd64 namespace: kube-system labels: tier: node app: flannel spec: template: metadata: labels: tier: node app: flannel spec: hostNetwork: true nodeSelector: beta.kubernetes.io/arch: amd64 tolerations: - operator: Exists effect: NoSchedule serviceAccountName: flannel initContainers: - name: install-cni image: registry.cn-shenzhen.aliyuncs.com/cp_m/flannel:v0.10.0-amd64 command: - cp args: - -f - /etc/kube-flannel/cni-conf.json - /etc/cni/net.d/10-flannel.conflist volumeMounts: - name: cni mountPath: /etc/cni/net.d - name: flannel-cfg mountPath: /etc/kube-flannel/ containers: - name: kube-flannel image: registry.cn-shenzhen.aliyuncs.com/cp_m/flannel:v0.10.0-amd64 command: - /opt/bin/flanneld args: - --ip-masq - --kube-subnet-mgr - --iface=bond0 resources: requests: cpu: "100m" memory: "50Mi" limits: cpu: "100m" memory: "50Mi" securityContext: privileged: true env: - name: POD_NAME valueFrom: fieldRef: fieldPath: metadata.name - name: POD_NAMESPACE valueFrom: fieldRef: fieldPath: metadata.namespace volumeMounts: - name: run mountPath: /run - name: flannel-cfg mountPath: /etc/kube-flannel/ volumes: - name: run hostPath: path: /run - name: cni hostPath: path: /etc/cni/net.d - name: flannel-cfg configMap: name: kube-flannel-cfg EOF “Network”: “10.244.0.0/16”要和kubeadm-config.yaml配置文件中podSubnet: 10.244.0.0/16相同 ``` ### 2、创建flanner相关role和pod ``` # 应用生效 [root@k8s-master-01 ~]# kubectl apply -f kube-flannel.yaml clusterrole.rbac.authorization.k8s.io/flannel created clusterrolebinding.rbac.authorization.k8s.io/flannel created serviceaccount/flannel created configmap/kube-flannel-cfg created daemonset.extensions/kube-flannel-ds-amd64 created # 等待一会时间,再次查看各个pods的状态 [root@k8s-master-01 ~]# kubectl get pods --namespace=kube-system NAME READY STATUS RESTARTS AGE coredns-78d4cf999f-5zt5z 1/1 Running 0 12m ---coredns启动成功 coredns-78d4cf999f-mkgsx 1/1 Running 0 12m ---coredns启动成功 etcd-k8s-master-01 1/1 Running 0 11m kube-apiserver-k8s-master-01 1/1 Running 0 12m kube-controller-manager-k8s-master-01 1/1 Running 0 11m kube-flannel-ds-amd64-7lj6m 1/1 Running 0 13s kube-proxy-88s74 1/1 Running 0 12m kube-scheduler-k8s-master-01 1/1 Running 0 12m # 加入更换了网络插件,需要把coredns的pod重新创建,不然网络coredns的pod网络不通 # 查看 kubectl get pods --namespace kube-system kubectl get svc --namespace kube-system #删除coredns kubectl delete deployment coredns -n kube-system kubectl delete svc kube-dns -n kube-system kubectl delete cm coredns -n kube-system #重新部署coredns rm -f coredns.yaml.sed deploy.sh coredns.yml wget https://raw.githubusercontent.com/coredns/deployment/master/kubernetes/coredns.yaml.sed wget https://raw.githubusercontent.com/coredns/deployment/master/kubernetes/deploy.sh chmod +x deploy.sh ./deploy.sh -i 10.96.0.10 > coredns.yml #这里从--service-cidr=10.96.0.0/16中选用10.96.0.10作为coredns地址 kubectl apply -f coredns.yml ``` # 九、加入集群 ### 1、Master加入集群构成高可用 ``` 复制秘钥到各个节点 在master01 服务器上执行下面命令,将kubernetes相关文件复制到 master02、master03 如果其他节点为初始化第一个master节点,则将该节点的配置文件复制到其余两个主节点,例如master03为第一个master节点,则将它的k8s配置复制到master02和master01。 ``` - 复制文件到 master02 ``` ssh root@master02.k8s.io mkdir -p /etc/kubernetes/pki/etcd scp /etc/kubernetes/admin.conf root@master02.k8s.io:/etc/kubernetes scp /etc/kubernetes/pki/{ca.*,sa.*,front-proxy-ca.*} root@master02.k8s.io:/etc/kubernetes/pki scp /etc/kubernetes/pki/etcd/ca.* root@master02.k8s.io:/etc/kubernetes/pki/etcd ``` - 复制文件到 master03 ``` ssh root@master03.k8s.io mkdir -p /etc/kubernetes/pki/etcd scp /etc/kubernetes/admin.conf root@master03.k8s.io:/etc/kubernetes scp /etc/kubernetes/pki/{ca.*,sa.*,front-proxy-ca.*} root@master03.k8s.io:/etc/kubernetes/pki scp /etc/kubernetes/pki/etcd/ca.* root@master03.k8s.io:/etc/kubernetes/pki/etcd ``` - master节点加入集群  master02 和 master03 服务器上都执行加入集群操作 ```bash kubeadm join master.k8s.io:16443 --token i77yg1.1eype0c53jsanoge --discovery-token-ca-cert-hash sha256:8f0a817012ab333a057b6a7410e65971be20b95c1b75fc4015f8f3b6785f626f --experimental-control-plane ```  如果加入失败想重新尝试,请输入 kubeadm reset 命令清除之前的设置,重新执行从“复制秘钥”和“加入集群”这两步  如果是master加入,请在最后面加上 –experimental-control-plane 这个参数 ```bash # 显示安装过程: This node has joined the cluster and a new control plane instance was created: * Certificate signing request was sent to apiserver and approval was received. * The Kubelet was informed of the new secure connection details. * Master label and taint were applied to the new node. * The Kubernetes control plane instances scaled up. * A new etcd member was added to the local/stacked etcd cluster. To start administering your cluster from this node, you need to run the following as a regular user: mkdir -p $HOME/.kube sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config sudo chown $(id -u):$(id -g) $HOME/.kube/config Run 'kubectl get nodes' to see this node join the cluster. ``` - 配置kubectl环境变量 ```bash mkdir -p $HOME/.kube sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config sudo chown $(id -u):$(id -g) $HOME/.kube/config # 指令补全 yum install bash-completion -y source <(kubectl completion bash) echo "source <(kubectl completion bash)" >> ~/.bashrc ``` ### 2、node节点加入集群  除了让master节点加入集群组成高可用外,slave节点也要加入集群中。  这里将k8s-node-01、k8s-node-02、k8s-node-03加入集群,进行工作  输入初始化k8s master时候提示的加入命令,如下: ``` kubeadm join master.k8s.io:16443 --token i77yg1.1eype0c53jsanoge --discovery-token-ca-cert-hash sha256:8f0a817012ab333a057b6a7410e65971be20b95c1b75fc4015f8f3b6785f626f ```  node节点加入,不需要加上 –experimental-control-plane 这个参数 ### 3、如果忘记加入集群的token和sha256 (如正常则跳过) - 显示获取token列表 ``` kubeadm token list ``` 默认情况下 Token 过期是时间是24小时,如果 Token 过期以后,可以输入以下命令,生成新的 Token ``` kubeadm token create ``` - 获取ca证书sha256编码hash值 ``` openssl x509 -pubkey -in /etc/kubernetes/pki/ca.crt | openssl rsa -pubin -outform der 2>/dev/null | openssl dgst -sha256 -hex | sed 's/^.* //' ``` 拼接命令 ``` kubeadm join master.k8s.io:16443 --token 882ik4.9ib2kb0eftvuhb58 --discovery-token-ca-cert-hash sha256:0b1a836894d930c8558b350feeac8210c85c9d35b6d91fde202b870f3244016a 如果是master加入,请在最后面加上 –experimental-control-plane 这个参数 ``` ### 4、查看各个节点加入集群情况 ``` kubectl get nodes -o wide ``` # 十、从集群中删除 Node - Master节点: ``` kubectl drain --delete-local-data --force --ignore-daemonsets kubectl delete node ``` - Slave节点: ``` kubeadm reset ``` ## 初始化失败 ```bash yes | kubeadm reset ifconfig cni0 down ip link delete cni0 ifconfig flannel.1 down ip link delete flannel.1 rm -rf /var/lib/cni/ rm -f $HOME/.kube/config systemctl restart docker systemctl restart kubelet systemctl status kubelet journalctl -f -u kubelet ``` ## 问题汇总: 1、多网卡监听问题 ``` k8s master组件在多网卡环境下,会监听到服务器外网IP问题 #注意--hostname-override的值写kubectl get nodes显示的结果 #修改kubelet启动参数 cat > /etc/sysconfig/kubelet <<\EOF KUBELET_EXTRA_ARGS=--runtime-cgroups=/systemd/system.slice --kubelet-cgroups=/systemd/system.slice --hostname-override=k8s-master-01 --node-ip=10.199.1.136 EOF #重启kubelet服务 systemctl daemon-reload systemctl restart kubelet systemctl status kubelet #查看kubelet日志 journalctl -f -u kubelet https://blog.csdn.net/qianghaohao/article/details/98588427 kubeadm + vagrant 部署多节点 k8s 的一个坑(多网卡问题) https://github.com/kubernetes/kubernetes/issues/33618 https://kubernetes.io/zh/docs/setup/independent/install-kubeadm/ kubeadm init 和 kubeadm join 用于为 kubelet 获取 额外的用户参数。 #解决方案 @danielschonfeld The kubelet flag you should set is --hostname-override ``` 参考资料: https://github.com/kubernetes/kubernetes/issues/33618 Issue when using kubeadm with multiple network interfaces #33618 http://www.mydlq.club/article/4/ https://kuboard.cn/install/install-kubernetes.html#%E5%88%9D%E5%A7%8B%E5%8C%96%E7%AC%AC%E4%B8%80%E4%B8%AAmaster%E8%8A%82%E7%82%B9 https://blog.51cto.com/fengwan/2426528?source=dra kubeadm搭建高可用kubernetes 1.15.1 https://segmentfault.com/a/1190000018741112?utm_source=tag-newest Kubernetes的几种主流部署方式02-kubeadm部署高可用集群 https://www.cnblogs.com/hongdada/p/9771857.html Docker中的Cgroup Driver:Cgroupfs 与 Systemd https://juejin.im/entry/5b0aa39551882538be0d2e21 centos7使用kubeadm配置高可用集群(多master 多网卡,需主动修改组件信息) ================================================ FILE: kubeadm/k8s清理.md ================================================ # 一、清理资源 ``` systemctl stop kubelet systemctl stop docker kubeadm reset #yum remove -y kubelet kubeadm kubectl --disableexcludes=kubernetes rm -rf /etc/kubernetes/ rm -rf /root/.kube/ rm -rf $HOME/.kube/ rm -rf /var/lib/etcd/ rm -rf /var/lib/cni/ rm -rf /var/lib/kubelet/ rm -rf /etc/cni/ rm -rf /opt/cni/ ifconfig cni0 down ifconfig flannel.1 down ifconfig docker0 down ip link delete cni0 ip link delete flannel.1 #docker rmi -f $(docker images -q) #docker rm -f `docker ps -a -q` #yum install -y kubelet kubeadm kubectl --disableexcludes=kubernetes kubeadm version systemctl restart kubelet.service systemctl enable kubelet.service ``` # 二、重新初始化 ``` swapoff -a modprobe br_netfilter sysctl -p /etc/sysctl.d/k8s.conf chmod 755 /etc/sysconfig/modules/ipvs.modules && bash /etc/sysconfig/modules/ipvs.modules && lsmod | grep -e ip_vs -e nf_conntrack_ipv4 kubeadm config images list |sed -e 's/^/docker pull /g' -e 's#k8s.gcr.io#registry.cn-hangzhou.aliyuncs.com/google_containers#g' |sh -x docker images |grep google_containers |awk '{print "docker tag ",$1":"$2,$1":"$2}' |sed -e 's#registry.cn-hangzhou.aliyuncs.com/google_containers#k8s.gcr.io#2' |sh -x docker images |grep google_containers |awk '{print "docker rmi ", $1":"$2}' |sh -x docker pull coredns/coredns:1.3.1 docker tag coredns/coredns:1.3.1 k8s.gcr.io/coredns:1.3.1 docker rmi coredns/coredns:1.3.1 kubeadm init --kubernetes-version=v1.15.3 --pod-network-cidr=10.244.0.0/16 --apiserver-advertise-address=192.168.56.11 --apiserver-bind-port=6443 #获取加入集群的指令 kubeadm token create --print-join-command ``` # 三、Node操作 ``` mkdir -p $HOME/.kube ``` # 四、Master操作 ``` mkdir -p $HOME/.kube cp -i /etc/kubernetes/admin.conf $HOME/.kube/config chown $(id -u):$(id -g) $HOME/.kube/config scp $HOME/.kube/config root@linux-node2:$HOME/.kube/config scp $HOME/.kube/config root@linux-node3:$HOME/.kube/config scp $HOME/.kube/config root@linux-node4:$HOME/.kube/config ``` # 五、Master和Node节点 ``` chown $(id -u):$(id -g) $HOME/.kube/config ``` 参考资料: https://blog.51cto.com/wutengfei/2121202 kubernetes中网络报错问题 ================================================ FILE: kubeadm/kubeadm.yaml ================================================ apiVersion: kubeadm.k8s.io/v1beta2 bootstrapTokens: - groups: - system:bootstrappers:kubeadm:default-node-token token: abcdef.0123456789abcdef ttl: 24h0m0s usages: - signing - authentication kind: InitConfiguration localAPIEndpoint: advertiseAddress: 192.168.56.11 bindPort: 6443 nodeRegistration: criSocket: /var/run/dockershim.sock name: linux-node1.example.com taints: - effect: NoSchedule key: node-role.kubernetes.io/master --- apiServer: timeoutForControlPlane: 4m0s apiVersion: kubeadm.k8s.io/v1beta2 certificatesDir: /etc/kubernetes/pki clusterName: kubernetes controllerManager: {} dns: type: CoreDNS etcd: local: dataDir: /var/lib/etcd imageRepository: k8s.gcr.io kind: ClusterConfiguration kubernetesVersion: v1.15.0 networking: dnsDomain: cluster.local podSubnet: 172.168.0.0/16 serviceSubnet: 10.96.0.0/12 scheduler: {} --- apiVersion: kubeproxy.config.k8s.io/v1alpha1 kind: KubeProxyConfiguration mode: ipvs # kube-proxy 模式 ================================================ FILE: kubeadm/kubeadm初始化k8s集群延长证书过期时间.md ================================================ # 一、前言 kubeadm初始化k8s集群,签发的CA证书有效期默认是10年,签发的apiserver证书有效期默认是1年,到期之后请求apiserver会报错,使用openssl命令查询相关证书是否到期。 以下延长证书过期的方法适合kubernetes1.14、1.15、1.16、1.17、1.18版本 # 二、查看证书有效时间 ```bash openssl x509 -in /etc/kubernetes/pki/ca.crt -noout -text |grep Not 显示如下,通过下面可看到ca证书有效期是10年,从2020到2030年: Not Before: Apr 22 04:09:07 2020 GMT Not After : Apr 20 04:09:07 2030 GMT openssl x509 -in /etc/kubernetes/pki/apiserver.crt -noout -text |grep Not 显示如下,通过下面可看到apiserver证书有效期是1年,从2020到2021年: Not Before: Apr 22 04:09:07 2020 GMT Not After : Apr 22 04:09:07 2021 GMT ``` # 三、延长证书过期时间 ```bash 1.把update-kubeadm-cert.sh文件上传到master1、master2、master3节点 update-kubeadm-cert.sh文件所在的github地址如下: https://github.com/luckylucky421/kubernetes1.17.3 把update-kubeadm-cert.sh文件clone和下载下来,拷贝到master1,master2,master3节点上 2.在每个节点都执行如下命令 1)给update-kubeadm-cert.sh证书授权可执行权限 chmod +x update-kubeadm-cert.sh 2)执行下面命令,修改证书过期时间,把时间延长到10年 ./update-kubeadm-cert.sh all 3)在master1节点查询Pod是否正常,能查询出数据说明证书签发完成 kubectl get pods -n kube-system 显示如下,能够看到pod信息,说明证书签发正常: ...... calico-node-b5ks5 1/1 Running 0 157m calico-node-r6bfr 1/1 Running 0 155m calico-node-r8qzv 1/1 Running 0 7h1m coredns-66bff467f8-5vk2q 1/1 Running 0 7h30m ...... ``` # 四、验证证书有效时间是否延长到10年 ```bash openssl x509 -in /etc/kubernetes/pki/ca.crt -noout -text |grep Not 显示如下,通过下面可看到ca证书有效期是10年,从2020到2030年: Not Before: Apr 22 04:09:07 2020 GMT Not After : Apr 20 04:09:07 2030 GMT openssl x509 -in /etc/kubernetes/pki/apiserver.crt -noout -text |grep Not 显示如下,通过下面可看到apiserver证书有效期是10年,从2020到2030年: Not Before: Apr 22 11:15:53 2020 GMT Not After : Apr 20 11:15:53 2030 GMT openssl x509 -in /etc/kubernetes/pki/apiserver-etcd-client.crt -noout -text |grep Not 显示如下,通过下面可看到etcd证书有效期是10年,从2020到2030年: Not Before: Apr 22 11:32:24 2020 GMT Not After : Apr 20 11:32:24 2030 GMT openssl x509 -in /etc/kubernetes/pki/front-proxy-ca.crt -noout -text |grep Not 显示如下,通过下面可看到fron-proxy证书有效期是10年,从2020到2030年: Not Before: Apr 22 04:09:08 2020 GMT Not After : Apr 20 04:09:08 2030 GMT ``` 参考资料: https://mp.weixin.qq.com/s/N7WRT0OkyJHec35BH_X1Hg kubeadm初始化k8s集群延长证书过期时间 ================================================ FILE: kubeadm/kubeadm无法下载镜像问题.md ================================================ 0、kubeadm镜像介绍 ``` kubeadm 是kubernetes 的集群安装工具,能够快速安装kubernetes 集群。 kubeadm init 命令默认使用的docker镜像仓库为k8s.gcr.io,国内无法直接访问,于是需要变通一下。 ``` 1、首先查看需要使用哪些镜像 ``` kubeadm config images list #输出如下结果 k8s.gcr.io/kube-apiserver:v1.15.3 k8s.gcr.io/kube-controller-manager:v1.15.3 k8s.gcr.io/kube-scheduler:v1.15.3 k8s.gcr.io/kube-proxy:v1.15.3 k8s.gcr.io/pause:3.1 k8s.gcr.io/etcd:3.3.10 k8s.gcr.io/coredns:1.3.1 我们通过 docker.io/mirrorgooglecontainers 中转一下 ``` 2、批量下载及转换标签 ``` #docker.io/mirrorgooglecontainers中转镜像 kubeadm config images list |sed -e 's/^/docker pull /g' -e 's#k8s.gcr.io#docker.io/mirrorgooglecontainers#g' |sh -x docker images |grep mirrorgooglecontainers |awk '{print "docker tag ",$1":"$2,$1":"$2}' |sed -e 's#mirrorgooglecontainers#k8s.gcr.io#2' |sh -x docker images |grep mirrorgooglecontainers |awk '{print "docker rmi ", $1":"$2}' |sh -x docker pull coredns/coredns:1.3.1 docker tag coredns/coredns:1.3.1 k8s.gcr.io/coredns:1.3.1 docker rmi coredns/coredns:1.3.1 注:coredns没包含在docker.io/mirrorgooglecontainers中,需要手工从coredns官方镜像转换下。 #阿里云的中转镜像 kubeadm config images list |sed -e 's/^/docker pull /g' -e 's#k8s.gcr.io#registry.cn-hangzhou.aliyuncs.com/google_containers#g' |sh -x docker images |grep google_containers |awk '{print "docker tag ",$1":"$2,$1":"$2}' |sed -e 's#registry.cn-hangzhou.aliyuncs.com/google_containers#k8s.gcr.io#2' |sh -x docker images |grep google_containers |awk '{print "docker rmi ", $1":"$2}' |sh -x docker pull coredns/coredns:1.3.1 docker tag coredns/coredns:1.3.1 k8s.gcr.io/coredns:1.3.1 docker rmi coredns/coredns:1.3.1 ``` 3、查看镜像列表 ``` docker images REPOSITORY TAG IMAGE ID CREATED SIZE k8s.gcr.io/kube-proxy v1.15.3 232b5c793146 2 weeks ago 82.4MB k8s.gcr.io/kube-scheduler v1.15.3 703f9c69a5d5 2 weeks ago 81.1MB k8s.gcr.io/kube-controller-manager v1.15.3 e77c31de5547 2 weeks ago 159MB k8s.gcr.io/coredns 1.3.1 eb516548c180 7 months ago 40.3MB k8s.gcr.io/etcd 3.3.10 2c4adeb21b4f 9 months ago 258MB k8s.gcr.io/pause 3.1 da86e6ba6ca1 20 months ago 742kB docker rmi -f $(docker images -q) docker rm -f `docker ps -a -q` ``` 参考文档: https://cloud.tencent.com/info/6db42438f5dd7842bcecb6baf61833aa.html kubeadm 无法下载镜像问题 https://juejin.im/post/5b8a4536e51d4538c545645c 使用kubeadm 部署 Kubernetes(国内环境) ================================================ FILE: manual/README.md ================================================ # 内核升级 ``` # 载入公钥 rpm --import https://www.elrepo.org/RPM-GPG-KEY-elrepo.org # 安装ELRepo rpm -Uvh http://www.elrepo.org/elrepo-release-7.0-3.el7.elrepo.noarch.rpm # 载入elrepo-kernel元数据 yum --disablerepo=\* --enablerepo=elrepo-kernel repolist # 查看可用的rpm包 yum --disablerepo=\* --enablerepo=elrepo-kernel list kernel* # 安装长期支持版本的kernel yum --disablerepo=\* --enablerepo=elrepo-kernel install -y kernel-lt.x86_64 # 删除旧版本工具包 yum remove kernel-tools-libs.x86_64 kernel-tools.x86_64 -y # 安装新版本工具包 yum --disablerepo=\* --enablerepo=elrepo-kernel install -y kernel-lt-tools.x86_64 #查看默认启动顺序 awk -F\' '$1=="menuentry " {print $2}' /etc/grub2.cfg CentOS Linux (4.4.183-1.el7.elrepo.x86_64) 7 (Core) CentOS Linux (3.10.0-327.10.1.el7.x86_64) 7 (Core) CentOS Linux (0-rescue-c52097a1078c403da03b8eddeac5080b) 7 (Core) #默认启动的顺序是从0开始,新内核是从头插入(目前位置在0,而4.4.4的是在1),所以需要选择0。 grub2-set-default 0 #重启并检查 reboot ``` 参考资料 https://github.com/easzlab/kubeasz/blob/master/docs/guide/kernel_upgrade.md ================================================ FILE: manual/v1.14/README.md ================================================ ================================================ FILE: manual/v1.15.3/README.md ================================================ # 一、Kubernetes 1.15 二进制集群安装 本系列文档将介绍如何使用二进制部署Kubernetes v1.15.3 集群的所有部署,而不是使用自动化部署(kubeadm)集群。在部署过程中,将详细列出各个组件启动参数,以及相关配置说明。在学习完本文档后,将理解k8s各个组件的交互原理,并且可以快速解决实际问题。 ## 1.1、组件版本 ``` Kubernetes 1.15.3 Docker 18.09 (docker使用官方的脚本安装,后期可能升级为新的版本,但是不影响) Etcd 3.3.13 Flanneld 0.11.0 ``` ## 1.2、组件说明 ### kube-apiserver ``` 使用节点本地Nginx 4层透明代理实现高可用 (也可以使用haproxy,只是起到代理apiserver的作用) 关闭非安全端口8080和匿名访问 使用安全端口6443接受https请求 严格的认知和授权策略 (x509、token、rbac) 开启bootstrap token认证,支持kubelet TLS bootstrapping; 使用https访问kubelet、etcd ``` ### kube-controller-manager ``` 3节点高可用 (在k8s中,有些组件需要选举,所以使用奇数为集群高可用方案) 关闭非安全端口,使用10252接受https请求 使用kubeconfig访问apiserver的安全扣 使用approve kubelet证书签名请求(CSR),证书过期后自动轮转 各controller使用自己的ServiceAccount访问apiserver ``` ### kube-scheduler ``` 3节点高可用; 使用kubeconfig访问apiserver安全端口 ``` ### kubelet ``` 使用kubeadm动态创建bootstrap token 使用TLS bootstrap机制自动生成client和server证书,过期后自动轮转 在kubeletConfiguration类型的JSON文件配置主要参数 关闭只读端口,在安全端口10250接受https请求,对请求进行认真和授权,拒绝匿名访问和非授权访问 使用kubeconfig访问apiserver的安全端口 ``` ### kube-proxy ``` 使用kubeconfig访问apiserver的安全端口 在KubeProxyConfiguration类型JSON文件配置为主要参数 使用ipvs代理模式 ``` ### 集群插件 ``` DNS 使用功能、性能更好的coredns 网络 使用Flanneld 作为集群网络插件 ``` # 二、初始化环境 ## 1.1、集群机器 ``` #master节点 192.168.0.50 k8s-01 192.168.0.51 k8s-02 192.168.0.52 k8s-03 #node节点 192.168.0.53 k8s-04 #node节点只运行node,但是设置证书的时候要添加这个ip ``` 本文档的所有etcd集群、master集群、worker节点均使用以上三台机器,并且初始化步骤需要在所有机器上执行命令。如果没有特殊命令,所有操作均在192.168.0.50上进行操作 node节点后面会有操作,但是在初始化这步,是所有集群机器。包括node节点,我上面没有列出node节点 ## 1.2、修改主机名 所有机器设置永久主机名 ``` hostnamectl set-hostname abcdocker-k8s01 #所有机器按照要求修改 bash #刷新主机名 ``` 接下来我们需要在所有机器上添加hosts解析 ``` cat >> /etc/hosts <>/etc/profile [root@abcdocker-k8s01 ~]# source /etc/profile [root@abcdocker-k8s01 ~]# env|grep PATH PATH=/opt/k8s/bin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/root/bin ``` ## 1.4、安装依赖包 在每台服务器上安装依赖包 ``` yum install -y conntrack ntpdate ntp ipvsadm ipset jq iptables curl sysstat libseccomp wget ``` 关闭防火墙 Linux 以及swap分区 ``` systemctl stop firewalld systemctl disable firewalld iptables -F && iptables -X && iptables -F -t nat && iptables -X -t nat iptables -P FORWARD ACCEPT swapoff -a sed -i '/ swap / s/^\(.*\)$/#\1/g' /etc/fstab setenforce 0 sed -i 's/^SELINUX=.*/SELINUX=disabled/' /etc/selinux/config #如果开启了swap分区,kubelet会启动失败(可以通过设置参数——-fail-swap-on设置为false) ``` 升级内核 ``` ``` 参考资料 https://i4t.com/4253.html Kubernetes 1.14 二进制集群安装 https://github.com/kubernetes/kubernetes/releases/tag/v1.15.3 下载链接 ================================================ FILE: mysql/README.md ================================================ ================================================ FILE: mysql/kubernetes访问外部mysql服务.md ================================================ Table of Contents ================= * [Table of Contents](#table-of-contents) * [一、创建endpoints](#一创建endpoints) * [二、创建service](#二创建service) * [三、文件合并](#三文件合并) * [四、安装centos7基础镜像](#四安装centos7基础镜像) * [五、测试数据库连接](#五测试数据库连接) `k8s访问集群外独立的服务最好的方式是采用Endpoint方式(可以看作是将k8s集群之外的服务抽象为内部服务),以mysql服务为例` # 一、创建endpoints (带注释的操作,建议分步操作,被这个坑了很久,或者可以直接使用合并文件一步执行) ```bash # 删除 mysql-endpoints kubectl delete -f mysql-endpoints.yaml -n mos-namespace # 创建 mysql-endpoints.yaml cat >mysql-endpoints.yaml<<\EOF apiVersion: v1 kind: Endpoints metadata: name: mysql-production subsets: - addresses: - ip: 10.198.1.155 #-注意目前服务器的数据库需要放开权限 ports: - port: 3306 protocol: TCP EOF # 创建 mysql-endpoints kubectl apply -f mysql-endpoints.yaml -n mos-namespace # 查看 mysql-endpoints kubectl get endpoints mysql-production -n mos-namespace # 查看 mysql-endpoints详情 kubectl describe endpoints mysql-production -n mos-namespace # 探测服务是否可达 nc -zv 10.198.1.155 3306 ``` # 二、创建service ```bash # 删除 mysql-service kubectl delete -f mysql-service.yaml -n mos-namespace # 编写 mysql-service.yaml cat >mysql-service.yaml<<\EOF apiVersion: v1 kind: Service metadata: name: mysql-production spec: ports: - port: 3306 protocol: TCP EOF # 创建 mysql-service kubectl apply -f mysql-service.yaml -n mos-namespace # 查看 mysql-service kubectl get svc mysql-production -n mos-namespace # 查看 mysql-service详情 kubectl describe svc mysql-production -n mos-namespace # 验证service ip的连通性 nc -zv `kubectl get svc mysql-production -n mos-namespace|grep mysql-production|awk '{print $3}'` 3306 ``` # 三、文件合并 `注意点: Endpoints类型,可以打标签,但是Service不可以通过标签来选择,直接不写selector: name: mysql-endpoints 不然会出现异常,找不到endpoints节点` ``` cat << EOF > mysql-service-new.yaml apiVersion: v1 kind: Service metadata: name: mysql-production spec: #selector: ---注意这里用标签选择,直接取消 # name: mysql-endpoints ports: - port: 3306 protocol: TCP EOF ``` 完整文件 ```bash kubectl delete -f mysql-endpoints-new.yaml -n mos-namespace kubectl delete -f mysql-service-new.yaml -n mos-namespace cat << EOF > mysql-endpoints-new.yaml apiVersion: v1 kind: Endpoints metadata: name: mysql-production labels: name: mysql-endpoints subsets: - addresses: - ip: 10.198.1.155 ports: - port: 3306 protocol: TCP EOF cat << EOF > mysql-service-new.yaml apiVersion: v1 kind: Service metadata: name: mysql-production spec: ports: - port: 3306 protocol: TCP EOF kubectl apply -f mysql-endpoints-new.yaml -n mos-namespace kubectl apply -f mysql-service-new.yaml -n mos-namespace nc -zv `kubectl get svc mysql-production -n mos-namespace|grep mysql-production|awk '{print $3}'` 3306 ``` # 四、安装centos7基础镜像 ```bash # 查看 mos-namespace 下的pod资源 kubectl get pods -n mos-namespace # 清理命令行创建的deployment kubectl delete deployment centos7-app -n mos-namespace # 命令行跑一个centos7的bash基础容器 #kubectl run --rm --image=centos:7.2.1511 centos7-app -it --port=8080 --replicas=1 -n mos-namespace kubectl run --image=centos:7.2.1511 centos7-app -it --port=8080 --replicas=1 -n mos-namespace # 安装mysql客户端 yum install vim net-tools telnet nc -y yum install -y mariadb.x86_64 mariadb-libs.x86_64 ``` # 五、测试数据库连接 ```bash # 进入到容器 kubectl exec `kubectl get pods -n mos-namespace|grep centos7-app|awk '{print $1}'` -it /bin/bash -n mos-namespace # 检查网络连通性 ping mysql-production # 测试mysql服务端口是否OK nc -zv mysql-production 3306 # 连接测试 mysql -h'mysql-production' -u'root' -p'password' ``` 参考资料: https://blog.csdn.net/hxpjava1/article/details/80040407 使用kubernetes访问外部服务mysql/redis https://blog.csdn.net/liyingke112/article/details/76204038 https://blog.csdn.net/ybt_c_index/article/details/80881157 istio 0.8 用ServiceEntry访问外部服务(如RDS) ================================================ FILE: redis/K8s上Redis集群动态扩容.md ================================================ 参考资料: http://redisdoc.com/topic/cluster-tutorial.html#id10 Redis 命令参考 https://cloud.tencent.com/developer/article/1392872 ================================================ FILE: redis/K8s上运行Redis单实例.md ================================================ Table of Contents ================= * [一、创建namespace](#一创建namespace) * [二、创建一个 configmap](#二创建一个-configmap) * [三、创建 redis 容器](#三创建-redis-容器) * [四、创建redis-service服务](#四创建redis-service服务) * [五、验证redis实例](#五验证redis实例) # 一、创建namespace ```bash # 清理 namespace kubectl delete -f mos-namespace.yaml # 创建一个专用的 namespace cat > mos-namespace.yaml <<\EOF --- apiVersion: v1 kind: Namespace metadata: name: mos-namespace EOF kubectl apply -f mos-namespace.yaml # 查看 namespace kubectl get namespace -A ``` # 二、创建一个 configmap ```bash mkdir config && cd config # 清理configmap kubectl delete configmap redis-conf -n mos-namespace # 创建redis配置文件 cat >redis.conf <<\EOF #daemonize yes pidfile /data/redis.pid port 6379 tcp-backlog 30000 timeout 0 tcp-keepalive 10 loglevel notice logfile /data/redis.log databases 16 #save 900 1 #save 300 10 #save 60 10000 stop-writes-on-bgsave-error no rdbcompression yes rdbchecksum yes dbfilename dump.rdb dir /data slave-serve-stale-data yes slave-read-only yes repl-diskless-sync no repl-diskless-sync-delay 5 repl-disable-tcp-nodelay no slave-priority 100 requirepass redispassword maxclients 30000 appendonly no appendfilename "appendonly.aof" appendfsync everysec no-appendfsync-on-rewrite no auto-aof-rewrite-percentage 100 auto-aof-rewrite-min-size 64mb aof-load-truncated yes lua-time-limit 5000 slowlog-log-slower-than 10000 slowlog-max-len 128 latency-monitor-threshold 0 notify-keyspace-events KEA hash-max-ziplist-entries 512 hash-max-ziplist-value 64 list-max-ziplist-entries 512 list-max-ziplist-value 64 set-max-intset-entries 1000 zset-max-ziplist-entries 128 zset-max-ziplist-value 64 hll-sparse-max-bytes 3000 activerehashing yes client-output-buffer-limit normal 0 0 0 client-output-buffer-limit slave 256mb 64mb 60 client-output-buffer-limit pubsub 32mb 8mb 60 hz 10 EOF # 在mos-namespace中创建 configmap kubectl create configmap redis-conf --from-file=redis.conf -n mos-namespace ``` # 三、创建 redis 容器 ```bash # 清理pod kubectl delete -f mos_redis.yaml cat > mos_redis.yaml <<\EOF apiVersion: apps/v1 kind: Deployment metadata: name: mos-redis namespace: mos-namespace spec: selector: matchLabels: name: mos-redis replicas: 1 template: metadata: labels: name: mos-redis spec: containers: - name: mos-redis image: redis volumeMounts: - name: mos mountPath: "/usr/local/etc" command: - "redis-server" args: - "/usr/local/etc/redis/redis.conf" volumes: - name: mos configMap: name: redis-conf items: - key: redis.conf path: redis/redis.conf EOF # 创建和查看 pod kubectl apply -f mos_redis.yaml kubectl get pods -n mos-namespace # 注意:configMap 会挂在 /usr/local/etc/redis/redis.conf 上。与 mountPath 和 configMap 下的 path 一同指定 ``` # 四、创建redis-service服务 ```bash # 删除service kubectl delete -f redis-service.yaml -n mos-namespace # 编写redis-service.yaml cat >redis-service.yaml<<\EOF apiVersion: v1 kind: Service metadata: name: redis-production namespace: mos-namespace spec: selector: name: mos-redis ports: - port: 6379 protocol: TCP EOF # 创建service kubectl apply -f redis-service.yaml -n mos-namespace # 查看service kubectl get svc redis-production -n mos-namespace # 查看service详情 kubectl describe svc redis-production -n mos-namespace ``` # 五、验证redis实例 1、普通方式验证 ```bash # 进入到容器 kubectl exec -it `kubectl get pods -n mos-namespace|grep redis|awk '{print $1}'` /bin/bash -n mos-namespace redis-cli -h 127.0.0.1 -a redispassword # 127.0.0.1:6379> set a b # 127.0.0.1:6379> get a "b" # 查看日志(因为配置文件中有配置日志写到容器里的/data/redis.log文件) kubectl exec -it `kubectl get pods -n mos-namespace|grep redis|awk '{print $1}'` /bin/bash -n mos-namespace $ tail -100f /data/redis.log 1:C 14 Nov 2019 06:46:13.476 # oO0OoO0OoO0Oo Redis is starting oO0OoO0OoO0Oo 1:C 14 Nov 2019 06:46:13.476 # Redis version=5.0.6, bits=64, commit=00000000, modified=0, pid=1, just started 1:C 14 Nov 2019 06:46:13.476 # Configuration loaded 1:M 14 Nov 2019 06:46:13.478 * Running mode=standalone, port=6379. 1:M 14 Nov 2019 06:46:13.478 # WARNING: The TCP backlog setting of 30000 cannot be enforced because /proc/sys/net/core/somaxconn is set to the lower value of 128. 1:M 14 Nov 2019 06:46:13.478 # Server initialized 1:M 14 Nov 2019 06:46:13.478 # WARNING you have Transparent Huge Pages (THP) support enabled in your kernel. This will create latency and memory usage issues with Redis. To fix this issue run the command 'echo never > /sys/kernel/mm/transparent_hugepage/enabled' as root, and add it to your /etc/rc.local in order to retain the setting after a reboot. Redis must be restarted after THP is disabled. 1:M 14 Nov 2019 06:46:13.478 * Ready to accept connections ``` 2、通过暴露的service验证 ```bash # 命令行跑一个centos7的bash基础容器 kubectl run --image=centos:7.2.1511 centos7-app -it --port=8080 --replicas=1 -n mos-namespace # 通过service方式验证 kubectl exec `kubectl get pods -n mos-namespace|grep centos7-app|awk '{print $1}'` -it /bin/bash -n mos-namespace yum install -y epel-release yum install -y redis redis-cli -h redis-production -a redispassword ``` 参考文档: https://www.cnblogs.com/klvchen/p/10862607.html ================================================ FILE: redis/K8s上运行Redis集群指南.md ================================================ Table of Contents ================= * [一、前言](#一前言) * [二、准备操作](#二准备操作) * [三、StatefulSet简介](#三statefulset简介) * [四、部署过程](#四部署过程) * [1、创建NFS存储](#1创建nfs存储) * [2、创建PV](#2创建pv) * [3、创建Configmap](#3创建configmap) * [4、创建Headless service](#4创建headless-service) * [4、创建Redis 集群节点](#4创建redis-集群节点) * [5、初始化Redis集群](#5初始化redis集群) * [6、创建用于访问Service](#6创建用于访问service) * [五、测试主从切换](#五测试主从切换) * [六、疑问点](#六疑问点) # 一、前言 架构原理: `每个Master都可以拥有多个Slave。当Master下线后,Redis集群会从多个Slave中选举出一个新的Master作为替代,而旧Master重新上线后变成新Master的Slave。` # 二、准备操作 本次部署主要基于该项目: `https://github.com/zuxqoj/kubernetes-redis-cluster` 其包含了两种部署Redis集群的方式: ```bash StatefulSet Service & Deployment ``` 两种方式各有优劣,对于像Redis、Mongodb、Zookeeper等有状态的服务,使用StatefulSet是首选方式。本文将主要介绍如何使用StatefulSet进行Redis集群的部署。 # 三、StatefulSet简介 - 1、RC、Deployment、DaemonSet都是面向无状态的服务,它们所管理的Pod的IP、名字,启停顺序等都是随机的,而StatefulSet是什么?顾名思义,有状态的集合,管理所有有状态的服务,比如MySQL、MongoDB集群等。 - 2、StatefulSet本质上是Deployment的一种变体,在v1.9版本中已成为GA版本,它为了解决有状态服务的问题,它所管理的Pod拥有固定的Pod名称,启停顺序,在StatefulSet中,Pod名字称为网络标识(hostname),还必须要用到共享存储。 - 3、在Deployment中,与之对应的服务是service,而在StatefulSet中与之对应的headless service,headless service,即无头服务,与service的区别就是它没有Cluster IP,解析它的名称时将返回该Headless Service对应的全部Pod的Endpoint列表。 - 4、除此之外,StatefulSet在Headless Service的基础上又为StatefulSet控制的每个Pod副本创建了一个DNS域名,这个域名的格式为: ```bash $(podname).(headless server name) FQDN: $(podname).(headless server name).namespace.svc.cluster.local ``` - 5、也即是说,对于有状态服务,我们最好使用固定的网络标识(如域名信息)来标记节点,当然这也需要应用程序的支持(如Zookeeper就支持在配置文件中写入主机域名)。 - 6、StatefulSet基于Headless Service(即没有Cluster IP的Service)为Pod实现了稳定的网络标志(包括Pod的hostname和DNS Records),在Pod重新调度后也保持不变。同时,结合PV/PVC,StatefulSet可以实现稳定的持久化存储,就算Pod重新调度后,还是能访问到原先的持久化数据。 - 7、以下为使用StatefulSet部署Redis的架构,无论是Master还是Slave,都作为StatefulSet的一个副本,并且数据通过PV进行持久化,对外暴露为一个Service,接受客户端请求。 # 四、部署过程 ```bash 1.创建NFS存储 2.创建PV 3.创建PVC 4.创建Configmap 5.创建headless服务 6.创建Redis StatefulSet 7.初始化Redis集群 ``` ## 1、创建NFS存储 创建NFS存储主要是为了给Redis提供稳定的后端存储,当Redis的Pod重启或迁移后,依然能获得原先的数据。这里,我们先要创建NFS,然后通过使用PV为Redis挂载一个远程的NFS路径。 ```bash yum -y install nfs-utils #主包提供文件系统 yum -y install rpcbind #提供rpc协议 ``` 然后,新增/etc/exports文件,用于设置需要共享的路径 ```bash $ cat /etc/exports /data/nfs/redis/pv1 *(rw,no_root_squash,sync,insecure) /data/nfs/redis/pv2 *(rw,no_root_squash,sync,insecure) /data/nfs/redis/pv3 *(rw,no_root_squash,sync,insecure) /data/nfs/redis/pv4 *(rw,no_root_squash,sync,insecure) /data/nfs/redis/pv5 *(rw,no_root_squash,sync,insecure) /data/nfs/redis/pv6 *(rw,no_root_squash,sync,insecure) #创建相应目录 mkdir -p /data/nfs/redis/pv{1..6} #接着,启动NFS和rpcbind服务 systemctl restart rpcbind systemctl restart nfs systemctl enable nfs systemctl enable rpcbind #查看 exportfs -v #客户端 yum -y install nfs-utils #查看存储端共享 showmount -e localhost ``` ## 2、创建PV 每一个Redis Pod都需要一个独立的PV来存储自己的数据,因此可以创建一个pv.yaml文件,包含6个PV ```bash kubectl delete -f pv.yaml cat >pv.yaml<<\EOF apiVersion: v1 kind: PersistentVolume metadata: name: nfs-pv1 spec: capacity: storage: 20Gi accessModes: - ReadWriteMany persistentVolumeReclaimPolicy: Retain storageClassName: nfs nfs: server: 10.198.1.155 path: "/data/nfs/redis/pv1" --- apiVersion: v1 kind: PersistentVolume metadata: name: nfs-pv2 spec: capacity: storage: 20Gi accessModes: - ReadWriteMany persistentVolumeReclaimPolicy: Retain storageClassName: nfs nfs: server: 10.198.1.155 path: "/data/nfs/redis/pv2" --- apiVersion: v1 kind: PersistentVolume metadata: name: nfs-pv3 spec: capacity: storage: 20Gi accessModes: - ReadWriteMany persistentVolumeReclaimPolicy: Retain storageClassName: nfs nfs: server: 10.198.1.155 path: "/data/nfs/redis/pv3" --- apiVersion: v1 kind: PersistentVolume metadata: name: nfs-pv4 spec: capacity: storage: 20Gi accessModes: - ReadWriteMany persistentVolumeReclaimPolicy: Retain storageClassName: nfs nfs: server: 10.198.1.155 path: "/data/nfs/redis/pv4" --- apiVersion: v1 kind: PersistentVolume metadata: name: nfs-pv5 spec: capacity: storage: 20Gi accessModes: - ReadWriteMany persistentVolumeReclaimPolicy: Retain storageClassName: nfs nfs: server: 10.198.1.155 path: "/data/nfs/redis/pv5" --- apiVersion: v1 kind: PersistentVolume metadata: name: nfs-pv6 spec: capacity: storage: 20Gi accessModes: - ReadWriteMany persistentVolumeReclaimPolicy: Retain storageClassName: nfs nfs: server: 10.198.1.155 path: "/data/nfs/redis/pv6" EOF kubectl apply -f pv.yaml ``` ## 3、创建Configmap 这里,我们可以直接将Redis的配置文件转化为Configmap,这是一种更方便的配置读取方式。配置文件redis.conf如下 ```bash #配置文件redis.conf cat >redis.conf<<\EOF appendonly yes cluster-enabled yes cluster-config-file /var/lib/redis/nodes.conf cluster-node-timeout 5000 dir /var/lib/redis port 6379 EOF #删除名为redis-conf的Configmap kubectl delete configmap redis-conf #创建名为redis-conf的Configmap kubectl create configmap redis-conf --from-file=redis.conf #查看创建的configmap $ kubectl describe cm redis-conf Name: redis-conf Namespace: default Labels: Annotations: Data ==== redis.conf: ---- appendonly yes cluster-enabled yes cluster-config-file /var/lib/redis/nodes.conf cluster-node-timeout 5000 dir /var/lib/redis port 6379 Events: #如上,redis.conf中的所有配置项都保存到redis-conf这个Configmap中。 ``` ## 4、创建Headless service Headless service是StatefulSet实现稳定网络标识的基础,我们需要提前创建。准备文件headless-service.yaml如下: ```bash #删除svc kubectl delete -f headless-service.yaml #编写svc cat >headless-service.yaml<<\EOF apiVersion: v1 kind: Service metadata: name: redis-service labels: app: redis spec: ports: - name: redis-port port: 6379 clusterIP: None selector: app: redis appCluster: redis-cluster EOF #创建svc kubectl create -f headless-service.yaml #查看service $ kubectl get svc NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE redis-service ClusterIP None 6379/TCP 0s ``` 可以看到,服务名称为redis-service,其CLUSTER-IP为None,表示这是一个“无头”服务。 ## 4、创建Redis 集群节点 创建好Headless service后,就可以利用StatefulSet创建Redis 集群节点,这也是本文的核心内容。我们先创建redis.yml文件: ```bash #清理pvc资源 kubectl delete pvc redis-data-redis-app-{0..5} #清理pod资源 kubectl delete -f redis.yaml #编写yaml cat >redis.yaml<<\EOF apiVersion: apps/v1beta1 kind: StatefulSet metadata: name: redis-app spec: serviceName: "redis-service" replicas: 6 template: metadata: labels: app: redis appCluster: redis-cluster spec: terminationGracePeriodSeconds: 20 affinity: podAntiAffinity: preferredDuringSchedulingIgnoredDuringExecution: - weight: 100 podAffinityTerm: labelSelector: matchExpressions: - key: app operator: In values: - redis topologyKey: kubernetes.io/hostname containers: - name: redis image: redis command: - "redis-server" #redis启动命令 args: - "/etc/redis/redis.conf" #redis-server后面跟的参数,换行代表空格 - "--protected-mode" #允许外网访问 - "no" resources: #资源 requests: #请求的资源 cpu: "100m" #m代表千分之,相当于0.1 个cpu资源 memory: "100Mi" #内存100m大小 ports: - name: redis containerPort: 6379 protocol: "TCP" - name: cluster containerPort: 16379 protocol: "TCP" volumeMounts: - name: "redis-conf" #挂载configmap生成的文件 mountPath: "/etc/redis" #挂载到哪个路径下 - name: "redis-data" #挂载持久卷的路径 mountPath: "/var/lib/redis" volumes: - name: "redis-conf" #引用configMap卷 configMap: name: "redis-conf" items: - key: "redis.conf" #创建configMap指定的名称 path: "redis.conf" #里面的那个文件--from-file参数后面的文件 volumeClaimTemplates: #进行pvc持久卷声明 - metadata: name: redis-data spec: accessModes: [ "ReadWriteMany" ] storageClassName: "nfs" #--注意这里是使用nfs storageClass,如果没有改默认的,可以忽略不写 resources: requests: storage: 20Gi EOF #创建资源 kubectl apply -f redis.yaml PodAntiAffinity:表示反亲和性,其决定了某个pod不可以和哪些Pod部署在同一拓扑域,可以用于将一个服务的POD分散在不同的主机或者拓扑域中,提高服务本身的稳定性。 matchExpressions:规定了Redis_Pod要尽量不要调度到包含app为redis的Node上,也即是说已经存在Redis的Node上尽量不要再分配Redis Pod了. 另外,根据StatefulSet的规则,我们生成的Redis的6个Pod的hostname会被依次命名为$(statefulset名称)-$(序号),如下图所示: ``` ```bash # kubectl get pods -o wide NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE redis-app-0 1/1 Running 0 2h 172.17.24.3 192.168.0.144 redis-app-1 1/1 Running 0 2h 172.17.63.8 192.168.0.148 redis-app-2 1/1 Running 0 2h 172.17.24.8 192.168.0.144 redis-app-3 1/1 Running 0 2h 172.17.63.9 192.168.0.148 redis-app-4 1/1 Running 0 2h 172.17.24.9 192.168.0.144 redis-app-5 1/1 ContainerCreating 0 2h 172.17.63.10 192.168.0.148 如上,可以看到这些Pods在部署时是以{0…N-1}的顺序依次创建的。注意,直到redis-app-0状态启动后达到Running状态之后,redis-app-1 才开始启动。 同时,每个Pod都会得到集群内的一个DNS域名,格式为$(podname).$(service name).$(namespace).svc.cluster.local ,也即是: redis-app-0.redis-service.default.svc.cluster.local redis-app-1.redis-service.default.svc.cluster.local ...以此类推... 这里我们可以验证一下 #kubectl run --rm curl --image=radial/busyboxplus:curl -it kubectl run --rm -i --tty busybox --image=busybox:1.28 /bin/sh $ nslookup redis-app-0.redis-service #注意格式 $(podname).$(service name).$(namespace) Server: 10.96.0.10 Address 1: 10.96.0.10 kube-dns.kube-system.svc.cluster.local Name: redis-app-0.redis-service Address 1: 172.17.24.3 redis-app-0.redis-service.default.svc.cluster.local 在K8S集群内部,这些Pod就可以利用该域名互相通信。我们可以使用busybox镜像的nslookup检验这些域名(一条命令) $ kubectl run -it --rm --image=busybox:1.28 --restart=Never busybox -- nslookup redis-app-0.redis-service Server: 10.96.0.10 Address 1: 10.96.0.10 kube-dns.kube-system.svc.cluster.local Name: redis-app-0.redis-service Address 1: 172.17.24.3 redis-app-0.redis-service.default.svc.cluster.local pod "busybox" deleted 可以看到, redis-app-0 的IP为172.17.24.3。当然,若Redis Pod迁移或是重启(我们可以手动删除掉一个Redis Pod来测试),IP是会改变的,但是Pod的域名、SRV records、A record都不会改变。 另外可以发现,我们之前创建的pv都被成功绑定了: $ kubectl get pv|grep nfs-pv nfs-pv1 20Gi RWX Retain Bound default/redis-data-redis-app-1 nfs 65s nfs-pv2 20Gi RWX Retain Bound default/redis-data-redis-app-0 nfs 65s nfs-pv3 20Gi RWX Retain Bound default/redis-data-redis-app-2 nfs 65s nfs-pv4 20Gi RWX Retain Bound default/redis-data-redis-app-5 nfs 65s nfs-pv5 20Gi RWX Retain Bound default/redis-data-redis-app-3 nfs 65s nfs-pv6 20Gi RWX Retain Bound default/redis-data-redis-app-4 nfs 65s 查看pvc资源 $ kubectl get pvc NAME STATUS VOLUME CAPACITY ACCESS MODES STORAGECLASS AGE redis-data-redis-app-0 Bound nfs-pv2 20Gi RWX nfs 96s redis-data-redis-app-1 Bound nfs-pv1 20Gi RWX nfs 86s redis-data-redis-app-2 Bound nfs-pv3 20Gi RWX nfs 75s redis-data-redis-app-3 Bound nfs-pv5 20Gi RWX nfs 69s redis-data-redis-app-4 Bound nfs-pv6 20Gi RWX nfs 62s redis-data-redis-app-5 Bound nfs-pv4 20Gi RWX nfs 56s ``` ## 5、初始化Redis集群 创建好6个Redis Pod后,我们还需要利用常用的Redis-tribe工具进行集群的初始化 创建Ubuntu容器 由于Redis集群必须在所有节点启动后才能进行初始化,而如果将初始化逻辑写入Statefulset中,则是一件非常复杂而且低效的行为。这里,本人不得不称赞一下原项目作者的思路,值得学习。也就是说,我们可以在K8S上创建一个额外的容器,专门用于进行K8S集群内部某些服务的管理控制。 这里,我们专门启动一个Ubuntu的容器,可以在该容器中安装Redis-tribe,进而初始化Redis集群,执行: ```bash 1、#创建一个ubuntu容器 kubectl run -it ubuntu --image=ubuntu --restart=Never /bin/bash #进入到容器 kubectl exec -it ubuntu /bin/bash 2、#我们使用阿里云的Ubuntu源,执行 $ cat > /etc/apt/sources.list << EOF deb http://mirrors.aliyun.com/ubuntu/ bionic main restricted universe multiverse deb-src http://mirrors.aliyun.com/ubuntu/ bionic main restricted universe multiverse deb http://mirrors.aliyun.com/ubuntu/ bionic-security main restricted universe multiverse deb-src http://mirrors.aliyun.com/ubuntu/ bionic-security main restricted universe multiverse deb http://mirrors.aliyun.com/ubuntu/ bionic-updates main restricted universe multiverse deb-src http://mirrors.aliyun.com/ubuntu/ bionic-updates main restricted universe multiverse deb http://mirrors.aliyun.com/ubuntu/ bionic-proposed main restricted universe multiverse deb-src http://mirrors.aliyun.com/ubuntu/ bionic-proposed main restricted universe multiverse deb http://mirrors.aliyun.com/ubuntu/ bionic-backports main restricted universe multiverse deb-src http://mirrors.aliyun.com/ubuntu/ bionic-backports main restricted universe multiverse EOF 3、#成功后,原项目要求执行如下命令安装基本的软件环境: apt-get update apt-get install -y vim wget python2.7 python-pip redis-tools dnsutils 4、#初始化集群 首先,我们需要安装redis-trib pip install redis-trib==0.5.1 然后,创建只有Master节点的集群 redis-trib.py create \ `dig +short redis-app-0.redis-service.default.svc.cluster.local`:6379 \ `dig +short redis-app-1.redis-service.default.svc.cluster.local`:6379 \ `dig +short redis-app-2.redis-service.default.svc.cluster.local`:6379 其次,为每个Master添加Slave redis-trib.py replicate \ --master-addr `dig +short redis-app-0.redis-service.default.svc.cluster.local`:6379 \ --slave-addr `dig +short redis-app-3.redis-service.default.svc.cluster.local`:6379 redis-trib.py replicate \ --master-addr `dig +short redis-app-1.redis-service.default.svc.cluster.local`:6379 \ --slave-addr `dig +short redis-app-4.redis-service.default.svc.cluster.local`:6379 redis-trib.py replicate \ --master-addr `dig +short redis-app-2.redis-service.default.svc.cluster.local`:6379 \ --slave-addr `dig +short redis-app-5.redis-service.default.svc.cluster.local`:6379 至此,我们的Redis集群就真正创建完毕了,连到任意一个Redis Pod中检验一下: $ kubectl exec -it redis-app-2 /bin/bash root@redis-app-2:/data# /usr/local/bin/redis-cli -c 127.0.0.1:6379> cluster nodes 5d3e77f6131c6f272576530b23d1cd7592942eec 172.17.24.3:6379@16379 master - 0 1559628533000 1 connected 0-5461 a4b529c40a920da314c6c93d17dc603625d6412c 172.17.63.10:6379@16379 master - 0 1559628531670 6 connected 10923-16383 368971dc8916611a86577a8726e4f1f3a69c5eb7 172.17.24.9:6379@16379 slave 0025e6140f85cb243c60c214467b7e77bf819ae3 0 1559628533672 4 connected 0025e6140f85cb243c60c214467b7e77bf819ae3 172.17.63.8:6379@16379 master - 0 1559628533000 2 connected 5462-10922 6d5ee94b78b279e7d3c77a55437695662e8c039e 172.17.24.8:6379@16379 myself,slave a4b529c40a920da314c6c93d17dc603625d6412c 0 1559628532000 5 connected 2eb3e06ce914e0e285d6284c4df32573e318bc01 172.17.63.9:6379@16379 slave 5d3e77f6131c6f272576530b23d1cd7592942eec 0 1559628533000 3 connected 127.0.0.1:6379> cluster info cluster_state:ok cluster_slots_assigned:16384 cluster_slots_ok:16384 cluster_slots_pfail:0 cluster_slots_fail:0 cluster_known_nodes:6 cluster_size:3 cluster_current_epoch:6 cluster_my_epoch:6 cluster_stats_messages_ping_sent:14910 cluster_stats_messages_pong_sent:15139 cluster_stats_messages_sent:30049 cluster_stats_messages_ping_received:15139 cluster_stats_messages_pong_received:14910 cluster_stats_messages_received:30049 127.0.0.1:6379> 另外,还可以在NFS上查看Redis挂载的数据: $ ll /data/nfs/redis/pv3 total 12 -rw-r--r-- 1 root root 92 Jun 4 11:36 appendonly.aof -rw-r--r-- 1 root root 175 Jun 4 11:36 dump.rdb -rw-r--r-- 1 root root 794 Jun 4 11:49 nodes.conf ``` ## 6、创建用于访问Service 前面我们创建了用于实现StatefulSet的Headless Service,但该Service没有Cluster Ip,因此不能用于外界访问。所以,我们还需要创建一个Service,专用于为Redis集群提供访问和负载均衡: ```bash #删除服务 kubectl delete -f redis-access-service.yaml #编写yaml cat >redis-access-service.yaml<<\EOF apiVersion: v1 kind: Service metadata: name: redis-access-service labels: app: redis spec: type: NodePort ports: - name: redis-port protocol: "TCP" port: 6379 targetPort: 6379 nodePort: 30010 selector: app: redis appCluster: redis-cluster EOF #如上,该Service名称为 redis-access-service,在K8S集群中暴露6379端口,并且会对labels name为app: redis或appCluster: redis-cluster的pod进行负载均衡。 #创建服务 kubectl apply -f redis-access-service.yaml #查看svc $ kubectl get svc redis-access-service -o wide NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE SELECTOR redis-access-service NodePort 10.111.59.191 6379:30010/TCP 83m app=redis,appCluster=redis-cluster #如上,在K8S集群中,所有应用都可以通过 10.111.59.191:6379 来访问Redis集群。当然,为了方便测试,我们也可以为Service添加一个NodePort映射到物理机30010上。 #查看svc详情 $ kubectl describe svc redis-access-service Name: redis-access-service Namespace: default Labels: app=redis Annotations: kubectl.kubernetes.io/last-applied-configuration: {"apiVersion":"v1","kind":"Service","metadata":{"annotations":{},"labels":{"app":"redis"},"name":"redis-access-service","namespace":"defau... Selector: app=redis,appCluster=redis-cluster Type: NodePort IP: 10.111.59.191 Port: redis-port 6379/TCP TargetPort: 6379/TCP NodePort: redis-port 30010/TCP Endpoints: 10.244.1.230:6379,10.244.1.231:6379,10.244.1.232:6379 + 3 more... Session Affinity: None External Traffic Policy: Cluster Events: #集群内测试(service ip 测试) yum install redis -y redis-cli -h 10.111.59.191 -p 6379 -c 10.111.59.191:6379> CLUSTER info cluster_state:ok cluster_slots_assigned:16384 cluster_slots_ok:16384 cluster_slots_pfail:0 cluster_slots_fail:0 cluster_known_nodes:5 cluster_size:3 cluster_current_epoch:3 cluster_my_epoch:3 cluster_stats_messages_ping_sent:766 cluster_stats_messages_pong_sent:790 cluster_stats_messages_meet_sent:2 cluster_stats_messages_sent:1558 cluster_stats_messages_ping_received:787 cluster_stats_messages_pong_received:768 cluster_stats_messages_meet_received:3 cluster_stats_messages_received:1558 #宿主机端口测试(使用集群协议测试) redis-cli -h 10.198.1.156 -p 30010 -c 10.198.1.156:30010> cluster info cluster_state:ok cluster_slots_assigned:16384 cluster_slots_ok:16384 cluster_slots_pfail:0 cluster_slots_fail:0 cluster_known_nodes:5 cluster_size:3 cluster_current_epoch:3 cluster_my_epoch:2 cluster_stats_messages_ping_sent:907 cluster_stats_messages_pong_sent:901 cluster_stats_messages_meet_sent:3 cluster_stats_messages_sent:1811 cluster_stats_messages_ping_received:900 cluster_stats_messages_pong_received:910 cluster_stats_messages_meet_received:1 cluster_stats_messages_received:1811 ``` # 五、测试主从切换 在K8S上搭建完好Redis集群后,我们最关心的就是其原有的高可用机制是否正常。这里,我们可以任意挑选一个Master的Pod来测试集群的主从切换机制,如redis-app-0: ```bash [root@master redis]# kubectl get pods redis-app-0 -o wide NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE redis-app-1 1/1 Running 0 3h 172.17.24.3 192.168.0.144 进入redis-app-0查看: [root@master redis]# kubectl exec -it redis-app-0 /bin/bash root@redis-app-0:/data# /usr/local/bin/redis-cli -c 127.0.0.1:6379> role 1) "master" 2) (integer) 13370 3) 1) 1) "172.17.63.9" 2) "6379" 3) "13370" 127.0.0.1:6379> 如上可以看到,app-0为master,slave为172.17.63.9即redis-app-3。 接着,我们手动删除redis-app-0: [root@master redis]# kubectl delete pod redis-app-0 pod "redis-app-0" deleted [root@master redis]# kubectl get pod redis-app-0 -o wide NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE redis-app-0 1/1 Running 0 4m 172.17.24.3 192.168.0.144 我们再进入redis-app-0内部查看: [root@master redis]# kubectl exec -it redis-app-0 /bin/bash root@redis-app-0:/data# /usr/local/bin/redis-cli -c 127.0.0.1:6379> role 1) "slave" 2) "172.17.63.9" 3) (integer) 6379 4) "connected" 5) (integer) 13958 如上,redis-app-0变成了slave,从属于它之前的从节点172.17.63.9即redis-app-3 ``` # 六、疑问点 1、pod重启,ip变了,集群健康性如何维护 ``` 至此,大家可能会疑惑,前面讲了这么多似乎并没有体现出StatefulSet的作用,其提供的稳定标志redis-app-*仅在初始化集群的时候用到,而后续Redis Pod的通信或配置文件中并没有使用该标志。我想说,是的,本文使用StatefulSet部署Redis确实没有体现出其优势,还不如介绍Zookeeper集群来的明显,不过没关系,学到知识就好。 那为什么没有使用稳定的标志,Redis Pod也能正常进行故障转移呢?这涉及了Redis本身的机制。因为,Redis集群中每个节点都有自己的NodeId(保存在自动生成的nodes.conf中),并且该NodeId不会随着IP的变化和变化,这其实也是一种固定的网络标志。也就是说,就算某个Redis Pod重启了,该Pod依然会加载保存的NodeId来维持自己的身份。我们可以在NFS上查看redis-app-1的nodes.conf文件 $ cat /usr/local/k8s/redis/pv1/nodes.conf 96689f2018089173e528d3a71c4ef10af68ee462 192.168.169.209:6379@16379 slave d884c4971de9748f99b10d14678d864187a9e5d3 0 1526460952651 4 connected 237d46046d9b75a6822f02523ab894928e2300e6 192.168.169.200:6379@16379 slave c15f378a604ee5b200f06cc23e9371cbc04f4559 0 1526460952651 1 connected c15f378a604ee5b200f06cc23e9371cbc04f4559 192.168.169.197:6379@16379 master - 0 1526460952651 1 connected 10923-16383 d884c4971de9748f99b10d14678d864187a9e5d3 192.168.169.205:6379@16379 master - 0 1526460952651 4 connected 5462-10922 c3b4ae23c80ffe31b7b34ef29dd6f8d73beaf85f 192.168.169.198:6379@16379 myself,slave c8a8f70b4c29333de6039c47b2f3453ed11fb5c2 0 1526460952565 3 connected c8a8f70b4c29333de6039c47b2f3453ed11fb5c2 192.168.169.201:6379@16379 master - 0 1526460952651 6 connected 0-5461 vars currentEpoch 6 lastVoteEpoch 4 如上,第一列为NodeId,稳定不变;第二列为IP和端口信息,可能会改变。 这里,我们介绍NodeId的两种使用场景: 当某个Slave Pod断线重连后IP改变,但是Master发现其NodeId依旧, 就认为该Slave还是之前的Slave。 当某个Master Pod下线后,集群在其Slave中选举重新的Master。待旧Master上线后,集群发现其NodeId依旧,会让旧Master变成新Master的slave。 ``` 2、pvc绑定不上报错(storageclass.storage.k8s.io "nfs" not found报错) ``` $ kubectl describe pvc redis-data-redis-app-0 Warning ProvisioningFailed 14s (x2 over 24s) persistentvolume-controller storageclass.storage.k8s.io "nfs" not found #原因为创建pv的时候,没有指定 storageClassName: nfs ``` 参考文档: https://cloud.tencent.com/developer/article/1392872 redis动态扩容 https://blog.csdn.net/zhutongcloud/article/details/90768390 部署Redis集群 https://www.jianshu.com/p/65c4baadf5d9 redis故障切换nodeid原因 ================================================ FILE: redis/README.md ================================================ 参考资料: https://mp.weixin.qq.com/s/noVUEO5tbdcdx8AzYNrsMw Kubernetes上通过sts测试Redis Cluster集群 ================================================ FILE: rke/README.md ================================================ # 一、基础配置优化 ``` chattr -i /etc/passwd* && chattr -i /etc/group* && chattr -i /etc/shadow* && chattr -i /etc/gshadow* groupadd docker useradd -g docker docker echo "1Qaz2Wsx3Edc" | passwd --stdin docker usermod docker -G docker #注意这里需要将数组改为docker属组,不然会报错 setenforce 0 sed -i 's/SELINUX=enforcing/SELINUX=disabled/g' /etc/selinux/config # 关闭selinux systemctl daemon-reload systemctl stop firewalld.service && systemctl disable firewalld.service # 关闭防火墙 #echo 'LANG="en_US.UTF-8"' >> /etc/profile; source /etc/profile # 修改系统语言 ln -sf /usr/share/zoneinfo/Asia/Shanghai /etc/localtime # 修改时区(如果需要修改) # 性能调优 cat >> /etc/sysctl.conf< /etc/sysctl.d/k8s.conf net.ipv4.ip_forward=1 net.bridge.bridge-nf-call-ip6tables = 1 net.bridge.bridge-nf-call-iptables = 1 vm.swappiness=0 EOF sysctl --system #docker用户免密登录 mkdir -p /home/docker/.ssh/ chmod 700 /home/docker/.ssh/ echo 'ssh-rsa AAAAB3NzaC1yc2EAAAADAQABAAABAQC7bRm20od1b3rzW3ZPLB5NZn3jQesvfiz2p0WlfcYJrFHfF5Ap0ubIBUSQpVNLn94u8ABGBLboZL8Pjo+rXQPkIcObJxoKS8gz6ZOxcxJhl11JKxTz7s49nNYaNDIwB13KaNpvBEHVoW3frUnP+RnIKIIDsr1QCr9t64D9TE99mbNkEvDXr021UQi12Bf4KP/8gfYK3hDMRuX634/K8yu7+IaO1vEPNT8HDo9XGcvrOD1QGV+is8mrU53Xa2qTsto7AOb2J8M6n1mSZxgNz2oGc6ZDuN1iMBfHm4O/s5VEgbttzB2PtI0meKeaLt8VaqwTth631EN1ryjRYUuav7bf docker@k8s-master-01' > /home/docker/.ssh/authorized_keys chmod 400 /home/docker/.ssh/authorized_keys ``` ## 二、基础环境准备 ``` mkdir -p /etc/yum.repos.d_bak/ mv /etc/yum.repos.d/* /etc/yum.repos.d_bak/ curl http://mirrors.aliyun.com/repo/Centos-7.repo >/etc/yum.repos.d/Centos-7.repo curl http://mirrors.aliyun.com/repo/epel-7.repo >/etc/yum.repos.d/epel-7.repo sed -i '/aliyuncs/d' /etc/yum.repos.d/Centos-7.repo yum clean all && yum makecache fast yum -y install yum-utils yum-config-manager --add-repo http://mirrors.aliyun.com/docker-ce/linux/centos/docker-ce.repo yum install -y device-mapper-persistent-data lvm2 yum install docker-ce -y #从docker1.13版本开始,docker会自动设置iptables的FORWARD默认策略为DROP,所以需要修改docker的启动配置文件/usr/lib/systemd/system/docker.service cat > /usr/lib/systemd/system/docker.service << \EOF [Unit] Description=Docker Application Container Engine Documentation=https://docs.docker.com BindsTo=containerd.service After=network-online.target firewalld.service containerd.service Wants=network-online.target Requires=docker.socket [Service] Type=notify ExecStart=/usr/bin/dockerd -H fd:// --containerd=/run/containerd/containerd.sock ExecStartPost=/usr/sbin/iptables -P FORWARD ACCEPT ExecReload=/bin/kill -s HUP \$MAINPID TimeoutSec=0 RestartSec=2 Restart=always StartLimitBurst=3 StartLimitInterval=60s LimitNOFILE=infinity LimitNPROC=infinity LimitCORE=infinity TasksMax=infinity Delegate=yes KillMode=process [Install] WantedBy=multi-user.target EOF #设置加速器 curl -sSL https://get.daocloud.io/daotools/set_mirror.sh | sh -s http://41935bf4.m.daocloud.io #这个脚本在centos 7上有个bug,脚本会改变docker的配置文件/etc/docker/daemon.json但修改的时候多了一个逗号,导致docker无法启动 #或者直接执行这个指令 tee /etc/docker/daemon.json <<-'EOF' { "registry-mirrors": ["https://1z45x7d0.mirror.aliyuncs.com"], "insecure-registries": ["192.168.56.11:5000"], "storage-driver": "overlay2", "log-driver": "json-file", "log-opts": { "max-size": "100m", "max-file": "3" } } EOF systemctl daemon-reload systemctl restart docker #查看加速器是否生效 root># docker info Registry Mirrors: https://1z45x7d0.mirror.aliyuncs.com/ --发现参数已经生效 Live Restore Enabled: false ``` ## 三、RKE安装 使用RKE安装,需要先安装好docker和设置好root和普通用户的免key登录 1、下载RKE ``` #可以从https://github.com/rancher/rke/releases下载安装包,本文使用版本v0.3.0.下载完后将安装包上传至任意节点. wget https://github.com/rancher/rke/releases/download/v0.2.8/rke_linux-amd64 chmod 777 rke_linux-amd64 mv rke_linux-amd64 /usr/local/bin/rke ``` 2、创建集群配置文件 ``` cat >/tmp/cluster.yml <> ~/.bashrc ``` # 四、helm将rancher部署在k8s集群 1、安装并配置helm客户端 ``` #使用官方提供的脚本一键安装 curl https://raw.githubusercontent.com/kubernetes/helm/master/scripts/get > get_helm.sh chmod 700 get_helm.sh ./get_helm.sh #手动下载安装 #下载 Helm wget https://storage.googleapis.com/kubernetes-helm/helm-v2.9.1-linux-amd64.tar.gz #解压 Helm tar -zxvf helm-v2.9.1-linux-amd64.tar.gz #复制客户端执行文件到 bin 目录下 cp linux-amd64/helm /usr/local/bin/ ``` 2、配置helm客户端具有访问k8s集群的权限 ``` kubectl -n kube-system create serviceaccount tiller kubectl create clusterrolebinding tiller --clusterrole cluster-admin --serviceaccount=kube-system:tiller ``` 3、将helm server(titler)部署到k8s集群 ``` helm init --service-account tiller --tiller-image hongxiaolu/tiller:v2.12.3 --stable-repo-url https://kubernetes.oss-cn-hangzhou.aliyuncs.com/charts ``` 4、为helm客户端配置chart仓库 ``` helm repo add rancher-stable https://releases.rancher.com/server-charts/stable ``` 5、检查rancher chart仓库可用 ``` helm search rancher ``` ``` 安装证书管理器 helm install stable/cert-manager \ --name cert-manager \ --namespace kube-system kubectl get pods --all-namespaces|grep cert-manager helm install rancher-stable/rancher \ --name rancher \ --namespace cattle-system \ --set hostname=acai.rancher.com ``` 参考资料: http://www.acaiblog.cn/2019/03/15/RKE%E9%83%A8%E7%BD%B2rancher%E9%AB%98%E5%8F%AF%E7%94%A8%E9%9B%86%E7%BE%A4/ https://blog.csdn.net/login_sonata/article/details/93847888 ================================================ FILE: rke/cluster.yml ================================================ # If you intened to deploy Kubernetes in an air-gapped environment, # please consult the documentation on how to configure custom RKE images. nodes: - address: 10.198.1.156 port: "22" internal_address: "" role: - controlplane - worker - etcd hostname_override: "" user: k8s docker_socket: /var/run/docker.sock ssh_key: "" ssh_key_path: ~/.ssh/id_rsa labels: {} - address: 10.198.1.157 port: "22" internal_address: "" role: - controlplane - worker - etcd hostname_override: "" user: k8s docker_socket: /var/run/docker.sock ssh_key: "" ssh_key_path: ~/.ssh/id_rsa labels: {} - address: 10.198.1.158 port: "22" internal_address: "" role: - worker hostname_override: "" user: k8s docker_socket: /var/run/docker.sock ssh_key: "" ssh_key_path: ~/.ssh/id_rsa labels: {} - address: 10.198.1.159 port: "22" internal_address: "" role: - worker hostname_override: "" user: k8s docker_socket: /var/run/docker.sock ssh_key: "" ssh_key_path: ~/.ssh/id_rsa labels: {} - address: 10.198.1.160 port: "22" internal_address: "" role: - worker hostname_override: "" user: k8s docker_socket: /var/run/docker.sock ssh_key: "" ssh_key_path: ~/.ssh/id_rsa labels: {} services: etcd: image: "" extra_args: {} extra_binds: [] extra_env: [] external_urls: [] ca_cert: "" cert: "" key: "" path: "" snapshot: null retention: "" creation: "" kube-api: image: "" extra_args: enable-admission-plugins: NamespaceLifecycle,LimitRanger,ServiceAccount,DefaultStorageClass,DefaultTolerationSeconds,MutatingAdmissionWebhook,ValidatingAdmissionWebhook,ResourceQuota,Initializers runtime-config: api/all=true,admissionregistration.k8s.io/v1alpha1=true extra_binds: [] extra_env: [] service_cluster_ip_range: 10.44.0.0/16 service_node_port_range: "" pod_security_policy: true kube-controller: image: "" extra_args: {} extra_binds: [] extra_env: [] cluster_cidr: 10.46.0.0/16 service_cluster_ip_range: 10.44.0.0/16 scheduler: image: "" extra_args: {} extra_binds: [] extra_env: [] kubelet: image: "" extra_args: enforce-node-allocatable: "pods,kube-reserved,system-reserved" system-reserved-cgroup: "/system.slice" system-reserved: "cpu=500m,memory=1Gi" kube-reserved-cgroup: "/system.slice/kubelet.service" kube-reserved: "cpu=1,memory=2Gi" eviction-soft: "memory.available<10%,nodefs.available<10%,imagefs.available<10%" eviction-soft-grace-period: "memory.available=2m,nodefs.available=2m,imagefs.available=2m" extra_binds: [] extra_env: [] cluster_domain: k8s.test.net infra_container_image: "" cluster_dns_server: 10.44.0.10 fail_swap_on: false kubeproxy: image: "" extra_args: {} extra_binds: [] extra_env: [] network: plugin: calico options: {} authentication: strategy: x509 options: {} sans: [] addons: "" addons_include: [] system_images: etcd: rancher/coreos-etcd:v3.2.24 alpine: rancher/rke-tools:v0.1.25 nginx_proxy: rancher/rke-tools:v0.1.25 cert_downloader: rancher/rke-tools:v0.1.25 kubernetes_services_sidecar: rancher/rke-tools:v0.1.25 kubedns: rancher/k8s-dns-kube-dns-amd64:1.14.13 dnsmasq: rancher/k8s-dns-dnsmasq-nanny-amd64:1.14.13 kubedns_sidecar: rancher/k8s-dns-sidecar-amd64:1.14.13 kubedns_autoscaler: rancher/cluster-proportional-autoscaler-amd64:1.0.0 kubernetes: rancher/hyperkube:v1.12.6-rancher1 flannel: rancher/coreos-flannel:v0.10.0 flannel_cni: rancher/coreos-flannel-cni:v0.3.0 calico_node: rancher/calico-node:v3.1.3 calico_cni: rancher/calico-cni:v3.1.3 calico_controllers: "" calico_ctl: rancher/calico-ctl:v2.0.0 canal_node: rancher/calico-node:v3.1.3 canal_cni: rancher/calico-cni:v3.1.3 canal_flannel: rancher/coreos-flannel:v0.10.0 wave_node: weaveworks/weave-kube:2.1.2 weave_cni: weaveworks/weave-npc:2.1.2 pod_infra_container: rancher/pause-amd64:3.1 ingress: rancher/nginx-ingress-controller:0.21.0-rancher1 ingress_backend: rancher/nginx-ingress-controller-defaultbackend:1.4 metrics_server: rancher/metrics-server-amd64:v0.3.1 ssh_key_path: ~/.ssh/id_rsa ssh_agent_auth: false authorization: mode: rbac options: {} ignore_docker_version: false kubernetes_version: "" private_registries: [] ingress: provider: "" options: {} node_selector: {} extra_args: {} cluster_name: "" cloud_provider: name: "" prefix_path: "" addon_job_timeout: 0 bastion_host: address: "" port: "" user: "" ssh_key: "" ssh_key_path: "" monitoring: provider: "" options: {} ================================================ FILE: tools/Linux Kernel 升级.md ================================================ # Linux Kernel 升级 k8s,docker,cilium等很多功能、特性需要较新的linux内核支持,所以有必要在集群部署前对内核进行升级;CentOS7 和 Ubuntu16.04可以很方便的完成内核升级。 ## CentOS7 红帽企业版 Linux 仓库网站 https://www.elrepo.org,主要提供各种硬件驱动(显卡、网卡、声卡等)和内核升级相关资源;兼容 CentOS7 内核升级。如下按照网站提示载入elrepo公钥及最新elrepo版本,然后按步骤升级内核(以安装长期支持版本 kernel-lt 为例) ``` bash # 载入公钥 rpm --import https://www.elrepo.org/RPM-GPG-KEY-elrepo.org # 安装ELRepo rpm -Uvh http://www.elrepo.org/elrepo-release-7.0-4.el7.elrepo.noarch.rpm # 载入elrepo-kernel元数据 yum --disablerepo=\* --enablerepo=elrepo-kernel repolist # 查看可用的rpm包 yum --disablerepo=\* --enablerepo=elrepo-kernel list kernel* # 安装长期支持版本的kernel yum --disablerepo=\* --enablerepo=elrepo-kernel install -y kernel-lt.x86_64 # 删除旧版本工具包 yum remove kernel-tools-libs.x86_64 kernel-tools.x86_64 -y # 安装新版本工具包 yum --disablerepo=\* --enablerepo=elrepo-kernel install -y kernel-lt-tools.x86_64 # 查看默认启动顺序 awk -F\' '$1=="menuentry " {print $2}' /etc/grub2.cfg CentOS Linux (4.4.208-1.el7.elrepo.x86_64) 7 (Core) CentOS Linux (3.10.0-1062.9.1.el7.x86_64) 7 (Core) CentOS Linux (3.10.0-957.el7.x86_64) 7 (Core) CentOS Linux (0-rescue-292a31ba53a34a6aa077e3467b6f9541) 7 (Core) # 默认启动的顺序是从0开始,新内核是从头插入(目前位置在0,而4.4.4的是在1),所以需要选择0。 grub2-set-default 0 # 将第一个内核作为默认内核 sed -i 's/GRUB_DEFAULT=saved/GRUB_DEFAULT=0/g' /etc/default/grub # 更新 grub grub2-mkconfig -o /boot/grub2/grub.cfg # 重启并检查 reboot ``` ## Ubuntu16.04 ``` bash 打开 http://kernel.ubuntu.com/~kernel-ppa/mainline/ 并选择列表中选择你需要的版本(以4.16.3为例)。 接下来,根据你的系统架构下载 如下.deb 文件: Build for amd64 succeeded (see BUILD.LOG.amd64): linux-headers-4.16.3-041603_4.16.3-041603.201804190730_all.deb linux-headers-4.16.3-041603-generic_4.16.3-041603.201804190730_amd64.deb linux-image-4.16.3-041603-generic_4.16.3-041603.201804190730_amd64.deb #安装后重启即可 $ sudo dpkg -i *.deb ``` 参考文档: https://github.com/easzlab/kubeasz/blob/master/docs/guide/kernel_upgrade.md ================================================ FILE: tools/README.md ================================================ # 同步工具 1、同步主机host文件 ``` [root@master01 ~]# ./ssh_copy.sh /etc/hosts spawn scp /etc/hosts root@master01:/etc/hosts hosts 100% 440 940.4KB/s 00:00 spawn scp /etc/hosts root@master02:/etc/hosts hosts 100% 440 774.6KB/s 00:00 spawn scp /etc/hosts root@master03:/etc/hosts hosts 100% 440 1.4MB/s 00:00 spawn scp /etc/hosts root@slave01:/etc/hosts hosts 100% 440 912.6KB/s 00:00 spawn scp /etc/hosts root@slave02:/etc/hosts hosts 100% 440 826.8KB/s 00:00 spawn scp /etc/hosts root@slave03:/etc/hosts hosts ``` 2、iptables多端口 ```bash #iptables多端口 -A RH-Firewall-1-INPUT -s 13.138.33.20/32 -p tcp -m tcp -m multiport --dports 80,443,6443,20000:40000 -j ACCEPT #同步防火墙 ./ssh_copy.sh /etc/sysconfig/iptables ``` ================================================ FILE: tools/k8s域名解析coredns问题排查过程.md ================================================ 参考资料: https://segmentfault.com/a/1190000019823091?utm_source=tag-newest ================================================ FILE: tools/kubernetes-node打标签.md ================================================ ``` kubectl get nodes -A --show-labels kubectl label nodes 10.199.1.159 node=10.199.1.159 kubectl label nodes 10.199.1.160 node=10.199.1.160 ``` ================================================ FILE: tools/kubernetes-常用操作.md ================================================ # 一、节点调度配置 ``` [root@master01 ~]# kubectl get nodes -A NAME STATUS ROLES AGE VERSION 10.19.2.246 Ready node 3h13m v1.15.2 10.19.2.247 Ready node 3h13m v1.15.2 10.19.2.248 Ready node 3h13m v1.15.2 10.19.2.56 Ready,SchedulingDisabled master 4h55m v1.15.2 10.19.2.57 Ready,SchedulingDisabled master 4h55m v1.15.2 10.19.2.58 Ready,SchedulingDisabled master 4h55m v1.15.2 #方法一 [root@master01 ~]# kubectl uncordon 10.19.2.56 node/10.19.2.56 uncordoned [root@master01 ~]# kubectl get nodes -A NAME STATUS ROLES AGE VERSION 10.19.2.246 Ready node 3h13m v1.15.2 10.19.2.247 Ready node 3h13m v1.15.2 10.19.2.248 Ready node 3h13m v1.15.2 10.19.2.56 Ready master 4h56m v1.15.2 10.19.2.57 Ready,SchedulingDisabled master 4h56m v1.15.2 10.19.2.58 Ready,SchedulingDisabled master 4h56m v1.15.2 #方法二 [root@master01 ~]# kubectl patch node 10.19.2.56 -p '{"spec":{"unschedulable":false}}' node/10.19.2.56 patched [root@master01 ~]# kubectl get nodes -A NAME STATUS ROLES AGE VERSION 10.19.2.246 Ready node 3h17m v1.15.2 10.19.2.247 Ready node 3h17m v1.15.2 10.19.2.248 Ready node 3h17m v1.15.2 10.19.2.56 Ready master 5h v1.15.2 10.19.2.57 Ready,SchedulingDisabled master 5h v1.15.2 10.19.2.58 Ready,SchedulingDisabled master 5h v1.15.2 ``` # 二、标签查看 ``` [root@master01 ~]# kubectl get nodes --show-labels NAME STATUS ROLES AGE VERSION LABELS 10.19.2.246 Ready node 3h15m v1.15.2 beta.kubernetes.io/arch=amd64,beta.kubernetes.io/os=linux,kubernetes.io/arch=amd64,kubernetes.io/hostname=10.19.2.246,kubernetes.io/os=linux,kubernetes.io/role=node 10.19.2.247 Ready node 3h15m v1.15.2 beta.kubernetes.io/arch=amd64,beta.kubernetes.io/os=linux,kubernetes.io/arch=amd64,kubernetes.io/hostname=10.19.2.247,kubernetes.io/os=linux,kubernetes.io/role=node 10.19.2.248 Ready node 3h15m v1.15.2 beta.kubernetes.io/arch=amd64,beta.kubernetes.io/os=linux,kubernetes.io/arch=amd64,kubernetes.io/hostname=10.19.2.248,kubernetes.io/os=linux,kubernetes.io/role=node 10.19.2.56 Ready master 4h57m v1.15.2 beta.kubernetes.io/arch=amd64,beta.kubernetes.io/os=linux,kubernetes.io/arch=amd64,kubernetes.io/hostname=10.19.2.56,kubernetes.io/os=linux,kubernetes.io/role=master 10.19.2.57 Ready,SchedulingDisabled master 4h57m v1.15.2 beta.kubernetes.io/arch=amd64,beta.kubernetes.io/os=linux,kubernetes.io/arch=amd64,kubernetes.io/hostname=10.19.2.57,kubernetes.io/os=linux,kubernetes.io/role=master 10.19.2.58 Ready,SchedulingDisabled master 4h57m v1.15.2 beta.kubernetes.io/arch=amd64,beta.kubernetes.io/os=linux,kubernetes.io/arch=amd64,kubernetes.io/hostname=10.19.2.58,kubernetes.io/os=linux,kubernetes.io/role=master ``` 参考文档: https://blog.csdn.net/miss1181248983/article/details/88181434 Kubectl常用命令 ================================================ FILE: tools/kubernetes-批量删除Pods.md ================================================ # 一、批量删除处于Pending状态的pod ``` kubectl get pods | grep Pending | awk '{print $1}' | xargs kubectl delete pod ``` # 二、批量删除处于Evicted状态的pod ``` kubectl get pods | grep Evicted | awk '{print $1}' | xargs kubectl delete pod ``` 参考文档: https://blog.csdn.net/weixin_39686421/article/details/80574131 kubernetes-批量删除Evicted Pods ================================================ FILE: tools/kubernetes访问外部mysql服务.md ================================================ ` k8s访问集群外独立的服务最好的方式是采用Endpoint方式(可以看作是将k8s集群之外的服务抽象为内部服务),以mysql服务为例 ` # 一、创建endpoints ```bash #创建 mysql-endpoints.yaml cat > mysql-endpoints.yaml <<\EOF kind: Endpoints apiVersion: v1 metadata: name: mysql-production namespace: default subsets: - addresses: - ip: 10.198.1.155 ports: - port: 3306 EOF kubectl apply -f mysql-endpoints.yaml ``` # 二、创建service ```bash #创建 mysql-service.yaml cat > mysql-service.yaml <<\EOF apiVersion: v1 kind: Service metadata: name: mysql-production spec: ports: - port: 3306 EOF kubectl apply -f mysql-service.yaml ``` # 三、测试连接数据库 ```bash cat > mysql-rc.yaml <<\EOF apiVersion: v1 kind: ReplicationController metadata: name: mysql spec: replicas: 1 selector: app: mysql template: metadata: labels: app: mysql spec: containers: - name: mysql image: docker.io/mysql:5.7 imagePullPolicy: IfNotPresent ports: - containerPort: 3306 env: - name: MYSQL_ROOT_PASSWORD value: "123456" EOF kubectl apply -f mysql-rc.yaml ``` 参考资料: https://blog.csdn.net/hxpjava1/article/details/80040407 使用kubernetes访问外部服务mysql/redis ================================================ FILE: tools/ssh_copy.sh ================================================ #!/bin/bash for i in `echo master01 master02 master03 slave01 slave02 slave03`;do expect -c " spawn scp $1 root@$i:$1 expect { \"*yes/no*\" {send \"yes\r\"; exp_continue} \"*password*\" {send \"123456\r\"; exp_continue} \"*Password*\" {send \"123456\r\";} } " done