系统环境:
Kafka 是由 Apache 软件基金会开发的一个开源流处理平台,由 Scala 和 Java 编写。它是一个分布式、支持分区的的、多副本,基于 zookeeper 协调的分布式消息系统。它最大特性是可以实时的处理大量数据以满足各种需求场景,比如基于 Hadoop 的批处理系统、低延时的实时系统、storm/Spark 流式处理引擎,web/nginx 日志,访问日志,消息服务等等。
ZooKeeper 是一种分布式协调服务,用于管理大型主机。在分布式环境中协调和管理服务是一个复杂的过程。ZooKeeper 通过其简单的架构和 API 解决了这个问题。 ZooKeeper 允许开发人员专注于核心应用程序逻辑,而不必担心应用程序的分布式特性。
Kafka 中主要利用 zookeeper 解决分布式一致性问题。Kafka 使用 Zookeeper 的分布式协调服务将生产者,消费者,消息储存结合在一起。同时借助 Zookeeper,Kafka 能够将生产者、消费者和 Broker 在内的所有组件在无状态的条件下建立起生产者和消费者的订阅关系,实现生产者的负载均衡。
Kafka Manager 是目前最受欢迎的 Kafka 集群管理工具,最早由雅虎开源,用户可以在 Web 界面执行一些简单的集群管理操作。
支持以下功能:
这个流程需要部署三个组件,分别为 Zookeeper、Kafka、Kafka Manager:
由于都是使用 StatefulSet 方式部署的有状态服务,所以 Kubernetes 集群需要提前设置一个 StorageClass 方便后续部署时指定存储分配(如果想指定为已经存在的 StorageClass 创建 PV 则跳过此步骤)。
此处用的是 NFS 存储驱动,如果是其它存储需要提前设置好相关配置
nfs-storage.yaml
apiVersion: storage.k8s.io/v1
kind: StorageClass
metadata:
name: nfs-storage
provisioner: nfs-client #动态卷分配服务指定的名称
parameters:
archiveOnDelete: "true" #设置为"false"时删除PVC不会保留数据,"true"则保留数据
mountOptions:
- hard #指定为硬挂载方式
- nfsvers=4 #指定NFS版本
$ kubectl apply -f nfs-storage.yaml
zookeeper.yaml
#部署 Service Headless,用于Zookeeper间相互通信
apiVersion: v1
kind: Service
metadata:
name: zookeeper-headless
labels:
app: zookeeper
spec:
type: ClusterIP
clusterIP: None
publishNotReadyAddresses: true
ports:
- name: client
port: 2181
targetPort: client
- name: follower
port: 2888
targetPort: follower
- name: election
port: 3888
targetPort: election
selector:
app: zookeeper
---
#部署 Service,用于外部访问 Zookeeper
apiVersion: v1
kind: Service
metadata:
name: zookeeper
labels:
app: zookeeper
spec:
type: ClusterIP
ports:
- name: client
port: 2181
targetPort: client
- name: follower
port: 2888
targetPort: follower
- name: election
port: 3888
targetPort: election
selector:
app: zookeeper
---
apiVersion: apps/v1
kind: StatefulSet
metadata:
name: zookeeper
labels:
app: zookeeper
spec:
serviceName: zookeeper-headless
replicas: 3
podManagementPolicy: Parallel
updateStrategy:
type: RollingUpdate
selector:
matchLabels:
app: zookeeper
template:
metadata:
name: zookeeper
labels:
app: zookeeper
spec:
securityContext:
fsGroup: 1001
containers:
- name: zookeeper
image: docker.io/bitnami/zookeeper:3.4.14-debian-9-r25
imagePullPolicy: IfNotPresent
securityContext:
runAsUser: 1001
command:
- bash
- -ec
- |
# Execute entrypoint as usual after obtaining ZOO_SERVER_ID based on POD hostname
HOSTNAME=`hostname -s`
if [[ $HOSTNAME =~ (.*)-([0-9]+)$ ]]; then
ORD=${BASH_REMATCH[2]}
export ZOO_SERVER_ID=$((ORD+1))
else
echo "Failed to get index from hostname $HOST"
exit 1
fi
. /opt/bitnami/base/functions
. /opt/bitnami/base/helpers
print_welcome_page
. /init.sh
nami_initialize zookeeper
exec tini -- /run.sh
resources:
limits:
cpu: 500m
memory: 512Mi
requests:
cpu: 250m
memory: 256Mi
env:
- name: ZOO_PORT_NUMBER
value: "2181"
- name: ZOO_TICK_TIME
value: "2000"
- name: ZOO_INIT_LIMIT
value: "10"
- name: ZOO_SYNC_LIMIT
value: "5"
- name: ZOO_MAX_CLIENT_CNXNS
value: "60"
- name: ZOO_SERVERS
value: "
zookeeper-0.zookeeper-headless:2888:3888,
zookeeper-1.zookeeper-headless:2888:3888,
zookeeper-2.zookeeper-headless:2888:3888
"
- name: ZOO_ENABLE_AUTH
value: "no"
- name: ZOO_HEAP_SIZE
value: "1024"
- name: ZOO_LOG_LEVEL
value: "ERROR"
- name: ALLOW_ANONYMOUS_LOGIN
value: "yes"
ports:
- name: client
containerPort: 2181
- name: follower
containerPort: 2888
- name: election
containerPort: 3888
livenessProbe:
tcpSocket:
port: client
initialDelaySeconds: 30
periodSeconds: 10
timeoutSeconds: 5
successThreshold: 1
failureThreshold: 6
readinessProbe:
tcpSocket:
port: client
initialDelaySeconds: 5
periodSeconds: 10
timeoutSeconds: 5
successThreshold: 1
failureThreshold: 6
volumeMounts:
- name: data
mountPath: /bitnami/zookeeper
volumeClaimTemplates:
- metadata:
name: data
annotations:
spec:
storageClassName: nfs-storage #指定为上面创建的 storageclass
accessModes:
- ReadWriteOnce
resources:
requests:
storage: 5