题记

对于Docker容器集群来说,比较成熟的方案包括Swarm、Mesos、k8s和Google的Kubernetes,特别是后者得到了更多厂商的使用和推广,但是kubernetes相比较来说技术门槛较高,让很多用户望而却步,幸好,Docker在今年的6月7号开源发布了原生的集群管理工具SwarmKit,主要提供容器集群以及编排能力,那赶紧尝试一下,看看有什么好的功能。


SwarmKit框架




SwarmKit中有两种角色,Manager和Worker。Manager主要管理节点、调度任务。Worker主要通过Executor来执行任务,当前缺省的Executor为Docker Container Executor。包含了一下特点:

(1)内建分布式存储,不要额外的数据库
(2)支持Rolling update
(3)容器HA,支持Zero applicaton downtime
(4)通过TLS保证通讯安全


接下来就赶紧安装一下。

             “ SwarmKit 环境部署

部署环境

VMWare Workstation 12 

Ubuntu 14.04


先制作一个通用的Ubuntu14.04虚拟机,然后安装Docker、安装Swarmkit包,然后进行虚拟机复制,生成三台集群环境,分别为一个Manager和两个Worker。

1、安装Ubuntu 14.04

2、安装Docker

curl -sSL https://get.docker.io | bash

3、由于SwarmKit采用Go语言开发,所以需要部署一个go环境,下载go包

下载地址:http://www.golangtc.com/download

我这里下载的是go1.6.2.linux-amd64.tar.gz

4、解压缩到/usr/local下面即可

然后添加环境变量 至 /etc/profile


export GOROOT=/usr/local/go
export PATH=/root/go/src/github.com/docker/swarmkit/bin:/usr/local/go/bin:$PATH
export GOPATH=/root/go
export SWARM_SOCKET=/tmp/controller/swarm.sock
注意:我目前的环境变量包含了所有的

环境变量生效之后,直接输入go即可看到命令提示。


5、安装git包

apt-get install git

6、输入如下命令下载swarmkit包

$ go get github.com/docker/swarmkit
注意:需要配置GOPATH环境变量,如上所述:/root/go,相关代码会下载到该目录下

7、进入/root/go/src/github.com/docker/swarmkit目录,make即可

root@controller:~/go/src/github.com/docker/swarmkit# pwd
/root/go/src/github.com/docker/swarmkit
root@controller:~/go/src/github.com/docker/swarmkit# ls
agent  bin          ca          cmd          CONTRIBUTING.md  Godeps    ioutils  log          Makefile  NOMENCLATURE.md  protobuf   vendor
api    BUILDING.md  circle.yml  codecov.yml  doc.go           identity  LICENSE  MAINTAINERS  manager   picker           README.md  version

8、在该目录的bin文件夹下可以看到生成的二进制文件

root@controller:~/go/src/github.com/docker/swarmkit/bin# ll
total 76092
drwxr-xr-x  2 root root     4096 Jun 30 10:00 ./
drwxr-xr-x 17 root root     4096 Jun 30 10:00 ../
-rwxr-xr-x  1 root root 12929120 Jun 30 10:00 protoc-gen-gogoswarm*
-rwxr-xr-x  1 root root 18044776 Jun 30 10:00 swarm-bench*
-rwxr-xr-x  1 root root 19219800 Jun 30 10:00 swarmctl*
-rwxr-xr-x  1 root root 27708144 Jun 30 10:00 swarmd*

swarmd是一个swarmkit daemon程序,用来运行manager和worker。

swarmctl是一个命令行工具,用来访问swarm manger

我们需要将这四个文件分别拷贝到不同manage和worker节点的/usr/bin里面,当然,也可以放在上述的环境变量里面。

9、建议在该机器下载一个测试镜像。

10、关闭虚拟机,复制。


              “ SwarmKit 具体使用 

目前我已经准备好了三台虚拟机,里面的环境都是一样的。


Manager:192.168.14.244/controller

Worker:192.168.14.80/worker1

Worker:192.168.14.223/worker2

确保manager和所有worker都可以实现ssh无密码访问


1、在manager节点执行启动命令swarmd -d /tmp/controller --listen-control-api /tmp/controller/swarm.sock --hostname controller

root@controller:~# swarmd -d /tmp/controller --listen-control-api /tmp/controller/swarm.sock --hostname controller
Warning: Specifying a valid address with --listen-remote-api may be necessary for other managers to reach this one.
INFO[0000] Listening for connections                     addr=[::]:4242 proto=tcp
INFO[0000] Listening for local connections               addr=/tmp/controller/swarm.sock proto=unix
INFO[0000] 2dacc8ed9d604bff became follower at term 0
INFO[0000] newRaft 2dacc8ed9d604bff [peers: [], term: 0, commit: 0, applied: 0, lastindex: 0, lastterm: 0]
INFO[0000] 2dacc8ed9d604bff became follower at term 1
INFO[0000] 2dacc8ed9d604bff is starting a new election at term 1
INFO[0000] 2dacc8ed9d604bff became candidate at term 2
INFO[0000] 2dacc8ed9d604bff received vote from 2dacc8ed9d604bff at term 2
INFO[0000] 2dacc8ed9d604bff became leader at term 2
INFO[0000] raft.node: 2dacc8ed9d604bff elected leader 2dacc8ed9d604bff at term 2
INFO[0000] node is ready

注意:你可以使用nohup命令启动后台运行


然后分别在worker机器上运行加入集群的命令

swarmd -d /tmp/work1 --hostname work1 --join-addr 192.168.13.244:4242

root@worker1:~# swarmd -d /tmp/work1 --hostname work1 --join-addr 192.168.13.244:4242
Warning: Specifying a valid address with --listen-remote-api may be necessary for other managers to reach this one.
INFO[0000] Waiting for TLS certificate to be issued...
INFO[0000] Downloaded new TLS credentials with role: swarm-worker.
INFO[0000] node is ready


swarmd -d /tmp/work2 --hostname work2 --join-addr 192.168.13.244:4242

root@worker2:~# swarmd -d /tmp/work2 --hostname work2 --join-addr 192.168.13.244:4242
Warning: Specifying a valid address with --listen-remote-api may be necessary for other managers to reach this one.
INFO[0000] Waiting for TLS certificate to be issued...
INFO[0000] Downloaded new TLS credentials with role: swarm-worker.
INFO[0000] node is ready

这样三台机器都加入了统一的SwarmKit集群管理中了。接下来我们就可以通过swarmctl命令对docker集群进行管理了,这里面需要添加SWARM_SOCKET环境变量,参看上述。


2、查看目前的集群节点情况

root@controller:~# swarmctl node ls
ID                         Name        Membership  Status  Availability  Manager Status
--                         ----        ----------  ------  ------------  --------------
5a9hk2li71cx0fuzgi42wsrvc  work1       ACCEPTED    READY   ACTIVE
a84vwflvs2ao4awtxor4ybaa6  work2       ACCEPTED    READY   ACTIVE
azwpol92a8mrk1mssqw2ry7p7  controller  ACCEPTED    READY   ACTIVE        REACHABLE *

可以看到目前所有的状态都是正常状态


3、创建一个ubuntu服务

root@controller:~# swarmctl service create --name ubuntu --image ubuntu:14.04
1601akon3w1t1m7i9el4pn9gl

注意:如果集群各个节点包含镜像,启动比较快,如果不包含,各个节点如果联网,自行去docker hub里面下载镜像,这个可能会花费时间。这也是我建议为什么先下载一个镜像的原因。


查看目前的服务情况

root@controller:~# swarmctl service ls
ID                         Name    Image         Replicas
--                         ----    -----         --------
1601akon3w1t1m7i9el4pn9gl  ubuntu  ubuntu:14.04  0/1

通过inspect查看服务的详细情况。

root@controller:~# swarmctl service inspect ubuntu
ID                : 1601akon3w1t1m7i9el4pn9gl
Name              : ubuntu
Replicas          : 0/1
Template
 Container
  Image           : ubuntu:14.04

Task ID                      Service    Slot    Image           Desired State    Last State                  Node
-------                      -------    ----    -----           -------------    ----------                  ----
9qflavdz3okfqtael5558o8ey    ubuntu     1       ubuntu:14.04    RUNNING          PREPARING 34 seconds ago    controller


Last State是PREPARING,说明controller还没有启动ubuntu容器,因为本地还没有镜像,需要从镜像仓库拉取。


4、更新服务

SwarmKit提供的swarmctl service  update功能可以对已有的服务的信息进行更新,例如镜像版本,实例参数(CPU、内存、端口号、网络、卷...)、副本数、标签、环境变量等进行更新

root@controller:~# swarmctl service update ubuntu
Error: no changes detected
Usage:
  swarmctl service update <service ID> [flags]

Flags:
      --args value                  container args (default [])
      --bind value                  define a bind mount (default [])
      --command value               override entrypoint (default [])
      --constraint value            Placement constraint (node.labels.key==value) (default [])
      --cpu-limit string            CPU cores limit (e.g. 0.5)
      --cpu-reservation string      number of CPU cores reserved (e.g. 0.5)
      --env value                   container env (default [])
      --image string                container image
      --label value                 service label (key=value) (default [])
      --memory-limit string         memory limit (e.g. 512m)
      --memory-reservation string   amount of reserved memory (e.g. 512m)
      --name string                 service name
      --network string              network name
      --ports value                 ports (default [])
      --replicas uint               number of replicas for the service (only works in replicated service mode) (default 1)
      --restart-condition string    condition to restart the task (any, failure, none) (default "any")
      --restart-delay string        delay between task restarts (default "5s")
      --restart-max-attempts uint   maximum number of restart attempts (0 = unlimited)
      --restart-window string       time window to evaluate restart attempts (0 = unbound) (default "0s")
      --update-delay string         delay between task updates (0s = none) (default "0s")
      --update-parallelism uint     task update parallelism (0 = all at once)
      --volume value                define a volume mount (default [])

5、节点管理

root@controller:~# swarmctl service inspect ubuntu
ID                : 1601akon3w1t1m7i9el4pn9gl
Name              : ubuntu
Replicas          : 0/3
Template
 Container
  Image           : ubuntu:14.04

Task ID                      Service    Slot    Image           Desired State    Last State                  Node
-------                      -------    ----    -----           -------------    ----------                  ----
e4xaodj9v2sest4ix141wgu4t    ubuntu     1       ubuntu:14.04    ACCEPTED         ACCEPTED now                work2
0yfsmmpvq51clsywnfjn077va    ubuntu     2       ubuntu:14.04    RUNNING          PREPARING 17 minutes ago    work1
706418wj5je0h7jtirvqmhlfi    ubuntu     3       ubuntu:14.04    ACCEPTED         ACCEPTED now                controller

我们可以通过swarmctl node drain work2命令将work2设置为不可用状态

(如果希望激活,使用swarmctl node activate work2命令即可)

同时我们可以查看到状态改变

root@controller:~# swarmctl node drain work2
root@controller:~# swarmctl service inspect ubuntu
ID                : 1601akon3w1t1m7i9el4pn9gl
Name              : ubuntu
Replicas          : 0/3
Template
 Container
  Image           : ubuntu:14.04

Task ID                      Service    Slot    Image           Desired State    Last State                  Node
-------                      -------    ----    -----           -------------    ----------                  ----
2znvxhamkbvj2eqf5q3xxin7d    ubuntu     1       ubuntu:14.04    RUNNING          PREPARING 6 seconds ago     controller
0yfsmmpvq51clsywnfjn077va    ubuntu     2       ubuntu:14.04    RUNNING          PREPARING 19 minutes ago    work1
890be0l7eapjep5v8k70b2pto    ubuntu     3       ubuntu:14.04    ACCEPTED         ACCEPTED 2 seconds ago      controller
我们可以看到

任务已经迁移到controller和work1上了。


后面可以看到controller的实例以ubuntu.3.xxxxx;work1的实例以ubuntu.2.xxxxx;work2的实例名以ubuntu.1.xxxxx,这是因为我执行了(swarmctl service update redis --replicas 3)副本操作。


我们看到在controller里面已经可以看到原来work2的实例了

root@controller:~# docker ps -a
CONTAINER ID        IMAGE               COMMAND             CREATED              STATUS                          PORTS               NAMES
a159185201f8        ubuntu:14.04        "/bin/bash"         5 seconds ago        Exited (0) 4 seconds ago                            ubuntu.1.58zd7f87vldcici33i272v20o
0507647074e1        ubuntu:14.04        "/bin/bash"         8 seconds ago        Exited (0) 6 seconds ago                            ubuntu.3.dwbyeoyk9bgu07u2wz3utoput
50a5aea24ed2        ubuntu:14.04        "/bin/bash"         17 seconds ago       Exited (0) 16 seconds ago                           ubuntu.1.9dwu5e3tb9cte2w2fnp7lyd1r
f8244e6e4d55        ubuntu:14.04        "/bin/bash"         23 seconds ago       Exited (0) 22 seconds ago                           ubuntu.3.0dipyu39lyhvqtc68bq5tfld2
7019fbed41f1        ubuntu:14.04        "/bin/bash"         31 seconds ago       Exited (0) 29 seconds ago                           ubuntu.1.2kjo445p3l2fucgo82e06gy1e
a63eac32c424        ubuntu:14.04        "/bin/bash"         33 seconds ago       Exited (0) 31 seconds ago                           ubuntu.3.anr48rmhqyh98v2xmwvafb60x
928882228a57        ubuntu:14.04        "/bin/bash"         46 seconds ago       Exited (0) 45 seconds ago                           ubuntu.3.bymam97t2g6libynguczc6lm8
6ec7ad7a6bd2        ubuntu:14.04        "/bin/bash"         48 seconds ago       Exited (0) 47 seconds ago                           ubuntu.1.ct3fc130hki4keq6ax0uq0at6
98e9728e9000        ubuntu:14.04        "/bin/bash"         56 seconds ago       Exited (0) 55 seconds ago                           ubuntu.3.arjd0dtjefhofd5xnxw48loli
bcf0708aae66        ubuntu:14.04        "/bin/bash"         59 seconds ago       Exited (0) 58 seconds ago                           ubuntu.1.augk3ferw6tvwe8qs6fk3wtt2
57db27be5bbd        ubuntu:14.04        "/bin/bash"         About a minute ago   Exited (0) About a minute ago                       ubuntu.3.6tkx22dhta5gtmkbpk83osj4k
6fdf69e49e98        ubuntu:14.04        "/bin/bash"         About a minute ago   Exited (0) About a minute ago                       ubuntu.1.4qamwbaezfp3iigo0kkwzh5my
3715774ddda2        ubuntu:14.04        "/bin/bash"         About a minute ago   Exited (0) About a minute ago                       ubuntu.3.5h2ohspdob81pueb74xmrr9q9
22071586b2a0        ubuntu:14.04        "/bin/bash"         About a minute ago   Exited (0) About a minute ago                       ubuntu.1.f4lmuk63l20kp165qe6uovo1n
797aa652b3da        ubuntu:14.04        "/bin/bash"         About a minute ago   Exited (0) About a minute ago                       ubuntu.3.2pyygimy8ilaub70gltsa96d0
22143a2f7795        ubuntu:14.04        "/bin/bash"         About a minute ago   Exited (0) About a minute ago                       ubuntu.1.dvrjxce69qa382otiw225gmac
1a26f50f87fa        ubuntu:14.04        "/bin/bash"         About a minute ago   Exited (0) About a minute ago                       ubuntu.3.byb6g2iuehmvgif7ss0nwnk0d
f43232d8a05a        ubuntu:14.04        "/bin/bash"         2 minutes ago        Exited (0) About a minute ago                       ubuntu.1.8zucjezgh65dv58jckgt0ycjj

后面我们将work2激活后,看到work2的实例又启用了

root@worker2:~# docker ps -a
CONTAINER ID        IMAGE               COMMAND             CREATED              STATUS                          PORTS               NAMES
9424a0b3dd72        ubuntu:14.04        "/bin/bash"         10 seconds ago       Exited (0) 9 seconds ago                            ubuntu.1.ezptv6gxoalsqdawl9uz2anwk
6dd10817d0cf        ubuntu:14.04        "/bin/bash"         20 seconds ago       Exited (0) 19 seconds ago                           ubuntu.1.8lnrs64khjyqj69cnt03o20cs
a71f20118aba        ubuntu:14.04        "/bin/bash"         32 seconds ago       Exited (0) 31 seconds ago                           ubuntu.1.2i81636zz5vpkzjxs0emswmys
ef939614adcc        ubuntu:14.04        "/bin/bash"         42 seconds ago       Exited (0) 41 seconds ago                           ubuntu.1.0b66k4exrehqc1zatoalmvfki
e06a3e77986f        ubuntu:14.04        "/bin/bash"         52 seconds ago       Exited (0) 52 seconds ago                           ubuntu.1.41gzjjru89aym5yxx56f1li77
c9a5e2ecacfd        ubuntu:14.04        "/bin/bash"         About a minute ago   Exited (0) About a minute ago                       ubuntu.1.5hwusgfhb20fkzo1nloragkia
0785dd33cdf1        ubuntu:14.04        "/bin/bash"         About a minute ago   Exited (0) About a minute ago                       ubuntu.1.71qca8ze2rpb94p3zv86y4u0e
6fc4a1a657aa        ubuntu:14.04        "/bin/bash"         About a minute ago   Exited (0) About a minute ago                       ubuntu.1.0vujqtkoz7bl3rtui2kpexjhm
b2c046e8515c        ubuntu:14.04        "/bin/bash"         About a minute ago   Exited (0) About a minute ago                       ubuntu.1.54elpfbr77jaq420mar5hn3vi

结语

目前SwarmKit由于刚开源,肯定有大量的问题存在,而且都还不能进入生产系统,但是作为直接集成到Docker Engine的优势比较明显,我个人在使用过程中还发现了不少问题。(问题可能是我理解不同)

例如:为什么我频繁刷新某个机器的docker实例,相关的Contrainer_ID并不一样,频繁变化,如果我全部启动后,contrainer_id固定下来了,但是类似上述的迁移还发生了变化。


另外,我理解的集群管理应该是我创建一个镜像的服务,应该根据内部的调度算法,创建某个容器实例在某个集群节点上,(上述看到的很多,是因为我更新了服务副本为3,执行了swarmctl service update redis --replicas 3),

但是为什么每个集群节点都包含9个容器实例,难道是根据我的虚拟机配置2vCPU+2vG,直接占满资源么?


后面还需要对该工具进一步研究,不过我还是比较看好该工具在容器集群的发展前景的。













Logo

开源、云原生的融合云平台

更多推荐