SQLFlow 从零开始安装使用
在 Kubernetes 集群上安装 SQLFlow Playground官网参考:https://sql-machine-learning.github.io/sqlflow/doc/run/kubernetes/一. 安装 Docker, 参考: https://docs.docker.com/engine/install/centos/0. 卸载之前 Dockeryum remove doc
·
在 Kubernetes 集群上安装 SQLFlow Playground
官网参考: https://sql-machine-learning.github.io/sqlflow/doc/run/kubernetes/
一. 安装 Docker, 参考: https://docs.docker.com/engine/install/centos/
0. 卸载之前 Docker
yum remove docker \
docker-client \
docker-client-latest \
docker-common \
docker-latest \
docker-latest-logrotate \
docker-logrotate \
docker-engine
1. 安装工具包
yum install -y yum-utils
2. 配置镜像仓库 (官网镜像比较慢, 使用国内阿里云镜像仓库)
yum-config-manager \
--add-repo \
http://mirrors.aliyun.com/docker-ce/linux/centos/docker-ce.repo
3. 更新 yum 软件包索引
yum makecache fast
4. 安装最新版本Docker Engine和容器
yum install docker-ce docker-ce-cli containerd.io
5. 启动 Docker
systemctl start docker
6. 配置容器镜像加速, 修改 /etc/docker/daemon.json 文件内容, 无则创建.
打开阿里云网页 -- 菜单 -- 产品与服务 -- 容器镜像服务 -- 镜像加速器
注: 其中的 xxxxx ,每个人不相同,自己注册登陆后复制自己的.
sudo mkdir -p /etc/docker
sudo tee /etc/docker/daemon.json <<-'EOF'
{
"registry-mirrors": ["https://xxxxx.mirror.aliyuncs.com"]
}
EOF
sudo systemctl daemon-reload
sudo systemctl restart docker
7. 通过运行 hello-world 镜像来验证是否正确安装了Docker Engine
docker run hello-world
容器运行时,它会打印参考消息并退出,下载的镜像使用 docker images 可查看.
至此, Docker 安装成功!
SQLFlow 安装过程相关文件目录:
/
sqlflow
config --所有配置文件统
out --所有日志输出统
二. 安装 Minikube (迷你版 k8s 集群), 参考: https://minikube.sigs.k8s.io/docs/start/
0. 删除集群, 重新安装 (报错太多或安装失败)
minikube delete --all
1. 安装和配置 kubectl, 官网参考: https://kubernetes.io/docs/tasks/tools/install-kubectl/
2. 安装和配置 Minikube, 官网参考: https://minikube.sigs.k8s.io/docs/start/?
问题: 官网提供下载安装方式在国内网络均下载失败
解决方式: 在github中找到,下载到win本地,再上传到linux服务器,然后添加执行权限,移动到 /usr/local/bin/
可直接使用我准备好的文件:
链接: https://pan.baidu.com/s/1yH3HzsVMgZjflhCHrul1sg 密码: u2kp
3. 启动 Minikube
[root@ecs-yw-smbs-1-0001 sqlflow]# minikube start --vm-driver=none --kubernetes-version=v1.17.0 --image-repository=registry.cn-hangzhou.aliyuncs.com/google_containers
* minikube v1.14.0 on Centos 7.6.1810
* Kubernetes 1.19.2 is now available. If you would like to upgrade, specify: --kubernetes-version=v1.19.2
* Using the none driver based on existing profile
* Starting control plane node minikube in cluster minikube
* Restarting existing none bare metal machine for "minikube" ...
* OS release is CentOS Linux 7 (Core)
* Preparing Kubernetes v1.17.0 on Docker 17.12.1-ce ...
* minikube 1.14.1 is available! Download it: https://github.com/kubernetes/minikube/releases/tag/v1.14.1
* To disable this notice, run: 'minikube config set WantUpdateNotification false'
> kubectl.sha256: 65 B / 65 B [--------------------------] 100.00% ? p/s 0s
> kubelet.sha256: 65 B / 65 B [--------------------------] 100.00% ? p/s 0s
> kubeadm.sha256: 65 B / 65 B [--------------------------] 100.00% ? p/s 0s
> kubeadm: 37.52 MiB / 37.52 MiB [----------------] 100.00% 7.95 MiB p/s 5s
> kubectl: 41.48 MiB / 41.48 MiB [---------------] 100.00% 3.04 MiB p/s 14s
> kubelet: 106.39 MiB / 106.39 MiB [-------------] 100.00% 7.22 MiB p/s 15s
* Configuring local host environment ...
*
! The 'none' driver is designed for experts who need to integrate with an existing VM
* Most users should use the newer 'docker' driver instead, which does not require root!
* For more information, see: https://minikube.sigs.k8s.io/docs/reference/drivers/none/
*
! kubectl and minikube configuration will be stored in /root
! To use kubectl or minikube commands as your own user, you may need to relocate them. For example, to overwrite your own settings, run:
*
- sudo mv /root/.kube /root/.minikube $HOME
- sudo chown -R $USER $HOME/.kube $HOME/.minikube
*
* This can also be done automatically by setting the env var CHANGE_MINIKUBE_NONE_USER=true
* Verifying Kubernetes components...
* Enabled addons: storage-provisioner, default-storageclass
* Done! kubectl is now configured to use "minikube" by default
4. 查看 K8S 服务进程
[root@ecs-yw-smbs-1-0001 sqlflow]# kubectl get po -A
NAMESPACE NAME READY STATUS RESTARTS AGE
kube-system coredns-7f9c544f75-gpwqh 1/1 Running 0 26s
kube-system etcd-ecs-yw-smbs-1-0001 1/1 Running 0 26s
kube-system kube-apiserver-ecs-yw-smbs-1-0001 1/1 Running 0 26s
kube-system kube-controller-manager-ecs-yw-smbs-1-0001 1/1 Running 0 26s
kube-system kube-proxy-262w8 1/1 Running 0 26s
kube-system kube-scheduler-ecs-yw-smbs-1-0001 1/1 Running 0 25s
kube-system storage-provisioner 1/1 Running 0 31s
5. minikube捆绑了Kubernetes仪表板, 可使用web了解集群状态
nohup minikube dashboard > /sqlflow/out/dashboard.out
nohup kubectl proxy --address=192.168.229.150 --disable-filter=true > /sqlflow/out/proxy.out 2>&1 &
浏览器访问:
http://192.168.229.150:8001/api/v1/namespaces/kubernetes-dashboard/services/http:kubernetes-dashboard:/proxy/#/overview?namespace=default
至此, Minikube 安装成功!
三. 安装 Argo
1. 需要提前下载一个工具包 socat, 否则最终访问页面被拒
yum install socat
2. 启动 Argo 服务
创建 Argo 命名空间
kubectl create namespace argo
创建用户
kubectl create rolebinding default-admin --clusterrole=admin --serviceaccount=default:default
启动 Argo 服务
kubectl apply -n argo -f https://raw.githubusercontent.com/argoproj/argo/v2.7.7/manifests/install.yaml
这一步骤,通常因网络原因导致失败.
则可先使用 wget 将文件下载到本地某文件夹中,再进行加载:
1. wget https://raw.githubusercontent.com/argoproj/argo/v2.7.7/manifests/install.yaml
移动并修改名称到此, pwd : /sqlflow/config/argo.yaml
2. kubectl apply -n argo -f /sqlflow/config/argo.yaml
*若下载或操作错误,可删除后重新下载
kubectl delete -f /sqlflow/config/argo.yaml
3. 启动 Argo, 访问所有 pods, 访问 Argo UI
[root@ecs-yw-smbs-1-0001 config]# kubectl get pods -nargo --watch
NAME READY STATUS RESTARTS AGE
argo-server-67566f6964-h84hw 1/1 Running 0 49s
workflow-controller-968885fcc-2wzcs 1/1 Running 0 49s
等待所有组件均 running,具体表现运行情况为: 1/1,之后再进行下一步
nohup kubectl -n argo port-forward deployment/argo-server 2746:2746 --address=0.0.0.0 > /sqlflow/out/argo.out 2>&1 &
查看日志是否报错: cat argo.out
浏览器访问: http://192.168.229.150:2746
至此, Argo 安装成功!
四. 切换数据源
1. 下载 k8s 配置文件
wget https://raw.githubusercontent.com/sql-machine-learning/sqlflow/develop/doc/run/k8s/install-sqlflow-multi-users.yaml
2. 移动至/sqlflow/config, 拆分为两个文件便于管理
mv install-sqlflow-multi-users.yaml /sqlflow/config/sqlflow-server.yaml
[root@ecs-yw-smbs-1-0001 config]# cat sqlflow-server.yaml
apiVersion: apps/v1 # for versions before 1.9.0 use apps/v1beta2
kind: Deployment
metadata:
name: sqlflow-server
spec:
selector:
matchLabels:
app: sqlflow-server
strategy:
type: Recreate
template:
metadata:
labels:
app: sqlflow-server
spec:
containers:
- image: sqlflow/sqlflow:server
name: server
imagePullPolicy: IfNotPresent
command: ["sqlflowserver", "--argo-mode"]
env:
- name: SQLFLOW_WORKFLOW_LOGVIEW_ENDPOINT
value: "http://192.168.229.150:2746"
- name: SQLFLOW_WORKFLOW_STEP_IMAGE
value: sqlflow/sqlflow:step
- name: SQLFLOW_WORKFLOW_TTL
value: "600"
ports:
- containerPort: 50051
protocol: TCP
---
apiVersion: v1
kind: Service
metadata:
name: sqlflow-server
labels:
app: sqlflow-server
spec:
ports:
- port: 50051
protocol: TCP
selector:
app: sqlflow-server
---
mv install-sqlflow-multi-users.yaml /sqlflow/config/sqlflow-jupyter.yaml
[root@ecs-yw-smbs-1-0001 config]# cat sqlflow-jupyter.yaml
apiVersion: apps/v1
kind: Deployment
metadata:
name: sqlflow-jupyter
labels:
app: sqlflow-jupyter
spec:
selector:
matchLabels:
app: sqlflow-jupyter
strategy:
type: Recreate
template:
metadata:
labels:
app: sqlflow-jupyter
spec:
containers:
- image: sqlflow/sqlflow:jupyter
name: notebook
imagePullPolicy: IfNotPresent
ports:
- containerPort: 8888
name: notebook
command:
- sh
- -c
- 'export SQLFLOW_DATASOURCE=mysql://账户:密码@tcp\(数据库所在ip:端口\)/?maxAllowedPacket=0 && jupyter notebook --ip=0.0.0.0 --port=8888 --allow-root --NotebookApp.token=""'
env:
- name: SQLFLOW_SERVER
value: "192.168.229.150:50051"
volumes:
- name: docker-socket-volume
hostPath:
path: /var/run/docker.sock
type: File
3. 分别先后启动,并查看对应进程和映射端口
[root@ecs-yw-smbs-1-0001 sqlflow]# kubectl apply -f config/sqlflow-server.yaml
deployment.apps/sqlflow-server created
service/sqlflow-server created
[root@ecs-yw-smbs-1-0001 sqlflow]# kubectl get pods
NAME READY STATUS RESTARTS AGE
sqlflow-server-5489b4b6d9-zbc7f 1/1 Running 0 28s
[root@ecs-yw-smbs-1-0001 sqlflow]# nohup kubectl port-forward deployment/sqlflow-server 50051:50051 --address=0.0.0.0 >/sqlflow/out/server.out 2>&1 &
[root@ecs-yw-smbs-1-0001 sqlflow]# kubectl apply -f config/sqlflow-jupyter.yaml
deployment.apps/sqlflow-jupyter created
[root@ecs-yw-smbs-1-0001 sqlflow]# kubectl get pods
NAME READY STATUS RESTARTS AGE
sqlflow-jupyter-77cb766dfd-rjmph 1/1 Running 0 4s
sqlflow-server-5489b4b6d9-zbc7f 1/1 Running 0 97s
[root@ecs-yw-smbs-1-0001 sqlflow]# nohup kubectl port-forward deployment/sqlflow-jupyter 8000:8888 --address=0.0.0.0 >/sqlflow/out/jupyter.out 2>&1 &
4. 使用浏览器访问
http://192.168.229.150:8000
至此, SQLFlow 安装成功!
五.安装和使用 SQLFlow 客户端
参考: https://github.com/sql-machine-learning/sqlflow/blob/develop/doc/run/cli.md
1. 切换到 /usr/local/bin/ 目录下, 下载文件并赋予执行权限
wget http://cdn.sqlflow.tech/latest/linux/sqlflow
chmod +x sqlflow
2. 连接 SQLFlow
sqlflow --sqlflow-server=localhost:50051 --data-source="mysql://root:root@tcp(192.168.229.150:3306)/"
–sqlflow-server=localhost:50051, 指定 SQLFlow 安装服务器ip及客户端映射到本地端口号
–data-source=“mysql://root:root@tcp(192.168.229.150:3306)/”, 指定数据源
3. 连接成功后输出
[root@ecs-yw-smbs-1-0001 local]# ls
bin etc games include lib lib64 libexec sbin share sqlflow src
[root@ecs-yw-smbs-1-0001 local]# pwd
/usr/local
[root@ecs-yw-smbs-1-0001 local]# ./sqlflow --sqlflow-server=localhost:50051 --data-source="mysql://root:root@tcp(192.168.229.150:3306)/"
Welcome to SQLFlow. Commands end with ;
sqlflow> show databases;
SQLFlow Step: [1/1] Execute Code: bash -c step -e "show databases;"
SQLFlow Step: [1/1] Log: http://localhost:9001/workflows/default/sqlflow-ws5dw?tab=workflow&nodeId=sqlflow-ws5dw-2995552722&sidePanel=logs:sqlflow-ws5dw-2995552722:main
SQLFlow Step: [1/1] Status: Pending
SQLFlow Step: [1/1] Status: Running
SQLFlow Step: [1/1] Status: Succeeded
get un-recognized message type: <nil>
get un-recognized message type: <nil>
get un-recognized message type: <nil>
get un-recognized message type: <nil>
get un-recognized message type: <nil>
get un-recognized message type: <nil>
+--------------------+
| DATABASE |
+--------------------+
| information_schema |
| mysql |
| performance_schema |
| sqlflow |
| sys |
+--------------------+
sqlflow>
至此, 登陆自定义数据源成功!
注意: 想成功运行并预测结果, 需在 mysql 创建 sqlflow 预测结果专用数据库 sqlflow_models, 否则找不到输出会报错.
所以记得在自定义的数据源中, 创建该数据库
create database sqlflow_models;
更多推荐
所有评论(0)