TFX Airflow Tutorial

Tutorial

环境要求

  • Linux/MacOS
  • Python 3.9 及更高版本
  • Virtualenv
  • Git

环境配置

    cd
    virtualenv -p python3 tfx-env
    source ~/tfx-env/bin/activate
    mkdir tfx; cd tfx



    git clone https://github.com/tensorflow/tfx.git
    cd ~/tfx/tfx/tfx/examples/workshop/setup
    ./setup_demo.sh

创建流水线框架

1. 打开一个新的终端窗口,在该窗口中执行下面的命令,启动 airflow 管理服务

    # Open a new terminal window, and in that window ...
    source ~/tfx-env/bin/activate
    cd
    airflow users  create --role Admin --username admin --email admin --firstname admin --lastname admin --password admin    
    airflow webserver -p 8080

2. 打开一个新的终端窗口,在该窗口中执行下面的命令,启动 airflow 调度服务

    # Open another new terminal window, and in that window ...
    source ~/tfx-env/bin/activate
    airflow scheduler

在浏览器中:

打开浏览器,然后转到 http://127.0.0.1:8080

DAG 操作

登陆管理平台,用户名:admin, 密码:admin login page 运行 taxi 流水线 DAG

TensorFlow

dag-button-refresh

3. 流水线执行完成后,打开一个新的终端窗口,在该窗口中执行下面的命令,打开对每个组件进行性能分析的 notebook

    # Open yet another new terminal window, and in that window ...
    # Assuming that you've cloned the TFX repo into ~/tfx
    source ~/tfx-env/bin/activate
    cd ~/tfx/tfx/tfx/examples/workshop/notebooks
    jupyter notebook

login page

NEXTK8s minikube operation 1