[
  {
    "path": ".github/CODEOWNERS",
    "content": "* @mlops-for-all/maintainers\n"
  },
  {
    "path": ".github/PULL_REQUEST_TEMPLATE.md",
    "content": "## Changes?\n<!-- 이 pr 로 인해서 무엇이 변경되었는지 작성해주세요 -->\n\n## Why we need?\n<!-- 이 pr 이 왜 필요한지 작성해주세요 -->\n\n## Test?\n\n- [ ] `npm run start` 를 수행하여 로컬에서 렌더링된 페이지를 확인하셨나요?\n- [ ] `npm test` 를 통과하였나요?\n\n## Anything Else? (Optional)\n<!-- 스크린샷, 환경 정보, 주의사항 등 필요한 추가정보가 있다면 작성해주세요. -->\n"
  },
  {
    "path": ".github/workflows/deploy.yml",
    "content": "name: Deploy to GitHub Pages\n\non:\n  push:\n    branches:\n      - main\n    # Review gh actions docs if you want to further define triggers, paths, etc\n    # https://docs.github.com/en/actions/using-workflows/workflow-syntax-for-github-actions#on\n\njobs:\n  deploy:\n    name: Deploy to GitHub Pages\n    runs-on: ubuntu-latest\n    steps:\n      - uses: actions/checkout@v2\n      - uses: actions/setup-node@v3\n        with:\n          node-version: 18\n          cache: npm\n\n      - name: Install dependencies\n        run: npm ci\n      - name: Build website\n        run: npm run build\n\n      # Popular action to deploy to GitHub Pages:\n      # Docs: https://github.com/peaceiris/actions-gh-pages#%EF%B8%8F-docusaurus\n      - name: Deploy to GitHub Pages\n        uses: peaceiris/actions-gh-pages@v3\n        with:\n          github_token: ${{ secrets.ORG_PAT }}\n          # Build output to publish to the `gh-pages` branch:\n          publish_dir: ./build\n          # The following lines assign commit authorship to the official\n          # GH-Actions bot for deploys to `gh-pages` branch:\n          # https://github.com/actions/checkout/issues/13#issuecomment-724415212\n          # The GH actions bot is used by default if you didn't specify the two fields.\n          # You can swap them out with your own user credentials.\n          user_name: github-actions[bot]\n          user_email: 41898282+github-actions[bot]@users.noreply.github.com\n"
  },
  {
    "path": ".github/workflows/pull-request.yml",
    "content": "name: \"Pull Request\"\non:\n  pull_request:\n    types: [opened, synchronize, edited, reopened, closed]\n\njobs:\n  label:\n    runs-on: ubuntu-latest\n    steps:\n      - uses: anencore94/labeler@v1.1.0\n"
  },
  {
    "path": ".gitignore",
    "content": "# Dependencies\n/node_modules\n\n# Production\n/build\n\n# Generated files\n.docusaurus\n.cache-loader\n\n# Misc\n.DS_Store\n.env.local\n.env.development.local\n.env.test.local\n.env.production.local\n\nnpm-debug.log*\nyarn-debug.log*\nyarn-error.log*\n\nv1/\n.vscode\n\nopenai.env\n.envrc\n__pycache__\n"
  },
  {
    "path": "README.md",
    "content": "## 모두의 MLOps\n\n모두의 MLOps 프로젝트입니다.\n\n프로젝트에 누구던 자유롭게 기여할 수 있습니다.\n\n자세한 내용은 [How to Contribute](https://mlops-for-all.github.io/community/how-to-contribute/)를 참조하세요.\n"
  },
  {
    "path": "babel.config.js",
    "content": "module.exports = {\n  presets: [require.resolve('@docusaurus/core/lib/babel/preset')],\n};\n"
  },
  {
    "path": "community/community.md",
    "content": "---\ntitle: \"Community\"\nsidebar_position: 1\n---\n\n### *모두의 MLOps* 릴리즈 소식\n\n새로운 포스트나 수정사항은 [Announcements](https://github.com/mlops-for-all/mlops-for-all.github.io/discussions/categories/announcements)에서 확인할 수 있습니다.\n\n### Question\n\n프로젝트 내용과 관련된 궁금점은 [Q&A](https://github.com/mlops-for-all/mlops-for-all.github.io/discussions/categories/q-a)를 통해 질문할 수 있습니다.\n\n### Suggestion\n\n제안점은 [Ideas](https://github.com/mlops-for-all/mlops-for-all.github.io/discussions/categories/ideas)를 통해 제안해 주시면 됩니다.\n\n### Copyright\n\n1. 본 문서를 “비상업적 목적” 사용 시 하기와 같이 출처를 반드시 표시해주세요.\n    - MLOps for ALL, https://mlops-for-all.github.io/\n2. 상업적 용도로 인용/사용/차용하고자 하는 경우 마키나락스(contact@makinarocks.ai)로 사전에 문의주시기 바랍니다.\n"
  },
  {
    "path": "community/contributors.md",
    "content": "---\nsidebar_position: 3\n---\n\n# Contributors\n\n## Main Authors\n\nimport {\n  MainAuthorRow,\n} from '@site/src/components/TeamProfileCards';\n\n<MainAuthorRow />\n\n\n## Contributors\nThank you for contributing our tutorials!\n\nimport {\n  ContributorsRow,\n} from '@site/src/components/TeamProfileCards';\n\n<ContributorsRow />\n"
  },
  {
    "path": "community/how-to-contribute.md",
    "content": "---\ntitle: \"How to Contribute\"\nsidebar_position: 2\n---\n\n## How to Start\n\n### Git Repo 준비\n\n1. [*모두의 MLOps* GitHub Repository](https://github.com/mlops-for-all/mlops-for-all.github.io)에 접속합니다.\n\n2. 여러분의 개인 Repository로 `Fork`합니다.\n\n3. Forked Repository를 여러분의 작업 환경으로 `git clone`합니다.\n\n### 환경 설정\n\n1. 모두의 MLOps는 Hugo 와 Node를 이용하고 있습니다.  \n  다음 명령어를 통해 필요한 패키지가 설치되어 있는지 확인합니다.\n\n- node & npm\n\n    ```bash\n    npm --version\n    ```\n\n- hugo\n\n    ```bash\n    hugo version\n    ```\n\n1. 필요한 node module을 설치합니다.\n\n    ```bash\n    npm install\n    ```\n\n2. 프로젝트에서는 각 글의 일관성을 위해서 여러 markdown lint를 적용하고 있습니다.  \n  다음 명령어를 실행해 test를 진행한 후 커밋합니다.내용 수정 및 추가 후 lint가 맞는지 확인합니다.\n\n    ```bash\n    npm test\n    ```\n\n4. lint 확인 완료 후 ci 를 실행합니다.\n\n    ```bash\n    npm ci\n    ```\n\n4. 로컬에서 실행 후 수정한 글이 정상적으로 나오는지 확인합니다.\n\n    ```bash\n    npm run start\n    ```\n\n## How to Contribute\n\n### 1. 새로운 포스트를 작성할 때\n\n새로운 포스트는 각 챕터와 포스트의 위치에 맞는 weight를 설정합니다.\n\n- Introduction: 1xx\n- Setup: 2xx\n- Kubeflow: 3xx\n- API Deployment: 4xx\n- Help: 10xx\n\n### 2. 기존의 포스트를 수정할 때\n\n기존의 포스트를 수정할 때 Contributor에 본인의 이름을 입력합니다.\n\n```markdown\ncontributors: [\"John Doe\", \"Adam Smith\"]\n```\n\n### 3. 프로젝트에 처음 기여할 때\n\n만약 프로젝트에 처음 기여 할 때 `content/kor/contributors`에 본인의 이름으로 폴더를 생성한 후, `_index.md`라는 파일을 작성합니다.\n\n예를 들어, `minsoo kim`이 본인의 영어 이름이라면, 폴더명은 `minsoo-kim`으로 하여 해당 폴더 내부의 `_index.md`파일에 다음의 내용을 작성합니다.\n폴더명은 하이픈(-)으로 연결한 소문자로, title은 띄어쓰기를 포함한 CamelCase로 작성합니다.\n\n```markdown\n---\ntitle: \"John Doe\"\ndraft: false\n---\n```\n\n## After Pull Request\n\nPull Request를 생성하면 프로젝트에서는 자동으로 *모두의 MLOps* 운영진에게 리뷰 요청이 전해집니다. 최대 일주일 이내로 확인 후 Comment를 드릴 예정입니다.\n"
  },
  {
    "path": "docs/api-deployment/_category_.json",
    "content": "{\n  \"label\": \"API Deployment\",\n  \"position\": 7,\n  \"link\": {\n    \"type\": \"generated-index\"\n  }\n}\n"
  },
  {
    "path": "docs/api-deployment/seldon-children.md",
    "content": "---\ntitle : \"6. Multi Models\"\ndescription: \"\"\nsidebar_position: 6\ncontributors: [\"Jongseob Jeon\"]\n---\n\n## Multi Models\n\n앞서 설명했던 방법들은 모두 단일 모델을 대상으로 했습니다.  \n이번 페이지에서는 여러 개의 모델을 연결하는 방법에 대해서 알아봅니다.\n\n## Pipeline\n\n우선 모델을 2개를 생성하는 파이프라인을 작성하겠습니다.\n\n모델은 앞서 사용한 SVC 모델에 StandardScaler를 추가하고 저장하도록 하겠습니다.\n\n```python\nfrom functools import partial\n\nimport kfp\nfrom kfp.components import InputPath, OutputPath, create_component_from_func\n\n\n@partial(\n    create_component_from_func,\n    packages_to_install=[\"pandas\", \"scikit-learn\"],\n)\ndef load_iris_data(\n    data_path: OutputPath(\"csv\"),\n    target_path: OutputPath(\"csv\"),\n):\n    import pandas as pd\n    from sklearn.datasets import load_iris\n\n    iris = load_iris()\n\n    data = pd.DataFrame(iris[\"data\"], columns=iris[\"feature_names\"])\n    target = pd.DataFrame(iris[\"target\"], columns=[\"target\"])\n\n    data.to_csv(data_path, index=False)\n    target.to_csv(target_path, index=False)\n\n@partial(\n    create_component_from_func,\n    packages_to_install=[\"dill\", \"pandas\", \"scikit-learn\", \"mlflow\"],\n)\ndef train_scaler_from_csv(\n    data_path: InputPath(\"csv\"),\n    scaled_data_path: OutputPath(\"csv\"),\n    model_path: OutputPath(\"dill\"),\n    input_example_path: OutputPath(\"dill\"),\n    signature_path: OutputPath(\"dill\"),\n    conda_env_path: OutputPath(\"dill\"),\n):\n    import dill\n    import pandas as pd\n    from sklearn.preprocessing import StandardScaler\n\n    from mlflow.models.signature import infer_signature\n    from mlflow.utils.environment import _mlflow_conda_env\n\n    data = pd.read_csv(data_path)\n\n    scaler = StandardScaler()\n    scaled_data = scaler.fit_transform(data)\n    scaled_data = pd.DataFrame(scaled_data, columns=data.columns, index=data.index)\n\n    scaled_data.to_csv(scaled_data_path, index=False)\n\n    with open(model_path, mode=\"wb\") as file_writer:\n        dill.dump(scaler, file_writer)\n\n    input_example = data.sample(1)\n    with open(input_example_path, \"wb\") as file_writer:\n        dill.dump(input_example, file_writer)\n\n    signature = infer_signature(data, scaler.transform(data))\n    with open(signature_path, \"wb\") as file_writer:\n        dill.dump(signature, file_writer)\n\n    conda_env = _mlflow_conda_env(\n        additional_pip_deps=[\"scikit-learn\"],\n        install_mlflow=False\n    )\n    with open(conda_env_path, \"wb\") as file_writer:\n        dill.dump(conda_env, file_writer)\n\n\n@partial(\n    create_component_from_func,\n    packages_to_install=[\"dill\", \"pandas\", \"scikit-learn\", \"mlflow\"],\n)\ndef train_svc_from_csv(\n    train_data_path: InputPath(\"csv\"),\n    train_target_path: InputPath(\"csv\"),\n    model_path: OutputPath(\"dill\"),\n    input_example_path: OutputPath(\"dill\"),\n    signature_path: OutputPath(\"dill\"),\n    conda_env_path: OutputPath(\"dill\"),\n    kernel: str,\n):\n    import dill\n    import pandas as pd\n    from sklearn.svm import SVC\n\n    from mlflow.models.signature import infer_signature\n    from mlflow.utils.environment import _mlflow_conda_env\n\n    train_data = pd.read_csv(train_data_path)\n    train_target = pd.read_csv(train_target_path)\n\n    clf = SVC(kernel=kernel)\n    clf.fit(train_data, train_target)\n\n    with open(model_path, mode=\"wb\") as file_writer:\n        dill.dump(clf, file_writer)\n\n    input_example = train_data.sample(1)\n    with open(input_example_path, \"wb\") as file_writer:\n        dill.dump(input_example, file_writer)\n\n    signature = infer_signature(train_data, clf.predict(train_data))\n    with open(signature_path, \"wb\") as file_writer:\n        dill.dump(signature, file_writer)\n\n    conda_env = _mlflow_conda_env(\n        additional_pip_deps=[\"scikit-learn\"],\n        install_mlflow=False\n    )\n    with open(conda_env_path, \"wb\") as file_writer:\n        dill.dump(conda_env, file_writer)\n\n\n@partial(\n    create_component_from_func,\n    packages_to_install=[\"dill\", \"pandas\", \"scikit-learn\", \"mlflow\", \"boto3\"],\n)\ndef upload_sklearn_model_to_mlflow(\n    model_name: str,\n    model_path: InputPath(\"dill\"),\n    input_example_path: InputPath(\"dill\"),\n    signature_path: InputPath(\"dill\"),\n    conda_env_path: InputPath(\"dill\"),\n):\n    import os\n    import dill\n    from mlflow.sklearn import save_model\n    \n    from mlflow.tracking.client import MlflowClient\n\n    os.environ[\"MLFLOW_S3_ENDPOINT_URL\"] = \"http://minio-service.kubeflow.svc:9000\"\n    os.environ[\"AWS_ACCESS_KEY_ID\"] = \"minio\"\n    os.environ[\"AWS_SECRET_ACCESS_KEY\"] = \"minio123\"\n\n    client = MlflowClient(\"http://mlflow-server-service.mlflow-system.svc:5000\")\n\n    with open(model_path, mode=\"rb\") as file_reader:\n        clf = dill.load(file_reader)\n\n    with open(input_example_path, \"rb\") as file_reader:\n        input_example = dill.load(file_reader)\n\n    with open(signature_path, \"rb\") as file_reader:\n        signature = dill.load(file_reader)\n\n    with open(conda_env_path, \"rb\") as file_reader:\n        conda_env = dill.load(file_reader)\n    save_model(\n        sk_model=clf,\n        path=model_name,\n        serialization_format=\"cloudpickle\",\n        conda_env=conda_env,\n        signature=signature,\n        input_example=input_example,\n    )\n    run = client.create_run(experiment_id=\"0\")\n    client.log_artifact(run.info.run_id, model_name)\n\n\nfrom kfp.dsl import pipeline\n\n\n@pipeline(name=\"multi_model_pipeline\")\ndef multi_model_pipeline(kernel: str = \"rbf\"):\n    iris_data = load_iris_data()\n    scaled_data = train_scaler_from_csv(data=iris_data.outputs[\"data\"])\n    _ = upload_sklearn_model_to_mlflow(\n        model_name=\"scaler\",\n        model=scaled_data.outputs[\"model\"],\n        input_example=scaled_data.outputs[\"input_example\"],\n        signature=scaled_data.outputs[\"signature\"],\n        conda_env=scaled_data.outputs[\"conda_env\"],\n    )\n    model = train_svc_from_csv(\n        train_data=scaled_data.outputs[\"scaled_data\"],\n        train_target=iris_data.outputs[\"target\"],\n        kernel=kernel,\n    )\n    _ = upload_sklearn_model_to_mlflow(\n        model_name=\"svc\",\n        model=model.outputs[\"model\"],\n        input_example=model.outputs[\"input_example\"],\n        signature=model.outputs[\"signature\"],\n        conda_env=model.outputs[\"conda_env\"],\n    )\n\n\nif __name__ == \"__main__\":\n    kfp.compiler.Compiler().compile(multi_model_pipeline, \"multi_model_pipeline.yaml\")\n\n```\n\n파이프라인을 업로드하면 다음과 같이 나옵니다.\n\n![children-kubeflow.png](./img/children-kubeflow.png)\n\nMLflow 대시보드를 확인하면 다음과 같이 두 개의 모델이 생성됩니다.\n\n![children-mlflow.png](./img/children-mlflow.png)\n\n각각의 run_id를 확인 후 다음과 같이 SeldonDeployment 스펙을 정의합니다.\n\n```bash\napiVersion: machinelearning.seldon.io/v1\nkind: SeldonDeployment\nmetadata:\n  name: multi-model-example\n  namespace: kubeflow-user-example-com\nspec:\n  name: model\n  predictors:\n  - name: model\n\n    componentSpecs:\n    - spec:\n        volumes:\n        - name: model-provision-location\n          emptyDir: {}\n\n        initContainers:\n        - name: scaler-initializer\n          image: gcr.io/kfserving/storage-initializer:v0.4.0\n          args:\n            - \"s3://mlflow/mlflow/artifacts/0/7f445015a0e94519b003d316478766ef/artifacts/scaler\"\n            - \"/mnt/models\"\n          volumeMounts:\n          - mountPath: /mnt/models\n            name: model-provision-location\n          envFrom:\n          - secretRef:\n              name: seldon-init-container-secret\n        - name: svc-initializer\n          image: gcr.io/kfserving/storage-initializer:v0.4.0\n          args:\n            - \"s3://mlflow/mlflow/artifacts/0/87eb168e76264b39a24b0e5ca0fe922b/artifacts/svc\"\n            - \"/mnt/models\"\n          volumeMounts:\n          - mountPath: /mnt/models\n            name: model-provision-location\n          envFrom:\n          - secretRef:\n              name: seldon-init-container-secret\n\n        containers:\n        - name: scaler\n          image: seldonio/mlflowserver:1.8.0-dev\n          volumeMounts:\n          - mountPath: /mnt/models\n            name: model-provision-location\n            readOnly: true\n          securityContext:\n            privileged: true\n            runAsUser: 0\n            runAsGroup: 0\n        - name: svc\n          image: seldonio/mlflowserver:1.8.0-dev\n          volumeMounts:\n          - mountPath: /mnt/models\n            name: model-provision-location\n            readOnly: true\n          securityContext:\n            privileged: true\n            runAsUser: 0\n            runAsGroup: 0\n\n    graph:\n      name: scaler\n      type: MODEL\n      parameters:\n      - name: model_uri\n        type: STRING\n        value: \"/mnt/models\"\n      - name: predict_method\n        type: STRING\n        value: \"transform\"\n      children:\n      - name: svc\n        type: MODEL\n        parameters:\n        - name: model_uri\n          type: STRING\n          value: \"/mnt/models\"\n```\n\n모델이 두 개가 되었으므로 각 모델의 initContainer와 container를 정의해주어야 합니다.\n이 필드는 입력값을 array로 받으며 순서는 관계없습니다.\n\n모델이 실행하는 순서는 graph에서 정의됩니다.\n\n```bash\ngraph:\n  name: scaler\n  type: MODEL\n  parameters:\n  - name: model_uri\n    type: STRING\n    value: \"/mnt/models\"\n  - name: predict_method\n    type: STRING\n    value: \"transform\"\n  children:\n  - name: svc\n    type: MODEL\n    parameters:\n    - name: model_uri\n      type: STRING\n      value: \"/mnt/models\"\n```\n\ngraph의 동작 방식은 처음 받은 값을 정해진 predict_method로 변환한 뒤 children으로 정의된 모델에 전달하는 방식입니다.\n이 경우 scaler -> svc 로 데이터가 전달됩니다.\n\n이제 위의 스펙을 yaml파일로 생성해 보겠습니다.\n\n```bash\ncat <<EOF > multi-model.yaml\napiVersion: machinelearning.seldon.io/v1\nkind: SeldonDeployment\nmetadata:\n  name: multi-model-example\n  namespace: kubeflow-user-example-com\nspec:\n  name: model\n  predictors:\n  - name: model\n\n    componentSpecs:\n    - spec:\n        volumes:\n        - name: model-provision-location\n          emptyDir: {}\n\n        initContainers:\n        - name: scaler-initializer\n          image: gcr.io/kfserving/storage-initializer:v0.4.0\n          args:\n            - \"s3://mlflow/mlflow/artifacts/0/7f445015a0e94519b003d316478766ef/artifacts/scaler\"\n            - \"/mnt/models\"\n          volumeMounts:\n          - mountPath: /mnt/models\n            name: model-provision-location\n          envFrom:\n          - secretRef:\n              name: seldon-init-container-secret\n        - name: svc-initializer\n          image: gcr.io/kfserving/storage-initializer:v0.4.0\n          args:\n            - \"s3://mlflow/mlflow/artifacts/0/87eb168e76264b39a24b0e5ca0fe922b/artifacts/svc\"\n            - \"/mnt/models\"\n          volumeMounts:\n          - mountPath: /mnt/models\n            name: model-provision-location\n          envFrom:\n          - secretRef:\n              name: seldon-init-container-secret\n\n        containers:\n        - name: scaler\n          image: ghcr.io/mlops-for-all/mlflowserver\n          volumeMounts:\n          - mountPath: /mnt/models\n            name: model-provision-location\n            readOnly: true\n          securityContext:\n            privileged: true\n            runAsUser: 0\n            runAsGroup: 0\n        - name: svc\n          image: ghcr.io/mlops-for-all/mlflowserver\n          volumeMounts:\n          - mountPath: /mnt/models\n            name: model-provision-location\n            readOnly: true\n          securityContext:\n            privileged: true\n            runAsUser: 0\n            runAsGroup: 0\n\n    graph:\n      name: scaler\n      type: MODEL\n      parameters:\n      - name: model_uri\n        type: STRING\n        value: \"/mnt/models\"\n      - name: predict_method\n        type: STRING\n        value: \"transform\"\n      children:\n      - name: svc\n        type: MODEL\n        parameters:\n        - name: model_uri\n          type: STRING\n          value: \"/mnt/models\"\nEOF\n```\n\n다음 명령어를 통해 API를 생성합니다.\n\n```bash\nkubectl apply -f multi-model.yaml\n```\n\n정상적으로 수행되면 다음과 같이 출력됩니다.\n\n```bash\nseldondeployment.machinelearning.seldon.io/multi-model-example created\n```\n\n정상적으로 생성됐는지 확인합니다.\n\n```bash\nkubectl get po -n kubeflow-user-example-com | grep multi-model-example\n```\n\n정상적으로 생성되면 다음과 비슷한 pod이 생성됩니다.\n\n```bash\nmulti-model-example-model-0-scaler-svc-9955fb795-n9ffw   4/4     Running     0          2m30s\n```\n"
  },
  {
    "path": "docs/api-deployment/seldon-fields.md",
    "content": "---\ntitle : \"4. Seldon Fields\"\ndescription: \"\"\nsidebar_position: 4\ncontributors: [\"Jongseob Jeon\"]\n---\n\n## How Seldon Core works?\n\nSeldon Core가 API 서버를 생성하는 과정을 요약하면 다음과 같습니다.\n\n![seldon-fields-0.png](./img/seldon-fields-0.png)\n\n1. initContainer는 모델 저장소에서 필요한 모델을 다운로드 받습니다.\n2. 다운로드받은 모델을 container로 전달합니다.\n3. container는 전달받은 모델을 감싼 API 서버를 실행합니다.\n4. 생성된 API 서버 주소로 API를 요청하여 모델의 추론 값을 받을 수 있습니다.\n\n## SeldonDeployment Spec\n\nSeldon Core를 사용할 때, 주로 사용하게 되는 커스텀 리소스인 SeldonDeployment를 정의하는 yaml 파일은 다음과 같습니다.\n\n```bash\napiVersion: machinelearning.seldon.io/v1\nkind: SeldonDeployment\nmetadata:\n  name: seldon-example\n  namespace: kubeflow-user-example-com\nspec:\n  name: model\n  predictors:\n  - name: model\n\n    componentSpecs:\n    - spec:\n        volumes:\n        - name: model-provision-location\n          emptyDir: {}\n\n        initContainers:\n        - name: model-initializer\n          image: gcr.io/kfserving/storage-initializer:v0.4.0\n          args:\n            - \"gs://seldon-models/v1.12.0-dev/sklearn/iris\"\n            - \"/mnt/models\"\n          volumeMounts:\n          - mountPath: /mnt/models\n            name: model-provision-location\n\n        containers:\n        - name: model\n          image: seldonio/sklearnserver:1.8.0-dev\n          volumeMounts:\n          - mountPath: /mnt/models\n            name: model-provision-location\n            readOnly: true\n          securityContext:\n            privileged: true\n            runAsUser: 0\n            runAsGroup: 0\n\n    graph:\n      name: model\n      type: MODEL\n      parameters:\n      - name: model_uri\n        type: STRING\n        value: \"/mnt/models\"\n      children: []\n\n```\n\nSeldonDeployment spe 중 `name` 과 `predictors` 필드는 required 필드입니다.  \n`name`은 쿠버네티스 상에서 pod의 구분을 위한 이름으로 크게 영향을 미치지 않습니다.  \n`predictors`는 한 개로 구성된 array로 `name`, `componentSpecs` 와 `graph` 가 정의되어야 합니다.  \n여기서도 `name`은 pod의 구분을 위한 이름으로 크게 영향을 미치지 않습니다.  \n\n이제 `componentSpecs` 와 `graph`에서 정의해야 할 필드들에 대해서 알아보겠습니다.\n\n## componentSpecs\n\n`componentSpecs` 는 하나로 구성된 array로 `spec` 키값이 정의되어야 합니다.  \n`spec` 에는 `volumes`, `initContainers`, `containers` 의 필드가 정의되어야 합니다.\n\n### volumes\n\n```bash\nvolumes:\n- name: model-provision-location\n  emptyDir: {}\n```\n\n`volumes`은 initContainer에서 다운로드받는 모델을 저장하기 위한 공간을 의미합니다.  \narray로 입력을 받으며 array의 구성 요소는 `name`과 `emptyDir` 입니다.  \n이 값들은 모델을 다운로드받고 옮길 때 한번 사용되므로 크게 수정하지 않아도 됩니다.\n\n### initContainer\n\n```bash\n- name: model-initializer\n  image: gcr.io/kfserving/storage-initializer:v0.4.0\n  args:\n    - \"gs://seldon-models/v1.12.0-dev/sklearn/iris\"\n    - \"/mnt/models\"\n  volumeMounts:\n  - mountPath: /mnt/models\n    name: model-provision-location\n```\n\ninitContainer는 API에서 사용할 모델을 다운로드받는 역할을 합니다.  \n그래서 사용되는 필드들은 모델 저장소(Model Registry)로부터 데이터를 다운로드받을 때 필요한 정보들을 정해줍니다.\n\ninitContainer의 값은 n개의 array로 구성되어 있으며 사용하는 모델마다 각각 지정해주어야 합니다.\n\n#### name\n\n`name`은 쿠버네티스 상의 pod의 이름입니다.  \n디버깅을 위해 `{model_name}-initializer` 로 사용하길 권장합니다.\n\n#### image\n\n`image` 는 모델을 다운로드 받기 위해 사용할 이미지 이름입니다.  \nseldon core에서 권장하는 이미지는 크게 두 가지입니다.\n\n- gcr.io/kfserving/storage-initializer:v0.4.0\n- seldonio/rclone-storage-initializer:1.13.0-dev\n\n각각의 자세한 내용은 다음을 참고 바랍니다.\n\n- [kfserving](https://docs.seldon.io/projects/seldon-core/en/latest/servers/kfserving-storage-initializer.html)\n- [rclone](https://github.com/SeldonIO/seldon-core/tree/master/components/rclone-storage-initializer)\n\n*모두의 MLOps* 에서는 kfserving을 사용합니다.\n\n#### args\n\n```bash\nargs:\n  - \"gs://seldon-models/v1.12.0-dev/sklearn/iris\"\n  - \"/mnt/models\"\n```\n\ngcr.io/kfserving/storage-initializer:v0.4.0 도커 이미지가 실행(`run`)될 때 입력받는 argument를 입력합니다.  \narray로 구성되며 첫 번째 array의 값은 다운로드받을 모델의 주소를 적습니다.  \n두 번째 array의 값은 다운로드받은 모델을 저장할 주소를 적습니다. (seldon core에서는 주로 `/mnt/models`에 저장합니다.)\n\n### volumeMounts\n\n```bash\nvolumeMounts:\n  - mountPath: /mnt/models\n    name: model-provision-location\n```\n\n`volumneMounts`는 volumes에서 설명한 것과 같이 `/mnt/models`를 쿠버네티스 상에서 공유할 수 있도록 볼륨을 붙여주는 필드입니다.  \n자세한 내용은 [쿠버네티스 Volume](https://kubernetes.io/docs/concepts/storage/volumes/)을 참조 바랍니다.\n\n### container\n\n```bash\ncontainers:\n- name: model\n  image: seldonio/sklearnserver:1.8.0-dev\n  volumeMounts:\n  - mountPath: /mnt/models\n    name: model-provision-location\n    readOnly: true\n  securityContext:\n    privileged: true\n    runAsUser: 0\n    runAsGroup: 0\n```\n\ncontainer는 실제로 모델이 API 형식으로 실행될 때의 설정을 정의하는 필드입니다.  \n\n#### name\n\n`name`은 쿠버네티스 상의 pod의 이름입니다. 사용하는 모델의 이름을 적습니다.\n\n#### image\n\n`image` 는 모델을 API로 만드는 데 사용할 이미지입니다.  \n이미지에는 모델이 로드될 때 필요한 패키지들이 모두 설치되어 있어야 합니다.\n\nSeldon Core에서 지원하는 공식 이미지는 다음과 같습니다.\n\n- seldonio/sklearnserver\n- seldonio/mlflowserver\n- seldonio/xgboostserver\n- seldonio/tfserving\n\n#### volumeMounts\n\n```bash\nvolumeMounts:\n- mountPath: /mnt/models\n  name: model-provision-location\n  readOnly: true\n```\n\ninitContainer에서 다운로드받은 데이터가 있는 경로를 알려주는 필드입니다.  \n이때 모델이 수정되는 것을 방지하기 위해 `readOnly: true`도 같이 주겠습니다.\n\n#### securityContext\n\n```bash\nsecurityContext:\n  privileged: true\n  runAsUser: 0\n  runAsGroup: 0\n```\n\n필요한 패키지를 설치할 때 pod이 권한이 없어서 패키지 설치를 수행하지 못할 수 있습니다.  \n이를 위해서 root 권한을 부여합니다. (다만 이 작업은 실제 서빙 시 보안 문제가 생길 수 있습니다.)\n\n## graph\n\n```bash\ngraph:\n  name: model\n  type: MODEL\n  parameters:\n  - name: model_uri\n    type: STRING\n    value: \"/mnt/models\"\n  children: []\n```\n\n모델이 동작하는 순서를 정의한 필드입니다.\n\n### name\n\n모델 그래프의 이름입니다. container에서 정의된 이름을 사용합니다.\n\n### type\n\ntype은 크게 4가지가 있습니다.\n\n1. TRANSFORMER\n2. MODEL\n3. OUTPUT_TRANSFORMER\n4. ROUTER\n\n각 type에 대한 자세한 설명은 [Seldon Core Complex Graphs Metadata Example](https://docs.seldon.io/projects/seldon-core/en/latest/examples/graph-metadata.html)을 참조 바랍니다.\n\n### parameters\n\nclass init 에서 사용되는 값들입니다.  \nsklearnserver에서 필요한 값은 [다음 파일](https://github.com/SeldonIO/seldon-core/blob/master/servers/sklearnserver/sklearnserver/SKLearnServer.py)에서 확인할 수 있습니다.\n\n```python\nclass SKLearnServer(SeldonComponent):\n    def __init__(self, model_uri: str = None, method: str = \"predict_proba\"):\n```\n\n코드를 보면 `model_uri`와 `method`를 정의할 수 있습니다.\n\n### children\n\n순서도를 작성할 때 사용됩니다. 자세한 내용은 다음 페이지에서 설명합니다.\n"
  },
  {
    "path": "docs/api-deployment/seldon-iris.md",
    "content": "---\ntitle : \"2. Deploy SeldonDeployment\"\ndescription: \"\"\nsidebar_position: 2\ndate: 2021-12-22\nlastmod: 2021-12-22\ncontributors: [\"Youngcheol Jang\", \"SeungTae Kim\"]\n---\n\n## SeldonDeployment를 통해 배포하기\n\n이번에는 학습된 모델이 있을 때 SeldonDeployment를 통해 API Deployment를 해보겠습니다.\nSeldonDeployment는 쿠버네티스(Kubernetes)에 모델을 REST/gRPC 서버의 형태로 배포하기 위해 정의된 CRD(CustomResourceDefinition)입니다.\n\n### 1. Prerequisites\n\nSeldonDeployment 관련된 실습은 seldon-deploy라는 새로운 네임스페이스(namespace)에서 진행하도록 하겠습니다.\n네임스페이스를 생성한 뒤, seldon-deploy를 현재 네임스페이스로 설정합니다.\n\n```bash\nkubectl create namespace seldon-deploy\nkubectl config set-context --current --namespace=seldon-deploy\n```\n\n### 2. 스펙 정의\n\nSeldonDeployment를 배포하기 위한 yaml 파일을 생성합니다.\n이번 페이지에서는 공개된 iris model을 사용하도록 하겠습니다.\n이 iris model은 sklearn 프레임워크를 통해 학습되었기 때문에 SKLEARN_SERVER를 사용합니다.\n\n```bash\ncat <<EOF > iris-sdep.yaml\napiVersion: machinelearning.seldon.io/v1alpha2\nkind: SeldonDeployment\nmetadata:\n  name: sklearn\n  namespace: seldon-deploy\nspec:\n  name: iris\n  predictors:\n  - graph:\n      children: []\n      implementation: SKLEARN_SERVER\n      modelUri: gs://seldon-models/v1.12.0-dev/sklearn/iris\n      name: classifier\n    name: default\n    replicas: 1\nEOF\n```\n\nyaml 파일을 배포합니다.\n\n```bash\nkubectl apply -f iris-sdep.yaml\n```\n\n다음 명령어를 통해 정상적으로 배포가 되었는지 확인합니다.\n\n```bash\nkubectl get pods --selector seldon-app=sklearn-default -n seldon-deploy\n```\n\n모두 Running 이 되면 다음과 비슷한 결과가 출력됩니다.\n\n```bash\nNAME                                            READY   STATUS    RESTARTS   AGE\nsklearn-default-0-classifier-5fdfd7bb77-ls9tr   2/2     Running   0          5m\n```\n\n## Ingress URL\n\n이제 배포된 모델에 추론 요청(predict request)를 보내서 추론 결괏값을 받아옵니다.\n배포된 API는 다음과 같은 규칙으로 생성됩니다.\n`http://{NODE_IP}:{NODE_PORT}/seldon/{namespace}/{seldon-deployment-name}/api/v1.0/{method-name}/`\n\n### NODE_IP / NODE_PORT\n\n[Seldon Core 설치 시, Ambassador를 Ingress Controller로 설정하였으므로](../setup-components/install-components-seldon.md), SeldonDeployment로 생성된 API 서버는 모두 Ambassador의 Ingress gateway를 통해 요청할 수 있습니다.\n\n따라서 우선 Ambassador Ingress Gateway의 url을 환경 변수로 설정합니다.\n\n```bash\nexport NODE_IP=$(kubectl get nodes -o jsonpath='{ $.items[*].status.addresses[?(@.type==\"InternalIP\")].address }')\nexport NODE_PORT=$(kubectl get service ambassador -n seldon-system -o jsonpath=\"{.spec.ports[0].nodePort}\")\n```\n\n설정된 url을 확인합니다.\n\n```bash\necho \"NODE_IP\"=$NODE_IP\necho \"NODE_PORT\"=$NODE_PORT\n```\n\n다음과 비슷하게 출력되어야 하며, 클라우드 등을 통해 설정할 경우, internal ip 주소가 설정되는 것을 확인할 수 있습니다.\n\n```bash\nNODE_IP=192.168.0.19\nNODE_PORT=30486\n```\n\n### namespace / seldon-deployment-name\n\nSeldonDeployment가 배포된 `namespace`와 `seldon-deployment-name`를 의미합니다.\n이는 스펙을 정의할 때 metadata에 정의된 값을 사용합니다.\n\n```bash\nmetadata:\n  name: sklearn\n  namespace: seldon-deploy\n```\n\n위의 예시에서는 `namespace`는 seldon-deploy, `seldon-deployment-name`은 sklearn 입니다.\n\n### method-name\n\nSeldonDeployment에서 주로 사용하는 `method-name`은 두 가지가 있습니다.\n\n1. doc\n2. predictions\n\n각각의 method의 자세한 사용 방법은 아래에서 설명합니다.\n\n## Using Swagger\n\n우선 doc method를 사용하는 방법입니다. doc method를 이용하면 seldon에서 생성한 swagger에 접속할 수 있습니다.\n\n### 1. Swagger 접속\n\n위에서 설명한 ingress url 규칙에 따라 아래 주소를 통해 swagger에 접근할 수 있습니다.  \n`http://192.168.0.19:30486/seldon/seldon-deploy/sklearn/api/v1.0/doc/`\n\n![iris-swagger1.png](./img/iris-swagger1.png)\n\n### 2. Swagger Predictions 메뉴 선택\n\nUI에서 `/seldon/seldon-deploy/sklearn/api/v1.0/predictions` 메뉴를 선택합니다.\n\n![iris-swagger2.png](./img/iris-swagger2.png)\n\n### 3. *Try it out* 선택\n\n![iris-swagger3.png](./img/iris-swagger3.png)\n\n### 4. Request body에 data 입력\n\n![iris-swagger4.png](./img/iris-swagger4.png)\n\n다음 데이터를 입력합니다.\n\n```bash\n{\n  \"data\": {\n    \"ndarray\":[[1.0, 2.0, 5.0, 6.0]]\n  }\n}\n```\n\n### 5. 추론 결과 확인\n\n`Execute` 버튼을 눌러서 추론 결과를 확인할 수 있습니다.\n\n![iris-swagger5.png](./img/iris-swagger5.png)\n\n정상적으로 수행되면 다음과 같은 추론 결과를 얻습니다.\n\n```bash\n{\n  \"data\": {\n    \"names\": [\n      \"t:0\",\n      \"t:1\",\n      \"t:2\"\n    ],\n    \"ndarray\": [\n      [\n        9.912315378486697e-7,\n        0.0007015931307746079,\n        0.9992974156376876\n      ]\n    ]\n  },\n  \"meta\": {\n    \"requestPath\": {\n      \"classifier\": \"seldonio/sklearnserver:1.11.2\"\n    }\n  }\n}\n```\n\n## Using CLI\n\n또한, curl과 같은 http client CLI 도구를 활용해서도 API 요청을 수행할 수 있습니다.\n\n예를 들어, 다음과 같이 `/predictions`를 요청하면\n\n```bash\ncurl -X POST http://$NODE_IP:$NODE_PORT/seldon/seldon-deploy/sklearn/api/v1.0/predictions \\\n-H 'Content-Type: application/json' \\\n-d '{ \"data\": { \"ndarray\": [[1,2,3,4]] } }'\n```\n\n아래와 같은 응답이 정상적으로 출력되는 것을 확인할 수 있습니다.\n\n```bash\n{\"data\":{\"names\":[\"t:0\",\"t:1\",\"t:2\"],\"ndarray\":[[0.0006985194531162835,0.00366803903943666,0.995633441507447]]},\"meta\":{\"requestPath\":{\"classifier\":\"seldonio/sklearnserver:1.11.2\"}}}\n```\n"
  },
  {
    "path": "docs/api-deployment/seldon-mlflow.md",
    "content": "---\ntitle : \"5. Model from MLflow\"\ndescription: \"\"\nsidebar_position: 5\ncontributors: [\"Jongseob Jeon\"]\n---\n\n## Model from MLflow\n\n이번 페이지에서는 [MLflow Component](../kubeflow/advanced-mlflow.md)에서 저장된 모델을 이용해 API를 생성하는 방법에 대해서 알아보겠습니다.\n\n## Secret\n\ninitContainer가 minio에 접근해서 모델을 다운로드받으려면 credentials가 필요합니다.\nminio에 접근하기 위한 credentials는 다음과 같습니다.\n\n```bash\napiVersion: v1\ntype: Opaque\nkind: Secret\nmetadata:\n  name: seldon-init-container-secret\n  namespace: kubeflow-user-example-com\ndata:\n  AWS_ACCESS_KEY_ID: bWluaW8K=\n  AWS_SECRET_ACCESS_KEY: bWluaW8xMjM=\n  AWS_ENDPOINT_URL: aHR0cDovL21pbmlvLm1ha2luYXJvY2tzLmFp\n  USE_SSL: ZmFsc2U=\n```\n\n`AWS_ACCESS_KEY_ID` 의 입력값은 `minio`입니다. 다만 secret의 입력값은 인코딩된 값이여야 되기 때문에 실제로 입력되는 값은 다음을 수행후 나오는 값이어야 합니다.\n\ndata에 입력되어야 하는 값들은 다음과 같습니다.\n\n- AWS_ACCESS_KEY_ID: minio\n- AWS_SECRET_ACCESS_KEY: minio123\n- AWS_ENDPOINT_URL: http://minio-service.kubeflow.svc:9000\n- USE_SSL: false\n\n인코딩은 다음 명령어를 통해서 할 수 있습니다.\n\n```bash\necho -n minio | base64\n```\n\n그러면 다음과 같은 값이 출력됩니다.\n\n```bash\nbWluaW8=\n```\n\n인코딩을 전체 값에 대해서 진행하면 다음과 같이 됩니다.\n\n- AWS_ACCESS_KEY_ID: bWluaW8=\n- AWS_SECRET_ACCESS_KEY: bWluaW8xMjM=\n- AWS_ENDPOINT_URL: aHR0cDovL21pbmlvLXNlcnZpY2Uua3ViZWZsb3cuc3ZjOjkwMDA=\n- USE_SSL: ZmFsc2U=\n\n다음 명령어를 통해 secret을 생성할 수 있는 yaml파일을 생성합니다.\n\n```bash\ncat <<EOF > seldon-init-container-secret.yaml\napiVersion: v1\nkind: Secret\nmetadata:\n  name: seldon-init-container-secret\n  namespace: kubeflow-user-example-com\ntype: Opaque\ndata:\n  AWS_ACCESS_KEY_ID: bWluaW8=\n  AWS_SECRET_ACCESS_KEY: bWluaW8xMjM=\n  AWS_ENDPOINT_URL: aHR0cDovL21pbmlvLXNlcnZpY2Uua3ViZWZsb3cuc3ZjOjkwMDA=\n  USE_SSL: ZmFsc2U=\nEOF\n```\n\n다음 명령어를 통해 secret을 생성합니다.\n\n```bash\nkubectl apply -f seldon-init-container-secret.yaml\n```\n\n정상적으로 수행되면 다음과 같이 출력됩니다.\n\n```bash\nsecret/seldon-init-container-secret created\n```\n\n## Seldon Core yaml\n\n이제 Seldon Core를 생성하는 yaml파일을 작성합니다.\n\n```bash\napiVersion: machinelearning.seldon.io/v1\nkind: SeldonDeployment\nmetadata:\n  name: seldon-example\n  namespace: kubeflow-user-example-com\nspec:\n  name: model\n  predictors:\n  - name: model\n\n    componentSpecs:\n    - spec:\n        volumes:\n        - name: model-provision-location\n          emptyDir: {}\n\n        initContainers:\n        - name: model-initializer\n          image: gcr.io/kfserving/storage-initializer:v0.4.0\n          args:\n            - \"s3://mlflow/mlflow/artifacts/0/74ba8e33994144f599e50b3be176cdb0/artifacts/svc\"\n            - \"/mnt/models\"\n          volumeMounts:\n          - mountPath: /mnt/models\n            name: model-provision-location\n          envFrom:\n          - secretRef:\n              name: seldon-init-container-secret\n\n        containers:\n        - name: model\n          image: ghcr.io/mlops-for-all/mlflowserver\n          volumeMounts:\n          - mountPath: /mnt/models\n            name: model-provision-location\n            readOnly: true\n          securityContext:\n            privileged: true\n            runAsUser: 0\n            runAsGroup: 0\n\n    graph:\n      name: model\n      type: MODEL\n      parameters:\n      - name: model_uri\n        type: STRING\n        value: \"/mnt/models\"\n      children: []\n```\n\n이 전에 작성한 [Seldon Fields](../api-deployment/seldon-fields.md)와 달라진 점은 크게 두 부분입니다.\ninitContainer에 `envFrom` 필드가 추가되었으며 args의 주소가 `s3://mlflow/mlflow/artifacts/0/74ba8e33994144f599e50b3be176cdb0/artifacts/svc` 로 바뀌었습니다.\n\n### args\n\n앞서 args의 첫번째 array는 우리가 다운로드받을 모델의 경로라고 했습니다.  \n그럼 mlflow에 저장된 모델의 경로는 어떻게 알 수 있을까요?\n\n다시 mlflow에 들어가서 run을 클릭하고 모델을 누르면 다음과 같이 확인할 수 있습니다.\n\n![seldon-mlflow-0.png](./img/seldon-mlflow-0.png)\n\n이렇게 확인된 경로를 입력하면 됩니다.\n\n### envFrom\n\nminio에 접근해서 모델을 다운로드 받는 데 필요한 환경변수를 입력해주는 과정입니다.\n앞서 만든 `seldon-init-container-secret`를 이용합니다.\n\n## API 생성\n\n우선 위에서 정의한 스펙을 yaml 파일로 생성하겠습니다.\n\n```bash\napiVersion: machinelearning.seldon.io/v1\nkind: SeldonDeployment\nmetadata:\n  name: seldon-example\n  namespace: kubeflow-user-example-com\nspec:\n  name: model\n  predictors:\n  - name: model\n\n    componentSpecs:\n    - spec:\n        volumes:\n        - name: model-provision-location\n          emptyDir: {}\n\n        initContainers:\n        - name: model-initializer\n          image: gcr.io/kfserving/storage-initializer:v0.4.0\n          args:\n            - \"s3://mlflow/mlflow/artifacts/0/74ba8e33994144f599e50b3be176cdb0/artifacts/svc\"\n            - \"/mnt/models\"\n          volumeMounts:\n          - mountPath: /mnt/models\n            name: model-provision-location\n          envFrom:\n          - secretRef:\n              name: seldon-init-container-secret\n\n        containers:\n        - name: model\n          image: ghcr.io/mlops-for-all/mlflowserver\n          volumeMounts:\n          - mountPath: /mnt/models\n            name: model-provision-location\n            readOnly: true\n          securityContext:\n            privileged: true\n            runAsUser: 0\n            runAsGroup: 0\n\n    graph:\n      name: model\n      type: MODEL\n      parameters:\n      - name: model_uri\n        type: STRING\n        value: \"/mnt/models\"\n      - name: xtype\n        type: STRING\n        value: \"dataframe\"\n      children: []\nEOF\n```\n\nseldon pod을 생성합니다.\n\n```bash\nkubectl apply -f seldon-mlflow.yaml\n\n```\n\n정상적으로 수행되면 다음과 같이 출력됩니다.\n\n```bash\nseldondeployment.machinelearning.seldon.io/seldon-example created\n```\n\n이제 pod이 정상적으로 뜰 때까지 기다립니다.\n\n```bash\nkubectl get po -n kubeflow-user-example-com | grep seldon\n```\n\n다음과 비슷하게 출력되면 정상적으로 API를 생성했습니다.\n\n```bash\nseldon-example-model-0-model-5c949bd894-c5f28      3/3     Running     0          69s\n```\n\nCLI를 이용해 생성된 API에는 다음 request를 통해 실행을 확인할 수 있습니다.\n\n```bash\ncurl -X POST http://$NODE_IP:$NODE_PORT/seldon/seldon-deploy/sklearn/api/v1.0/predictions \\\n-H 'Content-Type: application/json' \\\n-d '{\n    \"data\": {\n        \"ndarray\": [\n            [\n                143.0,\n                0.0,\n                30.0,\n                30.0\n            ]\n        ],\n        \"names\": [\n            \"sepal length (cm)\",\n            \"sepal width (cm)\",\n            \"petal length (cm)\",\n            \"petal width (cm)\"\n        ]\n    }\n}'\n```\n\n정상적으로 실행될 경우 다음과 같은 결과를 받을 수 있습니다.\n\n```bash\n{\"data\":{\"names\":[],\"ndarray\":[\"Virginica\"]},\"meta\":{\"requestPath\":{\"model\":\"ghcr.io/mlops-for-all/mlflowserver:e141f57\"}}}\n```\n"
  },
  {
    "path": "docs/api-deployment/seldon-pg.md",
    "content": "---\ntitle : \"3. Seldon Monitoring\"\ndescription: \"Prometheus & Grafana 확인하기\"\nsidebar_position: 3\ndate: 2021-12-24\nlastmod: 2021-12-24\ncontributors: [\"Jongseob Jeon\"]\n---\n\n## Grafana & Prometheus\n\n이제, [지난 페이지](../api-deployment/seldon-iris.md)에서 생성했던 SeldonDeployment 로 API Request 를 반복적으로 수행해보고, 대시보드에 변화가 일어나는지 확인해봅니다.\n\n### 대시보드\n\n[앞서 생성한 대시보드](../setup-components/install-components-pg.md)를 포트 포워딩합니다.\n\n```bash\nkubectl port-forward svc/seldon-core-analytics-grafana -n seldon-system 8090:80\n```\n\n### API 요청\n\n[앞서 생성한 Seldon Deployment](../api-deployment/seldon-iris.md#using-cli)에 요청을 **반복해서** 보냅니다.\n\n```bash\ncurl -X POST http://$NODE_IP:$NODE_PORT/seldon/seldon-deploy/sklearn/api/v1.0/predictions \\\n-H 'Content-Type: application/json' \\\n-d '{ \"data\": { \"ndarray\": [[1,2,3,4]] } }'\n```\n\n그리고 그라파나 대시보드를 확인하면 다음과 같이 Global Request Rate 이 `0 ops` 에서 순간적으로 상승하는 것을 확인할 수 있습니다.\n\n![repeat-raise.png](./img/repeat-raise.png)\n\n이렇게 프로메테우스와 그라파나가 정상적으로 설치된 것을 확인할 수 있습니다.\n"
  },
  {
    "path": "docs/api-deployment/what-is-api-deployment.md",
    "content": "---\ntitle : \"1. What is API Deployment?\"\ndescription: \"\"\nsidebar_position: 1\ndate: 2021-12-22\nlastmod: 2021-12-22\ncontributors: [\"Youngcheol Jang\"]\n---\n\n## API Deployment란?\n\n머신러닝 모델을 학습한 뒤에는 어떻게 사용해야 할까요?  \n머신러닝을 학습할 때는 더 높은 성능의 모델이 나오기를 기대하지만, 학습된 모델을 사용하여 추론을 할 때는 빠르고 쉽게 추론 결과를 받아보고 싶을 것입니다.\n\n모델의 추론 결과를 확인하고자 할 때 주피터 노트북이나 파이썬 스크립트를 통해 학습된 모델을 로드한 뒤 추론할 수 있습니다.  \n그렇지만 이런 방법은 모델이 클수록 모델을 불러오는 데 많은 시간을 소요하게 되어서 비효율적입니다. 또한 이렇게 이용하면 많은 사람이 모델을 이용할 수 없고 학습된 모델이 있는 환경에서밖에 사용할 수 없습니다.\n\n그래서 실제 서비스에서 머신러닝이 사용될 때는 API를 이용해서 학습된 모델을 사용합니다. 모델은 API 서버가 구동되는 환경에서 한 번만 로드가 되며, DNS를 활용하여 외부에서도 쉽게 추론 결과를 받을 수 있고 다른 서비스와 연동할 수 있습니다.\n\n하지만 모델을 API로 만드는 작업에는 생각보다 많은 부수적인 작업이 필요합니다.  \n그래서 API로 만드는 작업을 더 쉽게 하기 위해서 Tensorflow와 같은 머신러닝 프레임워크 진영에서는 추론 엔진(Inference engine)을 개발하였습니다.\n\n추론 엔진들을 이용하면 해당 머신러닝 프레임워크로 개발되고 학습된 모델을 불러와 추론이 가능한 API(REST 또는 gRPC)를 생성합니다.  \n이러한 추론 엔진을 활용하여 구축한 API 서버로 추론하고자 하는 데이터를 담아 요청을 보내면, 추론 엔진이 추론 결과를 응답에 담아 전송하는 것입니다.\n\n대표적으로 다음과 같은 오픈소스 추론 엔진들이 개발되었습니다.\n\n- [Tensorflow : Tensorflow Serving](https://github.com/tensorflow/serving)\n- [PyTorch : Torchserve](https://github.com/pytorch/serve)\n- [Onnx : Onnx Runtime](https://github.com/microsoft/onnxruntime)\n\n오프소스에서 공식적으로 지원하지는 않지만, 많이 쓰이는 sklearn, xgboost 프레임워크를 위한 추론 엔진도 개발되어 있습니다.\n\n이처럼 모델의 추론 결과를 API의 형태로 받아볼 수 있도록 배포하는 것을 **API Deployment**라고 합니다.\n\n## Serving Framework\n\n위에서 다양한 추론 엔진들이 개발되었다는 사실을 소개해 드렸습니다.\n쿠버네티스 환경에서 이러한 추론 엔진들을 사용하여 API Deployment를 한다면 어떤 작업이 필요할까요?\n추론 엔진을 배포하기 위한 Deployment, 추론 요청을 보낼 Endpoint를 생성하기 위한 Service,\n외부에서의 추론 요청을 추론 엔진으로 보내기 위한 Ingress 등 많은 쿠버네티스 리소스를 배포해 주어야 합니다.\n이것 이외에도, 많은 추론 요청이 들어왔을 경우의 스케일 아웃(scale-out), 추론 엔진 상태에 대한 모니터링, 개선된 모델이 나왔을 경우 버전 업데이트 등 추론 엔진을 운영할 때의 요구사항은 한두 가지가 아닙니다.\n\n이러한 많은 요구사항을 처리하기 위해 추론 엔진들을 쿠버네티스 환경 위에서 한 번 더 추상화한 **Serving Framework**들이 개발되었습니다.\n\n개발된 Serving Framework들은 다음과 같은 오픈소스들이 있습니다.\n\n- [Seldon Core](https://github.com/SeldonIO/seldon-core)\n- [Kserve](https://github.com/kserve)\n- [BentoML](https://github.com/bentoml/BentoML)\n\n*모두의 MLOps*에서는 Seldon Core를 사용하여 API Deployment를 하는 과정을 다루어 보도록 하겠습니다.\n"
  },
  {
    "path": "docs/appendix/_category_.json",
    "content": "{\n  \"label\": \"Appendix\",\n  \"position\": 9,\n  \"link\": {\n    \"type\": \"generated-index\"\n  }\n}\n"
  },
  {
    "path": "docs/appendix/metallb.md",
    "content": "---\ntitle: \"2. Bare Metal 클러스터용 load balancer metallb 설치\"\nsidebar_position: 2\n---\n\n## MetalLB란?\n\nKubernetes 사용 시 AWS, GCP, Azure 와 같은 클라우드 플랫폼에서는 자체적으로 로드 벨런서(Load Balancer)를 제공해 주지만, 온프레미스 클러스터에서는 로드 벨런싱 기능을 제공하는 모듈을 추가적으로 설치해야 합니다.  \n[MetalLB](https://metallb.universe.tf/)는 베어메탈 환경에서 사용할 수 있는 로드 벨런서를 제공하는 오픈소스 프로젝트 입니다.\n\n## 요구사항\n\n| 요구 사항                                                    | 버전 및 내용                                                 |\n| ------------------------------------------------------------ | ------------------------------------------------------------ |\n| Kubernetes                                                   | 로드 벨런싱 기능이 없는 >= v1.13.0                           |\n| [호환가능한 네트워크  CNI](https://metallb.universe.tf/installation/network-addons/) | Calico, Canal, Cilium, Flannel, Kube-ovn, Kube-router, Weave  Net |\n| IPv4 주소                                                    | MetalLB 배포에 사용                                          |\n| BGP 모드를 사용할 경우                                       | BGP 기능을 지원하는 하나 이상의 라우터                       |\n| 노드 간 포트 TCP/UDP 7946 오픈                               | memberlist 요구 사항  \n\n## MetalLB 설치\n\n### Preparation\n\nIPVS 모드에서 kube-proxy를 사용하는 경우 Kubernetes v1.14.2 이후부터는 엄격한 ARP(strictARP) 모드를 사용하도록 설정해야 합니다.  \nKube-router는 기본적으로 엄격한 ARP를 활성화하므로 서비스 프록시로 사용할 경우에는 이 기능이 필요하지 않습니다.  \n엄격한 ARP 모드를 적용하기에 앞서, 현재 모드를 확인합니다.\n\n```bash\n# see what changes would be made, returns nonzero returncode if different\nkubectl get configmap kube-proxy -n kube-system -o yaml | \\\ngrep strictARP\n```\n\n```bash\nstrictARP: false\n```\n\nstrictARP: false 가 출력되는 경우 다음을 실행하여 strictARP: true로 변경합니다.\n(strictARP: true가 이미 출력된다면 다음 커맨드를 수행하지 않으셔도 됩니다.)\n\n```bash\n# actually apply the changes, returns nonzero returncode on errors only\nkubectl get configmap kube-proxy -n kube-system -o yaml | \\\nsed -e \"s/strictARP: false/strictARP: true/\" | \\\nkubectl apply -f - -n kube-system\n```\n\n정상적으로 수행되면 다음과 같이 출력됩니다.\n\n```bash\nWarning: resource configmaps/kube-proxy is missing the kubectl.kubernetes.io/last-applied-configuration annotation which is required by kubectl apply. kubectl apply should only be used on resources created declaratively by either kubectl create --save-config or kubectl apply. The missing annotation will be patched automatically.\nconfigmap/kube-proxy configured\n```\n\n### 설치 - Manifest\n\n#### 1. MetalLB 를 설치합니다.\n\n```bash\nkubectl apply -f https://raw.githubusercontent.com/metallb/metallb/v0.11.0/manifests/namespace.yaml\nkubectl apply -f https://raw.githubusercontent.com/metallb/metallb/v0.11.0/manifests/metallb.yaml\n```\n\n#### 2. 정상 설치 확인\n\nmetallb-system namespace 의 2 개의 pod 이 모두 Running 이 될 때까지 기다립니다.\n\n```bash\nkubectl get pod -n metallb-system\n```\n\n모두 Running 이 되면 다음과 비슷한 결과가 출력됩니다.\n\n```bash\nNAME                          READY   STATUS    RESTARTS   AGE\ncontroller-7dcc8764f4-8n92q   1/1     Running   1          1m\nspeaker-fnf8l                 1/1     Running   1          1m\n```\n\n매니페스트의 구성 요소는 다음과 같습니다.\n\n- metallb-system/controller\n  - deployment 로 배포되며, 로드 벨런싱을 수행할 external IP 주소의 할당을 처리하는 역할을 담당합니다.\n- metallb-system/speaker\n  - daemonset 형태로 배포되며, 외부 트래픽과 서비스를 연결해 네트워크 통신이 가능하도록 구성하는 역할을 담당합니다.\n\n서비스에는 컨트롤러 및 스피커와 구성 요소가 작동하는 데 필요한 RBAC 사용 권한이 포함됩니다.\n\n## Configuration\n\nMetalLB 의 로드 벨런싱 정책 설정은 관련 설정 정보를 담은 configmap 을 배포하여 설정할 수 있습니다.\n\nMetalLB 에서 구성할 수 있는 모드로는 다음과 같이 2가지가 있습니다.\n\n1. [Layer 2 모드](https://metallb.universe.tf/concepts/layer2/)\n2. [BGP 모드](https://metallb.universe.tf/concepts/bgp/)\n\n여기에서는 Layer 2 모드로 진행하겠습니다.\n\n### Layer 2 Configuration\n\nLayer 2 모드는 간단하게 사용할 IP 주소의 대역만 설정하면 됩니다.  \nLayer 2 모드를 사용할 경우 워커 노드의 네트워크 인터페이스에 IP를 바인딩 하지 않아도 되는데 로컬 네트워크의 ARP 요청에 직접 응답하여 컴퓨터의 MAC주소를 클라이언트에 제공하는 방식으로 작동하기 때문입니다.\n\n다음 `metallb_config.yaml` 파일은 MetalLB 가 192.168.35.100 ~ 192.168.35.110의 IP에 대한 제어 권한을 제공하고 Layer 2 모드를 구성하는 설정입니다.\n\n클러스터 노드와 클라이언트 노드가 분리된 경우, 192.168.35.100 ~ 192.168.35.110 대역이 클라이언트 노드와 클러스터 노드 모두 접근 가능한 대역이어야 합니다.\n\n#### metallb_config.yaml\n\n```bash\napiVersion: v1\nkind: ConfigMap\nmetadata:\n  namespace: metallb-system\n  name: config\ndata:\n  config: |\n    address-pools:\n    - name: default\n      protocol: layer2\n      addresses:\n      - 192.168.35.100-192.168.35.110  # IP 대역폭\n```\n\n위의 설정을 적용합니다.\n\n```test\nkubectl apply -f metallb_config.yaml \n```\n\n정상적으로 배포하면 다음과 같이 출력됩니다.\n\n```test\nconfigmap/config created\n```\n\n## MetalLB 사용\n\n### Kubeflow Dashboard\n\n먼저 kubeflow의 Dashboard 를 제공하는 istio-system 네임스페이스의 istio-ingressgateway 서비스의 타입을 `LoadBalancer`로 변경하여 MetalLB로부터 로드 벨런싱 기능을 제공받기 전에, 현재 상태를 확인합니다.\n\n```bash\nkubectl get svc/istio-ingressgateway -n istio-system\n```\n\n해당 서비스의 타입은 ClusterIP이며, External-IP 값은 `none` 인 것을 확인할 수 있습니다.\n\n```bash\nNAME                   TYPE        CLUSTER-IP    EXTERNAL-IP   PORT(S)                                        AGE\nistio-ingressgateway   ClusterIP   10.103.72.5   <none>        15021/TCP,80/TCP,443/TCP,31400/TCP,15443/TCP   4h21m\n```\n\ntype 을 LoadBalancer 로 변경하고 원하는 IP 주소를 입력하고 싶은 경우 loadBalancerIP 항목을 추가합니다.  \n추가 하지 않을 경우에는 위에서 설정한 IP 주소풀에서 순차적으로 IP 주소가 배정됩니다.\n\n```bash\nkubectl edit svc/istio-ingressgateway -n istio-system\n```\n\n```bash\nspec:\n  clusterIP: 10.103.72.5\n  clusterIPs:\n  - 10.103.72.5\n  ipFamilies:\n  - IPv4\n  ipFamilyPolicy: SingleStack\n  ports:\n  - name: status-port\n    port: 15021\n    protocol: TCP\n    targetPort: 15021\n  - name: http2\n    port: 80\n    protocol: TCP\n    targetPort: 8080\n  - name: https\n    port: 443\n    protocol: TCP\n    targetPort: 8443\n  - name: tcp\n    port: 31400\n    protocol: TCP\n    targetPort: 31400\n  - name: tls\n    port: 15443\n    protocol: TCP\n    targetPort: 15443\n  selector:\n    app: istio-ingressgateway\n    istio: ingressgateway\n  sessionAffinity: None\n  type: LoadBalancer # Change ClusterIP to LoadBalancer\n  loadBalancerIP: 192.168.35.100   # Add IP\nstatus:\n  loadBalancer: {}\n```\n\n다시 확인을 해보면 External-IP 값이 `192.168.35.100` 인 것을 확인합니다.\n\n```bash\nkubectl get svc/istio-ingressgateway -n istio-system\n```\n\n```bash\nNAME                   TYPE           CLUSTER-IP    EXTERNAL-IP      PORT(S)                                                                      AGE\nistio-ingressgateway   LoadBalancer   10.103.72.5   192.168.35.100   15021:31054/TCP,80:30853/TCP,443:30443/TCP,31400:30012/TCP,15443:31650/TCP   5h1m\n```\n\nWeb Browser 를 열어 [http://192.168.35.100](http://192.168.35.100) 으로 접속하여, 다음과 같은 화면이 출력되는 것을 확인합니다.\n\n![login-after-istio-ingressgateway-setting.png](./img/login-after-istio-ingressgateway-setting.png)\n\n### minio Dashboard\n\n먼저 minio 의 Dashboard 를 제공하는 kubeflow 네임스페이스의 minio-service 서비스의 타입을 LoadBalancer로 변경하여 MetalLB로부터 로드 벨런싱 기능을 제공받기 전에, 현재 상태를 확인합니다.\n\n```bash\nkubectl get svc/minio-service -n kubeflow\n```\n\n해당 서비스의 타입은 ClusterIP이며, External-IP 값은 `none` 인 것을 확인할 수 있습니다.\n\n```bash\nNAME            TYPE        CLUSTER-IP      EXTERNAL-IP   PORT(S)    AGE\nminio-service   ClusterIP   10.109.209.87   <none>        9000/TCP   5h14m\n```\n\ntype 을 LoadBalancer 로 변경하고 원하는 IP 주소를 입력하고 싶은 경우 loadBalancerIP 항목을 추가합니다.  \n추가 하지 않을 경우에는 위에서 설정한 IP 주소풀에서 순차적으로 IP 주소가 배정됩니다.\n\n```bash\nkubectl edit svc/minio-service -n kubeflow\n```\n\n```bash\napiVersion: v1\nkind: Service\nmetadata:\n  annotations:\n    kubectl.kubernetes.io/last-applied-configuration: |\n      {\"apiVersion\":\"v1\",\"kind\":\"Service\",\"metadata\":{\"annotations\":{},\"labels\":{\"application-crd-id\":\"kubeflow-pipelines\"},\"name\":\"minio-ser>\n  creationTimestamp: \"2022-01-05T08:44:23Z\"\n  labels:\n    application-crd-id: kubeflow-pipelines\n  name: minio-service\n  namespace: kubeflow\n  resourceVersion: \"21120\"\n  uid: 0053ee28-4f87-47bb-ad6b-7ad68aa29a48\nspec:\n  clusterIP: 10.109.209.87\n  clusterIPs:\n  - 10.109.209.87\n  ipFamilies:\n  - IPv4\n  ipFamilyPolicy: SingleStack\n  ports:\n  - name: http\n    port: 9000\n    protocol: TCP\n    targetPort: 9000\n  selector:\n    app: minio\n    application-crd-id: kubeflow-pipelines\n  sessionAffinity: None\n  type: LoadBalancer # Change ClusterIP to LoadBalancer\n  loadBalancerIP: 192.168.35.101 # Add IP\nstatus:\n  loadBalancer: {}\n```\n\n다시 확인을 해보면 External-IP 값이 `192.168.35.101` 인 것을 확인할 수 있습니다.\n\n```bash\nkubectl get svc/minio-service -n kubeflow\n```\n\n```bash\nNAME            TYPE           CLUSTER-IP      EXTERNAL-IP      PORT(S)          AGE\nminio-service   LoadBalancer   10.109.209.87   192.168.35.101   9000:31371/TCP   5h21m\n```\n\nWeb Browser 를 열어 [http://192.168.35.101:9000](http://192.168.35.101:9000) 으로 접속하여, 다음과 같은 화면이 출력되는 것을 확인합니다.\n\n![login-after-minio-setting.png](./img/login-after-minio-setting.png)\n\n### mlflow Dashboard\n\n먼저 mlflow 의 Dashboard 를 제공하는 mlflow-system 네임스페이스의 mlflow-server-service 서비스의 타입을 LoadBalancer로 변경하여 MetalLB로부터 로드 벨런싱 기능을 제공받기 전에, 현재 상태를 확인합니다.\n\n```bash\nkubectl get svc/mlflow-server-service -n mlflow-system\n```\n\n해당 서비스의 타입은 ClusterIP이며, External-IP 값은 `none` 인 것을 확인할 수 있습니다.\n\n```bash\nNAME                    TYPE        CLUSTER-IP       EXTERNAL-IP   PORT(S)    AGE\nmlflow-server-service   ClusterIP   10.111.173.209   <none>        5000/TCP   4m50s\n```\n\ntype 을 LoadBalancer 로 변경하고 원하는 IP 주소를 입력하고 싶은 경우 loadBalancerIP 항목을 추가합니다.  \n추가 하지 않을 경우에는 위에서 설정한 IP 주소풀에서 순차적으로 IP 주소가 배정됩니다.\n\n```bash\nkubectl edit svc/mlflow-server-service -n mlflow-system\n```\n\n```bash\napiVersion: v1\nkind: Service\nmetadata:\n  annotations:\n    meta.helm.sh/release-name: mlflow-server\n    meta.helm.sh/release-namespace: mlflow-system\n  creationTimestamp: \"2022-01-07T04:00:19Z\"\n  labels:\n    app.kubernetes.io/managed-by: Helm\n  name: mlflow-server-service\n  namespace: mlflow-system\n  resourceVersion: \"276246\"\n  uid: e5d39fb7-ad98-47e7-b512-f9c673055356\nspec:\n  clusterIP: 10.111.173.209\n  clusterIPs:\n  - 10.111.173.209\n  ipFamilies:\n  - IPv4\n  ipFamilyPolicy: SingleStack\n  ports:\n  - port: 5000\n    protocol: TCP\n    targetPort: 5000\n  selector:\n    app.kubernetes.io/name: mlflow-server\n  sessionAffinity: None\n  type: LoadBalancer # Change ClusterIP to LoadBalancer\n  loadBalancerIP: 192.168.35.102 # Add IP\nstatus:\n  loadBalancer: {}\n```\n\n다시 확인을 해보면 External-IP 값이 `192.168.35.102` 인 것을 확인할 수 있습니다.\n\n```bash\nkubectl get svc/mlflow-server-service -n mlflow-system\n```\n\n```bash\nNAME                    TYPE           CLUSTER-IP       EXTERNAL-IP      PORT(S)          AGE\nmlflow-server-service   LoadBalancer   10.111.173.209   192.168.35.102   5000:32287/TCP   6m11s\n```\n\nWeb Browser 를 열어 [http://192.168.35.102:5000](http://192.168.35.102:5000) 으로 접속하여, 다음과 같은 화면이 출력되는 것을 확인합니다.\n\n![login-after-mlflow-setting.png](./img/login-after-mlflow-setting.png)\n\n### Grafana Dashboard\n\n먼저 Grafana 의 Dashboard 를 제공하는 seldon-system 네임스페이스의 seldon-core-analytics-grafana 서비스의 타입을 LoadBalancer로 변경하여 MetalLB로부터 로드 벨런싱 기능을 제공받기 전에, 현재 상태를 확인합니다.\n\n```bash\nkubectl get svc/seldon-core-analytics-grafana -n seldon-system\n```\n\n해당 서비스의 타입은 ClusterIP이며, External-IP 값은 `none` 인 것을 확인할 수 있습니다.\n\n```bash\nNAME                            TYPE        CLUSTER-IP      EXTERNAL-IP   PORT(S)   AGE\nseldon-core-analytics-grafana   ClusterIP   10.109.20.161   <none>        80/TCP    94s\n```\n\ntype 을 LoadBalancer 로 변경하고 원하는 IP 주소를 입력하고 싶은 경우 loadBalancerIP 항목을 추가합니다.  \n추가 하지 않을 경우에는 위에서 설정한 IP 주소풀에서 순차적으로 IP 주소가 배정됩니다.\n\n```bash\nkubectl edit svc/seldon-core-analytics-grafana -n seldon-system\n```\n\n```bash\napiVersion: v1\nkind: Service\nmetadata:\n  annotations:\n    meta.helm.sh/release-name: seldon-core-analytics\n    meta.helm.sh/release-namespace: seldon-system\n  creationTimestamp: \"2022-01-07T04:16:47Z\"\n  labels:\n    app.kubernetes.io/instance: seldon-core-analytics\n    app.kubernetes.io/managed-by: Helm\n    app.kubernetes.io/name: grafana\n    app.kubernetes.io/version: 7.0.3\n    helm.sh/chart: grafana-5.1.4\n  name: seldon-core-analytics-grafana\n  namespace: seldon-system\n  resourceVersion: \"280605\"\n  uid: 75073b78-92ec-472c-b0d5-240038ea8fa5\nspec:\n  clusterIP: 10.109.20.161\n  clusterIPs:\n  - 10.109.20.161\n  ipFamilies:\n  - IPv4\n  ipFamilyPolicy: SingleStack\n  ports:\n  - name: service\n    port: 80\n    protocol: TCP\n    targetPort: 3000\n  selector:\n    app.kubernetes.io/instance: seldon-core-analytics\n    app.kubernetes.io/name: grafana\n  sessionAffinity: None\n  type: LoadBalancer # Change ClusterIP to LoadBalancer\n  loadBalancerIP: 192.168.35.103 # Add IP\nstatus:\n  loadBalancer: {}\n```\n\n다시 확인을 해보면 External-IP 값이 `192.168.35.103` 인 것을 확인할 수 있습니다.\n\n```bash\nkubectl get svc/seldon-core-analytics-grafana -n seldon-system\n```\n\n```bash\nNAME                            TYPE           CLUSTER-IP      EXTERNAL-IP      PORT(S)        AGE\nseldon-core-analytics-grafana   LoadBalancer   10.109.20.161   192.168.35.103   80:31191/TCP   5m14s\n```\n\nWeb Browser 를 열어 [http://192.168.35.103:80](http://192.168.35.103:80) 으로 접속하여, 다음과 같은 화면이 출력되는 것을 확인합니다.\n\n![login-after-grafana-setting.png](./img/login-after-grafana-setting.png)\n"
  },
  {
    "path": "docs/appendix/pyenv.md",
    "content": "---\ntitle: \"1. Python 가상환경 설치\"\nsidebar_position: 1\n---\n\n## 파이썬 가상환경\n\nPython 환경을 사용하다 보면 여러 버전의 Python 환경을 사용하고 싶은 경우나, 여러 프로젝트별 패키지 버전을 따로 관리하고 싶은 경우가 발생합니다.\n\n이처럼 Python 환경 혹은 Python Package 환경을 가상화하여 관리하는 것을 쉽게 도와주는 도구로는 pyenv, conda, virtualenv, venv 등이 존재합니다.\n\n이 중 *모두의 MLOps*에서는 [pyenv](https://github.com/pyenv/pyenv)와 [pyenv-virtualenv](https://github.com/pyenv/pyenv-virtualenv)를 설치하는 방법을 다룹니다.  \npyenv는 Python 버전을 관리하는 것을 도와주며, pyenv-virtualenv는 pyenv의 plugin으로써 파이썬 패키지 환경을 관리하는 것을 도와줍니다.\n\n## pyenv 설치\n\n### Prerequisites\n\n운영 체제별로 Prerequisites가 다릅니다. [다음 페이지](https://github.com/pyenv/pyenv/wiki#suggested-build-environment)를 참고하여 필수 패키지들을 설치해주시기 바랍니다.\n\n### 설치 - macOS\n\n1. pyenv, pyenv-virtualenv 설치\n\n```bash\nbrew update\nbrew install pyenv\nbrew install pyenv-virtualenv\n```\n\n2. pyenv 설정\n\nmacOS의 경우 카탈리나 버전 이후 기본 shell이 zsh로 변경되었기 때문에 zsh을 사용하는 경우를 가정하였습니다.\n\n```bash\necho 'eval \"$(pyenv init -)\"' >> ~/.zshrc\necho 'eval \"$(pyenv virtualenv-init -)\"' >> ~/.zshrc\nsource ~/.zshrc\n```\n\npyenv 명령이 정상적으로 수행되는지 확인합니다.\n\n```bash\npyenv --help\n```\n\n```bash\n$ pyenv --help\nUsage: pyenv <command> [<args>]\n\nSome useful pyenv commands are:\n   --version   Display the version of pyenv\n   activate    Activate virtual environment\n   commands    List all available pyenv commands\n   deactivate   Deactivate virtual environment\n   exec        Run an executable with the selected Python version\n   global      Set or show the global Python version(s)\n   help        Display help for a command\n   hooks       List hook scripts for a given pyenv command\n   init        Configure the shell environment for pyenv\n   install     Install a Python version using python-build\n   local       Set or show the local application-specific Python version(s)\n   prefix      Display prefix for a Python version\n   rehash      Rehash pyenv shims (run this after installing executables)\n   root        Display the root directory where versions and shims are kept\n   shell       Set or show the shell-specific Python version\n   shims       List existing pyenv shims\n   uninstall   Uninstall a specific Python version\n   version     Show the current Python version(s) and its origin\n   version-file   Detect the file that sets the current pyenv version\n   version-name   Show the current Python version\n   version-origin   Explain how the current Python version is set\n   versions    List all Python versions available to pyenv\n   virtualenv   Create a Python virtualenv using the pyenv-virtualenv plugin\n   virtualenv-delete   Uninstall a specific Python virtualenv\n   virtualenv-init   Configure the shell environment for pyenv-virtualenv\n   virtualenv-prefix   Display real_prefix for a Python virtualenv version\n   virtualenvs   List all Python virtualenvs found in `$PYENV_ROOT/versions/*'.\n   whence      List all Python versions that contain the given executable\n   which       Display the full path to an executable\n\nSee `pyenv help <command>' for information on a specific command.\nFor full documentation, see: https://github.com/pyenv/pyenv#readme\n```\n\n### 설치 - Ubuntu\n\n1. pyenv, pyenv-virtualenv 설치\n\n```bash\ncurl https://pyenv.run | bash\n```\n\n다음과 같은 내용이 출력되면 정상적으로 설치된 것을 의미합니다.\n\n```bash\n  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current\n                                 Dload  Upload   Total   Spent    Left  Speed\n  0     0    0     0    0     0      0      0 --:--:-- --:--:--   0     0    0     0    0     0      0      0 --:--:-- --:--:-- 100   270  100   270    0     0    239      0  0:00:01  0:00:01 --:--:--   239\nCloning into '/home/mlops/.pyenv'...\nr\n...\n중략...\n...\nremote: Enumerating objects: 10, done.\nremote: Counting objects: 100% (10/10), done.\nremote: Compressing objects: 100% (6/6), done.\nremote: Total 10 (delta 1), reused 6 (delta 0), pack-reused 0\nUnpacking objects: 100% (10/10), 2.92 KiB | 2.92 MiB/s, done.\n\nWARNING: seems you still have not added 'pyenv' to the load path.\n\n\n# See the README for instructions on how to set up\n# your shell environment for Pyenv.\n\n# Load pyenv-virtualenv automatically by adding\n# the following to ~/.bashrc:\n\neval \"$(pyenv virtualenv-init -)\"\n\n```\n\n2. pyenv 설정\n\n기본 shell로 bash shell을 사용하는 경우를 가정하였습니다.\nbash에서 pyenv와 pyenv-virtualenv 를 사용할 수 있도록 설정합니다.\n\n```bash\nsudo vi ~/.bashrc\n```\n\n다음 문자열을 입력한 후 저장합니다.\n\n```bash\nexport PATH=\"$HOME/.pyenv/bin:$PATH\"\neval \"$(pyenv init -)\"\neval \"$(pyenv virtualenv-init -)\"\n```\n\nshell을 restart 합니다.\n\n```bash\nexec $SHELL\n```\n\npyenv 명령이 정상적으로 수행되는지 확인합니다.\n\n```bash\npyenv --help\n```\n\n다음과 같은 메시지가 출력되면 정상적으로 설정된 것을 의미합니다.\n\n```bash\n$ pyenv\npyenv 2.2.2\nUsage: pyenv <command> [<args>]\n\nSome useful pyenv commands are:\n   --version   Display the version of pyenv\n   activate    Activate virtual environment\n   commands    List all available pyenv commands\n   deactivate   Deactivate virtual environment\n   doctor      Verify pyenv installation and development tools to build pythons.\n   exec        Run an executable with the selected Python version\n   global      Set or show the global Python version(s)\n   help        Display help for a command\n   hooks       List hook scripts for a given pyenv command\n   init        Configure the shell environment for pyenv\n   install     Install a Python version using python-build\n   local       Set or show the local application-specific Python version(s)\n   prefix      Display prefix for a Python version\n   rehash      Rehash pyenv shims (run this after installing executables)\n   root        Display the root directory where versions and shims are kept\n   shell       Set or show the shell-specific Python version\n   shims       List existing pyenv shims\n   uninstall   Uninstall a specific Python version\n   version     Show the current Python version(s) and its origin\n   version-file   Detect the file that sets the current pyenv version\n   version-name   Show the current Python version\n   version-origin   Explain how the current Python version is set\n   versions    List all Python versions available to pyenv\n   virtualenv   Create a Python virtualenv using the pyenv-virtualenv plugin\n   virtualenv-delete   Uninstall a specific Python virtualenv\n   virtualenv-init   Configure the shell environment for pyenv-virtualenv\n   virtualenv-prefix   Display real_prefix for a Python virtualenv version\n   virtualenvs   List all Python virtualenvs found in `$PYENV_ROOT/versions/*'.\n   whence      List all Python versions that contain the given executable\n   which       Display the full path to an executable\n\nSee `pyenv help <command>' for information on a specific command.\nFor full documentation, see: https://github.com/pyenv/pyenv#readme\n```\n\n## pyenv 사용\n\n### Python 버전 설치\n\n`pyenv install <Python-Version>` 명령을 통해 원하는 파이썬 버전을 설치할 수 있습니다.\n이번 페이지에서는 예시로 kubeflow에서 기본으로 사용하는 파이썬 3.7.12 버전을 설치하겠습니다.\n\n```bash\npyenv install 3.7.12\n```\n\n정상적으로 설치되면 다음과 같은 메시지가 출력됩니다.\n\n```bash\n$ pyenv install 3.7.12\nDownloading Python-3.7.12.tar.xz...\n-> https://www.python.org/ftp/python/3.7.12/Python-3.7.12.tar.xz\nInstalling Python-3.7.12...\npatching file Doc/library/ctypes.rst\npatching file Lib/test/test_unicode.py\npatching file Modules/_ctypes/_ctypes.c\npatching file Modules/_ctypes/callproc.c\npatching file Modules/_ctypes/ctypes.h\npatching file setup.py\npatching file 'Misc/NEWS.d/next/Core and Builtins/2020-06-30-04-44-29.bpo-41100.PJwA6F.rst'\npatching file Modules/_decimal/libmpdec/mpdecimal.h\nInstalled Python-3.7.12 to /home/mlops/.pyenv/versions/3.7.12\n```\n\n### Python 가상환경 생성\n\n`pyenv virtualenv <Installed-Python-Version> <가상환경-이름>` 명령을 통해 원하는 파이썬 버전의 파이썬 가상환경을 생성할 수 있습니다.\n\n예시로 Python 3.7.12 버전의 `demo`라는 이름의 Python 가상환경을 생성하겠습니다.\n\n```bash\npyenv virtualenv 3.7.12 demo\n```\n\n```bash\n$ pyenv virtualenv 3.7.12 demo\nLooking in links: /tmp/tmpffqys0gv\nRequirement already satisfied: setuptools in /home/mlops/.pyenv/versions/3.7.12/envs/demo/lib/python3.7/site-packages (47.1.0)\nRequirement already satisfied: pip in /home/mlops/.pyenv/versions/3.7.12/envs/demo/lib/python3.7/site-packages (20.1.1)\n```\n\n### Python 가상환경 사용\n\n`pyenv activate <가상환경 이름>` 명령을 통해 위와 같은 방식으로 생성한 가상환경을 사용할 수 있습니다.\n\n예시로는 `demo`라는 이름의 Python 가상환경을 사용하겠습니다.\n\n```bash\npyenv activate demo\n```\n\n다음과 같이 현재 가상환경의 정보가 shell의 맨 앞에 출력되는 것을 확인할 수 있습니다.\n\n  Before\n\n  ```bash\n  mlops@ubuntu:~$ pyenv activate demo\n  ```\n\n  After\n\n  ```bash\n  pyenv-virtualenv: prompt changing will be removed from future release. configure `export PYENV_VIRTUALENV_DISABLE_PROMPT=1' to simulate the behavior.\n  (demo) mlops@ubuntu:~$ \n  ```\n\n### Python 가상환경 비활성화\n\n`source deactivate` 명령을 통해 현재 사용 중인 가상환경을 비활성화할 수 있습니다.\n\n```bash\nsource deactivate\n```\n\n  Before\n\n  ```bash\n  (demo) mlops@ubuntu:~$ source deactivate\n  ```\n\n  After\n\n  ```bash\n  mlops@ubuntu:~$ \n  ```\n"
  },
  {
    "path": "docs/further-readings/_category_.json",
    "content": "{\n  \"label\": \"Further Readings\",\n  \"position\": 8,\n  \"link\": {\n    \"type\": \"generated-index\"\n  }\n}\n"
  },
  {
    "path": "docs/further-readings/info.md",
    "content": "---\ntitle: \"다루지 못한 것들\"\ndate: 2021-12-21\nlastmod: 2021-12-21\n---\n\n## MLOps Component\n\n[MLOps Concepts](../introduction/component.md)에서 다루었던 컴포넌트를 도식화하면 다음과 같습니다.\n\n![open-stacks-0.png](./img/open-stacks-0.png)\n\n이 중 *모두의 MLOps* 에서 다룬 기술 스택들은 다음과 같습니다.\n\n![open-stacks-1.png](./img/open-stacks-1.png)\n\n보시는 것처럼 아직 우리가 다루지 못한 많은 MLOps 컴포넌트들이 있습니다.  \n\n시간 관계상 이번에 모두 다루지는 못했지만, 만약 필요하다면 다음과 같은 오픈소스들을 먼저 참고해보면 좋을 것 같습니다.\n\n![open-stacks-2.png](./img/open-stacks-2.png)\n\n세부 내용은 다음과 같습니다.\n\n| Mgmt.                      | Component                   | Open Soruce                           |\n| -------------------------- | --------------------------- | ------------------------------------- |\n| Data Mgmt.                 | Collection                  | [Kafka](https://kafka.apache.org/)                                 |\n|                            | Validation                  | [Beam](https://beam.apache.org/)                                  |\n|                            | Feature Store               | [Flink](https://flink.apache.org/)                                 |\n| ML Model Dev. & Experiment | Modeling                    | [Jupyter](https://jupyter.org/)                               |\n|                            | Analysis & Experiment Mgmt. | [MLflow](https://mlflow.org/)                                |\n|                            | HPO Tuning & AutoML         | [Katib](https://github.com/kubeflow/katib)                                 |\n| Deploy Mgmt.               | Serving Framework           | [Seldon Core](https://docs.seldon.io/projects/seldon-core/en/latest/index.html)                           |\n|                            | A/B Test                    | [Iter8](https://iter8.tools/)                                 |\n|                            | Monitoring                  | [Grafana](https://grafana.com/oss/grafana/), [Prometheus](https://prometheus.io/)                   |\n| Process Mgmt.              | pipeline                    | [Kubeflow](https://www.kubeflow.org/)                              |\n|                            | CI/CD                       | [Github Action](https://docs.github.com/en/actions)                         |\n|                            | Continuous Training         | [Argo Events](https://argoproj.github.io/events/)                           |\n| Platform Mgmt.             | Configuration Mgmt.         | [Consul](https://www.consul.io/)                                |\n|                            | Code Version Mgmt.          | [Github](https://github.com/), [Minio](https://min.io/)                         |\n|                            | Logging                     | (EFK) [Elastic Search](https://www.elastic.co/kr/elasticsearch/), [Fluentd](https://www.fluentd.org/), [Kibana](https://www.elastic.co/kr/kibana/) |\n|                            | Resource Mgmt.              | [Kubernetes](https://kubernetes.io/)                            |\n"
  },
  {
    "path": "docs/introduction/_category_.json",
    "content": "{\n  \"label\": \"Introduction\",\n  \"position\": 1,\n  \"link\": {\n    \"type\": \"generated-index\"\n  }\n}\n"
  },
  {
    "path": "docs/introduction/component.md",
    "content": "---\ntitle : \"3. Components of MLOps\"\ndescription: \"Describe MLOps Components\"\nsidebar_position: 3\ndate: 2021-12-03\nlastmod: 2021-12-10\ncontributors: [\"Youngcheol Jang\"]\n---\n\n## Practitioners guide to MLOps\n\n 2021년 5월에 발표된 구글의 [white paper : Practitioners guide to MLOps: A framework for continuous delivery and automation of machine learning](https://services.google.com/fh/files/misc/practitioners_guide_to_mlops_whitepaper.pdf)에서는 MLOps의 핵심 기능들로 다음과 같은 것들을 언급하였습니다.\n\n\n![mlops-component](./img/mlops-component.png)\n\n\n 각 기능이 어떤 역할을 하는지 살펴보겠습니다.\n\n### 1. Experimentation\n\n 실험(Experimentation)은 머신러닝 엔지니어들이 데이터를 분석하고, 프로토타입 모델을 만들며 학습 기능을 구현할 수 있도록 하는 다음과 같은 기능을 제공합니다.\n\n- 깃(Git)과 같은 버전 컨트롤 도구와 통합된 노트북(Jupyter Notebook) 환경 제공\n- 사용한 데이터, 하이퍼 파라미터, 평가 지표를 포함한 실험 추적 기능 제공\n- 데이터와 모델에 대한 분석 및 시각화 기능 제공\n\n### 2. Data Processing\n\n 데이터 처리(Data Processing)는 머신러닝 모델 개발 단계, 지속적인 학습(Continuous Training) 단계, 그리고 API 배포(API Deployment) 단계에서 많은 양의 데이터를 사용할 수 있게 해 주는 다음과 같은 기능을 제공합니다.\n\n- 다양한 데이터 소스와 서비스에 호환되는 데이터 커넥터(connector) 기능 제공\n- 다양한 형태의 데이터와 호환되는 데이터 인코더(encoder) & 디코더(decoder) 기능 제공\n- 다양한 형태의 데이터에 대한 데이터 변환과 피처 엔지니어링(feature engineering) 기능 제공\n- 학습과 서빙을 위한 확장 가능한 배치, 스트림 데이터 처리 기능 제공\n\n### 3. Model training\n\n 모델 학습(Model training)은 모델 학습을 위한 알고리즘을 효율적으로 실행시켜주는 다음과 같은 기능을 제공합니다.\n\n- ML 프레임워크의 실행을 위한 환경 제공\n- 다수의 GPU / 분산 학습 사용을 위한 분산 학습 환경 제공\n- 하이퍼 파라미터 튜닝과 최적화 기능 제공\n\n### 4. Model evaluation\n\n 모델 평가(Model evaluation)는 실험 환경과 상용 환경에서 동작하는 모델의 성능을 관찰할 수 있는 다음과 같은 기능을 제공합니다.\n\n- 평가 데이터에 대한 모델 성능 평가 기능\n- 서로 다른 지속 학습 실행 결과에 대한 예측 성능 추적\n- 서로 다른 모델의 성능 비교와 시각화\n- 해석할 수 있는 AI 기술을 이용한 모델 출력 해석 기능 제공\n\n### 5. Model serving\n\n 모델 서빙(Model serving)은 상용 환경에 모델을 배포하고 서빙하기 위한 다음과 같은 기능들을 제공합니다.\n\n- 저 지연 추론과 고가용성 추론 기능 제공\n- 다양한 ML 모델 서빙 프레임워크 지원(Tensorflow Serving, TorchServe, NVIDIA Triton, Scikit-learn, XGBoost. etc)\n- 복잡한 형태의 추론 루틴 기능 제공, 예를 들어 전처리(preprocess) 또는 후처리(postprocess) 기능과 최종 결과를 위해 다수의 모델이 사용되는 경우를 말합니다.\n- 순간적으로 치솟는 추론 요청을 처리하기 위한 오토 스케일링(autoscaling) 기능 제공\n- 추론 요청과 추론 결과에 대한 로깅 기능 제공\n\n### 6. Online experimentation\n\n 온라인 실험(Online experimentation)은 새로운 모델이 생성되었을 때, 이 모델을 배포하면 어느 정도의 성능을 보일 것인지 검증하는 기능을 제공합니다. 이 기능은 새 모델을 배포하는 것까지 연동하기 위해 모델 저장소(Model Registry)와 연동되어야 합니다.\n\n- 카나리(canary) & 섀도(shadow) 배포 기능 제공\n- A/B 테스트 기능 제공\n- 멀티 암드 밴딧(Multi-armed bandit) 테스트 기능 제공\n\n### 7. Model Monitoring\n\n모델 모니터링(Model Monitoring)은 상용 환경에 배포된 모델이 정상적으로 동작하고 있는지를 모니터링하는 기능을 제공합니다. 예를 들어 모델의 성능이 떨어져 업데이트가 필요한지에 대한 정보 등을 제공합니다.\n\n### 8. ML Pipeline\n\n머신러닝 파이프라인(ML Pipeline)은 상용 환경에서 복잡한 ML 학습과 추론 작업을 구성하고 제어하고 자동화하기 위한 다음과 같은 기능을 제공합니다.\n\n- 다양한 이벤트를 소스를 통한 파이프라인 실행 기능\n- 파이프라인 파라미터와 생성되는 산출물 관리를 위한 머신러닝 메타데이터 추적과 연동 기능\n- 일반적인 머신러닝 작업을 위한 내장 컴포넌트 지원과 사용자가 직접 구현한 컴포넌트에 대한 지원 기능\n- 서로 다른 실행 환경 제공 기능\n\n### 9. Model Registry\n\n 모델 저장소(Model Registry)는 머신러닝 모델의 생명 주기(Lifecycle)을 중앙 저장소에서 관리할 수 있게 해 주는 기능을 제공합니다.\n\n- 학습된 모델 그리고 배포된 모델에 대한 등록, 추적, 버저닝 기능 제공\n- 배포를 위해 필요한 데이터와 런타임 패키지들에 대한 정보 저장 기능\n\n### 10. Dataset and Feature Repository\n\n- 데이터에 대한 공유, 검색, 재사용 그리고 버전 관리 기능\n- 이벤트 스트리밍 및 온라인 추론 작업에 대한 실시간 처리 및 저 지연 서빙 기능\n- 사진, 텍스트, 테이블 형태의 데이터와 같은 다양한 형태의 데이터 지원 기능\n\n### 11. ML Metadata and Artifact Tracking\n\n MLOps의 각 단계에서는 다양한 형태의 산출물들이 생성됩니다. ML 메타데이터는 이런 산출물들에 대한 정보를 의미합니다.\n ML 메타데이터와 산출물 관리는 산출물의 위치, 타입, 속성, 그리고 관련된 실험(experiment)에 대한 정보를 관리하기 위해 다음과 같은 기능들을 제공합니다.\n\n- ML 산출물에 대한 히스토리 관리 기능\n- 실험과 파이프라인 파라미터 설정에 대한 추적, 공유 기능\n- ML 산출물에 대한 저장, 접근, 시각화, 다운로드 기능 제공\n- 다른 MLOps 기능과의 통합 기능 제공\n"
  },
  {
    "path": "docs/introduction/intro.md",
    "content": "---\ntitle : \"1. What is MLOps?\"\ndescription: \"Introduction to MLOps\"\nsidebar_position: 1\ndate: 2021-1./img to MLOps\"\nlastmod: 2022-03-05\ncontributors: [\"Jongseob Jeon\"]\n---\n\n## Machine Learning Project\n\n2012년 Alexnet 이후 CV, NLP를 비롯하여 데이터가 존재하는 도메인이라면 어디서든 머신러닝과 딥러닝을 도입하고자 하였습니다.  \n딥러닝과 머신러닝은 AI라는 단어로 묶이며 불렸고 많은 매체에서 AI의 필요성을 외쳤습니다. 그리고 무수히 많은 기업에서 머신러닝과 딥러닝을 이용한 수많은 프로젝트를 진행하였습니다. 하지만 그 결과는 어떻게 되었을까요?  \n엘리먼트 AI의 음병찬 동북아 지역 총괄책임자는 [*\"10개 기업에 AI 프로젝트를 시작한다면 그중 9개는 컨셉검증(POC)만 하다 끝난다\"*](https://zdnet.co.kr/view/?no=20200611062002)고 말했습니다.\n\n이처럼 많은 프로젝트에서 머신러닝과 딥러닝은 이 문제를 풀 수 있을 것 같다는 가능성만을 보여주고 사라졌습니다. 그리고 이 시기쯤에 [AI에 다시 겨울](https://www.aifutures.org/2021/ai-winter-is-coming/)이 다가오고 있다는 전망도 나오기 시작했습니다.\n\n왜 프로젝트 대부분이 컨셉검증(POC) 단계에서 끝났을까요?  \n머신러닝과 딥러닝 코드만으로는 실제 서비스를 운영할 수 없기 때문입니다.\n\n실제 서비스 단계에서 머신러닝과 딥러닝의 코드가 차지하는 부분은 생각보다 크지 않기 때문에, 단순히 모델의 성능만이 아닌 다른 많은 부분을 고려해야 합니다.  \n구글은 이런 문제를 2015년 [Hidden Technical Debt in Machine Learning Systems](https://proceedings.neurips.cc/paper/2015/file/86df7dcfd896fcaf2674f757a2463eba-Paper.pdf)에서 지적한 바 있습니다.  \n하지만 이 논문이 나올 당시에는 아직 많은 머신러닝 엔지니어들이 딥러닝과 머신러닝의 가능성을 입증하기 바쁜 시기였기 때문에, 논문이 지적하는 바에 많은 주의를 기울이지는 않았습니다.\n\n그리고 몇 년이 지난 후 머신러닝과 딥러닝은 가능성을 입증해내어, 이제 사람들은 실제 서비스에 적용하고자 했습니다.  \n하지만 곧 많은 사람이 실제 서비스는 쉽지 않다는 것을 깨달았습니다.\n\n## Devops\n\nMLOps는 이전에 없던 새로운 개념이 아니라 DevOps라고 불리는 개발 방법론에서 파생된 단어입니다. 그렇기에 DevOps를 이해한다면 MLOps를 이해하는 데 도움이 됩니다.\n\n### DevOps\n\nDevOps는 Development(개발)와 Operations(운영)의 합성어로 소프트웨어의 개발(Development)과 운영(Operations)의 합성어로서 소프트웨어 개발자와 정보기술 전문가 간의 소통, 협업 및 통합을 강조하는 개발 환경이나 문화를 말합니다.\nDevOps의 목적은 소프트웨어 개발 조직과 운영 조직간의 상호 의존적 대응이며 조직이 소프트웨어 제품과 서비스를 빠른 시간에 개발 및 배포하는 것을 목적으로 합니다.\n\n### Silo Effect\n\n그럼 간단한 상황 설명을 통해 DevOps가 왜 필요한지 알아보도록 하겠습니다.\n\n서비스 초기에는 지원하는 기능이 많지 않으며 팀 또는 회사의 규모가 작습니다. 이때에는 개발팀과 운영팀의 구분이 없거나 작은 규모의 팀으로 구분되어 있습니다. 핵심은 규모가 작다는 것에 있습니다. 이때는 서로 소통할 수 있는 접점이 많고, 집중해야 하는 서비스가 적기 때문에 빠르게 서비스를 개선해 나갈 수 있습니다.\n\n하지만 서비스의 규모가 커질수록 개발팀과 운영팀은 분리되고 서로 소통할 수 있는 채널의 물리적인 한계가 오게 됩니다. 예를 들어서 다른 팀과 함께하는 미팅에 팀원 전체가 미팅을 하는 것이 아니라 각 팀의 팀장 혹은 소수의 시니어만 참석하여 미팅을 진행하게 됩니다. 이런 소통 채널의 한계는 필연적으로 소통의 부재로 이어지게 됩니다. 그러다 보면 개발팀은 새로운 기능들을 계속해서 개발하고 운영팀 입장에서는 개발팀에서 개발한 기능이 배포 시 장애를 일으키는 등 여러 문제가 생기게 됩니다.\n\n위와 같은 상황이 반복되면 조직 이기주의라고 불리는 사일로 현상이 생길 수 있습니다.\n\n![silo](./img/silo.png)\n\n> 사일로(silo)는 곡식이나 사료를 저장하는 굴뚝 모양의 창고를 의미한다. 사일로는 독립적으로 존재하며 저장되는 물품이 서로 섞이지 않도록 철저히 관리할 수 있도록 도와준다.  \n> 사일로 효과(Organizational Silos Effect)는 조직 부서 간에 서로 협력하지 않고 내부 이익만을 추구하는 현상을 의미한다. 조직 내에서 개별 부서끼리 서로 담을 쌓고 각자의 이익에만 몰두하는 부서 이기주의를 일컫는다.\n\n사일로 현상은 서비스 품질의 저하로 이어지게 됩니다. 이러한 사일로 현상을 해결하기 위해 나온 것이 바로 DevOps입니다.\n\n### CI/CD\n\nContinuous Integration(CI) 와 Continuous Delivery (CD)는 개발팀과 운영팀의 장벽을 해제하기 위한 구체적인 방법입니다.\n\n![cicd](./img/cicd.png)\n\n이 방법을 통해서 개발팀에서는 운영팀의 환경을 이해하고 개발팀에서 개발 중인 기능이 정상적으로 배포까지 이어질 수 있는지 확인합니다. 운영팀은 검증된 기능 또는 개선된 제품을 더 자주 배포해 고객의 제품 경험을 상승시킵니다.  \n앞에서 설명한 내용을 종합하자면 DevOps는 개발팀과 운영팀 간의 문제가 있었고 이를 해결하기 위한 방법론입니다.\n\n## MLOps\n\n### 1) ML+Ops\n\nMLOps는 Machine Learning 과 Operations의 합성어로 DevOps에서 Dev가 ML로 바뀌었습니다. 이제 앞에서 살펴본 DevOps를 통해 MLOps가 무엇인지 짐작해 볼 수 있습니다.\n“MLOps는 머신러닝팀과 운영팀의 문제를 해결하기 위한 방법입니다.”\n이 말은 머신러닝팀과 운영팀 사이에 문제가 발생했다는 의미입니다. 그럼 왜 머신러닝팀과 운영팀에는 문제가 발생했을까요? 두 팀 간의 문제를 알아보기 위해서 추천시스템을 예시로 알아보겠습니다.\n\n#### Rule Based\n\n처음 추천시스템을 만드는 경우 간단한 규칙을 기반으로 아이템을 추천합니다. 예를 들어서 1주일간 판매량이 가장 많은 순서대로 보여주는 식의 방식을 이용합니다. 이 방식으로 모델을 정한다면 특별한 이유가 없는 이상 모델의 수정이 필요 없습니다.\n\n#### Machine Learning\n\n서비스의 규모가 조금 커지고 로그 데이터가 많이 쌓인다면 이를 이용해 아이템 기반 혹은 유저 기반의 머신러닝 모델을 생성합니다. 이때 모델은 정해진 주기에 따라 모델을 재학습 후 재배포합니다.\n\n#### Deep Learning\n\n개인화 추천에 대한 요구가 더 커지고 더 좋은 성능을 내는 모델을 필요해질 경우 딥러닝을 이용한 모델을 개발하기 시작합니다. 이때 만드는 모델은 머신러닝과 같이 정해진 주기에 따라 모델을 재학습 후 재배포합니다.\n\n![graph](./img/graph.png)\n\n위에서 설명한 것을 x축을 모델의 복잡도, y축을 모델의 성능으로 두고 그래프로 표현한다면 다음과 같이 복잡도가 올라갈 때 모델의 성능이 올라가는 상승 관계를 갖습니다. 머신러닝에서 딥러닝으로 넘어갈 머신러닝 팀이 새로 생기게 됩니다.\n\n만약 관리해야할 모델이 적다면 서로 협업을 통해서 충분히 해결할 수 있지만 개발해야 할 모델이 많아진다면 DevOps의 경우와 같이 사일로 현상이 나타나게 됩니다.\n\nDevOps의 목표와 맞춰서 생각해보면 MLOps의 목표는 개발한 모델이 정상적으로 배포될 수 있는지 테스트하는 것입니다. 개발팀에서 개발한 기능이 정상적으로 배포될 수 있는지 확인하는 것이 DevOps의 목표였다면, MLOps의 목표는 머신러닝 팀에서 개발한 모델이 정상적으로 배포될 수 있는지 확인하는 것입니다.\n\n### 2) ML -> Ops\n\n하지만 최근 나오고 있는 MLOps 관련 제품과 설명을 보면 꼭 앞에서 설명한 목표만을 대상으로 하고 있지 않습니다.\n어떤 경우에는 머신러닝 팀에서 만든 모델을 이용해 직접 운영을 할 수 있도록 도와주려고 합니다. 이러한 니즈는 최근 머신러닝 프로젝트가 진행되는 과정에서 알 수 있습니다.\n\n추천시스템의 경우 운영에서 간단한 모델부터 시작해 운영할 수 있었습니다. 하지만 자연어, 이미지와 같은 곳에서는 규칙 기반의 모델보다는 딥러닝을 이용해 주어진 태스크를 해결할 수 있는지 검증(POC)를 선행하는 경우가 많습니다. 검증이 끝난 프로젝트는 이제 서비스를 위한 운영 환경을 개발하기 시작합니다. 하지만 머신러닝 팀 내의 자체 역량으로는 이 문제를 해결하기 쉽지 않습니다. 이를 해결하기 위해서 MLOps가 필요한 경우도 있습니다.\n\n### 3) 결론\n\n요약하자면 MLOps는 두 가지 목표가 있습니다.\n앞에서 설명한 MLOps는 ML+Ops 로 두 팀의 생산성 향상을 위한 것이였습니다.\n반면, 뒤에서 설명한 것은 ML->Ops 로 머신러닝 팀에서 직접 운영을 할 수 있도록 도와주는 것을 말합니다.\n"
  },
  {
    "path": "docs/introduction/levels.md",
    "content": "---\ntitle : \"2. Levels of MLOps\"\ndescription: \"Levels of MLOps\"\nsidebar_position: 2\ndate: 2021-12-03\nlastmod: 2022-03-05\ncontributors: [\"Jongseob Jeon\", \"Chanmin Cho\"]\n\n---\n\n이번 페이지에서는 구글에서 발표한 MLOps의 단계를 보며 MLOps의 핵심 기능은 무엇인지 알아 보겠습니다.\n\n## Hidden Technical Debt in ML System\n\n구글은 무려 2015년부터 MLOps의 필요성을 말했습니다. Hidden Technical Debt in Machine Learning Systems 은 그런 구글의 생각을 담은 논문입니다.\n\n![paper](./img/paper.png)\n\n이 논문의 핵심은 바로 머신러닝을 이용한 제품을 만드는데 있어서 머신러닝 코드는 전체 시스템을 구성하는데 있어서 아주 일부일 뿐이라는 것입니다.\n\n![paper-2](./img/paper-2.png)\n\n구글은 이 논문을 더 발전시켜서 MLOps라는 용어를 만들어 확장시켰습니다. 더 자세한 내용은 [구글 클라우드 홈페이지](https://cloud.google.com/architecture/mlops-continuous-delivery-and-automation-pipelines-in-machine-learning)에서 더 자세한 내용을 확인할 수 있습니다. 이번 포스트에서는 구글에서 말하는 MLOps란 어떤 것인지에 대해서 설명해보고자 합니다.\n\n구글에서는 MLOps의 발전 단계를 총 3(0~2)단계로 나누었습니다. 각 단계들에 대해 설명하기 앞서 이전 포스트에서 설명했던 개념 중 필요한 부분을 다시 한번 보겠습니다.\n\n머신러닝 모델을 운영하기 위해서는 모델을 개발하는 머신러닝 팀과 배포 및 운영을 담당하는 운영팀이 있습니다. 이 두 팀의 원할한 협업을 위해서 MLOps가 필요하게 되었습니다. 이전에는 간단히 Continuous Integration(CI)/Continuous Deployment(CD)를 통해서 할 수 있다고 하였는데, 어떻게 CI/CD를 하는지에 대해서 알아 보겠습니다.\n\n## 0단계: 수동 프로세스\n\n![level-0](./img/level-0.png)\n\n0단계에서 두 팀은 “모델”을 통해 소통합니다. 머신 러닝팀은 쌓여있는 데이터로 모델을 학습시키고 학습된 모델을 운영팀에게 전달 합니다. 운영팀은 이렇게 전달받은 모델을 배포합니다.\n\n![toon](./img/toon.png)\n\n초기의 머신 러닝 모델들은 이 “모델” 중심의 소통을 통해 배포합니다. 그런데 이런 배포 방식은 여러 문제가 있습니다.  \n예를 들어서 어떤 기능에서는 파이썬 3.7을 쓰고 어떤 기능에서는 파이썬 3.8을 쓴다면 다음과 같은 상황을 자주 목격할 수 있습니다.\n\n이러한 상황이 일어나는 이유는 머신러닝 모델의 특성에 있습니다. 학습된 머신러닝 모델이 동작하기 위해서는 3가지가 필요합니다.\n\n1. 파이썬 코드\n2. 학습된 가중치\n3. 환경 (패키지, 버전 등)\n\n만약 이 3가지 중 한 가지라도 전달이 잘못 된다면 모델이 동작하지 않거나 예상하지 못한 예측을 할수 있습니다. 그런데 많은 경우 환경이 일치하지 않아서 동작하지 않는 경우가 많습니다. 머신러닝은 다양한 오픈소스를 사용하는데 오픈소스는 특성상 어떤 버전을 쓰는지에 따라서 같은 함수라도 결과가 다를 수 있습니다.\n\n이러한 문제는 서비스 초기에는 관리할 모델이 많지 않기 때문에 금방 해결할 수 있습니다. 하지만 관리하는 기능들이 많아지고 서로 소통에 어려움을 겪게 된다면 성능이 더 좋은 모델을 빠르게 배포할 수 없게 됩니다.\n\n## 1단계: ML 파이프라인 자동화\n\n### Pipeline\n\n![level-1-pipeline](./img/level-1-pipeline.png)\n\n그래서 MLOps에서는 “파이프라인(Pipeline)”을 이용해 이러한 문제를 방지하고자 했습니다. MLOps의 파이프라인은 도커와 같은 컨테이너를 이용해 머신러닝 엔지니어가 모델 개발에 사용한 것과 동일한 환경으로 동작되는 것을 보장합니다. 이를 통해서 환경이 달라서 모델이 동작하지 않는 상황을 방지합니다.\n\n그런데 파이프라인은 범용적인 용어로 여러 다양한 태스크에서 사용됩니다. 머신러닝 엔지니어가 작성하는 파이프라인의 역할은 무엇일까요?  \n머신러닝 엔지니어가 작성하는 파이프라인은 학습된 모델을 생산합니다. 그래서 파이프라인 대신 학습 파이프라인(Training Pipeline)이 더 정확하다고 볼 수 있습니다.\n\n### Continuous Training\n\n![level-1-ct.png](./img/level-1-ct.png)\n\n그리고 Continuous Training(CT) 개념이 추가됩니다. 그렇다면 CT는 왜 필요할까요?\n\n#### Auto Retrain\n\nReal World에서 데이터는 Data Shift라는 데이터의 분포가 계속해서 변하는 특징이 있습니다. 그래서 과거에 학습한 모델이 시간이 지남에 따라 모델의 성능이 저하되는 문제가 있습니다. 이 문제를 해결하는 가장 간단하고 효과적인 해결책은 바로 최근 데이터를 이용해 모델을 재학습하는 것입니다. 변화된 데이터 분포에 맞춰서 모델을 재학습하면 다시 준수한 성능을 낼 수 있습니다.\n\n#### Auto Deploy\n\n하지만 제조업과 같이 한 공장에서 여러 레시피를 처리하는 경우 무조건 재학습을 하는 것이 좋지 않을 수 도 있습니다. Blind Spot이 대표적인 예입니다.\n\n예를 들어서 자동차 생산 라인에서 모델 A에 대해서 모델을 만들고 이를 이용해 예측을 진행하고 있었습니다. 만약 전혀 다른 모델 B가 들어오면 이전에 보지 못한 데이터 패턴이기 때문에 모델 B에 대해서 새로운 모델을 학습합니다.\n\n이제 모델 B에 대해서 모델을 만들었기 때문에 모델은 예측을 진행할 것 입니다. 그런데 만약 데이터가 다시 모델 A로 바뀐다면 어떻게 할까요?  \n만약 Retraining 규칙만 있다면 다시 모델 A에 대해서 새로운 모델을 학습하게 됩니다. 그런데 머신러닝 모델이 충분한 성능을 보이기 위해서는 충분한 양의 데이터가 모여야 합니다. Blind Spot이란 이렇게 데이터를 모으기 위해서 모델이 동작하지 않는 구간을 말합니다.\n\n이러한 Blind Spot을 해결하는 방법은 간단할 수 있습니다. 바로 모델 A에 대한 모델이 과거에 있었는지 확인하고 만약 있었다면 새로운 모델을 바로 학습하기 보다는 이 전 모델을 이용해 다시 예측을 하면 이런 Blind Spot을 해결할 수 있습니다. 이렇게 모델와 같은 메타 데이터를 이용해 모델을 자동으로 변환해주는 것을 Auto Deploy라고 합니다.\n\n정리하자면 CT를 위해서는 Auto Retraining과 Auto Deploy 두 가지 기능이 필요합니다. 둘은 서로의 단점을 보완해 계속해서 모델의 성능을 유지할 수 있게 합니다.\n\n### Model Serving\n\n![level-1-modelserving](./img/level-1-modelserving.png)\n\n프로덕션 환경에서의 머신러닝 파이프라인은 새로운 데이터에 기반한 최신 모델을 예측 서비스에 지속적으로 배포합니다. 이 과정에서, 훈련되고 검증된 모델을 온라인 예측 서비스에 자동적으로 배포하는 작업이 포함됩니다.\n\n\n## 2단계: CI/CD 파이프라인의 자동화\n\n![level-2](./img/level-2.png)\n\n2단계의 제목은 CI와 CD의 자동화 입니다. DevOps에서의 CI/CD의 대상은 소스 코드입니다. 그렇다면 MLOps는 어떤 것이 CI/CD의 대상일까요?\n\nMLOps의 CI/CD 대상 또한 소스 코드인 것은 맞지만 조금 더 엄밀히 정의하자면 학습 파이프라인이라고 볼 수 있습니다.\n\n그래서 모델을 학습하는데 있어서 영향이 있는 변화에 대해서 실제로 모델이 정상적으로 학습이 되는지 (CI), 학습된 모델이 정상적으로 동작하는지 (CD)를 확인해야 합니다. 그래서 학습을 하는 코드에 직접적인 수정이 있는 경우에는 CI/CD를 진행해야 합니다.\n\n코드 외에도 사용하는 패키지의 버전, 파이썬의 버전 변경도 CI/CD의 대상입니다. 많은 경우 머신 러닝은 오픈 소스를 이용합니다. 하지만 오픈 소스는 그 특성상 버전이 바뀌었을 때 함수의 내부 로직이 변하는 경우도 있습니다. 물론 어느 정도 버전이 올라 갈 때 이와 관련된 알림을 주지만 한 번에 버전이 크게 바뀐다면 이러한 변화를 모를 수도 있습니다.  \n그래서 사용하는 패키지의 버전이 변하는 경우에도 CI/CD를 통해 정상적으로 모델이 학습, 동작하는지 확인을 해야 합니다.\n"
  },
  {
    "path": "docs/introduction/why_kubernetes.md",
    "content": "---\ntitle : \"4. Why Kubernetes?\"\ndescription: \"Reason for using k8s in MLOps\"\nsidebar_position: 4\ndate: 2021-12-03\nlastmod: 2021-12-10\ncontributors: [\"Jaeyeon Kim\"]\n---\n\n## MLOps & Kubernetes\n\n그렇다면 MLOps를 이야기할 때, 쿠버네티스(Kubernetes)라는 단어가 항상 함께 들리는 이유가 무엇일까요?\n\n성공적인 MLOps 시스템을 구축하기 위해서는 [MLOps의 구성요소](../introduction/component.md) 에서 설명한 것처럼 다양한 구성 요소들이 필요하지만, 각각의 구성 요소들이 유기적으로 운영되기 위해서는 인프라 레벨에서 수많은 이슈를 해결해야 합니다.  \n간단하게는 수많은 머신러닝 모델의 학습 요청을 차례대로 실행하는 것, 다른 작업 공간에서도 같은 실행 환경을 보장해야 하는 것, 배포된 서비스에 장애가 생겼을 때 빠르게 대응해야 하는 것 등의 이슈 등을 생각해볼 수 있습니다.  \n여기서 컨테이너(Container)와 컨테이너 오케스트레이션 시스템(Container Orchestration System)의 필요성이 등장합니다.\n\n쿠버네티스와 같은 컨테이너 오케스트레이션 시스템을 도입하면 실행 환경의 격리와 관리를 효율적으로 수행할 수 있습니다. 컨테이너 오케스트레이션 시스템을 도입한다면, 머신러닝 모델을 개발하고 배포하는 과정에서 다수의 개발자가 소수의 클러스터를 공유하면서 *'1번 클러스터 사용 중이신가요?', 'GPU 사용 중이던 제 프로세스 누가 죽였나요?', '누가 클러스터에 x 패키지 업데이트했나요?'* 와 같은 상황을 방지할 수 있습니다.\n\n## Container\n\n그렇다면 컨테이너란 무엇일까요? 마이크로소프트에서는 컨테이너를 [다음](https://azure.microsoft.com/ko-kr/overview/what-is-a-container/)과 같이 정의하고 있습니다.\n\n> 컨테이너란 : 애플리케이션의 표준화된 이식 가능한 패키징\n\n그런데 왜 머신러닝에서 컨테이너가 필요할까요? 머신러닝 모델들은 운영체제나 Python 실행 환경, 패키지 버전 등에 따라 다르게 동작할 수 있습니다.  \n이를 방지하기 위해서 머신러닝에 사용된 소스 코드와 함께 종속적인 실행 환경 전체를 **하나로 묶어서(패키징해서)** 공유하고 실행하는 데 활용할 수 있는 기술이 컨테이너라이제이션(Containerization) 기술입니다.\n이렇게 패키징된 형태를 컨테이너 이미지라고 부르며, 컨테이너 이미지를 공유함으로써 사용자들은 어떤 시스템에서든 같은 실행 결과를 보장할 수 있게 됩니다.  \n즉, 단순히 Jupyter Notebook 파일이나, 모델의 소스 코드와 requirements.txt 파일을 공유하는 것이 아닌, 모든 실행 환경이 담긴 컨테이너 이미지를 공유한다면 *\"제 노트북에서는 잘 되는데요?\"* 와 같은 상황을 피할 수 있습니다.\n\n컨테이너를 처음 접하시는 분들이 흔히 하시는 오해 중 하나는 \"**컨테이너 == 도커**\"라고 받아들이는 것입니다.  \n도커는 컨테이너와 같은 의미를 지니는 개념이 아니라, 컨테이너를 띄우거나, 컨테이너 이미지를 만들고 공유하는 것과 같이 컨테이너를 더욱더 쉽고 유연하게 사용할 수 있는 기능을 제공해주는 도구입니다. 정리하자면 컨테이너는 가상화 기술이고, 도커는 가상화 기술의 구현체라고 말할 수 있습니다.\n\n다만, 도커는 여러 컨테이너 가상화 도구 중에서 쉬운 사용성과 높은 효율성을 바탕으로 가장 빠르게 성장하여 대세가 되었기에 컨테이너하면 도커라는 이미지가 자동으로 떠오르게 되었습니다. 이렇게 컨테이너와 도커 생태계가 대세가 되기까지는 다양한 이유가 있지만, 기술적으로 자세한 이야기는 *모두의 MLOps*의 범위를 넘어서기 때문에 다루지는 않겠습니다.\n\n컨테이너 혹은 도커를 처음 들어보시는 분들에게는 *모두의 MLOps*의 내용이 다소 어렵게 느껴질 수 있으므로, [생활코딩](https://opentutorials.org/course/4781), [subicura 님의 개인 블로그 글](https://subicura.com/2017/01/19/docker-guide-for-beginners-1.html) 등의 자료를 먼저 살펴보는 것을 권장합니다.\n\n## Container Orchestration System\n\n그렇다면 컨테이너 오케스트레이션 시스템은 무엇일까요? **오케스트레이션**이라는 단어에서 추측해 볼 수 있듯이, 수많은 컨테이너가 있을 때 컨테이너들이 서로 조화롭게 구동될 수 있도록 지휘하는 시스템에 비유할 수 있습니다.\n\n컨테이너 기반의 시스템에서 서비스는 컨테이너의 형태로 사용자들에게 제공됩니다. 이때 관리해야 할 컨테이너의 수가 적다면 운영 담당자 한 명이서도 충분히 모든 상황에 대응할 수 있습니다.  \n하지만, 수백 개 이상의 컨테이너가 수 십 대 이상의 클러스터에서 구동되고 있고 장애를 일으키지 않고 항상 정상 동작해야 한다면, 모든 서비스의 정상 동작 여부를 담당자 한 명이 파악하고 이슈에 대응하는 것은 불가능에 가깝습니다.\n\n예를 들면, 모든 서비스가 정상적으로 동작하고 있는지를 계속해서 모니터링(Monitoring)해야 합니다.  \n만약, 특정 서비스가 장애를 일으켰다면 여러 컨테이너의 로그를 확인해가며 문제를 파악해야 합니다.  \n또한, 특정 클러스터나 특정 컨테이너에 작업이 몰리지 않도록 스케줄링(Scheduling)하고 로드 밸런싱(Load Balancing)하며, 스케일링(Scaling)하는 등의 수많은 작업을 담당해야 합니다.\n이렇게 수많은 컨테이너의 상태를 지속해서 관리하고 운영하는 과정을 조금이나마 쉽게, 자동으로 할 수 있는 기능을 제공해주는 소프트웨어가 바로 컨테이너 오케스트레이션 시스템입니다.  \n\n머신러닝에서는 어떻게 쓰일 수 있을까요?  \n예를 들어서 GPU가 있어야 하는 딥러닝 학습 코드가 패키징된 컨테이너는 사용 가능한 GPU가 있는 클러스터에서 수행하고, 많은 메모리를 필요로 하는 데이터 전처리 코드가 패키징된 컨테이너는 메모리의 여유가 많은 클러스터에서 수행하고, 학습 중에 클러스터에 문제가 생기면 자동으로 같은 컨테이너를 다른 클러스터로 이동시키고 다시 학습을 진행하는 등의 작업을 사람이 일일이 수행하지 않고, 자동으로 관리하는 시스템을 개발한 뒤 맡기는 것입니다.\n\n집필을 하는 2022년을 기준으로 쿠버네티스는 컨테이너 오케스트레이션 시스템의 사실상의 표준(De facto standard)입니다.\n\nCNCF에서 2018년 발표한 [Survey](https://www.cncf.io/blog/2018/08/29/cncf-survey-use-of-cloud-native-technologies-in-production-has-grown-over-200-percent/) 에 따르면 다음 그림과 같이 이미 두각을 나타내고 있었으며, 2019년 발표한 [Survey](https://www.cncf.io/wp-content/uploads/2020/08/CNCF_Survey_Report.pdf)에 따르면 그중 78%가 상용 수준(Production Level)에서 사용하고 있다는 것을 알 수 있습니다.\n\n![k8s-graph](./img/k8s-graph.png)\n\n쿠버네티스 생태계가 이처럼 커지게 된 이유에는 여러 가지 이유가 있습니다. 하지만 도커와 마찬가지로 쿠버네티스 역시 머신러닝 기반의 서비스에서만 사용하는 기술이 아니기에, 자세히 다루기에는 상당히 많은 양의 기술적인 내용을 다루어야 하므로 이번 *모두의 MLOps*에서는 자세한 내용은 생략할 예정입니다.\n\n다만, *모두의 MLOps*에서 앞으로 다룰 내용은 도커와 쿠버네티스에 대한 내용을 어느 정도 알고 계신 분들을 대상으로 작성하였습니다. 따라서 쿠버네티스에 대해 익숙하지 않으신 분들은 다음 [쿠버네티스 공식 문서](https://kubernetes.io/ko/docs/concepts/overview/what-is-kubernetes/), [subicura 님의 개인 블로그 글](https://subicura.com/k8s/) 등의 쉽고 자세한 자료들을 먼저 참고해주시는 것을 권장합니다.\n"
  },
  {
    "path": "docs/kubeflow/_category_.json",
    "content": "{\n  \"label\": \"Kubeflow\",\n  \"position\": 6,\n  \"link\": {\n    \"type\": \"generated-index\"\n  }\n}\n"
  },
  {
    "path": "docs/kubeflow/advanced-component.md",
    "content": "---\ntitle : \"8. Component - InputPath/OutputPath\"\ndescription: \"\"\nsidebar_position: 8\ncontributors: [\"Jongseob Jeon\", \"SeungTae Kim\"]\n---\n\n\n## Complex Outputs\n\n이번 페이지에서는 [Kubeflow Concepts](../kubeflow/kubeflow-concepts.md#component-contents) 예시로 나왔던 코드를 컴포넌트로 작성해 보겠습니다.\n\n## Component Contents\n\n아래 코드는 [Kubeflow Concepts](../kubeflow/kubeflow-concepts.md#component-contents)에서 사용했던 컴포넌트 콘텐츠입니다.\n\n```python\nimport dill\nimport pandas as pd\n\nfrom sklearn.svm import SVC\n\ntrain_data = pd.read_csv(train_data_path)\ntrain_target = pd.read_csv(train_target_path)\n\nclf = SVC(kernel=kernel)\nclf.fit(train_data, train_target)\n\nwith open(model_path, mode=\"wb\") as file_writer:\n    dill.dump(clf, file_writer)\n```\n\n## Component Wrapper\n\n### Define a standalone Python function\n\n컴포넌트 래퍼에 필요한 Config들과 함께 작성하면 다음과 같이 됩니다.\n\n```python\ndef train_from_csv(\n    train_data_path: str,\n    train_target_path: str,\n    model_path: str,\n    kernel: str,\n):\n    import dill\n    import pandas as pd\n\n    from sklearn.svm import SVC\n\n    train_data = pd.read_csv(train_data_path)\n    train_target = pd.read_csv(train_target_path)\n\n    clf = SVC(kernel=kernel)\n    clf.fit(train_data, train_target)\n\n    with open(model_path, mode=\"wb\") as file_writer:\n        dill.dump(clf, file_writer)\n```\n\n[Basic Usage Component](../kubeflow/basic-component)에서 설명할 때 입력과 출력에 대한 타입 힌트를 적어야 한다고 설명 했었습니다. 그런데 만약 json에서 사용할 수 있는 기본 타입이 아닌 dataframe, model와 같이 복잡한 객체들은 어떻게 할까요?\n\n파이썬에서 함수간에 값을 전달할 때, 객체를 반환해도 그 값이 호스트의 메모리에 저장되어 있으므로 다음 함수에서도 같은 객체를 사용할 수 있습니다. 하지만 kubeflow에서 컴포넌트들은 각각 컨테이너 위에서 서로 독립적으로 실행됩니다. 즉, 같은 메모리를 공유하고 있지 않기 때문에, 보통의 파이썬 함수에서 사용하는 방식과 같이 객체를 전달할 수 없습니다. 컴포넌트 간에 넘겨 줄 수 있는 정보는 `json` 으로만 가능합니다. 따라서 Model이나 DataFrame과 같이 json 형식으로 변환할 수 없는 타입의 객체는 다른 방법을 통해야 합니다.\n\nKubeflow에서는 이를 해결하기 위해 json-serializable 하지 않은 타입의 객체는 메모리 대신 파일에 데이터를 저장한 뒤, 그 파일을 이용해 정보를 전달합니다. 저장된 파일의 경로는 str이기 때문에 컴포넌트 간에 전달할 수 있기 때문입니다. 그런데 kubeflow에서는 minio를 이용해 파일을 저장하는데 유저는 실행을 하기 전에는 각 파일의 경로를 알 수 없습니다. 이를 위해서 kubeflow에서는 입력과 출력의 경로와 관련된 매직을 제공하는데 바로 `InputPath`와 `OutputPath` 입니다.\n\n`InputPath`는 단어 그대로 입력 경로를 `OutputPath` 는 단어 그대로 출력 경로를 의미합니다.\n\n예를 들어서 데이터를 생성하고 반환하는 컴포넌트에서는 `data_path: OutputPath()`를 argument로 만듭니다.\n그리고 데이터를 받는 컴포넌트에서는 `data_path: InputPath()`을 argument로 생성합니다.\n\n이렇게 만든 후 파이프라인에서 서로 연결을 하면 kubeflow에서 필요한 경로를 자동으로 생성후 입력해 주기 때문에 더 이상 유저는 경로를 신경쓰지 않고 컴포넌트간의 관계만 신경쓰면 됩니다.\n\n이제 이 내용을 바탕으로 다시 컴포넌트 래퍼를 작성하면 다음과 같이 됩니다.\n\n```python\nfrom kfp.components import InputPath, OutputPath\n\ndef train_from_csv(\n    train_data_path: InputPath(\"csv\"),\n    train_target_path: InputPath(\"csv\"),\n    model_path: OutputPath(\"dill\"),\n    kernel: str,\n):\n    import dill\n    import pandas as pd\n\n    from sklearn.svm import SVC\n\n    train_data = pd.read_csv(train_data_path)\n    train_target = pd.read_csv(train_target_path)\n\n    clf = SVC(kernel=kernel)\n    clf.fit(train_data, train_target)\n\n    with open(model_path, mode=\"wb\") as file_writer:\n        dill.dump(clf, file_writer)\n```\n\nInputPath나 OutputPath는 string을 입력할 수 있습니다. 이 string은 입력 또는 출력하려고 하는 파일의 포맷입니다.  \n그렇다고 꼭 이 포맷으로 파일 형태로 저장이 강제되는 것은 아닙니다.  \n다만 파이프라인을 컴파일할 때 최소한의 타입 체크를 위한 도우미 역할을 합니다.  \n만약 파일 포맷이 고정되지 않는다면 입력하지 않으면 됩니다 (타입 힌트 에서 `Any` 와 같은 역할을 합니다).\n\n### Convert to Kubeflow Format\n\n작성한 컴포넌트를 kubeflow에서 사용할 수 있는 포맷으로 변환합니다.\n\n```python\nfrom kfp.components import InputPath, OutputPath, create_component_from_func\n\n\n@create_component_from_func\ndef train_from_csv(\n    train_data_path: InputPath(\"csv\"),\n    train_target_path: InputPath(\"csv\"),\n    model_path: OutputPath(\"dill\"),\n    kernel: str,\n):\n    import dill\n    import pandas as pd\n\n    from sklearn.svm import SVC\n\n    train_data = pd.read_csv(train_data_path)\n    train_target = pd.read_csv(train_target_path)\n\n    clf = SVC(kernel=kernel)\n    clf.fit(train_data, train_target)\n\n    with open(model_path, mode=\"wb\") as file_writer:\n        dill.dump(clf, file_writer)\n```\n\n## Rule to use InputPath/OutputPath\n\nInputPath나 OutputPath argument는 파이프라인으로 작성할 때 지켜야하는 규칙이 있습니다.\n\n### Load Data Component\n\n위에서 작성한 컴포넌트를 실행하기 위해서는 데이터가 필요하므로 데이터를 생성하는 컴포넌트를 작성합니다.\n\n```python\nfrom functools import partial\n\nfrom kfp.components import InputPath, OutputPath, create_component_from_func\n\n\n@create_component_from_func\ndef load_iris_data(\n    data_path: OutputPath(\"csv\"),\n    target_path: OutputPath(\"csv\"),\n):\n    import pandas as pd\n    from sklearn.datasets import load_iris\n\n    iris = load_iris()\n\n    data = pd.DataFrame(iris[\"data\"], columns=iris[\"feature_names\"])\n    target = pd.DataFrame(iris[\"target\"], columns=[\"target\"])\n\n    data.to_csv(data_path, index=False)\n    target.to_csv(target_path, index=False)\n```\n\n### Write Pipeline\n\n이제 파이프라인을 작성해 보도록 하겠습니다.\n\n```python\nfrom kfp.dsl import pipeline\n\n\n@pipeline(name=\"complex_pipeline\")\ndef complex_pipeline(kernel: str):\n    iris_data = load_iris_data()\n    model = train_from_csv(\n        train_data=iris_data.outputs[\"data\"],\n        train_target=iris_data.outputs[\"target\"],\n        kernel=kernel,\n    )\n```\n\n한 가지 이상한 점을 확인하셨나요?  \n바로 입력과 출력에서 받는 argument중 경로와 관련된 것들에 `_path` 접미사가 모두 사라졌습니다.  \n`iris_data.outputs[\"data_path\"]` 가 아닌 `iris_data.outputs[\"data\"]` 으로 접근하는 것을 확인할 수 있습니다.  \n이는 kubeflow에서 정한 법칙으로 `InputPath` 와 `OutputPath` 으로 생성된 경로들은 파이프라인에서 접근할 때는 `_path` 접미사를 생략하여 접근합니다.\n\n다만 방금 작성한 파이프라인을 업로드할 경우 실행이 되지 않습니다.\n이유는 다음 페이지에서 설명합니다.\n"
  },
  {
    "path": "docs/kubeflow/advanced-environment.md",
    "content": "---\ntitle : \"9. Component - Environment\"\ndescription: \"\"\nsidebar_position: 9\ncontributors: [\"Jongseob Jeon\"]\n---\n\n\n## Component Environment\n\n앞서  [8. Component - InputPath/OutputPath](../kubeflow/advanced-component.md)에서 작성한 파이프라인을 실행하면 실패하게 됩니다. 왜 실패하는지 알아보고 정상적으로 실행될 수 있도록 수정합니다.\n\n### Convert to Kubeflow Format\n\n[앞에서 작성한 컴포넌트](../kubeflow/advanced-component.md#convert-to-kubeflow-format)를 yaml파일로 변환하도록 하겠습니다.\n\n```python\nfrom kfp.components import InputPath, OutputPath, create_component_from_func\n\n\n@create_component_from_func\ndef train_from_csv(\n    train_data_path: InputPath(\"csv\"),\n    train_target_path: InputPath(\"csv\"),\n    model_path: OutputPath(\"dill\"),\n    kernel: str,\n):\n    import dill\n    import pandas as pd\n\n    from sklearn.svm import SVC\n\n    train_data = pd.read_csv(train_data_path)\n    train_target = pd.read_csv(train_target_path)\n\n    clf = SVC(kernel=kernel)\n    clf.fit(train_data, train_target)\n\n    with open(model_path, mode=\"wb\") as file_writer:\n        dill.dump(clf, file_writer)\n\n\nif __name__ == \"__main__\":\n    train_from_csv.component_spec.save(\"train_from_csv.yaml\")\n```\n\n위의 스크립트를 실행하면 다음과 같은 `train_from_csv.yaml` 파일을 얻을 수 있습니다.\n\n```bash\nname: Train from csv\ninputs:\n- {name: train_data, type: csv}\n- {name: train_target, type: csv}\n- {name: model, type: dill}\n- {name: kernel, type: String}\nimplementation:\n  container:\n    image: python:3.7\n    command:\n    - sh\n    - -ec\n    - |\n      program_path=$(mktemp)\n      printf \"%s\" \"$0\" > \"$program_path\"\n      python3 -u \"$program_path\" \"$@\"\n    - |\n      def train_from_csv(\n          train_data_path,\n          train_target_path,\n          model_path,\n          kernel,\n      ):\n          import dill\n          import pandas as pd\n\n          from sklearn.svm import SVC\n\n          train_data = pd.read_csv(train_data_path)\n          train_target = pd.read_csv(train_target_path)\n\n          clf = SVC(kernel=kernel)\n          clf.fit(train_data, train_target)\n\n          with open(model_path, mode=\"wb\") as file_writer:\n              dill.dump(clf, file_writer)\n\n      import argparse\n      _parser = argparse.ArgumentParser(prog='Train from csv', description='')\n      _parser.add_argument(\"--train-data\", dest=\"train_data_path\", type=str, required=True, default=argparse.SUPPRESS)\n      _parser.add_argument(\"--train-target\", dest=\"train_target_path\", type=str, required=True, default=argparse.SUPPRESS)\n      _parser.add_argument(\"--model\", dest=\"model_path\", type=str, required=True, default=argparse.SUPPRESS)\n      _parser.add_argument(\"--kernel\", dest=\"kernel\", type=str, required=True, default=argparse.SUPPRESS)\n      _parsed_args = vars(_parser.parse_args())\n\n      _outputs = train_from_csv(**_parsed_args)\n    args:\n    - --train-data\n    - {inputPath: train_data}\n    - --train-target\n    - {inputPath: train_target}\n    - --model\n    - {inputPath: model}\n    - --kernel\n    - {inputValue: kernel}\n```\n\n앞서 [Basic Usage Component](../kubeflow/basic-component.md#convert-to-kubeflow-format)에서 설명한 내용에 따르면 이 컴포넌트는 다음과 같이 실행됩니다.\n\n1. `docker pull python:3.7`\n2. run `command`\n\n하지만 위에서 생성된 컴포넌트를 실행하면 오류가 발생하게 됩니다.  \n그 이유는 컴포넌트 래퍼가 실행되는 방식에 있습니다.  \nKubeflow는 쿠버네티스를 이용하기 때문에 컴포넌트 래퍼는 각각 독립된 컨테이너 위에서 컴포넌트 콘텐츠를 실행합니다.\n\n자세히 보면 생성된 만든 `train_from_csv.yaml` 에서 정해진 이미지는  `image: python:3.7` 입니다.\n\n이제 어떤 이유 때문에 실행이 안 되는지 눈치채신 분들도 있을 것입니다.\n\n`python:3.7` 이미지에는 우리가 사용하고자 하는 `dill`, `pandas`, `sklearn` 이 설치되어 있지 않습니다.  \n그러므로 실행할 때 해당 패키지가 존재하지 않는다는 에러와 함께 실행이 안 됩니다.\n\n그럼 어떻게 패키지를 추가할 수 있을까요?\n\n## 패키지 추가 방법\n\nKubeflow를 변환하는 과정에서 두 가지 방법을 통해 패키지를 추가할 수 있습니다.\n\n1. `base_image` 사용\n2. `package_to_install` 사용\n\n컴포넌트를 컴파일할 때 사용했던 함수 `create_component_from_func` 가 어떤 argument들을 받을 수 있는지 확인해 보겠습니다.\n\n```bash\ndef create_component_from_func(\n    func: Callable,\n    output_component_file: Optional[str] = None,\n    base_image: Optional[str] = None,\n    packages_to_install: List[str] = None,\n    annotations: Optional[Mapping[str, str]] = None,\n):\n```\n\n- `func`: 컴포넌트로 만들 컴포넌트 래퍼 함수\n- `base_image`: 컴포넌트 래퍼가 실행할 이미지\n- `packages_to_install`: 컴포넌트에서 사용해서 추가로 설치해야 하는 패키지\n\n### 1. base_image\n\n컴포넌트가 실행되는 순서를 좀 더 자세히 들여다보면 다음과 같습니다.\n\n1. `docker pull base_image`\n2. `pip install packages_to_install`\n3. run `command`\n\n만약 컴포넌트가 사용하는 base_image에 패키지들이 전부 설치되어 있다면 추가적인 패키지 설치 없이 바로 사용할 수 있습니다.\n\n예를 들어, 이번 페이지에서는 다음과 같은 Dockerfile을 작성하겠습니다.\n\n```dockerfile\nFROM python:3.7\n\nRUN pip install dill pandas scikit-learn\n```\n\n위의 Dockerfile을 이용해 이미지를 빌드해 보겠습니다. 실습에서 사용해볼 도커 허브는 ghcr입니다.  \n각자 환경에 맞추어서 도커 허브를 선택 후 업로드하면 됩니다.\n\n```bash\ndocker build . -f Dockerfile -t ghcr.io/mlops-for-all/base-image\ndocker push ghcr.io/mlops-for-all/base-image\n```\n\n이제 base_image를 입력해 보겠습니다.\n\n```python\nfrom functools import partial\nfrom kfp.components import InputPath, OutputPath, create_component_from_func\n\n@partial(\n    create_component_from_func,\n    base_image=\"ghcr.io/mlops-for-all/base-image:latest\",\n)\ndef train_from_csv(\n    train_data_path: InputPath(\"csv\"),\n    train_target_path: InputPath(\"csv\"),\n    model_path: OutputPath(\"dill\"),\n    kernel: str,\n):\n    import dill\n    import pandas as pd\n\n    from sklearn.svm import SVC\n\n    train_data = pd.read_csv(train_data_path)\n    train_target = pd.read_csv(train_target_path)\n\n    clf = SVC(kernel=kernel)\n    clf.fit(train_data, train_target)\n\n    with open(model_path, mode=\"wb\") as file_writer:\n        dill.dump(clf, file_writer)\n\nif __name__ == \"__main__\":\n    train_from_csv.component_spec.save(\"train_from_csv.yaml\")\n```\n\n이제 생성된 컴포넌트를 컴파일하면 다음과 같이 나옵니다.\n\n```bash\nname: Train from csv\ninputs:\n- {name: train_data, type: csv}\n- {name: train_target, type: csv}\n- {name: kernel, type: String}\noutputs:\n- {name: model, type: dill}\nimplementation:\n  container:\n    image: ghcr.io/mlops-for-all/base-image:latest\n    command:\n    - sh\n    - -ec\n    - |\n      program_path=$(mktemp)\n      printf \"%s\" \"$0\" > \"$program_path\"\n      python3 -u \"$program_path\" \"$@\"\n    - |\n      def _make_parent_dirs_and_return_path(file_path: str):\n          import os\n          os.makedirs(os.path.dirname(file_path), exist_ok=True)\n          return file_path\n\n      def train_from_csv(\n          train_data_path,\n          train_target_path,\n          model_path,\n          kernel,\n      ):\n          import dill\n          import pandas as pd\n\n          from sklearn.svm import SVC\n\n          train_data = pd.read_csv(train_data_path)\n          train_target = pd.read_csv(train_target_path)\n\n          clf = SVC(kernel=kernel)\n          clf.fit(train_data, train_target)\n\n          with open(model_path, mode=\"wb\") as file_writer:\n              dill.dump(clf, file_writer)\n\n      import argparse\n      _parser = argparse.ArgumentParser(prog='Train from csv', description='')\n      _parser.add_argument(\"--train-data\", dest=\"train_data_path\", type=str, required=True, default=argparse.SUPPRESS)\n      _parser.add_argument(\"--train-target\", dest=\"train_target_path\", type=str, required=True, default=argparse.SUPPRESS)\n      _parser.add_argument(\"--kernel\", dest=\"kernel\", type=str, required=True, default=argparse.SUPPRESS)\n      _parser.add_argument(\"--model\", dest=\"model_path\", type=_make_parent_dirs_and_return_path, required=True, default=argparse.SUPPRESS)\n      _parsed_args = vars(_parser.parse_args())\n\n      _outputs = train_from_csv(**_parsed_args)\n    args:\n    - --train-data\n    - {inputPath: train_data}\n    - --train-target\n    - {inputPath: train_target}\n    - --kernel\n    - {inputValue: kernel}\n    - --model\n    - {outputPath: model}\n```\n\nbase_image가 우리가 설정한 값으로 바뀐 것을 확인할 수 있습니다.\n\n### 2. packages_to_install\n\n하지만 패키지가 추가될 때마다 docker 이미지를 계속해서 새로 생성하는 작업은 많은 시간이 소요됩니다.\n이 때, `packages_to_install` argument 를 사용하면 패키지를 컨테이너에 쉽게 추가할 수 있습니다.\n\n```python\nfrom functools import partial\nfrom kfp.components import InputPath, OutputPath, create_component_from_func\n\n@partial(\n    create_component_from_func,\n    packages_to_install=[\"dill==0.3.4\", \"pandas==1.3.4\", \"scikit-learn==1.0.1\"],\n)\ndef train_from_csv(\n    train_data_path: InputPath(\"csv\"),\n    train_target_path: InputPath(\"csv\"),\n    model_path: OutputPath(\"dill\"),\n    kernel: str,\n):\n    import dill\n    import pandas as pd\n\n    from sklearn.svm import SVC\n\n    train_data = pd.read_csv(train_data_path)\n    train_target = pd.read_csv(train_target_path)\n\n    clf = SVC(kernel=kernel)\n    clf.fit(train_data, train_target)\n\n    with open(model_path, mode=\"wb\") as file_writer:\n        dill.dump(clf, file_writer)\n\nif __name__ == \"__main__\":\n    train_from_csv.component_spec.save(\"train_from_csv.yaml\")\n```\n\n스크립트를 실행하면 다음과 같은 `train_from_csv.yaml` 파일이 생성됩니다.\n\n```bash\nname: Train from csv\ninputs:\n- {name: train_data, type: csv}\n- {name: train_target, type: csv}\n- {name: kernel, type: String}\noutputs:\n- {name: model, type: dill}\nimplementation:\n  container:\n    image: python:3.7\n    command:\n    - sh\n    - -c\n    - (PIP_DISABLE_PIP_VERSION_CHECK=1 python3 -m pip install --quiet --no-warn-script-location\n      'dill==0.3.4' 'pandas==1.3.4' 'scikit-learn==1.0.1' || PIP_DISABLE_PIP_VERSION_CHECK=1\n      python3 -m pip install --quiet --no-warn-script-location 'dill==0.3.4' 'pandas==1.3.4'\n      'scikit-learn==1.0.1' --user) && \"$0\" \"$@\"\n    - sh\n    - -ec\n    - |\n      program_path=$(mktemp)\n      printf \"%s\" \"$0\" > \"$program_path\"\n      python3 -u \"$program_path\" \"$@\"\n    - |\n      def _make_parent_dirs_and_return_path(file_path: str):\n          import os\n          os.makedirs(os.path.dirname(file_path), exist_ok=True)\n          return file_path\n\n      def train_from_csv(\n          train_data_path,\n          train_target_path,\n          model_path,\n          kernel,\n      ):\n          import dill\n          import pandas as pd\n\n          from sklearn.svm import SVC\n\n          train_data = pd.read_csv(train_data_path)\n          train_target = pd.read_csv(train_target_path)\n\n          clf = SVC(kernel=kernel)\n          clf.fit(train_data, train_target)\n\n          with open(model_path, mode=\"wb\") as file_writer:\n              dill.dump(clf, file_writer)\n\n      import argparse\n      _parser = argparse.ArgumentParser(prog='Train from csv', description='')\n      _parser.add_argument(\"--train-data\", dest=\"train_data_path\", type=str, required=True, default=argparse.SUPPRESS)\n      _parser.add_argument(\"--train-target\", dest=\"train_target_path\", type=str, required=True, default=argparse.SUPPRESS)\n      _parser.add_argument(\"--kernel\", dest=\"kernel\", type=str, required=True, default=argparse.SUPPRESS)\n      _parser.add_argument(\"--model\", dest=\"model_path\", type=_make_parent_dirs_and_return_path, required=True, default=argparse.SUPPRESS)\n      _parsed_args = vars(_parser.parse_args())\n\n      _outputs = train_from_csv(**_parsed_args)\n    args:\n    - --train-data\n    - {inputPath: train_data}\n    - --train-target\n    - {inputPath: train_target}\n    - --kernel\n    - {inputValue: kernel}\n    - --model\n    - {outputPath: model}\n```\n\n위에 작성한 컴포넌트가 실행되는 순서를 좀 더 자세히 들여다보면 다음과 같습니다.\n\n1. `docker pull python:3.7`\n2. `pip install dill==0.3.4 pandas==1.3.4 scikit-learn==1.0.1`\n3. run `command`\n\n생성된 yaml 파일을 자세히 보면, 다음과 같은 줄이 자동으로 추가되어 필요한 패키지가 설치되기 때문에 오류 없이 정상적으로 실행됩니다.\n\n```bash\n    command:\n    - sh\n    - -c\n    - (PIP_DISABLE_PIP_VERSION_CHECK=1 python3 -m pip install --quiet --no-warn-script-location\n      'dill==0.3.4' 'pandas==1.3.4' 'scikit-learn==1.0.1' || PIP_DISABLE_PIP_VERSION_CHECK=1\n      python3 -m pip install --quiet --no-warn-script-location 'dill==0.3.4' 'pandas==1.3.4'\n      'scikit-learn==1.0.1' --user) && \"$0\" \"$@\"\n```\n"
  },
  {
    "path": "docs/kubeflow/advanced-mlflow.md",
    "content": "---\ntitle : \"12. Component - MLFlow\"\ndescription: \"\"\nsidebar_position: 12\ndate: 2021-12-13\nlastmod: 2021-12-20\ncontributors: [\"Jongseob Jeon\", \"SeungTae Kim\"]\n---\n\n## MLFlow Component\n\n[Advanced Usage Component](../kubeflow/advanced-component.md) 에서 학습한 모델이 API Deployment까지 이어지기 위해서는 MLFlow에 모델을 저장해야 합니다.\n\n이번 페이지에서는 MLFlow에 모델을 저장할 수 있는 컴포넌트를 작성하는 과정을 설명합니다.\n\n## MLFlow in Local\n\nMLFlow에서 모델을 저장하고 서빙에서 사용하기 위해서는 다음의 항목들이 필요합니다.\n\n- model\n- signature\n- input_example\n- conda_env\n\n파이썬 코드를 통해서 MLFLow에 모델을 저장하는 과정에 대해서 알아보겠습니다.\n\n### 1. 모델 학습\n\n아래 과정은 iris 데이터를 이용해 SVC 모델을 학습하는 과정입니다.\n\n```python\nimport pandas as pd\nfrom sklearn.datasets import load_iris\nfrom sklearn.svm import SVC\n\niris = load_iris()\n\ndata = pd.DataFrame(iris[\"data\"], columns=iris[\"feature_names\"])\ntarget = pd.DataFrame(iris[\"target\"], columns=[\"target\"])\n\nclf = SVC(kernel=\"rbf\")\nclf.fit(data, target)\n\n```\n\n### 2. MLFLow Infos\n\nmlflow에 필요한 정보들을 만드는 과정입니다.\n\n```python\nfrom mlflow.models.signature import infer_signature\nfrom mlflow.utils.environment import _mlflow_conda_env\n\ninput_example = data.sample(1)\nsignature = infer_signature(data, clf.predict(data))\nconda_env = _mlflow_conda_env(additional_pip_deps=[\"dill\", \"pandas\", \"scikit-learn\"])\n```\n\n각 변수의 내용을 확인하면 다음과 같습니다.\n\n- `input_example`\n\n    | sepal length (cm) | sepal width (cm) | petal length (cm) | petal width (cm) |\n    | --- | --- | --- | --- |\n    | 6.5 | 6.7 | 3.1 | 4.4 |\n\n- `signature`\n\n    ```python\n    inputs:\n      ['sepal length (cm)': double, 'sepal width (cm)': double, 'petal length (cm)': double, 'petal width (cm)': double]\n    outputs:\n      [Tensor('int64', (-1,))]\n    ```\n\n- `conda_env`\n\n    ```python\n    {'name': 'mlflow-env',\n     'channels': ['conda-forge'],\n     'dependencies': ['python=3.8.10',\n      'pip',\n      {'pip': ['mlflow', 'dill', 'pandas', 'scikit-learn']}]}\n    ```\n\n### 3. Save MLFLow Infos\n\n다음으로 학습한 정보들과 모델을 저장합니다.\n학습한 모델이 sklearn 패키지를 이용하기 때문에 `mlflow.sklearn` 을 이용하면 쉽게 모델을 저장할 수 있습니다.\n\n```python\nfrom mlflow.sklearn import save_model\n\nsave_model(\n    sk_model=clf,\n    path=\"svc\",\n    serialization_format=\"cloudpickle\",\n    conda_env=conda_env,\n    signature=signature,\n    input_example=input_example,\n)\n```\n\n로컬에서 작업하면 다음과 같은 svc 폴더가 생기며 아래와 같은 파일들이 생성됩니다.\n\n```bash\nls svc\n```\n\n위의 명령어를 실행하면 다음의 출력값을 확인할 수 있습니다.\n\n```bash\nMLmodel            conda.yaml         input_example.json model.pkl          requirements.txt\n```\n\n각 파일을 확인하면 다음과 같습니다.\n\n- MLmodel\n\n    ```bash\n    flavors:\n      python_function:\n        env: conda.yaml\n        loader_module: mlflow.sklearn\n        model_path: model.pkl\n        python_version: 3.8.10\n      sklearn:\n        pickled_model: model.pkl\n        serialization_format: cloudpickle\n        sklearn_version: 1.0.1\n    saved_input_example_info:\n      artifact_path: input_example.json\n      pandas_orient: split\n      type: dataframe\n    signature:\n      inputs: '[{\"name\": \"sepal length (cm)\", \"type\": \"double\"}, {\"name\": \"sepal width\n        (cm)\", \"type\": \"double\"}, {\"name\": \"petal length (cm)\", \"type\": \"double\"}, {\"name\":\n        \"petal width (cm)\", \"type\": \"double\"}]'\n      outputs: '[{\"type\": \"tensor\", \"tensor-spec\": {\"dtype\": \"int64\", \"shape\": [-1]}}]'\n    utc_time_created: '2021-12-06 06:52:30.612810'\n    ```\n\n- conda.yaml\n\n    ```bash\n    channels:\n    - conda-forge\n    dependencies:\n    - python=3.8.10\n    - pip\n    - pip:\n      - mlflow\n      - dill\n      - pandas\n      - scikit-learn\n    name: mlflow-env\n    ```\n\n- input_example.json\n\n    ```bash\n    {\n        \"columns\": \n        [\n            \"sepal length (cm)\",\n            \"sepal width (cm)\",\n            \"petal length (cm)\",\n            \"petal width (cm)\"\n        ],\n        \"data\": \n        [\n            [6.7, 3.1, 4.4, 1.4]\n        ]\n    }\n    ```\n\n- requirements.txt\n\n    ```bash\n    mlflow\n    dill\n    pandas\n    scikit-learn\n    ```\n\n- model.pkl\n\n## MLFlow on Server\n\n이제 저장된 모델을 mlflow 서버에 올리는 작업을 해보겠습니다.\n\n```python\nimport mlflow\n\nwith mlflow.start_run():\n    mlflow.log_artifact(\"svc/\")\n```\n\n저장하고 `mlruns` 가 생성된 경로에서 `mlflow ui` 명령어를 이용해 mlflow 서버와 대시보드를 띄웁니다.\nmlflow 대시보드에 접속하여 생성된 run을 클릭하면 다음과 같이 보입니다.\n\n![mlflow-0.png](./img/mlflow-0.png)\n(해당 화면은 mlflow 버전에 따라 다를 수 있습니다.)\n\n## MLFlow Component\n\n이제 Kubeflow에서 재사용할 수 있는 컴포넌트를 작성해 보겠습니다.\n\n재사용할 수 있는 컴포넌트를 작성하는 방법은 크게 3가지가 있습니다.\n\n1. 모델을 학습하는 컴포넌트에서 필요한 환경을 저장 후 MLFlow 컴포넌트는 업로드만 담당\n\n    ![mlflow-1.png](./img/mlflow-1.png)\n\n2. 학습된 모델과 데이터를 MLFlow 컴포넌트에 전달 후 컴포넌트에서 저장과 업로드 담당\n\n    ![mlflow-2.png](./img/mlflow-2.png)\n\n3. 모델을 학습하는 컴포넌트에서 저장과 업로드를 담당\n\n    ![mlflow-3.png](./img/mlflow-3.png)\n\n저희는 이 중 1번의 접근 방법을 통해 모델을 관리하려고 합니다.\n이유는 MLFlow 모델을 업로드하는 코드는 바뀌지 않기 때문에 매번 3번처럼 컴포넌트 작성마다 작성할 필요는 없기 때문입니다.\n\n컴포넌트를 재활용하는 방법은 1번과 2번의 방법으로 가능합니다.\n다만 2번의 경우 모델이 학습된 이미지와 패키지들을 전달해야 하므로 결국 컴포넌트에 대한 추가 정보를 전달해야 합니다.\n\n1번의 방법으로 진행하기 위해서는 학습하는 컴포넌트 또한 변경되어야 합니다.\n모델을 저장하는데 필요한 환경들을 저장해주는 코드가 추가되어야 합니다.\n\n```python\nfrom functools import partial\nfrom kfp.components import InputPath, OutputPath, create_component_from_func\n\n@partial(\n    create_component_from_func,\n    packages_to_install=[\"dill\", \"pandas\", \"scikit-learn\", \"mlflow\"],\n)\ndef train_from_csv(\n    train_data_path: InputPath(\"csv\"),\n    train_target_path: InputPath(\"csv\"),\n    model_path: OutputPath(\"dill\"),\n    input_example_path: OutputPath(\"dill\"),\n    signature_path: OutputPath(\"dill\"),\n    conda_env_path: OutputPath(\"dill\"),\n    kernel: str,\n):\n    import dill\n    import pandas as pd\n    from sklearn.svm import SVC\n\n    from mlflow.models.signature import infer_signature\n    from mlflow.utils.environment import _mlflow_conda_env\n\n    train_data = pd.read_csv(train_data_path)\n    train_target = pd.read_csv(train_target_path)\n\n    clf = SVC(kernel=kernel)\n    clf.fit(train_data, train_target)\n\n    with open(model_path, mode=\"wb\") as file_writer:\n        dill.dump(clf, file_writer)\n\n    input_example = train_data.sample(1)\n    with open(input_example_path, \"wb\") as file_writer:\n        dill.dump(input_example, file_writer)\n\n    signature = infer_signature(train_data, clf.predict(train_data))\n    with open(signature_path, \"wb\") as file_writer:\n        dill.dump(signature, file_writer)\n\n    conda_env = _mlflow_conda_env(\n        additional_pip_deps=[\"dill\", \"pandas\", \"scikit-learn\"]\n    )\n    with open(conda_env_path, \"wb\") as file_writer:\n        dill.dump(conda_env, file_writer)\n\n```\n\n그리고 MLFlow에 업로드하는 컴포넌트를 작성합니다.\n이 때 업로드되는 MLflow의 endpoint를 우리가 설치한 [mlflow service](../setup-components/install-components-mlflow.md) 로 이어지게 설정해주어야 합니다.  \n이 때 S3 Endpoint의 주소는 MLflow Server 설치 당시 설치한 minio의 [쿠버네티스 서비스 DNS 네임을 활용](https://kubernetes.io/ko/docs/concepts/services-networking/dns-pod-service/)합니다. 해당 service 는 kubeflow namespace에서 minio-service라는 이름으로 생성되었으므로, `http://minio-service.kubeflow.svc:9000` 로 설정합니다.  \n이와 비슷하게 tracking_uri의 주소는 mlflow server의 쿠버네티스 서비스 DNS 네임을 활용하여, `http://mlflow-server-service.mlflow-system.svc:5000` 로 설정합니다.\n\n```python\nfrom functools import partial\nfrom kfp.components import InputPath, create_component_from_func\n\n@partial(\n    create_component_from_func,\n    packages_to_install=[\"dill\", \"pandas\", \"scikit-learn\", \"mlflow\", \"boto3\"],\n)\ndef upload_sklearn_model_to_mlflow(\n    model_name: str,\n    model_path: InputPath(\"dill\"),\n    input_example_path: InputPath(\"dill\"),\n    signature_path: InputPath(\"dill\"),\n    conda_env_path: InputPath(\"dill\"),\n):\n    import os\n    import dill\n    from mlflow.sklearn import save_model\n    \n    from mlflow.tracking.client import MlflowClient\n\n    os.environ[\"MLFLOW_S3_ENDPOINT_URL\"] = \"http://minio-service.kubeflow.svc:9000\"\n    os.environ[\"AWS_ACCESS_KEY_ID\"] = \"minio\"\n    os.environ[\"AWS_SECRET_ACCESS_KEY\"] = \"minio123\"\n\n    client = MlflowClient(\"http://mlflow-server-service.mlflow-system.svc:5000\")\n\n    with open(model_path, mode=\"rb\") as file_reader:\n        clf = dill.load(file_reader)\n\n    with open(input_example_path, \"rb\") as file_reader:\n        input_example = dill.load(file_reader)\n\n    with open(signature_path, \"rb\") as file_reader:\n        signature = dill.load(file_reader)\n\n    with open(conda_env_path, \"rb\") as file_reader:\n        conda_env = dill.load(file_reader)\n\n    save_model(\n        sk_model=clf,\n        path=model_name,\n        serialization_format=\"cloudpickle\",\n        conda_env=conda_env,\n        signature=signature,\n        input_example=input_example,\n    )\n    run = client.create_run(experiment_id=\"0\")\n    client.log_artifact(run.info.run_id, model_name)\n```\n\n## MLFlow Pipeline\n\n이제 작성한 컴포넌트들을 연결해서 파이프라인으로 만들어 보겠습니다.\n\n### Data Component\n\n모델을 학습할 때 쓸 데이터는 sklearn의 iris 입니다.\n데이터를 생성하는 컴포넌트를 작성합니다.\n\n```python\nfrom functools import partial\n\nfrom kfp.components import InputPath, OutputPath, create_component_from_func\n\n\n@partial(\n    create_component_from_func,\n    packages_to_install=[\"pandas\", \"scikit-learn\"],\n)\ndef load_iris_data(\n    data_path: OutputPath(\"csv\"),\n    target_path: OutputPath(\"csv\"),\n):\n    import pandas as pd\n    from sklearn.datasets import load_iris\n\n    iris = load_iris()\n\n    data = pd.DataFrame(iris[\"data\"], columns=iris[\"feature_names\"])\n    target = pd.DataFrame(iris[\"target\"], columns=[\"target\"])\n\n    data.to_csv(data_path, index=False)\n    target.to_csv(target_path, index=False)\n\n```\n\n### Pipeline\n\n파이프라인 코드는 다음과 같이 작성할 수 있습니다.\n\n```python\nfrom kfp.dsl import pipeline\n\n\n@pipeline(name=\"mlflow_pipeline\")\ndef mlflow_pipeline(kernel: str, model_name: str):\n    iris_data = load_iris_data()\n    model = train_from_csv(\n        train_data=iris_data.outputs[\"data\"],\n        train_target=iris_data.outputs[\"target\"],\n        kernel=kernel,\n    )\n    _ = upload_sklearn_model_to_mlflow(\n        model_name=model_name,\n        model=model.outputs[\"model\"],\n        input_example=model.outputs[\"input_example\"],\n        signature=model.outputs[\"signature\"],\n        conda_env=model.outputs[\"conda_env\"],\n    )\n```\n\n### Run\n\n위에서 작성된 컴포넌트와 파이프라인을 하나의 파이썬 파일에 정리하면 다음과 같습니다.\n\n```python\nfrom functools import partial\n\nimport kfp\nfrom kfp.components import InputPath, OutputPath, create_component_from_func\nfrom kfp.dsl import pipeline\n\n\n@partial(\n    create_component_from_func,\n    packages_to_install=[\"pandas\", \"scikit-learn\"],\n)\ndef load_iris_data(\n    data_path: OutputPath(\"csv\"),\n    target_path: OutputPath(\"csv\"),\n):\n    import pandas as pd\n    from sklearn.datasets import load_iris\n\n    iris = load_iris()\n\n    data = pd.DataFrame(iris[\"data\"], columns=iris[\"feature_names\"])\n    target = pd.DataFrame(iris[\"target\"], columns=[\"target\"])\n\n    data.to_csv(data_path, index=False)\n    target.to_csv(target_path, index=False)\n\n\n@partial(\n    create_component_from_func,\n    packages_to_install=[\"dill\", \"pandas\", \"scikit-learn\", \"mlflow\"],\n)\ndef train_from_csv(\n    train_data_path: InputPath(\"csv\"),\n    train_target_path: InputPath(\"csv\"),\n    model_path: OutputPath(\"dill\"),\n    input_example_path: OutputPath(\"dill\"),\n    signature_path: OutputPath(\"dill\"),\n    conda_env_path: OutputPath(\"dill\"),\n    kernel: str,\n):\n    import dill\n    import pandas as pd\n    from sklearn.svm import SVC\n\n    from mlflow.models.signature import infer_signature\n    from mlflow.utils.environment import _mlflow_conda_env\n\n    train_data = pd.read_csv(train_data_path)\n    train_target = pd.read_csv(train_target_path)\n\n    clf = SVC(kernel=kernel)\n    clf.fit(train_data, train_target)\n\n    with open(model_path, mode=\"wb\") as file_writer:\n        dill.dump(clf, file_writer)\n\n    input_example = train_data.sample(1)\n    with open(input_example_path, \"wb\") as file_writer:\n        dill.dump(input_example, file_writer)\n\n    signature = infer_signature(train_data, clf.predict(train_data))\n    with open(signature_path, \"wb\") as file_writer:\n        dill.dump(signature, file_writer)\n\n    conda_env = _mlflow_conda_env(\n        additional_pip_deps=[\"dill\", \"pandas\", \"scikit-learn\"]\n    )\n    with open(conda_env_path, \"wb\") as file_writer:\n        dill.dump(conda_env, file_writer)\n\n\n@partial(\n    create_component_from_func,\n    packages_to_install=[\"dill\", \"pandas\", \"scikit-learn\", \"mlflow\", \"boto3\"],\n)\ndef upload_sklearn_model_to_mlflow(\n    model_name: str,\n    model_path: InputPath(\"dill\"),\n    input_example_path: InputPath(\"dill\"),\n    signature_path: InputPath(\"dill\"),\n    conda_env_path: InputPath(\"dill\"),\n):\n    import os\n    import dill\n    from mlflow.sklearn import save_model\n    \n    from mlflow.tracking.client import MlflowClient\n\n    os.environ[\"MLFLOW_S3_ENDPOINT_URL\"] = \"http://minio-service.kubeflow.svc:9000\"\n    os.environ[\"AWS_ACCESS_KEY_ID\"] = \"minio\"\n    os.environ[\"AWS_SECRET_ACCESS_KEY\"] = \"minio123\"\n\n    client = MlflowClient(\"http://mlflow-server-service.mlflow-system.svc:5000\")\n\n    with open(model_path, mode=\"rb\") as file_reader:\n        clf = dill.load(file_reader)\n\n    with open(input_example_path, \"rb\") as file_reader:\n        input_example = dill.load(file_reader)\n\n    with open(signature_path, \"rb\") as file_reader:\n        signature = dill.load(file_reader)\n\n    with open(conda_env_path, \"rb\") as file_reader:\n        conda_env = dill.load(file_reader)\n\n    save_model(\n        sk_model=clf,\n        path=model_name,\n        serialization_format=\"cloudpickle\",\n        conda_env=conda_env,\n        signature=signature,\n        input_example=input_example,\n    )\n    run = client.create_run(experiment_id=\"0\")\n    client.log_artifact(run.info.run_id, model_name)\n\n\n@pipeline(name=\"mlflow_pipeline\")\ndef mlflow_pipeline(kernel: str, model_name: str):\n    iris_data = load_iris_data()\n    model = train_from_csv(\n        train_data=iris_data.outputs[\"data\"],\n        train_target=iris_data.outputs[\"target\"],\n        kernel=kernel,\n    )\n    _ = upload_sklearn_model_to_mlflow(\n        model_name=model_name,\n        model=model.outputs[\"model\"],\n        input_example=model.outputs[\"input_example\"],\n        signature=model.outputs[\"signature\"],\n        conda_env=model.outputs[\"conda_env\"],\n    )\n\n\nif __name__ == \"__main__\":\n    kfp.compiler.Compiler().compile(mlflow_pipeline, \"mlflow_pipeline.yaml\")\n```\n\n<p>\n  <details>\n    <summary>mlflow_pipeline.yaml</summary>\n\n```bash\napiVersion: argoproj.io/v1alpha1\nkind: Workflow\nmetadata:\n  generateName: mlflow-pipeline-\n  annotations: {pipelines.kubeflow.org/kfp_sdk_version: 1.8.10, pipelines.kubeflow.org/pipeline_compilation_time: '2022-01-19T14:14:11.999807',\n    pipelines.kubeflow.org/pipeline_spec: '{\"inputs\": [{\"name\": \"kernel\", \"type\":\n      \"String\"}, {\"name\": \"model_name\", \"type\": \"String\"}], \"name\": \"mlflow_pipeline\"}'}\n  labels: {pipelines.kubeflow.org/kfp_sdk_version: 1.8.10}\nspec:\n  entrypoint: mlflow-pipeline\n  templates:\n  - name: load-iris-data\n    container:\n      args: [--data, /tmp/outputs/data/data, --target, /tmp/outputs/target/data]\n      command:\n      - sh\n      - -c\n      - (PIP_DISABLE_PIP_VERSION_CHECK=1 python3 -m pip install --quiet --no-warn-script-location\n        'pandas' 'scikit-learn' || PIP_DISABLE_PIP_VERSION_CHECK=1 python3 -m pip\n        install --quiet --no-warn-script-location 'pandas' 'scikit-learn' --user)\n        && \"$0\" \"$@\"\n      - sh\n      - -ec\n      - |\n        program_path=$(mktemp)\n        printf \"%s\" \"$0\" > \"$program_path\"\n        python3 -u \"$program_path\" \"$@\"\n      - |\n        def _make_parent_dirs_and_return_path(file_path: str):\n            import os\n            os.makedirs(os.path.dirname(file_path), exist_ok=True)\n            return file_path\n\n        def load_iris_data(\n            data_path,\n            target_path,\n        ):\n            import pandas as pd\n            from sklearn.datasets import load_iris\n\n            iris = load_iris()\n\n            data = pd.DataFrame(iris[\"data\"], columns=iris[\"feature_names\"])\n            target = pd.DataFrame(iris[\"target\"], columns=[\"target\"])\n\n            data.to_csv(data_path, index=False)\n            target.to_csv(target_path, index=False)\n\n        import argparse\n        _parser = argparse.ArgumentParser(prog='Load iris data', description='')\n        _parser.add_argument(\"--data\", dest=\"data_path\", type=_make_parent_dirs_and_return_path, required=True, default=argparse.SUPPRESS)\n        _parser.add_argument(\"--target\", dest=\"target_path\", type=_make_parent_dirs_and_return_path, required=True, default=argparse.SUPPRESS)\n        _parsed_args = vars(_parser.parse_args())\n\n        _outputs = load_iris_data(**_parsed_args)\n      image: python:3.7\n    outputs:\n      artifacts:\n      - {name: load-iris-data-data, path: /tmp/outputs/data/data}\n      - {name: load-iris-data-target, path: /tmp/outputs/target/data}\n    metadata:\n      labels:\n        pipelines.kubeflow.org/kfp_sdk_version: 1.8.10\n        pipelines.kubeflow.org/pipeline-sdk-type: kfp\n        pipelines.kubeflow.org/enable_caching: \"true\"\n      annotations: {pipelines.kubeflow.org/component_spec: '{\"implementation\": {\"container\":\n          {\"args\": [\"--data\", {\"outputPath\": \"data\"}, \"--target\", {\"outputPath\": \"target\"}],\n          \"command\": [\"sh\", \"-c\", \"(PIP_DISABLE_PIP_VERSION_CHECK=1 python3 -m pip\n          install --quiet --no-warn-script-location ''pandas'' ''scikit-learn'' ||\n          PIP_DISABLE_PIP_VERSION_CHECK=1 python3 -m pip install --quiet --no-warn-script-location\n          ''pandas'' ''scikit-learn'' --user) && \\\"$0\\\" \\\"$@\\\"\", \"sh\", \"-ec\", \"program_path=$(mktemp)\\nprintf\n          \\\"%s\\\" \\\"$0\\\" > \\\"$program_path\\\"\\npython3 -u \\\"$program_path\\\" \\\"$@\\\"\\n\",\n          \"def _make_parent_dirs_and_return_path(file_path: str):\\n    import os\\n    os.makedirs(os.path.dirname(file_path),\n          exist_ok=True)\\n    return file_path\\n\\ndef load_iris_data(\\n    data_path,\\n    target_path,\\n):\\n    import\n          pandas as pd\\n    from sklearn.datasets import load_iris\\n\\n    iris = load_iris()\\n\\n    data\n          = pd.DataFrame(iris[\\\"data\\\"], columns=iris[\\\"feature_names\\\"])\\n    target\n          = pd.DataFrame(iris[\\\"target\\\"], columns=[\\\"target\\\"])\\n\\n    data.to_csv(data_path,\n          index=False)\\n    target.to_csv(target_path, index=False)\\n\\nimport argparse\\n_parser\n          = argparse.ArgumentParser(prog=''Load iris data'', description='''')\\n_parser.add_argument(\\\"--data\\\",\n          dest=\\\"data_path\\\", type=_make_parent_dirs_and_return_path, required=True,\n          default=argparse.SUPPRESS)\\n_parser.add_argument(\\\"--target\\\", dest=\\\"target_path\\\",\n          type=_make_parent_dirs_and_return_path, required=True, default=argparse.SUPPRESS)\\n_parsed_args\n          = vars(_parser.parse_args())\\n\\n_outputs = load_iris_data(**_parsed_args)\\n\"],\n          \"image\": \"python:3.7\"}}, \"name\": \"Load iris data\", \"outputs\": [{\"name\":\n          \"data\", \"type\": \"csv\"}, {\"name\": \"target\", \"type\": \"csv\"}]}', pipelines.kubeflow.org/component_ref: '{}'}\n  - name: mlflow-pipeline\n    inputs:\n      parameters:\n      - {name: kernel}\n      - {name: model_name}\n    dag:\n      tasks:\n      - {name: load-iris-data, template: load-iris-data}\n      - name: train-from-csv\n        template: train-from-csv\n        dependencies: [load-iris-data]\n        arguments:\n          parameters:\n          - {name: kernel, value: '{{inputs.parameters.kernel}}'}\n          artifacts:\n          - {name: load-iris-data-data, from: '{{tasks.load-iris-data.outputs.artifacts.load-iris-data-data}}'}\n          - {name: load-iris-data-target, from: '{{tasks.load-iris-data.outputs.artifacts.load-iris-data-target}}'}\n      - name: upload-sklearn-model-to-mlflow\n        template: upload-sklearn-model-to-mlflow\n        dependencies: [train-from-csv]\n        arguments:\n          parameters:\n          - {name: model_name, value: '{{inputs.parameters.model_name}}'}\n          artifacts:\n          - {name: train-from-csv-conda_env, from: '{{tasks.train-from-csv.outputs.artifacts.train-from-csv-conda_env}}'}\n          - {name: train-from-csv-input_example, from: '{{tasks.train-from-csv.outputs.artifacts.train-from-csv-input_example}}'}\n          - {name: train-from-csv-model, from: '{{tasks.train-from-csv.outputs.artifacts.train-from-csv-model}}'}\n          - {name: train-from-csv-signature, from: '{{tasks.train-from-csv.outputs.artifacts.train-from-csv-signature}}'}\n  - name: train-from-csv\n    container:\n      args: [--train-data, /tmp/inputs/train_data/data, --train-target, /tmp/inputs/train_target/data,\n        --kernel, '{{inputs.parameters.kernel}}', --model, /tmp/outputs/model/data,\n        --input-example, /tmp/outputs/input_example/data, --signature, /tmp/outputs/signature/data,\n        --conda-env, /tmp/outputs/conda_env/data]\n      command:\n      - sh\n      - -c\n      - (PIP_DISABLE_PIP_VERSION_CHECK=1 python3 -m pip install --quiet --no-warn-script-location\n        'dill' 'pandas' 'scikit-learn' 'mlflow' || PIP_DISABLE_PIP_VERSION_CHECK=1\n        python3 -m pip install --quiet --no-warn-script-location 'dill' 'pandas' 'scikit-learn'\n        'mlflow' --user) && \"$0\" \"$@\"\n      - sh\n      - -ec\n      - |\n        program_path=$(mktemp)\n        printf \"%s\" \"$0\" > \"$program_path\"\n        python3 -u \"$program_path\" \"$@\"\n      - |\n        def _make_parent_dirs_and_return_path(file_path: str):\n            import os\n            os.makedirs(os.path.dirname(file_path), exist_ok=True)\n            return file_path\n\n        def train_from_csv(\n            train_data_path,\n            train_target_path,\n            model_path,\n            input_example_path,\n            signature_path,\n            conda_env_path,\n            kernel,\n        ):\n            import dill\n            import pandas as pd\n            from sklearn.svm import SVC\n\n            from mlflow.models.signature import infer_signature\n            from mlflow.utils.environment import _mlflow_conda_env\n\n            train_data = pd.read_csv(train_data_path)\n            train_target = pd.read_csv(train_target_path)\n\n            clf = SVC(kernel=kernel)\n            clf.fit(train_data, train_target)\n\n            with open(model_path, mode=\"wb\") as file_writer:\n                dill.dump(clf, file_writer)\n\n            input_example = train_data.sample(1)\n            with open(input_example_path, \"wb\") as file_writer:\n                dill.dump(input_example, file_writer)\n\n            signature = infer_signature(train_data, clf.predict(train_data))\n            with open(signature_path, \"wb\") as file_writer:\n                dill.dump(signature, file_writer)\n\n            conda_env = _mlflow_conda_env(\n                additional_pip_deps=[\"dill\", \"pandas\", \"scikit-learn\"]\n            )\n            with open(conda_env_path, \"wb\") as file_writer:\n                dill.dump(conda_env, file_writer)\n\n        import argparse\n        _parser = argparse.ArgumentParser(prog='Train from csv', description='')\n        _parser.add_argument(\"--train-data\", dest=\"train_data_path\", type=str, required=True, default=argparse.SUPPRESS)\n        _parser.add_argument(\"--train-target\", dest=\"train_target_path\", type=str, required=True, default=argparse.SUPPRESS)\n        _parser.add_argument(\"--kernel\", dest=\"kernel\", type=str, required=True, default=argparse.SUPPRESS)\n        _parser.add_argument(\"--model\", dest=\"model_path\", type=_make_parent_dirs_and_return_path, required=True, default=argparse.SUPPRESS)\n        _parser.add_argument(\"--input-example\", dest=\"input_example_path\", type=_make_parent_dirs_and_return_path, required=True, default=argparse.SUPPRESS)\n        _parser.add_argument(\"--signature\", dest=\"signature_path\", type=_make_parent_dirs_and_return_path, required=True, default=argparse.SUPPRESS)\n        _parser.add_argument(\"--conda-env\", dest=\"conda_env_path\", type=_make_parent_dirs_and_return_path, required=True, default=argparse.SUPPRESS)\n        _parsed_args = vars(_parser.parse_args())\n\n        _outputs = train_from_csv(**_parsed_args)\n      image: python:3.7\n    inputs:\n      parameters:\n      - {name: kernel}\n      artifacts:\n      - {name: load-iris-data-data, path: /tmp/inputs/train_data/data}\n      - {name: load-iris-data-target, path: /tmp/inputs/train_target/data}\n    outputs:\n      artifacts:\n      - {name: train-from-csv-conda_env, path: /tmp/outputs/conda_env/data}\n      - {name: train-from-csv-input_example, path: /tmp/outputs/input_example/data}\n      - {name: train-from-csv-model, path: /tmp/outputs/model/data}\n      - {name: train-from-csv-signature, path: /tmp/outputs/signature/data}\n    metadata:\n      labels:\n        pipelines.kubeflow.org/kfp_sdk_version: 1.8.10\n        pipelines.kubeflow.org/pipeline-sdk-type: kfp\n        pipelines.kubeflow.org/enable_caching: \"true\"\n      annotations: {pipelines.kubeflow.org/component_spec: '{\"implementation\": {\"container\":\n          {\"args\": [\"--train-data\", {\"inputPath\": \"train_data\"}, \"--train-target\",\n          {\"inputPath\": \"train_target\"}, \"--kernel\", {\"inputValue\": \"kernel\"}, \"--model\",\n          {\"outputPath\": \"model\"}, \"--input-example\", {\"outputPath\": \"input_example\"},\n          \"--signature\", {\"outputPath\": \"signature\"}, \"--conda-env\", {\"outputPath\":\n          \"conda_env\"}], \"command\": [\"sh\", \"-c\", \"(PIP_DISABLE_PIP_VERSION_CHECK=1\n          python3 -m pip install --quiet --no-warn-script-location ''dill'' ''pandas''\n          ''scikit-learn'' ''mlflow'' || PIP_DISABLE_PIP_VERSION_CHECK=1 python3 -m\n          pip install --quiet --no-warn-script-location ''dill'' ''pandas'' ''scikit-learn''\n          ''mlflow'' --user) && \\\"$0\\\" \\\"$@\\\"\", \"sh\", \"-ec\", \"program_path=$(mktemp)\\nprintf\n          \\\"%s\\\" \\\"$0\\\" > \\\"$program_path\\\"\\npython3 -u \\\"$program_path\\\" \\\"$@\\\"\\n\",\n          \"def _make_parent_dirs_and_return_path(file_path: str):\\n    import os\\n    os.makedirs(os.path.dirname(file_path),\n          exist_ok=True)\\n    return file_path\\n\\ndef train_from_csv(\\n    train_data_path,\\n    train_target_path,\\n    model_path,\\n    input_example_path,\\n    signature_path,\\n    conda_env_path,\\n    kernel,\\n):\\n    import\n          dill\\n    import pandas as pd\\n    from sklearn.svm import SVC\\n\\n    from\n          mlflow.models.signature import infer_signature\\n    from mlflow.utils.environment\n          import _mlflow_conda_env\\n\\n    train_data = pd.read_csv(train_data_path)\\n    train_target\n          = pd.read_csv(train_target_path)\\n\\n    clf = SVC(kernel=kernel)\\n    clf.fit(train_data,\n          train_target)\\n\\n    with open(model_path, mode=\\\"wb\\\") as file_writer:\\n        dill.dump(clf,\n          file_writer)\\n\\n    input_example = train_data.sample(1)\\n    with open(input_example_path,\n          \\\"wb\\\") as file_writer:\\n        dill.dump(input_example, file_writer)\\n\\n    signature\n          = infer_signature(train_data, clf.predict(train_data))\\n    with open(signature_path,\n          \\\"wb\\\") as file_writer:\\n        dill.dump(signature, file_writer)\\n\\n    conda_env\n          = _mlflow_conda_env(\\n        additional_pip_deps=[\\\"dill\\\", \\\"pandas\\\",\n          \\\"scikit-learn\\\"]\\n    )\\n    with open(conda_env_path, \\\"wb\\\") as file_writer:\\n        dill.dump(conda_env,\n          file_writer)\\n\\nimport argparse\\n_parser = argparse.ArgumentParser(prog=''Train\n          from csv'', description='''')\\n_parser.add_argument(\\\"--train-data\\\", dest=\\\"train_data_path\\\",\n          type=str, required=True, default=argparse.SUPPRESS)\\n_parser.add_argument(\\\"--train-target\\\",\n          dest=\\\"train_target_path\\\", type=str, required=True, default=argparse.SUPPRESS)\\n_parser.add_argument(\\\"--kernel\\\",\n          dest=\\\"kernel\\\", type=str, required=True, default=argparse.SUPPRESS)\\n_parser.add_argument(\\\"--model\\\",\n          dest=\\\"model_path\\\", type=_make_parent_dirs_and_return_path, required=True,\n          default=argparse.SUPPRESS)\\n_parser.add_argument(\\\"--input-example\\\", dest=\\\"input_example_path\\\",\n          type=_make_parent_dirs_and_return_path, required=True, default=argparse.SUPPRESS)\\n_parser.add_argument(\\\"--signature\\\",\n          dest=\\\"signature_path\\\", type=_make_parent_dirs_and_return_path, required=True,\n          default=argparse.SUPPRESS)\\n_parser.add_argument(\\\"--conda-env\\\", dest=\\\"conda_env_path\\\",\n          type=_make_parent_dirs_and_return_path, required=True, default=argparse.SUPPRESS)\\n_parsed_args\n          = vars(_parser.parse_args())\\n\\n_outputs = train_from_csv(**_parsed_args)\\n\"],\n          \"image\": \"python:3.7\"}}, \"inputs\": [{\"name\": \"train_data\", \"type\": \"csv\"},\n          {\"name\": \"train_target\", \"type\": \"csv\"}, {\"name\": \"kernel\", \"type\": \"String\"}],\n          \"name\": \"Train from csv\", \"outputs\": [{\"name\": \"model\", \"type\": \"dill\"},\n          {\"name\": \"input_example\", \"type\": \"dill\"}, {\"name\": \"signature\", \"type\":\n          \"dill\"}, {\"name\": \"conda_env\", \"type\": \"dill\"}]}', pipelines.kubeflow.org/component_ref: '{}',\n        pipelines.kubeflow.org/arguments.parameters: '{\"kernel\": \"{{inputs.parameters.kernel}}\"}'}\n  - name: upload-sklearn-model-to-mlflow\n    container:\n      args: [--model-name, '{{inputs.parameters.model_name}}', --model, /tmp/inputs/model/data,\n        --input-example, /tmp/inputs/input_example/data, --signature, /tmp/inputs/signature/data,\n        --conda-env, /tmp/inputs/conda_env/data]\n      command:\n      - sh\n      - -c\n      - (PIP_DISABLE_PIP_VERSION_CHECK=1 python3 -m pip install --quiet --no-warn-script-location\n        'dill' 'pandas' 'scikit-learn' 'mlflow' 'boto3' || PIP_DISABLE_PIP_VERSION_CHECK=1\n        python3 -m pip install --quiet --no-warn-script-location 'dill' 'pandas' 'scikit-learn'\n        'mlflow' 'boto3' --user) && \"$0\" \"$@\"\n      - sh\n      - -ec\n      - |\n        program_path=$(mktemp)\n        printf \"%s\" \"$0\" > \"$program_path\"\n        python3 -u \"$program_path\" \"$@\"\n      - |\n        def upload_sklearn_model_to_mlflow(\n            model_name,\n            model_path,\n            input_example_path,\n            signature_path,\n            conda_env_path,\n        ):\n            import os\n            import dill\n            from mlflow.sklearn import save_model\n\n            from mlflow.tracking.client import MlflowClient\n\n            os.environ[\"MLFLOW_S3_ENDPOINT_URL\"] = \"http://minio-service.kubeflow.svc:9000\"\n            os.environ[\"AWS_ACCESS_KEY_ID\"] = \"minio\"\n            os.environ[\"AWS_SECRET_ACCESS_KEY\"] = \"minio123\"\n\n            client = MlflowClient(\"http://mlflow-server-service.mlflow-system.svc:5000\")\n\n            with open(model_path, mode=\"rb\") as file_reader:\n                clf = dill.load(file_reader)\n\n            with open(input_example_path, \"rb\") as file_reader:\n                input_example = dill.load(file_reader)\n\n            with open(signature_path, \"rb\") as file_reader:\n                signature = dill.load(file_reader)\n\n            with open(conda_env_path, \"rb\") as file_reader:\n                conda_env = dill.load(file_reader)\n\n            save_model(\n                sk_model=clf,\n                path=model_name,\n                serialization_format=\"cloudpickle\",\n                conda_env=conda_env,\n                signature=signature,\n                input_example=input_example,\n            )\n            run = client.create_run(experiment_id=\"0\")\n            client.log_artifact(run.info.run_id, model_name)\n\n        import argparse\n        _parser = argparse.ArgumentParser(prog='Upload sklearn model to mlflow', description='')\n        _parser.add_argument(\"--model-name\", dest=\"model_name\", type=str, required=True, default=argparse.SUPPRESS)\n        _parser.add_argument(\"--model\", dest=\"model_path\", type=str, required=True, default=argparse.SUPPRESS)\n        _parser.add_argument(\"--input-example\", dest=\"input_example_path\", type=str, required=True, default=argparse.SUPPRESS)\n        _parser.add_argument(\"--signature\", dest=\"signature_path\", type=str, required=True, default=argparse.SUPPRESS)\n        _parser.add_argument(\"--conda-env\", dest=\"conda_env_path\", type=str, required=True, default=argparse.SUPPRESS)\n        _parsed_args = vars(_parser.parse_args())\n\n        _outputs = upload_sklearn_model_to_mlflow(**_parsed_args)\n      image: python:3.7\n    inputs:\n      parameters:\n      - {name: model_name}\n      artifacts:\n      - {name: train-from-csv-conda_env, path: /tmp/inputs/conda_env/data}\n      - {name: train-from-csv-input_example, path: /tmp/inputs/input_example/data}\n      - {name: train-from-csv-model, path: /tmp/inputs/model/data}\n      - {name: train-from-csv-signature, path: /tmp/inputs/signature/data}\n    metadata:\n      labels:\n        pipelines.kubeflow.org/kfp_sdk_version: 1.8.10\n        pipelines.kubeflow.org/pipeline-sdk-type: kfp\n        pipelines.kubeflow.org/enable_caching: \"true\"\n      annotations: {pipelines.kubeflow.org/component_spec: '{\"implementation\": {\"container\":\n          {\"args\": [\"--model-name\", {\"inputValue\": \"model_name\"}, \"--model\", {\"inputPath\":\n          \"model\"}, \"--input-example\", {\"inputPath\": \"input_example\"}, \"--signature\",\n          {\"inputPath\": \"signature\"}, \"--conda-env\", {\"inputPath\": \"conda_env\"}],\n          \"command\": [\"sh\", \"-c\", \"(PIP_DISABLE_PIP_VERSION_CHECK=1 python3 -m pip\n          install --quiet --no-warn-script-location ''dill'' ''pandas'' ''scikit-learn''\n          ''mlflow'' ''boto3'' || PIP_DISABLE_PIP_VERSION_CHECK=1 python3 -m pip install\n          --quiet --no-warn-script-location ''dill'' ''pandas'' ''scikit-learn'' ''mlflow''\n          ''boto3'' --user) && \\\"$0\\\" \\\"$@\\\"\", \"sh\", \"-ec\", \"program_path=$(mktemp)\\nprintf\n          \\\"%s\\\" \\\"$0\\\" > \\\"$program_path\\\"\\npython3 -u \\\"$program_path\\\" \\\"$@\\\"\\n\",\n          \"def upload_sklearn_model_to_mlflow(\\n    model_name,\\n    model_path,\\n    input_example_path,\\n    signature_path,\\n    conda_env_path,\\n):\\n    import\n          os\\n    import dill\\n    from mlflow.sklearn import save_model\\n\\n    from\n          mlflow.tracking.client import MlflowClient\\n\\n    os.environ[\\\"MLFLOW_S3_ENDPOINT_URL\\\"]\n          = \\\"http://minio-service.kubeflow.svc:9000\\\"\\n    os.environ[\\\"AWS_ACCESS_KEY_ID\\\"]\n          = \\\"minio\\\"\\n    os.environ[\\\"AWS_SECRET_ACCESS_KEY\\\"] = \\\"minio123\\\"\\n\\n    client\n          = MlflowClient(\\\"http://mlflow-server-service.mlflow-system.svc:5000\\\")\\n\\n    with\n          open(model_path, mode=\\\"rb\\\") as file_reader:\\n        clf = dill.load(file_reader)\\n\\n    with\n          open(input_example_path, \\\"rb\\\") as file_reader:\\n        input_example\n          = dill.load(file_reader)\\n\\n    with open(signature_path, \\\"rb\\\") as file_reader:\\n        signature\n          = dill.load(file_reader)\\n\\n    with open(conda_env_path, \\\"rb\\\") as file_reader:\\n        conda_env\n          = dill.load(file_reader)\\n\\n    save_model(\\n        sk_model=clf,\\n        path=model_name,\\n        serialization_format=\\\"cloudpickle\\\",\\n        conda_env=conda_env,\\n        signature=signature,\\n        input_example=input_example,\\n    )\\n    run\n          = client.create_run(experiment_id=\\\"0\\\")\\n    client.log_artifact(run.info.run_id,\n          model_name)\\n\\nimport argparse\\n_parser = argparse.ArgumentParser(prog=''Upload\n          sklearn model to mlflow'', description='''')\\n_parser.add_argument(\\\"--model-name\\\",\n          dest=\\\"model_name\\\", type=str, required=True, default=argparse.SUPPRESS)\\n_parser.add_argument(\\\"--model\\\",\n          dest=\\\"model_path\\\", type=str, required=True, default=argparse.SUPPRESS)\\n_parser.add_argument(\\\"--input-example\\\",\n          dest=\\\"input_example_path\\\", type=str, required=True, default=argparse.SUPPRESS)\\n_parser.add_argument(\\\"--signature\\\",\n          dest=\\\"signature_path\\\", type=str, required=True, default=argparse.SUPPRESS)\\n_parser.add_argument(\\\"--conda-env\\\",\n          dest=\\\"conda_env_path\\\", type=str, required=True, default=argparse.SUPPRESS)\\n_parsed_args\n          = vars(_parser.parse_args())\\n\\n_outputs = upload_sklearn_model_to_mlflow(**_parsed_args)\\n\"],\n          \"image\": \"python:3.7\"}}, \"inputs\": [{\"name\": \"model_name\", \"type\": \"String\"},\n          {\"name\": \"model\", \"type\": \"dill\"}, {\"name\": \"input_example\", \"type\": \"dill\"},\n          {\"name\": \"signature\", \"type\": \"dill\"}, {\"name\": \"conda_env\", \"type\": \"dill\"}],\n          \"name\": \"Upload sklearn model to mlflow\"}', pipelines.kubeflow.org/component_ref: '{}',\n        pipelines.kubeflow.org/arguments.parameters: '{\"model_name\": \"{{inputs.parameters.model_name}}\"}'}\n  arguments:\n    parameters:\n    - {name: kernel}\n    - {name: model_name}\n  serviceAccountName: pipeline-runner\n```\n\n  </details>\n</p>\n\n실행후 생성된 mlflow_pipeline.yaml 파일을 파이프라인 업로드한 후, 실행하여 run 의 결과를 확인합니다.\n\n![mlflow-svc-0](./img/mlflow-svc-0.png)\n\nmlflow service를 포트포워딩해서 MLflow ui에 접속합니다.\n\n```bash\nkubectl port-forward svc/mlflow-server-service -n mlflow-system 5000:5000\n```\n\n웹 브라우저를 열어 localhost:5000으로 접속하면, 다음과 같이 run이 생성된 것을 확인할 수 있습니다.\n\n![mlflow-svc-1](./img/mlflow-svc-1.png)\n\nrun 을 클릭해서 확인하면 학습한 모델 파일이 있는 것을 확인할 수 있습니다.\n\n![mlflow-svc-2](./img/mlflow-svc-2.png)\n"
  },
  {
    "path": "docs/kubeflow/advanced-pipeline.md",
    "content": "---\ntitle : \"10. Pipeline - Setting\"\ndescription: \"\"\nsidebar_position: 10\ncontributors: [\"Jongseob Jeon\"]\n---\n\n## Pipeline Setting\n\n이번 페이지에서는 파이프라인에서 설정할 수 있는 값들에 대해 알아보겠습니다.\n\n## Display Name\n\n생성된 파이프라인 내에서 컴포넌트는 두 개의 이름을 갖습니다.\n\n- task_name: 컴포넌트를 작성할 때 작성한 함수 이름\n- display_name: kubeflow UI상에 보이는 이름\n\n예를 들어서 다음과 같은 경우 두 컴포넌트 모두 Print and return number로 설정되어 있어서 어떤 컴포넌트가 1번인지 2번인지 확인하기 어렵습니다.\n\n![run-7](./img/run-7.png)\n\n### set_display_name\n\n이를 위한 것이 바로 display_name 입니다.  \n설정하는 방법은 파이프라인에서 컴포넌트에 다음과 같이 `set_display_name` [attribute](https://kubeflow-pipelines.readthedocs.io/en/latest/source/kfp.dsl.html#kfp.dsl.ContainerOp.set_display_name)를 이용하면 됩니다.\n\n```python\nimport kfp\nfrom kfp.components import create_component_from_func\nfrom kfp.dsl import pipeline\n\n\n@create_component_from_func\ndef print_and_return_number(number: int) -> int:\n    print(number)\n    return number\n\n\n@create_component_from_func\ndef sum_and_print_numbers(number_1: int, number_2: int):\n    print(number_1 + number_2)\n\n\n@pipeline(name=\"example_pipeline\")\ndef example_pipeline(number_1: int, number_2: int):\n    number_1_result = print_and_return_number(number_1).set_display_name(\"This is number 1\")\n    number_2_result = print_and_return_number(number_2).set_display_name(\"This is number 2\")\n    sum_result = sum_and_print_numbers(\n        number_1=number_1_result.output, number_2=number_2_result.output\n    ).set_display_name(\"This is sum of number 1 and number 2\")\n\n\nif __name__ == \"__main__\":\n    kfp.compiler.Compiler().compile(example_pipeline, \"example_pipeline.yaml\")\n```\n\n이 스크립트를 실행해서 나온 `example_pipeline.yaml`을 확인하면 다음과 같습니다.\n\n<p>\n  <details>\n    <summary>example_pipeline.yaml</summary>\n\n```bash\napiVersion: argoproj.io/v1alpha1\nkind: Workflow\nmetadata:\n  generateName: example-pipeline-\n  annotations: {pipelines.kubeflow.org/kfp_sdk_version: 1.8.9, pipelines.kubeflow.org/pipeline_compilation_time: '2021-12-09T18:11:43.193190',\n    pipelines.kubeflow.org/pipeline_spec: '{\"inputs\": [{\"name\": \"number_1\", \"type\":\n      \"Integer\"}, {\"name\": \"number_2\", \"type\": \"Integer\"}], \"name\": \"example_pipeline\"}'}\n  labels: {pipelines.kubeflow.org/kfp_sdk_version: 1.8.9}\nspec:\n  entrypoint: example-pipeline\n  templates:\n  - name: example-pipeline\n    inputs:\n      parameters:\n      - {name: number_1}\n      - {name: number_2}\n    dag:\n      tasks:\n      - name: print-and-return-number\n        template: print-and-return-number\n        arguments:\n          parameters:\n          - {name: number_1, value: '{{inputs.parameters.number_1}}'}\n      - name: print-and-return-number-2\n        template: print-and-return-number-2\n        arguments:\n          parameters:\n          - {name: number_2, value: '{{inputs.parameters.number_2}}'}\n      - name: sum-and-print-numbers\n        template: sum-and-print-numbers\n        dependencies: [print-and-return-number, print-and-return-number-2]\n        arguments:\n          parameters:\n          - {name: print-and-return-number-2-Output, value: '{{tasks.print-and-return-number-2.outputs.parameters.print-and-return-number-2-Output}}'}\n          - {name: print-and-return-number-Output, value: '{{tasks.print-and-return-number.outputs.parameters.print-and-return-number-Output}}'}\n  - name: print-and-return-number\n    container:\n      args: [--number, '{{inputs.parameters.number_1}}', '----output-paths', /tmp/outputs/Output/data]\n      command:\n      - sh\n      - -ec\n      - |\n        program_path=$(mktemp)\n        printf \"%s\" \"$0\" > \"$program_path\"\n        python3 -u \"$program_path\" \"$@\"\n      - |\n        def print_and_return_number(number):\n            print(number)\n            return number\n\n        def _serialize_int(int_value: int) -> str:\n            if isinstance(int_value, str):\n                return int_value\n            if not isinstance(int_value, int):\n                raise TypeError('Value \"{}\" has type \"{}\" instead of int.'.format(\n                    str(int_value), str(type(int_value))))\n            return str(int_value)\n\n        import argparse\n        _parser = argparse.ArgumentParser(prog='Print and return number', description='')\n        _parser.add_argument(\"--number\", dest=\"number\", type=int, required=True, default=argparse.SUPPRESS)\n        _parser.add_argument(\"----output-paths\", dest=\"_output_paths\", type=str, nargs=1)\n        _parsed_args = vars(_parser.parse_args())\n        _output_files = _parsed_args.pop(\"_output_paths\", [])\n\n        _outputs = print_and_return_number(**_parsed_args)\n\n        _outputs = [_outputs]\n\n        _output_serializers = [\n            _serialize_int,\n\n        ]\n\n        import os\n        for idx, output_file in enumerate(_output_files):\n            try:\n                os.makedirs(os.path.dirname(output_file))\n            except OSError:\n                pass\n            with open(output_file, 'w') as f:\n                f.write(_output_serializers[idx](_outputs[idx]))\n      image: python:3.7\n    inputs:\n      parameters:\n      - {name: number_1}\n    outputs:\n      parameters:\n      - name: print-and-return-number-Output\n        valueFrom: {path: /tmp/outputs/Output/data}\n      artifacts:\n      - {name: print-and-return-number-Output, path: /tmp/outputs/Output/data}\n    metadata:\n      annotations: {pipelines.kubeflow.org/task_display_name: This is number 1, pipelines.kubeflow.org/component_spec: '{\"implementation\":\n          {\"container\": {\"args\": [\"--number\", {\"inputValue\": \"number\"}, \"----output-paths\",\n          {\"outputPath\": \"Output\"}], \"command\": [\"sh\", \"-ec\", \"program_path=$(mktemp)\\nprintf\n          \\\"%s\\\" \\\"$0\\\" > \\\"$program_path\\\"\\npython3 -u \\\"$program_path\\\" \\\"$@\\\"\\n\",\n          \"def print_and_return_number(number):\\n    print(number)\\n    return number\\n\\ndef\n          _serialize_int(int_value: int) -> str:\\n    if isinstance(int_value, str):\\n        return\n          int_value\\n    if not isinstance(int_value, int):\\n        raise TypeError(''Value\n          \\\"{}\\\" has type \\\"{}\\\" instead of int.''.format(\\n            str(int_value),\n          str(type(int_value))))\\n    return str(int_value)\\n\\nimport argparse\\n_parser\n          = argparse.ArgumentParser(prog=''Print and return number'', description='''')\\n_parser.add_argument(\\\"--number\\\",\n          dest=\\\"number\\\", type=int, required=True, default=argparse.SUPPRESS)\\n_parser.add_argument(\\\"----output-paths\\\",\n          dest=\\\"_output_paths\\\", type=str, nargs=1)\\n_parsed_args = vars(_parser.parse_args())\\n_output_files\n          = _parsed_args.pop(\\\"_output_paths\\\", [])\\n\\n_outputs = print_and_return_number(**_parsed_args)\\n\\n_outputs\n          = [_outputs]\\n\\n_output_serializers = [\\n    _serialize_int,\\n\\n]\\n\\nimport\n          os\\nfor idx, output_file in enumerate(_output_files):\\n    try:\\n        os.makedirs(os.path.dirname(output_file))\\n    except\n          OSError:\\n        pass\\n    with open(output_file, ''w'') as f:\\n        f.write(_output_serializers[idx](_outputs[idx]))\\n\"],\n          \"image\": \"python:3.7\"}}, \"inputs\": [{\"name\": \"number\", \"type\": \"Integer\"}],\n          \"name\": \"Print and return number\", \"outputs\": [{\"name\": \"Output\", \"type\":\n          \"Integer\"}]}', pipelines.kubeflow.org/component_ref: '{}', pipelines.kubeflow.org/arguments.parameters: '{\"number\":\n          \"{{inputs.parameters.number_1}}\"}'}\n      labels:\n        pipelines.kubeflow.org/kfp_sdk_version: 1.8.9\n        pipelines.kubeflow.org/pipeline-sdk-type: kfp\n        pipelines.kubeflow.org/enable_caching: \"true\"\n  - name: print-and-return-number-2\n    container:\n      args: [--number, '{{inputs.parameters.number_2}}', '----output-paths', /tmp/outputs/Output/data]\n      command:\n      - sh\n      - -ec\n      - |\n        program_path=$(mktemp)\n        printf \"%s\" \"$0\" > \"$program_path\"\n        python3 -u \"$program_path\" \"$@\"\n      - |\n        def print_and_return_number(number):\n            print(number)\n            return number\n\n        def _serialize_int(int_value: int) -> str:\n            if isinstance(int_value, str):\n                return int_value\n            if not isinstance(int_value, int):\n                raise TypeError('Value \"{}\" has type \"{}\" instead of int.'.format(\n                    str(int_value), str(type(int_value))))\n            return str(int_value)\n\n        import argparse\n        _parser = argparse.ArgumentParser(prog='Print and return number', description='')\n        _parser.add_argument(\"--number\", dest=\"number\", type=int, required=True, default=argparse.SUPPRESS)\n        _parser.add_argument(\"----output-paths\", dest=\"_output_paths\", type=str, nargs=1)\n        _parsed_args = vars(_parser.parse_args())\n        _output_files = _parsed_args.pop(\"_output_paths\", [])\n\n        _outputs = print_and_return_number(**_parsed_args)\n\n        _outputs = [_outputs]\n\n        _output_serializers = [\n            _serialize_int,\n\n        ]\n\n        import os\n        for idx, output_file in enumerate(_output_files):\n            try:\n                os.makedirs(os.path.dirname(output_file))\n            except OSError:\n                pass\n            with open(output_file, 'w') as f:\n                f.write(_output_serializers[idx](_outputs[idx]))\n      image: python:3.7\n    inputs:\n      parameters:\n      - {name: number_2}\n    outputs:\n      parameters:\n      - name: print-and-return-number-2-Output\n        valueFrom: {path: /tmp/outputs/Output/data}\n      artifacts:\n      - {name: print-and-return-number-2-Output, path: /tmp/outputs/Output/data}\n    metadata:\n      annotations: {pipelines.kubeflow.org/task_display_name: This is number 2, pipelines.kubeflow.org/component_spec: '{\"implementation\":\n          {\"container\": {\"args\": [\"--number\", {\"inputValue\": \"number\"}, \"----output-paths\",\n          {\"outputPath\": \"Output\"}], \"command\": [\"sh\", \"-ec\", \"program_path=$(mktemp)\\nprintf\n          \\\"%s\\\" \\\"$0\\\" > \\\"$program_path\\\"\\npython3 -u \\\"$program_path\\\" \\\"$@\\\"\\n\",\n          \"def print_and_return_number(number):\\n    print(number)\\n    return number\\n\\ndef\n          _serialize_int(int_value: int) -> str:\\n    if isinstance(int_value, str):\\n        return\n          int_value\\n    if not isinstance(int_value, int):\\n        raise TypeError(''Value\n          \\\"{}\\\" has type \\\"{}\\\" instead of int.''.format(\\n            str(int_value),\n          str(type(int_value))))\\n    return str(int_value)\\n\\nimport argparse\\n_parser\n          = argparse.ArgumentParser(prog=''Print and return number'', description='''')\\n_parser.add_argument(\\\"--number\\\",\n          dest=\\\"number\\\", type=int, required=True, default=argparse.SUPPRESS)\\n_parser.add_argument(\\\"----output-paths\\\",\n          dest=\\\"_output_paths\\\", type=str, nargs=1)\\n_parsed_args = vars(_parser.parse_args())\\n_output_files\n          = _parsed_args.pop(\\\"_output_paths\\\", [])\\n\\n_outputs = print_and_return_number(**_parsed_args)\\n\\n_outputs\n          = [_outputs]\\n\\n_output_serializers = [\\n    _serialize_int,\\n\\n]\\n\\nimport\n          os\\nfor idx, output_file in enumerate(_output_files):\\n    try:\\n        os.makedirs(os.path.dirname(output_file))\\n    except\n          OSError:\\n        pass\\n    with open(output_file, ''w'') as f:\\n        f.write(_output_serializers[idx](_outputs[idx]))\\n\"],\n          \"image\": \"python:3.7\"}}, \"inputs\": [{\"name\": \"number\", \"type\": \"Integer\"}],\n          \"name\": \"Print and return number\", \"outputs\": [{\"name\": \"Output\", \"type\":\n          \"Integer\"}]}', pipelines.kubeflow.org/component_ref: '{}', pipelines.kubeflow.org/arguments.parameters: '{\"number\":\n          \"{{inputs.parameters.number_2}}\"}'}\n      labels:\n        pipelines.kubeflow.org/kfp_sdk_version: 1.8.9\n        pipelines.kubeflow.org/pipeline-sdk-type: kfp\n        pipelines.kubeflow.org/enable_caching: \"true\"\n  - name: sum-and-print-numbers\n    container:\n      args: [--number-1, '{{inputs.parameters.print-and-return-number-Output}}', --number-2,\n        '{{inputs.parameters.print-and-return-number-2-Output}}']\n      command:\n      - sh\n      - -ec\n      - |\n        program_path=$(mktemp)\n        printf \"%s\" \"$0\" > \"$program_path\"\n        python3 -u \"$program_path\" \"$@\"\n      - |\n        def sum_and_print_numbers(number_1, number_2):\n            print(number_1 + number_2)\n\n        import argparse\n        _parser = argparse.ArgumentParser(prog='Sum and print numbers', description='')\n        _parser.add_argument(\"--number-1\", dest=\"number_1\", type=int, required=True, default=argparse.SUPPRESS)\n        _parser.add_argument(\"--number-2\", dest=\"number_2\", type=int, required=True, default=argparse.SUPPRESS)\n        _parsed_args = vars(_parser.parse_args())\n\n        _outputs = sum_and_print_numbers(**_parsed_args)\n      image: python:3.7\n    inputs:\n      parameters:\n      - {name: print-and-return-number-2-Output}\n      - {name: print-and-return-number-Output}\n    metadata:\n      annotations: {pipelines.kubeflow.org/task_display_name: This is sum of number\n          1 and number 2, pipelines.kubeflow.org/component_spec: '{\"implementation\":\n          {\"container\": {\"args\": [\"--number-1\", {\"inputValue\": \"number_1\"}, \"--number-2\",\n          {\"inputValue\": \"number_2\"}], \"command\": [\"sh\", \"-ec\", \"program_path=$(mktemp)\\nprintf\n          \\\"%s\\\" \\\"$0\\\" > \\\"$program_path\\\"\\npython3 -u \\\"$program_path\\\" \\\"$@\\\"\\n\",\n          \"def sum_and_print_numbers(number_1, number_2):\\n    print(number_1 + number_2)\\n\\nimport\n          argparse\\n_parser = argparse.ArgumentParser(prog=''Sum and print numbers'',\n          description='''')\\n_parser.add_argument(\\\"--number-1\\\", dest=\\\"number_1\\\",\n          type=int, required=True, default=argparse.SUPPRESS)\\n_parser.add_argument(\\\"--number-2\\\",\n          dest=\\\"number_2\\\", type=int, required=True, default=argparse.SUPPRESS)\\n_parsed_args\n          = vars(_parser.parse_args())\\n\\n_outputs = sum_and_print_numbers(**_parsed_args)\\n\"],\n          \"image\": \"python:3.7\"}}, \"inputs\": [{\"name\": \"number_1\", \"type\": \"Integer\"},\n          {\"name\": \"number_2\", \"type\": \"Integer\"}], \"name\": \"Sum and print numbers\"}',\n        pipelines.kubeflow.org/component_ref: '{}', pipelines.kubeflow.org/arguments.parameters: '{\"number_1\":\n          \"{{inputs.parameters.print-and-return-number-Output}}\", \"number_2\": \"{{inputs.parameters.print-and-return-number-2-Output}}\"}'}\n      labels:\n        pipelines.kubeflow.org/kfp_sdk_version: 1.8.9\n        pipelines.kubeflow.org/pipeline-sdk-type: kfp\n        pipelines.kubeflow.org/enable_caching: \"true\"\n  arguments:\n    parameters:\n    - {name: number_1}\n    - {name: number_2}\n  serviceAccountName: pipeline-runner\n```\n\n  </details>\n</p>\n\n이 전의 파일과 비교하면 `pipelines.kubeflow.org/task_display_name` key가 새로 생성되었습니다.\n\n### UI in Kubeflow\n\n위에서 만든 파일을 이용해 이전에 생성한 [파이프라인](../kubeflow/basic-pipeline-upload.md#upload-pipeline-version)의 버전을 올리겠습니다.\n\n![adv-pipeline-0.png](./img/adv-pipeline-0.png)\n\n그러면 위와 같이 설정한 이름이 노출되는 것을 확인할 수 있습니다.\n\n## Resources\n\n### GPU\n\n특별한 설정이 없다면 파이프라인은 컴포넌트를 쿠버네티스 파드(pod)로 실행할 때, 기본 리소스 스펙으로 실행하게 됩니다.  \n만약 GPU를 사용해 모델을 학습해야 할 때 쿠버네티스상에서 GPU를 할당받지 못해 제대로 학습이 이루어지지 않습니다.  \n이를 위해 `set_gpu_limit()` [attribute](https://kubeflow-pipelines.readthedocs.io/en/latest/source/kfp.dsl.html?highlight=set_gpu_limit#kfp.dsl.UserContainer.set_gpu_limit)을 이용해 설정할 수 있습니다.\n\n```python\nimport kfp\nfrom kfp.components import create_component_from_func\nfrom kfp.dsl import pipeline\n\n\n@create_component_from_func\ndef print_and_return_number(number: int) -> int:\n    print(number)\n    return number\n\n\n@create_component_from_func\ndef sum_and_print_numbers(number_1: int, number_2: int):\n    print(number_1 + number_2)\n\n\n@pipeline(name=\"example_pipeline\")\ndef example_pipeline(number_1: int, number_2: int):\n    number_1_result = print_and_return_number(number_1).set_display_name(\"This is number 1\")\n    number_2_result = print_and_return_number(number_2).set_display_name(\"This is number 2\")\n    sum_result = sum_and_print_numbers(\n        number_1=number_1_result.output, number_2=number_2_result.output\n    ).set_display_name(\"This is sum of number 1 and number 2\").set_gpu_limit(1)\n\n\nif __name__ == \"__main__\":\n    kfp.compiler.Compiler().compile(example_pipeline, \"example_pipeline.yaml\")\n```\n\n위의 스크립트를 실행하면 생성된 파일에서 `sum-and-print-numbers`를 자세히 보면 resources에 `{nvidia.com/gpu: 1}` 도 추가된 것을 볼 수 있습니다.\n이를 통해 GPU를 할당받을 수 있습니다.\n\n```bash\n  - name: sum-and-print-numbers\n    container:\n      args: [--number-1, '{{inputs.parameters.print-and-return-number-Output}}', --number-2,\n        '{{inputs.parameters.print-and-return-number-2-Output}}']\n      command:\n      - sh\n      - -ec\n      - |\n        program_path=$(mktemp)\n        printf \"%s\" \"$0\" > \"$program_path\"\n        python3 -u \"$program_path\" \"$@\"\n      - |\n        def sum_and_print_numbers(number_1, number_2):\n            print(number_1 + number_2)\n\n        import argparse\n        _parser = argparse.ArgumentParser(prog='Sum and print numbers', description='')\n        _parser.add_argument(\"--number-1\", dest=\"number_1\", type=int, required=True, default=argparse.SUPPRESS)\n        _parser.add_argument(\"--number-2\", dest=\"number_2\", type=int, required=True, default=argparse.SUPPRESS)\n        _parsed_args = vars(_parser.parse_args())\n\n        _outputs = sum_and_print_numbers(**_parsed_args)\n      image: python:3.7\n      resources:\n        limits: {nvidia.com/gpu: 1}\n```\n\n### CPU\n\ncpu의 개수를 정하기 위해서 이용하는 함수는 `.set_cpu_limit()` [attribute](https://kubeflow-pipelines.readthedocs.io/en/latest/source/kfp.dsl.html?highlight=set_gpu_limit#kfp.dsl.Sidecar.set_cpu_limit)을 이용해 설정할 수 있습니다.  \ngpu와는 다른 점은 int가 아닌 string으로 입력해야 한다는 점입니다.\n\n```python\nimport kfp\nfrom kfp.components import create_component_from_func\nfrom kfp.dsl import pipeline\n\n\n@create_component_from_func\ndef print_and_return_number(number: int) -> int:\n    print(number)\n    return number\n\n\n@create_component_from_func\ndef sum_and_print_numbers(number_1: int, number_2: int):\n    print(number_1 + number_2)\n\n\n@pipeline(name=\"example_pipeline\")\ndef example_pipeline(number_1: int, number_2: int):\n    number_1_result = print_and_return_number(number_1).set_display_name(\"This is number 1\")\n    number_2_result = print_and_return_number(number_2).set_display_name(\"This is number 2\")\n    sum_result = sum_and_print_numbers(\n        number_1=number_1_result.output, number_2=number_2_result.output\n    ).set_display_name(\"This is sum of number 1 and number 2\").set_gpu_limit(1).set_cpu_limit(\"16\")\n\n\nif __name__ == \"__main__\":\n    kfp.compiler.Compiler().compile(example_pipeline, \"example_pipeline.yaml\")\n```\n\n바뀐 부분만 확인하면 다음과 같습니다.\n\n```bash\n      resources:\n        limits: {nvidia.com/gpu: 1, cpu: '16'}\n```\n\n### Memory\n\n메모리는 `.set_memory_limit()` [attribute](https://kubeflow-pipelines.readthedocs.io/en/latest/source/kfp.dsl.html?highlight=set_gpu_limit#kfp.dsl.Sidecar.set_memory_limit)을 이용해 설정할 수 있습니다.\n\n```python\nimport kfp\nfrom kfp.components import create_component_from_func\nfrom kfp.dsl import pipeline\n\n\n@create_component_from_func\ndef print_and_return_number(number: int) -> int:\n    print(number)\n    return number\n\n\n@create_component_from_func\ndef sum_and_print_numbers(number_1: int, number_2: int):\n    print(number_1 + number_2)\n\n\n@pipeline(name=\"example_pipeline\")\ndef example_pipeline(number_1: int, number_2: int):\n    number_1_result = print_and_return_number(number_1).set_display_name(\"This is number 1\")\n    number_2_result = print_and_return_number(number_2).set_display_name(\"This is number 2\")\n    sum_result = sum_and_print_numbers(\n        number_1=number_1_result.output, number_2=number_2_result.output\n    ).set_display_name(\"This is sum of number 1 and number 2\").set_gpu_limit(1).set_memory_limit(\"1G\")\n\n\nif __name__ == \"__main__\":\n    kfp.compiler.Compiler().compile(example_pipeline, \"example_pipeline.yaml\")\n\n```\n\n바뀐 부분만 확인하면 다음과 같습니다.\n\n```bash\n      resources:\n        limits: {nvidia.com/gpu: 1, memory: 1G}\n```\n"
  },
  {
    "path": "docs/kubeflow/advanced-run.md",
    "content": "---\ntitle : \"11. Pipeline - Run Result\"\ndescription: \"\"\nsidebar_position: 11\ncontributors: [\"Jongseob Jeon\", \"SeungTae Kim\"]\n---\n\n\n## Run Result\n\nRun 실행 결과를 눌러보면 3개의 탭이 존재합니다.\n각각 Graph, Run output, Config 입니다.\n\n![advanced-run-0.png](./img/advanced-run-0.png)\n\n## Graph\n\n![advanced-run-1.png](./img/advanced-run-1.png)\n\n그래프에서는 실행된 컴포넌트를 누르면 컴포넌트의 실행 정보를 확인할 수 있습니다.\n\n### Input/Output\n\nInput/Output 탭은 컴포넌트에서 사용한 Config들과 Input, Output Artifacts를 확인하고 다운로드 받을 수 있습니다.\n\n### Logs\n\nLogs에서는 파이썬 코드 실행 중 나오는 모든 stdout을 확인할 수 있습니다.\n다만 pod은 일정 시간이 지난 후 지워지기 때문에 일정 시간이 지나면 이 탭에서는 확인할 수 없습니다.\n이때는 Output artifacts의 main-logs에서 확인할 수 있습니다.\n\n### Visualizations\n\nVisualizations에서는 컴포넌트에서 생성된 플랏을 보여줍니다.\n\n플랏을 생성하기 위해서는 `mlpipeline_ui_metadata: OutputPath(\"UI_Metadata\")` argument로 보여주고 싶은 값을 저장하면 됩니다. 이 때 플랏의 형태는 html 포맷이어야 합니다.\n변환하는 과정은 다음과 같습니다.\n\n```python\n\n@partial(\n    create_component_from_func,\n    packages_to_install=[\"matplotlib\"],\n)\ndef plot_linear(\n    mlpipeline_ui_metadata: OutputPath(\"UI_Metadata\")\n):\n    import base64\n    import json\n    from io import BytesIO\n\n    import matplotlib.pyplot as plt\n\n    plt.plot(x=[1, 2, 3], y=[1, 2,3])\n\n    tmpfile = BytesIO()\n    plt.savefig(tmpfile, format=\"png\")\n    encoded = base64.b64encode(tmpfile.getvalue()).decode(\"utf-8\")\n\n    html = f\"<img src='data:image/png;base64,{encoded}'>\"\n    metadata = {\n        \"outputs\": [\n            {\n                \"type\": \"web-app\",\n                \"storage\": \"inline\",\n                \"source\": html,\n            },\n        ],\n    }\n    with open(mlpipeline_ui_metadata, \"w\") as html_writer:\n        json.dump(metadata, html_writer)\n```\n\n파이프라인으로 작성하면 다음과 같이 됩니다.\n\n```python\nfrom functools import partial\n\nimport kfp\nfrom kfp.components import create_component_from_func, OutputPath\nfrom kfp.dsl import pipeline\n\n\n@partial(\n    create_component_from_func,\n    packages_to_install=[\"matplotlib\"],\n)\ndef plot_linear(mlpipeline_ui_metadata: OutputPath(\"UI_Metadata\")):\n    import base64\n    import json\n    from io import BytesIO\n\n    import matplotlib.pyplot as plt\n\n    plt.plot([1, 2, 3], [1, 2, 3])\n\n    tmpfile = BytesIO()\n    plt.savefig(tmpfile, format=\"png\")\n    encoded = base64.b64encode(tmpfile.getvalue()).decode(\"utf-8\")\n\n    html = f\"<img src='data:image/png;base64,{encoded}'>\"\n    metadata = {\n        \"outputs\": [\n            {\n                \"type\": \"web-app\",\n                \"storage\": \"inline\",\n                \"source\": html,\n            },\n        ],\n    }\n    with open(mlpipeline_ui_metadata, \"w\") as html_writer:\n        json.dump(metadata, html_writer)\n\n\n@pipeline(name=\"plot_pipeline\")\ndef plot_pipeline():\n    plot_linear()\n\n\nif __name__ == \"__main__\":\n    kfp.compiler.Compiler().compile(plot_pipeline, \"plot_pipeline.yaml\")\n```\n\n이 스크립트를 실행해서 나온 `plot_pipeline.yaml`을 확인하면 다음과 같습니다.\n\n<p>\n  <details>\n    <summary>plot_pipeline.yaml</summary>\n\n```bash\napiVersion: argoproj.io/v1alpha1\nkind: Workflow\nmetadata:\n  generateName: plot-pipeline-\n  annotations: {pipelines.kubeflow.org/kfp_sdk_version: 1.8.9, pipelines.kubeflow.org/pipeline_compilation_time: '2\n022-01-17T13:31:32.963214',\n    pipelines.kubeflow.org/pipeline_spec: '{\"name\": \"plot_pipeline\"}'}\n  labels: {pipelines.kubeflow.org/kfp_sdk_version: 1.8.9}\nspec:\n  entrypoint: plot-pipeline\n  templates:\n  - name: plot-linear\n    container:\n      args: [--mlpipeline-ui-metadata, /tmp/outputs/mlpipeline_ui_metadata/data]\n      command:\n      - sh\n      - -c\n      - (PIP_DISABLE_PIP_VERSION_CHECK=1 python3 -m pip install --quiet --no-warn-script-location\n        'matplotlib' || PIP_DISABLE_PIP_VERSION_CHECK=1 python3 -m pip install --quiet\n        --no-warn-script-location 'matplotlib' --user) && \"$0\" \"$@\"\n      - sh\n      - -ec\n      - |\n        program_path=$(mktemp)\n        printf \"%s\" \"$0\" > \"$program_path\"\n        python3 -u \"$program_path\" \"$@\"\n      - |\n        def _make_parent_dirs_and_return_path(file_path: str):\n            import os\n            os.makedirs(os.path.dirname(file_path), exist_ok=True)\n            return file_path\n        def plot_linear(mlpipeline_ui_metadata):\n            import base64\n            import json\n            from io import BytesIO\n            import matplotlib.pyplot as plt\n            plt.plot([1, 2, 3], [1, 2, 3])\n            tmpfile = BytesIO()\n            plt.savefig(tmpfile, format=\"png\")\n            encoded = base64.b64encode(tmpfile.getvalue()).decode(\"utf-8\")\n            html = f\"<img src='data:image/png;base64,{encoded}'>\"\n            metadata = {\n                \"outputs\": [\n                    {\n                        \"type\": \"web-app\",\n                        \"storage\": \"inline\",\n                        \"source\": html,\n                    },\n                ],\n            }\n            with open(mlpipeline_ui_metadata, \"w\") as html_writer:\n                json.dump(metadata, html_writer)\n\n        import argparse\n        _parser = argparse.ArgumentParser(prog='Plot linear', description='')\n        _parser.add_argument(\"--mlpipeline-ui-metadata\", dest=\"mlpipeline_ui_metadata\", type=_make_parent_dirs_and_return_path, required=True, default=argparse.SUPPRESS)\n        _parsed_args = vars(_parser.parse_args())\n        _outputs = plot_linear(**_parsed_args)\n      image: python:3.7\n    outputs:\n      artifacts:\n      - {name: mlpipeline-ui-metadata, path: /tmp/outputs/mlpipeline_ui_metadata/data}\n    metadata:\n      labels:\n        pipelines.kubeflow.org/kfp_sdk_version: 1.8.9\n        pipelines.kubeflow.org/pipeline-sdk-type: kfp\n        pipelines.kubeflow.org/enable_caching: \"true\"\n      annotations: {pipelines.kubeflow.org/component_spec: '{\"implementation\": {\"container\":\n          {\"args\": [\"--mlpipeline-ui-metadata\", {\"outputPath\": \"mlpipeline_ui_metadata\"}],\n          \"command\": [\"sh\", \"-c\", \"(PIP_DISABLE_PIP_VERSION_CHECK=1 python3 -m pip\n          install --quiet --no-warn-script-location ''matplotlib'' || PIP_DISABLE_PIP_VERSION_CHECK=1\n          python3 -m pip install --quiet --no-warn-script-location ''matplotlib''\n          --user) && \\\"$0\\\" \\\"$@\\\"\", \"sh\", \"-ec\", \"program_path=$(mktemp)\\nprintf\n          \\\"%s\\\" \\\"$0\\\" > \\\"$program_path\\\"\\npython3 -u \\\"$program_path\\\" \\\"$@\\\"\\n\",\n          \"def _make_parent_dirs_and_return_path(file_path: str):\\n    import os\\n    os.makedirs(os.path.dirname(file_path),\n          exist_ok=True)\\n    return file_path\\n\\ndef plot_linear(mlpipeline_ui_metadata):\\n    import\n          base64\\n    import json\\n    from io import BytesIO\\n\\n    import matplotlib.pyplot\n          as plt\\n\\n    plt.plot([1, 2, 3], [1, 2, 3])\\n\\n    tmpfile = BytesIO()\\n    plt.savefig(tmpfile,\n          format=\\\"png\\\")\\n    encoded = base64.b64encode(tmpfile.getvalue()).decode(\\\"utf-8\\\")\\n\\n    html\n          = f\\\"<img src=''data:image/png;base64,{encoded}''>\\\"\\n    metadata = {\\n        \\\"outputs\\\":\n          [\\n            {\\n                \\\"type\\\": \\\"web-app\\\",\\n                \\\"storage\\\":\n          \\\"inline\\\",\\n                \\\"source\\\": html,\\n            },\\n        ],\\n    }\\n    with\n          open(mlpipeline_ui_metadata, \\\"w\\\") as html_writer:\\n        json.dump(metadata,\n          html_writer)\\n\\nimport argparse\\n_parser = argparse.ArgumentParser(prog=''Plot\n          linear'', description='''')\\n_parser.add_argument(\\\"--mlpipeline-ui-metadata\\\",\n          dest=\\\"mlpipeline_ui_metadata\\\", type=_make_parent_dirs_and_return_path,\n          required=True, default=argparse.SUPPRESS)\\n_parsed_args = vars(_parser.parse_args())\\n\\n_outputs\n          = plot_linear(**_parsed_args)\\n\"], \"image\": \"python:3.7\"}}, \"name\": \"Plot\n          linear\", \"outputs\": [{\"name\": \"mlpipeline_ui_metadata\", \"type\": \"UI_Metadata\"}]}',\n        pipelines.kubeflow.org/component_ref: '{}'}\n  - name: plot-pipeline\n    dag:\n      tasks:\n      - {name: plot-linear, template: plot-linear}\n  arguments:\n    parameters: []\n  serviceAccountName: pipeline-runner\n```\n\n  </details>\n</p>\n\n실행 후 Visualization을 클릭합니다.\n\n![advanced-run-5.png](./img/advanced-run-5.png)\n\n## Run output\n\n![advanced-run-2.png](./img/advanced-run-2.png)\n\nRun output은 kubeflow에서 지정한 형태로 생긴 Artifacts를 모아서 보여주는 곳이며 평가 지표(Metric)를 보여줍니다.\n\n평가 지표(Metric)을 보여주기 위해서는 `mlpipeline_metrics_path: OutputPath(\"Metrics\")` argument에 보여주고 싶은 이름과 값을 json 형태로 저장하면 됩니다.\n예를 들어서 다음과 같이 작성할 수 있습니다.\n\n```python\n@create_component_from_func\ndef show_metric_of_sum(\n    number: int,\n    mlpipeline_metrics_path: OutputPath(\"Metrics\"),\n  ):\n    import json\n    metrics = {\n        \"metrics\": [\n            {\n                \"name\": \"sum_value\",\n                \"numberValue\": number,\n            },\n        ],\n    }\n    with open(mlpipeline_metrics_path, \"w\") as f:\n        json.dump(metrics, f)\n```\n\n평가 지표를 생성하는 컴포넌트를 [파이프라인](../kubeflow/basic-pipeline.md)에서 생성한 파이프라인에 추가 후 실행해 보겠습니다.\n전체 파이프라인은 다음과 같습니다.\n\n```python\nimport kfp\nfrom kfp.components import create_component_from_func, OutputPath\nfrom kfp.dsl import pipeline\n\n\n@create_component_from_func\ndef print_and_return_number(number: int) -> int:\n    print(number)\n    return number\n\n@create_component_from_func\ndef sum_and_print_numbers(number_1: int, number_2: int) -> int:\n    sum_number = number_1 + number_2\n    print(sum_number)\n    return sum_number\n\n@create_component_from_func\ndef show_metric_of_sum(\n    number: int,\n    mlpipeline_metrics_path: OutputPath(\"Metrics\"),\n  ):\n    import json\n    metrics = {\n        \"metrics\": [\n            {\n                \"name\": \"sum_value\",\n                \"numberValue\": number,\n            },\n        ],\n    }\n    with open(mlpipeline_metrics_path, \"w\") as f:\n        json.dump(metrics, f)\n\n@pipeline(name=\"example_pipeline\")\ndef example_pipeline(number_1: int, number_2: int):\n    number_1_result = print_and_return_number(number_1)\n    number_2_result = print_and_return_number(number_2)\n    sum_result = sum_and_print_numbers(\n        number_1=number_1_result.output, number_2=number_2_result.output\n    )\n    show_metric_of_sum(sum_result.output)\n\n\nif __name__ == \"__main__\":\n    kfp.compiler.Compiler().compile(example_pipeline, \"example_pipeline.yaml\")\n```\n\n실행 후 Run Output을 클릭하면 다음과 같이 나옵니다.\n\n![advanced-run-4.png](./img/advanced-run-4.png)\n\n## Config\n\n![advanced-run-3.png](./img/advanced-run-3.png)\n\nConfig에서는 파이프라인 Config로 입력받은 모든 값을 확인할 수 있습니다.\n"
  },
  {
    "path": "docs/kubeflow/basic-component.md",
    "content": "---\ntitle : \"4. Component - Write\"\ndescription: \"\"\nsidebar_position: 4\ncontributors: [\"Jongseob Jeon\"]\n---\n\n\n## Component\n\n컴포넌트(Component)를 작성하기 위해서는 다음과 같은 내용을 작성해야 합니다.\n\n1. 컴포넌트 콘텐츠(Component Contents) 작성\n2. 컴포넌트 래퍼(Component Wrapper) 작성\n\n이제 각 과정에 대해서 알아보도록 하겠습니다.\n\n## Component Contents\n\n컴포넌트 콘텐츠는 우리가 흔히 작성하는 파이썬 코드와 다르지 않습니다.  \n예를 들어서 숫자를 입력으로 받고 입력받은 숫자를 출력한 뒤 반환하는 컴포넌트를 작성해 보겠습니다.  \n파이썬 코드로 작성하면 다음과 같이 작성할 수 있습니다.\n\n```python\nprint(number)\n```\n\n그런데 이 코드를 실행하면 에러가 나고 동작하지 않는데 그 이유는 출력해야 할 `number`가 정의되어 있지 않기 때문입니다.\n\n[Kubeflow Concepts](../kubeflow/kubeflow-concepts.md)에서 `number` 와 같이 컴포넌트 콘텐츠에서 필요한 값들은 **Config**로 정의한다고 했습니다. 컴포넌트 콘텐츠를 실행시키기 위해 필요한 Config들은 컴포넌트 래퍼에서 전달이 되어야 합니다.\n\n## Component Wrapper\n\n### Define a standalone Python function\n\n이제 필요한 Config를 전달할 수 있도록 컴포넌트 래퍼를 만들어야 합니다.\n\n별도의 Config 없이 컴포넌트 래퍼로 감쌀 경우 다음과 같이 됩니다.\n\n```python\ndef print_and_return_number():\n    print(number)\n    return number\n```\n\n이제 콘텐츠에서 필요한 Config를 래퍼의 argument로 추가합니다. 다만, argument 만을 적는 것이 아니라 argument의 타입 힌트도 작성해야 합니다. Kubeflow에서는 파이프라인을 Kubeflow 포맷으로 변환할 때, 컴포넌트 간의 연결에서 정해진 입력과 출력의 타입이 일치하는지 체크합니다. 만약 컴포넌트가 필요로 하는 입력과 다른 컴포넌트로부터 전달받은 출력의 포맷이 일치하지 않을 경우 파이프라인 생성을 할 수 없습니다.\n\n이제 다음과 같이 argument와 그 타입, 그리고 반환하는 타입을 적어서 컴포넌트 래퍼를 완성합니다.\n\n```python\ndef print_and_return_number(number: int) -> int:\n    print(number)\n    return number\n```\n\nKubeflow에서 반환 값으로 사용할 수 있는 타입은 json에서 표현할 수 있는 타입들만 사용할 수 있습니다. 대표적으로 사용되며 권장하는 타입들은 다음과 같습니다.\n\n- int\n- float\n- str\n\n만약 단일 값이 아닌 여러 값을 반환하려면 `collections.namedtuple` 을 이용해야 합니다.  \n자세한 내용은 [Kubeflow 공식 문서](https://www.kubeflow.org/docs/components/pipelines/sdk/python-function-components/#passing-parameters-by-value)를 참고 하시길 바랍니다.  \n예를 들어서 입력받은 숫자를 2로 나눈 몫과 나머지를 반환하는 컴포넌트는 다음과 같이 작성해야 합니다.\n\n```python\nfrom typing import NamedTuple\n\n\ndef divide_and_return_number(\n    number: int,\n) -> NamedTuple(\"DivideOutputs\", [(\"quotient\", int), (\"remainder\", int)]):\n    from collections import namedtuple\n\n    quotient, remainder = divmod(number, 2)\n    print(\"quotient is\", quotient)\n    print(\"remainder is\", remainder)\n\n    divide_outputs = namedtuple(\n        \"DivideOutputs\",\n        [\n            \"quotient\",\n            \"remainder\",\n        ],\n    )\n    return divide_outputs(quotient, remainder)\n```\n\n### Convert to Kubeflow Format\n\n이제 작성한 컴포넌트를 kubeflow에서 사용할 수 있는 포맷으로 변환해야 합니다. 변환은 `kfp.components.create_component_from_func` 를 통해서 할 수 있습니다.  \n이렇게 변환된 형태는 파이썬에서 함수로 import 하여서 파이프라인에서 사용할 수 있습니다.\n\n```python\nfrom kfp.components import create_component_from_func\n\n@create_component_from_func\ndef print_and_return_number(number: int) -> int:\n    print(number)\n    return number\n```\n\n### Share component with yaml file\n\n만약 파이썬 코드로 공유를 할 수 없는 경우 YAML 파일로 컴포넌트를 공유해서 사용할 수 있습니다.\n이를 위해서는 우선 컴포넌트를 YAML 파일로 변환한 뒤 `kfp.components.load_component_from_file` 을 통해 파이프라인에서 사용할 수 있습니다.\n\n우선 작성한 컴포넌트를 YAML 파일로 변환하는 과정에 대해서 설명합니다.\n\n```python\nfrom kfp.components import create_component_from_func\n\n@create_component_from_func\ndef print_and_return_number(number: int) -> int:\n    print(number)\n    return number\n\nif __name__ == \"__main__\":\n    print_and_return_number.component_spec.save(\"print_and_return_number.yaml\")\n```\n\n작성한 파이썬 코드를 실행하면 `print_and_return_number.yaml` 파일이 생성됩니다. 파일을 확인하면 다음과 같습니다.\n\n```bash\nname: Print and return number\ninputs:\n- {name: number, type: Integer}\noutputs:\n- {name: Output, type: Integer}\nimplementation:\n  container:\n    image: python:3.7\n    command:\n    - sh\n    - -ec\n    - |\n      program_path=$(mktemp)\n      printf \"%s\" \"$0\" > \"$program_path\"\n      python3 -u \"$program_path\" \"$@\"\n    - |\n      def print_and_return_number(number):\n          print(number)\n          return number\n\n      def _serialize_int(int_value: int) -> str:\n          if isinstance(int_value, str):\n              return int_value\n          if not isinstance(int_value, int):\n              raise TypeError('Value \"{}\" has type \"{}\" instead of int.'.format(str(int_value), str(type(int_value))))\n          return str(int_value)\n\n      import argparse\n      _parser = argparse.ArgumentParser(prog='Print and return number', description='')\n      _parser.add_argument(\"--number\", dest=\"number\", type=int, required=True, default=argparse.SUPPRESS)\n      _parser.add_argument(\"----output-paths\", dest=\"_output_paths\", type=str, nargs=1)\n      _parsed_args = vars(_parser.parse_args())\n      _output_files = _parsed_args.pop(\"_output_paths\", [])\n\n      _outputs = print_and_return_number(**_parsed_args)\n\n      _outputs = [_outputs]\n\n      _output_serializers = [\n          _serialize_int,\n\n      ]\n\n      import os\n      for idx, output_file in enumerate(_output_files):\n          try:\n              os.makedirs(os.path.dirname(output_file))\n          except OSError:\n              pass\n          with open(output_file, 'w') as f:\n              f.write(_output_serializers[idx](_outputs[idx]))\n    args:\n    - --number\n    - {inputValue: number}\n    - '----output-paths'\n    - {outputPath: Output}\n```\n\n이제 생성된 파일을 공유해서 파이프라인에서 다음과 같이 사용할 수 있습니다.\n\n```python\nfrom kfp.components import load_component_from_file\n\nprint_and_return_number = load_component_from_file(\"print_and_return_number.yaml\")\n```\n\n## How Kubeflow executes component\n\nKubeflow에서 컴포넌트가 실행되는 순서는 다음과 같습니다.\n\n1. `docker pull <image>`: 정의된 컴포넌트의 실행 환경 정보가 담긴 이미지를 pull\n2. run `command`: pull 한 이미지에서 컴포넌트 콘텐츠를 실행합니다.  \n\n`print_and_return_number.yaml` 를 예시로 들자면 `@create_component_from_func` 의 default image 는 python:3.7 이므로 해당 이미지를 기준으로 컴포넌트 콘텐츠를 실행하게 됩니다.  \n\n1. `docker pull python:3.7`\n2. `print(number)`\n\n## References:\n\n- [Getting Started With Python function based components](https://www.kubeflow.org/docs/components/pipelines/sdk/python-function-components/#getting-started-with-python-function-based-components)\n"
  },
  {
    "path": "docs/kubeflow/basic-pipeline-upload.md",
    "content": "---\ntitle : \"6. Pipeline - Upload\"\ndescription: \"\"\nsidebar_position: 6\ncontributors: [\"Jongseob Jeon\"]\n---\n\n## Upload Pipeline\n\n이제 우리가 만든 파이프라인을 직접 kubeflow에서 업로드 해 보겠습니다.  \n파이프라인 업로드는 kubeflow 대시보드 UI를 통해 진행할 수 있습니다.\n[Install Kubeflow](../setup-components/install-components-kf.md#정상-설치-확인) 에서 사용한 방법을 이용해 포트포워딩합니다.\n\n```bash\nkubectl port-forward svc/istio-ingressgateway -n istio-system 8080:80\n```\n\n[http://localhost:8080](http://localhost:8080)에 접속해 대시보드를 열어줍니다.\n\n### 1. Pipelines 탭 선택\n\n![pipeline-gui-0.png](./img/pipeline-gui-0.png)\n\n### 2. Upload Pipeline 선택\n\n![pipeline-gui-1.png](./img/pipeline-gui-1.png)\n\n### 3. Choose file 선택\n\n![pipeline-gui-2.png](./img/pipeline-gui-2.png)\n\n### 4. 생성된 yaml파일 업로드\n\n![pipeline-gui-3.png](./img/pipeline-gui-3.png)\n\n### 5. Create\n\n![pipeline-gui-4.png](./img/pipeline-gui-4.png)\n\n## Upload Pipeline Version\n\n업로드된 파이프라인은 업로드를 통해서 버전을 관리할 수 있습니다. 다만 깃헙과 같은 코드 차원의 버전 관리가 아닌 같은 이름의 파이프라인을 모아서 보여주는 역할을 합니다.\n위의 예시에서 파이프라인을 업로드한 경우 다음과 같이 example_pipeline이 생성된 것을 확인할 수 있습니다.\n\n![pipeline-gui-5.png](./img/pipeline-gui-5.png)\n\n클릭하면 다음과 같은 화면이 나옵니다.\n\n![pipeline-gui-4.png](./img/pipeline-gui-4.png)\n\nUpload Version을 클릭하면 다음과 같이 파이프라인을 업로드할 수 있는 화면이 생성됩니다.\n\n![pipeline-gui-6.png](./img/pipeline-gui-6.png)\n\n파이프라인을 업로드 합니다.\n\n![pipeline-gui-7.png](./img/pipeline-gui-7.png)\n\n업로드된 경우 다음과 같이 파이프라인 버전을 확인할 수 있습니다.\n\n![pipeline-gui-8.png](./img/pipeline-gui-8.png)\n"
  },
  {
    "path": "docs/kubeflow/basic-pipeline.md",
    "content": "---\ntitle : \"5. Pipeline - Write\"\ndescription: \"\"\nsidebar_position: 5\ncontributors: [\"Jongseob Jeon\"]\n---\n\n## Pipeline\n\n컴포넌트는 독립적으로 실행되지 않고 파이프라인의 구성요소로써 실행됩니다. 그러므로 컴포넌트를 실행해 보려면 파이프라인을 작성해야 합니다.\n그리고 파이프라인을 작성하기 위해서는 컴포넌트의 집합과 컴포넌트의 실행 순서가 필요합니다.\n\n이번 페이지에서는 숫자를 입력받고 출력하는 컴포넌트와 두 개의 컴포넌트로부터 숫자를 받아서 합을 출력하는 컴포넌트가 있는 파이프라인을 만들어 보도록 하겠습니다.\n\n## Component Set\n\n우선 파이프라인에서 사용할 컴포넌트들을 작성합니다.\n\n1. `print_and_return_number`\n\n  입력받은 숫자를 출력하고 반환하는 컴포넌트입니다.  \n  컴포넌트가 입력받은 값을 반환하기 때문에 int를 return의 타입 힌트로 입력합니다.\n\n  ```python\n  @create_component_from_func\n  def print_and_return_number(number: int) -> int:\n      print(number)\n      return number\n  ```\n\n2. `sum_and_print_numbers`\n\n  입력받은 두 개의 숫자의 합을 출력하는 컴포넌트입니다.  \n  이 컴포넌트 역시 두 숫자의 합을 반환하기 때문에 int를 return의 타입 힌트로 입력합니다.\n\n  ```python\n  @create_component_from_func\n  def sum_and_print_numbers(number_1: int, number_2: int) -> int:\n      sum_num = number_1 + number_2\n      print(sum_num)\n      return sum_num\n  ```\n\n## Component Order\n\n### Define Order\n\n필요한 컴포넌트의 집합을 만들었으면, 다음으로는 이들의 순서를 정의해야 합니다.  \n이번 페이지에서 만들 파이프라인의 순서를 그림으로 표현하면 다음과 같이 됩니다.\n\n![pipeline-0.png](./img/pipeline-0.png)\n\n### Single Output\n\n이제 이 순서를 코드로 옮겨보겠습니다.  \n\n우선 위의 그림에서 `print_and_return_number_1` 과 `print_and_return_number_2` 를 작성하면 다음과 같이 됩니다.\n\n```python\ndef example_pipeline():\n    number_1_result = print_and_return_number(number_1)\n    number_2_result = print_and_return_number(number_2)\n```\n\n컴포넌트를 실행하고 그 반환 값을 각각 `number_1_result` 와 `number_2_result` 에 저장합니다.  \n저장된 `number_1_result` 의 반환 값은 `number_1_resulst.output` 를 통해 사용할 수 있습니다.\n\n### Multi Output\n\n위의 예시에서 컴포넌트는 단일 값만을 반환하기 때문에 `output`을 이용해 바로 사용할 수 있습니다.  \n만약, 여러 개의 반환 값이 있다면 `outputs`에 저장이 되며 dict 타입이기에 key를 이용해 원하는 반환 값을 사용할 수 있습니다.\n예를 들어서 앞에서 작성한 여러 개를 반환하는 [컴포넌트](../kubeflow/basic-component.md#define-a-standalone-python-function) 의 경우를 보겠습니다.\n`divde_and_return_number` 의 return 값은 `quotient` 와 `remainder` 가 있습니다. 이 두 값을 `print_and_return_number` 에 전달하는 예시를 보면 다음과 같습니다.\n\n```python\ndef multi_pipeline():\n    divided_result = divde_and_return_number(number)\n    num_1_result = print_and_return_number(divided_result.outputs[\"quotient\"])\n    num_2_result = print_and_return_number(divided_result.outputs[\"remainder\"])\n```\n\n`divde_and_return_number`의 결과를 `divided_result`에 저장하고 각각 `divided_result.outputs[\"quotient\"]`, `divided_result.outputs[\"remainder\"]`로 값을 가져올 수 있습니다.\n\n### Write to python code\n\n이제 다시 본론으로 돌아와서 이 두 값의 결과를 `sum_and_print_numbers` 에 전달합니다.\n\n```python\ndef example_pipeline():\n    number_1_result = print_and_return_number(number_1)\n    number_2_result = print_and_return_number(number_2)\n    sum_result = sum_and_print_numbers(\n        number_1=number_1_result.output, number_2=number_2_result.output\n    )\n```\n\n다음으로 각 컴포넌트에 필요한 Config들을 모아서 파이프라인 Config로 정의 합니다.\n\n```python\ndef example_pipeline(number_1: int, number_2:int):\n    number_1_result = print_and_return_number(number_1)\n    number_2_result = print_and_return_number(number_2)\n    sum_result = sum_and_print_numbers(\n        number_1=number_1_result.output, number_2=number_2_result.output\n    )\n```\n\n## Convert to Kubeflow Format\n\n마지막으로 kubeflow에서 사용할 수 있는 형식으로 변환합니다. 변환은 `kfp.dsl.pipeline` 함수를 이용해 할 수 있습니다.\n\n```python\nfrom kfp.dsl import pipeline\n\n\n@pipeline(name=\"example_pipeline\")\ndef example_pipeline(number_1: int, number_2: int):\n    number_1_result = print_and_return_number(number_1)\n    number_2_result = print_and_return_number(number_2)\n    sum_result = sum_and_print_numbers(\n        number_1=number_1_result.output, number_2=number_2_result.output\n    )\n```\n\nKubeflow에서 파이프라인을 실행하기 위해서는 yaml 형식으로만 가능하기 때문에 생성한 파이프라인을 정해진 yaml 형식으로 컴파일(Compile) 해 주어야 합니다.\n컴파일은 다음 명령어를 이용해 생성할 수 있습니다.\n\n```python\nif __name__ == \"__main__\":\n    import kfp\n    kfp.compiler.Compiler().compile(example_pipeline, \"example_pipeline.yaml\")\n```\n\n## Conclusion\n\n앞서 설명한 내용을 한 파이썬 코드로 모으면 다음과 같이 됩니다.\n\n```python\nimport kfp\nfrom kfp.components import create_component_from_func\nfrom kfp.dsl import pipeline\n\n@create_component_from_func\ndef print_and_return_number(number: int) -> int:\n    print(number)\n    return number\n\n@create_component_from_func\ndef sum_and_print_numbers(number_1: int, number_2: int):\n    print(number_1 + number_2)\n\n@pipeline(name=\"example_pipeline\")\ndef example_pipeline(number_1: int, number_2: int):\n    number_1_result = print_and_return_number(number_1)\n    number_2_result = print_and_return_number(number_2)\n    sum_result = sum_and_print_numbers(\n        number_1=number_1_result.output, number_2=number_2_result.output\n    )\n\nif __name__ == \"__main__\":\n    kfp.compiler.Compiler().compile(example_pipeline, \"example_pipeline.yaml\")\n```\n\n컴파일된 결과를 보면 다음과 같습니다.\n\n<details>\n  <summary>example_pipeline.yaml</summary>\n\n```bash\napiVersion: argoproj.io/v1alpha1\nkind: Workflow\nmetadata:\n  generateName: example-pipeline-\n  annotations: {pipelines.kubeflow.org/kfp_sdk_version: 1.6.3, pipelines.kubeflow.org/pipeline_compilation_time: '2021-12-05T13:38:51.566777',\n    pipelines.kubeflow.org/pipeline_spec: '{\"inputs\": [{\"name\": \"number_1\", \"type\":\n      \"Integer\"}, {\"name\": \"number_2\", \"type\": \"Integer\"}], \"name\": \"example_pipeline\"}'}\n  labels: {pipelines.kubeflow.org/kfp_sdk_version: 1.6.3}\nspec:\n  entrypoint: example-pipeline\n  templates:\n  - name: example-pipeline\n    inputs:\n      parameters:\n      - {name: number_1}\n      - {name: number_2}\n    dag:\n      tasks:\n      - name: print-and-return-number\n        template: print-and-return-number\n        arguments:\n          parameters:\n          - {name: number_1, value: '{{inputs.parameters.number_1}}'}\n      - name: print-and-return-number-2\n        template: print-and-return-number-2\n        arguments:\n          parameters:\n          - {name: number_2, value: '{{inputs.parameters.number_2}}'}\n      - name: sum-and-print-numbers\n        template: sum-and-print-numbers\n        dependencies: [print-and-return-number, print-and-return-number-2]\n        arguments:\n          parameters:\n          - {name: print-and-return-number-2-Output, value: '{{tasks.print-and-return-number-2.outputs.parameters.print-and-return-number-2-Output}}'}\n          - {name: print-and-return-number-Output, value: '{{tasks.print-and-return-number.outputs.parameters.print-and-return-number-Output}}'}\n  - name: print-and-return-number\n    container:\n      args: [--number, '{{inputs.parameters.number_1}}', '----output-paths', /tmp/outputs/Output/data]\n      command:\n      - sh\n      - -ec\n      - |\n        program_path=$(mktemp)\n        printf \"%s\" \"$0\" > \"$program_path\"\n        python3 -u \"$program_path\" \"$@\"\n      - |\n        def print_and_return_number(number):\n            print(number)\n            return number\n\n        def _serialize_int(int_value: int) -> str:\n            if isinstance(int_value, str):\n                return int_value\n            if not isinstance(int_value, int):\n                raise TypeError('Value \"{}\" has type \"{}\" instead of int.'.format(str(int_value), str(type(int_value))))\n            return str(int_value)\n\n        import argparse\n        _parser = argparse.ArgumentParser(prog='Print and return number', description='')\n        _parser.add_argument(\"--number\", dest=\"number\", type=int, required=True, default=argparse.SUPPRESS)\n        _parser.add_argument(\"----output-paths\", dest=\"_output_paths\", type=str, nargs=1)\n        _parsed_args = vars(_parser.parse_args())\n        _output_files = _parsed_args.pop(\"_output_paths\", [])\n\n        _outputs = print_and_return_number(**_parsed_args)\n\n        _outputs = [_outputs]\n\n        _output_serializers = [\n            _serialize_int,\n\n        ]\n\n        import os\n        for idx, output_file in enumerate(_output_files):\n            try:\n                os.makedirs(os.path.dirname(output_file))\n            except OSError:\n                pass\n            with open(output_file, 'w') as f:\n                f.write(_output_serializers[idx](_outputs[idx]))\n      image: python:3.7\n    inputs:\n      parameters:\n      - {name: number_1}\n    outputs:\n      parameters:\n      - name: print-and-return-number-Output\n        valueFrom: {path: /tmp/outputs/Output/data}\n      artifacts:\n      - {name: print-and-return-number-Output, path: /tmp/outputs/Output/data}\n    metadata:\n      labels: {pipelines.kubeflow.org/kfp_sdk_version: 1.6.3, pipelines.kubeflow.org/pipeline-sdk-type: kfp}\n      annotations: {pipelines.kubeflow.org/component_spec: '{\"implementation\": {\"container\":\n          {\"args\": [\"--number\", {\"inputValue\": \"number\"}, \"----output-paths\", {\"outputPath\":\n          \"Output\"}], \"command\": [\"sh\", \"-ec\", \"program_path=$(mktemp)\\nprintf \\\"%s\\\"\n          \\\"$0\\\" > \\\"$program_path\\\"\\npython3 -u \\\"$program_path\\\" \\\"$@\\\"\\n\", \"def\n          print_and_return_number(number):\\n    print(number)\\n    return number\\n\\ndef\n          _serialize_int(int_value: int) -> str:\\n    if isinstance(int_value, str):\\n        return\n          int_value\\n    if not isinstance(int_value, int):\\n        raise TypeError(''Value\n          \\\"{}\\\" has type \\\"{}\\\" instead of int.''.format(str(int_value), str(type(int_value))))\\n    return\n          str(int_value)\\n\\nimport argparse\\n_parser = argparse.ArgumentParser(prog=''Print\n          and return number'', description='''')\\n_parser.add_argument(\\\"--number\\\",\n          dest=\\\"number\\\", type=int, required=True, default=argparse.SUPPRESS)\\n_parser.add_argument(\\\"----output-paths\\\",\n          dest=\\\"_output_paths\\\", type=str, nargs=1)\\n_parsed_args = vars(_parser.parse_args())\\n_output_files\n          = _parsed_args.pop(\\\"_output_paths\\\", [])\\n\\n_outputs = print_and_return_number(**_parsed_args)\\n\\n_outputs\n          = [_outputs]\\n\\n_output_serializers = [\\n    _serialize_int,\\n\\n]\\n\\nimport\n          os\\nfor idx, output_file in enumerate(_output_files):\\n    try:\\n        os.makedirs(os.path.dirname(output_file))\\n    except\n          OSError:\\n        pass\\n    with open(output_file, ''w'') as f:\\n        f.write(_output_serializers[idx](_outputs[idx]))\\n\"],\n          \"image\": \"python:3.7\"}}, \"inputs\": [{\"name\": \"number\", \"type\": \"Integer\"}],\n          \"name\": \"Print and return number\", \"outputs\": [{\"name\": \"Output\", \"type\":\n          \"Integer\"}]}', pipelines.kubeflow.org/component_ref: '{}', pipelines.kubeflow.org/arguments.parameters: '{\"number\":\n          \"{{inputs.parameters.number_1}}\"}'}\n  - name: print-and-return-number-2\n    container:\n      args: [--number, '{{inputs.parameters.number_2}}', '----output-paths', /tmp/outputs/Output/data]\n      command:\n      - sh\n      - -ec\n      - |\n        program_path=$(mktemp)\n        printf \"%s\" \"$0\" > \"$program_path\"\n        python3 -u \"$program_path\" \"$@\"\n      - |\n        def print_and_return_number(number):\n            print(number)\n            return number\n\n        def _serialize_int(int_value: int) -> str:\n            if isinstance(int_value, str):\n                return int_value\n            if not isinstance(int_value, int):\n                raise TypeError('Value \"{}\" has type \"{}\" instead of int.'.format(str(int_value), str(type(int_value))))\n            return str(int_value)\n\n        import argparse\n        _parser = argparse.ArgumentParser(prog='Print and return number', description='')\n        _parser.add_argument(\"--number\", dest=\"number\", type=int, required=True, default=argparse.SUPPRESS)\n        _parser.add_argument(\"----output-paths\", dest=\"_output_paths\", type=str, nargs=1)\n        _parsed_args = vars(_parser.parse_args())\n        _output_files = _parsed_args.pop(\"_output_paths\", [])\n\n        _outputs = print_and_return_number(**_parsed_args)\n\n        _outputs = [_outputs]\n\n        _output_serializers = [\n            _serialize_int,\n\n        ]\n\n        import os\n        for idx, output_file in enumerate(_output_files):\n            try:\n                os.makedirs(os.path.dirname(output_file))\n            except OSError:\n                pass\n            with open(output_file, 'w') as f:\n                f.write(_output_serializers[idx](_outputs[idx]))\n      image: python:3.7\n    inputs:\n      parameters:\n      - {name: number_2}\n    outputs:\n      parameters:\n      - name: print-and-return-number-2-Output\n        valueFrom: {path: /tmp/outputs/Output/data}\n      artifacts:\n      - {name: print-and-return-number-2-Output, path: /tmp/outputs/Output/data}\n    metadata:\n      labels: {pipelines.kubeflow.org/kfp_sdk_version: 1.6.3, pipelines.kubeflow.org/pipeline-sdk-type: kfp}\n      annotations: {pipelines.kubeflow.org/component_spec: '{\"implementation\": {\"container\":\n          {\"args\": [\"--number\", {\"inputValue\": \"number\"}, \"----output-paths\", {\"outputPath\":\n          \"Output\"}], \"command\": [\"sh\", \"-ec\", \"program_path=$(mktemp)\\nprintf \\\"%s\\\"\n          \\\"$0\\\" > \\\"$program_path\\\"\\npython3 -u \\\"$program_path\\\" \\\"$@\\\"\\n\", \"def\n          print_and_return_number(number):\\n    print(number)\\n    return number\\n\\ndef\n          _serialize_int(int_value: int) -> str:\\n    if isinstance(int_value, str):\\n        return\n          int_value\\n    if not isinstance(int_value, int):\\n        raise TypeError(''Value\n          \\\"{}\\\" has type \\\"{}\\\" instead of int.''.format(str(int_value), str(type(int_value))))\\n    return\n          str(int_value)\\n\\nimport argparse\\n_parser = argparse.ArgumentParser(prog=''Print\n          and return number'', description='''')\\n_parser.add_argument(\\\"--number\\\",\n          dest=\\\"number\\\", type=int, required=True, default=argparse.SUPPRESS)\\n_parser.add_argument(\\\"----output-paths\\\",\n          dest=\\\"_output_paths\\\", type=str, nargs=1)\\n_parsed_args = vars(_parser.parse_args())\\n_output_files\n          = _parsed_args.pop(\\\"_output_paths\\\", [])\\n\\n_outputs = print_and_return_number(**_parsed_args)\\n\\n_outputs\n          = [_outputs]\\n\\n_output_serializers = [\\n    _serialize_int,\\n\\n]\\n\\nimport\n          os\\nfor idx, output_file in enumerate(_output_files):\\n    try:\\n        os.makedirs(os.path.dirname(output_file))\\n    except\n          OSError:\\n        pass\\n    with open(output_file, ''w'') as f:\\n        f.write(_output_serializers[idx](_outputs[idx]))\\n\"],\n          \"image\": \"python:3.7\"}}, \"inputs\": [{\"name\": \"number\", \"type\": \"Integer\"}],\n          \"name\": \"Print and return number\", \"outputs\": [{\"name\": \"Output\", \"type\":\n          \"Integer\"}]}', pipelines.kubeflow.org/component_ref: '{}', pipelines.kubeflow.org/arguments.parameters: '{\"number\":\n          \"{{inputs.parameters.number_2}}\"}'}\n  - name: sum-and-print-numbers\n    container:\n      args: [--number-1, '{{inputs.parameters.print-and-return-number-Output}}', --number-2,\n        '{{inputs.parameters.print-and-return-number-2-Output}}']\n      command:\n      - sh\n      - -ec\n      - |\n        program_path=$(mktemp)\n        printf \"%s\" \"$0\" > \"$program_path\"\n        python3 -u \"$program_path\" \"$@\"\n      - |\n        def sum_and_print_numbers(number_1, number_2):\n            print(number_1 + number_2)\n\n        import argparse\n        _parser = argparse.ArgumentParser(prog='Sum and print numbers', description='')\n        _parser.add_argument(\"--number-1\", dest=\"number_1\", type=int, required=True, default=argparse.SUPPRESS)\n        _parser.add_argument(\"--number-2\", dest=\"number_2\", type=int, required=True, default=argparse.SUPPRESS)\n        _parsed_args = vars(_parser.parse_args())\n\n        _outputs = sum_and_print_numbers(**_parsed_args)\n      image: python:3.7\n    inputs:\n      parameters:\n      - {name: print-and-return-number-2-Output}\n      - {name: print-and-return-number-Output}\n    metadata:\n      labels: {pipelines.kubeflow.org/kfp_sdk_version: 1.6.3, pipelines.kubeflow.org/pipeline-sdk-type: kfp}\n      annotations: {pipelines.kubeflow.org/component_spec: '{\"implementation\": {\"container\":\n          {\"args\": [\"--number-1\", {\"inputValue\": \"number_1\"}, \"--number-2\", {\"inputValue\":\n          \"number_2\"}], \"command\": [\"sh\", \"-ec\", \"program_path=$(mktemp)\\nprintf \\\"%s\\\"\n          \\\"$0\\\" > \\\"$program_path\\\"\\npython3 -u \\\"$program_path\\\" \\\"$@\\\"\\n\", \"def\n          sum_and_print_numbers(number_1, number_2):\\n    print(number_1 + number_2)\\n\\nimport\n          argparse\\n_parser = argparse.ArgumentParser(prog=''Sum and print numbers'',\n          description='''')\\n_parser.add_argument(\\\"--number-1\\\", dest=\\\"number_1\\\",\n          type=int, required=True, default=argparse.SUPPRESS)\\n_parser.add_argument(\\\"--number-2\\\",\n          dest=\\\"number_2\\\", type=int, required=True, default=argparse.SUPPRESS)\\n_parsed_args\n          = vars(_parser.parse_args())\\n\\n_outputs = sum_and_print_numbers(**_parsed_args)\\n\"],\n          \"image\": \"python:3.7\"}}, \"inputs\": [{\"name\": \"number_1\", \"type\": \"Integer\"},\n          {\"name\": \"number_2\", \"type\": \"Integer\"}], \"name\": \"Sum and print numbers\"}',\n        pipelines.kubeflow.org/component_ref: '{}', pipelines.kubeflow.org/arguments.parameters: '{\"number_1\":\n          \"{{inputs.parameters.print-and-return-number-Output}}\", \"number_2\": \"{{inputs.parameters.print-and-return-number-2-Output}}\"}'}\n  arguments:\n    parameters:\n    - {name: number_1}\n    - {name: number_2}\n  serviceAccountName: pipeline-runner\n```\n\n</details>\n"
  },
  {
    "path": "docs/kubeflow/basic-requirements.md",
    "content": "---\ntitle : \"3. Install Requirements\"\ndescription: \"\"\nsidebar_position: 3\ncontributors: [\"Jongseob Jeon\"]\n---\n\n실습을 위해 권장하는 파이썬 버전은 python>=3.7입니다. 파이썬 환경에 익숙하지 않은 분들은 다음 [Appendix 1. 파이썬 가상환경](../appendix/pyenv)을 참고하여 **클라이언트 노드**에 설치해주신 뒤 패키지 설치를 진행해주시기를 바랍니다.\n\n실습을 진행하기에서 필요한 패키지들과 버전은 다음과 같습니다.\n\n- requirements.txt\n\n  ```bash\n  kfp==1.8.9\n  scikit-learn==1.0.1\n  mlflow==1.21.0\n  pandas==1.3.4\n  dill==0.3.4\n  ```\n\n[앞에서 만든 파이썬 가상환경](../appendix/pyenv.md#python-가상환경-생성)을 활성화합니다.\n\n```bash\npyenv activate demo\n```\n\n패키지 설치를 진행합니다.\n\n```bash\npip3 install -U pip\npip3 install kfp==1.8.9 scikit-learn==1.0.1 mlflow==1.21.0 pandas==1.3.4 dill==0.3.4\n```\n"
  },
  {
    "path": "docs/kubeflow/basic-run.md",
    "content": "---\ntitle : \"7. Pipeline - Run\"\ndescription: \"\"\nsidebar_position: 7\ncontributors: [\"Jongseob Jeon\"]\n---\n\n## Run Pipeline\n\n이제 업로드한 파이프라인을 실행시켜 보겠습니다.\n\n## Before Run\n\n### 1. Create Experiment\n\nExperiment란 Kubeflow 에서 실행되는 Run을 논리적으로 관리하는 단위입니다.  \n\nKubeflow에서 namespace를 처음 들어오면 생성되어 있는 Experiment가 없습니다. 따라서 파이프라인을 실행하기 전에 미리 Experiment를 생성해두어야 합니다. Experiment이 있다면 [Run Pipeline](../kubeflow/basic-run.md#run-pipeline-1)으로 넘어가도 무방합니다.\n\nExperiment는 Create Experiment 버튼을 통해 생성할 수 있습니다.\n\n![run-0.png](./img/run-0.png)\n\n### 2. Name 입력\n\nExperiment로 사용할 이름을 입력합니다.\n![run-1.png](./img/run-1.png)\n\n## Run Pipeline\n\n### 1. Create Run 선택\n\n![run-2.png](./img/run-2.png)\n\n### 2. Experiment 선택\n\n![run-9.png](./img/run-9.png)\n\n![run-10.png](./img/run-10.png)\n\n### 3. Pipeline Config 입력\n\n파이프라인을 생성할 때 입력한 Config 값들을 채워 넣습니다.\n업로드한 파이프라인은 number_1과 number_2를 입력해야 합니다.\n\n![run-3.png](./img/run-3.png)\n\n### 4. Start\n\n입력 후 Start 버튼을 누르면 파이프라인이 실행됩니다.\n\n![run-4.png](./img/run-4.png)\n\n## Run Result\n\n실행된 파이프라인들은 Runs 탭에서 확인할 수 있습니다.\nRun을 클릭하면 실행된 파이프라인과 관련된 자세한 내용을 확인해 볼 수 있습니다.\n\n![run-5.png](./img/run-5.png)\n\n클릭하면 다음과 같은 화면이 나옵니다. 아직 실행되지 않은 컴포넌트는 회색 표시로 나옵니다.\n\n![run-6.png](./img/run-6.png)\n\n컴포넌트가 실행이 완료되면 초록색 체크 표시가 나옵니다.\n\n![run-7.png](./img/run-7.png)\n\n가장 마지막 컴포넌트를 보면 입력한 Config인 3과 5의 합인 8이 출력된 것을 확인할 수 있습니다.\n\n![run-8.png](./img/run-8.png)\n"
  },
  {
    "path": "docs/kubeflow/how-to-debug.md",
    "content": "---\ntitle : \"13. Component - Debugging\"\ndescription: \"\"\nsidebar_position: 13\ncontributors: [\"Jongseob Jeon\"]\n---\n\n## Debugging Pipeline\n\n이번 페이지에서는 Kubeflow 컴포넌트를 디버깅하는 방법에 대해서 알아봅니다.\n\n## Failed Component\n\n이번 페이지에서는 [Component - MLFlow](../kubeflow/advanced-mlflow.md#mlflow-pipeline) 에서 이용한 파이프라인을 조금 수정해서 사용합니다.\n\n우선 컴포넌트가 실패하도록 파이프라인을 변경하도록 하겠습니다.\n\n```python\nfrom functools import partial\n\nimport kfp\nfrom kfp.components import InputPath, OutputPath, create_component_from_func\nfrom kfp.dsl import pipeline\n\n\n@partial(\n    create_component_from_func,\n    packages_to_install=[\"pandas\", \"scikit-learn\"],\n)\ndef load_iris_data(\n    data_path: OutputPath(\"csv\"),\n    target_path: OutputPath(\"csv\"),\n):\n    import pandas as pd\n    from sklearn.datasets import load_iris\n\n    iris = load_iris()\n\n    data = pd.DataFrame(iris[\"data\"], columns=iris[\"feature_names\"])\n    target = pd.DataFrame(iris[\"target\"], columns=[\"target\"])\n    \n    data[\"sepal length (cm)\"] = None\n    data.to_csv(data_path, index=False)\n    target.to_csv(target_path, index=False)\n\n\n@partial(\n    create_component_from_func,\n    packages_to_install=[\"pandas\"],\n)\ndef drop_na_from_csv(\n    data_path: InputPath(\"csv\"),\n    output_path: OutputPath(\"csv\"),\n):\n    import pandas as pd\n\n    data = pd.read_csv(data_path)\n    data = data.dropna()\n    data.to_csv(output_path, index=False)\n\n\n@partial(\n    create_component_from_func,\n    packages_to_install=[\"dill\", \"pandas\", \"scikit-learn\", \"mlflow\"],\n)\ndef train_from_csv(\n    train_data_path: InputPath(\"csv\"),\n    train_target_path: InputPath(\"csv\"),\n    model_path: OutputPath(\"dill\"),\n    input_example_path: OutputPath(\"dill\"),\n    signature_path: OutputPath(\"dill\"),\n    conda_env_path: OutputPath(\"dill\"),\n    kernel: str,\n):\n    import dill\n    import pandas as pd\n    from sklearn.svm import SVC\n\n    from mlflow.models.signature import infer_signature\n    from mlflow.utils.environment import _mlflow_conda_env\n\n    train_data = pd.read_csv(train_data_path)\n    train_target = pd.read_csv(train_target_path)\n\n    clf = SVC(kernel=kernel)\n    clf.fit(train_data, train_target)\n\n    with open(model_path, mode=\"wb\") as file_writer:\n        dill.dump(clf, file_writer)\n\n    input_example = train_data.sample(1)\n    with open(input_example_path, \"wb\") as file_writer:\n        dill.dump(input_example, file_writer)\n\n    signature = infer_signature(train_data, clf.predict(train_data))\n    with open(signature_path, \"wb\") as file_writer:\n        dill.dump(signature, file_writer)\n\n    conda_env = _mlflow_conda_env(\n        additional_pip_deps=[\"dill\", \"pandas\", \"scikit-learn\"]\n    )\n    with open(conda_env_path, \"wb\") as file_writer:\n        dill.dump(conda_env, file_writer)\n\n\n\n@pipeline(name=\"debugging_pipeline\")\ndef debugging_pipeline(kernel: str):\n    iris_data = load_iris_data()\n    drop_data = drop_na_from_csv(data=iris_data.outputs[\"data\"])\n    model = train_from_csv(\n        train_data=drop_data.outputs[\"output\"],\n        train_target=iris_data.outputs[\"target\"],\n        kernel=kernel,\n    )\n\nif __name__ == \"__main__\":\n    kfp.compiler.Compiler().compile(debugging_pipeline, \"debugging_pipeline.yaml\")\n\n```\n\n수정한 점은 다음과 같습니다.\n\n1. 데이터를 불러오는 `load_iris_data` 컴포넌트에서 `sepal length (cm)` 피처에 `None` 값을 주입\n2. `drop_na_from_csv` 컴포넌트에서 `drop_na()` 함수를 이용해 na 값이 포함된 `row`를 제거\n\n이제 파이프라인을 업로드하고 실행해 보겠습니다.  \n실행 후 Run을 눌러서 확인해보면 `Train from csv` 컴포넌트에서 실패했다고 나옵니다.\n\n![debug-0.png](./img/debug-0.png)\n\n실패한 컴포넌트를 클릭하고 로그를 확인해서 실패한 이유를 확인해 보겠습니다.\n\n![debug-2.png](./img/debug-2.png)\n\n로그를 확인하면 데이터의 개수가 0이여서 실행되지 않았다고 나옵니다.  \n분명 정상적으로 데이터를 전달했는데 왜 데이터의 개수가 0개일까요?  \n\n이제 입력받은 데이터에 어떤 문제가 있었는지 확인해 보겠습니다.  \n우선 컴포넌트를 클릭하고 Input/Ouput 탭에서 입력값으로 들어간 데이터들을 다운로드 받습니다.  \n다운로드는 빨간색 네모로 표시된 곳의 링크를 클릭하면 됩니다.\n\n![debug-5.png](./img/debug-5.png)\n\n두 개의 파일을 같은 경로에 다운로드합니다.  \n그리고 해당 경로로 이동해서 파일을 확인합니다.\n\n```bash\nls\n```\n\n다음과 같이 두 개의 파일이 있습니다.\n\n```bash\ndrop-na-from-csv-output.tgz load-iris-data-target.tgz\n```\n\n압축을 풀어보겠습니다.\n\n```bash\ntar -xzvf load-iris-data-target.tgz ; mv data target.csv\ntar -xzvf drop-na-from-csv-output.tgz ; mv data data.csv\n```\n\n그리고 이를 주피터 노트북을 이용해 컴포넌트 코드를 실행합니다.\n\n![debug-3.png](./img/debug-3.png)\n\n디버깅을 해본 결과 dropna 할 때 column을 기준으로 drop을 해야 하는데 row를 기준으로 drop을 해서 데이터가 모두 사라졌습니다.\n이제 문제의 원인을 알아냈으니 column을 기준으로 drop이 되게 컴포넌트를 수정합니다.\n\n```python\n@partial(\n    create_component_from_func,\n    packages_to_install=[\"pandas\"],\n)\ndef drop_na_from_csv(\n    data_path: InputPath(\"csv\"),\n    output_path: OutputPath(\"csv\"),\n):\n    import pandas as pd\n\n    data = pd.read_csv(data_path)\n    data = data.dropna(axis=\"columns\")\n    data.to_csv(output_path, index=False)\n```\n\n수정 후 파이프라인을 다시 업로드하고 실행하면 다음과 같이 정상적으로 수행하는 것을 확인할 수 있습니다.\n\n![debug-6.png](./img/debug-6.png)\n"
  },
  {
    "path": "docs/kubeflow/kubeflow-concepts.md",
    "content": "---\ntitle : \"2. Kubeflow Concepts\"\ndescription: \"\"\nsidebar_position: 2\ncontributors: [\"Jongseob Jeon\"]\n---\n\n## Component\n\n컴포넌트(Component)는 컴포넌트 콘텐츠(Component contents)와 컴포넌트 래퍼(Component wrapper)로 구성되어 있습니다.\n하나의 컴포넌트는 컴포넌트 래퍼를 통해 kubeflow에 전달되며 전달된 컴포넌트는 정의된 컴포넌트 콘텐츠를 실행(execute)하고 아티팩트(artifacts)들을 생산합니다.\n\n![concept-0.png](./img/concept-0.png)\n\n### Component Contents\n\n컴포넌트 콘텐츠를 구성하는 것은 총 3가지가 있습니다.\n\n![concept-1.png](./img/concept-1.png)\n\n1. Environemnt\n2. Python code w\\ Config\n3. Generates Artifacts\n\n예시와 함께 각 구성 요소가 어떤 것인지 알아보도록 하겠습니다.\n다음과 같이 데이터를 불러와 SVC(Support Vector Classifier)를 학습한 후 SVC 모델을 저장하는 과정을 적은 파이썬 코드가 있습니다.\n\n```python\nimport dill\nimport pandas as pd\n\nfrom sklearn.svm import SVC\n\ntrain_data = pd.read_csv(train_data_path)\ntrain_target= pd.read_csv(train_target_path)\n\nclf= SVC(\n    kernel=kernel\n)\nclf.fit(train_data)\n\nwith open(model_path, mode=\"wb\") as file_writer:\n     dill.dump(clf, file_writer)\n```\n\n위의 파이썬 코드는 다음과 같이 컴포넌트 콘텐츠로 나눌 수 있습니다.\n\n![concept-2.png](./img/concept-2.png)\n\nEnvironment는 파이썬 코드에서 사용하는 패키지들을 import하는 부분입니다.  \n다음으로 Python Code w\\ Config 에서는 주어진 Config를 이용해 실제로 학습을 수행합니다.  \n마지막으로 아티팩트를 저장하는 과정이 있습니다.\n\n### Component Wrapper\n\n컴포넌트 래퍼는 컴포넌트 콘텐츠에 필요한 Config를 전달하고 실행시키는 작업을 합니다.\n\n![concept-3.png](./img/concept-3.png)\n\nKubeflow에서는 컴포넌트 래퍼를 위의 `train_svc_from_csv`와 같이 함수의 형태로 정의합니다.\n컴포넌트 래퍼가 콘텐츠를 감싸면 다음과 같이 됩니다.\n\n![concept-4.png](./img/concept-4.png)\n\n### Artifacts\n\n위의 설명에서 컴포넌트는 아티팩트(Artifacts)를 생성한다고 했습니다. 아티팩트란 evaluation result, log 등 어떤 형태로든 파일로 생성되는 것을 통틀어서 칭하는 용어입니다.\n그중 우리가 관심을 두는 유의미한 것들은 다음과 같은 것들이 있습니다.\n\n![concept-5.png](./img/concept-5.png)\n\n- Model\n- Data\n- Metric\n- etc\n\n#### Model\n\n저희는 모델을 다음과 같이 정의 했습니다.\n\n> 모델이란 파이썬 코드와 학습된 Weights와 Network 구조 그리고 이를 실행시키기 위한 환경이 모두 포함된 형태\n\n#### Data\n\n데이터는 전 처리된 피처, 모델의 예측 값 등을 포함합니다.\n\n#### Metric\n\nMetric은 동적 지표와 정적 지표 두 가지로 나누었습니다.\n\n- 동적 지표란 train loss와 같이 학습이 진행되는 중 에폭(Epoch)마다 계속해서 변화하는 값을 의미합니다.\n- 정적 지표란 학습이 끝난 후 최종적으로 모델을 평가하는 정확도 등을 의미합니다.\n\n## Pipeline\n\n파이프라인은 컴포넌트의 집합과 컴포넌트를 실행시키는 순서도로 구성되어 있습니다. 이 때, 순서도는 방향 순환이 없는 그래프로 이루어져 있으며, 간단한 조건문을 포함할 수 있습니다.\n\n![concept-6.png](./img/concept-6.png)\n\n### Pipeline Config\n\n앞서 컴포넌트를 실행시키기 위해서는 Config가 필요하다고 설명했습니다. 파이프라인을 구성하는 컴포넌트의 Config 들을 모아 둔 것이 파이프라인 Config입니다.\n\n![concept-7.png](./img/concept-7.png)\n\n## Run\n\n파이프라인이 필요로 하는 파이프라인 Config가 주어져야지만 파이프라인을 실행할 수 있습니다.  \nKubeflow에서는 실행된 파이프라인을 Run 이라고 부릅니다.\n\n![concept-8.png](./img/concept-8.png)\n\n파이프라인이 실행되면 각 컴포넌트가 아티팩트들을 생성합니다.\nKubeflow pipeline에서는 Run 하나당 고유한 ID 를 생성하고, Run에서 생성되는 모든 아티팩트들을 저장합니다.\n\n![concept-9.png](./img/concept-9.png)\n\n그러면 이제 직접 컴포넌트와 파이프라인을 작성하는 방법에 대해서 알아보도록 하겠습니다.\n"
  },
  {
    "path": "docs/kubeflow/kubeflow-intro.md",
    "content": "---\ntitle : \"1. Kubeflow Introduction\"\ndescription: \"\"\nsidebar_position: 1\ncontributors: [\"Jongseob Jeon\"]\n---\n\nKubeflow를 사용하기 위해서는 컴포넌트(Component)와 파이프라인(Pipeline)을 작성해야 합니다.\n\n*모두의 MLOps*에서 설명하는 방식은 [Kubeflow Pipeline 공식 홈페이지](https://www.kubeflow.org/docs/components/pipelines/overview/quickstart/)에서 설명하는 방식과는 다소 차이가 있습니다. 여기에서는 Kubeflow Pipeline을 워크플로(Workflow)가 아닌 앞서 설명한 [MLOps를 구성하는 요소](../kubeflow/kubeflow-concepts.md#component-contents) 중 하나의 컴포넌트로 사용하기 때문입니다.\n\n그럼 이제 컴포넌트와 파이프라인은 무엇이며 어떻게 작성할 수 있는지 알아보도록 하겠습니다.\n"
  },
  {
    "path": "docs/kubeflow-dashboard-guide/_category_.json",
    "content": "{\n  \"label\": \"Kubeflow UI Guide\",\n  \"position\": 5,\n  \"link\": {\n    \"type\": \"generated-index\"\n  }\n}\n"
  },
  {
    "path": "docs/kubeflow-dashboard-guide/experiments-and-others.md",
    "content": "---\ntitle : \"6. Kubeflow Pipeline 관련\"\ndescription: \"\"\nsidebar_position: 6\ncontributors: [\"Jaeyeon Kim\"]\n---\n\nCentral Dashboard의 왼쪽 탭의 Experiments(KFP), Pipelines, Runs, Recurring Runs, Artifacts, Executions 페이지들에서는 Kubeflow Pipeline과 Pipeline의 실행 그리고 Pipeline Run의 결과를 관리합니다.\n\n![left-tabs](./img/left-tabs.png)\n\nKubeflow Pipeline이 *모두의 MLOps*에서 Kubeflow를 사용하는 주된 이유이며, Kubeflow Pipeline을 만드는 방법, 실행하는 방법, 결과를 확인하는 방법 등 자세한 내용은 [3.Kubeflow](../kubeflow/kubeflow-intro)에서 다룹니다.\n"
  },
  {
    "path": "docs/kubeflow-dashboard-guide/experiments.md",
    "content": "---\ntitle : \"5. Experiments(AutoML)\"\ndescription: \"\"\nsidebar_position: 5\ncontributors: [\"Jaeyeon Kim\"]\n---\n\n다음으로는 Central Dashboard의 왼쪽 탭의 Experiments(AutoML)을 클릭해보겠습니다.\n\n![left-tabs](./img/left-tabs.png)\n\n![automl](./img/automl.png)\n\nExperiments(AutoML) 페이지는 Kubeflow에서 Hyperparameter Tuning과 Neural Architecture Search를 통한 AutoML을 담당하는 [Katib](https://www.kubeflow.org/docs/components/katib/overview/)를 관리할 수 있는 페이지입니다.\n\nKatib와 Experiments(AutoML)에 대한 사용법은 *모두의 MLOps* v1.0에서는 다루지 않으며, v2.0에 추가될 예정입니다.\n"
  },
  {
    "path": "docs/kubeflow-dashboard-guide/intro.md",
    "content": "---\ntitle : \"1. Central Dashboard\"\ndescription: \"\"\nsidebar_position: 1\ncontributors: [\"Jaeyeon Kim\", \"SeungTae Kim\"]\n---\n\n[Kubeflow 설치](../setup-components/install-components-kf.md)를 완료하면, 다음 커맨드를 통해 대시보드에 접속할 수 있습니다.\n\n```bash\nkubectl port-forward --address 0.0.0.0 svc/istio-ingressgateway -n istio-system 8080:80\n```\n\n![after-login](./img/after-login.png)\n\nCentral Dashboard는 Kubeflow에서 제공하는 모든 기능을 통합하여 제공하는 UI입니다. Central Dashboard에서 제공하는 기능은 크게 왼쪽의 탭을 기준으로 구분할 수 있습니다.\n\n![left-tabs](./img/left-tabs.png)\n\n- Home\n- Notebooks\n- Tensorboards\n- Volumes\n- Models\n- Experiments(AutoML)\n- Experiments(KFP)\n- Pipelines\n- Runs\n- Recurring Runs\n- Artifacts\n- Executions\n\n그럼 이제 기능별 간단한 사용법을 알아보겠습니다.\n"
  },
  {
    "path": "docs/kubeflow-dashboard-guide/notebooks.md",
    "content": "---\ntitle : \"2. Notebooks\"\ndescription: \"\"\nsidebar_position: 2\ncontributors: [\"Jaeyeon Kim\"]\n---\n\n## 노트북 서버(Notebook Server) 생성하기\n\n다음 Central Dashboard의 왼쪽 탭의 Notebooks를 클릭해보겠습니다.\n\n![left-tabs](./img/left-tabs.png)\n\n다음과 같은 화면을 볼 수 있습니다.\n\nNotebooks 탭은 JupyterHub와 비슷하게 유저별로 jupyter notebook 및 code server 환경(이하 노트북 서버)을 독립적으로 생성하고 접속할 수 있는 페이지입니다.\n\n![notebook-home](./img/notebook-home.png)\n\n오른쪽 위의 `+ NEW NOTEBOOK` 버튼을 클릭합니다.\n\n![new-notebook](./img/new-notebook.png)\n\n아래와 같은 화면이 나타나면, 이제 생성할 노트북 서버의 스펙(Spec)을 명시하여 생성합니다.\n\n![create](./img/create.png)\n\n<details>\n<summary>각 스펙에 대한 자세한 내용은 아래와 같습니다.</summary>\n\n- **name**:\n  - 노트북 서버를 구분할 수 있는 이름으로 생성합니다.\n- **namespace** :\n  - 따로 변경할 수 없습니다. (현재 로그인한 user 계정의 namespace이 자동으로 지정되어 있습니다.)\n- **Image**:\n  - sklearn, pytorch, tensorflow 등의 파이썬 패키지가 미리 설치된 jupyter lab 이미지 중 사용할 이미지를 선택합니다.\n    - 노트북 서버 내에서 GPU를 사용하여 tensorflow-cuda, pytorch-cuda 등의 이미지를 사용하는 경우, **하단의 GPUs** 부분을 확인하시기 바랍니다.\n  - 추가적인 패키지나 소스코드 등을 포함한 커스텀(Custom) 노트북 서버를 사용하고 싶은 경우에는 커스텀 이미지(Custom Image)를 만들고 배포 후 사용할 수도 있습니다.\n- **CPU / RAM**\n  - 필요한 자원 사용량을 입력합니다.\n    - cpu : core 단위\n      - 가상 core 개수 단위를 의미하며, int 형식이 아닌  `1.5`, `2.7` 등의 float 형식도 입력할 수 있습니다.\n    - memory : Gi 단위\n- **GPUs**\n  - 주피터 노트북에 할당할 GPU 개수를 입력합니다.\n    - `None`\n      - GPU 자원이 필요하지 않은 상황\n    - 1, 2, 4\n      - GPU 1, 2, 4 개 할당\n  - GPU Vendor\n    - 앞의 [(Optional) Setup GPU](../setup-kubernetes/setup-nvidia-gpu.md) 를 따라 nvidia gpu plugin을 설치하였다면 NVIDIA를 선택합니다.\n- **Workspace Volume**\n  - 노트북 서버 내에서 필요한 만큼의 디스크 용량을 입력합니다.\n  - Type 과 Name 은 변경하지 않고, **디스크 용량을 늘리고 싶거나** **AccessMode 를 변경하고 싶을** 때에만 변경해서 사용하시면 됩니다.\n    - **\"Don't use Persistent Storage for User's home\"** 체크박스는 노트북 서버의 작업 내용을 저장하지 않아도 상관없을 때에만 클릭합니다. **일반적으로는 누르지 않는 것을 권장합니다.**\n    - 기존에 미리 생성해두었던 PVC를 사용하고 싶을 때에는, Type을 \"Existing\" 으로 입력하여 해당 PVC의 이름을 입력하여 사용하시면 됩니다.\n- **Data Volumes**\n  - 추가적인 스토리지 자원이 필요하다면 **\"+ ADD VOLUME\"** 버튼을 클릭하여 생성할 수 있습니다.\n- ~~Configurations, Affinity/Tolerations, Miscellaneous Settings~~\n  - 일반적으로는 필요하지 않으므로 *모두의 MLOps*에서는 자세한 설명을 생략합니다.\n\n</details>\n\n모두 정상적으로 입력하였다면 하단의 **LAUNCH** 버튼이 활성화되며, 버튼을 클릭하면 노트북 서버 생성이 시작됩니다.\n\n![creating](./img/creating.png)\n\n생성 후 아래와 같이 **Status** 가 초록색 체크 표시 아이콘으로 변하며, **CONNECT 버튼**이 활성화됩니다.\n\n![created](./img/created.png)\n\n---\n\n## 노트북 서버 접속하기\n\n**CONNECT 버튼**을 클릭하면 브라우저에 새 창이 열리며, 다음과 같은 화면이 보입니다.\n\n![notebook-access](./img/notebook-access.png)\n\n**Launcher**의 Notebook, Console, Terminal 아이콘을 클릭하여 사용할 수 있습니다.\n\n  생성된 Notebook 화면\n\n![notebook-console](./img/notebook-console.png)\n\n  생성된 Terminal 화면\n\n![terminal-console](./img/terminal-console.png)\n\n---\n\n## 노트북 서버 중단하기\n\n노트북 서버를 오랜 시간 사용하지 않는 경우, 쿠버네티스 클러스터의 효율적인 리소스 사용을 위해서 노트북 서버를 중단(Stop)할 수 있습니다. **단, 이 경우 노트북 서버 생성 시 Workspace Volume 또는 Data Volume으로 지정해놓은 경로 외에 저장된 데이터는 모두 초기화되는 것에 주의하시기 바랍니다.**  \n노트북 서버 생성 당시 경로를 변경하지 않았다면, 디폴트(Default) Workspace Volume의 경로는 노트북 서버 내의 `/home/jovyan` 이므로, `/home/jovyan` 의 하위 경로 이외의 경로에 저장된 데이터는 모두 사라집니다.\n\n다음과 같이 `STOP` 버튼을 클릭하면 노트북 서버가 중단됩니다.\n\n![notebook-stop](./img/notebook-stop.png)\n\n중단이 완료되면 다음과 같이 `CONNECT` 버튼이 비활성화되며, `PLAY` 버튼을 클릭하면 다시 정상적으로 사용할 수 있습니다.\n\n![notebook-restart](./img/notebook-restart.png)\n"
  },
  {
    "path": "docs/kubeflow-dashboard-guide/tensorboards.md",
    "content": "---\ntitle : \"3. Tensorboards\"\ndescription: \"\"\nsidebar_position: 3\ncontributors: [\"Jaeyeon Kim\"]\n---\n\n다음으로는 Central Dashboard의 왼쪽 탭의 Tensorboards를 클릭해보겠습니다.\n\n![left-tabs](./img/left-tabs.png)\n\n다음과 같은 화면을 볼 수 있습니다.\n\n![tensorboard](./img/tensorboard.png)\n\nTensorboards 탭은 Tensorflow, PyTorch 등의 프레임워크에서 제공하는 Tensorboard 유틸이 생성한 ML 학습 관련 데이터를 시각화하는 텐서보드 서버(Tensorboard Server)를 쿠버네티스 클러스터에 생성하는 기능을 제공합니다.\n\n이렇게 생성한 텐서보드 서버는, 일반적인 원격 텐서보드 서버의 사용법과 같이 사용할 수도 있으며, [Kubeflow 파이프라인 런에서 바로 텐서보드 서버에 데이터를 저장하는 용도](https://www.kubeflow.org/docs/components/pipelines/sdk/output-viewer/#tensorboard)로 활용할 수 있습니다.\n\nKubeflow 파이프라인 런의 결과를 시각화하는 방법에는 [다양한 방식](https://www.kubeflow.org/docs/components/pipelines/sdk/output-viewer/)이 있으며, *모두의 MLOps*에서는 더 일반적으로 활용할 수 있도록 Kubeflow 컴포넌트의 Visualization 기능과 MLflow의 시각화 기능을 활용할 예정이므로, Tensorboards 페이지에 대한 자세한 설명은 생략하겠습니다.\n"
  },
  {
    "path": "docs/kubeflow-dashboard-guide/volumes.md",
    "content": "---\ntitle : \"4. Volumes\"\ndescription: \"\"\nsidebar_position: 4\ncontributors: [\"Jaeyeon Kim\"]\n---\n\n## Volumes\n\n다음으로는 Central Dashboard의 왼쪽 탭의 Volumes를 클릭해보겠습니다.\n\n![left-tabs](./img/left-tabs.png)\n\n다음과 같은 화면을 볼 수 있습니다.\n\n![volumes](./img/volumes.png)\n\nVolumes 탭은 [Kubernetes의 볼륨(Volume)](https://kubernetes.io/ko/docs/concepts/storage/volumes/), 정확히는 [퍼시스턴트 볼륨 클레임(Persistent Volume Claim, 이하 pvc)](https://kubernetes.io/ko/docs/concepts/storage/persistent-volumes/) 중 현재 user의 namespace에 속한 pvc를 관리하는 기능을 제공합니다.\n\n위 스크린샷을 보면, [1. Notebooks](../kubeflow-dashboard-guide/notebooks) 페이지에서 생성한 Volume의 정보를 확인할 수 있습니다. 해당 Volume의 Storage Class는 쿠버네티스 클러스터 설치 당시 설치한 Default Storage Class인 local-path로 설정되어있음을 확인할 수 있습니다.\n\n이외에도 user namespace에 새로운 볼륨을 생성하거나, 조회하거나, 삭제하고 싶은 경우에 Volumes 페이지를 활용할 수 있습니다.\n\n---\n\n## 볼륨 생성하기\n\n오른쪽 위의 `+ NEW VOLUME` 버튼을 클릭하면 다음과 같은 화면을 볼 수 있습니다.\n\n![new-volume](./img/new-volume.png)\n\nname, size, storage class, access mode를 지정하여 생성할 수 있습니다.\n\n원하는 리소스 스펙을 지정하여 생성하면 다음과 같이 볼륨의 Status가 `Pending`으로 조회됩니다. `Status` 아이콘에 마우스 커서를 가져다 대면 *해당 볼륨은 mount하여 사용하는 first consumer가 나타날 때 실제로 생성을 진행한다(This volume will be bound when its first consumer is created.)*는 메시지를 확인할 수 있습니다.  \n이는 실습을 진행하는 [StorageClass](https://kubernetes.io/ko/docs/concepts/storage/storage-classes/)인 `local-path`의 볼륨 생성 정책에 해당하며, **문제 상황이 아닙니다.**  \n해당 페이지에서 Status가 `Pending` 으로 보이더라도 해당 볼륨을 사용하길 원하는 노트북 서버 혹은 파드(Pod)에서는 해당 볼륨의 이름을 지정하여 사용할 수 있으며, 그때 실제로 볼륨 생성이 진행됩니다.\n\n![creating-volume](./img/creating-volume.png)\n"
  },
  {
    "path": "docs/prerequisites/_category_.json",
    "content": "{\n  \"label\": \"Prerequisites\",\n  \"position\": 1,\n  \"link\": {\n    \"type\": \"generated-index\"\n  }\n}\n"
  },
  {
    "path": "docs/prerequisites/docker/_category_.json",
    "content": "{\n  \"label\": \"Docker\",\n  \"position\": 1,\n  \"link\": {\n    \"type\": \"generated-index\"\n  }\n}\n"
  },
  {
    "path": "docs/prerequisites/docker/advanced.md",
    "content": "---\ntitle : \"[Practice] Docker Advanced\"\ndescription: \"Practice to use docker more advanced way.\"\nsidebar_position: 6\ncontributors: [\"Jongseob Jeon\", \"Jaeyeon Kim\"]\n---\n\n## 도커 이미지 잘 만들기\n\n### 도커 이미지를 만들 때 고려해야 될 점\n\nDockerfile 을 활용하여 도커 이미지를 만들 때는 명령어의 **순서**가 중요합니다.  \n그 이유는 도커 이미지는 여러 개의 Read-Only Layer 로 구성되어있고, 이미지를 빌드할 때 이미 존재하는 레이어는 **캐시되어** 재사용되기 때문에, 이를 생각해서 Dockerfile 을 구성한다면 **빌드 시간을 줄일 수 있습니다.**\n\nDockerfile에서 `RUN`, `ADD`, `COPY` 명령어 하나가 하나의 레이어로 저장됩니다.\n\n예를 들어서 다음과 같은 `Dockerfile`이 있습니다.\n\n```docker\n# Layer 1\nFROM ubuntu:latest\n\n# Layer 2\nRUN apt-get update && apt-get install python3 pip3 -y\n\n# Layer 3\nRUN pip3 install -U pip && pip3 install torch\n\n# Layer 4\nCOPY src/ src/\n\n# Layer 5\nCMD python src/app.py\n```\n\n위의 `Dockerfile`로 빌드된 이미지를 `docker run -it app:latest /bin/bash` 명령어로 실행하면 다음과 같은 레이어로 표현할 수 있습니다.\n\n![layers.png](./img/layers.png)\n\n최상단의 R/W Layer 는 이미지에 영향을 주지 않습니다. 즉, 컨테이너 내부에서 작업한 내역은 모두 휘발성입니다.\n\n하단의 레이어가 변경되면, 그 위의 레이어는 모두 새로 빌드됩니다. 그래서 Dockerfile 내장 명령어의 순서가 중요합니다.  \n예를 들면, **자주 변경**되는 부분은 **최대한 뒤쪽으로** 정렬하는 것을 추천합니다. (ex. `COPY src/ app/src/`)\n\n그렇기 때문에 반대로 변경되지 않는 부분은 최대한 앞쪽으로 정렬하는게 좋습니다.\n\n만약 거의 **변경되지 않지만**, 여러 곳에서 **자주** 쓰이는 부분을 공통화할 수도 있습니다.\n해당 공통부분만 묶어서 별도의 이미지는 미리 만들어둔 다음, **베이스 이미지** 로 활용하는 것이 좋습니다.\n\n예를 들어, 다른 건 거의 똑같은데, tensorflow-cpu 를 사용하는 이미지와, tensorflow-gpu 를 사용하는 환경을 분리해서 이미지로 만들고 싶은 경우에는 다음과 같이 할 수 있습니다.  \npython 과 기타 기본적인 패키지가 설치된 [`ghcr.io/makinarocks/python:3.8-base`](http://ghcr.io/makinarocks/python:3.8-base-cpu) 를 만들어두고, **tensorflow cpu 버전과 gpu 버전이** 설치된 이미지 새로 만들때는, 위의 이미지를 `FROM` 으로 불러온 다음, tensorflow install 하는 부분만 별도로 작성해서 Dockerfile 을 2 개로 관리한다면 가독성도 좋고 빌드 시간도 줄일 수 있습니다.\n\n**합칠 수 있는 Layer 는 합치는 것**이 Old version 의 도커에서는 성능 향상 효과를 이끌었습니다. 여러분의 도커 컨테이너가 어떤 도커 버전에서 실행될 것인지 보장할 수 없으며, **가독성**을 위해서도 합칠 수 있는 Layer 는 적절히 합치는 것이 좋습니다.\n\n예를 들면, 다음과 같이 작성된 `Dockerfile`이 있습니다.\n\n```docker\n# Bad Case\nRUN apt-get update\nRUN apt-get install build-essential -y\nRUN apt-get install curl -y\nRUN apt-get install jq -y\nRUN apt-get install git -y\n```\n\n이를 아래와 같이 합쳐서 적을 수 있습니다.\n\n```docker\n# Better Case\nRUN apt-get update && \\\n    apt-get install -y \\\n    build-essential \\\n    curl \\\n    jq \\\n    git\n```\n\n편의를 위해서는 `.dockerignore` 도 사용하는게 좋습니다.\n`.dockerignore`는 `.gitignore` 와 비슷한 역할을 한다고 이해하면 됩니다. (git add 할 때 제외할 수 있듯이, docker build 할 때 자동으로 제외)\n\n더 많은 정보는 [Docker 공식 문서](https://docs.docker.com/develop/develop-images/dockerfile_best-practices/)에서 확인하실 수 있습니다.\n\n### ENTRYPOINT vs CMD\n\n`ENTRYPOINT` 와 `CMD` 는 모두 컨테이너의 실행 시점에서 어떤 명령어를 실행시키고 싶을 때 사용합니다.\n그리고 이 둘 중 하나는 반드시 존재해야 합니다.\n\n- **차이점**\n  - `CMD`: docker run 을 수행할 때, 쉽게 변경하여 사용할 수 있음\n  - `ENTRYPOINT`: `--entrypoint`  를 사용해야 변경할 수 있음\n\n`ENTRYPOINT` 와 `CMD` 가 함께 쓰일 때는 보통 `CMD`는 `ENTRYPOINT` 에서 적은 명령의 arguments(parameters) 를 의미합니다.\n\n예를 들어서 다음과 같은 `Dockerfile` 이 있습니다.\n\n```docker\nFROM ubuntu:latest\n\n# 아래 4 가지 option 을 바꿔가며 직접 테스트해보시면 이해하기 편합니다.\n# 단, NO ENTRYPOINT 옵션은 base image 인 ubuntu:latest 에 이미 있어서 테스트해볼 수는 없고 나머지 v2, 3, 5, 6, 8, 9, 11, 12 를 테스트해볼 수 있습니다.\n# ENTRYPOINT echo \"Hello ENTRYPOINT\"\n# ENTRYPOINT [\"echo\", \"Hello ENTRYPOINT\"]\n# CMD echo \"Hello CMD\"\n# CMD [\"echo\", \"Hello CMD\"]\n```\n\n위의 `Dockerfile`에서 주석으로 표시된 부분들을 해제하며 빌드하고 실행하면 다음과 같은 결과를 얻을 수 있습니다.\n\n|                    | No ENTRYPOINT  | ENTRYPOINT a b | ENTRYPOINT [\"a\", \"b\"] |\n| ------------------ | -------------- | -------------- | --------------------- |\n| **NO CMD**         | Error!         | /bin/sh -c a b | a b                   |\n| **CMD [\"x\", \"y\"]** | x y            | /bin/sh -c a b | a b x y               |\n| **CMD x y**        | /bin/sh -c x y | /bin/sh -c a b | a b /bin/sh -c x y    |\n\n- In Kubernetes pod\n  - `ENTRYPOINT` → command\n  - `CMD` → args\n\n### Docker tag 이름 짓기\n\n도커 이미지의 tag 로 **latest 는 사용하지 않는 것을 권장**합니다.  \n이유는 latest 는 default tag name 이므로 **의도치 않게 overwritten** 되는 경우가 너무 많이 발생하기 때문입니다.\n\n하나의 이미지는 하나의 태그를 가짐(**uniqueness**)을 보장해야 추후 Production 단계에서 **협업/디버깅**에 용이합니다.  \n내용은 다르지만, 동일한 tag 를 사용하게 되면 추후 dangling image 로 취급되어 관리하기 어려워집니다.  \ndangling image는 `docker images`에는 나오지 않지만 계속해서 저장소를 차지하고 있습니다.\n\n### ETC\n\n1. log 등의 정보는 container 내부가 아닌 곳에 따로 저장합니다.\n    container 내부에서 write 한 data 는 언제든지 사라질 수 있기 때문입니다.\n2. secret 한 정보, 환경(dev/prod) dependent 한 정보 등은 Dockerfile 에 직접 적는 게 아니라, env var 또는 .env config file 을 사용합니다.\n3. Dockerfile **linter** 도 존재하므로, 협업 시에는 활용하면 좋습니다.\n    [https://github.com/hadolint/hadolint](https://github.com/hadolint/hadolint)\n\n## docker run 의 다양한 옵션\n\n### docker run with volume\n\nDocker container 사용 시 불편한 점이 있습니다.\n바로 Docker는 기본적으로 Docker **container 내부에서 작업한 모든 사항은 저장되지 않습니다.**\n이유는 Docker container 는 각각 격리된 파일시스템을 사용합니다. 따라서, **여러 docker container 끼리 데이터를 공유하기 어렵습니다.**\n\n이 문제를 해결하기 위해서 Docker에서 제공하는 방식은 **2 가지**가 있습니다.\n\n![storage.png](./img/storage.png)\n\n#### Docker volume\n\n- docker cli 를 사용해 `volume` 이라는 리소스를 직접 관리\n- host 에서 Docker area(`/var/lib/docker`) 아래에 특정 디렉토리를 생성한 다음, 해당 경로를 docker container 에 mount\n\n#### Bind mount\n\n- host 의 특정 경로를 docker container 에 mount\n\n#### How to use?\n\n사용 방식은 **동일한 인터페이스**로 `-v` 옵션을 통해 사용할 수 있습니다.  \n다만, volume 을 사용할 때에는 `docker volume create`, `docker volume ls`, `docker volume rm` 등을 수행하여 직접 관리해주어야 합니다.\n\n- Docker volume\n\n    ```bash\n    docker run \\\n        -v my_volume:/app \\\n        nginx:latest\n    ````\n\n- Blind mount\n\n    ```bash\n    docker run \\\n        -v /home/user/some/path:/app \\\n        nginx:latest\n    ```\n\n로컬에서 개발할 때는 bind mount 가 편하긴 하지만, 환경을 깔끔하게 유지하고 싶다면 docker volume 을 사용하여 create, rm 을 명시적으로 수행하는 것도 하나의 방법입니다.\n\n쿠버네티스에서 스토리지를 제공하는 방식도 결국 docker 의 bind mount 를 활용하여 제공합니다.\n\n### docker run with resource limit\n\n기본적으로 docker container 는 **host OS 의 cpu, memory 자원을 fully 사용**할 수 있습니다. 하지만 이렇게 사용하게 되면 host OS 의 자원 상황에 따라서 **OOM** 등의 이슈로 docker container 가 비정상적으로 종료되는 상황이 발생할 수 있습니다.  \n이런 문제를 다루기 위해 **docker container 실행 시, cpu 와 memory 의 사용량 제한**을 걸 수 있는 `-m` [옵션](https://docs.docker.com/config/containers/resource_constraints/#limit-a-containers-access-to-memory)을 제공합니다.\n\n```bash\ndocker run -d -m 512m --memory-reservation=256m --name 512-limit ubuntu sleep 3600\ndocker run -d -m 1g --memory-reservation=256m --name 1g-limit ubuntu sleep 3600\n```\n\n위의 도커를 실행 후 `docker stats` 커맨드를 통해 사용량을 확인할 수 있습니다.\n\n```bash\nCONTAINER ID   NAME        CPU %     MEM USAGE / LIMIT   MEM %     NET I/O       BLOCK I/O   PIDS\n4ea1258e2e09   1g-limit    0.00%     300KiB / 1GiB       0.03%     1kB / 0B      0B / 0B     1\n4edf94b9a3e5   512-limit   0.00%     296KiB / 512MiB     0.06%     1.11kB / 0B   0B / 0B     1\n```\n\n쿠버네티스에서 pod 라는 리소스에 cpu, memory 제한을 줄 때, 이 방식을 활용하여 제공합니다.\n\n### docker run with restart policy\n\n특정 컨테이너가 계속해서 running 상태를 유지시켜야 하는 경우가 존재합니다. 이런 경우를 위해서 해당 컨테이너가 종료되자마자 바로 재생성을 시도할 수 있는 `--restart=always` 옵션을 제공하고 있습니다.\n\n옵션 입력 후 도커를 실행합니다.\n\n```bash\ndocker run --restart=always ubuntu\n```\n\n`watch -n1 docker ps`를 통해 재실행이 되고 있는지 확인합니다.\n정상적으로 수행되고 있다면 다음과 같이 STATUS에 `Restarting (0)` 이 출력됩니다.\n\n```bash\nCONTAINER ID   IMAGE     COMMAND   CREATED          STATUS                         PORTS     NAMES\na911850276e8   ubuntu    \"bash\"    35 seconds ago   Restarting (0) 6 seconds ago             hungry_vaughan\n```\n\n- [https://docs.docker.com/engine/reference/commandline/run/#restart-policies---restart](https://docs.docker.com/engine/reference/commandline/run/#restart-policies---restart)\n  - on-failure with max retries\n  - always 등의 선택지 제공\n\n쿠버네티스에서 job 이라는 resource 의 restart 옵션을 줄 때, 이 방식을 활용하여 제공합니다.\n\n### docker run as a background process\n\n도커 컨테이너를 실행할 때는 기본적으로 foreground process 로 실행됩니다. 즉, 컨테이너를 실행한 터미널이 해당 컨테이너에 자동으로 attach 되어 있어, 다른 명령을 실행할 수 없습니다.\n\n다음과 같은 예시를 수행해봅니다.  \n우선 터미널 2 개를 열어, 하나의 터미널에서는 `docker ps` 를 지켜보고, 다른 하나의 터미널에서는 다음과 같은 명령을 차례로 실행해보며 동작을 지켜봅니다.\n\n#### First Practice\n\n```bash\ndocker run -it ubuntu sleep 10\n```\n\n10 초동안 멈춰 있어야 하고, 해당 컨테이너에서 다른 명령을 수행할 수 없습니다. 10초 뒤에는 docker ps 에서 container 가 종료되는 것을 확인할 수 있습니다.\n\n#### Second Practice\n\n```bash\ndocker run -it ubuntu sleep 10\n```\n\n이후, `ctrl + p` -> `ctrl + q`\n\n해당 터미널에서 이제 다른 명령을 수행할 수 있게 되었으며, docker ps 로도 10초까지는 해당 컨테이너가 살아있는 것을 확인할 수 있습니다.\n이렇게 docker container 내부에서 빠져나온 상황을 detached 라고 부릅니다.\n도커에서는 run 을 실행함과 동시에 detached mode 로 실행시킬 수 있는 옵션을 제공합니다.\n\n#### Third Practice\n\n```bash\ndocker run -d ubuntu sleep 10\n```\n\ndetached mode 이므로 해당 명령을 실행시킨 터미널에서 다른 액션을 수행시킬 수 있습니다.\n\n상황에 따라 detached mode 를 적절히 활용하면 좋습니다.  \n예를 들어, DB 와 통신하는 Backend API server 를 개발할 때 Backend API server 는 source code 를 변경시켜가면서 hot-loading 으로 계속해서 로그를 확인해봐야 하지만, DB 는 로그를 지켜볼 필요는 없는 경우라면 다음과 같이 실행할 수 있습니다.  \nDB 는 docker container 를 detached mode 로 실행시키고, Backend API server 는 attached mode 로 log 를 following 하면서 실행시키면 효율적입니다.\n\n## References\n\n- [https://towardsdatascience.com/docker-storage-598e385f4efe](https://towardsdatascience.com/docker-storage-598e385f4efe)\n- [https://vsupalov.com/docker-latest-tag/](https://vsupalov.com/docker-latest-tag/)\n- [https://docs.microsoft.com/ko-kr/azure/container-registry/container-registry-image-tag-version](https://docs.microsoft.com/ko-kr/azure/container-registry/container-registry-image-tag-version)\n- [https://stevelasker.blog/2018/03/01/docker-tagging-best-practices-for-tagging-and-versioning-docker-images/](https://stevelasker.blog/2018/03/01/docker-tagging-best-practices-for-tagging-and-versioning-docker-images/)\n"
  },
  {
    "path": "docs/prerequisites/docker/command.md",
    "content": "---\ntitle : \"[Practice] Docker command\"\ndescription: \"Practice to use docker command.\"\nsidebar_position: 4\ncontributors: [\"Jongseob Jeon\", \"Jaeyeon Kim\"]\n---\n\n## 1. 정상 설치 확인\n\n```bash\ndocker run hello-world\n```\n\n정상적으로 설치된 경우 다음과 같은 메시지를 확인할 수 있습니다.\n\n```bash\nHello from Docker!\nThis message shows that your installation appears to be working correctly.\n....\n```\n\n**(For ubuntu)** sudo 없이 사용하고 싶다면 아래 사이트를 참고합니다.\n\n- [https://docs.docker.com/engine/install/linux-postinstall/#manage-docker-as-a-non-root-user](https://docs.docker.com/engine/install/linux-postinstall/#manage-docker-as-a-non-root-user)\n\n## 2. Docker Pull\n\ndocker image registry(도커 이미지를 저장하고 공유할 수 있는 저장소)로부터 Docker image 를 로컬에 다운로드 받는 커맨드입니다.\n\n아래 커맨드를 통해 docker pull에서 사용 가능한 argument들을 확인할 수 있습니다.\n\n```bash\ndocker pull --help\n```\n\n정상적으로 수행되면 아래와 같이 출력됩니다.\n\n```bash\nUsage:  docker pull [OPTIONS] NAME[:TAG|@DIGEST]\n\nPull an image or a repository from a registry\n\nOptions:\n  -a, --all-tags                Download all tagged images in the repository\n      --disable-content-trust   Skip image verification (default true)\n      --platform string         Set platform if server is multi-platform capable\n  -q, --quiet                   Suppress verbose output\n```\n\n여기서 알 수 있는 것은 바로 docker pull은 두 개 타입의 argument를 받는다는 것을 알 수 있습니다.\n\n1. `[OPTIONS]`\n2. `NAME[:TAG|@DIGEST]`\n\nhelp에서 나온 `-a`, -`q` 옵션을 사용하기 위해서는 NAME 앞에서 사용해야 합니다.\n\n직접 `ubuntu:18.04` 이미지를 pull 해보겠습니다.\n\n```bash\ndocker pull ubuntu:18.04\n```\n\n위 명령어를 해석하면 `ubuntu` 라는 이름을 가진 이미지 중 `18.04` 태그가 달려있는 이미지를 가져오라는 뜻입니다.\n\n만약, 정상적으로 수행된다면 다음과 비슷하게 출력됩니다.\n\n```bash\n18.04: Pulling from library/ubuntu\n20d796c36622: Pull complete \nDigest: sha256:42cd9143b6060261187a72716906187294b8b66653b50d70bc7a90ccade5c984\nStatus: Downloaded newer image for ubuntu:18.04\ndocker.io/library/ubuntu:18.04\n```\n\n위의 명령어를 수행하면 [docker.io/library](http://docker.io/library/) 라는 이름의 registry 에서 ubuntu:18.04 라는 image 를 여러분의 노트북에 다운로드 받게됩니다.\n\n- 참고사항\n  - 추후 [docker.io](http://docker.io) 나 public 한 docker hub 와 같은 registry 대신에, 특정 **private** 한 registry 에서 docker image 를 가져와야 하는 경우에는, [`docker login`](https://docs.docker.com/engine/reference/commandline/login/) 을 통해서 특정 registry 를 바라보도록 한 뒤, docker pull 을 수행하는 형태로 사용합니다. 혹은 insecure registry 를 설정하는 [방안](https://stackoverflow.com/questions/42211380/add-insecure-registry-to-docker)도 활용할 수 있습니다.\n  - 폐쇄망에서 docker image 를 `.tar` 파일과 같은 형태로 저장하고 공유할 수 있도록 [`docker save`](https://docs.docker.com/engine/reference/commandline/save/), [`docker load`](https://docs.docker.com/engine/reference/commandline/load/) 와 같은 명령어도 존재합니다.\n\n## 3. Docker images\n\n로컬에 존재하는 docker image 리스트를 출력하는 커맨드입니다.\n\n```bash\ndocker images --help\n```\n\ndocker images에서 사용할 수 있는 argument는 다음과 같습니다.\n\n```bash\nUsage:  docker images [OPTIONS] [REPOSITORY[:TAG]]\n\nList images\n\nOptions:\n  -a, --all             Show all images (default hides intermediate images)\n      --digests         Show digests\n  -f, --filter filter   Filter output based on conditions provided\n      --format string   Pretty-print images using a Go template\n      --no-trunc        Don't truncate output\n  -q, --quiet           Only show image IDs\n```\n\n아래 명령어를 이용해 직접 실행해 보겠습니다.\n\n```bash\ndocker images\n```\n\n만약 도커를 최초 설치 후 이 실습을 진행한다면 다음과 비슷하게 출력됩니다.\n\n```bash\nREPOSITORY   TAG       IMAGE ID       CREATED      SIZE\nubuntu       18.04     29e70752d7b2   2 days ago   56.7MB\n```\n\n줄 수 있는 argument중 `-q`를 사용하면 `IMAGE ID` 만 출력됩니다.\n\n```bash\ndocker images -q\n```\n\n```bash\n29e70752d7b2\n```\n\n## 4. Docker ps\n\n현재 실행 중인 도커 컨테이너 리스트를 출력하는 커맨드입니다.\n\n```bash\ndocker ps --help\n```\n\ndocker ps에서 사용할 수 있는 argument는 다음과 같습니다.\n\n```bash\nUsage:  docker ps [OPTIONS]\n\nList containers\n\nOptions:\n  -a, --all             Show all containers (default shows just running)\n  -f, --filter filter   Filter output based on conditions provided\n      --format string   Pretty-print containers using a Go template\n  -n, --last int        Show n last created containers (includes all states) (default -1)\n  -l, --latest          Show the latest created container (includes all states)\n      --no-trunc        Don't truncate output\n  -q, --quiet           Only display container IDs\n  -s, --size            Display total file sizes\n```\n\n아래 명령어를 이용해 직접 실행해 보겠습니다.\n\n```bash\ndocker ps\n```\n\n현재 실행 중인 컨테이너가 없다면 다음과 같이 나옵니다.\n\n```bash\nCONTAINER ID   IMAGE     COMMAND   CREATED   STATUS    PORTS     NAMES\n```\n\n만약 실행되는 컨테이너가 있다면 다음과 비슷하게 나옵니다.\n\n```bash\nCONTAINER ID   IMAGE     COMMAND        CREATED          STATUS          PORTS     NAMES\nc1e8f5e89d8d   ubuntu    \"sleep 3600\"   13 seconds ago   Up 12 seconds             trusting_newton\n```\n\n## 5. Docker run\n\n도커 컨테이너를 실행시키는 커맨드입니다.\n\n```bash\ndocker run --help\n```\n\ndocker run을 실행하는 명령어는 다음과 같습니다.\n\n```bash\nUsage:  docker run [OPTIONS] IMAGE [COMMAND] [ARG...]\n\nRun a command in a new container\n```\n\n여기서 우리가 확인해야 하는 것은 바로 docker run은 세 개 타입의 argument를 받는다는 것을 알 수 있습니다.\n\n1. `[OPTIONS]`\n2. `[COMMAND]`\n3. `[ARG...]`\n\n직접 도커 컨테이너를 실행해 보겠습니다.\n\n```bash\n## Usage:  docker run [OPTIONS] IMAGE [COMMAND] [ARG...]\ndocker run -it --name demo1 ubuntu:18.04 /bin/bash\n```\n\n- `-it` : `-i` 옵션 + `-t` 옵션\n  - container 를 실행시킴과 동시에 interactive 한 terminal 로 접속시켜주는 옵션\n- `--name` : name\n  - 컨테이너 id 대신, 구분하기 쉽도록 지정해주는 이름\n- `/bin/bash`\n  - 컨테이너를 실행시킴과 동시에 실행할 커맨드로, `/bin/bash` 는 bash 쉘을 여는 것을 의미합니다.\n\n실행 후 `exit` 명령어를 통해 컨테이너를 종료합니다.\n\n이 제 앞서 배웠던 `docker ps` 명령어를 치면 다음과 같이 나옵니다.\n\n```bash\nCONTAINER ID   IMAGE     COMMAND   CREATED   STATUS    PORTS     NAMES\n```\n\n실행되고 있는 컨테이너가 나온다고 했지만 어째서인지 방금 실행한 컨테이너가 보이지 않습니다.\n그 이유는 `docker ps`는 기본값으로 현재 실행 중인 컨테이너를 보여주기 때문입니다.\n\n만약 종료된 컨테이너들도 보고 싶다면 `-a` 옵션을 주어야 합니다.\n\n```bash\ndocker ps -a\n```\n\n그러면 다음과 같이 종료된 컨테이너 목록도 나옵니다.\n\n```bash\nCONTAINER ID   IMAGE          COMMAND       CREATED         STATUS                     PORTS     NAMES\n4c1aa74a382a   ubuntu:18.04   \"/bin/bash\"   2 minutes ago   Exited (0) 2 minutes ago             demo1\n```\n\n## 6. Docker exec\n\nDocker 컨테이너 내부에서 명령을 내리거나, 내부로 접속하는 커맨드입니다.\n\n```bash\ndocker exec --help\n```\n\n예를 들어서 다음과 같은 명령어를 실행해 보겠습니다.\n\n```bash\ndocker run -d --name demo2 ubuntu:18.04 sleep 3600\n```\n\n여기서 `-d` 옵션은 도커 컨테이너를 백그라운드에서 실행시켜서, 컨테이너에서 접속 종료를 하더라도, 계속 실행 중이 되도록 하는 커맨드입니다.\n\n`docker ps`를 통해 현재 실행중인지 확인합니다.\n\n다음과 같이 실행 중임을 확인할 수 있습니다.\n\n```bash\nCONTAINER ID   IMAGE          COMMAND        CREATED         STATUS         PORTS     NAMES\nfc88a83e90f0   ubuntu:18.04   \"sleep 3600\"   4 seconds ago   Up 3 seconds             demo2\n```\n\n이제 `docker exec` 명령어를 통해서 실행중인 도커 컨테이너에 접속해 보겠습니다.\n\n```bash\ndocker exec -it demo2 /bin/bash\n```\n\n이 전의 `docker run`과 동일하게 container 내부에 접속할 수 있습니다.\n\n`exit`을 통해 종료합니다.\n\n## 7. Docker logs\n\n도커 컨테이너의 log를 확인하는 커맨드 입니다.\n\n```bash\ndocker logs --help\n```\n\n다음과 같은 컨테이너를 실행시키도록 하겠습니다.\n\n```bash\ndocker run --name demo3 -d busybox sh -c \"while true; do $(echo date); sleep 1; done\"\n```\n\n위 명령어를 통해서 test 라는 이름의 busybox 컨테이너를 백그라운드에서 도커 컨테이너로 실행하여, 1초에 한 번씩 현재 시간을 출력하도록 했습니다.\n\n이제 아래 명령어를 통해 log를 확인해 보겠습니다.\n\n```bash\ndocker logs demo3\n```\n\n정상적으로 수행되면 아래와 비슷하게 나옵니다.\n\n```bash\nSun Mar  6 11:06:49 UTC 2022\nSun Mar  6 11:06:50 UTC 2022\nSun Mar  6 11:06:51 UTC 2022\nSun Mar  6 11:06:52 UTC 2022\nSun Mar  6 11:06:53 UTC 2022\nSun Mar  6 11:06:54 UTC 2022\n```\n\n그런데 이렇게 사용할 경우 여태까지 찍힌 log 밖에 확인할 수 없습니다.  \n이 때 `-f` 옵션을 이용해 계속 watch 하며 출력할 수 있습니다.\n\n```bash\ndocker logs demo3 -f    \n```\n\n## 8. Docker stop\n\n실행 중인 도커 컨테이너를 중단시키는 커맨드입니다.\n\n```bash\ndocker stop --help\n```\n\n`docker ps`를 통해 현재 실행 중인 컨테이너를 확인하면 다음과 같습니다.\n\n```bash\nCONTAINER ID   IMAGE          COMMAND                  CREATED              STATUS              PORTS     NAMES\n730391669c39   busybox        \"sh -c 'while true; …\"   About a minute ago   Up About a minute             demo3\nfc88a83e90f0   ubuntu:18.04   \"sleep 3600\"             4 minutes ago        Up 4 minutes                  demo2\n```\n\n이제 `docker stop` 을 통해 도커를 정지해 보겠습니다.\n\n```bash\ndocker stop demo2\n```\n\n실행 후 `docker ps`를 다시 입력합니다.\n\n```bash\nCONTAINER ID   IMAGE     COMMAND                  CREATED         STATUS         PORTS     NAMES\n730391669c39   busybox   \"sh -c 'while true; …\"   2 minutes ago   Up 2 minutes             demo3\n```\n\n위의 결과와 비교했을 때 demo2 컨테이너가 현재 실행 중인 컨테이너 목록에서 사라진 것을 확인할 수 있습니다.\n\n나머지 컨테이너도 정지합니다.\n\n```bash\ndocker stop demo3\n```\n\n## 9. Docker rm\n\n도커 컨테이너를 삭제하는 커맨드입니다.\n\n```bash\ndocker rm --help\n```\n\n도커 컨테이너는 기본적으로 종료가 된 상태로 있습니다. 그래서 `docker ps -a`를 통해서 종료된 컨테이너도 볼 수 있습니다.\n그런데 종료된 컨테이너는 왜 지워야 할까요?  \n종료되어 있는 도커에는 이전에 사용한 데이터가 아직 컨테이너 내부에 남아있습니다.\n그래서 restart 등을 통해서 컨테이너를 재시작할 수 있습니다.\n그런데 이 과정에서 disk를 사용하게 됩니다.\n\n그래서 완전히 사용하지 않는 컨테이너를 지우기 위해서는 `docker rm` 명령어를 사용해야 합니다.\n\n우선 현재 컨테이너들을 확인합니다.\n\n```bash\ndocker ps -a\n```\n\n다음과 같이 3개의 컨테이너가 있습니다.\n\n```bash\nCONTAINER ID   IMAGE          COMMAND                  CREATED          STATUS                            PORTS     NAMES\n730391669c39   busybox        \"sh -c 'while true; …\"   4 minutes ago    Exited (137) About a minute ago             demo3\nfc88a83e90f0   ubuntu:18.04   \"sleep 3600\"             7 minutes ago    Exited (137) 2 minutes ago                  demo2\n4c1aa74a382a   ubuntu:18.04   \"/bin/bash\"              10 minutes ago   Exited (0) 10 minutes ago                   demo1\n```\n\n아래 명령어를 통해 `demo3` 컨테이너를 삭제해 보겠습니다.\n\n```bash\ndocker rm demo3\n```\n\n`docker ps -a` 명령어를 치면 다음과 같이 2개로 줄었습니다.\n\n```bash\nCONTAINER ID   IMAGE          COMMAND        CREATED          STATUS                       PORTS     NAMES\nfc88a83e90f0   ubuntu:18.04   \"sleep 3600\"   13 minutes ago   Exited (137) 8 minutes ago             demo2\n4c1aa74a382a   ubuntu:18.04   \"/bin/bash\"    16 minutes ago   Exited (0) 16 minutes ago              demo1\n```\n\n나머지 컨테이너들도 삭제합니다.\n\n```bash\ndocker rm demo2\ndocker rm demo1\n```\n\n## 10. Docker rmi\n\n도커 이미지를 삭제하는 커맨드입니다.\n\n```bash\ndocker rmi --help\n```\n\n아래 명령어를 통해 현재 어떤 이미지들이 로컬에 있는지 확인합니다.\n\n```bash\ndocker images\n```\n\n다음과 같이 출력됩니다.\n\n```bash\nREPOSITORY   TAG       IMAGE ID       CREATED        SIZE\nbusybox      latest    a8440bba1bc0   32 hours ago   1.41MB\nubuntu       18.04     29e70752d7b2   2 days ago     56.7MB\n```\n\n`busybox` 이미지를 삭제해 보겠습니다.\n\n```bash\ndocker rmi busybox\n```\n\n다시 `docker images`를 칠 경우 다음과 같이 나옵니다.\n\n```bash\nREPOSITORY   TAG       IMAGE ID       CREATED        SIZE\nubuntu       18.04     29e70752d7b2   2 days ago     56.7MB\n```\n\n## References\n\n- [https://docs.github.com/en/packages/working-with-a-github-packages-registry/working-with-the-container-registry](https://docs.github.com/en/packages/working-with-a-github-packages-registry/working-with-the-container-registry)\n"
  },
  {
    "path": "docs/prerequisites/docker/docker.md",
    "content": "---\ntitle : \"What is Docker?\"\ndescription: \"Introduction to Docker.\"\nsidebar_position: 3\ncontributors: [\"Jongseob Jeon\", \"Jaeyeon Kim\"]\n---\n\n\n## 컨테이너\n\n- 컨테이너 가상화\n  - 어플리케이션을 어디에서나 동일하게 실행하는 기술\n- 컨테이너 이미지\n  - 어플리케이션을 실행시키기 위해 필요한 모든 파일들의 집합\n  - → 붕어빵 틀\n- 컨테이너란?\n  - 컨테이너 이미지를 기반으로 실행된 한 개의 프로세스\n  - → 붕어빵 틀로 찍어낸 붕어빵\n\n## 도커\n\n도커는 **컨테이너를 관리**하고 사용할 수 있게 해주는 플랫폼입니다.  \n이러한 도커의 슬로건은 바로 **Build Once, Run Anywhere** 로 어디에서나 동일한 실행 결과를 보장합니다.\n\n도커 내부에서 동작하는 과정을 보자면 실제로 container 를 위한 리소스를 분리하고, lifecycle 을 제어하는 기능은 linux kernel 의 cgroup 등이 수행합니다.\n하지만 이러한 인터페이스를 바로 사용하는 것은 **너무 어렵기 때문에** 다음과 같은 추상화 layer를 만들게 됩니다.\n\n![docker-layer.png](./img/docker-layer.png)\n\n이를 통해 사용자는 사용자 친화적인 API 인 **Docker CLI** 만으로 쉽게 컨테이너를 제어할 수 있습니다.\n\n## Layer 해석\n\n위에서 나온 layer들의 역할은 다음과 같습니다.\n\n1. runC: linux kernel 의 기능을 직접 사용해서, container 라는 하나의 프로세스가 사용할 네임스페이스와 cpu, memory, filesystem 등을 격리시켜주는 기능을 수행합니다.\n2. containerd: runC(OCI layer) 에게 명령을 내리기 위한 추상화 단계이며, 표준화된 인터페이스(OCI)를 사용합니다.\n3. dockerd: containerd 에게 명령을 내리는 역할만 합니다.\n4. docker cli: 사용자는 docker cli 로 dockerd (Docker daemon)에게 명령을 내리기만 하면 됩니다.\n    - 이 통신 과정에서 unix socket 을 사용하기 때문에 가끔 도커 관련 에러가 나면 `/var/run/docker.sock` 가 사용 중이다, 권한이 없다 등등의 에러 메시지가 나오는 것입니다.\n\n이처럼 도커는 많은 단계를 감싸고 있지만, 흔히 도커라는 용어를 사용할 때는 Docker CLI 를 말할 때도 있고, Dockerd 를 말할 때도 있고 Docker Container 하나를 말할 때도 있어서 혼란이 생길 수 있습니다.  \n앞으로 나오는 글에서도 도커가 여러가지 의미로 쓰일 수 있습니다.\n\n## For ML Engineer\n\n머신러닝 엔지니어가 도커를 사용하는 이유는 다음과 같습니다.\n\n1. 나의 ML 학습/추론 코드를 OS, python version, python 환경, 특정 python package 버전에 independent 하도록 해야 한다.\n2. 그래서 코드 뿐만이 아닌 **해당 코드가 실행되기 위해 필요한 모든 종속적인 패키지, 환경 변수, 폴더명 등등을 하나의 패키지로** 묶을 수 있는 기술이 컨테이너화 기술이다.\n3. 이 기술을 쉽게 사용하고 관리할 수 있는 소프트웨어 중 하나가 도커이며, 패키지를 도커 이미지라고 부른다.\n"
  },
  {
    "path": "docs/prerequisites/docker/images.md",
    "content": "---\ntitle : \"[Practice] Docker images\"\ndescription: \"Practice to use docker image.\"\nsidebar_position: 5\ncontributors: [\"Jongseob Jeon\", \"Jaeyeon Kim\"]\n---\n\n## 1. Dockerfile 만들기\n\n도커 이미지를 만드는 가장 쉬운 방법은 도커에서 제공하는 템플릿인 Dockerfile을 사용하는 것입니다.  \n이외에는 running container 를 docker image 로 만드는 `docker commit` 등을 활용하는 방법이 있습니다.\n\n- `Dockerfile`\n  - 사용자가 도커 이미지를 쉽게 만들 수 있도록, 제공하는 템플릿\n  - 파일명은 꼭 `Dockerfile` 이 아니어도 상관없지만, `docker build` 수행 시, default 로 사용하는 파일명이 `Dockerfile` 입니다.\n  - 도커 이미지를 만드는 `docker build` 를 수행할 때, `-f` 옵션을 주면 다른 파일명으로도 사용 가능합니다.\n    - ex) `docker build -f dockerfile-asdf .` 도 가능\n\n1. 실습을 위해서 편한 디렉토리로 이동합니다.\n\n    ```bash\n    cd <SOME-DIRECTORY>\n    ```\n\n2. docker-practice 라는 이름의 폴더를 생성합니다.\n\n    ```bash\n    mkdir docker-practice\n    ```\n\n3. docker-practice 폴더로 이동합니다.\n\n    ```bash\n    cd docker-practice\n    ```\n\n4. Dockerfile 이라는 빈 파일을 생성합니다.\n\n    ```bash\n    touch Dockerfile\n    ```\n\n5. 정상적으로 생성되었는지 확인합니다.\n\n    ```bash\n    ls\n    ```\n\n## 2. Dockerfile 내장 명령어\n\nDockerfile 에서 사용할 수 있는 기본적인 명령어에 대해서 하나씩 알아보겠습니다.\n\n### FROM\n\nDockerfile 이 base image 로 어떠한 이미지를 사용할 것인지를 명시하는 명령어입니다.  \n도커 이미지를 만들 때, 아무것도 없는 빈 환경에서부터 하나하나씩 제가 의도한 환경을 만들어가는게 아니라, python 3.9 버전이 설치된 환경을 베이스로해두고, 저는 pytorch 를 설치하고, 제 소스코드만 넣어두는 형태로 활용할 수가 있습니다.  \n이러한 경우에는 `python:3.9`, `python-3.9-alpine`, ... 등의 잘 만들어진 이미지를 베이스로 활용합니다.\n\n```docker\nFROM <image>[:<tag>] [AS <name>]\n\n# 예시\nFROM ubuntu\nFROM ubuntu:18.04\nFROM nginx:latest AS ngx\n```\n\n### COPY\n\n**host(로컬)에서의 `<src>`** 경로의 파일 혹은 디렉토리를 **container 내부에서의 `<dest>`** 경로에 복사하는 명령어입니다.\n\n```docker\nCOPY <src>... <dest>\n\n# 예시\nCOPY a.txt /some-directory/b.txt\nCOPY my-directory /some-directory-2\n```\n\n`ADD` 는 `COPY` 와 비슷하지만 추가적인 기능을 품고 있습니다.\n\n```docker\n# 1 - 호스트에 압축되어있는 파일을 풀면서 컨테이너 내부로 copy 할 수 있음\nADD scripts.tar.gz /tmp\n# 2 - Remote URLs 에 있는 파일을 소스 경로로 지정할 수 있음\nADD http://www.example.com/script.sh /tmp\n\n# 위 두 가지 기능을 사용하고 싶을 경우에만 COPY 대신 ADD 를 사용하는 것을 권장\n```\n\n### RUN\n\n명시한 커맨드를 도커 컨테이너 내부에서 실행하는 명령어입니다.  \n도커 이미지는 해당 커맨드들이 실행된 상태를 유지합니다.\n\n```docker\nRUN <command>\nRUN [\"executable-command\", \"parameter1\", \"parameter2\"]\n\n# 예시\nRUN pip install torch\nRUN pip install -r requirements.txt\n```\n\n### CMD\n\n명시한 커맨드를 도커 컨테이너가 **시작될 때**, 실행하는 것을 명시하는 명령어입니다.  \n비슷한 역할을 하는 명령어로 **ENTRYPOINT** 가 있습니다. 이 둘의 차이에 대해서는 **뒤에서** 다룹니다.  \n하나의 도커 이미지에서는 하나의 **CMD** 만 실행할 수 있다는 점에서 **RUN** 명령어와 다릅니다.\n\n```docker\nCMD <command>\nCMD [\"executable-command\", \"parameter1\", \"parameter2\"]\nCMD [\"parameter1\", \"parameter2\"] # ENTRYPOINT 와 함께 사용될 때\n\n# 예시\nCMD python main.py\n```\n\n### WORKDIR\n\n이후 추가될 명령어를 컨테이너 내의 어떤 디렉토리에서 수행할 것인지를 명시하는 명령어입니다.  \n만약, 해당 디렉토리가 없다면 생성합니다.\n\n```docker\nWORKDIR /path/to/workdir\n\n# 예시\nWORKDIR /home/demo\nRUN pwd # /home/demo 가 출력됨\n```\n\n### ENV\n\n컨테이너 내부에서 지속적으로 사용될 environment variable 의 값을 설정하는 명령어입니다.\n\n```docker\nENV <KEY> <VALUE>\nENV <KEY>=<VALUE>\n\n# 예시\n# default 언어 설정\nRUN locale-gen ko_KR.UTF-8\nENV LANG ko_KR.UTF-8\nENV LANGUAGE ko_KR.UTF-8\nENV LC_ALL ko_KR.UTF-8\n```\n\n### EXPOSE\n\n컨테이너에서 뚫어줄 포트/프로토콜을 지정할 수 있습니다.  \n`<protocol>` 을 지정하지 않으면 TCP 가 디폴트로 설정됩니다.\n\n```docker\nEXPOSE <port>\nEXPOSE <port>/<protocol>\n\n# 예시\nEXPOSE 8080\n```\n\n## 3. 간단한 Dockerfile 작성해보기\n\n`vim Dockerfile` 혹은 vscode 등 본인이 사용하는 편집기로 `Dockerfile` 을 열어 다음과 같이 작성해줍니다.\n\n```docker\n# base image 를 ubuntu 18.04 로 설정합니다.\nFROM ubuntu:18.04\n\n# apt-get update 명령을 실행합니다.\nRUN apt-get update\n\n# TEST env var의 값을 hello 로 지정합니다.\nENV TEST hello\n\n# DOCKER CONTAINER 가 시작될 때, 환경변수 TEST 의 값을 출력합니다.\nCMD echo $TEST\n```\n\n## 4. Docker build from Dockerfile\n\n`docker build` 명령어로 Dockerfile 로부터 Docker Image 를 만들어봅니다.\n\n```bash\ndocker build --help\n```\n\nDockerfile 이 있는 경로에서 다음 명령을 실행합니다.\n\n```bash\ndocker build -t my-image:v1.0.0 .\n```\n\n위 커맨드를 설명하면 다음과 같습니다.\n\n- `.` : **현재 경로**에 있는 Dockerfile 로부터\n- `-t` : my-image 라는 **이름**과 v1.0.0 이라는 **태그**로 **이미지**를\n- 빌드하겠다라는 명령어\n\n정상적으로 이미지 빌드되었는지 확인해 보겠습니다.\n\n```bash\n# grep : my-image 가 있는지를 잡아내는 (grep) 하는 명령어\ndocker images | grep my-image\n```\n\n정상적으로 수행된다면 다음과 같이 출력됩니다.\n\n```bash\nmy-image     v1.0.0    143114710b2d   3 seconds ago   87.9MB\n```\n\n## 5. Docker run from Dockerfile\n\n그럼 이제 방금 빌드한 `my-image:v1.0.0` 이미지로 docker 컨테이너를 **run** 해보겠습니다.\n\n```bash\ndocker run my-image:v1.0.0\n```\n\n정상적으로 수행된다면 다음과 같이 나옵니다.\n\n```bash\nhello\n```\n\n## 6. Docker run with env\n\n이번에는 방금 빌드한 `my-image:v1.0.0` 이미지를 실행하는 시점에, `TEST` env var 의 값을 변경하여 docker 컨테이너를 run 해보겠습니다.\n\n```bash\ndocker run -e TEST=bye my-image:v1.0.0\n```\n\n정상적으로 수행된다면 다음과 같이 나옵니다.\n\n```bash\nbye\n```\n"
  },
  {
    "path": "docs/prerequisites/docker/install.md",
    "content": "---\ntitle : \"Install Docker\"\ndescription: \"Install docker to start.\"\nsidebar_position: 1\ncontributors: [\"Jongseob Jeon\", \"Jaeyeon Kim\"]\n---\n\n## Docker\n\n도커 실습을 위해 도커를 설치해야 합니다.  \n도커 설치는 어떤 OS를 사용하는지에 따라 달라집니다.  \n각 환경에 맞는 도커 설치는 공식 홈페이지를 참고해주세요.\n\n- [ubuntu](https://docs.docker.com/engine/install/ubuntu/)\n- [mac](https://docs.docker.com/desktop/mac/install/)\n- [windows](https://docs.docker.com/desktop/windows/install/)\n\n## 설치 확인\n\n`docker run hello-world` 가 정상적으로 수행되는 OS, 터미널 환경이 필요합니다.\n\n| OS      | Docker Engine  | Terminal           |\n| ------- | -------------- | ------------------ |\n| MacOS   | Docker Desktop | zsh                |\n| Windows | Docker Desktop | Powershell         |\n| Windows | Docker Desktop | WSL2               |\n| Ubuntu  | Docker Engine  | bash               |\n\n## 들어가기 앞서서..\n\nMLOps를 사용하기 위해 필요한 도커 사용법을 설명하니 많은 비유와 예시가 MLOps 쪽으로 치중되어 있을 수 있습니다.\n"
  },
  {
    "path": "docs/prerequisites/docker/introduction.md",
    "content": "---\ntitle : \"Why Docker & Kubernetes ?\"\ndescription: \"Introduction to Docker.\"\nsidebar_position: 2\ncontributors: [\"Jongseob Jeon\", \"Jaeyeon Kim\"]\n---\n\n## Why Kubernetes ?\n\n머신러닝 모델을 서비스화하기 위해서는 모델 개발 외에도 많은 **부가적인** 기능들이 필요합니다.\n\n1. 학습 단계\n    - 모델 학습 명령의 스케줄 관리\n    - 학습된 모델의 Reproducibility 보장\n2. 배포 단계\n    - 트래픽 분산\n    - 서비스 장애 모니터링\n    - 장애 시 트러블슈팅\n\n다행히도 이런 기능들에 대한 needs는 소프트웨어 개발 쪽에서 이미 많은 고민을 거쳐 발전되어 왔습니다.  \n따라서 머신러닝 모델을 배포할 때도 이런 고민의 결과물들을 활용하면 큰 도움을 받을 수 있습니다.\nMLOps에서 대표적으로 활용하는 소프트웨어 제품이 바로 도커와 쿠버네티스입니다.\n\n## 도커와 쿠버네티스\n\n### 기술 이름이 아니라 제품 이름\n\n도커와 쿠버네티스는 각각 컨테이너라이제이션(Containerization) 기능과 컨테이너 오케스트레이션(Container Orchestration) 기능을 제공하는 대표 소프트웨어(제품)입니다.\n\n#### 도커\n\n도커는 과거에 대세였지만 유료화 관련 정책들을 하나씩 추가하면서 점점 사용 빈도가 하락세입니다.\n하지만 2022년 3월 기준으로 아직까지도 가장 일반적으로 사용되는 컨테이너 가상화 소프트웨어입니다.\n\n![sysdig-2019.png](./img/sysdig-2019.png)\n\n<center> [from sysdig 2019] </center>\n\n![sysdig-2021.png](./img/sysdig-2021.png)\n\n<center> [from sysdig 2021]  </center>\n\n#### 쿠버네티스\n\n쿠버네티스는 지금까지는 비교 대상조차 거의 없는 제품입니다.\n\n![cncf-survey.png](./img/cncf-survey.png)\n\n<center> [from cncf survey] </center>\n\n![t4-ai.png](./img/t4-ai.png)\n\n<center> [from t4.ai]  </center>\n\n### **재미있는 오픈소스 역사 이야기**\n\n#### 초기 도커 & 쿠버네티스\n\n초기 도커 개발시에는 Docker Engine이라는 **하나의 패키지**에 API, CLI, 네트워크, 스토리지 등 여러 기능들을 모두 포함했으나, **MSA** 의 철학을 담아 **하나씩 분리**하기 시작했습니다.  \n하지만 초기의 쿠버네티스는 컨테이너 가상화를 위해 Docker Engine을 내장하고 있었습니다.  \n따라서 도커 버전이 업데이트될 때마다 Docker Engine 의 인터페이스가 변경되어 쿠버네티스에서 크게 영향을 받는 일이 계속해서 발생하였습니다.\n\n#### Open Container Initiative\n\n그래서 **이런 불편함을 해소**하고자, 도커를 중심으로 구글 등 컨테이너 기술에 관심있는 **여러 집단**들이 한데 모여 **Open Container Initiative,** 이하 **OCI**라는 프로젝트를 시작하여 컨테이너에 관한 **표준**을 정하는 일들을 시작하였습니다.  \n도커에서도 인터페이스를 **한 번 더 분리**해서, OCI 표준을 준수하는 **containerd**라는 Container Runtime 를 개발하고, **dockerd** 가 containerd 의 API 를 호출하도록 추상화 레이어를 추가하였습니다.\n\n이러한 흐름에 맞추어서 쿠버네티스에서도 이제부터는 도커만을 지원하지 않고, **OCI 표준을** 준수하고, 정해진 스펙을 지키는 컨테이너 런타임은 무엇이든 쿠버네티스에서 사용할 수 있도록, Container Runtime Interface, 이하 **CRI 스펙**을 버전 1.5부터 제공하기 시작했습니다.\n\n#### CRI-O\n\nRed Hat, Intel, SUSE, IBM에서 **OCI 표준+CRI 스펙을** 따라 Kubernetes 전용 Container Runtime 을 목적으로 개발한 컨테이너 런타임입니다.\n\n#### 지금의 도커 & 쿠버네티스\n\n쿠버네티스는 Docker Engine 을 디폴트 컨테이너 런타임으로 사용해왔지만, 도커의 API 가 **CRI** 스펙에 맞지 않아(*OCI 는 따름*) 도커의 API를 **CRI**와 호환되게 바꿔주는 **dockershim**을 쿠버네티스 자체적으로 개발 및 지원해왔었는데,(*도커 측이 아니라 쿠버네티스 측에서 지원했다는 점이 굉장히 큰 짐이었습니다.*) 이걸 쿠버네티스 **v1.20 부터는 Deprecated하고,** **v1.23 부터는 지원을 포기**하기로 결정하였습니다.\n\n- v1.23 은 2021 년 12월 릴리즈\n\n그래서 쿠버네티스 v1.23 부터는 도커를 native 하게 쓸 수 없습니다다.  \n그렇지만 **사용자들은 이런 변화에 크게 관련이 있진 않습니다.**\n왜냐하면 Docker Engine을 통해 만들어진 도커 이미지는 OCI 표준을 준수하기 때문에, 쿠버네티스가 어떤 컨테이너 런타임으로 이루어져있든 사용 가능하기 때문입니다.\n\n### References\n\n- [*https://www.linkedin.com/pulse/containerd는-무엇이고-왜-중요할까-sean-lee/?originalSubdomain=kr*](https://www.linkedin.com/pulse/containerd%EB%8A%94-%EB%AC%B4%EC%97%87%EC%9D%B4%EA%B3%A0-%EC%99%9C-%EC%A4%91%EC%9A%94%ED%95%A0%EA%B9%8C-sean-lee/?originalSubdomain=kr)\n- [https://kubernetes.io/blog/2021/12/07/kubernetes-1-23-release-announcement/](https://kubernetes.io/blog/2021/12/07/kubernetes-1-23-release-announcement/)\n- [https://kubernetes.io/blog/2020/12/02/dockershim-faq/](https://kubernetes.io/blog/2020/12/02/dockershim-faq/)\n- [https://kubernetes.io/blog/2020/12/02/dont-panic-kubernetes-and-docker/](https://kubernetes.io/blog/2020/12/02/dont-panic-kubernetes-and-docker/)\n- [https://kubernetes.io/ko/blog/2020/12/02/dont-panic-kubernetes-and-docker/](https://kubernetes.io/ko/blog/2020/12/02/dont-panic-kubernetes-and-docker/)\n"
  },
  {
    "path": "docs/setup-components/_category_.json",
    "content": "{\n  \"label\": \"Setup Components\",\n  \"position\": 3,\n  \"link\": {\n    \"type\": \"generated-index\"\n  }\n}\n"
  },
  {
    "path": "docs/setup-components/install-components-kf.md",
    "content": "---\ntitle : \"1. Kubeflow\"\ndescription: \"구성요소 설치 - Kubeflow\"\nsidebar_position: 1\ndate: 2021-12-13\nlastmod: 2021-12-20\ncontributors: [\"Jaeyeon Kim\", \"SeungTae Kim\"]\n---\n\n## 설치 파일 준비\n\nKubeflow **v1.4.0** 버전을 설치하기 위해서, 설치에 필요한 manifests 파일들을 준비합니다.\n\n[kubeflow/manifests Repository](https://github.com/kubeflow/manifests) 를 **v1.4.0** 태그로 깃 클론한 뒤, 해당 폴더로 이동합니다.\n\n```bash\ngit clone -b v1.4.0 https://github.com/kubeflow/manifests.git\ncd manifests\n```\n\n## 각 구성 요소별 설치\n\nkubeflow/manifests Repository 에 각 구성 요소별 설치 커맨드가 적혀져 있지만, 설치하며 발생할 수 있는 이슈 혹은 정상적으로 설치되었는지 확인하는 방법이 적혀져 있지 않아 처음 설치하는 경우 어려움을 겪는 경우가 많습니다.  \n따라서, 각 구성 요소별로 정상적으로 설치되었는지 확인하는 방법을 함께 작성합니다.  \n\n또한, 본 문서에서는 **모두의 MLOps** 에서 다루지 않는 구성요소인 Knative, KFServing, MPI Operator 의 설치는 리소스의 효율적 사용을 위해 따로 설치하지 않습니다.\n\n### Cert-manager\n\n1. cert-manager 를 설치합니다.\n\n  ```bash\n  kustomize build common/cert-manager/cert-manager/base | kubectl apply -f -\n  ```\n\n  정상적으로 설치되면 다음과 같이 출력됩니다.\n\n  ```bash\n  namespace/cert-manager created\n  customresourcedefinition.apiextensions.k8s.io/certificaterequests.cert-manager.io created\n  customresourcedefinition.apiextensions.k8s.io/certificates.cert-manager.io created\n  customresourcedefinition.apiextensions.k8s.io/challenges.acme.cert-manager.io created\n  customresourcedefinition.apiextensions.k8s.io/clusterissuers.cert-manager.io created\n  customresourcedefinition.apiextensions.k8s.io/issuers.cert-manager.io created\n  customresourcedefinition.apiextensions.k8s.io/orders.acme.cert-manager.io created\n  serviceaccount/cert-manager created\n  serviceaccount/cert-manager-cainjector created\n  serviceaccount/cert-manager-webhook created\n  role.rbac.authorization.k8s.io/cert-manager-webhook:dynamic-serving created\n  role.rbac.authorization.k8s.io/cert-manager-cainjector:leaderelection created\n  role.rbac.authorization.k8s.io/cert-manager:leaderelection created\n  clusterrole.rbac.authorization.k8s.io/cert-manager-cainjector created\n  clusterrole.rbac.authorization.k8s.io/cert-manager-controller-approve:cert-manager-io created\n  clusterrole.rbac.authorization.k8s.io/cert-manager-controller-certificates created\n  clusterrole.rbac.authorization.k8s.io/cert-manager-controller-challenges created\n  clusterrole.rbac.authorization.k8s.io/cert-manager-controller-clusterissuers created\n  clusterrole.rbac.authorization.k8s.io/cert-manager-controller-ingress-shim created\n  clusterrole.rbac.authorization.k8s.io/cert-manager-controller-issuers created\n  clusterrole.rbac.authorization.k8s.io/cert-manager-controller-orders created\n  clusterrole.rbac.authorization.k8s.io/cert-manager-edit created\n  clusterrole.rbac.authorization.k8s.io/cert-manager-view created\n  clusterrole.rbac.authorization.k8s.io/cert-manager-webhook:subjectaccessreviews created\n  rolebinding.rbac.authorization.k8s.io/cert-manager-webhook:dynamic-serving created\n  rolebinding.rbac.authorization.k8s.io/cert-manager-cainjector:leaderelection created\n  rolebinding.rbac.authorization.k8s.io/cert-manager:leaderelection created\n  clusterrolebinding.rbac.authorization.k8s.io/cert-manager-cainjector created\n  clusterrolebinding.rbac.authorization.k8s.io/cert-manager-controller-approve:cert-manager-io created\n  clusterrolebinding.rbac.authorization.k8s.io/cert-manager-controller-certificates created\n  clusterrolebinding.rbac.authorization.k8s.io/cert-manager-controller-challenges created\n  clusterrolebinding.rbac.authorization.k8s.io/cert-manager-controller-clusterissuers created\n  clusterrolebinding.rbac.authorization.k8s.io/cert-manager-controller-ingress-shim created\n  clusterrolebinding.rbac.authorization.k8s.io/cert-manager-controller-issuers created\n  clusterrolebinding.rbac.authorization.k8s.io/cert-manager-controller-orders created\n  clusterrolebinding.rbac.authorization.k8s.io/cert-manager-webhook:subjectaccessreviews created\n  service/cert-manager created\n  service/cert-manager-webhook created\n  deployment.apps/cert-manager created\n  deployment.apps/cert-manager-cainjector created\n  deployment.apps/cert-manager-webhook created\n  mutatingwebhookconfiguration.admissionregistration.k8s.io/cert-manager-webhook created\n  validatingwebhookconfiguration.admissionregistration.k8s.io/cert-manager-webhook created\n  ```\n\n  cert-manager namespace 의 3 개의 pod 가 모두 Running 이 될 때까지 기다립니다.\n\n  ```bash\n  kubectl get pod -n cert-manager\n  ```\n\n  모두 Running 이 되면 다음과 비슷한 결과가 출력됩니다.\n\n  ```bash\n  NAME                                       READY   STATUS    RESTARTS   AGE\n  cert-manager-7dd5854bb4-7nmpd              1/1     Running   0          2m10s\n  cert-manager-cainjector-64c949654c-2scxr   1/1     Running   0          2m10s\n  cert-manager-webhook-6b57b9b886-7q6g2      1/1     Running   0          2m10s\n  ```\n\n2. kubeflow-issuer 를 설치합니다.\n\n  ```bash\n  kustomize build common/cert-manager/kubeflow-issuer/base | kubectl apply -f -\n  ```\n\n  정상적으로 설치되면 다음과 같이 출력됩니다.\n\n  ```bash\n  clusterissuer.cert-manager.io/kubeflow-self-signing-issuer created\n  ```\n\n- cert-manager-webhook 이슈\n\n  cert-manager-webhook deployment 가 Running 이 아닌 경우, 다음과 비슷한 에러가 발생하며 kubeflow-issuer가 설치되지 않을 수 있음에 주의하시기 바랍니다.  \n  해당 에러가 발생한 경우, cert-manager 의 3개의 pod 가 모두 Running 이 되는 것을 확인한 이후 다시 명령어를 수행하시기 바랍니다.\n\n  ```bash\n  Error from server: error when retrieving current configuration of:\n  Resource: \"cert-manager.io/v1alpha2, Resource=clusterissuers\", GroupVersionKind: \"cert-manager.io/v1alpha2, Kind=ClusterIssuer\"\n  Name: \"kubeflow-self-signing-issuer\", Namespace: \"\"\n  from server for: \"STDIN\": conversion webhook for cert-manager.io/v1, Kind=ClusterIssuer failed: Post \"https://cert-manager-webhook.cert-manager.svc:443/convert?timeout=30s\": dial tcp 10.101.177.157:443: connect: connection refused\n  ```\n\n### Istio\n\n1. istio 관련 Custom Resource Definition(CRD) 를 설치합니다.\n\n  ```bash\n  kustomize build common/istio-1-9/istio-crds/base | kubectl apply -f -\n  ```\n\n  정상적으로 수행되면 다음과 같이 출력됩니다.\n\n  ```bash\n  customresourcedefinition.apiextensions.k8s.io/authorizationpolicies.security.istio.io created\n  customresourcedefinition.apiextensions.k8s.io/destinationrules.networking.istio.io created\n  customresourcedefinition.apiextensions.k8s.io/envoyfilters.networking.istio.io created\n  customresourcedefinition.apiextensions.k8s.io/gateways.networking.istio.io created\n  customresourcedefinition.apiextensions.k8s.io/istiooperators.install.istio.io created\n  customresourcedefinition.apiextensions.k8s.io/peerauthentications.security.istio.io created\n  customresourcedefinition.apiextensions.k8s.io/requestauthentications.security.istio.io created\n  customresourcedefinition.apiextensions.k8s.io/serviceentries.networking.istio.io created\n  customresourcedefinition.apiextensions.k8s.io/sidecars.networking.istio.io created\n  customresourcedefinition.apiextensions.k8s.io/virtualservices.networking.istio.io created\n  customresourcedefinition.apiextensions.k8s.io/workloadentries.networking.istio.io created\n  customresourcedefinition.apiextensions.k8s.io/workloadgroups.networking.istio.io created\n  ```\n\n2. istio namespace 를 설치합니다.\n\n  ```bash\n  kustomize build common/istio-1-9/istio-namespace/base | kubectl apply -f -\n  ```\n\n  정상적으로 수행되면 다음과 같이 출력됩니다.\n\n  ```bash\n  namespace/istio-system created\n  ```\n\n3. istio 를 설치합니다.\n\n  ```bash\n  kustomize build common/istio-1-9/istio-install/base | kubectl apply -f -\n  ```\n\n  정상적으로 수행되면 다음과 같이 출력됩니다.\n\n  ```bash\n  serviceaccount/istio-ingressgateway-service-account created\n  serviceaccount/istio-reader-service-account created\n  serviceaccount/istiod-service-account created\n  role.rbac.authorization.k8s.io/istio-ingressgateway-sds created\n  role.rbac.authorization.k8s.io/istiod-istio-system created\n  clusterrole.rbac.authorization.k8s.io/istio-reader-istio-system created\n  clusterrole.rbac.authorization.k8s.io/istiod-istio-system created\n  rolebinding.rbac.authorization.k8s.io/istio-ingressgateway-sds created\n  rolebinding.rbac.authorization.k8s.io/istiod-istio-system created\n  clusterrolebinding.rbac.authorization.k8s.io/istio-reader-istio-system created\n  clusterrolebinding.rbac.authorization.k8s.io/istiod-istio-system created\n  configmap/istio created\n  configmap/istio-sidecar-injector created\n  service/istio-ingressgateway created\n  service/istiod created\n  deployment.apps/istio-ingressgateway created\n  deployment.apps/istiod created\n  envoyfilter.networking.istio.io/metadata-exchange-1.8 created\n  envoyfilter.networking.istio.io/metadata-exchange-1.9 created\n  envoyfilter.networking.istio.io/stats-filter-1.8 created\n  envoyfilter.networking.istio.io/stats-filter-1.9 created\n  envoyfilter.networking.istio.io/tcp-metadata-exchange-1.8 created\n  envoyfilter.networking.istio.io/tcp-metadata-exchange-1.9 created\n  envoyfilter.networking.istio.io/tcp-stats-filter-1.8 created\n  envoyfilter.networking.istio.io/tcp-stats-filter-1.9 created\n  envoyfilter.networking.istio.io/x-forwarded-host created\n  gateway.networking.istio.io/istio-ingressgateway created\n  authorizationpolicy.security.istio.io/global-deny-all created\n  authorizationpolicy.security.istio.io/istio-ingressgateway created\n  mutatingwebhookconfiguration.admissionregistration.k8s.io/istio-sidecar-injector created\n  validatingwebhookconfiguration.admissionregistration.k8s.io/istiod-istio-system created\n  ```\n\n  istio-system namespace 의 2 개의 pod 가 모두 Running 이 될 때까지 기다립니다.\n\n  ```bash\n  kubectl get po -n istio-system\n  ```\n\n  모두 Running 이 되면 다음과 비슷한 결과가 출력됩니다.\n\n  ```bash\n  NAME                                   READY   STATUS    RESTARTS   AGE\n  istio-ingressgateway-79b665c95-xm22l   1/1     Running   0          16s\n  istiod-86457659bb-5h58w                1/1     Running   0          16s\n  ```\n\n### Dex\n\ndex 를 설치합니다.\n\n```bash\nkustomize build common/dex/overlays/istio | kubectl apply -f -\n```\n\n정상적으로 수행되면 다음과 같이 출력됩니다.\n\n```bash\nnamespace/auth created\ncustomresourcedefinition.apiextensions.k8s.io/authcodes.dex.coreos.com created\nserviceaccount/dex created\nclusterrole.rbac.authorization.k8s.io/dex created\nclusterrolebinding.rbac.authorization.k8s.io/dex created\nconfigmap/dex created\nsecret/dex-oidc-client created\nservice/dex created\ndeployment.apps/dex created\nvirtualservice.networking.istio.io/dex created\n```\n\nauth namespace 의 1 개의 pod 가 모두 Running 이 될 때까지 기다립니다.\n\n```bash\nkubectl get po -n auth\n```\n\n모두 Running 이 되면 다음과 비슷한 결과가 출력됩니다.\n\n```bash\nNAME                   READY   STATUS    RESTARTS   AGE\ndex-5ddf47d88d-458cs   1/1     Running   1          12s\n```\n\n### OIDC AuthService\n\nOIDC AuthService 를 설치합니다.\n\n```bash\nkustomize build common/oidc-authservice/base | kubectl apply -f -\n```\n\n정상적으로 수행되면 다음과 같이 출력됩니다.\n\n```bash\nconfigmap/oidc-authservice-parameters created\nsecret/oidc-authservice-client created\nservice/authservice created\npersistentvolumeclaim/authservice-pvc created\nstatefulset.apps/authservice created\nenvoyfilter.networking.istio.io/authn-filter created\n```\n\nistio-system namespace 에 authservice-0 pod 가 Running 이 될 때까지 기다립니다.\n\n```bash\nkubectl get po -n istio-system -w\n```\n\n모두 Running 이 되면 다음과 비슷한 결과가 출력됩니다.\n\n```bash\nNAME                                   READY   STATUS    RESTARTS   AGE\nauthservice-0                          1/1     Running   0          14s\nistio-ingressgateway-79b665c95-xm22l   1/1     Running   0          2m37s\nistiod-86457659bb-5h58w                1/1     Running   0          2m37s\n```\n\n### Kubeflow Namespace\n\nkubeflow namespace 를 생성합니다.\n\n```bash\nkustomize build common/kubeflow-namespace/base | kubectl apply -f -\n```\n\n정상적으로 수행되면 다음과 같이 출력됩니다.\n\n```bash\nnamespace/kubeflow created\n```\n\nkubeflow namespace 를 조회합니다.\n\n```bash\nkubectl get ns kubeflow\n```\n\n정상적으로 생성되면 다음과 비슷한 결과가 출력됩니다.\n\n```bash\nNAME       STATUS   AGE\nkubeflow   Active   8s\n```\n\n### Kubeflow Roles\n\nkubeflow-roles 를 설치합니다.\n\n```bash\nkustomize build common/kubeflow-roles/base | kubectl apply -f -\n```\n\n정상적으로 수행되면 다음과 같이 출력됩니다.\n\n```bash\nclusterrole.rbac.authorization.k8s.io/kubeflow-admin created\nclusterrole.rbac.authorization.k8s.io/kubeflow-edit created\nclusterrole.rbac.authorization.k8s.io/kubeflow-kubernetes-admin created\nclusterrole.rbac.authorization.k8s.io/kubeflow-kubernetes-edit created\nclusterrole.rbac.authorization.k8s.io/kubeflow-kubernetes-view created\nclusterrole.rbac.authorization.k8s.io/kubeflow-view created\n```\n\n방금 생성한 kubeflow roles 를 조회합니다.\n\n```bash\nkubectl get clusterrole | grep kubeflow\n```\n\n다음과 같이 총 6개의 clusterrole 이 출력됩니다.\n\n```bash\nkubeflow-admin                                                         2021-12-03T08:51:36Z\nkubeflow-edit                                                          2021-12-03T08:51:36Z\nkubeflow-kubernetes-admin                                              2021-12-03T08:51:36Z\nkubeflow-kubernetes-edit                                               2021-12-03T08:51:36Z\nkubeflow-kubernetes-view                                               2021-12-03T08:51:36Z\nkubeflow-view                                                          2021-12-03T08:51:36Z\n```\n\n### Kubeflow Istio Resources\n\nkubeflow-istio-resources 를 설치합니다.\n\n```bash\nkustomize build common/istio-1-9/kubeflow-istio-resources/base | kubectl apply -f -\n```\n\n정상적으로 수행되면 다음과 같이 출력됩니다.\n\n```bash\nclusterrole.rbac.authorization.k8s.io/kubeflow-istio-admin created\nclusterrole.rbac.authorization.k8s.io/kubeflow-istio-edit created\nclusterrole.rbac.authorization.k8s.io/kubeflow-istio-view created\ngateway.networking.istio.io/kubeflow-gateway created\n```\n\n방금 생성한 kubeflow roles 를 조회합니다.\n\n```bash\nkubectl get clusterrole | grep kubeflow-istio\n```\n\n다음과 같이 총 3개의 clusterrole 이 출력됩니다.\n\n```bash\nkubeflow-istio-admin                                                   2021-12-03T08:53:17Z\nkubeflow-istio-edit                                                    2021-12-03T08:53:17Z\nkubeflow-istio-view                                                    2021-12-03T08:53:17Z\n```\n\nKubeflow namespace 에 gateway 가 정상적으로 설치되었는지 확인합니다.\n\n```bash\nkubectl get gateway -n kubeflow\n```\n\n정상적으로 생성되면 다음과 비슷한 결과가 출력됩니다.\n\n```bash\nNAME               AGE\nkubeflow-gateway   31s\n```\n\n### Kubeflow Pipelines\n\nkubeflow pipelines 를 설치합니다.\n\n```bash\nkustomize build apps/pipeline/upstream/env/platform-agnostic-multi-user | kubectl apply -f -\n```\n\n정상적으로 수행되면 다음과 같이 출력됩니다.\n\n```bash\ncustomresourcedefinition.apiextensions.k8s.io/clusterworkflowtemplates.argoproj.io created\ncustomresourcedefinition.apiextensions.k8s.io/cronworkflows.argoproj.io created\ncustomresourcedefinition.apiextensions.k8s.io/workfloweventbindings.argoproj.io created\n...(생략)\nauthorizationpolicy.security.istio.io/ml-pipeline-visualizationserver created\nauthorizationpolicy.security.istio.io/mysql created\nauthorizationpolicy.security.istio.io/service-cache-server created\n```\n\n위 명령어는 여러 resources 를 한 번에 설치하고 있지만, 설치 순서의 의존성이 있는 리소스가 존재합니다.  \n따라서 때에 따라 다음과 비슷한 에러가 발생할 수 있습니다.\n\n```bash\n\"error: unable to recognize \"STDIN\": no matches for kind \"CompositeController\" in version \"metacontroller.k8s.io/v1alpha1\"\"  \n```\n\n위와 비슷한 에러가 발생한다면, 10 초 정도 기다린 뒤 다시 위의 명령을 수행합니다.\n\n```bash\nkustomize build apps/pipeline/upstream/env/platform-agnostic-multi-user | kubectl apply -f -\n```\n\n정상적으로 설치되었는지 확인합니다.\n\n```bash\nkubectl get po -n kubeflow\n```\n\n다음과 같이 총 16개의 pod 가 모두 Running 이 될 때까지 기다립니다.\n\n```bash\nNAME                                                     READY   STATUS    RESTARTS   AGE\ncache-deployer-deployment-79fdf9c5c9-bjnbg               2/2     Running   1          5m3s\ncache-server-5bdf4f4457-48gbp                            2/2     Running   0          5m3s\nkubeflow-pipelines-profile-controller-7b947f4748-8d26b   1/1     Running   0          5m3s\nmetacontroller-0                                         1/1     Running   0          5m3s\nmetadata-envoy-deployment-5b4856dd5-xtlkd                1/1     Running   0          5m3s\nmetadata-grpc-deployment-6b5685488-kwvv7                 2/2     Running   3          5m3s\nmetadata-writer-548bd879bb-zjkcn                         2/2     Running   1          5m3s\nminio-5b65df66c9-k5gzg                                   2/2     Running   0          5m3s\nml-pipeline-8c4b99589-85jw6                              2/2     Running   1          5m3s\nml-pipeline-persistenceagent-d6bdc77bd-ssxrv             2/2     Running   0          5m3s\nml-pipeline-scheduledworkflow-5db54d75c5-zk2cw           2/2     Running   0          5m2s\nml-pipeline-ui-5bd8d6dc84-j7wqr                          2/2     Running   0          5m2s\nml-pipeline-viewer-crd-68fb5f4d58-mbcbg                  2/2     Running   1          5m2s\nml-pipeline-visualizationserver-8476b5c645-wljfm         2/2     Running   0          5m2s\nmysql-f7b9b7dd4-xfnw4                                    2/2     Running   0          5m2s\nworkflow-controller-5cbbb49bd8-5zrwx                     2/2     Running   1          5m2s\n```\n\n추가로 ml-pipeline UI가 정상적으로 접속되는지 확인합니다.\n\n```bash\nkubectl port-forward svc/ml-pipeline-ui -n kubeflow 8888:80\n```\n\n웹 브라우저를 열어 [http://localhost:8888/#/pipelines/](http://localhost:8888/#/pipelines/) 경로에 접속합니다.\n\n다음과 같은 화면이 출력되는 것을 확인합니다.\n\n![pipeline-ui](./img/pipeline-ui.png)\n\n- localhost 연결 거부 이슈\n\n![localhost-reject](./img/localhost-reject.png)\n\n만약 다음과 같이 `localhost에서 연결을 거부했습니다` 라는 에러가 출력될 경우, 커맨드로 address 설정을 통해 접근하는 것이 가능합니다.\n\n**보안상의 문제가 되지 않는다면,** 아래와 같이 `0.0.0.0` 로 모든 주소의 bind를 열어주는 방향으로 ml-pipeline UI가 정상적으로 접속되는지 확인합니다.\n\n```bash\nkubectl port-forward --address 0.0.0.0 svc/ml-pipeline-ui -n kubeflow 8888:80\n```\n\n- 위의 옵션으로 실행했음에도 여전히 연결 거부 이슈가 발생할 경우\n\n방화벽 설정으로 접속해 모든 tcp 프로토콜의 포트에 대한 접속을 허가 또는 8888번 포트의 접속 허가를 추가해 접근 권한을 허가해줍니다.\n\n웹 브라우저를 열어 `http://<당신의 가상 인스턴스 공인 ip 주소>:8888/#/pipelines/` 경로에 접속하면, ml-pipeline UI 화면이 출력되는 것을 확인할 수 있습니다.\n\n하단에서 진행되는 다른 포트의 경로에 접속할 때도 위의 절차와 동일하게 커맨드를 실행하고, 방화벽에 포트 번호를 추가해주면 실행하는 것이 가능합니다.\n\n### Katib\n\nKatib 를 설치합니다.\n\n```bash\nkustomize build apps/katib/upstream/installs/katib-with-kubeflow | kubectl apply -f -\n```\n\n정상적으로 수행되면 다음과 같이 출력됩니다.\n\n```bash\ncustomresourcedefinition.apiextensions.k8s.io/experiments.kubeflow.org created\ncustomresourcedefinition.apiextensions.k8s.io/suggestions.kubeflow.org created\ncustomresourcedefinition.apiextensions.k8s.io/trials.kubeflow.org created\nserviceaccount/katib-controller created\nserviceaccount/katib-ui created\nclusterrole.rbac.authorization.k8s.io/katib-controller created\nclusterrole.rbac.authorization.k8s.io/katib-ui created\nclusterrole.rbac.authorization.k8s.io/kubeflow-katib-admin created\nclusterrole.rbac.authorization.k8s.io/kubeflow-katib-edit created\nclusterrole.rbac.authorization.k8s.io/kubeflow-katib-view created\nclusterrolebinding.rbac.authorization.k8s.io/katib-controller created\nclusterrolebinding.rbac.authorization.k8s.io/katib-ui created\nconfigmap/katib-config created\nconfigmap/trial-templates created\nsecret/katib-mysql-secrets created\nservice/katib-controller created\nservice/katib-db-manager created\nservice/katib-mysql created\nservice/katib-ui created\npersistentvolumeclaim/katib-mysql created\ndeployment.apps/katib-controller created\ndeployment.apps/katib-db-manager created\ndeployment.apps/katib-mysql created\ndeployment.apps/katib-ui created\ncertificate.cert-manager.io/katib-webhook-cert created\nissuer.cert-manager.io/katib-selfsigned-issuer created\nvirtualservice.networking.istio.io/katib-ui created\nmutatingwebhookconfiguration.admissionregistration.k8s.io/katib.kubeflow.org created\nvalidatingwebhookconfiguration.admissionregistration.k8s.io/katib.kubeflow.org created\n```\n\n정상적으로 설치되었는지 확인합니다.\n\n```bash\nkubectl get po -n kubeflow | grep katib\n```\n\n다음과 같이 총 4 개의 pod 가 Running 이 될 때까지 기다립니다.\n\n```bash\nkatib-controller-68c47fbf8b-b985z                        1/1     Running   0          82s\nkatib-db-manager-6c948b6b76-2d9gr                        1/1     Running   0          82s\nkatib-mysql-7894994f88-scs62                             1/1     Running   0          82s\nkatib-ui-64bb96d5bf-d89kp                                1/1     Running   0          82s\n```\n\n추가로 katib UI가 정상적으로 접속되는지 확인합니다.\n\n```bash\nkubectl port-forward svc/katib-ui -n kubeflow 8081:80\n```\n\n웹 브라우저를 열어 [http://localhost:8081/katib/](http://localhost:8081/katib/) 경로에 접속합니다.\n\n다음과 같은 화면이 출력되는 것을 확인합니다.\n\n![katib-ui](./img/katib-ui.png)\n\n### Central Dashboard\n\nDashboard 를 설치합니다.\n\n```bash\nkustomize build apps/centraldashboard/upstream/overlays/istio | kubectl apply -f -\n```\n\n정상적으로 수행되면 다음과 같이 출력됩니다.\n\n```bash\nserviceaccount/centraldashboard created\nrole.rbac.authorization.k8s.io/centraldashboard created\nclusterrole.rbac.authorization.k8s.io/centraldashboard created\nrolebinding.rbac.authorization.k8s.io/centraldashboard created\nclusterrolebinding.rbac.authorization.k8s.io/centraldashboard created\nconfigmap/centraldashboard-config created\nconfigmap/centraldashboard-parameters created\nservice/centraldashboard created\ndeployment.apps/centraldashboard created\nvirtualservice.networking.istio.io/centraldashboard created\n```\n\n정상적으로 설치되었는지 확인합니다.\n\n```bash\nkubectl get po -n kubeflow | grep centraldashboard\n```\n\nkubeflow namespace 에 centraldashboard 관련 1 개의 pod 가 Running 이 될 때까지 기다립니다.\n\n```bash\ncentraldashboard-8fc7d8cc-xl7ts                          1/1     Running   0          52s\n```\n\n추가로 Central Dashboard UI가 정상적으로 접속되는지 확인합니다.\n\n```bash\nkubectl port-forward svc/centraldashboard -n kubeflow 8082:80\n```\n\n웹 브라우저를 열어 [http://localhost:8082/](http://localhost:8082/) 경로에 접속합니다.\n\n다음과 같은 화면이 출력되는 것을 확인합니다.\n\n![central-dashboard](./img/central-dashboard.png)\n\n### Admission Webhook\n\n```bash\nkustomize build apps/admission-webhook/upstream/overlays/cert-manager | kubectl apply -f -\n```\n\n정상적으로 수행되면 다음과 같이 출력됩니다.\n\n```bash\ncustomresourcedefinition.apiextensions.k8s.io/poddefaults.kubeflow.org created\nserviceaccount/admission-webhook-service-account created\nclusterrole.rbac.authorization.k8s.io/admission-webhook-cluster-role created\nclusterrole.rbac.authorization.k8s.io/admission-webhook-kubeflow-poddefaults-admin created\nclusterrole.rbac.authorization.k8s.io/admission-webhook-kubeflow-poddefaults-edit created\nclusterrole.rbac.authorization.k8s.io/admission-webhook-kubeflow-poddefaults-view created\nclusterrolebinding.rbac.authorization.k8s.io/admission-webhook-cluster-role-binding created\nservice/admission-webhook-service created\ndeployment.apps/admission-webhook-deployment created\ncertificate.cert-manager.io/admission-webhook-cert created\nissuer.cert-manager.io/admission-webhook-selfsigned-issuer created\nmutatingwebhookconfiguration.admissionregistration.k8s.io/admission-webhook-mutating-webhook-configuration created\n```\n\n정상적으로 설치되었는지 확인합니다.\n\n```bash\nkubectl get po -n kubeflow | grep admission-webhook\n```\n\n1 개의 pod 가 Running 이 될 때까지 기다립니다.\n\n```bash\nadmission-webhook-deployment-667bd68d94-2hhrx            1/1     Running   0          11s\n```\n\n### Notebooks & Jupyter Web App\n\n1. Notebook controller 를 설치합니다.\n\n  ```bash\n  kustomize build apps/jupyter/notebook-controller/upstream/overlays/kubeflow | kubectl apply -f -\n  ```\n\n  정상적으로 수행되면 다음과 같이 출력됩니다.\n\n  ```bash\n  customresourcedefinition.apiextensions.k8s.io/notebooks.kubeflow.org created\n  serviceaccount/notebook-controller-service-account created\n  role.rbac.authorization.k8s.io/notebook-controller-leader-election-role created\n  clusterrole.rbac.authorization.k8s.io/notebook-controller-kubeflow-notebooks-admin created\n  clusterrole.rbac.authorization.k8s.io/notebook-controller-kubeflow-notebooks-edit created\n  clusterrole.rbac.authorization.k8s.io/notebook-controller-kubeflow-notebooks-view created\n  clusterrole.rbac.authorization.k8s.io/notebook-controller-role created\n  rolebinding.rbac.authorization.k8s.io/notebook-controller-leader-election-rolebinding created\n  clusterrolebinding.rbac.authorization.k8s.io/notebook-controller-role-binding created\n  configmap/notebook-controller-config-m44cmb547t created\n  service/notebook-controller-service created\n  deployment.apps/notebook-controller-deployment created\n  ```\n\n  정상적으로 설치되었는지 확인합니다.\n\n  ```bash\n  kubectl get po -n kubeflow | grep notebook-controller\n  ```\n\n  1 개의 pod 가 Running 이 될 때까지 기다립니다.\n\n  ```bash\n  notebook-controller-deployment-75b4f7b578-w4d4l          1/1     Running   0          105s\n  ```\n\n2. Jupyter Web App 을 설치합니다.\n\n  ```bash\n  kustomize build apps/jupyter/jupyter-web-app/upstream/overlays/istio | kubectl apply -f -\n  ```\n\n  정상적으로 수행되면 다음과 같이 출력됩니다.\n\n  ```bash\n  serviceaccount/jupyter-web-app-service-account created\n  role.rbac.authorization.k8s.io/jupyter-web-app-jupyter-notebook-role created\n  clusterrole.rbac.authorization.k8s.io/jupyter-web-app-cluster-role created\n  clusterrole.rbac.authorization.k8s.io/jupyter-web-app-kubeflow-notebook-ui-admin created\n  clusterrole.rbac.authorization.k8s.io/jupyter-web-app-kubeflow-notebook-ui-edit created\n  clusterrole.rbac.authorization.k8s.io/jupyter-web-app-kubeflow-notebook-ui-view created\n  rolebinding.rbac.authorization.k8s.io/jupyter-web-app-jupyter-notebook-role-binding created\n  clusterrolebinding.rbac.authorization.k8s.io/jupyter-web-app-cluster-role-binding created\n  configmap/jupyter-web-app-config-76844k4cd7 created\n  configmap/jupyter-web-app-logos created\n  configmap/jupyter-web-app-parameters-chmg88cm48 created\n  service/jupyter-web-app-service created\n  deployment.apps/jupyter-web-app-deployment created\n  virtualservice.networking.istio.io/jupyter-web-app-jupyter-web-app created\n  ```\n\n  정상적으로 설치되었는지 확인합니다.\n\n  ```bash\n  kubectl get po -n kubeflow | grep jupyter-web-app\n  ```\n\n  1개의 pod 가 Running 이 될 때까지 기다립니다.\n\n  ```bash\n  jupyter-web-app-deployment-6f744fbc54-p27ts              1/1     Running   0          2m\n  ```\n\n### Profiles + KFAM\n\nProfile Controller를 설치합니다.\n\n```bash\nkustomize build apps/profiles/upstream/overlays/kubeflow | kubectl apply -f -\n```\n\n정상적으로 수행되면 다음과 같이 출력됩니다.\n\n```bash\ncustomresourcedefinition.apiextensions.k8s.io/profiles.kubeflow.org created\nserviceaccount/profiles-controller-service-account created\nrole.rbac.authorization.k8s.io/profiles-leader-election-role created\nrolebinding.rbac.authorization.k8s.io/profiles-leader-election-rolebinding created\nclusterrolebinding.rbac.authorization.k8s.io/profiles-cluster-role-binding created\nconfigmap/namespace-labels-data-48h7kd55mc created\nconfigmap/profiles-config-46c7tgh6fd created\nservice/profiles-kfam created\ndeployment.apps/profiles-deployment created\nvirtualservice.networking.istio.io/profiles-kfam created\n```\n\n정상적으로 설치되었는지 확인합니다.\n\n```bash\nkubectl get po -n kubeflow | grep profiles-deployment\n```\n\n1 개의 pod 가 Running 이 될 때까지 기다립니다.\n\n```bash\nprofiles-deployment-89f7d88b-qsnrd                       2/2     Running   0          42s\n```\n\n### Volumes Web App\n\nVolumes Web App 을 설치합니다.\n\n```bash\nkustomize build apps/volumes-web-app/upstream/overlays/istio | kubectl apply -f -\n```\n\n정상적으로 수행되면 다음과 같이 출력됩니다.\n\n```bash\nserviceaccount/volumes-web-app-service-account created\nclusterrole.rbac.authorization.k8s.io/volumes-web-app-cluster-role created\nclusterrole.rbac.authorization.k8s.io/volumes-web-app-kubeflow-volume-ui-admin created\nclusterrole.rbac.authorization.k8s.io/volumes-web-app-kubeflow-volume-ui-edit created\nclusterrole.rbac.authorization.k8s.io/volumes-web-app-kubeflow-volume-ui-view created\nclusterrolebinding.rbac.authorization.k8s.io/volumes-web-app-cluster-role-binding created\nconfigmap/volumes-web-app-parameters-4gg8cm2gmk created\nservice/volumes-web-app-service created\ndeployment.apps/volumes-web-app-deployment created\nvirtualservice.networking.istio.io/volumes-web-app-volumes-web-app created\n```\n\n정상적으로 설치되었는지 확인합니다.\n\n```bash\nkubectl get po -n kubeflow | grep volumes-web-app\n```\n\n1개의 pod가 Running 이 될 때까지 기다립니다.\n\n```bash\nvolumes-web-app-deployment-8589d664cc-62svl              1/1     Running   0          27s\n```\n\n### Tensorboard & Tensorboard Web App\n\n1. Tensorboard Web App 를 설치합니다.\n\n  ```bash\n  kustomize build apps/tensorboard/tensorboards-web-app/upstream/overlays/istio | kubectl apply -f -\n  ```\n\n  정상적으로 수행되면 다음과 같이 출력됩니다.\n\n  ```bash\n  serviceaccount/tensorboards-web-app-service-account created\n  clusterrole.rbac.authorization.k8s.io/tensorboards-web-app-cluster-role created\n  clusterrole.rbac.authorization.k8s.io/tensorboards-web-app-kubeflow-tensorboard-ui-admin created\n  clusterrole.rbac.authorization.k8s.io/tensorboards-web-app-kubeflow-tensorboard-ui-edit created\n  clusterrole.rbac.authorization.k8s.io/tensorboards-web-app-kubeflow-tensorboard-ui-view created\n  clusterrolebinding.rbac.authorization.k8s.io/tensorboards-web-app-cluster-role-binding created\n  configmap/tensorboards-web-app-parameters-g28fbd6cch created\n  service/tensorboards-web-app-service created\n  deployment.apps/tensorboards-web-app-deployment created\n  virtualservice.networking.istio.io/tensorboards-web-app-tensorboards-web-app created\n  ```\n\n  정상적으로 설치되었는지 확인합니다.\n\n  ```bash\n  kubectl get po -n kubeflow | grep tensorboards-web-app\n  ```\n\n  1 개의 pod 가 Running 이 될 때까지 기다립니다.\n\n  ```bash\n  tensorboards-web-app-deployment-6ff79b7f44-qbzmw            1/1     Running             0          22s\n  ```\n\n2. Tensorboard Controller 를 설치합니다.\n\n  ```bash\n  kustomize build apps/tensorboard/tensorboard-controller/upstream/overlays/kubeflow | kubectl apply -f -\n  ```\n\n  정상적으로 수행되면 다음과 같이 출력됩니다.\n\n  ```bash\n  customresourcedefinition.apiextensions.k8s.io/tensorboards.tensorboard.kubeflow.org created\n  serviceaccount/tensorboard-controller created\n  role.rbac.authorization.k8s.io/tensorboard-controller-leader-election-role created\n  clusterrole.rbac.authorization.k8s.io/tensorboard-controller-manager-role created\n  clusterrole.rbac.authorization.k8s.io/tensorboard-controller-proxy-role created\n  rolebinding.rbac.authorization.k8s.io/tensorboard-controller-leader-election-rolebinding created\n  clusterrolebinding.rbac.authorization.k8s.io/tensorboard-controller-manager-rolebinding created\n  clusterrolebinding.rbac.authorization.k8s.io/tensorboard-controller-proxy-rolebinding created\n  configmap/tensorboard-controller-config-bf88mm96c8 created\n  service/tensorboard-controller-controller-manager-metrics-service created\n  deployment.apps/tensorboard-controller-controller-manager created\n  ```\n\n  정상적으로 설치되었는지 확인합니다.\n\n  ```bash\n  kubectl get po -n kubeflow | grep tensorboard-controller\n  ```\n\n  1 개의 pod 가 Running 이 될 때까지 기다립니다.\n\n  ```bash\n  tensorboard-controller-controller-manager-954b7c544-vjpzj   3/3     Running   1          73s\n  ```\n\n### Training Operator\n\nTraining Operator 를 설치합니다.\n\n```bash\nkustomize build apps/training-operator/upstream/overlays/kubeflow | kubectl apply -f -\n```\n\n정상적으로 수행되면 다음과 같이 출력됩니다.\n\n```bash\ncustomresourcedefinition.apiextensions.k8s.io/mxjobs.kubeflow.org created\ncustomresourcedefinition.apiextensions.k8s.io/pytorchjobs.kubeflow.org created\ncustomresourcedefinition.apiextensions.k8s.io/tfjobs.kubeflow.org created\ncustomresourcedefinition.apiextensions.k8s.io/xgboostjobs.kubeflow.org created\nserviceaccount/training-operator created\nclusterrole.rbac.authorization.k8s.io/kubeflow-training-admin created\nclusterrole.rbac.authorization.k8s.io/kubeflow-training-edit created\nclusterrole.rbac.authorization.k8s.io/kubeflow-training-view created\nclusterrole.rbac.authorization.k8s.io/training-operator created\nclusterrolebinding.rbac.authorization.k8s.io/training-operator created\nservice/training-operator created\ndeployment.apps/training-operator created\n```\n\n정상적으로 설치되었는지 확인합니다.\n\n```bash\nkubectl get po -n kubeflow | grep training-operator\n```\n\n1 개의 pod 가 Running 이 될 때까지 기다립니다.\n\n```bash\ntraining-operator-7d98f9dd88-6887f                          1/1     Running   0          28s\n```\n\n### User Namespace\n\nKubeflow 사용을 위해, 사용할 User의 Kubeflow Profile 을 생성합니다.\n\n```bash\nkustomize build common/user-namespace/base | kubectl apply -f -\n```\n\n정상적으로 수행되면 다음과 같이 출력됩니다.\n\n```bash\nconfigmap/default-install-config-9h2h2b6hbk created\nprofile.kubeflow.org/kubeflow-user-example-com created\n```\n\nkubeflow-user-example-com profile 이 생성된 것을 확인합니다.\n\n```bash\nkubectl get profile\n```\n\n```bash\nkubeflow-user-example-com   37s\n```\n\n## 정상 설치 확인\n\nKubeflow central dashboard에 web browser로 접속하기 위해 포트 포워딩합니다.\n\n```bash\nkubectl port-forward svc/istio-ingressgateway -n istio-system 8080:80\n```\n\nWeb Browser 를 열어 [http://localhost:8080](http://localhost:8080) 으로 접속하여, 다음과 같은 화면이 출력되는 것을 확인합니다.\n\n![login-ui](./img/login-after-install.png)\n\n다음 접속 정보를 입력하여 접속합니다.\n\n- Email Address: `user@example.com`\n- Password: `12341234`\n\n![central-dashboard](./img/after-login.png)\n\n"
  },
  {
    "path": "docs/setup-components/install-components-mlflow.md",
    "content": "---\ntitle : \"2. MLflow Tracking Server\"\ndescription: \"구성요소 설치 - MLflow\"\nsidebar_position: 2\ndate: 2021-12-13\nlastmod: 2021-12-13\ncontributors: [\"Jaeyeon Kim\"]\n---\n\n## Install MLflow Tracking Server\n\nMLflow는 대표적인 오픈소스 ML 실험 관리 도구입니다. MLflow는 [실험 관리 용도](https://mlflow.org/docs/latest/tracking.html#tracking) 외에도 [ML Model 패키징](https://mlflow.org/docs/latest/projects.html#projects), [ML 모델 배포 관리](https://mlflow.org/docs/latest/models.html#models), [ML 모델 저장](https://mlflow.org/docs/latest/model-registry.html#registry)과 같은 기능도 제공하고 있습니다.\n\n*모두의 MLOps*에서는 MLflow를 실험 관리 용도로 사용합니다.  \n그래서 MLflow에서 관리하는 데이터를 저장하고 UI를 제공하는 MLflow Tracking Server를 쿠버네티스 클러스터에 배포하여 사용할 예정입니다.\n\n## Before Install MLflow Tracking Server\n\n### PostgreSQL DB 설치\n\nMLflow Tracking Server가 Backend Store로 사용할 용도의 PostgreSQL DB를 쿠버네티스 클러스터에 배포합니다.\n\n먼저 `mlflow-system`이라는 namespace 를 생성합니다.\n\n```bash\nkubectl create ns mlflow-system\n```\n\n다음과 같은 메시지가 출력되면 정상적으로 생성된 것을 의미합니다.\n\n```bash\nnamespace/mlflow-system created\n```\n\npostgresql DB를 `mlflow-system` namespace 에 생성합니다.\n\n```bash\nkubectl -n mlflow-system apply -f https://raw.githubusercontent.com/mlops-for-all/helm-charts/b94b5fe4133f769c04b25068b98ccfa7a505aa60/mlflow/manifests/postgres.yaml \n```\n\n정상적으로 수행되면 다음과 같이 출력됩니다.\n\n```bash\nservice/postgresql-mlflow-service created\ndeployment.apps/postgresql-mlflow created\npersistentvolumeclaim/postgresql-mlflow-pvc created\n```\n\nmlflow-system namespace 에 1개의 postgresql 관련 pod 가 Running 이 될 때까지 기다립니다.\n\n```bash\nkubectl get pod -n mlflow-system | grep postgresql\n```\n\n다음과 비슷하게 출력되면 정상적으로 실행된 것입니다.\n\n```bash\npostgresql-mlflow-7b9bc8c79f-srkh7   1/1     Running   0          38s\n```\n\n### Minio 설정\n\nMLflow Tracking Server가 Artifacts Store로 사용할 용도의 Minio는 이전 Kubeflow 설치 단계에서 설치한 Minio를 활용합니다.  \n단, kubeflow 용도와 mlflow 용도를 분리하기 위해, mlflow 전용 버킷(bucket)을 생성하겠습니다.  \nminio 에 접속하여 버킷을 생성하기 위해, 우선 minio-service 를 포트포워딩합니다.\n\n```bash\nkubectl port-forward svc/minio-service -n kubeflow 9000:9000\n```\n\n웹 브라우저를 열어 [localhost:9000](http://localhost:9000)으로 접속하면 다음과 같은 화면이 출력됩니다.\n\n![minio-install](./img/minio-install.png)\n\n\n다음과 같은 접속 정보를 입력하여 로그인합니다.\n\n- Username: `minio`\n- Password: `minio123`\n\n우측 하단의 **`+`** 버튼을 클릭하여, `Create Bucket`를 클릭합니다.\n\n![create-bucket](./img/create-bucket.png)\n\n\n`Bucket Name`에 `mlflow`를 입력하여 버킷을 생성합니다.\n\n정상적으로 생성되면 다음과 같이 왼쪽에 `mlflow`라는 이름의 버킷이 생성됩니다.\n\n![mlflow-bucket](./img/mlflow-bucket.png)\n\n\n---\n\n## Let's Install MLflow Tracking Server\n\n### Helm Repository 추가\n\n```bash\nhelm repo add mlops-for-all https://mlops-for-all.github.io/helm-charts\n```\n\n다음과 같은 메시지가 출력되면 정상적으로 추가된 것을 의미합니다.\n\n```bash\n\"mlops-for-all\" has been added to your repositories\n```\n\n### Helm Repository 업데이트\n\n```bash\nhelm repo update\n```\n\n다음과 같은 메시지가 출력되면 정상적으로 업데이트된 것을 의미합니다.\n\n```bash\nHang tight while we grab the latest from your chart repositories...\n...Successfully got an update from the \"mlops-for-all\" chart repository\nUpdate Complete. ⎈Happy Helming!⎈\n```\n\n### Helm Install\n\nmlflow-server Helm Chart 0.2.0 버전을 설치합니다.\n\n```bash\nhelm install mlflow-server mlops-for-all/mlflow-server \\\n  --namespace mlflow-system \\\n  --version 0.2.0\n```\n\n- **주의**: 위의 helm chart는 MLflow 의 backend store 와 artifacts store 의 접속 정보를 kubeflow 설치 과정에서 생성한 minio와 위의 [PostgreSQL DB 설치](#postgresql-db-설치)에서 생성한 postgresql 정보를 default로 하여 설치합니다.\n  - 별개로 생성한 DB 혹은 Object storage를 활용하고 싶은 경우, [Helm Chart Repo](https://github.com/mlops-for-all/helm-charts/tree/main/mlflow/chart)를 참고하여 helm install 시 value를 따로 설정하여 설치하시기 바랍니다.\n\n다음과 같은 메시지가 출력되어야 합니다.\n\n```bash\nNAME: mlflow-server\nLAST DEPLOYED: Sat Dec 18 22:02:13 2021\nNAMESPACE: mlflow-system\nSTATUS: deployed\nREVISION: 1\nTEST SUITE: None\n```\n\n정상적으로 설치되었는지 확인합니다.\n\n```bash\nkubectl get pod -n mlflow-system | grep mlflow-server\n```\n\nmlflow-system namespace 에 1 개의 mlflow-server 관련 pod 가 Running 이 될 때까지 기다립니다.  \n다음과 비슷하게 출력되면 정상적으로 실행된 것입니다.\n\n```bash\nmlflow-server-ffd66d858-6hm62        1/1     Running   0          74s\n```\n\n### 정상 설치 확인\n\n그럼 이제 MLflow Server에 정상적으로 접속되는지 확인해보겠습니다.\n\n우선 클라이언트 노드에서 접속하기 위해, 포트포워딩을 수행합니다.\n\n```bash\nkubectl port-forward svc/mlflow-server-service -n mlflow-system 5000:5000\n```\n\n웹 브라우저를 열어 [localhost:5000](http://localhost:5000)으로 접속하면 다음과 같은 화면이 출력됩니다.\n\n![mlflow-install](./img/mlflow-install.png)\n\n"
  },
  {
    "path": "docs/setup-components/install-components-pg.md",
    "content": "---\ntitle : \"4. Prometheus & Grafana\"\ndescription: \"구성요소 설치 - Prometheus & Grafana\"\nsidebar_position: 4\ndate: 2021-12-13\nlastmod: 2021-12-13\ncontributors: [\"Jaeyeon Kim\"]\n---\n\n## Prometheus & Grafana\n\n프로메테우스(Prometheus) 와 그라파나(Grafana) 는 모니터링을 위한 도구입니다.  \n안정적인 서비스 운영을 위해서는 서비스와 서비스가 운영되고 있는 인프라의 상태를 지속해서 관찰하고, 관찰한 메트릭을 바탕으로 문제가 생길 때 빠르게 대응해야 합니다.  \n이러한 모니터링을 효율적으로 수행하기 위한 많은 도구 중 *모두의 MLOps*에서는 오픈소스인 프로메테우스와 그라파나를 사용할 예정입니다.\n\n더 자세한 내용은 [Prometheus 공식 문서](https://prometheus.io/docs/introduction/overview/), [Grafana 공식 문서](https://grafana.com/docs/)를 확인해주시기를 바랍니다.\n\n프로메테우스는 다양한 대상으로부터 Metric을 수집하는 도구이며, 그라파나는 모인 데이터를 시각화하는 것을 도와주는 도구입니다. 서로 간의 종속성은 없지만 상호 보완적으로 사용할 수 있어 함께 사용되는 경우가 많습니다.\n\n이번 페이지에서는 쿠버네티스 클러스터에 프로메테우스와 그라파나를 설치한 뒤, Seldon-Core 로 생성한 SeldonDeployment 로 API 요청을 보내, 정상적으로 Metrics 이 수집되는지 확인해보겠습니다.\n\n본 글에서는 seldonio/seldon-core-analytics Helm Chart 1.12.0 버전을 활용해 쿠버네티스 클러스터에 프로메테우스와 그라파나를 설치하고, Seldon-Core 에서 생성한 SeldonDeployment의 Metrics 을 효율적으로 확인하기 위한 대시보드도 함께 설치합니다.\n\n### Helm Repository 추가\n\n```bash\nhelm repo add seldonio https://storage.googleapis.com/seldon-charts\n```\n\n다음과 같은 메시지가 출력되면 정상적으로 추가된 것을 의미합니다.\n\n```bash\n\"seldonio\" has been added to your repositories\n```\n\n### Helm Repository 업데이트\n\n```bash\nhelm repo update\n```\n\n다음과 같은 메시지가 출력되면 정상적으로 업데이트된 것을 의미합니다.\n\n```bash\nHang tight while we grab the latest from your chart repositories...\n...Successfully got an update from the \"seldonio\" chart repository\n...Successfully got an update from the \"datawire\" chart repository\nUpdate Complete. ⎈Happy Helming!⎈\n```\n\n### Helm Install\n\nseldon-core-analytics Helm Chart 1.12.0 버전을 설치합니다.\n\n```bash\nhelm install seldon-core-analytics seldonio/seldon-core-analytics \\\n  --namespace seldon-system \\\n  --version 1.12.0\n```\n\n다음과 같은 메시지가 출력되어야 합니다.\n\n```bash\n생략...\nNAME: seldon-core-analytics\nLAST DEPLOYED: Tue Dec 14 18:29:38 2021\nNAMESPACE: seldon-system\nSTATUS: deployed\nREVISION: 1\n```\n\n정상적으로 설치되었는지 확인합니다.\n\n```bash\nkubectl get pod -n seldon-system | grep seldon-core-analytics\n```\n\nseldon-system namespace 에 6개의 seldon-core-analytics 관련 pod 가 Running 이 될 때까지 기다립니다.\n\n```bash\nseldon-core-analytics-grafana-657c956c88-ng8wn                  2/2     Running   0          114s\nseldon-core-analytics-kube-state-metrics-94bb6cb9-svs82         1/1     Running   0          114s\nseldon-core-analytics-prometheus-alertmanager-64cf7b8f5-nxbl8   2/2     Running   0          114s\nseldon-core-analytics-prometheus-node-exporter-5rrj5            1/1     Running   0          114s\nseldon-core-analytics-prometheus-pushgateway-8476474cff-sr4n6   1/1     Running   0          114s\nseldon-core-analytics-prometheus-seldon-685c664894-7cr45        2/2     Running   0          114s\n```\n\n### 정상 설치 확인\n\n그럼 이제 그라파나에 정상적으로 접속되는지 확인해보겠습니다.\n\n우선 클라이언트 노드에서 접속하기 위해, 포트포워딩을 수행합니다.\n\n```bash\nkubectl port-forward svc/seldon-core-analytics-grafana -n seldon-system 8090:80\n```\n\n웹 브라우저를 열어 [localhost:8090](http://localhost:8090)으로 접속하면 다음과 같은 화면이 출력됩니다.\n\n![grafana-install](./img/grafana-install.png)\n\n다음과 같은 접속정보를 입력하여 접속합니다.\n\n- Email or username : `admin`\n- Password : `password`\n\n로그인하면 다음과 같은 화면이 출력됩니다.\n\n![grafana-login](./img/grafana-login.png)\n\n좌측의 대시보드 아이콘을 클릭하여, `Manage` 버튼을 클릭합니다.\n\n![dashboard-click](./img/dashboard-click.png)\n\n기본적인 그라파나 대시보드가 포함되어있는 것을 확인할 수 있습니다. 이 중 `Prediction Analytics` 대시보드를 클릭합니다.\n\n![dashboard](./img/dashboard.png)\n\nSeldon Core API Dashboard 가 보이고, 다음과 같이 출력되는 것을 확인할 수 있습니다.\n\n![seldon-dashboard](./img/seldon-dashboard.png)\n\n## References\n\n- [Seldon-Core-Analytics Helm Chart](https://github.com/SeldonIO/seldon-core/tree/master/helm-charts/seldon-core-analytics)\n"
  },
  {
    "path": "docs/setup-components/install-components-seldon.md",
    "content": "---\ntitle : \"3. Seldon-Core\"\ndescription: \"구성요소 설치 - Seldon-Core\"\nsidebar_position: 3\ndate: 2021-12-13\nlastmod: 2021-12-13\ncontributors: [\"Jaeyeon Kim\"]\n---\n\n## Seldon-Core\n\nSeldon-Core는 쿠버네티스 환경에 수많은 머신러닝 모델을 배포하고 관리할 수 있는 오픈소스 프레임워크 중 하나입니다.  \n더 자세한 내용은 Seldon-Core 의 공식 [제품 설명 페이지](https://www.seldon.io/tech/products/core/) 와 [깃헙](https://github.com/SeldonIO/seldon-core) 그리고 API Deployment 파트를 참고해주시기를 바랍니다.\n\n## Selon-Core 설치\n\nSeldon-Core를 사용하기 위해서는 쿠버네티스의 인그레스(Ingress)를 담당하는 Ambassador 와 Istio 와 같은 [모듈이 필요합니다](https://docs.seldon.io/projects/seldon-core/en/latest/workflow/install.html).  \nSeldon-Core 에서는 Ambassador 와 Istio 만을 공식적으로 지원하며, *모두의 MLOps*에서는 Ambassador를 사용해 Seldon-core를 사용하므로 Ambassador를 설치하겠습니다.\n\n### Ambassador - Helm Repository 추가\n\n```bash\nhelm repo add datawire https://www.getambassador.io\n```\n\n다음과 같은 메시지가 출력되면 정상적으로 추가된 것을 의미합니다.\n\n```bash\n\"datawire\" has been added to your repositories\n```\n\n### Ambassador - Helm Repository 업데이트\n\n```bash\nhelm repo update\n```\n\n다음과 같은 메시지가 출력되면 정상적으로 업데이트된 것을 의미합니다.\n\n```bash\nHang tight while we grab the latest from your chart repositories...\n...Successfully got an update from the \"datawire\" chart repository\nUpdate Complete. ⎈Happy Helming!⎈\n```\n\n### Ambassador - Helm Install\n\nambassador Chart 6.9.3 버전을 설치합니다.\n\n```bash\nhelm install ambassador datawire/ambassador \\\n  --namespace seldon-system \\\n  --create-namespace \\\n  --set image.repository=quay.io/datawire/ambassador \\\n  --set enableAES=false \\\n  --set crds.keep=false \\\n  --version 6.9.3\n```\n\n다음과 같은 메시지가 출력되어야 합니다.\n\n```bash\n생략...\n\nW1206 17:01:36.026326   26635 warnings.go:70] rbac.authorization.k8s.io/v1beta1 Role is deprecated in v1.17+, unavailable in v1.22+; use rbac.authorization.k8s.io/v1 Role\nW1206 17:01:36.029764   26635 warnings.go:70] rbac.authorization.k8s.io/v1beta1 RoleBinding is deprecated in v1.17+, unavailable in v1.22+; use rbac.authorization.k8s.io/v1 RoleBinding\nNAME: ambassador\nLAST DEPLOYED: Mon Dec  6 17:01:34 2021\nNAMESPACE: seldon-system\nSTATUS: deployed\nREVISION: 1\nNOTES:\n-------------------------------------------------------------------------------\n  Congratulations! You've successfully installed Ambassador!\n\n-------------------------------------------------------------------------------\nTo get the IP address of Ambassador, run the following commands:\nNOTE: It may take a few minutes for the LoadBalancer IP to be available.\n     You can watch the status of by running 'kubectl get svc -w  --namespace seldon-system ambassador'\n\n  On GKE/Azure:\n  export SERVICE_IP=$(kubectl get svc --namespace seldon-system ambassador -o jsonpath='{.status.loadBalancer.ingress[0].ip}')\n\n  On AWS:\n  export SERVICE_IP=$(kubectl get svc --namespace seldon-system ambassador -o jsonpath='{.status.loadBalancer.ingress[0].hostname}')\n\n  echo http://$SERVICE_IP:\n\nFor help, visit our Slack at http://a8r.io/Slack or view the documentation online at https://www.getambassador.io.\n```\n\nseldon-system 에 4 개의 pod 가 Running 이 될 때까지 기다립니다.\n\n```bash\nkubectl get pod -n seldon-system\n```\n\n```bash\nambassador-7f596c8b57-4s9xh                  1/1     Running   0          7m15s\nambassador-7f596c8b57-dt6lr                  1/1     Running   0          7m15s\nambassador-7f596c8b57-h5l6f                  1/1     Running   0          7m15s\nambassador-agent-77bccdfcd5-d5jxj            1/1     Running   0          7m15s\n```\n\n### Seldon-Core - Helm Install\n\nseldon-core-operator Chart 1.11.2 버전을 설치합니다.\n\n```bash\nhelm install seldon-core seldon-core-operator \\\n    --repo https://storage.googleapis.com/seldon-charts \\\n    --namespace seldon-system \\\n    --set usageMetrics.enabled=true \\\n    --set ambassador.enabled=true \\\n    --version 1.11.2\n```\n\n다음과 같은 메시지가 출력되어야 합니다.\n\n```bash\n생략...\n\nW1206 17:05:38.336391   28181 warnings.go:70] admissionregistration.k8s.io/v1beta1 ValidatingWebhookConfiguration is deprecated in v1.16+, unavailable in v1.22+; use admissionregistration.k8s.io/v1 ValidatingWebhookConfiguration\nNAME: seldon-core\nLAST DEPLOYED: Mon Dec  6 17:05:34 2021\nNAMESPACE: seldon-system\nSTATUS: deployed\nREVISION: 1\nTEST SUITE: None\n```\n\nseldon-system namespace 에 1 개의 seldon-controller-manager pod 가 Running 이 될 때까지 기다립니다.\n\n```bash\nkubectl get pod -n seldon-system | grep seldon-controller\n```\n\n```bash\nseldon-controller-manager-8457b8b5c7-r2frm   1/1     Running   0          2m22s\n```\n\n## References\n\n- [Example Model Servers with Seldon](https://docs.seldon.io/projects/seldon-core/en/latest/examples/server_examples.html#examples-server-examples--page-root)\n"
  },
  {
    "path": "docs/setup-kubernetes/_category_.json",
    "content": "{\n  \"label\": \"Setup Kubernetes\",\n  \"position\": 2,\n  \"link\": {\n    \"type\": \"generated-index\"\n  }\n}\n"
  },
  {
    "path": "docs/setup-kubernetes/install-kubernetes/_category_.json",
    "content": "{\n  \"label\": \"4. Install Kubernetes\",\n  \"position\": 4,\n  \"link\": {\n    \"type\": \"generated-index\"\n  }\n}\n"
  },
  {
    "path": "docs/setup-kubernetes/install-kubernetes/kubernetes-with-k3s.md",
    "content": "---\ntitle: \"4.1. K3s\"\ndescription: \"\"\nsidebar_position: 1\ndate: 2021-12-13\nlastmod: 2021-12-20\ndraft: false\nweight: 221\ncontributors: [\"Jongseob Jeon\"]\nmenu:\n  docs:\n    parent:../setup-kubernetes\"\nimages: []\n---\n\n## 1. Prerequisite\n\n쿠버네티스 클러스터를 구축하기에 앞서, 필요한 구성 요소들을 **클러스터에** 설치합니다.\n\n[Install Prerequisite](../../setup-kubernetes/install-prerequisite.md)을 참고하여 Kubernetes를 설치하기 전에 필요한 요소들을 **클러스터에** 설치해 주시기 바랍니다.\n\nk3s 에서는 기본값으로 containerd를 백엔드로 이용해 설치합니다.\n하지만 저희는 GPU를 사용하기 위해서 docker를 백엔드로 사용해야 하므로 `--docker` 옵션을 통해 백엔드를 docker로 설치하겠습니다.\n\n```bash\ncurl -sfL https://get.k3s.io | INSTALL_K3S_VERSION=v1.21.7+k3s1 sh -s - server --disable traefik --disable servicelb --disable local-storage --docker\n```\n\nk3s를 설치 후 k3s config를 확인합니다\n\n```bash\nsudo cat /etc/rancher/k3s/k3s.yaml\n```\n\n정상적으로 설치되면 다음과 같은 항목이 출력됩니다.  \n(보안 문제와 관련된 키들은 <...>로 가렸습니다.)\n\n```bash\napiVersion: v1\nclusters:\n- cluster:\n    certificate-authority-data:\n    <...>\n    server: https://127.0.0.1:6443\n  name: default\ncontexts:\n- context:\n    cluster: default\n    user: default\n  name: default\ncurrent-context: default\nkind: Config\npreferences: {}\nusers:\n- name: default\n  user:\n    client-certificate-data:\n    <...>\n    client-key-data:\n    <...>\n```\n\n## 2. 쿠버네티스 클러스터 셋업\n\nk3s config를 클러스터의 kubeconfig로 사용하기 위해서 복사합니다.\n\n```bash\nmkdir .kube\nsudo cp /etc/rancher/k3s/k3s.yaml .kube/config\n```\n\n복사된 config 파일에 user가 접근할 수 있는 권한을 줍니다.\n\n```bash\nsudo chown $USER:$USER .kube/config\n```\n\n## 3. 쿠버네티스 클라이언트 셋업\n\n이제 클러스터에서 설정한 kubeconfig를 로컬로 이동합니다.\n로컬에서는 경로를 `~/.kube/config`로 설정합니다.\n\n처음 복사한 config 파일에는 server ip가 `https://127.0.0.1:6443` 으로 되어 있습니다.  \n이 값을 클러스터의 ip에 맞게 수정합니다.  \n(이번 페이지에서 사용하는 클러스터의 ip에 맞춰서 `https://192.168.0.19:6443` 으로 수정했습니다.)\n\n```bash\napiVersion: v1\nclusters:\n- cluster:\n    certificate-authority-data:\n    <...>\n    server: https://192.168.0.19:6443\n  name: default\ncontexts:\n- context:\n    cluster: default\n    user: default\n  name: default\ncurrent-context: default\nkind: Config\npreferences: {}\nusers:\n- name: default\n  user:\n    client-certificate-data:\n    <...>\n    client-key-data:\n    <...>\n```\n\n## 4. 쿠버네티스 기본 모듈 설치\n\n[Setup Kubernetes Modules](../../setup-kubernetes/install-kubernetes-module.md)을 참고하여 다음 컴포넌트들을 설치해 주시기 바랍니다.\n\n- helm\n- kustomize\n- CSI plugin\n- [Optional] nvidia-docker, nvidia-device-plugin\n\n## 5. 정상 설치 확인\n\n최종적으로 node가 Ready 인지, OS, Docker, Kubernetes 버전을 확인합니다.\n\n```bash\nkubectl get nodes -o wide\n```\n\n다음과 같은 메시지가 보이면 정상적으로 설치된 것을 의미합니다.\n\n```bash\nNAME    STATUS   ROLES                  AGE   VERSION        INTERNAL-IP    EXTERNAL-IP   OS-IMAGE             KERNEL-VERSION     CONTAINER-RUNTIME\nubuntu   Ready    control-plane,master   11m   v1.21.7+k3s1   192.168.0.19   <none>        Ubuntu 20.04.3 LTS   5.4.0-91-generic   docker://20.10.11\n```\n\n## 6. References\n\n- [https://rancher.com/docs/k3s/latest/en/installation/install-options/](https://rancher.com/docs/k3s/latest/en/installation/install-options/)\n"
  },
  {
    "path": "docs/setup-kubernetes/install-kubernetes/kubernetes-with-kubeadm.md",
    "content": "---\ntitle: \"4.3. Kubeadm\"\ndescription: \"\"\nsidebar_position: 3\ndate: 2021-12-13\nlastmod: 2021-12-20\ncontributors: [\"Youngcheol Jang\"]\n---\n\n## 1. Prerequisite\n\n쿠버네티스 클러스터를 구축하기에 앞서, 필요한 구성 요소들을 **클러스터에** 설치합니다.\n\n[Install Prerequisite](../../setup-kubernetes/install-prerequisite.md)을 참고하여 Kubernetes를 설치하기 전에 필요한 요소들을 **클러스터에** 설치해 주시기 바랍니다.\n\n쿠버네티스를 위한 네트워크의 설정을 변경합니다.\n\n```bash\nsudo modprobe br_netfilter\n\ncat <<EOF | sudo tee /etc/modules-load.d/k8s.conf\nbr_netfilter\nEOF\n\ncat <<EOF | sudo tee /etc/sysctl.d/k8s.conf\nnet.bridge.bridge-nf-call-ip6tables = 1\nnet.bridge.bridge-nf-call-iptables = 1\nEOF\nsudo sysctl --system\n```\n\n## 2. 쿠버네티스 클러스터 셋업\n\n- kubeadm : kubelet을 서비스에 등록하고, 클러스터 컴포넌트들 사이의 통신을 위한 인증서 발급 등 설치 과정 자동화\n- kubelet : container 리소스를 실행, 종료를 해 주는 컨테이너 핸들러\n- kubectl : 쿠버네티스 클러스터를 터미널에서 확인, 조작하기 위한 CLI 도구\n\n다음 명령어를 통해 kubeadm, kubelet, kubectl을 설치합니다.\n실수로 이 컴포넌트들의 버전이 변경하면, 예기치 않은 장애를 낳을 수 있으므로 컴포넌트들이 변경되지 않도록 설정합니다.\n\n```bash\nsudo apt-get update\nsudo apt-get install -y apt-transport-https ca-certificates curl &&\nsudo curl -fsSLo /usr/share/keyrings/kubernetes-archive-keyring.gpg https://packages.cloud.google.com/apt/doc/apt-key.gpg &&\necho \"deb [signed-by=/usr/share/keyrings/kubernetes-archive-keyring.gpg] https://apt.kubernetes.io/ kubernetes-xenial main\" | sudo tee /etc/apt/sources.list.d/kubernetes.list &&\nsudo apt-get update\nsudo apt-get install -y kubelet=1.21.7-00 kubeadm=1.21.7-00 kubectl=1.21.7-00 &&\nsudo apt-mark hold kubelet kubeadm kubectl\n```\n\nkubeadm, kubelet, kubectl 이 잘 설치되었는지 확인합니다.\n\n```bash\nmlops@ubuntu:~$ kubeadm version\nkubeadm version: &version.Info{Major:\"1\", Minor:\"21\", GitVersion:\"v1.21.7\", GitCommit:\"1f86634ff08f37e54e8bfcd86bc90b61c98f84d4\", GitTreeState:\"clean\", BuildDate:\"2021-11-17T14:40:08Z\", GoVersion:\"go1.16.10\", Compiler:\"gc\", Platform:\"linux/amd64\"}\n```\n\n```bash\nmlops@ubuntu:~$ kubelet --version\nKubernetes v1.21.7\n```\n\n```bash\nmlops@ubuntu:~$ kubectl version --client\nClient Version: version.Info{Major:\"1\", Minor:\"21\", GitVersion:\"v1.21.7\", GitCommit:\"1f86634ff08f37e54e8bfcd86bc90b61c98f84d4\", GitTreeState:\"clean\", BuildDate:\"2021-11-17T14:41:19Z\", GoVersion:\"go1.16.10\", Compiler:\"gc\", Platform:\"linux/amd64\"}\n```\n\n이제 kubeadm을 사용하여 쿠버네티스를 설치합니다.\n\n```bash\nkubeadm config images list\nkubeadm config images pull\n\nsudo kubeadm init --pod-network-cidr=10.244.0.0/16\n```\n\nkubectl을 통해서 쿠버네티스 클러스터를 제어할 수 있도록 admin 인증서를 $HOME/.kube/config 경로에 복사합니다.\n\n```bash\nmkdir -p $HOME/.kube\nsudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config\nsudo chown $(id -u):$(id -g) $HOME/.kube/config\n```\n\nCNI를 설치합니다.\n쿠버네티스 내부의 네트워크 설정을 전담하는 CNI는 여러 종류가 있으며, *모두의 MLOps*에서는 flannel을 사용합니다.\n\n```bash\nkubectl apply -f https://raw.githubusercontent.com/flannel-io/flannel/v0.13.0/Documentation/kube-flannel.yml\n```\n\n쿠버네티스 노드의 종류에는 크게 `마스터 노드`와 `워커 노드`가 있습니다.\n안정성을 위하여 `마스터 노드`에는 쿠버네티스 클러스터를 제어하는 작업만 실행되도록 하는 것이 일반적이지만,\n이 매뉴얼에서는 싱글 클러스터를 가정하고 있으므로 마스터 노드에 모든 종류의 작업이 실행될 수 있도록 설정합니다.\n\n```bash\nkubectl taint nodes --all node-role.kubernetes.io/master-\n```\n\n## 3. 쿠버네티스 클라이언트 셋업\n\n클러스터에 생성된 kubeconfig 파일을 **클라이언트**에 복사하여 kubectl을 통해 클러스터를 제어할 수 있도록 합니다.\n\n```bash\nmkdir -p $HOME/.kube\nscp -p {CLUSTER_USER_ID}@{CLUSTER_IP}:~/.kube/config ~/.kube/config\n```\n\n## 4. 쿠버네티스 기본 모듈 설치\n\n[Setup Kubernetes Modules](../../setup-kubernetes/install-kubernetes-module.md)을 참고하여 다음 컴포넌트들을 설치해 주시기 바랍니다.\n\n- helm\n- kustomize\n- CSI plugin\n- [Optional] nvidia-docker, nvidia-device-plugin\n\n## 5. 정상 설치 확인\n\n다음 명령어를 통해 노드의 STATUS가 Ready 상태가 되었는지 확인합니다.\n\n```bash\nkubectl get nodes\n```\n\nReady 가 되면 다음과 비슷한 결과가 출력됩니다.\n\n```bash\nNAME     STATUS   ROLES                  AGE     VERSION\nubuntu   Ready    control-plane,master   2m55s   v1.21.7\n```\n\n## 6. References\n\n- [kubeadm](https://kubernetes.io/docs/setup/production-environment/tools/kubeadm/install-kubeadm)\n"
  },
  {
    "path": "docs/setup-kubernetes/install-kubernetes/kubernetes-with-minikube.md",
    "content": "---\ntitle: \"4.2. Minikube\"\ndescription: \"\"\nsidebar_position: 2\ndate: 2021-12-13\nlastmod: 2021-12-20\ncontributors: [\"Jaeyeon Kim\"]\n---\n\n## 1. Prerequisite\n\n쿠버네티스 클러스터를 구축하기에 앞서, 필요한 구성 요소들을 **클러스터에** 설치합니다.\n\n[Install Prerequisite](../../setup-kubernetes/install-prerequisite.md)을 참고하여 Kubernetes를 설치하기 전에 필요한 요소들을 **클러스터에** 설치해 주시기 바랍니다.\n\n### Minikube binary\n\nMinikube를 사용하기 위해, v1.24.0 버전의 Minikube 바이너리를 설치합니다.\n\n```bash\nwget https://github.com/kubernetes/minikube/releases/download/v1.24.0/minikube-linux-amd64\nsudo install minikube-linux-amd64 /usr/local/bin/minikube\n```\n\n정상적으로 설치되었는지 확인합니다.\n\n```bash\nminikube version\n```\n\n다음과 같은 메시지가 보이면 정상적으로 설치된 것을 의미합니다.\n\n```bash\nmlops@ubuntu:~$ minikube version\nminikube version: v1.24.0\ncommit: 76b94fb3c4e8ac5062daf70d60cf03ddcc0a741b\n```\n\n## 2. 쿠버네티스 클러스터 셋업\n\n이제 Minikube를 활용해 쿠버네티스 클러스터를 **클러스터에** 구축합니다.\nGPU 의 원활한 사용과 클러스터-클라이언트 간 통신을 간편하게 수행하기 위해, Minikube 는 `driver=none` 옵션을 활용하여 실행합니다. `driver=none` 옵션은 root user 로 실행해야 함에 주의 바랍니다.\n\nroot user로 전환합니다.\n\n```bash\nsudo su\n```\n\n`minikube start`를 수행하여 쿠버네티스 클러스터 구축을 진행합니다. Kubeflow의 원활한 사용을 위해, 쿠버네티스 버전은 v1.21.7로 지정하여 구축하며 `--extra-config`를 추가합니다.\n\n```bash\nminikube start --driver=none \\\n  --kubernetes-version=v1.21.7 \\\n  --extra-config=apiserver.service-account-signing-key-file=/var/lib/minikube/certs/sa.key \\\n  --extra-config=apiserver.service-account-issuer=kubernetes.default.svc\n```\n\n### Disable default addons\n\nMinikube를 설치하면 Default로 설치되는 addon이 존재합니다. 이 중 저희가 사용하지 않을 addon을 비활성화합니다.\n\n```bash\nminikube addons disable storage-provisioner\nminikube addons disable default-storageclass\n```\n\n모든 addon이 비활성화된 것을 확인합니다.\n\n```bash\nminikube addons list\n```\n\n다음과 같은 메시지가 보이면 정상적으로 설치된 것을 의미합니다.\n\n```bash\nroot@ubuntu:/home/mlops# minikube addons list\n|-----------------------------|----------|--------------|-----------------------|\n|         ADDON NAME          | PROFILE  |    STATUS    |      MAINTAINER       |\n|-----------------------------|----------|--------------|-----------------------|\n| ambassador                  | minikube | disabled     | unknown (third-party) |\n| auto-pause                  | minikube | disabled     | google                |\n| csi-hostpath-driver         | minikube | disabled     | kubernetes            |\n| dashboard                   | minikube | disabled     | kubernetes            |\n| default-storageclass        | minikube | disabled     | kubernetes            |\n| efk                         | minikube | disabled     | unknown (third-party) |\n| freshpod                    | minikube | disabled     | google                |\n| gcp-auth                    | minikube | disabled     | google                |\n| gvisor                      | minikube | disabled     | google                |\n| helm-tiller                 | minikube | disabled     | unknown (third-party) |\n| ingress                     | minikube | disabled     | unknown (third-party) |\n| ingress-dns                 | minikube | disabled     | unknown (third-party) |\n| istio                       | minikube | disabled     | unknown (third-party) |\n| istio-provisioner           | minikube | disabled     | unknown (third-party) |\n| kubevirt                    | minikube | disabled     | unknown (third-party) |\n| logviewer                   | minikube | disabled     | google                |\n| metallb                     | minikube | disabled     | unknown (third-party) |\n| metrics-server              | minikube | disabled     | kubernetes            |\n| nvidia-driver-installer     | minikube | disabled     | google                |\n| nvidia-gpu-device-plugin    | minikube | disabled     | unknown (third-party) |\n| olm                         | minikube | disabled     | unknown (third-party) |\n| pod-security-policy         | minikube | disabled     | unknown (third-party) |\n| portainer                   | minikube | disabled     | portainer.io          |\n| registry                    | minikube | disabled     | google                |\n| registry-aliases            | minikube | disabled     | unknown (third-party) |\n| registry-creds              | minikube | disabled     | unknown (third-party) |\n| storage-provisioner         | minikube | disabled     | kubernetes            |\n| storage-provisioner-gluster | minikube | disabled     | unknown (third-party) |\n| volumesnapshots             | minikube | disabled     | kubernetes            |\n|-----------------------------|----------|--------------|-----------------------|\n```\n\n## 3. 쿠버네티스 클라이언트 셋업\n\n이번에는 **클라이언트**에 쿠버네티스의 원활한 사용을 위한 도구를 설치합니다.\n**클라이언트**와 **클러스터** 노드가 분리되지 않은 경우에는 root user로 모든 작업을 진행해야 함에 주의바랍니다.\n\n**클라이언트**와 **클러스터** 노드가 분리된 경우, 우선 kubernetes의 관리자 인증 정보를 **클라이언트**로 가져옵니다.\n\n1. **클러스터**에서 config를 확인합니다.\n\n  ```bash\n  # 클러스터 노드\n  minikube kubectl -- config view --flatten\n  ```\n\n2. 다음과 같은 정보가 출력됩니다.\n\n  ```bash\n  apiVersion: v1\n  clusters:\n  - cluster:\n      certificate-authority-data: LS0tLS1CRUd....\n      extensions:\n      - extension:\n          last-update: Mon, 06 Dec 2021 06:55:46 UTC\n          provider: minikube.sigs.k8s.io\n          version: v1.24.0\n        name: cluster_info\n      server: https://192.168.0.62:8443\n    name: minikube\n  contexts:\n  - context:\n      cluster: minikube\n      extensions:\n      - extension:\n          last-update: Mon, 06 Dec 2021 06:55:46 UTC\n          provider: minikube.sigs.k8s.io\n          version: v1.24.0\n        name: context_info\n      namespace: default\n      user: minikube\n    name: minikube\n  current-context: minikube\n  kind: Config\n  preferences: {}\n  users:\n  - name: minikube\n    user:\n      client-certificate-data: LS0tLS1CRUdJTi....\n      client-key-data: LS0tLS1CRUdJTiBSU0....\n  ```\n\n3. **클라이언트** 노드에서 `.kube` 폴더를 생성합니다.\n\n  ```bash\n  # 클라이언트 노드\n  mkdir -p /home/$USER/.kube\n  ```\n\n4. 해당 파일에 2. 에서 출력된 정보를 붙여넣은 뒤 저장합니다.\n  \n  ```bash\n  vi /home/$USER/.kube/config\n  ```\n\n## 4. 쿠버네티스 기본 모듈 설치\n\n[Setup Kubernetes Modules](../../setup-kubernetes/install-kubernetes-module.md)을 참고하여 다음 컴포넌트들을 설치해 주시기 바랍니다.\n\n- helm\n- kustomize\n- CSI plugin\n- [Optional] nvidia-docker, nvidia-device-plugin\n\n## 5. 정상 설치 확인\n\n최종적으로 node가 Ready 인지, OS, Docker, Kubernetes 버전을 확인합니다.\n\n```bash\nkubectl get nodes -o wide\n```\n\n다음과 같은 메시지가 보이면 정상적으로 설치된 것을 의미합니다.\n\n```bash\nNAME     STATUS   ROLES                  AGE     VERSION   INTERNAL-IP    EXTERNAL-IP   OS-IMAGE             KERNEL-VERSION     CONTAINER-RUNTIME\nubuntu   Ready    control-plane,master   2d23h   v1.21.7   192.168.0.75   <none>        Ubuntu 20.04.3 LTS   5.4.0-91-generic   docker://20.10.11\n```\n"
  },
  {
    "path": "docs/setup-kubernetes/install-kubernetes-module.md",
    "content": "---\ntitle: \"5. Install Kubernetes Modules\"\ndescription: \"Install Helm, Kustomize\"\nsidebar_position: 5\ndate: 2021-12-13\nlastmod: 2021-12-20\ncontributors: [\"Jaeyeon Kim\"]\n---\n\n## Setup Kubernetes Modules\n\n이번 페이지에서는 클러스터에서 사용할 모듈을 클라이언트 노드에서 설치하는 과정에 관해서 설명합니다.  \n앞으로 소개되는 과정은 모두 **클라이언트 노드**에서 진행됩니다.\n\n## Helm\n\nHelm은 쿠버네티스 패키지와 관련된 자원을 한 번에 배포하고 관리할 수 있게 도와주는 패키지 매니징 도구 중 하나입니다.\n\n1. 현재 폴더에 Helm v3.7.1 버전을 내려받습니다.\n\n- For Linux amd64\n\n  ```bash\n  wget https://get.helm.sh/helm-v3.7.1-linux-amd64.tar.gz\n  ```\n\n- 다른 OS는 [공식 홈페이지](https://github.com/helm/helm/releases/tag/v3.7.1)를 참고하시어, 클라이언트 노드의 OS와 CPU에 맞는 바이너리의 다운 경로를 확인하시기 바랍니다.\n\n2. helm을 사용할 수 있도록 압축을 풀고, 파일의 위치를 변경합니다.\n\n  ```bash\n  tar -zxvf helm-v3.7.1-linux-amd64.tar.gz\n  sudo mv linux-amd64/helm /usr/local/bin/helm\n  ```\n\n3. 정상적으로 설치되었는지 확인합니다.\n\n  ```bash\n  helm help\n  ```\n\n  다음과 같은 메시지가 보이면 정상적으로 설치된 것을 의미합니다.\n\n  ```bash\n  The Kubernetes package manager\n\n  Common actions for Helm:\n\n  - helm search:    search for charts\n  - helm pull:      download a chart to your local directory to view\n  - helm install:   upload the chart to Kubernetes\n  - helm list:      list releases of charts\n\n  Environment variables:\n\n  | Name                     | Description                                                         |\n  |--------------------------|---------------------------------------------------------------------|\n  | $HELM_CACHE_HOME         | set an alternative location for storing cached files.               |\n  | $HELM_CONFIG_HOME        | set an alternative location for storing Helm configuration.         |\n  | $HELM_DATA_HOME          | set an alternative location for storing Helm data.                  |\n\n  ...\n  ```\n\n## Kustomize\n\nkustomize 또한 여러 쿠버네티스 리소스를 한 번에 배포하고 관리할 수 있게 도와주는 패키지 매니징 도구 중 하나입니다.\n\n1. 현재 폴더에 kustomize v3.10.0 버전의 바이너리를 다운받습니다.\n\n- For Linux amd64\n\n  ```bash\n  wget https://github.com/kubernetes-sigs/kustomize/releases/download/kustomize%2Fv3.10.0/kustomize_v3.10.0_linux_amd64.tar.gz\n  ```\n\n- 다른 OS는 [kustomize/v3.10.0](https://github.com/kubernetes-sigs/kustomize/releases/tag/kustomize%2Fv3.10.0)에서 확인 후 다운로드 받습니다.\n\n2. kustomize 를 사용할 수 있도록 압축을 풀고, 파일의 위치를 변경합니다.\n\n  ```bash\n  tar -zxvf kustomize_v3.10.0_linux_amd64.tar.gz\n  sudo mv kustomize /usr/local/bin/kustomize\n  ```\n\n3. 정상적으로 설치되었는지 확인합니다.\n\n  ```bash\n  kustomize help\n  ```\n\n  다음과 같은 메시지가 보이면 정상적으로 설치된 것을 의미합니다.\n\n  ```bash\n  Manages declarative configuration of Kubernetes.\n  See https://sigs.k8s.io/kustomize\n\n  Usage:\n    kustomize [command]\n\n  Available Commands:\n    build                     Print configuration per contents of kustomization.yaml\n    cfg                       Commands for reading and writing configuration.\n    completion                Generate shell completion script\n    create                    Create a new kustomization in the current directory\n    edit                      Edits a kustomization file\n    fn                        Commands for running functions against configuration.\n  ...\n  ```\n\n## CSI Plugin : Local Path Provisioner\n\n1. CSI Plugin은 kubernetes 내의 스토리지를 담당하는 모듈입니다. 단일 노드 클러스터에서 쉽게 사용할 수 있는 CSI Plugin인 Local Path Provisioner를 설치합니다.\n\n  ```bash\n  kubectl apply -f https://raw.githubusercontent.com/rancher/local-path-provisioner/v0.0.20/deploy/local-path-storage.yaml\n  ```\n\n  다음과 같은 메시지가 보이면 정상적으로 설치된 것을 의미합니다.\n\n  ```bash\n  namespace/local-path-storage created\n  serviceaccount/local-path-provisioner-service-account created\n  clusterrole.rbac.authorization.k8s.io/local-path-provisioner-role created\n  clusterrolebinding.rbac.authorization.k8s.io/local-path-provisioner-bind created\n  deployment.apps/local-path-provisioner created\n  storageclass.storage.k8s.io/local-path created\n  configmap/local-path-config created\n  ```\n\n2. 또한, 다음과 같이 local-path-storage namespace 에 provisioner pod이 Running 인지 확인합니다.\n\n  ```bash\n  kubectl -n local-path-storage get pod\n  ```\n\n  정상적으로 수행되면 아래와 같이 출력됩니다.\n\n  ```bash\n  NAME                                     READY     STATUS    RESTARTS   AGE\n  local-path-provisioner-d744ccf98-xfcbk   1/1       Running   0          7m\n  ```\n\n4. 다음을 수행하여 default storage class로 변경합니다.\n\n  ```bash\n  kubectl patch storageclass local-path  -p '{\"metadata\": {\"annotations\":{\"storageclass.kubernetes.io/is-default-class\":\"true\"}}}'\n  ```\n\n  정상적으로 수행되면 아래와 같이 출력됩니다.\n\n  ```bash\n  storageclass.storage.k8s.io/local-path patched\n  ```\n\n5. default storage class로 설정되었는지 확인합니다.\n\n  ```bash\n  kubectl get sc\n  ```\n\n  다음과 같이 NAME에 `local-path (default)` 인 storage class가 존재하는 것을 확인합니다.\n\n  ```bash\n  NAME                   PROVISIONER             RECLAIMPOLICY   VOLUMEBINDINGMODE      ALLOWVOLUMEEXPANSION   AGE\n  local-path (default)   rancher.io/local-path   Delete          WaitForFirstConsumer   false                  2h\n  ```\n"
  },
  {
    "path": "docs/setup-kubernetes/install-prerequisite.md",
    "content": "---\ntitle: \"3. Install Prerequisite\"\ndescription: \"Install docker\"\nsidebar_position: 3\ndate: 2021-12-13\nlastmod: 2023-09-29\ncontributors: [\"Jaeyeon Kim\", \"Jongsun Shinn\", \"Sangwoo Shim\", \"Minwook Je\"]\n---\n\n\n이 페이지에서는 쿠버네티스를 설치하기에 앞서, **클러스터**와 **클라이언트**에 설치 혹은 설정해두어야 하는 컴포넌트들에 대한 매뉴얼을 설명합니다.\n\n## Install apt packages\n\n추후 클라이언트와 클러스터의 원활한 통신을 위해서는 Port-Forwarding을 수행해야 할 일이 있습니다.\nPort-Forwarding을 위해서는 **클러스터**에 다음 패키지를 설치해 주어야 합니다.\n\n```bash\nsudo apt-get update\nsudo apt-get install -y socat\n```\n\n## Install Docker\n\n1. 도커 설치에 필요한 APT 패키지들을 설치합니다.\n\n   ```bash\n   sudo apt-get update && sudo apt-get install -y ca-certificates curl gnupg lsb-release\n   ```\n\n2. 도커의 공식 GPG key를 추가합니다.\n\n   ```bash\n   curl -fsSL https://download.docker.com/linux/ubuntu/gpg | sudo gpg --dearmor -o /usr/share/keyrings/docker-archive-keyring.gpg\n   ```\n\n3. apt 패키지 매니저로 도커를 설치할 때, stable Repository에서 받아오도록 설정합니다.\n\n   ```bash\n   echo \\\n   \"deb [arch=$(dpkg --print-architecture) signed-by=/usr/share/keyrings/docker-archive-keyring.gpg] https://download.docker.com/linux/ubuntu \\\n   $(lsb_release -cs) stable\" | sudo tee /etc/apt/sources.list.d/docker.list > /dev/null\n   ```\n\n4. 현재 설치할 수 있는 도커 버전을 확인합니다.\n\n   ```bash\n   sudo apt-get update && apt-cache madison docker-ce\n   ```\n\n   출력되는 버전 중 `5:20.10.11~3-0~ubuntu-focal` 버전이 있는지 확인합니다.\n\n   ```bash\n   apt-cache madison docker-ce | grep 5:20.10.11~3-0~ubuntu-focal\n   ```\n\n   정상적으로 추가가 된 경우 다음과 같이 출력됩니다.\n\n   ```bash\n   docker-ce | 5:20.10.11~3-0~ubuntu-focal | https://download.docker.com/linux/ubuntu focal/stable amd64 Packages\n   ```\n\n5. `5:20.10.11~3-0~ubuntu-focal` 버전의 도커를 설치합니다.\n\n   ```bash\n   sudo apt-get install -y containerd.io docker-ce=5:20.10.11~3-0~ubuntu-focal docker-ce-cli=5:20.10.11~3-0~ubuntu-focal\n   ```\n\n6. 도커가 정상적으로 설치된 것을 확인합니다.\n\n   ```bash\n   sudo docker run hello-world\n   ```\n\n   명령어 실행 후 다음과 같은 메시지가 보이면 정상적으로 설치된 것을 의미합니다.\n\n   ```bash\n   mlops@ubuntu:~$ sudo docker run hello-world\n\n   Hello from Docker!\n   This message shows that your installation appears to be working correctly.\n\n   To generate this message, Docker took the following steps:\n   1. The Docker client contacted the Docker daemon.\n   2. The Docker daemon pulled the \"hello-world\" image from the Docker Hub.\n      (amd64)\n   3. The Docker daemon created a new container from that image which runs the\n      executable that produces the output you are currently reading.\n   4. The Docker daemon streamed that output to the Docker client, which sent it\n      to your terminal.\n\n   To try something more ambitious, you can run an Ubuntu container with:\n   $ docker run -it ubuntu bash\n\n   Share images, automate workflows, and more with a free Docker ID:\n   https://hub.docker.com/\n\n   For more examples and ideas, visit:\n   https://docs.docker.com/get-started/\n   ```\n\n7. docker 관련 command를 sudo 키워드 없이 사용할 수 있게 하도록 다음 명령어를 통해 권한을 추가합니다.\n\n   ```bash\n   sudo groupadd docker\n   sudo usermod -aG docker $USER\n   newgrp docker\n   ```\n\n8. sudo 키워드 없이 docker command를 사용할 수 있게 된 것을 확인하기 위해, 다시 한번 docker run을 실행합니다.\n\n   ```bash\n   docker run hello-world\n   ```\n\n   명령어 실행 후 다음과 같은 메시지가 보이면 정상적으로 권한이 추가된 것을 의미합니다.\n\n   ```bash\n   mlops@ubuntu:~$ docker run hello-world\n\n   Hello from Docker!\n   This message shows that your installation appears to be working correctly.\n\n   To generate this message, Docker took the following steps:\n   1. The Docker client contacted the Docker daemon.\n   2. The Docker daemon pulled the \"hello-world\" image from the Docker Hub.\n      (amd64)\n   3. The Docker daemon created a new container from that image which runs the\n      executable that produces the output you are currently reading.\n   4. The Docker daemon streamed that output to the Docker client, which sent it\n      to your terminal.\n\n   To try something more ambitious, you can run an Ubuntu container with:\n   $ docker run -it ubuntu bash\n\n   Share images, automate workflows, and more with a free Docker ID:\n   https://hub.docker.com/\n\n   For more examples and ideas, visit:\n   https://docs.docker.com/get-started/\n   ```\n\n## Turn off Swap Memory\n\nkubelet 이 정상적으로 동작하게 하기 위해서는 **클러스터** 노드에서 swap이라고 불리는 가상메모리를 꺼 두어야 합니다. 다음 명령어를 통해 swap을 꺼 둡니다.  \n**(클러스터와 클라이언트를 같은 데스크톱에서 사용할 때 swap 메모리를 종료하면 속도의 저하가 있을 수 있습니다)**  \n\n```bash\nsudo sed -i '/ swap / s/^\\(.*\\)$/#\\1/g' /etc/fstab\nsudo swapoff -a\n```\n\n## Install Kubectl\n\nkubectl 은 쿠버네티스 클러스터에 API를 요청할 때 사용하는 클라이언트 툴입니다. **클라이언트** 노드에 설치해두어야 합니다.\n\n1. 현재 폴더에 kubectl v1.21.7 버전을 다운받습니다.\n\n   ```bash\n   curl -LO https://dl.k8s.io/release/v1.21.7/bin/linux/amd64/kubectl\n\n   # Or if you use arm64\n   curl -LO https://dl.k8s.io/release/v1.21.7/bin/linux/arm64/kubectl\n   ```\n\n\n2. kubectl 을 사용할 수 있도록 파일의 권한과 위치를 변경합니다.\n\n   ```bash\n   sudo install -o root -g root -m 0755 kubectl /usr/local/bin/kubectl\n   ```\n\n3. 정상적으로 설치되었는지 확인합니다.\n\n   ```bash\n   kubectl version --client\n   ```\n\n   다음과 같은 메시지가 보이면 정상적으로 설치된 것을 의미합니다.\n\n   ```bash\n   Client Version: version.Info{Major:\"1\", Minor:\"21\", GitVersion:\"v1.21.7\", GitCommit:\"1f86634ff08f37e54e8bfcd86bc90b61c98f84d4\", GitTreeState:\"clean\", BuildDate:\"2021-11-17T14:41:19Z\", GoVersion:\"go1.16.10\", Compiler:\"gc\", Platform:\"linux/amd64\"}\n   ```\n\n4. 여러 개의 쿠버네티스 클러스터를 사용하는 경우, 여러 개의 kubeconfig 파일을 관리해야 하는 경우가 있습니다.  \n여러 개의 kubeconfig 파일 혹은 여러 개의 kube-context를 효율적으로 관리하는 방법은 다음과 같은 문서를 참고하시기 바랍니다.\n\n   - [https://dev.to/aabiseverywhere/configuring-multiple-kubeconfig-on-your-machine-59eo](https://dev.to/aabiseverywhere/configuring-multiple-kubeconfig-on-your-machine-59eo)\n   - [https://github.com/ahmetb/kubectx](https://github.com/ahmetb/kubectx)\n\n## References\n\n- [Install Docker Engine on Ubuntu](https://docs.docker.com/engine/install/ubuntu/)\n- [리눅스에 kubectl 설치 및 설정](https://kubernetes.io/ko/docs/tasks/tools/install-kubectl-linux/)\n"
  },
  {
    "path": "docs/setup-kubernetes/intro.md",
    "content": "---\ntitle: \"1. Introduction\"\ndescription: \"Setup Introduction\"\nsidebar_position: 1\ndate: 2021-12-13\nlastmod: 2021-12-13\ncontributors: [\"Jaeyeon Kim\", \"Jongsun Shinn\", \"Youngdon Tae\", \"SeungTae Kim\"]\n---\n\n## MLOps 시스템 구축해보기\n\nMLOps를 공부하는 데 있어서 가장 큰 장벽은 MLOps 시스템을 구성해보고 사용해보기가 어렵다는 점입니다. AWS, GCP 등의 퍼블릭 클라우드 혹은 Weight & Bias, neptune.ai 등의 상용 툴을 사용해보기에는 과금에 대한 부담이 존재하고, 처음부터 모든 환경을 혼자서 구성하기에는 어디서부터 시작해야 할지 막막하게 느껴질 수밖에 없습니다.\n\n이런 이유들로 MLOps를 선뜻 시작해보지 못하시는 분들을 위해, *모두의 MLOps*에서는 우분투가 설치되는 데스크톱 하나만 준비되어 있다면 MLOps 시스템을 밑바닥부터 구축하고 사용해 볼 수 있는 방법을 다룰 예정입니다.\n\n우분투 데스크탑 환경을 준비할 수 없는 경우, 가상머신을 활용하여 환경을 구성하기\n\n>Windows 혹은 Intel Mac을 사용해 `모두의 MLops` 실습을 진행 중인 분들은 `Virtual Box`, `VMware` 등의 가상머신 소프트웨어를 이용하여 우분투 데스크탑 환경을 준비할 수 있습니다. 이 때, 권장 사양을 맞춰 가상 머신을 생성해주시기 바랍니다.\n>또한, M1 Mac을 사용하시는 분들은 작성일(2022년 2월) 기준으로는 Virtual Box, VMware 는 이용할 수 없습니다. ([M1 Apple Silicone Mac에 최적화된 macOS 앱 지원 확인하기](https://isapplesiliconready.com/kr))\n>따라서, 클라우드 환경을 이용해 실습하는 것이 아니라면, [UTM , Virtual machines for Mac](https://mac.getutm.app/)을 설치하여 가상 머신을 이용해주세요.\n>(앱스토어에서 구매하여 다운로드 받는 소프트웨어는 일종의 Donation 개념의 비용 지불입니다. 무료 버전과 자동 업데이트 정도의 차이가 있어, 무료버전을 사용해도 무방합니다.)\n>해당 가상머신 소프트웨어는 `Ubuntu 20.04.3 LTS` 실습 운영체제를 지원하고 있어, M1 Mac에서 실습을 수행하는 것을 가능하게 합니다.\n\n\n하지만 [MLOps의 구성요소](../introduction/component.md)에서 설명하는 요소들을 모두 사용해볼 수는 없기에, *모두의 MLOps*에서는 대표적인 오픈소스만을 설치한 뒤, 서로 연동하여 사용하는 부분을 주로 다룰 예정입니다.\n\n*모두의 MLOps*에서 설치하는 오픈소스가 표준을 의미하는 것은 아니며, 여러분의 상황에 맞게 적절한 툴을 취사선택하는 것을 권장합니다.\n\n## 구성 요소\n\n이 글에서 만들어 볼 MLOps 시스템의 구성 요소들과 각 버전은 아래와 같은 환경에서 검증되었습니다.\n\n원활한 환경에서 테스트하기 위해 **싱글 노드 클러스터 (혹은 클러스터)** 와 **클라이언트**를 분리하여 설명해 드릴 예정입니다.  \n**클러스터** 는 우분투가 설치되어 있는 데스크톱 하나를 의미합니다.  \n**클라이언트** 는 노트북 혹은 클러스터가 설치되어 있는 데스크톱 외의 클라이언트로 사용할 수 있는 다른 데스크톱을 사용하는 것을 권장합니다.  \n하지만 두 대의 머신을 준비할 수 없다면 데스크톱 하나를 동시에 클러스터와 클라이언트 용도로 사용하셔도 괜찮습니다.\n\n### 클러스터\n\n#### 1. Software\n\n아래는 클러스터에 설치해야 할 소프트웨어 목록입니다.\n\n| Software        | Version     |\n| --------------- | ----------- |\n| Ubuntu          | 20.04.3 LTS |\n| Docker (Server) | 20.10.11    |\n| NVIDIA-Driver   | 470.86      |\n| Kubernetes      | v1.21.7     |\n| Kubeflow        | v1.4.0      |\n| MLFlow          | v1.21.0     |\n\n#### 2. Helm Chart\n\n아래는 Helm을 이용해 설치되어야 할 써드파티 소프트웨어 목록입니다.\n\n| Helm Chart Repo Name          | Version |\n| ----------------------------- | ------- |\n| datawire/ambassador           | 6.9.3   |\n| seldonio/seldon-core-operator | 1.11.2  |\n\n### 클라이언트\n\n클라이언트는 MacOS (Intel CPU), Ubuntu 20.04 에서 검증되었습니다.\n\n| Software        | Version     |\n| --------------- | ----------- |\n| kubectl         | v1.21.7     |\n| helm            | v3.7.1      |\n| kustomize       | v3.10.0     |\n\n### Minimum System Requirements\n\n모두의 MLOps를 설치할 클러스터는 다음과 같은 사양을 만족시키는 것을 권장합니다.  \n이는 Kubernetes 및 Kubeflow 의 권장 사양에 의존합니다.\n\n- CPU : 6 core\n- RAM : 12GB\n- DISK : 50GB\n- GPU : NVIDIA GPU (Optional)\n"
  },
  {
    "path": "docs/setup-kubernetes/kubernetes.md",
    "content": "---\ntitle : \"2. Setup Kubernetes\"\ndescription: \"Setup Kubernetes\"\nsidebar_position: 2\ndate: 2021-12-13\nlastmod: 2021-12-13\ncontributors: [\"Jaeyeon Kim\"]\n---\n\n## Setup Kubernetes Cluster\n\n쿠버네티스를 처음 배우시는 분들에게 첫 진입 장벽은 쿠버네티스 실습 환경을 구축하는 것입니다.\n\n프로덕션 레벨의 쿠버네티스 클러스터를 구축할 수 있게 공식적으로 지원하는 도구는 kubeadm 이지만, 사용자들이 조금 더 쉽게 구축할 수 있도록 도와주는 kubespray, kops 등의 도구도 존재하며, 학습 목적을 위해서 컴팩트한 쿠버네티스 클러스터를 정말 쉽게 구축할 수 있도록 도와주는 k3s, minikube, microk8s, kind 등의 도구도 존재합니다.\n\n각각의 도구는 장단점이 다르기에 사용자마다 선호하는 도구가 다른 점을 고려하여, 본 글에서는 kubeadm, k3s, minikube의 3가지 도구를 활용하여 쿠버네티스 클러스터를 구축하는 방법을 다룹니다.\n각 도구에 대한 자세한 비교는 다음 쿠버네티스 [공식 문서](https://kubernetes.io/ko/docs/tasks/tools/)를 확인해주시기를 바랍니다.\n\n*모두의 MLOps*에서 권장하는 툴은 **k3s**로 쿠버네티스 클러스터를 구축할 때 쉽게 할 수 있다는 장점이 있습니다.  \n만약 쿠버네티스의 모든 기능을 사용하고 노드 구성까지 활용하고 싶다면 **kubeadm**을 권장해 드립니다.  \n**minikube** 는 저희가 설명하는 컴포넌트 외에도 다른 쿠버네티스를 add-on 형식으로 쉽게 설치할 수 있다는 장점이 있습니다.\n\n본 *모두의 MLOps*에서는 구축하게 될 MLOps 구성 요소들을 원활히 사용하기 위해, 각각의 도구를 활용해 쿠버네티스 클러스터를 구축할 때, 추가로 설정해 주어야 하는 부분이 추가되어 있습니다.\n\nUbuntu OS까지는 설치되어 있는 데스크탑을 k8s cluster로 구축한 뒤, 외부 클라이언트 노드에서 쿠버네티스 클러스터에 접근하는 것을 확인하는 것까지가 본 **Setup Kubernetes**단원의 범위입니다.\n\n자세한 구축 방법은 3가지 도구마다 다르기에 다음과 같은 흐름으로 구성되어 있습니다.\n\n```bash\n3. Setup Prerequisite\n4. Setup Kubernetes\n  4.1. with k3s\n  4.2. with minikube\n  4.3. with kubeadm\n5. Setup Kubernetes Modules\n```\n\n그럼 이제 각각의 도구를 활용해 쿠버네티스 클러스터를 구축해보겠습니다. 반드시 모든 도구를 사용해 볼 필요는 없으며, 이 중 여러분이 익숙하신 도구를 활용해주시면 충분합니다.\n"
  },
  {
    "path": "docs/setup-kubernetes/setup-nvidia-gpu.md",
    "content": "---\ntitle: \"6. (Optional) Setup GPU\"\ndescription: \"Install nvidia docker, nvidia device plugin\"\nsidebar_position: 6\ndate: 2021-12-13\nlastmod: 2021-12-13\ncontributors: [\"Jaeyeon Kim\"]\n---\n\n쿠버네티스 및 Kubeflow 등에서 GP 를 사용하기 위해서는 다음 작업이 필요합니다.\n\n## 1. Install NVIDIA Driver\n\n`nvidia-smi` 수행 시 다음과 같은 화면이 출력된다면 이 단계는 생략해 주시기 바랍니다.\n\n  ```bash\n  mlops@ubuntu:~$ nvidia-smi \n  +-----------------------------------------------------------------------------+\n  | NVIDIA-SMI 470.86       Driver Version: 470.86       CUDA Version: 11.4     |\n  |-------------------------------+----------------------+----------------------+\n  | GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |\n  | Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |\n  |                               |                      |               MIG M. |\n  |===============================+======================+======================|\n  |   0  NVIDIA GeForce ...  Off  | 00000000:01:00.0 Off |                  N/A |\n  | 25%   32C    P8     4W / 120W |    211MiB /  6078MiB |      0%      Default |\n  |                               |                      |                  N/A |\n  +-------------------------------+----------------------+----------------------+\n  |   1  NVIDIA GeForce ...  Off  | 00000000:02:00.0 Off |                  N/A |\n  |  0%   34C    P8     7W / 175W |      5MiB /  7982MiB |      0%      Default |\n  |                               |                      |                  N/A |\n  +-------------------------------+----------------------+----------------------+\n                                                                                \n  +-----------------------------------------------------------------------------+\n  | Processes:                                                                  |\n  |  GPU   GI   CI        PID   Type   Process name                  GPU Memory |\n  |        ID   ID                                                   Usage      |\n  |=============================================================================|\n  |    0   N/A  N/A      1644      G   /usr/lib/xorg/Xorg                198MiB |\n  |    0   N/A  N/A      1893      G   /usr/bin/gnome-shell               10MiB |\n  |    1   N/A  N/A      1644      G   /usr/lib/xorg/Xorg                  4MiB |\n  +-----------------------------------------------------------------------------+\n  ```\n\n`nvidia-smi`의 출력 결과가 위와 같지 않다면 장착된 GPU에 맞는 nvidia driver를 설치해 주시기 바랍니다.\n\n만약 nvidia driver의 설치에 익숙하지 않다면 아래 명령어를 통해 설치하시기 바랍니다.\n\n  ```bash\n  sudo add-apt-repository ppa:graphics-drivers/ppa\n  sudo apt update && sudo apt install -y ubuntu-drivers-common\n  sudo ubuntu-drivers autoinstall\n  sudo reboot\n  ```\n\n## 2. NVIDIA-Docker 설치\n\nNVIDIA-Docker를 설치합니다.\n\n```bash\ncurl -s -L https://nvidia.github.io/nvidia-docker/gpgkey | \\\n  sudo apt-key add -\ndistribution=$(. /etc/os-release;echo $ID$VERSION_ID)\ncurl -s -L https://nvidia.github.io/nvidia-docker/$distribution/nvidia-docker.list | sudo tee /etc/apt/sources.list.d/nvidia-docker.list\nsudo apt-get update\nsudo apt-get install -y nvidia-docker2 &&\nsudo systemctl restart docker\n```\n\n정상적으로 설치되었는지 확인하기 위해, GPU를 사용하는 도커 컨테이너를 실행해봅니다.\n\n```bash\nsudo docker run --rm --gpus all nvidia/cuda:11.0-base nvidia-smi\n```\n\n다음과 같은 메시지가 보이면 정상적으로 설치된 것을 의미합니다.\n\n  ```bash\n  mlops@ubuntu:~$ sudo docker run --rm --gpus all nvidia/cuda:11.0-base nvidia-smi\n  +-----------------------------------------------------------------------------+\n  | NVIDIA-SMI 470.86       Driver Version: 470.86       CUDA Version: 11.4     |\n  |-------------------------------+----------------------+----------------------+\n  | GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |\n  | Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |\n  |                               |                      |               MIG M. |\n  |===============================+======================+======================|\n  |   0  NVIDIA GeForce ...  Off  | 00000000:01:00.0 Off |                  N/A |\n  | 25%   32C    P8     4W / 120W |    211MiB /  6078MiB |      0%      Default |\n  |                               |                      |                  N/A |\n  +-------------------------------+----------------------+----------------------+\n  |   1  NVIDIA GeForce ...  Off  | 00000000:02:00.0 Off |                  N/A |\n  |  0%   34C    P8     6W / 175W |      5MiB /  7982MiB |      0%      Default |\n  |                               |                      |                  N/A |\n  +-------------------------------+----------------------+----------------------+\n                                                                                \n  +-----------------------------------------------------------------------------+\n  | Processes:                                                                  |\n  |  GPU   GI   CI        PID   Type   Process name                  GPU Memory |\n  |        ID   ID                                                   Usage      |\n  |=============================================================================|\n  +-----------------------------------------------------------------------------+\n  ```\n\n## 3. NVIDIA-Docker를 Default Container Runtime으로 설정\n\n쿠버네티스는 기본적으로 Docker-CE를 Default Container Runtime으로 사용합니다.\n따라서, Docker Container 내에서 NVIDIA GPU를 사용하기 위해서는 NVIDIA-Docker 를 Container Runtime 으로 사용하여 pod를 생성할 수 있도록 Default Runtime을 수정해 주어야 합니다.\n\n1. `/etc/docker/daemon.json` 파일을 열어 다음과 같이 수정합니다.\n\n  ```bash\n  sudo vi /etc/docker/daemon.json\n\n  {\n    \"default-runtime\": \"nvidia\",\n    \"runtimes\": {\n        \"nvidia\": {\n            \"path\": \"nvidia-container-runtime\",\n            \"runtimeArgs\": []\n    }\n    }\n  }\n  ```\n\n2. 파일이 변경된 것을 확인한 후, Docker를 재시작합니다.\n\n  ```bash\n  sudo systemctl daemon-reload\n  sudo service docker restart\n  ```\n\n3. 변경 사항이 반영되었는지 확인합니다.\n\n  ```bash\n  sudo docker info | grep nvidia\n  ```\n\n  다음과 같은 메시지가 보이면 정상적으로 설치된 것을 의미합니다.\n\n  ```bash\n  mlops@ubuntu:~$ docker info | grep nvidia\n  Runtimes: io.containerd.runc.v2 io.containerd.runtime.v1.linux nvidia runc\n  Default Runtime: nvidia\n  ```\n\n## 4. Nvidia-Device-Plugin\n\n1. nvidia-device-plugin daemonset을 생성합니다.\n\n  ```bash\n  kubectl create -f https://raw.githubusercontent.com/NVIDIA/k8s-device-plugin/v0.10.0/nvidia-device-plugin.yml\n  ```\n\n2. nvidia-device-plugin pod이 RUNNING 상태로 생성되었는지 확인합니다.\n\n  ```bash\n  kubectl get pod -n kube-system | grep nvidia\n  ```\n\n  다음과 같은 결과가 출력되어야 합니다.\n\n  ```bash\n  kube-system       nvidia-device-plugin-daemonset-nlqh2         1/1     Running   0      1h\n  ```\n\n3. node 정보에 gpu가 사용가능하도록 설정되었는지 확인합니다.\n\n  ```bash\n  kubectl get nodes \"-o=custom-columns=NAME:.metadata.name,GPU:.status.allocatable.nvidia\\.com/gpu\"\n  ```\n\n  다음과 같은 메시지가 보이면 정상적으로 설정된 것을 의미합니다.  \n  (*모두의 MLOps* 에서 실습을 진행한 클러스터는 2개의 GPU가 있어서 2가 출력됩니다.\n  본인의 클러스터의 GPU 개수와 맞는 숫자가 출력된다면 됩니다.)\n\n  ```bash\n  NAME       GPU\n  ubuntu     2\n  ```\n\n설정되지 않은 경우, GPU의 value가 `<None>` 으로 표시됩니다.\n"
  },
  {
    "path": "docusaurus.config.js",
    "content": "// @ts-check\n// Note: type annotations allow type checking and IDEs autocompletion\n\nconst lightCodeTheme = require(\"prism-react-renderer/themes/github\");\nconst darkCodeTheme = require(\"prism-react-renderer/themes/dracula\");\n\n/** @type {import('@docusaurus/types').Config} */\nconst config = {\n  title: \"MLOps for ALL\",\n  tagline: \"모두를 위한 MLOps\",\n  favicon: \"img/favicon.ico\",\n\n  // Set the production url of your site here\n  url: \"https://mlops-for-all.github.io\",\n  // Set the /<baseUrl>/ pathname under which your site is served\n  // For GitHub pages deployment, it is often '/<projectName>/'\n  baseUrl: \"/\",\n\n  // GitHub pages deployment config.\n  // If you aren't using GitHub pages, you don't need these.\n  organizationName: \"mlops-for-all\", // Usually your GitHub org/user name.\n  projectName: \"mlops-for-all.github.io\", // Usually your repo name.\n\n  onBrokenLinks: \"throw\",\n  onBrokenMarkdownLinks: \"warn\",\n\n  // Even if you don't use internalization, you can use this field to set useful\n  // metadata like html lang. For example, if your site is Chinese, you may want\n  // to replace \"en\" with \"zh-Hans\".\n  i18n: {\n    defaultLocale: \"ko\",\n    locales: [\"en\", \"ko\"],\n    path: \"i18n\",\n  },\n  plugins: [\n    [\n      \"content-docs\",\n      /** @type {import('@docusaurus/plugin-content-docs').Options} */\n      ({\n        id: \"community\",\n        path: \"community\",\n        routeBasePath: \"community\",\n        editUrl:\n          \"https://github.com/mlops-for-all/mlops-for-all.github.io/tree/main/\",\n        editCurrentVersion: true,\n        sidebarPath: require.resolve(\"./sidebarsCommunity.js\"),\n        showLastUpdateAuthor: true,\n        showLastUpdateTime: true,\n      }),\n    ],\n  ],\n  presets: [\n    [\n      \"classic\",\n      /** @type {import('@docusaurus/preset-classic').Options} */\n      ({\n        docs: {\n          sidebarPath: require.resolve(\"./sidebars.js\"),\n          // Please change this to your repo.\n          // Remove this to remove the \"edit this page\" links.\n          editUrl:\n            \"https://github.com/mlops-for-all/mlops-for-all.github.io/tree/main/\",\n          showLastUpdateAuthor: true,\n          showLastUpdateTime: true,\n          lastVersion: \"current\",\n          versions: {\n            current: {\n              label: \"1.0\",\n            },\n          },\n        },\n        // blog: {\n        //   showReadingTime: true,\n        //   // Please change this to your repo.\n        //   // Remove this to remove the \"edit this page\" links.\n        //   editUrl:\n        //     'https://github.com/facebook/docusaurus/tree/main/packages/create-docusaurus/templates/shared/',\n        // },\n        theme: {\n          customCss: require.resolve(\"./src/css/custom.css\"),\n        },\n        gtag: {\n          trackingID: \"G-097K82469K\",\n          anonymizeIP: true,\n        },\n      }),\n    ],\n  ],\n  themeConfig:\n    /** @type {import('@docusaurus/preset-classic').ThemeConfig} */\n    ({\n      // Replace with your project's social card\n      image: \"img/logo-mlops-for-all.png\",\n      navbar: {\n        title: \"MLOps for ALL\",\n        logo: {\n          alt: \"My Site Logo\",\n          src: \"img/logo-mlops-for-all.png\",\n        },\n        items: [\n          {\n            type: \"docSidebar\",\n            sidebarId: \"tutorialSidebar\",\n            position: \"left\",\n            label: \"Tutorial\",\n          },\n          {\n            type: \"docSidebar\",\n            sidebarId: \"preSidebar\",\n            position: \"left\",\n            label: \"Prerequisites\",\n          },\n          {\n            to: \"/community/contributors\",\n            position: \"left\",\n            label: \"Community\",\n          },\n          // {to: '/blog', label: 'Blog', position: 'left'},\n          {\n            type: \"docsVersionDropdown\",\n            position: \"right\",\n          },\n          {\n            type: \"localeDropdown\",\n            position: \"right\",\n          },\n          {\n            href: \"https://github.com/mlops-for-all/mlops-for-all.github.io/tree/main/\",\n            label: \"GitHub\",\n            position: \"right\",\n          },\n        ],\n      },\n      footer: {\n        style: \"dark\",\n        logo: {\n          alt: \"MakinaRocks\",\n          src: \"/img/makinarocks.png\",\n          href: \"https://makinarocks.ai\",\n        },\n        copyright: `Copyright © 2021-${new Date().getFullYear()} MakinaRocks. Built with Docusaurus.`,\n      },\n      prism: {\n        theme: lightCodeTheme,\n        darkTheme: darkCodeTheme,\n      },\n    }),\n};\n\nmodule.exports = config;\n"
  },
  {
    "path": "i18n/en/code.json",
    "content": "{\n  \"team.profile.Jongseob Jeon.body\": {\n    \"message\": \"마키나락스에서 머신러닝 엔지니어로 일하고 있습니다. 모두의 딥러닝을 통해 많은 사람들이 딥러닝을 쉽게 접했듯이 MLOps for ALL를 통해 많은 사람들이 MLOps에 쉽게 접할수 있길 바랍니다.\"\n  },\n  \"team.profile.Jaeyeon Kim.body\": {\n    \"message\": \"비효율적인 작업을 자동화하는 것에 관심이 많습니다.\"\n  },\n  \"team.profile.Youngchel Jang.body\": {\n    \"message\": \"마키나락스에서 MLOps Engineer로 일하고 있습니다. 단순하게 생각하는 노력을 하고 있습니다.\"\n  },\n  \"team.profile.Jongsun Shinn.body\": {\n    \"message\": \"마키나락스에서 ML Engineer로 일하고 있습니다.\"\n  },\n  \"team.profile.Sangwoo Shim.body\": {\n    \"message\": \"마키나락스에서 CTO로 일하고 있습니다. 마키나락스는 머신러닝 기반의 산업용 AI 솔루션을 개발하는 스타트업입니다. 산업 현장의 문제 해결을 통해 사람이 본연의 일에 집중할 수 있게 만드는 것, 그것이 우리가 하는 일입니다.\"\n  },\n  \"team.profile.Seunghyun Ko.body\": {\n    \"message\": \"3i에서 MLOps Engineer로 일하고 있습니다. kubeflow에 관심이 많습니다.\"\n  },\n  \"team.profile.SeungTae Kim.body\": {\n    \"message\": \"Genesis Lab이라는 스타트업에서 Applied AI Engineer 인턴 업무를 수행하고 있습니다. 머신러닝 생태계가 우리 산업 전반에 큰 변화을 가져올 것이라 믿으며, 한 걸음씩 나아가고 있습니다.\"\n  },\n  \"team.profile.Youngdon Tae.body\": {\n    \"message\": \"백패커에서 ML 엔지니어로 일하고 있습니다. 자연어처리, 추천시스템, MLOps에 관심이 많습니다.\"\n  },\n  \"theme.ErrorPageContent.title\": {\n    \"message\": \"This page crashed.\",\n    \"description\": \"The title of the fallback page when the page crashed\"\n  },\n  \"theme.NotFound.title\": {\n    \"message\": \"Page Not Found\",\n    \"description\": \"The title of the 404 page\"\n  },\n  \"theme.NotFound.p1\": {\n    \"message\": \"We could not find what you were looking for.\",\n    \"description\": \"The first paragraph of the 404 page\"\n  },\n  \"theme.NotFound.p2\": {\n    \"message\": \"Please contact the owner of the site that linked you to the original URL and let them know their link is broken.\",\n    \"description\": \"The 2nd paragraph of the 404 page\"\n  },\n  \"theme.admonition.note\": {\n    \"message\": \"note\",\n    \"description\": \"The default label used for the Note admonition (:::note)\"\n  },\n  \"theme.admonition.tip\": {\n    \"message\": \"tip\",\n    \"description\": \"The default label used for the Tip admonition (:::tip)\"\n  },\n  \"theme.admonition.danger\": {\n    \"message\": \"danger\",\n    \"description\": \"The default label used for the Danger admonition (:::danger)\"\n  },\n  \"theme.admonition.info\": {\n    \"message\": \"info\",\n    \"description\": \"The default label used for the Info admonition (:::info)\"\n  },\n  \"theme.admonition.caution\": {\n    \"message\": \"caution\",\n    \"description\": \"The default label used for the Caution admonition (:::caution)\"\n  },\n  \"theme.BackToTopButton.buttonAriaLabel\": {\n    \"message\": \"Scroll back to top\",\n    \"description\": \"The ARIA label for the back to top button\"\n  },\n  \"theme.blog.archive.title\": {\n    \"message\": \"Archive\",\n    \"description\": \"The page & hero title of the blog archive page\"\n  },\n  \"theme.blog.archive.description\": {\n    \"message\": \"Archive\",\n    \"description\": \"The page & hero description of the blog archive page\"\n  },\n  \"theme.blog.paginator.navAriaLabel\": {\n    \"message\": \"Blog list page navigation\",\n    \"description\": \"The ARIA label for the blog pagination\"\n  },\n  \"theme.blog.paginator.newerEntries\": {\n    \"message\": \"Newer Entries\",\n    \"description\": \"The label used to navigate to the newer blog posts page (previous page)\"\n  },\n  \"theme.blog.paginator.olderEntries\": {\n    \"message\": \"Older Entries\",\n    \"description\": \"The label used to navigate to the older blog posts page (next page)\"\n  },\n  \"theme.blog.post.paginator.navAriaLabel\": {\n    \"message\": \"Blog post page navigation\",\n    \"description\": \"The ARIA label for the blog posts pagination\"\n  },\n  \"theme.blog.post.paginator.newerPost\": {\n    \"message\": \"Newer Post\",\n    \"description\": \"The blog post button label to navigate to the newer/previous post\"\n  },\n  \"theme.blog.post.paginator.olderPost\": {\n    \"message\": \"Older Post\",\n    \"description\": \"The blog post button label to navigate to the older/next post\"\n  },\n  \"theme.blog.post.plurals\": {\n    \"message\": \"One post|{count} posts\",\n    \"description\": \"Pluralized label for \\\"{count} posts\\\". Use as much plural forms (separated by \\\"|\\\") as your language support (see https://www.unicode.org/cldr/cldr-aux/charts/34/supplemental/language_plural_rules.html)\"\n  },\n  \"theme.blog.tagTitle\": {\n    \"message\": \"{nPosts} tagged with \\\"{tagName}\\\"\",\n    \"description\": \"The title of the page for a blog tag\"\n  },\n  \"theme.tags.tagsPageLink\": {\n    \"message\": \"View All Tags\",\n    \"description\": \"The label of the link targeting the tag list page\"\n  },\n  \"theme.colorToggle.ariaLabel\": {\n    \"message\": \"Switch between dark and light mode (currently {mode})\",\n    \"description\": \"The ARIA label for the navbar color mode toggle\"\n  },\n  \"theme.colorToggle.ariaLabel.mode.dark\": {\n    \"message\": \"dark mode\",\n    \"description\": \"The name for the dark color mode\"\n  },\n  \"theme.colorToggle.ariaLabel.mode.light\": {\n    \"message\": \"light mode\",\n    \"description\": \"The name for the light color mode\"\n  },\n  \"theme.docs.breadcrumbs.navAriaLabel\": {\n    \"message\": \"Breadcrumbs\",\n    \"description\": \"The ARIA label for the breadcrumbs\"\n  },\n  \"theme.docs.DocCard.categoryDescription\": {\n    \"message\": \"{count} items\",\n    \"description\": \"The default description for a category card in the generated index about how many items this category includes\"\n  },\n  \"theme.docs.paginator.navAriaLabel\": {\n    \"message\": \"Docs pages\",\n    \"description\": \"The ARIA label for the docs pagination\"\n  },\n  \"theme.docs.paginator.previous\": {\n    \"message\": \"Previous\",\n    \"description\": \"The label used to navigate to the previous doc\"\n  },\n  \"theme.docs.paginator.next\": {\n    \"message\": \"Next\",\n    \"description\": \"The label used to navigate to the next doc\"\n  },\n  \"theme.docs.tagDocListPageTitle.nDocsTagged\": {\n    \"message\": \"One doc tagged|{count} docs tagged\",\n    \"description\": \"Pluralized label for \\\"{count} docs tagged\\\". Use as much plural forms (separated by \\\"|\\\") as your language support (see https://www.unicode.org/cldr/cldr-aux/charts/34/supplemental/language_plural_rules.html)\"\n  },\n  \"theme.docs.tagDocListPageTitle\": {\n    \"message\": \"{nDocsTagged} with \\\"{tagName}\\\"\",\n    \"description\": \"The title of the page for a docs tag\"\n  },\n  \"theme.docs.versionBadge.label\": {\n    \"message\": \"Version: {versionLabel}\"\n  },\n  \"theme.docs.versions.unreleasedVersionLabel\": {\n    \"message\": \"This is unreleased documentation for {siteTitle} {versionLabel} version.\",\n    \"description\": \"The label used to tell the user that he's browsing an unreleased doc version\"\n  },\n  \"theme.docs.versions.unmaintainedVersionLabel\": {\n    \"message\": \"This is documentation for {siteTitle} {versionLabel}, which is no longer actively maintained.\",\n    \"description\": \"The label used to tell the user that he's browsing an unmaintained doc version\"\n  },\n  \"theme.docs.versions.latestVersionSuggestionLabel\": {\n    \"message\": \"For up-to-date documentation, see the {latestVersionLink} ({versionLabel}).\",\n    \"description\": \"The label used to tell the user to check the latest version\"\n  },\n  \"theme.docs.versions.latestVersionLinkLabel\": {\n    \"message\": \"latest version\",\n    \"description\": \"The label used for the latest version suggestion link label\"\n  },\n  \"theme.common.editThisPage\": {\n    \"message\": \"Edit this page\",\n    \"description\": \"The link label to edit the current page\"\n  },\n  \"theme.common.headingLinkTitle\": {\n    \"message\": \"Direct link to {heading}\",\n    \"description\": \"Title for link to heading\"\n  },\n  \"theme.lastUpdated.atDate\": {\n    \"message\": \" on {date}\",\n    \"description\": \"The words used to describe on which date a page has been last updated\"\n  },\n  \"theme.lastUpdated.byUser\": {\n    \"message\": \" by {user}\",\n    \"description\": \"The words used to describe by who the page has been last updated\"\n  },\n  \"theme.lastUpdated.lastUpdatedAtBy\": {\n    \"message\": \"Last updated{atDate}{byUser}\",\n    \"description\": \"The sentence used to display when a page has been last updated, and by who\"\n  },\n  \"theme.navbar.mobileVersionsDropdown.label\": {\n    \"message\": \"Versions\",\n    \"description\": \"The label for the navbar versions dropdown on mobile view\"\n  },\n  \"theme.tags.tagsListLabel\": {\n    \"message\": \"Tags:\",\n    \"description\": \"The label alongside a tag list\"\n  },\n  \"theme.AnnouncementBar.closeButtonAriaLabel\": {\n    \"message\": \"Close\",\n    \"description\": \"The ARIA label for close button of announcement bar\"\n  },\n  \"theme.blog.sidebar.navAriaLabel\": {\n    \"message\": \"Blog recent posts navigation\",\n    \"description\": \"The ARIA label for recent posts in the blog sidebar\"\n  },\n  \"theme.CodeBlock.copied\": {\n    \"message\": \"Copied\",\n    \"description\": \"The copied button label on code blocks\"\n  },\n  \"theme.CodeBlock.copyButtonAriaLabel\": {\n    \"message\": \"Copy code to clipboard\",\n    \"description\": \"The ARIA label for copy code blocks button\"\n  },\n  \"theme.CodeBlock.copy\": {\n    \"message\": \"Copy\",\n    \"description\": \"The copy button label on code blocks\"\n  },\n  \"theme.CodeBlock.wordWrapToggle\": {\n    \"message\": \"Toggle word wrap\",\n    \"description\": \"The title attribute for toggle word wrapping button of code block lines\"\n  },\n  \"theme.DocSidebarItem.toggleCollapsedCategoryAriaLabel\": {\n    \"message\": \"Toggle the collapsible sidebar category '{label}'\",\n    \"description\": \"The ARIA label to toggle the collapsible sidebar category\"\n  },\n  \"theme.NavBar.navAriaLabel\": {\n    \"message\": \"Main\",\n    \"description\": \"The ARIA label for the main navigation\"\n  },\n  \"theme.navbar.mobileLanguageDropdown.label\": {\n    \"message\": \"Languages\",\n    \"description\": \"The label for the mobile language switcher dropdown\"\n  },\n  \"theme.TOCCollapsible.toggleButtonLabel\": {\n    \"message\": \"On this page\",\n    \"description\": \"The label used by the button on the collapsible TOC component\"\n  },\n  \"theme.blog.post.readingTime.plurals\": {\n    \"message\": \"One min read|{readingTime} min read\",\n    \"description\": \"Pluralized label for \\\"{readingTime} min read\\\". Use as much plural forms (separated by \\\"|\\\") as your language support (see https://www.unicode.org/cldr/cldr-aux/charts/34/supplemental/language_plural_rules.html)\"\n  },\n  \"theme.blog.post.readMore\": {\n    \"message\": \"Read More\",\n    \"description\": \"The label used in blog post item excerpts to link to full blog posts\"\n  },\n  \"theme.blog.post.readMoreLabel\": {\n    \"message\": \"Read more about {title}\",\n    \"description\": \"The ARIA label for the link to full blog posts from excerpts\"\n  },\n  \"theme.docs.breadcrumbs.home\": {\n    \"message\": \"Home page\",\n    \"description\": \"The ARIA label for the home page in the breadcrumbs\"\n  },\n  \"theme.docs.sidebar.collapseButtonTitle\": {\n    \"message\": \"Collapse sidebar\",\n    \"description\": \"The title attribute for collapse button of doc sidebar\"\n  },\n  \"theme.docs.sidebar.collapseButtonAriaLabel\": {\n    \"message\": \"Collapse sidebar\",\n    \"description\": \"The title attribute for collapse button of doc sidebar\"\n  },\n  \"theme.docs.sidebar.navAriaLabel\": {\n    \"message\": \"Docs sidebar\",\n    \"description\": \"The ARIA label for the sidebar navigation\"\n  },\n  \"theme.docs.sidebar.closeSidebarButtonAriaLabel\": {\n    \"message\": \"Close navigation bar\",\n    \"description\": \"The ARIA label for close button of mobile sidebar\"\n  },\n  \"theme.navbar.mobileSidebarSecondaryMenu.backButtonLabel\": {\n    \"message\": \"← Back to main menu\",\n    \"description\": \"The label of the back button to return to main menu, inside the mobile navbar sidebar secondary menu (notably used to display the docs sidebar)\"\n  },\n  \"theme.docs.sidebar.toggleSidebarButtonAriaLabel\": {\n    \"message\": \"Toggle navigation bar\",\n    \"description\": \"The ARIA label for hamburger menu button of mobile navigation\"\n  },\n  \"theme.docs.sidebar.expandButtonTitle\": {\n    \"message\": \"Expand sidebar\",\n    \"description\": \"The ARIA label and title attribute for expand button of doc sidebar\"\n  },\n  \"theme.docs.sidebar.expandButtonAriaLabel\": {\n    \"message\": \"Expand sidebar\",\n    \"description\": \"The ARIA label and title attribute for expand button of doc sidebar\"\n  },\n  \"theme.ErrorPageContent.tryAgain\": {\n    \"message\": \"Try again\",\n    \"description\": \"The label of the button to try again rendering when the React error boundary captures an error\"\n  },\n  \"theme.common.skipToMainContent\": {\n    \"message\": \"Skip to main content\",\n    \"description\": \"The skip to content label used for accessibility, allowing to rapidly navigate to main content with keyboard tab/enter navigation\"\n  },\n  \"theme.tags.tagsPageTitle\": {\n    \"message\": \"Tags\",\n    \"description\": \"The title of the tag list page\"\n  }\n}\n"
  },
  {
    "path": "i18n/en/docusaurus-plugin-content-blog/options.json",
    "content": "{\n  \"title\": {\n    \"message\": \"Blog\",\n    \"description\": \"The title for the blog used in SEO\"\n  },\n  \"description\": {\n    \"message\": \"Blog\",\n    \"description\": \"The description for the blog used in SEO\"\n  },\n  \"sidebar.title\": {\n    \"message\": \"Recent posts\",\n    \"description\": \"The label for the left sidebar\"\n  }\n}\n"
  },
  {
    "path": "i18n/en/docusaurus-plugin-content-docs/current/api-deployment/_category_.json",
    "content": "{\n  \"label\": \"API Deployment\",\n  \"position\": 7,\n  \"link\": {\n    \"type\": \"generated-index\"\n  }\n}\n"
  },
  {
    "path": "i18n/en/docusaurus-plugin-content-docs/current/api-deployment/seldon-children.md",
    "content": "---\ntitle : \"6. Multi Models\"\ndescription: \"\"\nsidebar_position: 6\ncontributors: [\"Jongseob Jeon\"]\n---\n\nPreviously, the methods explained were all targeted at a single model. On this page, we will look at how to connect multiple models. \n\nFirst, we will create a pipeline that creates two models. We will add a StandardScaler to the SVC model we used before and store it.\n```python\nfrom functools import partial\n\nimport kfp\nfrom kfp.components import InputPath, OutputPath, create_component_from_func\n\n\n@partial(\n    create_component_from_func,\n    packages_to_install=[\"pandas\", \"scikit-learn\"],\n)\ndef load_iris_data(\n    data_path: OutputPath(\"csv\"),\n    target_path: OutputPath(\"csv\"),\n):\n    import pandas as pd\n    from sklearn.datasets import load_iris\n\n    iris = load_iris()\n\n    data = pd.DataFrame(iris[\"data\"], columns=iris[\"feature_names\"])\n    target = pd.DataFrame(iris[\"target\"], columns=[\"target\"])\n\n    data.to_csv(data_path, index=False)\n    target.to_csv(target_path, index=False)\n\n@partial(\n    create_component_from_func,\n    packages_to_install=[\"dill\", \"pandas\", \"scikit-learn\", \"mlflow\"],\n)\ndef train_scaler_from_csv(\n    data_path: InputPath(\"csv\"),\n    scaled_data_path: OutputPath(\"csv\"),\n    model_path: OutputPath(\"dill\"),\n    input_example_path: OutputPath(\"dill\"),\n    signature_path: OutputPath(\"dill\"),\n    conda_env_path: OutputPath(\"dill\"),\n):\n    import dill\n    import pandas as pd\n    from sklearn.preprocessing import StandardScaler\n\n    from mlflow.models.signature import infer_signature\n    from mlflow.utils.environment import _mlflow_conda_env\n\n    data = pd.read_csv(data_path)\n\n    scaler = StandardScaler()\n    scaled_data = scaler.fit_transform(data)\n    scaled_data = pd.DataFrame(scaled_data, columns=data.columns, index=data.index)\n\n    scaled_data.to_csv(scaled_data_path, index=False)\n\n    with open(model_path, mode=\"wb\") as file_writer:\n        dill.dump(scaler, file_writer)\n\n    input_example = data.sample(1)\n    with open(input_example_path, \"wb\") as file_writer:\n        dill.dump(input_example, file_writer)\n\n    signature = infer_signature(data, scaler.transform(data))\n    with open(signature_path, \"wb\") as file_writer:\n        dill.dump(signature, file_writer)\n\n    conda_env = _mlflow_conda_env(\n        additional_pip_deps=[\"scikit-learn\"],\n        install_mlflow=False\n    )\n    with open(conda_env_path, \"wb\") as file_writer:\n        dill.dump(conda_env, file_writer)\n\n\n@partial(\n    create_component_from_func,\n    packages_to_install=[\"dill\", \"pandas\", \"scikit-learn\", \"mlflow\"],\n)\ndef train_svc_from_csv(\n    train_data_path: InputPath(\"csv\"),\n    train_target_path: InputPath(\"csv\"),\n    model_path: OutputPath(\"dill\"),\n    input_example_path: OutputPath(\"dill\"),\n    signature_path: OutputPath(\"dill\"),\n    conda_env_path: OutputPath(\"dill\"),\n    kernel: str,\n):\n    import dill\n    import pandas as pd\n    from sklearn.svm import SVC\n\n    from mlflow.models.signature import infer_signature\n    from mlflow.utils.environment import _mlflow_conda_env\n\n    train_data = pd.read_csv(train_data_path)\n    train_target = pd.read_csv(train_target_path)\n\n    clf = SVC(kernel=kernel)\n    clf.fit(train_data, train_target)\n\n    with open(model_path, mode=\"wb\") as file_writer:\n        dill.dump(clf, file_writer)\n\n    input_example = train_data.sample(1)\n    with open(input_example_path, \"wb\") as file_writer:\n        dill.dump(input_example, file_writer)\n\n    signature = infer_signature(train_data, clf.predict(train_data))\n    with open(signature_path, \"wb\") as file_writer:\n        dill.dump(signature, file_writer)\n\n    conda_env = _mlflow_conda_env(\n        additional_pip_deps=[\"scikit-learn\"],\n        install_mlflow=False\n    )\n    with open(conda_env_path, \"wb\") as file_writer:\n        dill.dump(conda_env, file_writer)\n\n\n@partial(\n    create_component_from_func,\n    packages_to_install=[\"dill\", \"pandas\", \"scikit-learn\", \"mlflow\", \"boto3\"],\n)\ndef upload_sklearn_model_to_mlflow(\n    model_name: str,\n    model_path: InputPath(\"dill\"),\n    input_example_path: InputPath(\"dill\"),\n    signature_path: InputPath(\"dill\"),\n    conda_env_path: InputPath(\"dill\"),\n):\n    import os\n    import dill\n    from mlflow.sklearn import save_model\n    \n    from mlflow.tracking.client import MlflowClient\n\n    os.environ[\"MLFLOW_S3_ENDPOINT_URL\"] = \"http://minio-service.kubeflow.svc:9000\"\n    os.environ[\"AWS_ACCESS_KEY_ID\"] = \"minio\"\n    os.environ[\"AWS_SECRET_ACCESS_KEY\"] = \"minio123\"\n\n    client = MlflowClient(\"http://mlflow-server-service.mlflow-system.svc:5000\")\n\n    with open(model_path, mode=\"rb\") as file_reader:\n        clf = dill.load(file_reader)\n\n    with open(input_example_path, \"rb\") as file_reader:\n        input_example = dill.load(file_reader)\n\n    with open(signature_path, \"rb\") as file_reader:\n        signature = dill.load(file_reader)\n\n    with open(conda_env_path, \"rb\") as file_reader:\n        conda_env = dill.load(file_reader)\n    save_model(\n        sk_model=clf,\n        path=model_name,\n        serialization_format=\"cloudpickle\",\n        conda_env=conda_env,\n        signature=signature,\n        input_example=input_example,\n    )\n    run = client.create_run(experiment_id=\"0\")\n    client.log_artifact(run.info.run_id, model_name)\n\n\nfrom kfp.dsl import pipeline\n\n\n@pipeline(name=\"multi_model_pipeline\")\ndef multi_model_pipeline(kernel: str = \"rbf\"):\n    iris_data = load_iris_data()\n    scaled_data = train_scaler_from_csv(data=iris_data.outputs[\"data\"])\n    _ = upload_sklearn_model_to_mlflow(\n        model_name=\"scaler\",\n        model=scaled_data.outputs[\"model\"],\n        input_example=scaled_data.outputs[\"input_example\"],\n        signature=scaled_data.outputs[\"signature\"],\n        conda_env=scaled_data.outputs[\"conda_env\"],\n    )\n    model = train_svc_from_csv(\n        train_data=scaled_data.outputs[\"scaled_data\"],\n        train_target=iris_data.outputs[\"target\"],\n        kernel=kernel,\n    )\n    _ = upload_sklearn_model_to_mlflow(\n        model_name=\"svc\",\n        model=model.outputs[\"model\"],\n        input_example=model.outputs[\"input_example\"],\n        signature=model.outputs[\"signature\"],\n        conda_env=model.outputs[\"conda_env\"],\n    )\n\n\nif __name__ == \"__main__\":\n    kfp.compiler.Compiler().compile(multi_model_pipeline, \"multi_model_pipeline.yaml\")\n\n```\n\nIf you upload the pipeline, it will look like this.\n![children-kubeflow.png](./img/children-kubeflow.png)\n\nWhen you check the MLflow dashboard, two models will be generated, as shown below. \n\n![children-mlflow.png](./img/children-mlflow.png)\n\nAfter checking the run_id of each one, define the SeldonDeployment spec as follows.\n```bash\napiVersion: machinelearning.seldon.io/v1\nkind: SeldonDeployment\nmetadata:\n  name: multi-model-example\n  namespace: kubeflow-user-example-com\nspec:\n  name: model\n  predictors:\n  - name: model\n\n    componentSpecs:\n    - spec:\n        volumes:\n        - name: model-provision-location\n          emptyDir: {}\n\n        initContainers:\n        - name: scaler-initializer\n          image: gcr.io/kfserving/storage-initializer:v0.4.0\n          args:\n            - \"s3://mlflow/mlflow/artifacts/0/7f445015a0e94519b003d316478766ef/artifacts/scaler\"\n            - \"/mnt/models\"\n          volumeMounts:\n          - mountPath: /mnt/models\n            name: model-provision-location\n          envFrom:\n          - secretRef:\n              name: seldon-init-container-secret\n        - name: svc-initializer\n          image: gcr.io/kfserving/storage-initializer:v0.4.0\n          args:\n            - \"s3://mlflow/mlflow/artifacts/0/87eb168e76264b39a24b0e5ca0fe922b/artifacts/svc\"\n            - \"/mnt/models\"\n          volumeMounts:\n          - mountPath: /mnt/models\n            name: model-provision-location\n          envFrom:\n          - secretRef:\n              name: seldon-init-container-secret\n\n        containers:\n        - name: scaler\n          image: seldonio/mlflowserver:1.8.0-dev\n          volumeMounts:\n          - mountPath: /mnt/models\n            name: model-provision-location\n            readOnly: true\n          securityContext:\n            privileged: true\n            runAsUser: 0\n            runAsGroup: 0\n        - name: svc\n          image: seldonio/mlflowserver:1.8.0-dev\n          volumeMounts:\n          - mountPath: /mnt/models\n            name: model-provision-location\n            readOnly: true\n          securityContext:\n            privileged: true\n            runAsUser: 0\n            runAsGroup: 0\n\n    graph:\n      name: scaler\n      type: MODEL\n      parameters:\n      - name: model_uri\n        type: STRING\n        value: \"/mnt/models\"\n      - name: predict_method\n        type: STRING\n        value: \"transform\"\n      children:\n      - name: svc\n        type: MODEL\n        parameters:\n        - name: model_uri\n          type: STRING\n          value: \"/mnt/models\"\n```\nTwo models have been created so each model's initContainer and container must be defined. This field takes input as an array and the order does not matter. The order in which the models are executed is defined in the graph.\n```bash\ngraph:\n  name: scaler\n  type: MODEL\n  parameters:\n  - name: model_uri\n    type: STRING\n    value: \"/mnt/models\"\n  - name: predict_method\n    type: STRING\n    value: \"transform\"\n  children:\n  - name: svc\n    type: MODEL\n    parameters:\n    - name: model_uri\n      type: STRING\n      value: \"/mnt/models\"\n```\n\nThe operation of the graph is to convert the initial value received into a predefined predict_method and then pass it to the model defined as children. In this case, the data is passed from scaler -> svc.\n\nNow let's create the above specifications in a yaml file.\n\n```bash\ncat <<EOF > multi-model.yaml\napiVersion: machinelearning.seldon.io/v1\nkind: SeldonDeployment\nmetadata:\n  name: multi-model-example\n  namespace: kubeflow-user-example-com\nspec:\n  name: model\n  predictors:\n  - name: model\n\n    componentSpecs:\n    - spec:\n        volumes:\n        - name: model-provision-location\n          emptyDir: {}\n\n        initContainers:\n        - name: scaler-initializer\n          image: gcr.io/kfserving/storage-initializer:v0.4.0\n          args:\n            - \"s3://mlflow/mlflow/artifacts/0/7f445015a0e94519b003d316478766ef/artifacts/scaler\"\n            - \"/mnt/models\"\n          volumeMounts:\n          - mountPath: /mnt/models\n            name: model-provision-location\n          envFrom:\n          - secretRef:\n              name: seldon-init-container-secret\n        - name: svc-initializer\n          image: gcr.io/kfserving/storage-initializer:v0.4.0\n          args:\n            - \"s3://mlflow/mlflow/artifacts/0/87eb168e76264b39a24b0e5ca0fe922b/artifacts/svc\"\n            - \"/mnt/models\"\n          volumeMounts:\n          - mountPath: /mnt/models\n            name: model-provision-location\n          envFrom:\n          - secretRef:\n              name: seldon-init-container-secret\n\n        containers:\n        - name: scaler\n          image: ghcr.io/mlops-for-all/mlflowserver\n          volumeMounts:\n          - mountPath: /mnt/models\n            name: model-provision-location\n            readOnly: true\n          securityContext:\n            privileged: true\n            runAsUser: 0\n            runAsGroup: 0\n        - name: svc\n          image: ghcr.io/mlops-for-all/mlflowserver\n          volumeMounts:\n          - mountPath: /mnt/models\n            name: model-provision-location\n            readOnly: true\n          securityContext:\n            privileged: true\n            runAsUser: 0\n            runAsGroup: 0\n\n    graph:\n      name: scaler\n      type: MODEL\n      parameters:\n      - name: model_uri\n        type: STRING\n        value: \"/mnt/models\"\n      - name: predict_method\n        type: STRING\n        value: \"transform\"\n      children:\n      - name: svc\n        type: MODEL\n        parameters:\n        - name: model_uri\n          type: STRING\n          value: \"/mnt/models\"\nEOF\n```\n\nCreate an API through the following command.\n```bash\nkubectl apply -f multi-model.yaml\n```\n\nIf properly performed, it will be outputted as follows.\n```bash\nseldondeployment.machinelearning.seldon.io/multi-model-example created\n```\n\nCheck to see if it has been generated normally.\n```bash\nkubectl get po -n kubeflow-user-example-com | grep multi-model-example\n```\n\nIf it is created normally, a similar pod will be created.\n```bash\nmulti-model-example-model-0-scaler-svc-9955fb795-n9ffw   4/4     Running     0          2m30s\n```\n"
  },
  {
    "path": "i18n/en/docusaurus-plugin-content-docs/current/api-deployment/seldon-fields.md",
    "content": "---\ntitle : \"4. Seldon Fields\"\ndescription: \"\"\nsidebar_position: 4\ncontributors: [\"Jongseob Jeon\"]\n---\n\nSummary of how Seldon Core creates an API server:\n\n1. initContainer downloads the required model from the model repository.\n2. The downloaded model is passed to the container.\n3. The container runs an API server enclosing the model.\n4. The API can be requested at the generated API server address to receive the inference values from the model.\n\nThe yaml file defining the custom resource, SeldonDeployment, which is most commonly used when using Seldon Core is as follows:\n```bash\napiVersion: machinelearning.seldon.io/v1\nkind: SeldonDeployment\nmetadata:\n  name: seldon-example\n  namespace: kubeflow-user-example-com\nspec:\n  name: model\n  predictors:\n  - name: model\n\n    componentSpecs:\n    - spec:\n        volumes:\n        - name: model-provision-location\n          emptyDir: {}\n\n        initContainers:\n        - name: model-initializer\n          image: gcr.io/kfserving/storage-initializer:v0.4.0\n          args:\n            - \"gs://seldon-models/v1.12.0-dev/sklearn/iris\"\n            - \"/mnt/models\"\n          volumeMounts:\n          - mountPath: /mnt/models\n            name: model-provision-location\n\n        containers:\n        - name: model\n          image: seldonio/sklearnserver:1.8.0-dev\n          volumeMounts:\n          - mountPath: /mnt/models\n            name: model-provision-location\n            readOnly: true\n          securityContext:\n            privileged: true\n            runAsUser: 0\n            runAsGroup: 0\n\n    graph:\n      name: model\n      type: MODEL\n      parameters:\n      - name: model_uri\n        type: STRING\n        value: \"/mnt/models\"\n      children: []\n\n```\n\nThe `name` and `predictors` fields of SeldonDeployment are required fields. `name` is mainly used as a name to differentiate pods in Kubernetes and does not have a major effect. `predictors` must be a single array consisting of `name`, `componentSpecs` and `graph` defined. Here also, `name` is mainly used as a name to differentiate pods in Kubernetes and does not have a major effect.\n\nNow let's take a look at the fields that need to be defined in `componentSpecs` and `graph`.\n\n## componentSpecs\n\n`componentSpecs` must be a single array consisting of the `spec` key. The `spec` must have the fields `volumes`, `initContainers` and `containers` defined.\n\n### volumes\n\n```bash\nvolumes:\n- name: model-provision-location\n  emptyDir: {}\n```\n`Volumes` refer to the space used to store the models downloaded from the initContainer, which is received as an array with the components `name` and `emptyDir`. These values are used only once when downloading and moving the models, so they do not need to be modified significantly.\n```bash\n- name: model-initializer\n  image: gcr.io/kfserving/storage-initializer:v0.4.0\n  args:\n    - \"gs://seldon-models/v1.12.0-dev/sklearn/iris\"\n    - \"/mnt/models\"\n  volumeMounts:\n  - mountPath: /mnt/models\n    name: model-provision-location\n```\nThe `args` field contains the system arguments necessary to download the model from the model repository and move it to the specified model path. It provides the required parameters for the initContainer to perform the downloading and storage operations.\n\ninitContainer is responsible for downloading the model to be used from the API, so the fields used determine the information needed to download data from the model registry. \n\nThe value of initContainer consists of n arrays, and each model needs to be specified separately.\n\n#### name\n`name` is the name of the pod in Kubernetes, and it is recommended to use `{model_name}-initializer` for debugging. \n\n#### image\n\n`image` is the name of the image used to download the model, and there are two recommended images by\n- gcr.io/kfserving/storage-initializer:v0.4.0\n- seldonio/rclone-storage-initializer:1.13.0-dev\n\nFor more detailed information, please refer to the following resources:\n\n- [kfserving](https://docs.seldon.io/projects/seldon-core/en/latest/servers/kfserving-storage-initializer.html)\n- [rclone](https://github.com/SeldonIO/seldon-core/tree/master/components/rclone-storage-initializer)\n\nIn MLOps for ALL, we use kfserving for downloading and storing models.\n\n#### args\n\n```bash\nargs:\n  - \"gs://seldon-models/v1.12.0-dev/sklearn/iris\"\n  - \"/mnt/models\"\n```\n\nWhen the gcr.io/kfserving/storage-initializer:v0.4.0 Docker image is run (`run`), it takes an argument in the form of an array. The first array value is the address of the model to be downloaded. The second array value is the address where the downloaded model will be stored (Seldon Core usually stores it in `/mnt/models`).\n\n### volumeMounts\n\n```bash\nvolumeMounts:\n  - mountPath: /mnt/models\n    name: model-provision-location\n```\n\n`volumeMounts` is a field that attaches volumes to the Kubernetes to share `/mnt/models` as described in volumes. For more information, refer to Kubernetes Volume [Kubernetes Volume](https://kubernetes.io/docs/concepts/storage/volumes/).\"\n\n### container\n\n```bash\ncontainers:\n- name: model\n  image: seldonio/sklearnserver:1.8.0-dev\n  volumeMounts:\n  - mountPath: /mnt/models\n    name: model-provision-location\n    readOnly: true\n  securityContext:\n    privileged: true\n    runAsUser: 0\n    runAsGroup: 0\n```\n \n Container defines the fields that determine the configuration when the model is run in an API form.\n\n#### name\n\nThe `name` field refers to the name of the pod in Kubernetes. It should be the name of the model being used.\n\n#### image\n\nThe `image` field represents the image used to convert the model into an API. The image should have all the necessary packages installed when the model is loaded.\n\nSeldon Core provides official images for different types of models, including:\n\n- seldonio/sklearnserver\n- seldonio/mlflowserver\n- seldonio/xgboostserver\n- seldonio/tfserving\n\nYou can choose the appropriate image based on the type of model you are using.\n\n#### volumeMounts\n\n```bash\nvolumeMounts:\n- mountPath: /mnt/models\n  name: model-provision-location\n  readOnly: true\n```\n\nThis is a field that tells the path where the data downloaded from initContainer is located. Here, to prevent the model from being modified, `readOnly: true` will also be given.\n\n#### securityContext\n\n```bash\nsecurityContext:\n  privileged: true\n  runAsUser: 0\n  runAsGroup: 0\n```\n\nWhen installing necessary packages, pod may not be able to perform the package installation due to lack of permission. To address this, root permission is granted (although this could cause security issues when in actual service).\n\n## graph\n\n```bash\ngraph:\n  name: model\n  type: MODEL\n  parameters:\n  - name: model_uri\n    type: STRING\n    value: \"/mnt/models\"\n  children: []\n```\n\nThis is a field that defines the order in which the model operates.\n\n### name\n\nThe `name` field refers to the name of the model graph. It should match the name defined in the container.\n\n### type\n\nThe `type` field can have four different values:\n\n1. TRANSFORMER\n2. MODEL\n3. OUTPUT_TRANSFORMER\n4. ROUTER\n\nFor detailed explanations of each type, you can refer to the [Seldon Core Complex Graphs Metadata Example](https://docs.seldon.io/projects/seldon-core/en/latest/examples/graph-metadata.html).\n\n### parameters\n\nThe `parameters` field contains values used in the class init. For the sklearnserver, you can find the required values in the [following file](https://github.com/SeldonIO/seldon-core/blob/master/servers/sklearnserver/sklearnserver/SKLearnServer.py).\n```python\nclass SKLearnServer(SeldonComponent):\n    def __init__(self, model_uri: str = None, method: str = \"predict_proba\"):\n```\n\nIf you look at the code, you can define `model_uri` and `method`.\n\n### children\n\nThe `children` field is used when creating the sequence diagram. More details about this field will be explained on the following page.\n"
  },
  {
    "path": "i18n/en/docusaurus-plugin-content-docs/current/api-deployment/seldon-iris.md",
    "content": "---\ntitle : \"2. Deploy SeldonDeployment\"\ndescription: \"\"\nsidebar_position: 2\ndate: 2021-12-22\nlastmod: 2021-12-22\ncontributors: [\"Youngcheol Jang\", \"SeungTae Kim\"]\n---\n\n## Deploy with SeldonDeployment\n\nLet's deploy our trained model as an API using SeldonDeployment. SeldonDeployment is a custom resource definition (CRD) defined to deploy models as REST/gRPC servers on Kubernetes.\n\n#### 1. Prerequisites\n\nWe will conduct the SeldonDeployment related practice in a new namespace called seldon-deploy. After creating the namespace, set seldon-deploy as the current namespace.\n\n```bash\nkubectl create namespace seldon-deploy\nkubectl config set-context --current --namespace=seldon-deploy\n```\n\n### 2. Define Spec\n\nGenerate a yaml file to deploy SeldonDeployment. \nIn this page, we will use a publicly available iris model.\nBecause this iris model is trained through the sklearn framework, we use SKLEARN_SERVER.\n\n```bash\ncat <<EOF > iris-sdep.yaml\napiVersion: machinelearning.seldon.io/v1alpha2\nkind: SeldonDeployment\nmetadata:\n  name: sklearn\n  namespace: seldon-deploy\nspec:\n  name: iris\n  predictors:\n  - graph:\n      children: []\n      implementation: SKLEARN_SERVER\n      modelUri: gs://seldon-models/v1.12.0-dev/sklearn/iris\n      name: classifier\n    name: default\n    replicas: 1\nEOF\n```\n\nDeploy yaml file.\n\n```bash\nkubectl apply -f iris-sdep.yaml\n```\n\nCheck if the deployment was successful through the following command.\n\n```bash\nkubectl get pods --selector seldon-app=sklearn-default -n seldon-deploy\n```\n\nIf everyone runs, similar results will be printed.\n\n```bash\nNAME                                            READY   STATUS    RESTARTS   AGE\nsklearn-default-0-classifier-5fdfd7bb77-ls9tr   2/2     Running   0          5m\n```\n\n## Ingress URL\n\nNow, send a inference request to the deployed model to get the inference result. The API created by the SeldonDeployment follows the following rule:\n`http://{NODE_IP}:{NODE_PORT}/seldon/{namespace}/{seldon-deployment-name}/api/v1.0/{method-name}/`\n\n### NODE_IP / NODE_PORT\n\n[Since Seldon Core was installed with Ambassador as the Ingress Controller](../setup-components/install-components-seldon.md), all APIs created by SeldonDeployment can be requested through the Ambassador Ingress gateway.\n\nTherefore, first set the url of the Ambassador Ingress Gateway as an environment variable.\n\n```bash\nexport NODE_IP=$(kubectl get nodes -o jsonpath='{ $.items[*].status.addresses[?(@.type==\"InternalIP\")].address }')\nexport NODE_PORT=$(kubectl get service ambassador -n seldon-system -o jsonpath=\"{.spec.ports[0].nodePort}\")\n```\n\nCheck the set url.\n\n```bash\necho \"NODE_IP\"=$NODE_IP\necho \"NODE_PORT\"=$NODE_PORT\n```\n\nIt should be outputted similarly as follows, and if set through the cloud, you can check that internal IP address is set.\n```bash\nNODE_IP=192.168.0.19\nNODE_PORT=30486\n```\n\n### namespace / seldon-deployment-name\n\nThis refers to the `namespace` and `seldon-deployment-name` where the SeldonDeployment is deployed and used to define the values defined in the metadata when defining the spec.\n```bash\nmetadata:\n  name: sklearn\n  namespace: seldon-deploy\n```\n\nIn the example above, `namespace` is seldon-deploy, `seldon-deployment-name` is sklearn.\n### method-name\n\nIn SeldonDeployment, the commonly used `method-name` has two options:\n\n1. doc\n2. predictions\n\nThe detailed usage of each method is explained below.\n\n## Using Swagger\n\nFirst, let's explore how to use the doc method, which allows access to the Swagger generated by Seldon.\n\n### 1. Accessing Swagger\n\nAccording to the provided ingress URL rules, you can access the Swagger documentation using the following URL:\n`http://192.168.0.19:30486/seldon/seldon-deploy/sklearn/api/v1.0/doc/`\n\n![iris-swagger1.png](./img/iris-swagger1.png)\n\n### 2. Selecting Swagger Predictions\n\nIn the Swagger UI, select the `/seldon/seldon-deploy/sklearn/api/v1.0/predictions` endpoint.\n\n![iris-swagger2.png](./img/iris-swagger2.png)\n\n### 3. Choosing *Try it out*\n\n![iris-swagger3.png](./img/iris-swagger3.png)\n\n### 4. Inputting data in the Request body\n\n![iris-swagger4.png](./img/iris-swagger4.png)\n\nEnter the following data into the Request body.\n\n```bash\n{\n  \"data\": {\n    \"ndarray\":[[1.0, 2.0, 5.0, 6.0]]\n  }\n}\n```\n\n### 5. Check the inference results\n\nYou can click the `Execute` button to obtain the inference result.\n\n![iris-swagger5.png](./img/iris-swagger5.png)\n\nIf everything is executed successfully, you will obtain the following inference result.\n\n```bash\n{\n  \"data\": {\n    \"names\": [\n      \"t:0\",\n      \"t:1\",\n      \"t:2\"\n    ],\n    \"ndarray\": [\n      [\n        9.912315378486697e-7,\n        0.0007015931307746079,\n        0.9992974156376876\n      ]\n    ]\n  },\n  \"meta\": {\n    \"requestPath\": {\n      \"classifier\": \"seldonio/sklearnserver:1.11.2\"\n    }\n  }\n}\n```\n\n## Using CLI\n\nAlso, you can use http client CLI tools such as curl to make API requests.\nFor example, requesting `/predictions` as follows\n\n```bash\ncurl -X POST http://$NODE_IP:$NODE_PORT/seldon/seldon-deploy/sklearn/api/v1.0/predictions \\\n-H 'Content-Type: application/json' \\\n-d '{ \"data\": { \"ndarray\": [[1,2,3,4]] } }'\n```\n\nYou can confirm that the following response is outputted normally.\n```bash\n{\"data\":{\"names\":[\"t:0\",\"t:1\",\"t:2\"],\"ndarray\":[[0.0006985194531162835,0.00366803903943666,0.995633441507447]]},\"meta\":{\"requestPath\":{\"classifier\":\"seldonio/sklearnserver:1.11.2\"}}}\n```\n"
  },
  {
    "path": "i18n/en/docusaurus-plugin-content-docs/current/api-deployment/seldon-mlflow.md",
    "content": "---\ntitle : \"5. Model from MLflow\"\ndescription: \"\"\nsidebar_position: 5\ncontributors: [\"Jongseob Jeon\"]\n---\n\n## Model from MLflow\n\nOn this page, we will learn how to create an API using a model saved in the [MLflow Component](../kubeflow/advanced-mlflow.md).\n\n## Secret\n\nThe initContainer needs credentials to access minio and download the model. The credentials for access to minio are as follows.\n\n```bash\napiVersion: v1\ntype: Opaque\nkind: Secret\nmetadata:\n  name: seldon-init-container-secret\n  namespace: kubeflow-user-example-com\ndata:\n  AWS_ACCESS_KEY_ID: bWluaW8K=\n  AWS_SECRET_ACCESS_KEY: bWluaW8xMjM=\n  AWS_ENDPOINT_URL: aHR0cDovL21pbmlvLm1ha2luYXJvY2tzLmFp\n  USE_SSL: ZmFsc2U=\n```\n\nThe input value for `AWS_ACCESS_KEY_ID` is `minio`. However, since the input value for the secret must be an encoded value, the value that is actually entered must be the value that comes out after performing the following. \n\nThe values that need to be entered in data are as follows.\n\n- AWS_ACCESS_KEY_ID: minio\n- AWS_SECRET_ACCESS_KEY: minio123\n- AWS_ENDPOINT_URL: http://minio-service.kubeflow.svc:9000\n- USE_SSL: false\n\nThe encoding can be done using the following command.\n\n```bash\necho -n minio | base64\n```\n\nThen the following values will be output.\n\n```bash\nbWluaW8=\n```\n\nIf you do the encoding for the entire value, it will look like this:\n\n- AWS_ACCESS_KEY_ID: minio=\n- AWS_SECRET_ACCESS_KEY: minio123=\n- AWS_ENDPOINT_URL: http://minio-service.kubeflow.svc:9000=\n- USE_SSL: false=\n\nYou can generate a yaml file through the following command to create the secret.\n\n```bash\ncat <<EOF > seldon-init-container-secret.yaml\napiVersion: v1\nkind: Secret\nmetadata:\n  name: seldon-init-container-secret\n  namespace: kubeflow-user-example-com\ntype: Opaque\ndata:\n  AWS_ACCESS_KEY_ID: bWluaW8=\n  AWS_SECRET_ACCESS_KEY: bWluaW8xMjM=\n  AWS_ENDPOINT_URL: aHR0cDovL21pbmlvLXNlcnZpY2Uua3ViZWZsb3cuc3ZjOjkwMDA=\n  USE_SSL: ZmFsc2U=\nEOF\n```\n\nCreate the secret through the following command.\n\n```bash\nkubectl apply -f seldon-init-container-secret.yaml\n```\n\nIf performed normally, it will be output as follows.\n\n```bash\nsecret/seldon-init-container-secret created\n```\n\n## Seldon Core yaml\n\nNow let's write the yaml file to create Seldon Core.\n\n```bash\napiVersion: machinelearning.seldon.io/v1\nkind: SeldonDeployment\nmetadata:\n  name: seldon-example\n  namespace: kubeflow-user-example-com\nspec:\n  name: model\n  predictors:\n  - name: model\n\n    componentSpecs:\n    - spec:\n        volumes:\n        - name: model-provision-location\n          emptyDir: {}\n\n        initContainers:\n        - name: model-initializer\n          image: gcr.io/kfserving/storage-initializer:v0.4.0\n          args:\n            - \"s3://mlflow/mlflow/artifacts/0/74ba8e33994144f599e50b3be176cdb0/artifacts/svc\"\n            - \"/mnt/models\"\n          volumeMounts:\n          - mountPath: /mnt/models\n            name: model-provision-location\n          envFrom:\n          - secretRef:\n              name: seldon-init-container-secret\n\n        containers:\n        - name: model\n          image: ghcr.io/mlops-for-all/mlflowserver\n          volumeMounts:\n          - mountPath: /mnt/models\n            name: model-provision-location\n            readOnly: true\n          securityContext:\n            privileged: true\n            runAsUser: 0\n            runAsGroup: 0\n\n    graph:\n      name: model\n      type: MODEL\n      parameters:\n      - name: model_uri\n        type: STRING\n        value: \"/mnt/models\"\n      children: []\n```\n\nThere are two major changes compared to the previously created [Seldon Fields](../api-deployment/seldon-fields.md):\n\n1. The `envFrom` field is added to the initContainer.\n2. The address in the args has been changed to `s3://mlflow/mlflow/artifacts/0/74ba8e33994144f599e50b3be176cdb0/artifacts/svc`.\n\n### args\n\nPreviously, we mentioned that the first element of the args array is the path to the model we want to download. So, how can we determine the path of the model stored in MLflow?\n\nTo find the path, go back to MLflow and click on the run, then click on the model, as shown below:\n\n![seldon-mlflow-0.png](./img/seldon-mlflow-0.png)\n\nYou can use the path obtained from there.\n\n### envFrom\n\nThis process involves providing the environment variables required to access MinIO and download the model. We will use the `seldon-init-container-secret` created earlier.\n\n## API Creation\n\nFirst, let's generate the YAML file based on the specification defined above.\n\n```bash\napiVersion: machinelearning.seldon.io/v1\nkind: SeldonDeployment\nmetadata:\n  name: seldon-example\n  namespace: kubeflow-user-example-com\nspec:\n  name: model\n  predictors:\n  - name: model\n\n    componentSpecs:\n    - spec:\n        volumes:\n        - name: model-provision-location\n          emptyDir: {}\n\n        initContainers:\n        - name: model-initializer\n          image: gcr.io/kfserving/storage-initializer:v0.4.0\n          args:\n            - \"s3://mlflow/mlflow/artifacts/0/74ba8e33994144f599e50b3be176cdb0/artifacts/svc\"\n            - \"/mnt/models\"\n          volumeMounts:\n          - mountPath: /mnt/models\n            name: model-provision-location\n          envFrom:\n          - secretRef:\n              name: seldon-init-container-secret\n\n        containers:\n        - name: model\n          image: ghcr.io/mlops-for-all/mlflowserver\n          volumeMounts:\n          - mountPath: /mnt/models\n            name: model-provision-location\n            readOnly: true\n          securityContext:\n            privileged: true\n            runAsUser: 0\n            runAsGroup: 0\n\n    graph:\n      name: model\n      type: MODEL\n      parameters:\n      - name: model_uri\n        type: STRING\n        value: \"/mnt/models\"\n      - name: xtype\n        type: STRING\n        value: \"dataframe\"\n      children: []\nEOF\n```\n\nCreate a seldon pod.\n\n```bash\nkubectl apply -f seldon-mlflow.yaml\n\n```\n\nIf it is performed normally, it will be outputted as follows.\n\n```bash\nseldondeployment.machinelearning.seldon.io/seldon-example created\n```\n\nNow we wait until the pod is up and running properly.\n\n```bash\nkubectl get po -n kubeflow-user-example-com | grep seldon\n```\n\nIf it is outputted similarly to the following, the API has been created normally.\n\n```bash\nseldon-example-model-0-model-5c949bd894-c5f28      3/3     Running     0          69s\n```\n\nYou can confirm the execution through the following request on the API created through the CLI.\n\n```bash\ncurl -X POST http://$NODE_IP:$NODE_PORT/seldon/seldon-deploy/sklearn/api/v1.0/predictions \\\n-H 'Content-Type: application/json' \\\n-d '{\n    \"data\": {\n        \"ndarray\": [\n            [\n                143.0,\n                0.0,\n                30.0,\n                30.0\n            ]\n        ],\n        \"names\": [\n            \"sepal length (cm)\",\n            \"sepal width (cm)\",\n            \"petal length (cm)\",\n            \"petal width (cm)\"\n        ]\n    }\n}'\n```\n\nIf executed normally, you can get the following results.\n\n```bash\n{\"data\":{\"names\":[],\"ndarray\":[\"Virginica\"]},\"meta\":{\"requestPath\":{\"model\":\"ghcr.io/mlops-for-all/mlflowserver:e141f57\"}}}\n```\n"
  },
  {
    "path": "i18n/en/docusaurus-plugin-content-docs/current/api-deployment/seldon-pg.md",
    "content": "---\ntitle : \"3. Seldon Monitoring\"\ndescription: \"Prometheus & Grafana 확인하기\"\nsidebar_position: 3\ndate: 2021-12-24\nlastmod: 2021-12-24\ncontributors: [\"Jongseob Jeon\"]\n---\n\n## Grafana & Prometheus\n\nNow, let's perform repeated API requests with the SeldonDeployment we created on the [previous page](../api-deployment/seldon-iris.md) and check if the dashboard changes.\n\n### Dashboard\n\n[Forward the dashboard created earlier](../setup-components/install-components-pg.md).\n\n```bash\nkubectl port-forward svc/seldon-core-analytics-grafana -n seldon-system 8090:80\n```\n\n### Request API\n\nRequest **repeated** to the [previously created Seldon Deployment](../api-deployment/seldon-iris.md#using-cli).\n\n```bash\ncurl -X POST http://$NODE_IP:$NODE_PORT/seldon/seldon-deploy/sklearn/api/v1.0/predictions \\\n-H 'Content-Type: application/json' \\\n-d '{ \"data\": { \"ndarray\": [[1,2,3,4]] } }'\n```\n\nFurthermore, when checking the Grafana dashboard, you can observe that the Global Request Rate increases momentarily from `0 ops`.\n\n![repeat-raise.png](./img/repeat-raise.png)\n\nThis confirms that Prometheus and Grafana have been successfully installed and configured.\n"
  },
  {
    "path": "i18n/en/docusaurus-plugin-content-docs/current/api-deployment/what-is-api-deployment.md",
    "content": "---\ntitle : \"1. What is API Deployment?\"\ndescription: \"\"\nsidebar_position: 1\ndate: 2021-12-22\nlastmod: 2021-12-22\ncontributors: [\"Youngcheol Jang\"]\n---\n\n## What is API Deployment?\n\nAfter training a machine learning model, how should it be used? When training a machine learning model, you expect a model with higher performance to come out, but when you infer with the trained model, you want to get the inference results quickly and easily.\n\nWhen you want to check the inference results of the model, you can load the trained model and infer through a Jupyter notebook or a Python script. However, this method becomes inefficient as the model gets bigger, and you can only use the model in the environment where the trained model exists and cannot be used by many people.\n\nTherefore, when machine learning is used in actual services, it uses an API to use the trained model. The model is loaded only once in the environment where the API server is running, and you can easily get the inference results using DNS, and you can also link it with other services.\n\nHowever, there is a lot of ancillary work necessary to make the model into an API. In order to make it easier to make an API, machine learning frameworks such as Tensorflow have developed inference engines.\n\nUsing inference engines, we can create APIs (REST or gRPC) that can load and infer from machine learning models developed and trained in the corresponding frameworks. When we send a request with the data we want to infer to an API server built using these inference engines, the engine performs the inference and sends back the results in the response.\n\nSome well-known open-source inference engines include:\n\n- [Tensorflow: Tensorflow Serving](https://github.com/tensorflow/serving)\n- [PyTorch: Torchserve](https://github.com/pytorch/serve)\n- [ONNX: ONNX Runtime](https://github.com/microsoft/onnxruntime)\n\nWhile not officially supported in open-source, there are also inference engines developed for popular frameworks like sklearn and XGBoost.\n\nDeploying and serving the model's inference results through an API is called **API deployment**.\n\n## Serving Framework\n\nI introduced the fact that various inference engines have been developed. Now, if we want to deploy these inference engines in a Kubernetes environment for API deployment, what steps are involved? We need to deploy various Kubernetes resources such as Deployments for the inference engines, Services to create endpoints for sending inference requests, and Ingress to forward external inference requests to the inference engines. Additionally, we may need to handle requirements such as scaling out when there is a high volume of inference requests, monitoring the status of the inference engines, and updating the version when an improved model is available. There are many considerations when operating an inference engine, and it goes beyond just a few tasks.\n\nTo address these requirements, serving frameworks have been developed to further abstract the deployment of inference engines in a Kubernetes environment.\n\nSome popular serving frameworks include:\n\n- [Seldon Core](https://github.com/SeldonIO/seldon-core)\n- [Kserve](https://github.com/kserve)\n- [BentoML](https://github.com/bentoml/BentoML)\n\nIn *MLOps for ALL*, we use Seldon Core to demonstrate the process of API deployment.\n"
  },
  {
    "path": "i18n/en/docusaurus-plugin-content-docs/current/appendix/_category_.json",
    "content": "{\n  \"label\": \"Appendix\",\n  \"position\": 9,\n  \"link\": {\n    \"type\": \"generated-index\"\n  }\n}\n"
  },
  {
    "path": "i18n/en/docusaurus-plugin-content-docs/current/appendix/metallb.md",
    "content": "---\ntitle: \"2. Install load balancer metallb for Bare Metal Cluster\"\nsidebar_position: 2\n---\n\n## What is MetalLB?\n\n## Installing MetalLB\n\nWhen using Kubernetes on cloud platforms such as AWS, GCP, and Azure, they provide their own load balancers. However, for on-premises clusters, an additional module needs to be installed to enable load balancing. [MetalLB](https://metallb.universe.tf/) is an open-source project that provides a load balancer for bare metal environments.\n\n## Requirements\n\n| Requirement                                                 | Version and Details                                          |\n| ----------------------------------------------------------- | ------------------------------------------------------------ |\n| Kubernetes                                                  | Version >= v1.13.0 without built-in load balancing            |\n| [Compatible Network CNI](https://metallb.universe.tf/installation/network-addons/) | Calico, Canal, Cilium, Flannel, Kube-ovn, Kube-router, Weave Net |\n| IPv4 addresses                                              | Used for MetalLB deployment                                  |\n| BGP mode                                                    | One or more routers that support BGP functionality           |\n| TCP/UDP port 7946 open between nodes                         | Memberlist requirement                                      |\n\n### MetalLB Installation\n\n#### Preparation\n\nIf you are using kube-proxy in IPVS mode, starting from Kubernetes v1.14.2, you need to enable strict ARP mode.  \nBy default, Kube-router enables strict ARP, so this feature is not required if you are using Kube-router as a service proxy.  \nBefore applying strict ARP mode, check the current mode.\n\n```bash\n# see what changes would be made, returns nonzero returncode if different\nkubectl get configmap kube-proxy -n kube-system -o yaml | \\\ngrep strictARP\n```\n\n```bash\nstrictARP: false\n```\n\nIf strictARP: false is outputted, run the following to change it to strictARP: true.\n(If strictARP: true is already outputted, you do not need to execute the following command).\n\n```bash\n# actually apply the changes, returns nonzero returncode on errors only\nkubectl get configmap kube-proxy -n kube-system -o yaml | \\\nsed -e \"s/strictARP: false/strictARP: true/\" | \\\nkubectl apply -f - -n kube-system\n```\n\nIf performed normally, it will be output as follows.\n\n```bash\nWarning: resource configmaps/kube-proxy is missing the kubectl.kubernetes.io/last-applied-configuration annotation which is required by kubectl apply. kubectl apply should only be used on resources created declaratively by either kubectl create --save-config or kubectl apply. The missing annotation will be patched automatically.\nconfigmap/kube-proxy configured\n```\n\n### Installation - Manifest\n\n#### 1. Install MetalLB.\n\n```bash\nkubectl apply -f https://raw.githubusercontent.com/metallb/metallb/v0.11.0/manifests/namespace.yaml\nkubectl apply -f https://raw.githubusercontent.com/metallb/metallb/v0.11.0/manifests/metallb.yaml\n```\n\n#### 2. Check installation.\n\nWait until both pods in the metallb-system namespace are Running.\n\n```bash\nkubectl get pod -n metallb-system\n```\n\nWhen everthing is Running, similar results will be output.\n\n```bash\nNAME                          READY   STATUS    RESTARTS   AGE\ncontroller-7dcc8764f4-8n92q   1/1     Running   1          1m\nspeaker-fnf8l                 1/1     Running   1          1m\n```\n\nThe components of the manifest are as follows:\n\n- metallb-system/controller\n  - Deployed as a deployment, responsible for assigning external IP addresses for load balancing.\n- metallb-system/speaker\n  - Deployed as a daemonset, responsible for configuring network communication to connect external traffic and services.\n\nThe service includes RBAC permissions which are necessary for the controller and speaker components to operate.\n\n## Configuration\n\nSetting up the load balancing policy of MetalLB can be done by deploying a configmap containing the related configuration information.\n\nThere are two modes that can be configured in MetalLB:\n\n1. [Layer 2 Mode](https://metallb.universe.tf/concepts/layer2/) \n2. [BGP Mode](https://metallb.universe.tf/concepts/bgp/) \n\nHere we will proceed with Layer 2 mode.\n\n### Layer 2 Configuration\n\nIn the Layer 2 mode, it is enough to set only the range of IP addresses to be used simply.  \nWhen using Layer 2 mode, it is not necessary to bind IP to the network interface of the worker node, because it operates in a way that it responds directly to the ARP request of the local network and provides the computer's MAC address to the client.\n\nThe following `metallb_config.yaml` file is the configuration for MetalLB to provide control over the IP range of 192.168.35.100 ~ 192.168.35.110, and to configure Layer 2 mode.\n\nIn case the cluster node and the client node are separated, the range of 192.168.35.100 ~ 192.168.35.110 must be accessible by both the client node and the cluster node.\n\n#### metallb_config.yaml\n\n```bash\napiVersion: v1\nkind: ConfigMap\nmetadata:\n  namespace: metallb-system\n  name: config\ndata:\n  config: |\n    address-pools:\n    - name: default\n      protocol: layer2\n      addresses:\n      - 192.168.35.100-192.168.35.110  # IP 대역폭\n```\n\nApply the above settings.\n\n```test\nkubectl apply -f metallb_config.yaml \n```\n\nIf deployed normally, it will output as follows.\n\n```test\nconfigmap/config created\n```\n\n## Using MetalLB\n\n### Kubeflow Dashboard\n\nFirst, before getting the load-balancing feature from MetalLB, check the current status by changing the type of the istio-ingressgateway service in the istio-system namespace to `LoadBalancer` to provide the Kubeflow Dashboard.\n\n```bash\nkubectl get svc/istio-ingressgateway -n istio-system\n```\n\nThe type of this service is ClusterIP and you can see that the External-IP value is `none`.\n\n```bash\nNAME                   TYPE        CLUSTER-IP    EXTERNAL-IP   PORT(S)                                        AGE\nistio-ingressgateway   ClusterIP   10.103.72.5   <none>        15021/TCP,80/TCP,443/TCP,31400/TCP,15443/TCP   4h21m\n```\n\nChange the type to LoadBalancer and if you want to input a desired IP address, add the loadBalancerIP item.  \nIf you do not add it, IP addresses will be assigned sequentially from the IP address pool set above.\n\n```bash\nkubectl edit svc/istio-ingressgateway -n istio-system\n```\n\n```bash\nspec:\n  clusterIP: 10.103.72.5\n  clusterIPs:\n  - 10.103.72.5\n  ipFamilies:\n  - IPv4\n  ipFamilyPolicy: SingleStack\n  ports:\n  - name: status-port\n    port: 15021\n    protocol: TCP\n    targetPort: 15021\n  - name: http2\n    port: 80\n    protocol: TCP\n    targetPort: 8080\n  - name: https\n    port: 443\n    protocol: TCP\n    targetPort: 8443\n  - name: tcp\n    port: 31400\n    protocol: TCP\n    targetPort: 31400\n  - name: tls\n    port: 15443\n    protocol: TCP\n    targetPort: 15443\n  selector:\n    app: istio-ingressgateway\n    istio: ingressgateway\n  sessionAffinity: None\n  type: LoadBalancer # Change ClusterIP to LoadBalancer\n  loadBalancerIP: 192.168.35.100   # Add IP\nstatus:\n  loadBalancer: {}\n```\n\nIf you check again, you will see that the External-IP value is `192.168.35.100`.\n\n```bash\nkubectl get svc/istio-ingressgateway -n istio-system\n```\n\n```bash\nNAME                   TYPE           CLUSTER-IP    EXTERNAL-IP      PORT(S)                                                                      AGE\nistio-ingressgateway   LoadBalancer   10.103.72.5   192.168.35.100   15021:31054/TCP,80:30853/TCP,443:30443/TCP,31400:30012/TCP,15443:31650/TCP   5h1m\n```\n\nOpen a web browser and connect to [http://192.168.35.100](http://192.168.35.100) to verify the following screen is output.\n\n![login-after-istio-ingressgateway-setting.png](./img/login-after-istio-ingressgateway-setting.png)\n\n### minio Dashboard\n\nFirst, we check the current status before changing the type of minio-service, which provides the Dashboard of minio, in the kubeflow namespace to LoadBalancer to receive the load balancing function from MetalLB.\n\n```bash\nkubectl get svc/minio-service -n kubeflow\n```\n\nThe type of this service is ClusterIP and you can confirm that the External-IP value is `none`.\n\n```bash\nNAME            TYPE        CLUSTER-IP      EXTERNAL-IP   PORT(S)    AGE\nminio-service   ClusterIP   10.109.209.87   <none>        9000/TCP   5h14m\n```\n\nChange the type to LoadBalancer and if you want to enter an IP address, add the loadBalancerIP item. If you do not add, the IP address will be assigned sequentially from the IP address pool set above.\n\n```bash\nkubectl edit svc/minio-service -n kubeflow\n```\n\n```bash\napiVersion: v1\nkind: Service\nmetadata:\n  annotations:\n    kubectl.kubernetes.io/last-applied-configuration: |\n      {\"apiVersion\":\"v1\",\"kind\":\"Service\",\"metadata\":{\"annotations\":{},\"labels\":{\"application-crd-id\":\"kubeflow-pipelines\"},\"name\":\"minio-ser>\n  creationTimestamp: \"2022-01-05T08:44:23Z\"\n  labels:\n    application-crd-id: kubeflow-pipelines\n  name: minio-service\n  namespace: kubeflow\n  resourceVersion: \"21120\"\n  uid: 0053ee28-4f87-47bb-ad6b-7ad68aa29a48\nspec:\n  clusterIP: 10.109.209.87\n  clusterIPs:\n  - 10.109.209.87\n  ipFamilies:\n  - IPv4\n  ipFamilyPolicy: SingleStack\n  ports:\n  - name: http\n    port: 9000\n    protocol: TCP\n    targetPort: 9000\n  selector:\n    app: minio\n    application-crd-id: kubeflow-pipelines\n  sessionAffinity: None\n  type: LoadBalancer # Change ClusterIP to LoadBalancer\n  loadBalancerIP: 192.168.35.101 # Add IP\nstatus:\n  loadBalancer: {}\n```\n\nIf we check again, we can see that the External-IP value is `192.168.35.101`.\n\n```bash\nkubectl get svc/minio-service -n kubeflow\n```\n\n```bash\nNAME            TYPE           CLUSTER-IP      EXTERNAL-IP      PORT(S)          AGE\nminio-service   LoadBalancer   10.109.209.87   192.168.35.101   9000:31371/TCP   5h21m\n```\n\nOpen a web browser and connect to [http://192.168.35.101:9000](http://192.168.35.101:9000) to confirm the following screen is printed. \n\n![login-after-minio-setting.png](./img/login-after-minio-setting.png)\n\n### mlflow Dashboard\n\nFirst, we check the current status before changing the type of mlflow-server-service service in the mlflow-system namespace that provides the mlflow Dashboard to LoadBalancer to receive load balancing function from MetalLB.\n\n```bash\nkubectl get svc/mlflow-server-service -n mlflow-system\n```\n\nThe type of this service is ClusterIP and you can confirm that the External-IP value is `none`.\n\n```bash\nNAME                    TYPE        CLUSTER-IP       EXTERNAL-IP   PORT(S)    AGE\nmlflow-server-service   ClusterIP   10.111.173.209   <none>        5000/TCP   4m50s\n```\n\nChange the type to LoadBalancer and if you want to input the desired IP address, add the loadBalancerIP item.  \nIf you do not add it, the IP address will be assigned sequentially from the IP address pool set above.\n\n```bash\nkubectl edit svc/mlflow-server-service -n mlflow-system\n```\n\n```bash\napiVersion: v1\nkind: Service\nmetadata:\n  annotations:\n    meta.helm.sh/release-name: mlflow-server\n    meta.helm.sh/release-namespace: mlflow-system\n  creationTimestamp: \"2022-01-07T04:00:19Z\"\n  labels:\n    app.kubernetes.io/managed-by: Helm\n  name: mlflow-server-service\n  namespace: mlflow-system\n  resourceVersion: \"276246\"\n  uid: e5d39fb7-ad98-47e7-b512-f9c673055356\nspec:\n  clusterIP: 10.111.173.209\n  clusterIPs:\n  - 10.111.173.209\n  ipFamilies:\n  - IPv4\n  ipFamilyPolicy: SingleStack\n  ports:\n  - port: 5000\n    protocol: TCP\n    targetPort: 5000\n  selector:\n    app.kubernetes.io/name: mlflow-server\n  sessionAffinity: None\n  type: LoadBalancer # Change ClusterIP to LoadBalancer\n  loadBalancerIP: 192.168.35.102 # Add IP\nstatus:\n  loadBalancer: {}\n```\n\nIf we check again, we can see that the External-IP value is `192.168.35.102`.\n\n```bash\nkubectl get svc/mlflow-server-service -n mlflow-system\n```\n\n```bash\nNAME                    TYPE           CLUSTER-IP       EXTERNAL-IP      PORT(S)          AGE\nmlflow-server-service   LoadBalancer   10.111.173.209   192.168.35.102   5000:32287/TCP   6m11s\n```\n\nOpen the web browser and connect to [http://192.168.35.102:5000](http://192.168.35.102:5000) to confirm the following screen is displayed.\n\n![login-after-mlflow-setting.png](./img/login-after-mlflow-setting.png)\n\n### Grafana Dashboard\n\nFirst, check the current status before changing the type of seldon-core-analytics-grafana service in the seldon-system namespace which provides Grafana's Dashboard to receive Load Balancing function from MetalLB.\n\n```bash\nkubectl get svc/seldon-core-analytics-grafana -n seldon-system\n```\n\nThe type of the corresponding service is ClusterIP, and you can see that the External-IP value is `none`.\n\n```bash\nNAME                            TYPE        CLUSTER-IP      EXTERNAL-IP   PORT(S)   AGE\nseldon-core-analytics-grafana   ClusterIP   10.109.20.161   <none>        80/TCP    94s\n```\n\nChange the type to LoadBalancer and if you want to enter an IP address, add the loadBalancerIP item.  \nIf not, an IP address will be assigned sequentially from the IP address pool set above.\n\n```bash\nkubectl edit svc/seldon-core-analytics-grafana -n seldon-system\n```\n\n```bash\napiVersion: v1\nkind: Service\nmetadata:\n  annotations:\n    meta.helm.sh/release-name: seldon-core-analytics\n    meta.helm.sh/release-namespace: seldon-system\n  creationTimestamp: \"2022-01-07T04:16:47Z\"\n  labels:\n    app.kubernetes.io/instance: seldon-core-analytics\n    app.kubernetes.io/managed-by: Helm\n    app.kubernetes.io/name: grafana\n    app.kubernetes.io/version: 7.0.3\n    helm.sh/chart: grafana-5.1.4\n  name: seldon-core-analytics-grafana\n  namespace: seldon-system\n  resourceVersion: \"280605\"\n  uid: 75073b78-92ec-472c-b0d5-240038ea8fa5\nspec:\n  clusterIP: 10.109.20.161\n  clusterIPs:\n  - 10.109.20.161\n  ipFamilies:\n  - IPv4\n  ipFamilyPolicy: SingleStack\n  ports:\n  - name: service\n    port: 80\n    protocol: TCP\n    targetPort: 3000\n  selector:\n    app.kubernetes.io/instance: seldon-core-analytics\n    app.kubernetes.io/name: grafana\n  sessionAffinity: None\n  type: LoadBalancer # Change ClusterIP to LoadBalancer\n  loadBalancerIP: 192.168.35.103 # Add IP\nstatus:\n  loadBalancer: {}\n```\n\nIf you check again, you can see that the External-IP value is `192.168.35.103`.\n\n```bash\nkubectl get svc/seldon-core-analytics-grafana -n seldon-system\n```\n\n```bash\nNAME                            TYPE           CLUSTER-IP      EXTERNAL-IP      PORT(S)        AGE\nseldon-core-analytics-grafana   LoadBalancer   10.109.20.161   192.168.35.103   80:31191/TCP   5m14s\n```\n\nOpen the Web Browser and connect to http://192.168.35.103:80 to confirm that the following screen is displayed.\n\n![login-after-grafana-setting.png](./img/login-after-grafana-setting.png)\n"
  },
  {
    "path": "i18n/en/docusaurus-plugin-content-docs/current/appendix/pyenv.md",
    "content": "---\ntitle: \"1. Install Python virtual environment\"\nsidebar_position: 1\n---\n\n## Python virtual environment\n\nWhen working with Python, there may be cases where you want to use multiple versions of Python environments or manage package versions separately for different projects.\n\nTo easily manage Python environments or Python package environments in a virtualized manner, there are tools available such as pyenv, conda, virtualenv, and venv.\n\nAmong these, *MLOps for ALL* covers the installation of [pyenv](https://github.com/pyenv/pyenv) and [pyenv-virtualenv](https://github.com/pyenv/pyenv-virtualenv).  \npyenv helps manage Python versions, while pyenv-virtualenv is a plugin for pyenv that helps manage Python package environments.\n\n## Installing pyenv\n\n### Prerequisites\n\nPrerequisites vary depending on the operating system. Please refer to the [following page](https://github.com/pyenv/pyenv/wiki#suggested-build-environment) and install the required packages accordingly.\n\n### Installation - macOS\n\n1. Install pyenv, pyenv-virtualenv\n\n```bash\nbrew update\nbrew install pyenv\nbrew install pyenv-virtualenv\n```\n\n2. Set pyenv\n\nFor macOS, assuming the use of zsh since the default shell has changed to zsh in Catalina version and later, setting up pyenv.\n\n```bash\necho 'eval \"$(pyenv init -)\"' >> ~/.zshrc\necho 'eval \"$(pyenv virtualenv-init -)\"' >> ~/.zshrc\nsource ~/.zshrc\n```\n\nCheck if the pyenv command is executed properly.\n\n```bash\npyenv --help\n```\n\n```bash\n$ pyenv --help\nUsage: pyenv <command> [<args>]\n\nSome useful pyenv commands are:\n   --version   Display the version of pyenv\n   activate    Activate virtual environment\n   commands    List all available pyenv commands\n   deactivate   Deactivate virtual environment\n   exec        Run an executable with the selected Python version\n   global      Set or show the global Python version(s)\n   help        Display help for a command\n   hooks       List hook scripts for a given pyenv command\n   init        Configure the shell environment for pyenv\n   install     Install a Python version using python-build\n   local       Set or show the local application-specific Python version(s)\n   prefix      Display prefix for a Python version\n   rehash      Rehash pyenv shims (run this after installing executables)\n   root        Display the root directory where versions and shims are kept\n   shell       Set or show the shell-specific Python version\n   shims       List existing pyenv shims\n   uninstall   Uninstall a specific Python version\n   version     Show the current Python version(s) and its origin\n   version-file   Detect the file that sets the current pyenv version\n   version-name   Show the current Python version\n   version-origin   Explain how the current Python version is set\n   versions    List all Python versions available to pyenv\n   virtualenv   Create a Python virtualenv using the pyenv-virtualenv plugin\n   virtualenv-delete   Uninstall a specific Python virtualenv\n   virtualenv-init   Configure the shell environment for pyenv-virtualenv\n   virtualenv-prefix   Display real_prefix for a Python virtualenv version\n   virtualenvs   List all Python virtualenvs found in `$PYENV_ROOT/versions/*'.\n   whence      List all Python versions that contain the given executable\n   which       Display the full path to an executable\n\nSee `pyenv help <command>' for information on a specific command.\nFor full documentation, see: https://github.com/pyenv/pyenv#readme\n```\n\n### Installation - Ubuntu\n\n1. Install pyenv and pyenv-virtualenv\n\n```bash\ncurl https://pyenv.run | bash\n```\n\nIf the following content is output, it means that the installation is successful.\n\n```bash\n  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current\n                                 Dload  Upload   Total   Spent    Left  Speed\n  0     0    0     0    0     0      0      0 --:--:-- --:--:--   0     0    0     0    0     0      0      0 --:--:-- --:--:-- 100   270  100   270    0     0    239      0  0:00:01  0:00:01 --:--:--   239\nCloning into '/home/mlops/.pyenv'...\nr\n...\nSkip...\n...\nremote: Enumerating objects: 10, done.\nremote: Counting objects: 100% (10/10), done.\nremote: Compressing objects: 100% (6/6), done.\nremote: Total 10 (delta 1), reused 6 (delta 0), pack-reused 0\nUnpacking objects: 100% (10/10), 2.92 KiB | 2.92 MiB/s, done.\n\nWARNING: seems you still have not added 'pyenv' to the load path.\n\n\n# See the README for instructions on how to set up\n# your shell environment for Pyenv.\n\n# Load pyenv-virtualenv automatically by adding\n# the following to ~/.bashrc:\n\neval \"$(pyenv virtualenv-init -)\"\n\n```\n\n2. Set pyenv\n\nAssuming the use of bash shell as the default shell, configure pyenv and pyenv-virtualenv to be used in bash.\n\n```bash\nsudo vi ~/.bashrc\n```\n\nEnter the following string and save it.\n\n```bash\nexport PATH=\"$HOME/.pyenv/bin:$PATH\"\neval \"$(pyenv init -)\"\neval \"$(pyenv virtualenv-init -)\"\n```\n\nRestart the shell.\n\n```bash\nexec $SHELL\n```\n\nCheck if the pyenv command is executed properly.\n\n```bash\npyenv --help\n```\n\nIf the following message is displayed, it means that the settings have been configured correctly.\n\n```bash\n$ pyenv\npyenv 2.2.2\nUsage: pyenv <command> [<args>]\n\nSome useful pyenv commands are:\n   --version   Display the version of pyenv\n   activate    Activate virtual environment\n   commands    List all available pyenv commands\n   deactivate   Deactivate virtual environment\n   doctor      Verify pyenv installation and development tools to build pythons.\n   exec        Run an executable with the selected Python version\n   global      Set or show the global Python version(s)\n   help        Display help for a command\n   hooks       List hook scripts for a given pyenv command\n   init        Configure the shell environment for pyenv\n   install     Install a Python version using python-build\n   local       Set or show the local application-specific Python version(s)\n   prefix      Display prefix for a Python version\n   rehash      Rehash pyenv shims (run this after installing executables)\n   root        Display the root directory where versions and shims are kept\n   shell       Set or show the shell-specific Python version\n   shims       List existing pyenv shims\n   uninstall   Uninstall a specific Python version\n   version     Show the current Python version(s) and its origin\n   version-file   Detect the file that sets the current pyenv version\n   version-name   Show the current Python version\n   version-origin   Explain how the current Python version is set\n   versions    List all Python versions available to pyenv\n   virtualenv   Create a Python virtualenv using the pyenv-virtualenv plugin\n   virtualenv-delete   Uninstall a specific Python virtualenv\n   virtualenv-init   Configure the shell environment for pyenv-virtualenv\n   virtualenv-prefix   Display real_prefix for a Python virtualenv version\n   virtualenvs   List all Python virtualenvs found in `$PYENV_ROOT/versions/*'.\n   whence      List all Python versions that contain the given executable\n   which       Display the full path to an executable\n\nSee `pyenv help <command>' for information on a specific command.\nFor full documentation, see: https://github.com/pyenv/pyenv#readme\n```\n\n## Using pyenv\n\n### Install python version\n\nUsing the `pyenv install <Python-Version>` command, you can install the desired Python version.  \nIn this page, we will install the Python 3.7.12 version that is used by Kubeflow by default as an example.\n\n```bash\npyenv install 3.7.12\n```\n\nIf installed normally, the following message will be printed.\n\n```bash\n$ pyenv install 3.7.12\nDownloading Python-3.7.12.tar.xz...\n-> https://www.python.org/ftp/python/3.7.12/Python-3.7.12.tar.xz\nInstalling Python-3.7.12...\npatching file Doc/library/ctypes.rst\npatching file Lib/test/test_unicode.py\npatching file Modules/_ctypes/_ctypes.c\npatching file Modules/_ctypes/callproc.c\npatching file Modules/_ctypes/ctypes.h\npatching file setup.py\npatching file 'Misc/NEWS.d/next/Core and Builtins/2020-06-30-04-44-29.bpo-41100.PJwA6F.rst'\npatching file Modules/_decimal/libmpdec/mpdecimal.h\nInstalled Python-3.7.12 to /home/mlops/.pyenv/versions/3.7.12\n```\n\n### Create python virtual environment\n\nCreate a Python virtual environment with the `pyenv virtualenv <Installed-Python-Version> <Virtual-Environment-Name>` command to create a Python virtual environment with the desired Python version.\n\nFor example, let's create a Python virtual environment called `demo` with Python 3.7.12 version.\n```bash\npyenv virtualenv 3.7.12 demo\n```\n\n```bash\n$ pyenv virtualenv 3.7.12 demo\nLooking in links: /tmp/tmpffqys0gv\nRequirement already satisfied: setuptools in /home/mlops/.pyenv/versions/3.7.12/envs/demo/lib/python3.7/site-packages (47.1.0)\nRequirement already satisfied: pip in /home/mlops/.pyenv/versions/3.7.12/envs/demo/lib/python3.7/site-packages (20.1.1)\n```\n\n### Activating python virtual environment\n\nUse the `pyenv activate <environment name>` command to use the virtual environment created in this way.\n\nFor example, we will use a Python virtual environment called `demo`.\n\n```bash\npyenv activate demo\n```\n\n\nYou can see that the information of the current virtual environment is printed at the front of the shell.\n\n  Before\n\n  ```bash\n  mlops@ubuntu:~$ pyenv activate demo\n  ```\n\n  After\n\n  ```bash\n  pyenv-virtualenv: prompt changing will be removed from future release. configure `export PYENV_VIRTUALENV_DISABLE_PROMPT=1' to simulate the behavior.\n  (demo) mlops@ubuntu:~$ \n  ```\n\n### Deactivating python virtual environment\n\nYou can deactivate the currently active virtualenv by using the command `source deactivate`.\n\n```bash\nsource deactivate\n```\n\n  Before\n\n  ```bash\n  (demo) mlops@ubuntu:~$ source deactivate\n  ```\n\n  After\n\n  ```bash\n  mlops@ubuntu:~$ \n  ```\n"
  },
  {
    "path": "i18n/en/docusaurus-plugin-content-docs/current/further-readings/_category_.json",
    "content": "{\n  \"label\": \"Further Readings\",\n  \"position\": 8,\n  \"link\": {\n    \"type\": \"generated-index\"\n  }\n}\n"
  },
  {
    "path": "i18n/en/docusaurus-plugin-content-docs/current/further-readings/info.md",
    "content": "---\ntitle: \"Further Readings\"\ndate: 2021-12-21\nlastmod: 2021-12-21\n---\n\n## MLOps Component\n\nFrom the components covered in [MLOps Concepts](../introduction/component.md), the following diagram illustrates them. \n\n![open-stacks-0.png](./img/open-stacks-0.png)\n\nThe technology stacks covered in *Everyone's MLOps* are as follows.\n\n![open-stacks-1.png](./img/open-stacks-1.png)\n\n| | Storage | [Minio](https://min.io/)                            |\n| | Data Processing | [Apache Spark](https://spark.apache.org/)                             |\n| | Data Visualization | [Tableau](https://www.tableau.com/)                               |\n| Workflow Mgmt.             | Orchestration               | [Airflow](https://airflow.apache.org/)                              |\n| | Scheduling               | [Kubernetes](https://kubernetes.io/)                            |\n| Security & Compliance      | Authentication & Authorization | [Ldap](https://www.openldap.org/)                               |\n| | Data Encryption & Tokenization | [Vault](https://www.vaultproject.io/)                         |\n| | Governance & Auditing | [Open Policy Agent](https://www.openpolicyagent.org/)              |\n\nAs you can see, there are still many MLOps components that we have not covered yet. We could not cover them all this time due to time constraints, but if you need it, it might be a good idea to refer to the following open source projects first.\n\n![open-stacks-2.png](./img/open-stacks-2.png)\n\nFor details:\n\n| Mgmt.                      | Component                   | Open Soruce                           |\n| -------------------------- | --------------------------- | ------------------------------------- |\n| Data Mgmt.                 | Collection                  | [Kafka](https://kafka.apache.org/)                                 |\n|                            | Validation                  | [Beam](https://beam.apache.org/)                                  |\n|                            | Feature Store               | [Flink](https://flink.apache.org/)                                 |\n| ML Model Dev. & Experiment | Modeling                    | [Jupyter](https://jupyter.org/)                               |\n|                            | Analysis & Experiment Mgmt. | [MLflow](https://mlflow.org/)                                |\n|                            | HPO Tuning & AutoML         | [Katib](https://github.com/kubeflow/katib)                                 |\n| Deploy Mgmt.               | Serving Framework           | [Seldon Core](https://docs.seldon.io/projects/seldon-core/en/latest/index.html)                           |\n|                            | A/B Test                    | [Iter8](https://iter8.tools/)                                 |\n|                            | Monitoring                  | [Grafana](https://grafana.com/oss/grafana/), [Prometheus](https://prometheus.io/)                   |\n| Process Mgmt.              | pipeline                    | [Kubeflow](https://www.kubeflow.org/)                              |\n|                            | CI/CD                       | [Github Action](https://docs.github.com/en/actions)                         |\n|                            | Continuous Training         | [Argo Events](https://argoproj.github.io/events/)                           |\n| Platform Mgmt.             | Configuration Mgmt.         | [Consul](https://www.consul.io/)                                |\n|                            | Code Version Mgmt.          | [Github](https://github.com/), [Minio](https://min.io/)                         |\n|                            | Logging                     | (EFK) [Elastic Search](https://www.elastic.co/kr/elasticsearch/), [Fluentd](https://www.fluentd.org/), [Kibana](https://www.elastic.co/kr/kibana/) |\n|                            | Resource Mgmt.              | [Kubernetes](https://kubernetes.io/)                            |\n"
  },
  {
    "path": "i18n/en/docusaurus-plugin-content-docs/current/introduction/_category_.json",
    "content": "{\n  \"label\": \"Introduction\",\n  \"position\": 1,\n  \"link\": {\n    \"type\": \"generated-index\"\n  }\n}\n"
  },
  {
    "path": "i18n/en/docusaurus-plugin-content-docs/current/introduction/component.md",
    "content": "---\ntitle : \"3. Components of MLOps\"\ndescription: \"Describe MLOps Components\"\nsidebar_position: 3\ndate: 2021-12-03\nlastmod: 2021-12-10\ncontributors: [\"Youngcheol Jang\"]\n---\n\n## Practitioners guide to MLOps\n\nGoogle's white paper [Practitioners guide to MLOps: A framework for continuous delivery and automation of machine learning] published in May 2021 mentions the following core functionalities of MLOps: \n\n![mlops-component](./img/mlops-component.png)\n\nLet's look at what each feature does.\n\n### 1. Experimentation\n\nExperimentation provides machine learning engineers with the following capabilities for data analysis, prototyping model development, and implementing training functionality:\n\n- Integration with version control tools like Git and a notebook (Jupyter Notebook) environment\n- Experiment tracking capabilities including data used, hyperparameters, and evaluation metrics\n- Data and model analysis and visualization capabilities\n\n### 2. Data Processing\n\nData Processing enables working with large volumes of data during the stages of model development, continuous training, and API deployment by providing the following functionalities:\n\n- Data connectors compatible with various data sources and services\n- Data encoders and decoders compatible with different data formats\n- Data transformation and feature engineering capabilities for different data types\n- Scalable batch and streaming data processing capabilities for training and serving\n\n### 3. Model Training\n\nModel Training offers functionalities to efficiently execute algorithms for model training:\n\n- Environment provisioning for ML framework execution\n- Distributed training environment for multiple GPUs and distributed training\n- Hyperparameter tuning and optimization capabilities\n\n### 4. Model Evaluation\n\nModel evaluation provides the following capabilities to observe the performance of models in both experimental and production environments:\n\n- Model performance evaluation on evaluation datasets\n- Tracking prediction performance across different continuous training runs\n- Comparison and visualization of performance between different models\n- Model output interpretation using interpretable AI techniques\n\n### 5. Model Serving\n\nModel serving offers functionalities to deploy and serve models in production environments:\n\n- Low-latency and high-availability inference capabilities\n- Support for various ML model serving frameworks (TensorFlow Serving, TorchServe, NVIDIA Triton, Scikit-learn, XGBoost, etc.)\n- Advanced inference routines, such as preprocessing or postprocessing, and multi-model ensembling for final results\n- Autoscaling capabilities to handle spiking inference requests\n- Logging of inference requests and results\n\n### 6. Online Experimentation\n\nOnline experimentation provides capabilities to validate the performance of newly generated models when deployed. This functionality should be integrated with a Model Registry to coordinate the deployment of new models.\n\n- Canary and shadow deployment features\n- A/B testing capabilities\n- Multi-armed bandit testing functionality\n\n### 7. Model Monitoring\n\nModel monitoring enables the monitoring of deployed models in production environments to ensure proper functioning and provides information on model performance degradation and the need for updates.\n\n### 8. ML Pipeline\n\nML Pipeline offers the following functionalities to configure, control, and automate complex ML training and inference workflows in production environments:\n\n- Pipeline execution through various event sources\n- ML metadata tracking and integration for pipeline parameter and artifact management\n- Support for built-in components for common ML tasks and user-defined components\n- Provisioning of different execution environments\n\n### 9. Model Registry\n\nThe Model Registry provides the capability to manage the lifecycle of machine learning models in a centralized repository.\n\n- Registration, tracking, and versioning of trained and deployed models\n- Storage of information about the required data and runtime packages for deployment\n\n### 10. Dataset and Feature Repository\n\n- Sharing, search, reuse, and versioning capabilities for datasets\n- Real-time processing and low-latency serving capabilities for event streaming and online inference tasks\n- Support for various types of data, such as images, text, and tabular data\n\n### 11. ML Metadata and Artifact Tracking\n\nIn each stage of MLOps, various artifacts are generated. ML metadata refers to the information about these artifacts. ML metadata and artifact management provide the following functionalities to manage the location, type, attributes, and associations with experiments:\n\n- History management for ML artifacts\n- Tracking and sharing of experiments and pipeline parameter configurations\n- Storage, access, visualization, and download capabilities for ML artifacts\n- Integration with other MLOps functionalities"
  },
  {
    "path": "i18n/en/docusaurus-plugin-content-docs/current/introduction/intro.md",
    "content": "---\ntitle : \"1. What is MLOps?\"\ndescription: \"Introduction to MLOps\"\nsidebar_position: 1\ndate: 2021-1./img to MLOps\"\nlastmod: 2022-03-05\ncontributors: [\"Jongseob Jeon\"]\n---\n\n## Machine Learning Project\n\nSince 2012, when Alexnet was introduced, Machine Learning and Deep Learning have been introduced in any domain where data exists, such as Computer Vision and Natural Language Processing. Deep Learning and Machine Learning were referred to collectively as AI, and the need for AI was shouted from many media. And many companies conducted numerous projects using Machine Learning and Deep Learning. But what was the result? Byungchan Eum, the Head of North East Asia at Element AI, said “If 10 companies start an AI project, 9 of them will only be able to do concept validation (POC)”.\n\n\n\nIn this way, in many projects, Machine Learning and Deep Learning only showed the possibility that they could solve this problem and then disappeared. And around this time, the outlook that [AI Winter was coming again](https://www.aifutures.org/2021/ai-winter-is-coming/) also began to emerge.\n\nWhy did most projects end at the concept validation (POC) stage? Because it is impossible to operate an actual service with only Machine Learning and Deep Learning code.\n\nAt the actual service stage, the portion taken up by machine learning and deep learning code is not as large as one would think, so one must consider many other aspects besides simply the performance of the model. Google has pointed out this problem in their 2015 paper [Hidden Technical Debt in Machine Learning Systems](https://proceedings.neurips.cc/paper/2015/file/86df7dcfd896fcaf2674f757a2463eba-Paper.pdf). However, at the time this paper was released, many ML engineers were busy proving the potential of deep learning and machine learning, so the points made in the paper were not given much attention. \n\nAnd after a few years, machine learning and deep learning had proven their potential and people were now looking to apply it to actual services. However, soon many people realized that actual services were not as easy as they thought.\n\n## Devops\n\nMLOps is not a new concept, but rather a term derived from the development methodology called DevOps. Therefore, understanding DevOps can help in understanding MLOps.\n\n### DevOps\n\nDevOps is a portmanteau of \"Development\" and \"Operations,\" referring to a development and operations methodology that emphasizes communication, collaboration, and integration between software developers and IT professionals. It encompasses both the development and operation phases of software, aiming to achieve a symbiotic relationship between the two. The primary goal of DevOps is to enable organizations to develop and deploy software products and services rapidly by fostering close collaboration and interdependence between development and operations teams.\n\n### Silo Effect\nLet's explore why DevOps is necessary through a simple scenario.\n\nIn the early stages of a service, there are fewer supported features, and the team or company is relatively small. At this point, there may not be a clear distinction between development and operations, or the teams may be small. The key point here is the small scale. In such cases, there are many points of contact for effective communication, and with a limited number of services to focus on, it is possible to rapidly improve the service.\n\nHowever, as the service scales up, the development and operations teams tend to separate, and the physical limitations of communication channels become apparent. For example, in meetings involving multiple teams, only team leaders or a small number of seniors may attend, rather than the entire team. These limitations in communication channels inevitably lead to a lack of communication. Consequently, the development team continues to develop new features, while the operations team faces issues during deployment caused by the features developed by the development team.\n\nWhen such situations are repeated, it can lead to organizational silos, a phenomenon known as silo mentality.\n\n![silo](./img/silo.png)\n\n> Indeed, the term \"silo\" originally refers to a tall, cylindrical structure used for storing grain or livestock feed. Silos are designed to keep the stored materials separate and prevent them from mixing. \n> In the context of organizations, the \"silo effect\" or \"organizational silos effect\" refers to a phenomenon where departments or teams within an organization operate independently and prioritize their own interests without effective collaboration. It reflects a mentality where individual departments focus on building their own \"silos\" and solely pursue their own interests.\n\nThe silo effect can lead to a decline in service quality and hinder organizational performance. To address this issue, DevOps emerged as a solution. DevOps emphasizes collaboration, communication, and integration between development and operations teams, breaking down the barriers and fostering a culture of shared responsibility and collaboration. By promoting cross-functional teamwork and streamlining processes, DevOps aims to overcome silos and improve the efficiency and effectiveness of software development and operations.\n\n### CI/CD\n\nContinuous Integration (CI) and Continuous Delivery (CD) are concrete methods to break down the barriers between development teams and operations teams.\n\n![cicd](./img/cicd.png)\n\nThrough this method, the development team can understand the operational environment and check whether the features being developed can be seamlessly deployed. The operations team can deploy validated features or improved products more often to increase customer product experience. In summary, DevOps is a methodology to solve the problem between development teams and operations teams.\n\n## MLOps\n\n### 1) ML + Ops\n\nDevOps is a methodology that addresses the challenges between development and operations teams, promoting collaboration and effective communication. By applying DevOps principles, development teams gain a better understanding of the operational environment, and the developed features can be seamlessly integrated and deployed. On the other hand, operations teams can deploy validated features or improved products more frequently, enhancing the overall customer experience.\n\nMLOps, which stands for Machine Learning Operations, extends the DevOps principles and practices specifically to the field of machine learning. In MLOps, the \"Dev\" in DevOps is replaced with \"ML\" to emphasize the unique challenges and considerations related to machine learning.\n\nMLOps aims to address the issues that arise between machine learning teams and operations teams. To understand these issues, let's consider an example using a recommendation system.\n\n#### Rule-Based Approach\n\nIn the initial stages of building a recommendation system, a simple rule-based approach may be used. For example, items could be recommended based on the highest sales volume in the past week. With this approach, there is no need for model updates unless there are specific reasons for modification.\n\n#### Machine Learning Approach\n\nAs the scale of the service grows and more log data accumulates, machine learning models can be developed based on item-based or user-based recommendations. In this case, the models are periodically retrained and redeployed.\n\n#### Deep Learning Approach\n\nWhen there is a greater demand for personalized recommendations and a need for models that deliver higher performance, deep learning models are developed. Similar to machine learning, these models are periodically retrained and redeployed.\n\nBy considering these examples, it becomes evident that challenges can arise between the machine learning team and the operations team. MLOps aims to address these challenges and provide a methodology and set of practices to facilitate the development, deployment, and operation of machine learning models in a collaborative and efficient manner.\n\n![graph](./img/graph.png)\n\nIf we represent the concepts explained earlier on a graph, with model complexity on the x-axis and model performance on the y-axis, we can observe an upward trend where the model performance improves as the complexity increases. This often leads to the emergence of separate machine learning teams specializing in transitioning from traditional machine learning to deep learning.\n\nIf there are only a few models to manage, collaboration between teams can be sufficient to address the challenges. However, as the number of models to develop increases, silos similar to those observed in DevOps can emerge.\n\nConsidering the goals of DevOps, we can understand the goals of MLOps as ensuring that the developed models can be deployed successfully. While DevOps focuses on verifying that the features developed by the development team can be deployed correctly, MLOps focuses on verifying that the models developed by the machine learning team can be deployed effectively.\n\n### 2) ML -> Ops\n\nHowever, recent MLOps-related products and explanations indicate that the goals are not limited to what was previously described. In some cases, the goal is to enable the machine learning team to directly operate and manage the models they develop. This need arises from the process of ongoing machine learning projects.\n\nIn the case of recommendation systems, it was possible to start with simple models in operations. However, in domains such as natural language processing and image analysis, it is common to perform verification (POC) to determine if deep learning models can solve the given tasks. Once the verification is complete, the focus shifts to developing the operational environment for serving the models. However, it may not be easy for the machine learning team to handle this challenge with their internal capabilities alone. This is where MLOps becomes necessary.\n\n### 3) Conclusion\n\nIn summary, MLOps has two main goals. The earlier explanation of MLOps focused on ML+Ops, aiming to enhance productivity and collaboration between the two teams. On the other hand, the latter explanation focused on ML -> Ops, aiming to enable the machine learning team to directly operate and manage their models.\n"
  },
  {
    "path": "i18n/en/docusaurus-plugin-content-docs/current/introduction/levels.md",
    "content": "---\ntitle : \"2. Levels of MLOps\"\ndescription: \"Levels of MLOps\"\nsidebar_position: 2\ndate: 2021-12-03\nlastmod: 2022-03-05\ncontributors: [\"Jongseob Jeon\", \"Chanmin Cho\"]\n\n---\n\nThis page will look at the steps of MLOps outlined by Google and explore what the core features of MLOps are.\n\n## Hidden Technical Debt in ML System\n\nGoogle has been talking about the need for MLOps since as far back as 2015.  The paper Hidden Technical Debt in Machine Learning Systems encapsulates this idea from Google.  \n\n![paper](./img/paper.png)\n\nThe key takeaway from this paper is that the machine learning code is only a small part of the entire system when it comes to building products with machine learning.\n\n\nGoogle developed MLOps by evolving this paper and expanding the term. More details can be found on the [Google Cloud homepage](https://cloud.google.com/architecture/mlops-continuous-delivery-and-automation-pipelines-in-machine-learning). In this post, we will try to explain what Google means by MLOps.\n\nGoogle divided the evolution of MLOps into three (0-2) stages. Before explaining each stage, let's review some of the concepts described in the previous post.\n\nIn order to operate a machine learning model, there is a machine learning team responsible for developing the model and an operations team responsible for deployment and operations. MLOps is needed for the successful collaboration of these two teams. We have previously said that it can be done simply through Continuous Integration (CI) / Continuous Deployment (CD), so let us see how to do CI / CD.\n\n## Level 0: Manual Process\n![level-0](./img/level-0.png)\n\nAt the 0th stage, two teams communicate through a \"model\". The machine learning team trains the model with accumulated data and delivers the trained model to the operation team. The operation team then deploys the model delivered in this way.\n\n![toon](./img/toon.png)\n\nInitial machine learning models are deployed through this \"model\" centered communication. However, there are several problems with this distribution method. For example, if some functions use Python 3.7 and some use Python 3.8, we often see the following situation.\n\nThe reason for this situation lies in the characteristics of the machine learning model. Three things are needed for the trained machine learning model to work:\n\n1. Python code\n2. Trained weights\n3. Environment (Packages, versions)\n\nIf any of these three aspects is communicated incorrectly, the model may fail to function or make unexpected predictions. However, in many cases, models fail to work due to environmental mismatches. Machine learning relies on various open-source libraries, and due to the nature of open-source, even the same function can produce different results depending on the version used.\n\nIn the early stages of a service, when there are not many models to manage, these issues can be resolved quickly. However, as the number of managed features increases and communication becomes more challenging, it becomes difficult to deploy models with better performance quickly.\n\n## Level 1: Automated ML Pipeline\n### Pipeline\n\n![level-1-pipeline](./img/level-1-pipeline.png)\n\nSo, in MLOps, \"pipeline\" is used to prevent such problems. The MLOps pipeline ensures that the model operates in the same environment as the one used by the machine learning engineer during model development, using containers like Docker. This helps prevent situations where the model doesn't work due to differences in the environment.\n\nHowever, the term \"pipeline\" is used in a broader context and in various tasks. What is the role of the pipeline that machine learning engineers create? The pipeline created by machine learning engineers produces trained models. Therefore, it would be more accurate to refer to it as a training pipeline rather than just a pipeline.\n\n### Continuous Training\n\n![level-1-ct.png](./img/level-1-ct.png)\n\nAnd the concept of Continuous Training (CT) is added. So why is CT necessary?\n\n#### Auto Retrain\n\nIn the real world, data exhibits a characteristic called \"Data Shift,\" where the data distribution keeps changing over time. As a result, models trained in the past may experience performance degradation over time. The simplest and most effective solution to this problem is to retrain the model using recent data. By retraining the model according to the changed data distribution, it can regain its performance.\n\n#### Auto Deploy\n\nHowever, in industries such as manufacturing, where multiple recipes are processed in a single factory, it may not always be desirable to retrain the model unconditionally. One common example is the blind spot.\n\nFor example, in an automotive production line, a model A was created and used for predictions. If an entirely different model B is introduced, it represents unseen data patterns, and a new model is trained for model B.\n\nNow, the model will make predictions for model B. However, if the data switches back to model A, what should be done? \nIf there are only retraining rules, a new model for model A will be trained again. However, machine learning models require a sufficient amount of data to demonstrate satisfactory performance. The term \"blind spot\" refers to a period in which the model does not work while gathering enough data.\n\nThere is a simple solution to address this blind spot. It involves checking whether there was a previous model for model A and, if so, using the previous model for prediction instead of immediately training a new model. This way, using meta-data associated with the model to automatically switch models is known as Auto Deploy.\n\nTo summarize, for Continuous Training (CT), both Auto Retrain and Auto Deploy are necessary. They complement each other's weaknesses and enable the model's performance to be maintained continuously.\n\n\n### Model Serving\n\n![level-1-modelserving](./img/level-1-modelserving.png)\n\nMachine learning pipelines in production continuously deploy the latest models based on new data to your prediction service. This process involves automatically deploying trained and validated models to online prediction services.\n\n\n## Level 2: Automating the CI/CD Pipeline\n\n![level-2](./img/level-2.png)\n\nThe title of Step 2 is the automation of CI and CD. In DevOps, the focus of CI/CD is on source code. So what is the focus of CI/CD in MLOps?\n\nIn MLOps, the focus of CI/CD is also on source code, but more specifically, it can be seen as the training pipeline.\n\nTherefore, when it comes to training models, it is important to verify whether the model is trained correctly (CI) and whether the trained model functions properly (CD) in response to relevant changes that can impact the training process. Hence, CI/CD should be performed when there are direct modifications to the code used for training.\n\nIn addition to code, the versions of the packages used and changes in the Python version are also part of CI/CD. In many cases, machine learning utilizes open-source packages. However, open-source packages can have changes in the internal logic of functions when their versions are updated. Although notifications may be provided when there are certain version updates, significant changes in versions can go unnoticed. Therefore, when the versions of the packages used change, it is important to perform CI/CD to ensure that the model is trained and functions correctly.\n\nIn summary, in MLOps, CI/CD focuses on the source code, particularly the training pipeline, to verify that the model is trained correctly and functions properly. This includes checking for direct code modifications and changes in package versions or Python versions to ensure the integrity of the training and functioning processes of the model.\n"
  },
  {
    "path": "i18n/en/docusaurus-plugin-content-docs/current/introduction/why_kubernetes.md",
    "content": "---\ntitle : \"4. Why Kubernetes?\"\ndescription: \"Reason for using k8s in MLOps\"\nsidebar_position: 4\ndate: 2021-12-03\nlastmod: 2021-12-10\ncontributors: [\"Jaeyeon Kim\"]\n---\n\n## MLOps & Kubernetes\n\nWhen talking about MLOps, why is the word Kubernetes always heard together?\n\nTo build a successful MLOps system, various components are needed as described in [Components of MLOps](../introduction/component.md), but to operate them organically at the infrastructure level, there are many issues to be solved. For example, simply running a large number of machine learning model requests in order, ensuring the same execution environment in other workspaces, and responding quickly when a deployed service has a failure.\n\nThe need for containers and container orchestration systems appears here. With the introduction of container orchestration systems such as Kubernetes, efficient isolation and management of execution environments can be achieved. By introducing a container orchestration system, it is possible to prevent situations such as *'Is anyone using cluster 1?', 'Who killed my process that was using GPU?', 'Who updated the x package on the cluster?* when developing and deploying machine learning models while a few developers share a small number of clusters.\n\n## Container\n\nMicrosoft defines a container as follows: What is a container then? In Microsoft, a container is defined as [follows](https://azure.microsoft.com/en-us/overview/what-is-a-container/).\n\n> Container: Standardized, portable packaging of an application's code, libraries, and configuration files\n\nBut why is a container needed for machine learning? Machine learning models can behave differently depending on the operating system, Python execution environment, package version, etc. To prevent this, the technology used to share and execute the entire dependent execution environment with the source code used in machine learning is called containerization technology. This packaged form is called a container image, and by sharing the container image, users can ensure the same execution results on any system. In other words, by sharing not just the Jupyter Notebook file or the source code and requirements.txt file of the model, but the entire container image with the execution environment, you can avoid situations such as *\"It works on my notebook, why not yours?\"*.\n\nOne translation of the Korean sentence to English is: \"One of the common misunderstandings that people who are new to containers often make is to assume that \"container == Docker\". Docker is not a concept that has the same meaning as containers; rather, it is a tool that provides features to make it easier and more flexible to use containers, such as launching containers and creating and sharing container images. In summary, container is a virtualization technology, and Docker is an implementation of virtualization technology.\n\nHowever, Docker has become the mainstream quickly due to its easy usability and high efficiency among various container virtualization tools, so when people think of containers, they often think of Docker automatically. There are various reasons why the container and Docker ecosystem have become the mainstream, but for technical reasons, I won't go into that detail since it is outside the scope of Everybody's MLOps.\n\n## Container Orchestration System\n\nThen what is a container orchestration system? As inferred from the word \"orchestration,\" it can be compared to a system that coordinates the operation of numerous containers to work together harmoniously.\n\nIn container-based systems, services are provided to users in the form of containers. If the number of containers to be managed is small, a single operator can sufficiently handle all situations. However, if there are hundreds of containers running in dozens of clusters and they need to function continuously without causing any failures, it becomes nearly impossible for a single operator to monitor the proper functioning of all services and respond to issues.\n\nFor example, continuous monitoring is required to ensure that all services are functioning properly. If a specific service experiences a failure, the operator needs to investigate the problem by examining the logs of multiple containers. Additionally, they need to handle various tasks such as scheduling and load balancing to prevent work overload on specific clusters or containers, as well as scaling operations.\n\nA container orchestration system is software that provides functionality to manage and operate the states of numerous containers continuously and automatically, making the process of managing and operating a large number of containers somewhat easier.\n\n\nHow can it be used in machine learning? For example, a container that packages deep learning training code that requires a GPU can be executed on a cluster with available GPUs. A container that packages data preprocessing code requiring a large amount of memory can be executed on a cluster with ample memory. If there is an issue with the cluster during training, the system can automatically move the same container to a different cluster and continue the training, eliminating the need for manual intervention. Developing such a system that automates management without requiring manual intervention is the goal.\n\nAs of the writing of this text in 2022, Kubernetes is considered the de facto standard for container orchestration systems.\n\nAccording to the [survey](https://www.cncf.io/blog/2018/08/29/cncf-survey-use-of-cloud-native-technologies-in-production-has-grown-over-200-percent/) released by CNCF in 2018, Kubernetes was already showing its prominence. The [survey](https://www.cncf.io/wp-content/uploads/2020/08/CNCF_Survey_Report.pdf) published in 2019 indicates that 78% of respondents were using Kubernetes at a production level.\n\n![k8s-graph](./img/k8s-graph.png)\n\nThe growth of the Kubernetes ecosystem can be attributed to various reasons. However, similar to Docker, Kubernetes is not exclusively limited to machine learning-based services. Since delving into detailed technical content would require a substantial amount of discussion, this edition of \"MLOps for ALL\" will omit the detailed explanation of Kubernetes.\n"
  },
  {
    "path": "i18n/en/docusaurus-plugin-content-docs/current/kubeflow/_category_.json",
    "content": "{\n  \"label\": \"Kubeflow\",\n  \"position\": 6,\n  \"link\": {\n    \"type\": \"generated-index\"\n  }\n}\n"
  },
  {
    "path": "i18n/en/docusaurus-plugin-content-docs/current/kubeflow/advanced-component.md",
    "content": "---\ntitle : \"8. Component - InputPath/OutputPath\"\ndescription: \"\"\nsidebar_position: 8\ncontributors: [\"Jongseob Jeon\", \"SeungTae Kim\"]\n---\n\n## Complex Outputs\n\nOn this page, we will write the code example from [Kubeflow Concepts](../kubeflow/kubeflow-concepts.md#component-contents) as a component.\n\n## Component Contents\n\nBelow is the component content used in [Kubeflow Concepts](../kubeflow/kubeflow-concepts.md#component-contents).\n\n```python\nimport dill\nimport pandas as pd\n\nfrom sklearn.svm import SVC\n\ntrain_data = pd.read_csv(train_data_path)\ntrain_target = pd.read_csv(train_target_path)\n\nclf = SVC(kernel=kernel)\nclf.fit(train_data, train_target)\n\nwith open(model_path, mode=\"wb\") as file_writer:\n    dill.dump(clf, file_writer)\n```\n\n## Component Wrapper\n\n### Define a standalone Python function\n\nWith the necessary Configs for the Component Wrapper, it will look like this.\n\n```python\ndef train_from_csv(\n    train_data_path: str,\n    train_target_path: str,\n    model_path: str,\n    kernel: str,\n):\n    import dill\n    import pandas as pd\n\n    from sklearn.svm import SVC\n\n    train_data = pd.read_csv(train_data_path)\n    train_target = pd.read_csv(train_target_path)\n\n    clf = SVC(kernel=kernel)\n    clf.fit(train_data, train_target)\n\n    with open(model_path, mode=\"wb\") as file_writer:\n        dill.dump(clf, file_writer)\n```\n\nIn the [Basic Usage Component]](../kubeflow/basic-component), we explained that you should provide type hints for input and output when describing. But what about complex objects such as dataframes, models, that cannot be used in json?\n\nWhen passing values between functions in Python, objects can be returned and their value will be stored in the host's memory, so the same object can be used in the next function. However, in Kubeflow, components are running independently on each container, that is, they are not sharing the same memory, so you cannot pass objects in the same way as in a normal Python function. The only information that can be passed between components is in `json` format. Therefore, objects of types that cannot be converted into json format such as Model or DataFrame must be passed in some other way.\n\nKubeflow solves this by storing the data in a file instead of memory, and then using the file to pass information. Since the path of the stored file is a string, it can be passed between components. However, in Kubeflow, the user does not know the path of the file before the execution. For this, Kubeflow provides a magic related to the input and output paths, `InputPath` and `OutputPath`.\n\n`InputPath` literally means the input path, and `OutputPath` literally means the output path.\n\nFor example, in a component that generates and returns data, `data_path: OutputPath()` is created as an argument. And in a component that receives data, `data_path: InputPath()` is created as an argument.\n\nOnce these are created, when connecting them in a pipeline, Kubeflow automatically generates and inputs the necessary paths. Therefore, users no longer need to worry about the paths and only need to consider the relationships between components.\n\nBased on this information, when rewriting the component wrapper, it would look like the following.\n\n```python\nfrom kfp.components import InputPath, OutputPath\n\ndef train_from_csv(\n    train_data_path: InputPath(\"csv\"),\n    train_target_path: InputPath(\"csv\"),\n    model_path: OutputPath(\"dill\"),\n    kernel: str,\n):\n    import dill\n    import pandas as pd\n\n    from sklearn.svm import SVC\n\n    train_data = pd.read_csv(train_data_path)\n    train_target = pd.read_csv(train_target_path)\n\n    clf = SVC(kernel=kernel)\n    clf.fit(train_data, train_target)\n\n    with open(model_path, mode=\"wb\") as file_writer:\n        dill.dump(clf, file_writer)\n```\n\nInputPath or OutputPath can accept a string. This string is the format of the file to be input or output.  \nHowever, it does not necessarily mean that the file has to be stored in this format.  \nIt just serves as a helper for type checking when compiling the pipeline.  \nIf the file format is not fixed, then no input is needed (it serves the role of something like `Any` in type hints).\n\n### Convert to Kubeflow Format\n\nConvert the written component into a format that can be used in Kubeflow.\n\n```python\nfrom kfp.components import InputPath, OutputPath, create_component_from_func\n\n\n@create_component_from_func\ndef train_from_csv(\n    train_data_path: InputPath(\"csv\"),\n    train_target_path: InputPath(\"csv\"),\n    model_path: OutputPath(\"dill\"),\n    kernel: str,\n):\n    import dill\n    import pandas as pd\n\n    from sklearn.svm import SVC\n\n    train_data = pd.read_csv(train_data_path)\n    train_target = pd.read_csv(train_target_path)\n\n    clf = SVC(kernel=kernel)\n    clf.fit(train_data, train_target)\n\n    with open(model_path, mode=\"wb\") as file_writer:\n        dill.dump(clf, file_writer)\n```\n\n## Rule for using InputPath/OutputPath\n\nThere are rules to follow when using InputPath or OutputPath arguments in pipeline.\n\n### Load Data Component\n\nTo execute the previously written component, a component that generates data is created since data is required.\n\n```python\nfrom functools import partial\n\nfrom kfp.components import InputPath, OutputPath, create_component_from_func\n\n\n@create_component_from_func\ndef load_iris_data(\n    data_path: OutputPath(\"csv\"),\n    target_path: OutputPath(\"csv\"),\n):\n    import pandas as pd\n    from sklearn.datasets import load_iris\n\n    iris = load_iris()\n\n    data = pd.DataFrame(iris[\"data\"], columns=iris[\"feature_names\"])\n    target = pd.DataFrame(iris[\"target\"], columns=[\"target\"])\n\n    data.to_csv(data_path, index=False)\n    target.to_csv(target_path, index=False)\n```\n\n### Write Pipeline\n\nNow let's write the pipeline.\n\n```python\nfrom kfp.dsl import pipeline\n\n\n@pipeline(name=\"complex_pipeline\")\ndef complex_pipeline(kernel: str):\n    iris_data = load_iris_data()\n    model = train_from_csv(\n        train_data=iris_data.outputs[\"data\"],\n        train_target=iris_data.outputs[\"target\"],\n        kernel=kernel,\n    )\n```\n\nHave you noticed something strange?  \nAll the `_path` suffixes have disappeared from the arguments received in the input and output.  \nWe can see that instead of accessing `iris_data.outputs[\"data_path\"]`, we are accessing `iris_data.outputs[\"data\"]`.  \nThis happens because Kubeflow has a rule that paths created with `InputPath` and `OutputPath` can be accessed without the `_path` suffix when accessed from the pipeline.\n\nHowever, if you upload the pipeline just written, it will not run.  \nThe reason is explained on the next page.\n"
  },
  {
    "path": "i18n/en/docusaurus-plugin-content-docs/current/kubeflow/advanced-environment.md",
    "content": "---\ntitle : \"9. Component - Environment\"\ndescription: \"\"\nsidebar_position: 9\ncontributors: [\"Jongseob Jeon\"]\n---\n\n## Component Environment\n\nWhen we run the pipeline written in [8. Component - InputPath/OutputPath](../kubeflow/advanced-component.md), it fails. Let's find out why it fails and modify it so that it can run properly. \n\n### Convert to Kubeflow Format\n\nLet's convert the component written [earlier](../kubeflow/advanced-component.md#convert-to-kubeflow-format) into a yaml file.\n\n```python\nfrom kfp.components import InputPath, OutputPath, create_component_from_func\n\n\n@create_component_from_func\ndef train_from_csv(\n    train_data_path: InputPath(\"csv\"),\n    train_target_path: InputPath(\"csv\"),\n    model_path: OutputPath(\"dill\"),\n    kernel: str,\n):\n    import dill\n    import pandas as pd\n\n    from sklearn.svm import SVC\n\n    train_data = pd.read_csv(train_data_path)\n    train_target = pd.read_csv(train_target_path)\n\n    clf = SVC(kernel=kernel)\n    clf.fit(train_data, train_target)\n\n    with open(model_path, mode=\"wb\") as file_writer:\n        dill.dump(clf, file_writer)\n\n\nif __name__ == \"__main__\":\n    train_from_csv.component_spec.save(\"train_from_csv.yaml\")\n```\n\nIf you run the script above, you will get a `train_from_csv.yaml` file like the one below.\n\n```bash\nname: Train from csv\ninputs:\n- {name: train_data, type: csv}\n- {name: train_target, type: csv}\n- {name: model, type: dill}\n- {name: kernel, type: String}\nimplementation:\n  container:\n    image: python:3.7\n    command:\n    - sh\n    - -ec\n    - |\n      program_path=$(mktemp)\n      printf \"%s\" \"$0\" > \"$program_path\"\n      python3 -u \"$program_path\" \"$@\"\n    - |\n      def train_from_csv(\n          train_data_path,\n          train_target_path,\n          model_path,\n          kernel,\n      ):\n          import dill\n          import pandas as pd\n\n          from sklearn.svm import SVC\n\n          train_data = pd.read_csv(train_data_path)\n          train_target = pd.read_csv(train_target_path)\n\n          clf = SVC(kernel=kernel)\n          clf.fit(train_data, train_target)\n\n          with open(model_path, mode=\"wb\") as file_writer:\n              dill.dump(clf, file_writer)\n\n      import argparse\n      _parser = argparse.ArgumentParser(prog='Train from csv', description='')\n      _parser.add_argument(\"--train-data\", dest=\"train_data_path\", type=str, required=True, default=argparse.SUPPRESS)\n      _parser.add_argument(\"--train-target\", dest=\"train_target_path\", type=str, required=True, default=argparse.SUPPRESS)\n      _parser.add_argument(\"--model\", dest=\"model_path\", type=str, required=True, default=argparse.SUPPRESS)\n      _parser.add_argument(\"--kernel\", dest=\"kernel\", type=str, required=True, default=argparse.SUPPRESS)\n      _parsed_args = vars(_parser.parse_args())\n\n      _outputs = train_from_csv(**_parsed_args)\n    args:\n    - --train-data\n    - {inputPath: train_data}\n    - --train-target\n    - {inputPath: train_target}\n    - --model\n    - {inputPath: model}\n    - --kernel\n    - {inputValue: kernel}\n```\n\nAccording to the content explained in the [Basic Usage Component](../kubeflow/basic-component.md#convert-to-kubeflow-format) previously mentioned, this component will be executed as follows:\n\n1. `docker pull python:3.7`\n2. run `command`\n\nHowever, when running the component created above, an error will occur.  \nThe reason is in the way the component wrapper is executed.  \nKubeflow uses Kubernetes, so the component wrapper runs the component content on its own separate container.\n\nIn detail, the image specified in the generated `train_from_csv.yaml` is `image: python:3.7`.\n\nThere may be some people who notice why it is not running for some reason.\n\nThe `python:3.7` image does not have the packages we want to use, such as `dill`, `pandas`, and `sklearn`, installed.  \nTherefore, when executing, it fails with an error indicating that the packages are not found.\n\nSo, how can we add the packages?\n\n## Adding packages\n\nDuring the process of converting Kubeflow, there are two ways to add packages:\n\n1. Using `base_image`\n2. Using `package_to_install`\n\nLet's check what arguments the function `create_component_from_func` used to compile the components can receive.\n\n```bash\ndef create_component_from_func(\n    func: Callable,\n    output_component_file: Optional[str] = None,\n    base_image: Optional[str] = None,\n    packages_to_install: List[str] = None,\n    annotations: Optional[Mapping[str, str]] = None,\n):\n```\n\n- `func`: Function that creates the component wrapper to be made into a component.\n- `base_image`: Image that the component wrapper will run on.\n- `packages_to_install`: Additional packages that need to be installed for the component to use.\n\n### 1. base_image\n\nTake a closer look at the sequence in which the component is executed and it will be as follows:\n\n1. `docker pull base_image`\n2. `pip install packages_to_install`\n3. run `command`\n\nIf the base_image used by the component already has all the packages installed, you can use it without installing additional packages.\n\nFor example, on this page we are going to write a Dockerfile like this:\n\n```dockerfile\nFROM python:3.7\n\nRUN pip install dill pandas scikit-learn\n```\n\nLet's build the image using the Dockerfile above. The Docker hub we will use for the practice is ghcr.  \nYou can choose a Docker hub according to your environment and upload it.\n\n```bash\ndocker build . -f Dockerfile -t ghcr.io/mlops-for-all/base-image\ndocker push ghcr.io/mlops-for-all/base-image\n```\n\nNow let's try inputting the base image.\n\n```python\nfrom functools import partial\nfrom kfp.components import InputPath, OutputPath, create_component_from_func\n\n@partial(\n    create_component_from_func,\n    base_image=\"ghcr.io/mlops-for-all/base-image:latest\",\n)\ndef train_from_csv(\n    train_data_path: InputPath(\"csv\"),\n    train_target_path: InputPath(\"csv\"),\n    model_path: OutputPath(\"dill\"),\n    kernel: str,\n):\n    import dill\n    import pandas as pd\n\n    from sklearn.svm import SVC\n\n    train_data = pd.read_csv(train_data_path)\n    train_target = pd.read_csv(train_target_path)\n\n    clf = SVC(kernel=kernel)\n    clf.fit(train_data, train_target)\n\n    with open(model_path, mode=\"wb\") as file_writer:\n        dill.dump(clf, file_writer)\n\nif __name__ == \"__main__\":\n    train_from_csv.component_spec.save(\"train_from_csv.yaml\")\n```\n\nIf you compile the generated component, it will appear as follows.\n\n```bash\nname: Train from csv\ninputs:\n- {name: train_data, type: csv}\n- {name: train_target, type: csv}\n- {name: kernel, type: String}\noutputs:\n- {name: model, type: dill}\nimplementation:\n  container:\n    image: ghcr.io/mlops-for-all/base-image:latest\n    command:\n    - sh\n    - -ec\n    - |\n      program_path=$(mktemp)\n      printf \"%s\" \"$0\" > \"$program_path\"\n      python3 -u \"$program_path\" \"$@\"\n    - |\n      def _make_parent_dirs_and_return_path(file_path: str):\n          import os\n          os.makedirs(os.path.dirname(file_path), exist_ok=True)\n          return file_path\n\n      def train_from_csv(\n          train_data_path,\n          train_target_path,\n          model_path,\n          kernel,\n      ):\n          import dill\n          import pandas as pd\n\n          from sklearn.svm import SVC\n\n          train_data = pd.read_csv(train_data_path)\n          train_target = pd.read_csv(train_target_path)\n\n          clf = SVC(kernel=kernel)\n          clf.fit(train_data, train_target)\n\n          with open(model_path, mode=\"wb\") as file_writer:\n              dill.dump(clf, file_writer)\n\n      import argparse\n      _parser = argparse.ArgumentParser(prog='Train from csv', description='')\n      _parser.add_argument(\"--train-data\", dest=\"train_data_path\", type=str, required=True, default=argparse.SUPPRESS)\n      _parser.add_argument(\"--train-target\", dest=\"train_target_path\", type=str, required=True, default=argparse.SUPPRESS)\n      _parser.add_argument(\"--kernel\", dest=\"kernel\", type=str, required=True, default=argparse.SUPPRESS)\n      _parser.add_argument(\"--model\", dest=\"model_path\", type=_make_parent_dirs_and_return_path, required=True, default=argparse.SUPPRESS)\n      _parsed_args = vars(_parser.parse_args())\n\n      _outputs = train_from_csv(**_parsed_args)\n    args:\n    - --train-data\n    - {inputPath: train_data}\n    - --train-target\n    - {inputPath: train_target}\n    - --kernel\n    - {inputValue: kernel}\n    - --model\n    - {outputPath: model}\n```\n\nWe can confirm that the base_image has been changed to the value we have set.\n\n### 2. packages_to_install\n\nHowever, when packages are added, it takes a lot of time to create a new Docker image.\nIn this case, we can use the `packages_to_install` argument to easily add packages to the container.\n```python\nfrom functools import partial\nfrom kfp.components import InputPath, OutputPath, create_component_from_func\n\n@partial(\n    create_component_from_func,\n    packages_to_install=[\"dill==0.3.4\", \"pandas==1.3.4\", \"scikit-learn==1.0.1\"],\n)\ndef train_from_csv(\n    train_data_path: InputPath(\"csv\"),\n    train_target_path: InputPath(\"csv\"),\n    model_path: OutputPath(\"dill\"),\n    kernel: str,\n):\n    import dill\n    import pandas as pd\n\n    from sklearn.svm import SVC\n\n    train_data = pd.read_csv(train_data_path)\n    train_target = pd.read_csv(train_target_path)\n\n    clf = SVC(kernel=kernel)\n    clf.fit(train_data, train_target)\n\n    with open(model_path, mode=\"wb\") as file_writer:\n        dill.dump(clf, file_writer)\n\nif __name__ == \"__main__\":\n    train_from_csv.component_spec.save(\"train_from_csv.yaml\")\n```\n\nIf you execute the script, the `train_from_csv.yaml` file will be generated.\n\n```bash\nname: Train from csv\ninputs:\n- {name: train_data, type: csv}\n- {name: train_target, type: csv}\n- {name: kernel, type: String}\noutputs:\n- {name: model, type: dill}\nimplementation:\n  container:\n    image: python:3.7\n    command:\n    - sh\n    - -c\n    - (PIP_DISABLE_PIP_VERSION_CHECK=1 python3 -m pip install --quiet --no-warn-script-location\n      'dill==0.3.4' 'pandas==1.3.4' 'scikit-learn==1.0.1' || PIP_DISABLE_PIP_VERSION_CHECK=1\n      python3 -m pip install --quiet --no-warn-script-location 'dill==0.3.4' 'pandas==1.3.4'\n      'scikit-learn==1.0.1' --user) && \"$0\" \"$@\"\n    - sh\n    - -ec\n    - |\n      program_path=$(mktemp)\n      printf \"%s\" \"$0\" > \"$program_path\"\n      python3 -u \"$program_path\" \"$@\"\n    - |\n      def _make_parent_dirs_and_return_path(file_path: str):\n          import os\n          os.makedirs(os.path.dirname(file_path), exist_ok=True)\n          return file_path\n\n      def train_from_csv(\n          train_data_path,\n          train_target_path,\n          model_path,\n          kernel,\n      ):\n          import dill\n          import pandas as pd\n\n          from sklearn.svm import SVC\n\n          train_data = pd.read_csv(train_data_path)\n          train_target = pd.read_csv(train_target_path)\n\n          clf = SVC(kernel=kernel)\n          clf.fit(train_data, train_target)\n\n          with open(model_path, mode=\"wb\") as file_writer:\n              dill.dump(clf, file_writer)\n\n      import argparse\n      _parser = argparse.ArgumentParser(prog='Train from csv', description='')\n      _parser.add_argument(\"--train-data\", dest=\"train_data_path\", type=str, required=True, default=argparse.SUPPRESS)\n      _parser.add_argument(\"--train-target\", dest=\"train_target_path\", type=str, required=True, default=argparse.SUPPRESS)\n      _parser.add_argument(\"--kernel\", dest=\"kernel\", type=str, required=True, default=argparse.SUPPRESS)\n      _parser.add_argument(\"--model\", dest=\"model_path\", type=_make_parent_dirs_and_return_path, required=True, default=argparse.SUPPRESS)\n      _parsed_args = vars(_parser.parse_args())\n\n      _outputs = train_from_csv(**_parsed_args)\n    args:\n    - --train-data\n    - {inputPath: train_data}\n    - --train-target\n    - {inputPath: train_target}\n    - --kernel\n    - {inputValue: kernel}\n    - --model\n    - {outputPath: model}\n```\n\nIf we take a closer look at the order in which the components written above are executed, it looks like this:\n\n1. `docker pull python:3.7`\n2. `pip install dill==0.3.4 pandas==1.3.4 scikit-learn==1.0.1`\n3. run `command`\n\nWhen the generated yaml file is closely examined, the following lines are automatically added, so that the necessary packages are installed and the program runs smoothly without errors.\n\n```bash\n    command:\n    - sh\n    - -c\n    - (PIP_DISABLE_PIP_VERSION_CHECK=1 python3 -m pip install --quiet --no-warn-script-location\n      'dill==0.3.4' 'pandas==1.3.4' 'scikit-learn==1.0.1' || PIP_DISABLE_PIP_VERSION_CHECK=1\n      python3 -m pip install --quiet --no-warn-script-location 'dill==0.3.4' 'pandas==1.3.4'\n      'scikit-learn==1.0.1' --user) && \"$0\" \"$@\"\n```\n"
  },
  {
    "path": "i18n/en/docusaurus-plugin-content-docs/current/kubeflow/advanced-mlflow.md",
    "content": "---\ntitle : \"12. Component - MLFlow\"\ndescription: \"\"\nsidebar_position: 12\ndate: 2021-12-13\nlastmod: 2021-12-20\ncontributors: [\"Jongseob Jeon\", \"SeungTae Kim\"]\n---\n\n## MLFlow Component\n\nIn this page, we will explain the process of writing a component to store the model in MLFlow so that the model trained in [Advanced Usage Component](../kubeflow/advanced-component.md) can be linked to API deployment.\n\n## MLFlow in Local\n\nIn order to store the model in MLFlow and use it in serving, the following items are needed.\n\n- model\n- signature\n- input_example\n- conda_env\n\nWe will look into the process of saving a model to MLFlow through Python code.\n\n### 1. Train model\n\nThe following steps involve training an SVC model using the iris dataset.\n\n```python\nimport pandas as pd\nfrom sklearn.datasets import load_iris\nfrom sklearn.svm import SVC\n\niris = load_iris()\n\ndata = pd.DataFrame(iris[\"data\"], columns=iris[\"feature_names\"])\ntarget = pd.DataFrame(iris[\"target\"], columns=[\"target\"])\n\nclf = SVC(kernel=\"rbf\")\nclf.fit(data, target)\n\n```\n\n### 2. MLFLow Infos\n\nThis process creates the necessary information for MLFlow.\n\n```python\nfrom mlflow.models.signature import infer_signature\nfrom mlflow.utils.environment import _mlflow_conda_env\n\ninput_example = data.sample(1)\nsignature = infer_signature(data, clf.predict(data))\nconda_env = _mlflow_conda_env(additional_pip_deps=[\"dill\", \"pandas\", \"scikit-learn\"])\n```\n\nEach variable's content is as follows.\n\n- `input_example`\n\n    | sepal length (cm) | sepal width (cm) | petal length (cm) | petal width (cm) |\n    | --- | --- | --- | --- |\n    | 6.5 | 6.7 | 3.1 | 4.4 |\n\n- `signature`\n\n    ```python\n    inputs:\n      ['sepal length (cm)': double, 'sepal width (cm)': double, 'petal length (cm)': double, 'petal width (cm)': double]\n    outputs:\n      [Tensor('int64', (-1,))]\n    ```\n\n- `conda_env`\n\n    ```python\n    {'name': 'mlflow-env',\n     'channels': ['conda-forge'],\n     'dependencies': ['python=3.8.10',\n      'pip',\n      {'pip': ['mlflow', 'dill', 'pandas', 'scikit-learn']}]}\n    ```\n\n### 3. Save MLFLow Infos\n\nNext, we save the learned information and the model. Since the trained model uses the sklearn package, we can easily save the model using `mlflow.sklearn`.\n\n```python\nfrom mlflow.sklearn import save_model\n\nsave_model(\n    sk_model=clf,\n    path=\"svc\",\n    serialization_format=\"cloudpickle\",\n    conda_env=conda_env,\n    signature=signature,\n    input_example=input_example,\n)\n```\n\nIf you work locally, a svc folder will be created and the following files will be generated.\n\n```bash\nls svc\n```\n\nIf you execute the command above, you can check the following output value.\n\n```bash\nMLmodel            conda.yaml         input_example.json model.pkl          requirements.txt\n```\n\nEach file will be as follows if checked.\n\n- MLmodel\n\n    ```bash\n    flavors:\n      python_function:\n        env: conda.yaml\n        loader_module: mlflow.sklearn\n        model_path: model.pkl\n        python_version: 3.8.10\n      sklearn:\n        pickled_model: model.pkl\n        serialization_format: cloudpickle\n        sklearn_version: 1.0.1\n    saved_input_example_info:\n      artifact_path: input_example.json\n      pandas_orient: split\n      type: dataframe\n    signature:\n      inputs: '[{\"name\": \"sepal length (cm)\", \"type\": \"double\"}, {\"name\": \"sepal width\n        (cm)\", \"type\": \"double\"}, {\"name\": \"petal length (cm)\", \"type\": \"double\"}, {\"name\":\n        \"petal width (cm)\", \"type\": \"double\"}]'\n      outputs: '[{\"type\": \"tensor\", \"tensor-spec\": {\"dtype\": \"int64\", \"shape\": [-1]}}]'\n    utc_time_created: '2021-12-06 06:52:30.612810'\n    ```\n\n- conda.yaml\n\n    ```bash\n    channels:\n    - conda-forge\n    dependencies:\n    - python=3.8.10\n    - pip\n    - pip:\n      - mlflow\n      - dill\n      - pandas\n      - scikit-learn\n    name: mlflow-env\n    ```\n\n- input_example.json\n\n    ```bash\n    {\n        \"columns\": \n        [\n            \"sepal length (cm)\",\n            \"sepal width (cm)\",\n            \"petal length (cm)\",\n            \"petal width (cm)\"\n        ],\n        \"data\": \n        [\n            [6.7, 3.1, 4.4, 1.4]\n        ]\n    }\n    ```\n\n- requirements.txt\n\n    ```bash\n    mlflow\n    dill\n    pandas\n    scikit-learn\n    ```\n\n- model.pkl\n\n## MLFlow on Server\n\nNow, let's proceed with the task of uploading the saved model to the MLflow server.\n\n```python\nimport mlflow\n\nwith mlflow.start_run():\n    mlflow.log_artifact(\"svc/\")\n```\n\nSave and open the `mlruns` directory generated path with `mlflow ui` command to launch mlflow server and dashboard.\nAccess the mlflow dashboard, click the generated run to view it as below.\n\n![mlflow-0.png](./img/mlflow-0.png)\n(This screen may vary depending on the version of mlflow.)\n\n## MLFlow Component\n\nNow, let's write a reusable component in Kubeflow.\n\nThe ways of writing components that can be reused are broadly divided into three categories.\n\n1. After saving the necessary environment in the component responsible for model training, the MLflow component is only responsible for the upload.\n\n    ![mlflow-1.png](./img/mlflow-1.png)\n\n2. Pass the trained model and data to the MLflow component, which is responsible for saving and uploading.\n\n    ![mlflow-2.png](./img/mlflow-2.png)\n\n3. The component responsible for model training handles both saving and uploading.\n\n    ![mlflow-3.png](./img/mlflow-3.png)\n\nWe are trying to manage the model through the first approach.\nThe reason is that we don't need to write the code to upload the MLFlow model every time like three times for each component written.\n\nReusing components is possible by the methods 1 and 2.\nHowever, in the case of 2, it is necessary to deliver the trained image and packages to the component, so ultimately additional information about the component must be delivered.\n\nIn order to proceed with the method 1, the learning component must also be changed.\nCode that stores the environment needed to save the model must be added.\n\n```python\nfrom functools import partial\nfrom kfp.components import InputPath, OutputPath, create_component_from_func\n\n@partial(\n    create_component_from_func,\n    packages_to_install=[\"dill\", \"pandas\", \"scikit-learn\", \"mlflow\"],\n)\ndef train_from_csv(\n    train_data_path: InputPath(\"csv\"),\n    train_target_path: InputPath(\"csv\"),\n    model_path: OutputPath(\"dill\"),\n    input_example_path: OutputPath(\"dill\"),\n    signature_path: OutputPath(\"dill\"),\n    conda_env_path: OutputPath(\"dill\"),\n    kernel: str,\n):\n    import dill\n    import pandas as pd\n    from sklearn.svm import SVC\n\n    from mlflow.models.signature import infer_signature\n    from mlflow.utils.environment import _mlflow_conda_env\n\n    train_data = pd.read_csv(train_data_path)\n    train_target = pd.read_csv(train_target_path)\n\n    clf = SVC(kernel=kernel)\n    clf.fit(train_data, train_target)\n\n    with open(model_path, mode=\"wb\") as file_writer:\n        dill.dump(clf, file_writer)\n\n    input_example = train_data.sample(1)\n    with open(input_example_path, \"wb\") as file_writer:\n        dill.dump(input_example, file_writer)\n\n    signature = infer_signature(train_data, clf.predict(train_data))\n    with open(signature_path, \"wb\") as file_writer:\n        dill.dump(signature, file_writer)\n\n    conda_env = _mlflow_conda_env(\n        additional_pip_deps=[\"dill\", \"pandas\", \"scikit-learn\"]\n    )\n    with open(conda_env_path, \"wb\") as file_writer:\n        dill.dump(conda_env, file_writer)\n\n```\n\nWrite a component to upload to MLFlow.\nAt this time, configure the uploaded MLFlow endpoint to be connected to the [mlflow service](../setup-components/install-components-mlflow.md) that we installed.  \nIn this case, use the Kubernetes Service DNS Name of the Minio installed at the time of MLFlow Server installation. As this service is created in the Kubeflow namespace with the name minio-service, set it to `http://minio-service.kubeflow.svc:9000`.  \nSimilarly, for the tracking_uri address, use the Kubernetes Service DNS Name of the MLFlow server and set it to `http://mlflow-server-service.mlflow-system.svc:5000`.\n\n```python\nfrom functools import partial\nfrom kfp.components import InputPath, create_component_from_func\n\n@partial(\n    create_component_from_func,\n    packages_to_install=[\"dill\", \"pandas\", \"scikit-learn\", \"mlflow\", \"boto3\"],\n)\ndef upload_sklearn_model_to_mlflow(\n    model_name: str,\n    model_path: InputPath(\"dill\"),\n    input_example_path: InputPath(\"dill\"),\n    signature_path: InputPath(\"dill\"),\n    conda_env_path: InputPath(\"dill\"),\n):\n    import os\n    import dill\n    from mlflow.sklearn import save_model\n    \n    from mlflow.tracking.client import MlflowClient\n\n    os.environ[\"MLFLOW_S3_ENDPOINT_URL\"] = \"http://minio-service.kubeflow.svc:9000\"\n    os.environ[\"AWS_ACCESS_KEY_ID\"] = \"minio\"\n    os.environ[\"AWS_SECRET_ACCESS_KEY\"] = \"minio123\"\n\n    client = MlflowClient(\"http://mlflow-server-service.mlflow-system.svc:5000\")\n\n    with open(model_path, mode=\"rb\") as file_reader:\n        clf = dill.load(file_reader)\n\n    with open(input_example_path, \"rb\") as file_reader:\n        input_example = dill.load(file_reader)\n\n    with open(signature_path, \"rb\") as file_reader:\n        signature = dill.load(file_reader)\n\n    with open(conda_env_path, \"rb\") as file_reader:\n        conda_env = dill.load(file_reader)\n\n    save_model(\n        sk_model=clf,\n        path=model_name,\n        serialization_format=\"cloudpickle\",\n        conda_env=conda_env,\n        signature=signature,\n        input_example=input_example,\n    )\n    run = client.create_run(experiment_id=\"0\")\n    client.log_artifact(run.info.run_id, model_name)\n```\n\n## MLFlow Pipeline\n\nNow let's connect the components we have written and create a pipeline. \n\n### Data Component\n\nThe data we will use to train the model is sklearn's iris.\nWe will write a component to generate the data.\n\n```python\nfrom functools import partial\n\nfrom kfp.components import InputPath, OutputPath, create_component_from_func\n\n\n@partial(\n    create_component_from_func,\n    packages_to_install=[\"pandas\", \"scikit-learn\"],\n)\ndef load_iris_data(\n    data_path: OutputPath(\"csv\"),\n    target_path: OutputPath(\"csv\"),\n):\n    import pandas as pd\n    from sklearn.datasets import load_iris\n\n    iris = load_iris()\n\n    data = pd.DataFrame(iris[\"data\"], columns=iris[\"feature_names\"])\n    target = pd.DataFrame(iris[\"target\"], columns=[\"target\"])\n\n    data.to_csv(data_path, index=False)\n    target.to_csv(target_path, index=False)\n\n```\n\n### Pipeline\n\nThe pipeline code can be written as follows.\n\n```python\nfrom kfp.dsl import pipeline\n\n\n@pipeline(name=\"mlflow_pipeline\")\ndef mlflow_pipeline(kernel: str, model_name: str):\n    iris_data = load_iris_data()\n    model = train_from_csv(\n        train_data=iris_data.outputs[\"data\"],\n        train_target=iris_data.outputs[\"target\"],\n        kernel=kernel,\n    )\n    _ = upload_sklearn_model_to_mlflow(\n        model_name=model_name,\n        model=model.outputs[\"model\"],\n        input_example=model.outputs[\"input_example\"],\n        signature=model.outputs[\"signature\"],\n        conda_env=model.outputs[\"conda_env\"],\n    )\n```\n\n### Run\n\nIf you organize the components and pipelines written above into a single Python file, it would look like this.\n\n```python\nfrom functools import partial\n\nimport kfp\nfrom kfp.components import InputPath, OutputPath, create_component_from_func\nfrom kfp.dsl import pipeline\n\n\n@partial(\n    create_component_from_func,\n    packages_to_install=[\"pandas\", \"scikit-learn\"],\n)\ndef load_iris_data(\n    data_path: OutputPath(\"csv\"),\n    target_path: OutputPath(\"csv\"),\n):\n    import pandas as pd\n    from sklearn.datasets import load_iris\n\n    iris = load_iris()\n\n    data = pd.DataFrame(iris[\"data\"], columns=iris[\"feature_names\"])\n    target = pd.DataFrame(iris[\"target\"], columns=[\"target\"])\n\n    data.to_csv(data_path, index=False)\n    target.to_csv(target_path, index=False)\n\n\n@partial(\n    create_component_from_func,\n    packages_to_install=[\"dill\", \"pandas\", \"scikit-learn\", \"mlflow\"],\n)\ndef train_from_csv(\n    train_data_path: InputPath(\"csv\"),\n    train_target_path: InputPath(\"csv\"),\n    model_path: OutputPath(\"dill\"),\n    input_example_path: OutputPath(\"dill\"),\n    signature_path: OutputPath(\"dill\"),\n    conda_env_path: OutputPath(\"dill\"),\n    kernel: str,\n):\n    import dill\n    import pandas as pd\n    from sklearn.svm import SVC\n\n    from mlflow.models.signature import infer_signature\n    from mlflow.utils.environment import _mlflow_conda_env\n\n    train_data = pd.read_csv(train_data_path)\n    train_target = pd.read_csv(train_target_path)\n\n    clf = SVC(kernel=kernel)\n    clf.fit(train_data, train_target)\n\n    with open(model_path, mode=\"wb\") as file_writer:\n        dill.dump(clf, file_writer)\n\n    input_example = train_data.sample(1)\n    with open(input_example_path, \"wb\") as file_writer:\n        dill.dump(input_example, file_writer)\n\n    signature = infer_signature(train_data, clf.predict(train_data))\n    with open(signature_path, \"wb\") as file_writer:\n        dill.dump(signature, file_writer)\n\n    conda_env = _mlflow_conda_env(\n        additional_pip_deps=[\"dill\", \"pandas\", \"scikit-learn\"]\n    )\n    with open(conda_env_path, \"wb\") as file_writer:\n        dill.dump(conda_env, file_writer)\n\n\n@partial(\n    create_component_from_func,\n    packages_to_install=[\"dill\", \"pandas\", \"scikit-learn\", \"mlflow\", \"boto3\"],\n)\ndef upload_sklearn_model_to_mlflow(\n    model_name: str,\n    model_path: InputPath(\"dill\"),\n    input_example_path: InputPath(\"dill\"),\n    signature_path: InputPath(\"dill\"),\n    conda_env_path: InputPath(\"dill\"),\n):\n    import os\n    import dill\n    from mlflow.sklearn import save_model\n    \n    from mlflow.tracking.client import MlflowClient\n\n    os.environ[\"MLFLOW_S3_ENDPOINT_URL\"] = \"http://minio-service.kubeflow.svc:9000\"\n    os.environ[\"AWS_ACCESS_KEY_ID\"] = \"minio\"\n    os.environ[\"AWS_SECRET_ACCESS_KEY\"] = \"minio123\"\n\n    client = MlflowClient(\"http://mlflow-server-service.mlflow-system.svc:5000\")\n\n    with open(model_path, mode=\"rb\") as file_reader:\n        clf = dill.load(file_reader)\n\n    with open(input_example_path, \"rb\") as file_reader:\n        input_example = dill.load(file_reader)\n\n    with open(signature_path, \"rb\") as file_reader:\n        signature = dill.load(file_reader)\n\n    with open(conda_env_path, \"rb\") as file_reader:\n        conda_env = dill.load(file_reader)\n\n    save_model(\n        sk_model=clf,\n        path=model_name,\n        serialization_format=\"cloudpickle\",\n        conda_env=conda_env,\n        signature=signature,\n        input_example=input_example,\n    )\n    run = client.create_run(experiment_id=\"0\")\n    client.log_artifact(run.info.run_id, model_name)\n\n\n@pipeline(name=\"mlflow_pipeline\")\ndef mlflow_pipeline(kernel: str, model_name: str):\n    iris_data = load_iris_data()\n    model = train_from_csv(\n        train_data=iris_data.outputs[\"data\"],\n        train_target=iris_data.outputs[\"target\"],\n        kernel=kernel,\n    )\n    _ = upload_sklearn_model_to_mlflow(\n        model_name=model_name,\n        model=model.outputs[\"model\"],\n        input_example=model.outputs[\"input_example\"],\n        signature=model.outputs[\"signature\"],\n        conda_env=model.outputs[\"conda_env\"],\n    )\n\n\nif __name__ == \"__main__\":\n    kfp.compiler.Compiler().compile(mlflow_pipeline, \"mlflow_pipeline.yaml\")\n```\n\n<p>\n  <details>\n    <summary>mlflow_pipeline.yaml</summary>\n\n```bash\napiVersion: argoproj.io/v1alpha1\nkind: Workflow\nmetadata:\n  generateName: mlflow-pipeline-\n  annotations: {pipelines.kubeflow.org/kfp_sdk_version: 1.8.10, pipelines.kubeflow.org/pipeline_compilation_time: '2022-01-19T14:14:11.999807',\n    pipelines.kubeflow.org/pipeline_spec: '{\"inputs\": [{\"name\": \"kernel\", \"type\":\n      \"String\"}, {\"name\": \"model_name\", \"type\": \"String\"}], \"name\": \"mlflow_pipeline\"}'}\n  labels: {pipelines.kubeflow.org/kfp_sdk_version: 1.8.10}\nspec:\n  entrypoint: mlflow-pipeline\n  templates:\n  - name: load-iris-data\n    container:\n      args: [--data, /tmp/outputs/data/data, --target, /tmp/outputs/target/data]\n      command:\n      - sh\n      - -c\n      - (PIP_DISABLE_PIP_VERSION_CHECK=1 python3 -m pip install --quiet --no-warn-script-location\n        'pandas' 'scikit-learn' || PIP_DISABLE_PIP_VERSION_CHECK=1 python3 -m pip\n        install --quiet --no-warn-script-location 'pandas' 'scikit-learn' --user)\n        && \"$0\" \"$@\"\n      - sh\n      - -ec\n      - |\n        program_path=$(mktemp)\n        printf \"%s\" \"$0\" > \"$program_path\"\n        python3 -u \"$program_path\" \"$@\"\n      - |\n        def _make_parent_dirs_and_return_path(file_path: str):\n            import os\n            os.makedirs(os.path.dirname(file_path), exist_ok=True)\n            return file_path\n\n        def load_iris_data(\n            data_path,\n            target_path,\n        ):\n            import pandas as pd\n            from sklearn.datasets import load_iris\n\n            iris = load_iris()\n\n            data = pd.DataFrame(iris[\"data\"], columns=iris[\"feature_names\"])\n            target = pd.DataFrame(iris[\"target\"], columns=[\"target\"])\n\n            data.to_csv(data_path, index=False)\n            target.to_csv(target_path, index=False)\n\n        import argparse\n        _parser = argparse.ArgumentParser(prog='Load iris data', description='')\n        _parser.add_argument(\"--data\", dest=\"data_path\", type=_make_parent_dirs_and_return_path, required=True, default=argparse.SUPPRESS)\n        _parser.add_argument(\"--target\", dest=\"target_path\", type=_make_parent_dirs_and_return_path, required=True, default=argparse.SUPPRESS)\n        _parsed_args = vars(_parser.parse_args())\n\n        _outputs = load_iris_data(**_parsed_args)\n      image: python:3.7\n    outputs:\n      artifacts:\n      - {name: load-iris-data-data, path: /tmp/outputs/data/data}\n      - {name: load-iris-data-target, path: /tmp/outputs/target/data}\n    metadata:\n      labels:\n        pipelines.kubeflow.org/kfp_sdk_version: 1.8.10\n        pipelines.kubeflow.org/pipeline-sdk-type: kfp\n        pipelines.kubeflow.org/enable_caching: \"true\"\n      annotations: {pipelines.kubeflow.org/component_spec: '{\"implementation\": {\"container\":\n          {\"args\": [\"--data\", {\"outputPath\": \"data\"}, \"--target\", {\"outputPath\": \"target\"}],\n          \"command\": [\"sh\", \"-c\", \"(PIP_DISABLE_PIP_VERSION_CHECK=1 python3 -m pip\n          install --quiet --no-warn-script-location ''pandas'' ''scikit-learn'' ||\n          PIP_DISABLE_PIP_VERSION_CHECK=1 python3 -m pip install --quiet --no-warn-script-location\n          ''pandas'' ''scikit-learn'' --user) && \\\"$0\\\" \\\"$@\\\"\", \"sh\", \"-ec\", \"program_path=$(mktemp)\\nprintf\n          \\\"%s\\\" \\\"$0\\\" > \\\"$program_path\\\"\\npython3 -u \\\"$program_path\\\" \\\"$@\\\"\\n\",\n          \"def _make_parent_dirs_and_return_path(file_path: str):\\n    import os\\n    os.makedirs(os.path.dirname(file_path),\n          exist_ok=True)\\n    return file_path\\n\\ndef load_iris_data(\\n    data_path,\\n    target_path,\\n):\\n    import\n          pandas as pd\\n    from sklearn.datasets import load_iris\\n\\n    iris = load_iris()\\n\\n    data\n          = pd.DataFrame(iris[\\\"data\\\"], columns=iris[\\\"feature_names\\\"])\\n    target\n          = pd.DataFrame(iris[\\\"target\\\"], columns=[\\\"target\\\"])\\n\\n    data.to_csv(data_path,\n          index=False)\\n    target.to_csv(target_path, index=False)\\n\\nimport argparse\\n_parser\n          = argparse.ArgumentParser(prog=''Load iris data'', description='''')\\n_parser.add_argument(\\\"--data\\\",\n          dest=\\\"data_path\\\", type=_make_parent_dirs_and_return_path, required=True,\n          default=argparse.SUPPRESS)\\n_parser.add_argument(\\\"--target\\\", dest=\\\"target_path\\\",\n          type=_make_parent_dirs_and_return_path, required=True, default=argparse.SUPPRESS)\\n_parsed_args\n          = vars(_parser.parse_args())\\n\\n_outputs = load_iris_data(**_parsed_args)\\n\"],\n          \"image\": \"python:3.7\"}}, \"name\": \"Load iris data\", \"outputs\": [{\"name\":\n          \"data\", \"type\": \"csv\"}, {\"name\": \"target\", \"type\": \"csv\"}]}', pipelines.kubeflow.org/component_ref: '{}'}\n  - name: mlflow-pipeline\n    inputs:\n      parameters:\n      - {name: kernel}\n      - {name: model_name}\n    dag:\n      tasks:\n      - {name: load-iris-data, template: load-iris-data}\n      - name: train-from-csv\n        template: train-from-csv\n        dependencies: [load-iris-data]\n        arguments:\n          parameters:\n          - {name: kernel, value: '{{inputs.parameters.kernel}}'}\n          artifacts:\n          - {name: load-iris-data-data, from: '{{tasks.load-iris-data.outputs.artifacts.load-iris-data-data}}'}\n          - {name: load-iris-data-target, from: '{{tasks.load-iris-data.outputs.artifacts.load-iris-data-target}}'}\n      - name: upload-sklearn-model-to-mlflow\n        template: upload-sklearn-model-to-mlflow\n        dependencies: [train-from-csv]\n        arguments:\n          parameters:\n          - {name: model_name, value: '{{inputs.parameters.model_name}}'}\n          artifacts:\n          - {name: train-from-csv-conda_env, from: '{{tasks.train-from-csv.outputs.artifacts.train-from-csv-conda_env}}'}\n          - {name: train-from-csv-input_example, from: '{{tasks.train-from-csv.outputs.artifacts.train-from-csv-input_example}}'}\n          - {name: train-from-csv-model, from: '{{tasks.train-from-csv.outputs.artifacts.train-from-csv-model}}'}\n          - {name: train-from-csv-signature, from: '{{tasks.train-from-csv.outputs.artifacts.train-from-csv-signature}}'}\n  - name: train-from-csv\n    container:\n      args: [--train-data, /tmp/inputs/train_data/data, --train-target, /tmp/inputs/train_target/data,\n        --kernel, '{{inputs.parameters.kernel}}', --model, /tmp/outputs/model/data,\n        --input-example, /tmp/outputs/input_example/data, --signature, /tmp/outputs/signature/data,\n        --conda-env, /tmp/outputs/conda_env/data]\n      command:\n      - sh\n      - -c\n      - (PIP_DISABLE_PIP_VERSION_CHECK=1 python3 -m pip install --quiet --no-warn-script-location\n        'dill' 'pandas' 'scikit-learn' 'mlflow' || PIP_DISABLE_PIP_VERSION_CHECK=1\n        python3 -m pip install --quiet --no-warn-script-location 'dill' 'pandas' 'scikit-learn'\n        'mlflow' --user) && \"$0\" \"$@\"\n      - sh\n      - -ec\n      - |\n        program_path=$(mktemp)\n        printf \"%s\" \"$0\" > \"$program_path\"\n        python3 -u \"$program_path\" \"$@\"\n      - |\n        def _make_parent_dirs_and_return_path(file_path: str):\n            import os\n            os.makedirs(os.path.dirname(file_path), exist_ok=True)\n            return file_path\n\n        def train_from_csv(\n            train_data_path,\n            train_target_path,\n            model_path,\n            input_example_path,\n            signature_path,\n            conda_env_path,\n            kernel,\n        ):\n            import dill\n            import pandas as pd\n            from sklearn.svm import SVC\n\n            from mlflow.models.signature import infer_signature\n            from mlflow.utils.environment import _mlflow_conda_env\n\n            train_data = pd.read_csv(train_data_path)\n            train_target = pd.read_csv(train_target_path)\n\n            clf = SVC(kernel=kernel)\n            clf.fit(train_data, train_target)\n\n            with open(model_path, mode=\"wb\") as file_writer:\n                dill.dump(clf, file_writer)\n\n            input_example = train_data.sample(1)\n            with open(input_example_path, \"wb\") as file_writer:\n                dill.dump(input_example, file_writer)\n\n            signature = infer_signature(train_data, clf.predict(train_data))\n            with open(signature_path, \"wb\") as file_writer:\n                dill.dump(signature, file_writer)\n\n            conda_env = _mlflow_conda_env(\n                additional_pip_deps=[\"dill\", \"pandas\", \"scikit-learn\"]\n            )\n            with open(conda_env_path, \"wb\") as file_writer:\n                dill.dump(conda_env, file_writer)\n\n        import argparse\n        _parser = argparse.ArgumentParser(prog='Train from csv', description='')\n        _parser.add_argument(\"--train-data\", dest=\"train_data_path\", type=str, required=True, default=argparse.SUPPRESS)\n        _parser.add_argument(\"--train-target\", dest=\"train_target_path\", type=str, required=True, default=argparse.SUPPRESS)\n        _parser.add_argument(\"--kernel\", dest=\"kernel\", type=str, required=True, default=argparse.SUPPRESS)\n        _parser.add_argument(\"--model\", dest=\"model_path\", type=_make_parent_dirs_and_return_path, required=True, default=argparse.SUPPRESS)\n        _parser.add_argument(\"--input-example\", dest=\"input_example_path\", type=_make_parent_dirs_and_return_path, required=True, default=argparse.SUPPRESS)\n        _parser.add_argument(\"--signature\", dest=\"signature_path\", type=_make_parent_dirs_and_return_path, required=True, default=argparse.SUPPRESS)\n        _parser.add_argument(\"--conda-env\", dest=\"conda_env_path\", type=_make_parent_dirs_and_return_path, required=True, default=argparse.SUPPRESS)\n        _parsed_args = vars(_parser.parse_args())\n\n        _outputs = train_from_csv(**_parsed_args)\n      image: python:3.7\n    inputs:\n      parameters:\n      - {name: kernel}\n      artifacts:\n      - {name: load-iris-data-data, path: /tmp/inputs/train_data/data}\n      - {name: load-iris-data-target, path: /tmp/inputs/train_target/data}\n    outputs:\n      artifacts:\n      - {name: train-from-csv-conda_env, path: /tmp/outputs/conda_env/data}\n      - {name: train-from-csv-input_example, path: /tmp/outputs/input_example/data}\n      - {name: train-from-csv-model, path: /tmp/outputs/model/data}\n      - {name: train-from-csv-signature, path: /tmp/outputs/signature/data}\n    metadata:\n      labels:\n        pipelines.kubeflow.org/kfp_sdk_version: 1.8.10\n        pipelines.kubeflow.org/pipeline-sdk-type: kfp\n        pipelines.kubeflow.org/enable_caching: \"true\"\n      annotations: {pipelines.kubeflow.org/component_spec: '{\"implementation\": {\"container\":\n          {\"args\": [\"--train-data\", {\"inputPath\": \"train_data\"}, \"--train-target\",\n          {\"inputPath\": \"train_target\"}, \"--kernel\", {\"inputValue\": \"kernel\"}, \"--model\",\n          {\"outputPath\": \"model\"}, \"--input-example\", {\"outputPath\": \"input_example\"},\n          \"--signature\", {\"outputPath\": \"signature\"}, \"--conda-env\", {\"outputPath\":\n          \"conda_env\"}], \"command\": [\"sh\", \"-c\", \"(PIP_DISABLE_PIP_VERSION_CHECK=1\n          python3 -m pip install --quiet --no-warn-script-location ''dill'' ''pandas''\n          ''scikit-learn'' ''mlflow'' || PIP_DISABLE_PIP_VERSION_CHECK=1 python3 -m\n          pip install --quiet --no-warn-script-location ''dill'' ''pandas'' ''scikit-learn''\n          ''mlflow'' --user) && \\\"$0\\\" \\\"$@\\\"\", \"sh\", \"-ec\", \"program_path=$(mktemp)\\nprintf\n          \\\"%s\\\" \\\"$0\\\" > \\\"$program_path\\\"\\npython3 -u \\\"$program_path\\\" \\\"$@\\\"\\n\",\n          \"def _make_parent_dirs_and_return_path(file_path: str):\\n    import os\\n    os.makedirs(os.path.dirname(file_path),\n          exist_ok=True)\\n    return file_path\\n\\ndef train_from_csv(\\n    train_data_path,\\n    train_target_path,\\n    model_path,\\n    input_example_path,\\n    signature_path,\\n    conda_env_path,\\n    kernel,\\n):\\n    import\n          dill\\n    import pandas as pd\\n    from sklearn.svm import SVC\\n\\n    from\n          mlflow.models.signature import infer_signature\\n    from mlflow.utils.environment\n          import _mlflow_conda_env\\n\\n    train_data = pd.read_csv(train_data_path)\\n    train_target\n          = pd.read_csv(train_target_path)\\n\\n    clf = SVC(kernel=kernel)\\n    clf.fit(train_data,\n          train_target)\\n\\n    with open(model_path, mode=\\\"wb\\\") as file_writer:\\n        dill.dump(clf,\n          file_writer)\\n\\n    input_example = train_data.sample(1)\\n    with open(input_example_path,\n          \\\"wb\\\") as file_writer:\\n        dill.dump(input_example, file_writer)\\n\\n    signature\n          = infer_signature(train_data, clf.predict(train_data))\\n    with open(signature_path,\n          \\\"wb\\\") as file_writer:\\n        dill.dump(signature, file_writer)\\n\\n    conda_env\n          = _mlflow_conda_env(\\n        additional_pip_deps=[\\\"dill\\\", \\\"pandas\\\",\n          \\\"scikit-learn\\\"]\\n    )\\n    with open(conda_env_path, \\\"wb\\\") as file_writer:\\n        dill.dump(conda_env,\n          file_writer)\\n\\nimport argparse\\n_parser = argparse.ArgumentParser(prog=''Train\n          from csv'', description='''')\\n_parser.add_argument(\\\"--train-data\\\", dest=\\\"train_data_path\\\",\n          type=str, required=True, default=argparse.SUPPRESS)\\n_parser.add_argument(\\\"--train-target\\\",\n          dest=\\\"train_target_path\\\", type=str, required=True, default=argparse.SUPPRESS)\\n_parser.add_argument(\\\"--kernel\\\",\n          dest=\\\"kernel\\\", type=str, required=True, default=argparse.SUPPRESS)\\n_parser.add_argument(\\\"--model\\\",\n          dest=\\\"model_path\\\", type=_make_parent_dirs_and_return_path, required=True,\n          default=argparse.SUPPRESS)\\n_parser.add_argument(\\\"--input-example\\\", dest=\\\"input_example_path\\\",\n          type=_make_parent_dirs_and_return_path, required=True, default=argparse.SUPPRESS)\\n_parser.add_argument(\\\"--signature\\\",\n          dest=\\\"signature_path\\\", type=_make_parent_dirs_and_return_path, required=True,\n          default=argparse.SUPPRESS)\\n_parser.add_argument(\\\"--conda-env\\\", dest=\\\"conda_env_path\\\",\n          type=_make_parent_dirs_and_return_path, required=True, default=argparse.SUPPRESS)\\n_parsed_args\n          = vars(_parser.parse_args())\\n\\n_outputs = train_from_csv(**_parsed_args)\\n\"],\n          \"image\": \"python:3.7\"}}, \"inputs\": [{\"name\": \"train_data\", \"type\": \"csv\"},\n          {\"name\": \"train_target\", \"type\": \"csv\"}, {\"name\": \"kernel\", \"type\": \"String\"}],\n          \"name\": \"Train from csv\", \"outputs\": [{\"name\": \"model\", \"type\": \"dill\"},\n          {\"name\": \"input_example\", \"type\": \"dill\"}, {\"name\": \"signature\", \"type\":\n          \"dill\"}, {\"name\": \"conda_env\", \"type\": \"dill\"}]}', pipelines.kubeflow.org/component_ref: '{}',\n        pipelines.kubeflow.org/arguments.parameters: '{\"kernel\": \"{{inputs.parameters.kernel}}\"}'}\n  - name: upload-sklearn-model-to-mlflow\n    container:\n      args: [--model-name, '{{inputs.parameters.model_name}}', --model, /tmp/inputs/model/data,\n        --input-example, /tmp/inputs/input_example/data, --signature, /tmp/inputs/signature/data,\n        --conda-env, /tmp/inputs/conda_env/data]\n      command:\n      - sh\n      - -c\n      - (PIP_DISABLE_PIP_VERSION_CHECK=1 python3 -m pip install --quiet --no-warn-script-location\n        'dill' 'pandas' 'scikit-learn' 'mlflow' 'boto3' || PIP_DISABLE_PIP_VERSION_CHECK=1\n        python3 -m pip install --quiet --no-warn-script-location 'dill' 'pandas' 'scikit-learn'\n        'mlflow' 'boto3' --user) && \"$0\" \"$@\"\n      - sh\n      - -ec\n      - |\n        program_path=$(mktemp)\n        printf \"%s\" \"$0\" > \"$program_path\"\n        python3 -u \"$program_path\" \"$@\"\n      - |\n        def upload_sklearn_model_to_mlflow(\n            model_name,\n            model_path,\n            input_example_path,\n            signature_path,\n            conda_env_path,\n        ):\n            import os\n            import dill\n            from mlflow.sklearn import save_model\n\n            from mlflow.tracking.client import MlflowClient\n\n            os.environ[\"MLFLOW_S3_ENDPOINT_URL\"] = \"http://minio-service.kubeflow.svc:9000\"\n            os.environ[\"AWS_ACCESS_KEY_ID\"] = \"minio\"\n            os.environ[\"AWS_SECRET_ACCESS_KEY\"] = \"minio123\"\n\n            client = MlflowClient(\"http://mlflow-server-service.mlflow-system.svc:5000\")\n\n            with open(model_path, mode=\"rb\") as file_reader:\n                clf = dill.load(file_reader)\n\n            with open(input_example_path, \"rb\") as file_reader:\n                input_example = dill.load(file_reader)\n\n            with open(signature_path, \"rb\") as file_reader:\n                signature = dill.load(file_reader)\n\n            with open(conda_env_path, \"rb\") as file_reader:\n                conda_env = dill.load(file_reader)\n\n            save_model(\n                sk_model=clf,\n                path=model_name,\n                serialization_format=\"cloudpickle\",\n                conda_env=conda_env,\n                signature=signature,\n                input_example=input_example,\n            )\n            run = client.create_run(experiment_id=\"0\")\n            client.log_artifact(run.info.run_id, model_name)\n\n        import argparse\n        _parser = argparse.ArgumentParser(prog='Upload sklearn model to mlflow', description='')\n        _parser.add_argument(\"--model-name\", dest=\"model_name\", type=str, required=True, default=argparse.SUPPRESS)\n        _parser.add_argument(\"--model\", dest=\"model_path\", type=str, required=True, default=argparse.SUPPRESS)\n        _parser.add_argument(\"--input-example\", dest=\"input_example_path\", type=str, required=True, default=argparse.SUPPRESS)\n        _parser.add_argument(\"--signature\", dest=\"signature_path\", type=str, required=True, default=argparse.SUPPRESS)\n        _parser.add_argument(\"--conda-env\", dest=\"conda_env_path\", type=str, required=True, default=argparse.SUPPRESS)\n        _parsed_args = vars(_parser.parse_args())\n\n        _outputs = upload_sklearn_model_to_mlflow(**_parsed_args)\n      image: python:3.7\n    inputs:\n      parameters:\n      - {name: model_name}\n      artifacts:\n      - {name: train-from-csv-conda_env, path: /tmp/inputs/conda_env/data}\n      - {name: train-from-csv-input_example, path: /tmp/inputs/input_example/data}\n      - {name: train-from-csv-model, path: /tmp/inputs/model/data}\n      - {name: train-from-csv-signature, path: /tmp/inputs/signature/data}\n    metadata:\n      labels:\n        pipelines.kubeflow.org/kfp_sdk_version: 1.8.10\n        pipelines.kubeflow.org/pipeline-sdk-type: kfp\n        pipelines.kubeflow.org/enable_caching: \"true\"\n      annotations: {pipelines.kubeflow.org/component_spec: '{\"implementation\": {\"container\":\n          {\"args\": [\"--model-name\", {\"inputValue\": \"model_name\"}, \"--model\", {\"inputPath\":\n          \"model\"}, \"--input-example\", {\"inputPath\": \"input_example\"}, \"--signature\",\n          {\"inputPath\": \"signature\"}, \"--conda-env\", {\"inputPath\": \"conda_env\"}],\n          \"command\": [\"sh\", \"-c\", \"(PIP_DISABLE_PIP_VERSION_CHECK=1 python3 -m pip\n          install --quiet --no-warn-script-location ''dill'' ''pandas'' ''scikit-learn''\n          ''mlflow'' ''boto3'' || PIP_DISABLE_PIP_VERSION_CHECK=1 python3 -m pip install\n          --quiet --no-warn-script-location ''dill'' ''pandas'' ''scikit-learn'' ''mlflow''\n          ''boto3'' --user) && \\\"$0\\\" \\\"$@\\\"\", \"sh\", \"-ec\", \"program_path=$(mktemp)\\nprintf\n          \\\"%s\\\" \\\"$0\\\" > \\\"$program_path\\\"\\npython3 -u \\\"$program_path\\\" \\\"$@\\\"\\n\",\n          \"def upload_sklearn_model_to_mlflow(\\n    model_name,\\n    model_path,\\n    input_example_path,\\n    signature_path,\\n    conda_env_path,\\n):\\n    import\n          os\\n    import dill\\n    from mlflow.sklearn import save_model\\n\\n    from\n          mlflow.tracking.client import MlflowClient\\n\\n    os.environ[\\\"MLFLOW_S3_ENDPOINT_URL\\\"]\n          = \\\"http://minio-service.kubeflow.svc:9000\\\"\\n    os.environ[\\\"AWS_ACCESS_KEY_ID\\\"]\n          = \\\"minio\\\"\\n    os.environ[\\\"AWS_SECRET_ACCESS_KEY\\\"] = \\\"minio123\\\"\\n\\n    client\n          = MlflowClient(\\\"http://mlflow-server-service.mlflow-system.svc:5000\\\")\\n\\n    with\n          open(model_path, mode=\\\"rb\\\") as file_reader:\\n        clf = dill.load(file_reader)\\n\\n    with\n          open(input_example_path, \\\"rb\\\") as file_reader:\\n        input_example\n          = dill.load(file_reader)\\n\\n    with open(signature_path, \\\"rb\\\") as file_reader:\\n        signature\n          = dill.load(file_reader)\\n\\n    with open(conda_env_path, \\\"rb\\\") as file_reader:\\n        conda_env\n          = dill.load(file_reader)\\n\\n    save_model(\\n        sk_model=clf,\\n        path=model_name,\\n        serialization_format=\\\"cloudpickle\\\",\\n        conda_env=conda_env,\\n        signature=signature,\\n        input_example=input_example,\\n    )\\n    run\n          = client.create_run(experiment_id=\\\"0\\\")\\n    client.log_artifact(run.info.run_id,\n          model_name)\\n\\nimport argparse\\n_parser = argparse.ArgumentParser(prog=''Upload\n          sklearn model to mlflow'', description='''')\\n_parser.add_argument(\\\"--model-name\\\",\n          dest=\\\"model_name\\\", type=str, required=True, default=argparse.SUPPRESS)\\n_parser.add_argument(\\\"--model\\\",\n          dest=\\\"model_path\\\", type=str, required=True, default=argparse.SUPPRESS)\\n_parser.add_argument(\\\"--input-example\\\",\n          dest=\\\"input_example_path\\\", type=str, required=True, default=argparse.SUPPRESS)\\n_parser.add_argument(\\\"--signature\\\",\n          dest=\\\"signature_path\\\", type=str, required=True, default=argparse.SUPPRESS)\\n_parser.add_argument(\\\"--conda-env\\\",\n          dest=\\\"conda_env_path\\\", type=str, required=True, default=argparse.SUPPRESS)\\n_parsed_args\n          = vars(_parser.parse_args())\\n\\n_outputs = upload_sklearn_model_to_mlflow(**_parsed_args)\\n\"],\n          \"image\": \"python:3.7\"}}, \"inputs\": [{\"name\": \"model_name\", \"type\": \"String\"},\n          {\"name\": \"model\", \"type\": \"dill\"}, {\"name\": \"input_example\", \"type\": \"dill\"},\n          {\"name\": \"signature\", \"type\": \"dill\"}, {\"name\": \"conda_env\", \"type\": \"dill\"}],\n          \"name\": \"Upload sklearn model to mlflow\"}', pipelines.kubeflow.org/component_ref: '{}',\n        pipelines.kubeflow.org/arguments.parameters: '{\"model_name\": \"{{inputs.parameters.model_name}}\"}'}\n  arguments:\n    parameters:\n    - {name: kernel}\n    - {name: model_name}\n  serviceAccountName: pipeline-runner\n```\n\n  </details>\n</p>\n\nAfter generating the mlflow_pipeline.yaml file after execution, upload the pipeline and execute it to check the results of the run.\n\n![mlflow-svc-0](./img/mlflow-svc-0.png)\n\nPort-forward the mlflow service to access the MLflow UI.\n\n```bash\nkubectl port-forward svc/mlflow-server-service -n mlflow-system 5000:5000\n```\n\nOpen the web browser and connect to localhost:5000. You will then be able to see that the run has been created as follows.\n\n![mlflow-svc-1](./img/mlflow-svc-1.png)\n\nClick on run to verify that the trained model file is present.\n\n![mlflow-svc-2](./img/mlflow-svc-2.png)\n"
  },
  {
    "path": "i18n/en/docusaurus-plugin-content-docs/current/kubeflow/advanced-pipeline.md",
    "content": "---\ntitle : \"10. Pipeline - Setting\"\ndescription: \"\"\nsidebar_position: 10\ncontributors: [\"Jongseob Jeon\"]\n---\n\n## Pipeline Setting\n\nIn this page, we will look at values that can be set in the pipeline.\n\n## Display Name\n\nCreated within the pipeline, components have two names:\n\n- task_name: the function name when writing the component\n- display_name: the name that appears in the kubeflow UI\n\nFor example, in the case where both components are set to Print and return number, it is difficult to tell which component is 1 or 2.\n\n![run-7](./img/run-7.png)\n\n### set_display_name\n\nThe solution for this is the display_name.  \nWe can set the display_name in the pipeline by using the set_display_name [attribute](https://kubeflow-pipelines.readthedocs.io/en/latest/source/kfp.dsl.html#kfp.dsl.ContainerOp.set_display_name) of the component.\n\n```python\nimport kfp\nfrom kfp.components import create_component_from_func\nfrom kfp.dsl import pipeline\n\n\n@create_component_from_func\ndef print_and_return_number(number: int) -> int:\n    print(number)\n    return number\n\n\n@create_component_from_func\ndef sum_and_print_numbers(number_1: int, number_2: int):\n    print(number_1 + number_2)\n\n\n@pipeline(name=\"example_pipeline\")\ndef example_pipeline(number_1: int, number_2: int):\n    number_1_result = print_and_return_number(number_1).set_display_name(\"This is number 1\")\n    number_2_result = print_and_return_number(number_2).set_display_name(\"This is number 2\")\n    sum_result = sum_and_print_numbers(\n        number_1=number_1_result.output, number_2=number_2_result.output\n    ).set_display_name(\"This is sum of number 1 and number 2\")\n\n\nif __name__ == \"__main__\":\n    kfp.compiler.Compiler().compile(example_pipeline, \"example_pipeline.yaml\")\n```\n\nIf you run this script and check the resulting `example_pipeline.yaml`, it would be like this.\n\n<p>\n  <details>\n    <summary>example_pipeline.yaml</summary>\n\n```bash\napiVersion: argoproj.io/v1alpha1\nkind: Workflow\nmetadata:\n  generateName: example-pipeline-\n  annotations: {pipelines.kubeflow.org/kfp_sdk_version: 1.8.9, pipelines.kubeflow.org/pipeline_compilation_time: '2021-12-09T18:11:43.193190',\n    pipelines.kubeflow.org/pipeline_spec: '{\"inputs\": [{\"name\": \"number_1\", \"type\":\n      \"Integer\"}, {\"name\": \"number_2\", \"type\": \"Integer\"}], \"name\": \"example_pipeline\"}'}\n  labels: {pipelines.kubeflow.org/kfp_sdk_version: 1.8.9}\nspec:\n  entrypoint: example-pipeline\n  templates:\n  - name: example-pipeline\n    inputs:\n      parameters:\n      - {name: number_1}\n      - {name: number_2}\n    dag:\n      tasks:\n      - name: print-and-return-number\n        template: print-and-return-number\n        arguments:\n          parameters:\n          - {name: number_1, value: '{{inputs.parameters.number_1}}'}\n      - name: print-and-return-number-2\n        template: print-and-return-number-2\n        arguments:\n          parameters:\n          - {name: number_2, value: '{{inputs.parameters.number_2}}'}\n      - name: sum-and-print-numbers\n        template: sum-and-print-numbers\n        dependencies: [print-and-return-number, print-and-return-number-2]\n        arguments:\n          parameters:\n          - {name: print-and-return-number-2-Output, value: '{{tasks.print-and-return-number-2.outputs.parameters.print-and-return-number-2-Output}}'}\n          - {name: print-and-return-number-Output, value: '{{tasks.print-and-return-number.outputs.parameters.print-and-return-number-Output}}'}\n  - name: print-and-return-number\n    container:\n      args: [--number, '{{inputs.parameters.number_1}}', '----output-paths', /tmp/outputs/Output/data]\n      command:\n      - sh\n      - -ec\n      - |\n        program_path=$(mktemp)\n        printf \"%s\" \"$0\" > \"$program_path\"\n        python3 -u \"$program_path\" \"$@\"\n      - |\n        def print_and_return_number(number):\n            print(number)\n            return number\n\n        def _serialize_int(int_value: int) -> str:\n            if isinstance(int_value, str):\n                return int_value\n            if not isinstance(int_value, int):\n                raise TypeError('Value \"{}\" has type \"{}\" instead of int.'.format(\n                    str(int_value), str(type(int_value))))\n            return str(int_value)\n\n        import argparse\n        _parser = argparse.ArgumentParser(prog='Print and return number', description='')\n        _parser.add_argument(\"--number\", dest=\"number\", type=int, required=True, default=argparse.SUPPRESS)\n        _parser.add_argument(\"----output-paths\", dest=\"_output_paths\", type=str, nargs=1)\n        _parsed_args = vars(_parser.parse_args())\n        _output_files = _parsed_args.pop(\"_output_paths\", [])\n\n        _outputs = print_and_return_number(**_parsed_args)\n\n        _outputs = [_outputs]\n\n        _output_serializers = [\n            _serialize_int,\n\n        ]\n\n        import os\n        for idx, output_file in enumerate(_output_files):\n            try:\n                os.makedirs(os.path.dirname(output_file))\n            except OSError:\n                pass\n            with open(output_file, 'w') as f:\n                f.write(_output_serializers[idx](_outputs[idx]))\n      image: python:3.7\n    inputs:\n      parameters:\n      - {name: number_1}\n    outputs:\n      parameters:\n      - name: print-and-return-number-Output\n        valueFrom: {path: /tmp/outputs/Output/data}\n      artifacts:\n      - {name: print-and-return-number-Output, path: /tmp/outputs/Output/data}\n    metadata:\n      annotations: {pipelines.kubeflow.org/task_display_name: This is number 1, pipelines.kubeflow.org/component_spec: '{\"implementation\":\n          {\"container\": {\"args\": [\"--number\", {\"inputValue\": \"number\"}, \"----output-paths\",\n          {\"outputPath\": \"Output\"}], \"command\": [\"sh\", \"-ec\", \"program_path=$(mktemp)\\nprintf\n          \\\"%s\\\" \\\"$0\\\" > \\\"$program_path\\\"\\npython3 -u \\\"$program_path\\\" \\\"$@\\\"\\n\",\n          \"def print_and_return_number(number):\\n    print(number)\\n    return number\\n\\ndef\n          _serialize_int(int_value: int) -> str:\\n    if isinstance(int_value, str):\\n        return\n          int_value\\n    if not isinstance(int_value, int):\\n        raise TypeError(''Value\n          \\\"{}\\\" has type \\\"{}\\\" instead of int.''.format(\\n            str(int_value),\n          str(type(int_value))))\\n    return str(int_value)\\n\\nimport argparse\\n_parser\n          = argparse.ArgumentParser(prog=''Print and return number'', description='''')\\n_parser.add_argument(\\\"--number\\\",\n          dest=\\\"number\\\", type=int, required=True, default=argparse.SUPPRESS)\\n_parser.add_argument(\\\"----output-paths\\\",\n          dest=\\\"_output_paths\\\", type=str, nargs=1)\\n_parsed_args = vars(_parser.parse_args())\\n_output_files\n          = _parsed_args.pop(\\\"_output_paths\\\", [])\\n\\n_outputs = print_and_return_number(**_parsed_args)\\n\\n_outputs\n          = [_outputs]\\n\\n_output_serializers = [\\n    _serialize_int,\\n\\n]\\n\\nimport\n          os\\nfor idx, output_file in enumerate(_output_files):\\n    try:\\n        os.makedirs(os.path.dirname(output_file))\\n    except\n          OSError:\\n        pass\\n    with open(output_file, ''w'') as f:\\n        f.write(_output_serializers[idx](_outputs[idx]))\\n\"],\n          \"image\": \"python:3.7\"}}, \"inputs\": [{\"name\": \"number\", \"type\": \"Integer\"}],\n          \"name\": \"Print and return number\", \"outputs\": [{\"name\": \"Output\", \"type\":\n          \"Integer\"}]}', pipelines.kubeflow.org/component_ref: '{}', pipelines.kubeflow.org/arguments.parameters: '{\"number\":\n          \"{{inputs.parameters.number_1}}\"}'}\n      labels:\n        pipelines.kubeflow.org/kfp_sdk_version: 1.8.9\n        pipelines.kubeflow.org/pipeline-sdk-type: kfp\n        pipelines.kubeflow.org/enable_caching: \"true\"\n  - name: print-and-return-number-2\n    container:\n      args: [--number, '{{inputs.parameters.number_2}}', '----output-paths', /tmp/outputs/Output/data]\n      command:\n      - sh\n      - -ec\n      - |\n        program_path=$(mktemp)\n        printf \"%s\" \"$0\" > \"$program_path\"\n        python3 -u \"$program_path\" \"$@\"\n      - |\n        def print_and_return_number(number):\n            print(number)\n            return number\n\n        def _serialize_int(int_value: int) -> str:\n            if isinstance(int_value, str):\n                return int_value\n            if not isinstance(int_value, int):\n                raise TypeError('Value \"{}\" has type \"{}\" instead of int.'.format(\n                    str(int_value), str(type(int_value))))\n            return str(int_value)\n\n        import argparse\n        _parser = argparse.ArgumentParser(prog='Print and return number', description='')\n        _parser.add_argument(\"--number\", dest=\"number\", type=int, required=True, default=argparse.SUPPRESS)\n        _parser.add_argument(\"----output-paths\", dest=\"_output_paths\", type=str, nargs=1)\n        _parsed_args = vars(_parser.parse_args())\n        _output_files = _parsed_args.pop(\"_output_paths\", [])\n\n        _outputs = print_and_return_number(**_parsed_args)\n\n        _outputs = [_outputs]\n\n        _output_serializers = [\n            _serialize_int,\n\n        ]\n\n        import os\n        for idx, output_file in enumerate(_output_files):\n            try:\n                os.makedirs(os.path.dirname(output_file))\n            except OSError:\n                pass\n            with open(output_file, 'w') as f:\n                f.write(_output_serializers[idx](_outputs[idx]))\n      image: python:3.7\n    inputs:\n      parameters:\n      - {name: number_2}\n    outputs:\n      parameters:\n      - name: print-and-return-number-2-Output\n        valueFrom: {path: /tmp/outputs/Output/data}\n      artifacts:\n      - {name: print-and-return-number-2-Output, path: /tmp/outputs/Output/data}\n    metadata:\n      annotations: {pipelines.kubeflow.org/task_display_name: This is number 2, pipelines.kubeflow.org/component_spec: '{\"implementation\":\n          {\"container\": {\"args\": [\"--number\", {\"inputValue\": \"number\"}, \"----output-paths\",\n          {\"outputPath\": \"Output\"}], \"command\": [\"sh\", \"-ec\", \"program_path=$(mktemp)\\nprintf\n          \\\"%s\\\" \\\"$0\\\" > \\\"$program_path\\\"\\npython3 -u \\\"$program_path\\\" \\\"$@\\\"\\n\",\n          \"def print_and_return_number(number):\\n    print(number)\\n    return number\\n\\ndef\n          _serialize_int(int_value: int) -> str:\\n    if isinstance(int_value, str):\\n        return\n          int_value\\n    if not isinstance(int_value, int):\\n        raise TypeError(''Value\n          \\\"{}\\\" has type \\\"{}\\\" instead of int.''.format(\\n            str(int_value),\n          str(type(int_value))))\\n    return str(int_value)\\n\\nimport argparse\\n_parser\n          = argparse.ArgumentParser(prog=''Print and return number'', description='''')\\n_parser.add_argument(\\\"--number\\\",\n          dest=\\\"number\\\", type=int, required=True, default=argparse.SUPPRESS)\\n_parser.add_argument(\\\"----output-paths\\\",\n          dest=\\\"_output_paths\\\", type=str, nargs=1)\\n_parsed_args = vars(_parser.parse_args())\\n_output_files\n          = _parsed_args.pop(\\\"_output_paths\\\", [])\\n\\n_outputs = print_and_return_number(**_parsed_args)\\n\\n_outputs\n          = [_outputs]\\n\\n_output_serializers = [\\n    _serialize_int,\\n\\n]\\n\\nimport\n          os\\nfor idx, output_file in enumerate(_output_files):\\n    try:\\n        os.makedirs(os.path.dirname(output_file))\\n    except\n          OSError:\\n        pass\\n    with open(output_file, ''w'') as f:\\n        f.write(_output_serializers[idx](_outputs[idx]))\\n\"],\n          \"image\": \"python:3.7\"}}, \"inputs\": [{\"name\": \"number\", \"type\": \"Integer\"}],\n          \"name\": \"Print and return number\", \"outputs\": [{\"name\": \"Output\", \"type\":\n          \"Integer\"}]}', pipelines.kubeflow.org/component_ref: '{}', pipelines.kubeflow.org/arguments.parameters: '{\"number\":\n          \"{{inputs.parameters.number_2}}\"}'}\n      labels:\n        pipelines.kubeflow.org/kfp_sdk_version: 1.8.9\n        pipelines.kubeflow.org/pipeline-sdk-type: kfp\n        pipelines.kubeflow.org/enable_caching: \"true\"\n  - name: sum-and-print-numbers\n    container:\n      args: [--number-1, '{{inputs.parameters.print-and-return-number-Output}}', --number-2,\n        '{{inputs.parameters.print-and-return-number-2-Output}}']\n      command:\n      - sh\n      - -ec\n      - |\n        program_path=$(mktemp)\n        printf \"%s\" \"$0\" > \"$program_path\"\n        python3 -u \"$program_path\" \"$@\"\n      - |\n        def sum_and_print_numbers(number_1, number_2):\n            print(number_1 + number_2)\n\n        import argparse\n        _parser = argparse.ArgumentParser(prog='Sum and print numbers', description='')\n        _parser.add_argument(\"--number-1\", dest=\"number_1\", type=int, required=True, default=argparse.SUPPRESS)\n        _parser.add_argument(\"--number-2\", dest=\"number_2\", type=int, required=True, default=argparse.SUPPRESS)\n        _parsed_args = vars(_parser.parse_args())\n\n        _outputs = sum_and_print_numbers(**_parsed_args)\n      image: python:3.7\n    inputs:\n      parameters:\n      - {name: print-and-return-number-2-Output}\n      - {name: print-and-return-number-Output}\n    metadata:\n      annotations: {pipelines.kubeflow.org/task_display_name: This is sum of number\n          1 and number 2, pipelines.kubeflow.org/component_spec: '{\"implementation\":\n          {\"container\": {\"args\": [\"--number-1\", {\"inputValue\": \"number_1\"}, \"--number-2\",\n          {\"inputValue\": \"number_2\"}], \"command\": [\"sh\", \"-ec\", \"program_path=$(mktemp)\\nprintf\n          \\\"%s\\\" \\\"$0\\\" > \\\"$program_path\\\"\\npython3 -u \\\"$program_path\\\" \\\"$@\\\"\\n\",\n          \"def sum_and_print_numbers(number_1, number_2):\\n    print(number_1 + number_2)\\n\\nimport\n          argparse\\n_parser = argparse.ArgumentParser(prog=''Sum and print numbers'',\n          description='''')\\n_parser.add_argument(\\\"--number-1\\\", dest=\\\"number_1\\\",\n          type=int, required=True, default=argparse.SUPPRESS)\\n_parser.add_argument(\\\"--number-2\\\",\n          dest=\\\"number_2\\\", type=int, required=True, default=argparse.SUPPRESS)\\n_parsed_args\n          = vars(_parser.parse_args())\\n\\n_outputs = sum_and_print_numbers(**_parsed_args)\\n\"],\n          \"image\": \"python:3.7\"}}, \"inputs\": [{\"name\": \"number_1\", \"type\": \"Integer\"},\n          {\"name\": \"number_2\", \"type\": \"Integer\"}], \"name\": \"Sum and print numbers\"}',\n        pipelines.kubeflow.org/component_ref: '{}', pipelines.kubeflow.org/arguments.parameters: '{\"number_1\":\n          \"{{inputs.parameters.print-and-return-number-Output}}\", \"number_2\": \"{{inputs.parameters.print-and-return-number-2-Output}}\"}'}\n      labels:\n        pipelines.kubeflow.org/kfp_sdk_version: 1.8.9\n        pipelines.kubeflow.org/pipeline-sdk-type: kfp\n        pipelines.kubeflow.org/enable_caching: \"true\"\n  arguments:\n    parameters:\n    - {name: number_1}\n    - {name: number_2}\n  serviceAccountName: pipeline-runner\n```\n\n  </details>\n</p>\n\nIf compared with the previous file, the **`pipelines.kubeflow.org/task_display_name`** key has been newly created.\n\n### UI in Kubeflow\n\n\nWe will upload the version of the previously created [pipeline](../kubeflow/basic-pipeline-upload.md#upload-pipeline-version) using the files we created earlier.\n\n![adv-pipeline-0.png](./img/adv-pipeline-0.png)\n\nAs you can see, the configured name is displayed as shown above.\n\n## Resources\n\n### GPU\n\nBy default, when the pipeline runs components as Kubernetes pods, it uses the default resource specifications.  \nIf you need to train a model using a GPU and the Kubernetes environment doesn't allocate a GPU, the training may not be performed correctly.  \nTo address this, you can use the `set_gpu_limit()` [attribute](https://kubeflow-pipelines.readthedocs.io/en/latest/source/kfp.dsl.html?highlight=set_gpu_limit#kfp.dsl.UserContainer.set_gpu_limit) to set the GPU limit.\n\n```python\nimport kfp\nfrom kfp.components import create_component_from_func\nfrom kfp.dsl import pipeline\n\n\n@create_component_from_func\ndef print_and_return_number(number: int) -> int:\n    print(number)\n    return number\n\n\n@create_component_from_func\ndef sum_and_print_numbers(number_1: int, number_2: int):\n    print(number_1 + number_2)\n\n\n@pipeline(name=\"example_pipeline\")\ndef example_pipeline(number_1: int, number_2: int):\n    number_1_result = print_and_return_number(number_1).set_display_name(\"This is number 1\")\n    number_2_result = print_and_return_number(number_2).set_display_name(\"This is number 2\")\n    sum_result = sum_and_print_numbers(\n        number_1=number_1_result.output, number_2=number_2_result.output\n    ).set_display_name(\"This is sum of number 1 and number 2\").set_gpu_limit(1)\n\n\nif __name__ == \"__main__\":\n    kfp.compiler.Compiler().compile(example_pipeline, \"example_pipeline.yaml\")\n```\n\nIf you execute the above script, you can see that the resources has been added with `{nvidia.com/gpu: 1}` in the generated file when you look closely at `sum-and-print-numbers`.\nThrough this, you can allocate a GPU.\n\n```bash\n  - name: sum-and-print-numbers\n    container:\n      args: [--number-1, '{{inputs.parameters.print-and-return-number-Output}}', --number-2,\n        '{{inputs.parameters.print-and-return-number-2-Output}}']\n      command:\n      - sh\n      - -ec\n      - |\n        program_path=$(mktemp)\n        printf \"%s\" \"$0\" > \"$program_path\"\n        python3 -u \"$program_path\" \"$@\"\n      - |\n        def sum_and_print_numbers(number_1, number_2):\n            print(number_1 + number_2)\n\n        import argparse\n        _parser = argparse.ArgumentParser(prog='Sum and print numbers', description='')\n        _parser.add_argument(\"--number-1\", dest=\"number_1\", type=int, required=True, default=argparse.SUPPRESS)\n        _parser.add_argument(\"--number-2\", dest=\"number_2\", type=int, required=True, default=argparse.SUPPRESS)\n        _parsed_args = vars(_parser.parse_args())\n\n        _outputs = sum_and_print_numbers(**_parsed_args)\n      image: python:3.7\n      resources:\n        limits: {nvidia.com/gpu: 1}\n```\n\n### CPU\n\nThe function to set the number of CPUs can be set using the `.set_cpu_limit()` attribute [attribute](https://kubeflow-pipelines.readthedocs.io/en/latest/source/kfp.dsl.html?highlight=set_gpu_limit#kfp.dsl.Sidecar.set_cpu_limit).  \nThe difference from GPUs is that the input must be a string, not an int.\n\n```python\nimport kfp\nfrom kfp.components import create_component_from_func\nfrom kfp.dsl import pipeline\n\n\n@create_component_from_func\ndef print_and_return_number(number: int) -> int:\n    print(number)\n    return number\n\n\n@create_component_from_func\ndef sum_and_print_numbers(number_1: int, number_2: int):\n    print(number_1 + number_2)\n\n\n@pipeline(name=\"example_pipeline\")\ndef example_pipeline(number_1: int, number_2: int):\n    number_1_result = print_and_return_number(number_1).set_display_name(\"This is number 1\")\n    number_2_result = print_and_return_number(number_2).set_display_name(\"This is number 2\")\n    sum_result = sum_and_print_numbers(\n        number_1=number_1_result.output, number_2=number_2_result.output\n    ).set_display_name(\"This is sum of number 1 and number 2\").set_gpu_limit(1).set_cpu_limit(\"16\")\n\n\nif __name__ == \"__main__\":\n    kfp.compiler.Compiler().compile(example_pipeline, \"example_pipeline.yaml\")\n```\n\nThe changed part only can be confirmed as follows.\n\n```bash\n      resources:\n        limits: {nvidia.com/gpu: 1, cpu: '16'}\n```\n\n### Memory\n\nMemory can be set using the `.set_memory_limit()` [attribute](https://kubeflow-pipelines.readthedocs.io/en/latest/source/kfp.dsl.html?highlight=set_gpu_limit#kfp.dsl.Sidecar.set_memory_limit).\n\n```python\nimport kfp\nfrom kfp.components import create_component_from_func\nfrom kfp.dsl import pipeline\n\n\n@create_component_from_func\ndef print_and_return_number(number: int) -> int:\n    print(number)\n    return number\n\n\n@create_component_from_func\ndef sum_and_print_numbers(number_1: int, number_2: int):\n    print(number_1 + number_2)\n\n\n@pipeline(name=\"example_pipeline\")\ndef example_pipeline(number_1: int, number_2: int):\n    number_1_result = print_and_return_number(number_1).set_display_name(\"This is number 1\")\n    number_2_result = print_and_return_number(number_2).set_display_name(\"This is number 2\")\n    sum_result = sum_and_print_numbers(\n        number_1=number_1_result.output, number_2=number_2_result.output\n    ).set_display_name(\"This is sum of number 1 and number 2\").set_gpu_limit(1).set_memory_limit(\"1G\")\n\n\nif __name__ == \"__main__\":\n    kfp.compiler.Compiler().compile(example_pipeline, \"example_pipeline.yaml\")\n\n```\n\nThe changed parts are as follows if checked.\n\n```bash\n      resources:\n        limits: {nvidia.com/gpu: 1, memory: 1G}\n```\n"
  },
  {
    "path": "i18n/en/docusaurus-plugin-content-docs/current/kubeflow/advanced-run.md",
    "content": "---\ntitle : \"11. Pipeline - Run Result\"\ndescription: \"\"\nsidebar_position: 11\ncontributors: [\"Jongseob Jeon\", \"SeungTae Kim\"]\n---\n\n\n## Run Result\n\nClick Run Result and you will see three tabs:\nGraph, Run Output, and Config.\n\n![advanced-run-0.png](./img/advanced-run-0.png)\n\n## Graph\n\n![advanced-run-1.png](./img/advanced-run-1.png)\n\nIn the graph, if you click on the run component, you can check the running information of the component.\n\n### Input/Output\n\nThe Input/Output tab allows you to view and download the Configurations, Input, and Output Artifacts used in the components.\n\n### Logs\n\nIn the Logs tab, you can view all the stdout output generated during the execution of the Python code.\nHowever, pods are deleted after a certain period of time, so you may not be able to view them in this tab after a certain time.\nIn that case, you can check them in the main-logs section of the Output artifacts.\n\n### Visualizations\n\nThe Visualizations tab displays plots generated by the components.\n\nTo generate a plot, you can save the desired values as an argument using `mlpipeline_ui_metadata: OutputPath(\"UI_Metadata\")`. The plot should be in HTML format.\nThe conversion process is as follows.\n\n```python\n\n@partial(\n    create_component_from_func,\n    packages_to_install=[\"matplotlib\"],\n)\ndef plot_linear(\n    mlpipeline_ui_metadata: OutputPath(\"UI_Metadata\")\n):\n    import base64\n    import json\n    from io import BytesIO\n\n    import matplotlib.pyplot as plt\n\n    plt.plot(x=[1, 2, 3], y=[1, 2,3])\n\n    tmpfile = BytesIO()\n    plt.savefig(tmpfile, format=\"png\")\n    encoded = base64.b64encode(tmpfile.getvalue()).decode(\"utf-8\")\n\n    html = f\"<img src='data:image/png;base64,{encoded}'>\"\n    metadata = {\n        \"outputs\": [\n            {\n                \"type\": \"web-app\",\n                \"storage\": \"inline\",\n                \"source\": html,\n            },\n        ],\n    }\n    with open(mlpipeline_ui_metadata, \"w\") as html_writer:\n        json.dump(metadata, html_writer)\n```\n\nIf written in pipeline, it will be like this.\n\n```python\nfrom functools import partial\n\nimport kfp\nfrom kfp.components import create_component_from_func, OutputPath\nfrom kfp.dsl import pipeline\n\n\n@partial(\n    create_component_from_func,\n    packages_to_install=[\"matplotlib\"],\n)\ndef plot_linear(mlpipeline_ui_metadata: OutputPath(\"UI_Metadata\")):\n    import base64\n    import json\n    from io import BytesIO\n\n    import matplotlib.pyplot as plt\n\n    plt.plot([1, 2, 3], [1, 2, 3])\n\n    tmpfile = BytesIO()\n    plt.savefig(tmpfile, format=\"png\")\n    encoded = base64.b64encode(tmpfile.getvalue()).decode(\"utf-8\")\n\n    html = f\"<img src='data:image/png;base64,{encoded}'>\"\n    metadata = {\n        \"outputs\": [\n            {\n                \"type\": \"web-app\",\n                \"storage\": \"inline\",\n                \"source\": html,\n            },\n        ],\n    }\n    with open(mlpipeline_ui_metadata, \"w\") as html_writer:\n        json.dump(metadata, html_writer)\n\n\n@pipeline(name=\"plot_pipeline\")\ndef plot_pipeline():\n    plot_linear()\n\n\nif __name__ == \"__main__\":\n    kfp.compiler.Compiler().compile(plot_pipeline, \"plot_pipeline.yaml\")\n```\n\nIf you run this script and check the resulting `plot_pipeline.yaml`, you will see the following.\n\n<p>\n  <details>\n    <summary>plot_pipeline.yaml</summary>\n\n```bash\napiVersion: argoproj.io/v1alpha1\nkind: Workflow\nmetadata:\n  generateName: plot-pipeline-\n  annotations: {pipelines.kubeflow.org/kfp_sdk_version: 1.8.9, pipelines.kubeflow.org/pipeline_compilation_time: '2\n022-01-17T13:31:32.963214',\n    pipelines.kubeflow.org/pipeline_spec: '{\"name\": \"plot_pipeline\"}'}\n  labels: {pipelines.kubeflow.org/kfp_sdk_version: 1.8.9}\nspec:\n  entrypoint: plot-pipeline\n  templates:\n  - name: plot-linear\n    container:\n      args: [--mlpipeline-ui-metadata, /tmp/outputs/mlpipeline_ui_metadata/data]\n      command:\n      - sh\n      - -c\n      - (PIP_DISABLE_PIP_VERSION_CHECK=1 python3 -m pip install --quiet --no-warn-script-location\n        'matplotlib' || PIP_DISABLE_PIP_VERSION_CHECK=1 python3 -m pip install --quiet\n        --no-warn-script-location 'matplotlib' --user) && \"$0\" \"$@\"\n      - sh\n      - -ec\n      - |\n        program_path=$(mktemp)\n        printf \"%s\" \"$0\" > \"$program_path\"\n        python3 -u \"$program_path\" \"$@\"\n      - |\n        def _make_parent_dirs_and_return_path(file_path: str):\n            import os\n            os.makedirs(os.path.dirname(file_path), exist_ok=True)\n            return file_path\n        def plot_linear(mlpipeline_ui_metadata):\n            import base64\n            import json\n            from io import BytesIO\n            import matplotlib.pyplot as plt\n            plt.plot([1, 2, 3], [1, 2, 3])\n            tmpfile = BytesIO()\n            plt.savefig(tmpfile, format=\"png\")\n            encoded = base64.b64encode(tmpfile.getvalue()).decode(\"utf-8\")\n            html = f\"<img src='data:image/png;base64,{encoded}'>\"\n            metadata = {\n                \"outputs\": [\n                    {\n                        \"type\": \"web-app\",\n                        \"storage\": \"inline\",\n                        \"source\": html,\n                    },\n                ],\n            }\n            with open(mlpipeline_ui_metadata, \"w\") as html_writer:\n                json.dump(metadata, html_writer)\n\n        import argparse\n        _parser = argparse.ArgumentParser(prog='Plot linear', description='')\n        _parser.add_argument(\"--mlpipeline-ui-metadata\", dest=\"mlpipeline_ui_metadata\", type=_make_parent_dirs_and_return_path, required=True, default=argparse.SUPPRESS)\n        _parsed_args = vars(_parser.parse_args())\n        _outputs = plot_linear(**_parsed_args)\n      image: python:3.7\n    outputs:\n      artifacts:\n      - {name: mlpipeline-ui-metadata, path: /tmp/outputs/mlpipeline_ui_metadata/data}\n    metadata:\n      labels:\n        pipelines.kubeflow.org/kfp_sdk_version: 1.8.9\n        pipelines.kubeflow.org/pipeline-sdk-type: kfp\n        pipelines.kubeflow.org/enable_caching: \"true\"\n      annotations: {pipelines.kubeflow.org/component_spec: '{\"implementation\": {\"container\":\n          {\"args\": [\"--mlpipeline-ui-metadata\", {\"outputPath\": \"mlpipeline_ui_metadata\"}],\n          \"command\": [\"sh\", \"-c\", \"(PIP_DISABLE_PIP_VERSION_CHECK=1 python3 -m pip\n          install --quiet --no-warn-script-location ''matplotlib'' || PIP_DISABLE_PIP_VERSION_CHECK=1\n          python3 -m pip install --quiet --no-warn-script-location ''matplotlib''\n          --user) && \\\"$0\\\" \\\"$@\\\"\", \"sh\", \"-ec\", \"program_path=$(mktemp)\\nprintf\n          \\\"%s\\\" \\\"$0\\\" > \\\"$program_path\\\"\\npython3 -u \\\"$program_path\\\" \\\"$@\\\"\\n\",\n          \"def _make_parent_dirs_and_return_path(file_path: str):\\n    import os\\n    os.makedirs(os.path.dirname(file_path),\n          exist_ok=True)\\n    return file_path\\n\\ndef plot_linear(mlpipeline_ui_metadata):\\n    import\n          base64\\n    import json\\n    from io import BytesIO\\n\\n    import matplotlib.pyplot\n          as plt\\n\\n    plt.plot([1, 2, 3], [1, 2, 3])\\n\\n    tmpfile = BytesIO()\\n    plt.savefig(tmpfile,\n          format=\\\"png\\\")\\n    encoded = base64.b64encode(tmpfile.getvalue()).decode(\\\"utf-8\\\")\\n\\n    html\n          = f\\\"<img src=''data:image/png;base64,{encoded}''>\\\"\\n    metadata = {\\n        \\\"outputs\\\":\n          [\\n            {\\n                \\\"type\\\": \\\"web-app\\\",\\n                \\\"storage\\\":\n          \\\"inline\\\",\\n                \\\"source\\\": html,\\n            },\\n        ],\\n    }\\n    with\n          open(mlpipeline_ui_metadata, \\\"w\\\") as html_writer:\\n        json.dump(metadata,\n          html_writer)\\n\\nimport argparse\\n_parser = argparse.ArgumentParser(prog=''Plot\n          linear'', description='''')\\n_parser.add_argument(\\\"--mlpipeline-ui-metadata\\\",\n          dest=\\\"mlpipeline_ui_metadata\\\", type=_make_parent_dirs_and_return_path,\n          required=True, default=argparse.SUPPRESS)\\n_parsed_args = vars(_parser.parse_args())\\n\\n_outputs\n          = plot_linear(**_parsed_args)\\n\"], \"image\": \"python:3.7\"}}, \"name\": \"Plot\n          linear\", \"outputs\": [{\"name\": \"mlpipeline_ui_metadata\", \"type\": \"UI_Metadata\"}]}',\n        pipelines.kubeflow.org/component_ref: '{}'}\n  - name: plot-pipeline\n    dag:\n      tasks:\n      - {name: plot-linear, template: plot-linear}\n  arguments:\n    parameters: []\n  serviceAccountName: pipeline-runner\n```\n\n  </details>\n</p>\n\nAfter running, click Visualization.\n\n![advanced-run-5.png](./img/advanced-run-5.png)\n\n## Run output\n\n![advanced-run-2.png](./img/advanced-run-2.png)\n\nRun output is where Kubeflow gathers the Artifacts generated in the specified form and shows the evaluation index (Metric).\n\nTo show the evaluation index (Metric), you can save the name and value you want to show in the `mlpipeline_metrics_path: OutputPath(\"Metrics\")` argument in json format. For example, you can write it like this.\n\n```python\n@create_component_from_func\ndef show_metric_of_sum(\n    number: int,\n    mlpipeline_metrics_path: OutputPath(\"Metrics\"),\n  ):\n    import json\n    metrics = {\n        \"metrics\": [\n            {\n                \"name\": \"sum_value\",\n                \"numberValue\": number,\n            },\n        ],\n    }\n    with open(mlpipeline_metrics_path, \"w\") as f:\n        json.dump(metrics, f)\n```\n\nWe will add a component to generate evaluation metrics to the pipeline created in the [Pipeline](../kubeflow/basic-pipeline.md) and execute it. The whole pipeline is as follows.\n\n```python\nimport kfp\nfrom kfp.components import create_component_from_func, OutputPath\nfrom kfp.dsl import pipeline\n\n\n@create_component_from_func\ndef print_and_return_number(number: int) -> int:\n    print(number)\n    return number\n\n@create_component_from_func\ndef sum_and_print_numbers(number_1: int, number_2: int) -> int:\n    sum_number = number_1 + number_2\n    print(sum_number)\n    return sum_number\n\n@create_component_from_func\ndef show_metric_of_sum(\n    number: int,\n    mlpipeline_metrics_path: OutputPath(\"Metrics\"),\n  ):\n    import json\n    metrics = {\n        \"metrics\": [\n            {\n                \"name\": \"sum_value\",\n                \"numberValue\": number,\n            },\n        ],\n    }\n    with open(mlpipeline_metrics_path, \"w\") as f:\n        json.dump(metrics, f)\n\n@pipeline(name=\"example_pipeline\")\ndef example_pipeline(number_1: int, number_2: int):\n    number_1_result = print_and_return_number(number_1)\n    number_2_result = print_and_return_number(number_2)\n    sum_result = sum_and_print_numbers(\n        number_1=number_1_result.output, number_2=number_2_result.output\n    )\n    show_metric_of_sum(sum_result.output)\n\n\nif __name__ == \"__main__\":\n    kfp.compiler.Compiler().compile(example_pipeline, \"example_pipeline.yaml\")\n```\n\nAfter execution, click Run Output and it will show like this.\n\n![advanced-run-4.png](./img/advanced-run-4.png)\n\n## Config\n\n![advanced-run-3.png](./img/advanced-run-3.png)\n\nIn the Config tab, you can view all the values received as pipeline configurations.\n"
  },
  {
    "path": "i18n/en/docusaurus-plugin-content-docs/current/kubeflow/basic-component.md",
    "content": "---\ntitle : \"4. Component - Write\"\ndescription: \"\"\nsidebar_position: 4\ncontributors: [\"Jongseob Jeon\"]\n---\n\n\n## Component\n\nIn order to write a component, the following must be written: \n\n1. Writing Component Contents \n2. Writing Component Wrapper \n\nNow, let's look at each process.\n\n## Component Contents\n\nComponent Contents are no different from the Python code we commonly write.  \nFor example, let's try writing a component that takes a number as input, prints it, and then returns it. \n We can write it in Python code like this.\n\n```python\nprint(number)\n```\n\nHowever, when this code is run, an error occurs and it does not work because the `number` that should be printed is not defined. \n\nAs we saw in [Kubeflow Concepts](../kubeflow/kubeflow-concepts.md), values like `number` that are required in component content are defined in **Config**. In order to execute component content, the necessary Configs must be passed from the component wrapper.\n\n## Component Wrapper\n\n### Define a standalone Python function\n\nNow we need to create a component wrapper to be able to pass the required Configs.\n\nWithout a separate Config, it will be like this when wrapped with a component wrapper.\n\n```python\ndef print_and_return_number():\n    print(number)\n    return number\n```\n\nNow we add the required Config for the content as an argument to the wrapper. However, it is not just writing the argument but also writing the type hint of the argument. When Kubeflow converts the pipeline into the Kubeflow format, it checks if the specified input and output types are matched in the connection between the components. If the format of the input required by the component does not match the output received from another component, the pipeline cannot be created.\n\nNow we complete the component wrapper by writing down the argument, its type and the type to be returned as follows.\n\n```python\ndef print_and_return_number(number: int) -> int:\n    print(number)\n    return number\n```\n\nIn Kubeflow, you can only use types that can be expressed in json as return values. The most commonly used and recommended types are as follows:\n\n- int\n- float\n- str\n\nIf you want to return multiple values instead of a single value, you must use `collections.namedtuple`.  \nFor more details, please refer to the Kubeflow official documentation [Kubeflow Official Documentation](https://www.kubeflow.org/docs/components/pipelines/sdk/python-function-components/#passing-parameters-by-value).  \nFor example, if you want to write a component that returns the quotient and remainder of a number when divided by 2, it should be written as follows.\n\n```python\nfrom typing import NamedTuple\n\n\ndef divide_and_return_number(\n    number: int,\n) -> NamedTuple(\"DivideOutputs\", [(\"quotient\", int), (\"remainder\", int)]):\n    from collections import namedtuple\n\n    quotient, remainder = divmod(number, 2)\n    print(\"quotient is\", quotient)\n    print(\"remainder is\", remainder)\n\n    divide_outputs = namedtuple(\n        \"DivideOutputs\",\n        [\n            \"quotient\",\n            \"remainder\",\n        ],\n    )\n    return divide_outputs(quotient, remainder)\n```\n\n### Convert to Kubeflow Format\n\nNow you have to convert the written component into a format that can be used in Kubeflow. The conversion can be done through `kfp.components.create_component_from_func`. This converted form can be imported as a function in Python and used in the pipeline.\n\n```python\nfrom kfp.components import create_component_from_func\n\n@create_component_from_func\ndef print_and_return_number(number: int) -> int:\n    print(number)\n    return number\n```\n\n### Share component with yaml file\n\nIf it is not possible to share with Python code, you can share components with a YAML file and use them.\nTo do this, first convert the component to a YAML file and then use it in the pipeline with `kfp.components.load_component_from_file`.\n\nFirst, let's explain the process of converting the written component to a YAML file.\n\n```python\nfrom kfp.components import create_component_from_func\n\n@create_component_from_func\ndef print_and_return_number(number: int) -> int:\n    print(number)\n    return number\n\nif __name__ == \"__main__\":\n    print_and_return_number.component_spec.save(\"print_and_return_number.yaml\")\n```\n\nIf you run the Python code you wrote, a file called `print_and_return_number.yaml` will be created. When you check the file, it will be as follows.\n\n```bash\nname: Print and return number\ninputs:\n- {name: number, type: Integer}\noutputs:\n- {name: Output, type: Integer}\nimplementation:\n  container:\n    image: python:3.7\n    command:\n    - sh\n    - -ec\n    - |\n      program_path=$(mktemp)\n      printf \"%s\" \"$0\" > \"$program_path\"\n      python3 -u \"$program_path\" \"$@\"\n    - |\n      def print_and_return_number(number):\n          print(number)\n          return number\n\n      def _serialize_int(int_value: int) -> str:\n          if isinstance(int_value, str):\n              return int_value\n          if not isinstance(int_value, int):\n              raise TypeError('Value \"{}\" has type \"{}\" instead of int.'.format(str(int_value), str(type(int_value))))\n          return str(int_value)\n\n      import argparse\n      _parser = argparse.ArgumentParser(prog='Print and return number', description='')\n      _parser.add_argument(\"--number\", dest=\"number\", type=int, required=True, default=argparse.SUPPRESS)\n      _parser.add_argument(\"----output-paths\", dest=\"_output_paths\", type=str, nargs=1)\n      _parsed_args = vars(_parser.parse_args())\n      _output_files = _parsed_args.pop(\"_output_paths\", [])\n\n      _outputs = print_and_return_number(**_parsed_args)\n\n      _outputs = [_outputs]\n\n      _output_serializers = [\n          _serialize_int,\n\n      ]\n\n      import os\n      for idx, output_file in enumerate(_output_files):\n          try:\n              os.makedirs(os.path.dirname(output_file))\n          except OSError:\n              pass\n          with open(output_file, 'w') as f:\n              f.write(_output_serializers[idx](_outputs[idx]))\n    args:\n    - --number\n    - {inputValue: number}\n    - '----output-paths'\n    - {outputPath: Output}\n```\n\nNow the generated file can be shared and used in the pipeline as follows.\n\n```python\nfrom kfp.components import load_component_from_file\n\nprint_and_return_number = load_component_from_file(\"print_and_return_number.yaml\")\n```\n\n## How Kubeflow executes component\n\nIn Kubeflow, the execution order of components is as follows:\n\n1. `docker pull <image>`: Pull the image containing the execution environment information of the defined component.\n2. Run `command`: Execute the component's content within the pulled image.\n\nTaking `print_and_return_number.yaml` as an example, the default image in `@create_component_from_func` is `python:3.7`, so the component's content will be executed based on that image.\n\n1. `docker pull python:3.7`\n2. `print(number)`\n\n## References:\n- [Getting Started With Python function based components](https://www.kubeflow.org/docs/components/pipelines/sdk/python-function-components/#getting-started-with-python-function-based-components)\n"
  },
  {
    "path": "i18n/en/docusaurus-plugin-content-docs/current/kubeflow/basic-pipeline-upload.md",
    "content": "---\ntitle : \"6. Pipeline - Upload\"\ndescription: \"\"\nsidebar_position: 6\ncontributors: [\"Jongseob Jeon\"]\n---\n\n## Upload Pipeline\n\nNow, let's upload the pipeline we created directly to kubeflow.  \nPipeline uploads can be done through the kubeflow dashboard UI.\nUse the method used in [Install Kubeflow](../setup-components/install-components-kf.md) to do port forwarding.\n\n```bash\nkubectl port-forward svc/istio-ingressgateway -n istio-system 8080:80\n```\n\nAccess [http://localhost:8080](http://localhost:8080) to open the dashboard.\n\n### 1. Click Pipelines Tab\n\n![pipeline-gui-0.png](./img/pipeline-gui-0.png)\n\n### 2. Click Upload Pipeline\n\n![pipeline-gui-1.png](./img/pipeline-gui-1.png)\n\n### 3. Click Choose file\n\n![pipeline-gui-2.png](./img/pipeline-gui-2.png)\n\n### 4. Upload created yaml file\n\n![pipeline-gui-3.png](./img/pipeline-gui-3.png)\n\n### 5. Create\n\n![pipeline-gui-4.png](./img/pipeline-gui-4.png)\n\n## Upload Pipeline Version\n\n\nThe uploaded pipeline allows you to manage versions through uploads. However, it serves the role of gathering pipelines with the same name rather than version management at the code level, such as Github.\nIn the example above, clicking on example_pipeline will bring up the following screen.\n\n![pipeline-gui-5.png](./img/pipeline-gui-5.png)\n\nIf you click this screen shows.\n\n![pipeline-gui-4.png](./img/pipeline-gui-4.png)\n\nIf you click Upload Version, a screen appears where you can upload the pipeline.\n\n![pipeline-gui-6.png](./img/pipeline-gui-6.png)\n\nNow, upload your pipeline.\n\n![pipeline-gui-7.png](./img/pipeline-gui-7.png)\n\nOnce uploaded, you can check the pipeline version as follows.\n\n![pipeline-gui-8.png](./img/pipeline-gui-8.png)\n"
  },
  {
    "path": "i18n/en/docusaurus-plugin-content-docs/current/kubeflow/basic-pipeline.md",
    "content": "---\ntitle : \"5. Pipeline - Write\"\ndescription: \"\"\nsidebar_position: 5\ncontributors: [\"Jongseob Jeon\"]\n---\n\n## Pipeline\n\nComponents do not run independently but rather as components of a pipeline. Therefore, in order to run a component, a pipeline must be written.\nAnd in order to write a pipeline, a set of components and the order of execution of those components is necessary.\n\nOn this page, we will create a pipeline with a component that takes a number as input and outputs it, and a component that takes two numbers from two components and outputs the sum.\n\n## Component Set\n\nFirst, let's create the components that will be used in the pipeline.\n\n1. `print_and_return_number`\n\n   This component prints and returns the input number.  \n   Since the component returns the input value, we specify `int` as the return type hint.\n\n   ```python\n   @create_component_from_func\n   def print_and_return_number(number: int) -> int:\n       print(number)\n       return number\n   ```\n\n2. `sum_and_print_numbers`\n\n   This component calculates the sum of two input numbers and prints it.  \n   Similarly, since the component returns the sum, we specify `int` as the return type hint.\n\n   ```python\n   @create_component_from_func\n   def sum_and_print_numbers(number_1: int, number_2: int) -> int:\n       sum_num = number_1 + number_2\n       print(sum_num)\n       return sum_num\n   ```\n\n## Component Order\n\n### Define Order\n\nIf you have created the necessary set of components, the next step is to define their sequence.  \nThe diagram below represents the order of the pipeline components to be created on this page.\n\n![pipeline-0.png](./img/pipeline-0.png)\n\n### Single Output\n\nNow let's translate this sequence into code.\n\nFirst, writing `print_and_return_number_1` and `print_and_return_number_2` from the picture above would look like this.\n\n```python\ndef example_pipeline():\n    number_1_result = print_and_return_number(number_1)\n    number_2_result = print_and_return_number(number_2)\n```\n\nRun the component and store the return values in `number_1_result` and `number_2_result`, respectively.  \nThe return value of the stored `number_1_result` can be used through `number_1_resulst.output`.\n\n### Multi Output\n\nIn the example above, the components return a single value, so it can be directly used with `output`.  \nHowever, if there are multiple return values, they will be stored in `outputs` as a dictionary. You can use the keys to access the desired return values.\nLet's consider an example with a component that returns multiple values, like the one mentioned in the [component](../kubeflow/basic-component.md#define-a-standalone-python-function) definition. The `divide_and_return_number` component returns `quotient` and `remainder`. Here's an example of passing these two values to `print_and_return_number`:\n\n```python\ndef multi_pipeline():\n    divided_result = divde_and_return_number(number)\n    num_1_result = print_and_return_number(divided_result.outputs[\"quotient\"])\n    num_2_result = print_and_return_number(divided_result.outputs[\"remainder\"])\n```\n\nStore the result of `divide_and_return_number` in `divided_result` and you can get the values of each by `divided_result.outputs[\"quotient\"]` and `divided_result.outputs[\"remainder\"]`.\n\n### Write to python code\n\nNow, let's get back to the main topic and pass the result of these two values to `sum_and_print_numbers`.\n\n```python\ndef example_pipeline():\n    number_1_result = print_and_return_number(number_1)\n    number_2_result = print_and_return_number(number_2)\n    sum_result = sum_and_print_numbers(\n        number_1=number_1_result.output, number_2=number_2_result.output\n    )\n```\n\nNext, gather the necessary Configs for each component and define it as a pipeline Config.\n\n```python\ndef example_pipeline(number_1: int, number_2:int):\n    number_1_result = print_and_return_number(number_1)\n    number_2_result = print_and_return_number(number_2)\n    sum_result = sum_and_print_numbers(\n        number_1=number_1_result.output, number_2=number_2_result.output\n    )\n```\n\n## Convert to Kubeflow Format\n\nFinally, convert it into a format that can be used in Kubeflow. The conversion can be done using the `kfp.dsl.pipeline` function.\n\n```python\nfrom kfp.dsl import pipeline\n\n\n@pipeline(name=\"example_pipeline\")\ndef example_pipeline(number_1: int, number_2: int):\n    number_1_result = print_and_return_number(number_1)\n    number_2_result = print_and_return_number(number_2)\n    sum_result = sum_and_print_numbers(\n        number_1=number_1_result.output, number_2=number_2_result.output\n    )\n```\n\nIn order to run a pipeline in Kubeflow, it needs to be compiled into the designated yaml format as only yaml format is possible, so the created pipeline needs to be compiled into a specific yaml format.\nCompilation can be done using the following command.\n\n```python\nif __name__ == \"__main__\":\n    import kfp\n    kfp.compiler.Compiler().compile(example_pipeline, \"example_pipeline.yaml\")\n```\n\n## Conclusion\n\nAs explained earlier, if we gather the content into a Python code, it will look like this.\n\n```python\nimport kfp\nfrom kfp.components import create_component_from_func\nfrom kfp.dsl import pipeline\n\n@create_component_from_func\ndef print_and_return_number(number: int) -> int:\n    print(number)\n    return number\n\n@create_component_from_func\ndef sum_and_print_numbers(number_1: int, number_2: int):\n    print(number_1 + number_2)\n\n@pipeline(name=\"example_pipeline\")\ndef example_pipeline(number_1: int, number_2: int):\n    number_1_result = print_and_return_number(number_1)\n    number_2_result = print_and_return_number(number_2)\n    sum_result = sum_and_print_numbers(\n        number_1=number_1_result.output, number_2=number_2_result.output\n    )\n\nif __name__ == \"__main__\":\n    kfp.compiler.Compiler().compile(example_pipeline, \"example_pipeline.yaml\")\n```\n\nThe compiled result is as follows.\n\n<details>\n  <summary>example_pipeline.yaml</summary>\n\n```bash\napiVersion: argoproj.io/v1alpha1\nkind: Workflow\nmetadata:\n  generateName: example-pipeline-\n  annotations: {pipelines.kubeflow.org/kfp_sdk_version: 1.6.3, pipelines.kubeflow.org/pipeline_compilation_time: '2021-12-05T13:38:51.566777',\n    pipelines.kubeflow.org/pipeline_spec: '{\"inputs\": [{\"name\": \"number_1\", \"type\":\n      \"Integer\"}, {\"name\": \"number_2\", \"type\": \"Integer\"}], \"name\": \"example_pipeline\"}'}\n  labels: {pipelines.kubeflow.org/kfp_sdk_version: 1.6.3}\nspec:\n  entrypoint: example-pipeline\n  templates:\n  - name: example-pipeline\n    inputs:\n      parameters:\n      - {name: number_1}\n      - {name: number_2}\n    dag:\n      tasks:\n      - name: print-and-return-number\n        template: print-and-return-number\n        arguments:\n          parameters:\n          - {name: number_1, value: '{{inputs.parameters.number_1}}'}\n      - name: print-and-return-number-2\n        template: print-and-return-number-2\n        arguments:\n          parameters:\n          - {name: number_2, value: '{{inputs.parameters.number_2}}'}\n      - name: sum-and-print-numbers\n        template: sum-and-print-numbers\n        dependencies: [print-and-return-number, print-and-return-number-2]\n        arguments:\n          parameters:\n          - {name: print-and-return-number-2-Output, value: '{{tasks.print-and-return-number-2.outputs.parameters.print-and-return-number-2-Output}}'}\n          - {name: print-and-return-number-Output, value: '{{tasks.print-and-return-number.outputs.parameters.print-and-return-number-Output}}'}\n  - name: print-and-return-number\n    container:\n      args: [--number, '{{inputs.parameters.number_1}}', '----output-paths', /tmp/outputs/Output/data]\n      command:\n      - sh\n      - -ec\n      - |\n        program_path=$(mktemp)\n        printf \"%s\" \"$0\" > \"$program_path\"\n        python3 -u \"$program_path\" \"$@\"\n      - |\n        def print_and_return_number(number):\n            print(number)\n            return number\n\n        def _serialize_int(int_value: int) -> str:\n            if isinstance(int_value, str):\n                return int_value\n            if not isinstance(int_value, int):\n                raise TypeError('Value \"{}\" has type \"{}\" instead of int.'.format(str(int_value), str(type(int_value))))\n            return str(int_value)\n\n        import argparse\n        _parser = argparse.ArgumentParser(prog='Print and return number', description='')\n        _parser.add_argument(\"--number\", dest=\"number\", type=int, required=True, default=argparse.SUPPRESS)\n        _parser.add_argument(\"----output-paths\", dest=\"_output_paths\", type=str, nargs=1)\n        _parsed_args = vars(_parser.parse_args())\n        _output_files = _parsed_args.pop(\"_output_paths\", [])\n\n        _outputs = print_and_return_number(**_parsed_args)\n\n        _outputs = [_outputs]\n\n        _output_serializers = [\n            _serialize_int,\n\n        ]\n\n        import os\n        for idx, output_file in enumerate(_output_files):\n            try:\n                os.makedirs(os.path.dirname(output_file))\n            except OSError:\n                pass\n            with open(output_file, 'w') as f:\n                f.write(_output_serializers[idx](_outputs[idx]))\n      image: python:3.7\n    inputs:\n      parameters:\n      - {name: number_1}\n    outputs:\n      parameters:\n      - name: print-and-return-number-Output\n        valueFrom: {path: /tmp/outputs/Output/data}\n      artifacts:\n      - {name: print-and-return-number-Output, path: /tmp/outputs/Output/data}\n    metadata:\n      labels: {pipelines.kubeflow.org/kfp_sdk_version: 1.6.3, pipelines.kubeflow.org/pipeline-sdk-type: kfp}\n      annotations: {pipelines.kubeflow.org/component_spec: '{\"implementation\": {\"container\":\n          {\"args\": [\"--number\", {\"inputValue\": \"number\"}, \"----output-paths\", {\"outputPath\":\n          \"Output\"}], \"command\": [\"sh\", \"-ec\", \"program_path=$(mktemp)\\nprintf \\\"%s\\\"\n          \\\"$0\\\" > \\\"$program_path\\\"\\npython3 -u \\\"$program_path\\\" \\\"$@\\\"\\n\", \"def\n          print_and_return_number(number):\\n    print(number)\\n    return number\\n\\ndef\n          _serialize_int(int_value: int) -> str:\\n    if isinstance(int_value, str):\\n        return\n          int_value\\n    if not isinstance(int_value, int):\\n        raise TypeError(''Value\n          \\\"{}\\\" has type \\\"{}\\\" instead of int.''.format(str(int_value), str(type(int_value))))\\n    return\n          str(int_value)\\n\\nimport argparse\\n_parser = argparse.ArgumentParser(prog=''Print\n          and return number'', description='''')\\n_parser.add_argument(\\\"--number\\\",\n          dest=\\\"number\\\", type=int, required=True, default=argparse.SUPPRESS)\\n_parser.add_argument(\\\"----output-paths\\\",\n          dest=\\\"_output_paths\\\", type=str, nargs=1)\\n_parsed_args = vars(_parser.parse_args())\\n_output_files\n          = _parsed_args.pop(\\\"_output_paths\\\", [])\\n\\n_outputs = print_and_return_number(**_parsed_args)\\n\\n_outputs\n          = [_outputs]\\n\\n_output_serializers = [\\n    _serialize_int,\\n\\n]\\n\\nimport\n          os\\nfor idx, output_file in enumerate(_output_files):\\n    try:\\n        os.makedirs(os.path.dirname(output_file))\\n    except\n          OSError:\\n        pass\\n    with open(output_file, ''w'') as f:\\n        f.write(_output_serializers[idx](_outputs[idx]))\\n\"],\n          \"image\": \"python:3.7\"}}, \"inputs\": [{\"name\": \"number\", \"type\": \"Integer\"}],\n          \"name\": \"Print and return number\", \"outputs\": [{\"name\": \"Output\", \"type\":\n          \"Integer\"}]}', pipelines.kubeflow.org/component_ref: '{}', pipelines.kubeflow.org/arguments.parameters: '{\"number\":\n          \"{{inputs.parameters.number_1}}\"}'}\n  - name: print-and-return-number-2\n    container:\n      args: [--number, '{{inputs.parameters.number_2}}', '----output-paths', /tmp/outputs/Output/data]\n      command:\n      - sh\n      - -ec\n      - |\n        program_path=$(mktemp)\n        printf \"%s\" \"$0\" > \"$program_path\"\n        python3 -u \"$program_path\" \"$@\"\n      - |\n        def print_and_return_number(number):\n            print(number)\n            return number\n\n        def _serialize_int(int_value: int) -> str:\n            if isinstance(int_value, str):\n                return int_value\n            if not isinstance(int_value, int):\n                raise TypeError('Value \"{}\" has type \"{}\" instead of int.'.format(str(int_value), str(type(int_value))))\n            return str(int_value)\n\n        import argparse\n        _parser = argparse.ArgumentParser(prog='Print and return number', description='')\n        _parser.add_argument(\"--number\", dest=\"number\", type=int, required=True, default=argparse.SUPPRESS)\n        _parser.add_argument(\"----output-paths\", dest=\"_output_paths\", type=str, nargs=1)\n        _parsed_args = vars(_parser.parse_args())\n        _output_files = _parsed_args.pop(\"_output_paths\", [])\n\n        _outputs = print_and_return_number(**_parsed_args)\n\n        _outputs = [_outputs]\n\n        _output_serializers = [\n            _serialize_int,\n\n        ]\n\n        import os\n        for idx, output_file in enumerate(_output_files):\n            try:\n                os.makedirs(os.path.dirname(output_file))\n            except OSError:\n                pass\n            with open(output_file, 'w') as f:\n                f.write(_output_serializers[idx](_outputs[idx]))\n      image: python:3.7\n    inputs:\n      parameters:\n      - {name: number_2}\n    outputs:\n      parameters:\n      - name: print-and-return-number-2-Output\n        valueFrom: {path: /tmp/outputs/Output/data}\n      artifacts:\n      - {name: print-and-return-number-2-Output, path: /tmp/outputs/Output/data}\n    metadata:\n      labels: {pipelines.kubeflow.org/kfp_sdk_version: 1.6.3, pipelines.kubeflow.org/pipeline-sdk-type: kfp}\n      annotations: {pipelines.kubeflow.org/component_spec: '{\"implementation\": {\"container\":\n          {\"args\": [\"--number\", {\"inputValue\": \"number\"}, \"----output-paths\", {\"outputPath\":\n          \"Output\"}], \"command\": [\"sh\", \"-ec\", \"program_path=$(mktemp)\\nprintf \\\"%s\\\"\n          \\\"$0\\\" > \\\"$program_path\\\"\\npython3 -u \\\"$program_path\\\" \\\"$@\\\"\\n\", \"def\n          print_and_return_number(number):\\n    print(number)\\n    return number\\n\\ndef\n          _serialize_int(int_value: int) -> str:\\n    if isinstance(int_value, str):\\n        return\n          int_value\\n    if not isinstance(int_value, int):\\n        raise TypeError(''Value\n          \\\"{}\\\" has type \\\"{}\\\" instead of int.''.format(str(int_value), str(type(int_value))))\\n    return\n          str(int_value)\\n\\nimport argparse\\n_parser = argparse.ArgumentParser(prog=''Print\n          and return number'', description='''')\\n_parser.add_argument(\\\"--number\\\",\n          dest=\\\"number\\\", type=int, required=True, default=argparse.SUPPRESS)\\n_parser.add_argument(\\\"----output-paths\\\",\n          dest=\\\"_output_paths\\\", type=str, nargs=1)\\n_parsed_args = vars(_parser.parse_args())\\n_output_files\n          = _parsed_args.pop(\\\"_output_paths\\\", [])\\n\\n_outputs = print_and_return_number(**_parsed_args)\\n\\n_outputs\n          = [_outputs]\\n\\n_output_serializers = [\\n    _serialize_int,\\n\\n]\\n\\nimport\n          os\\nfor idx, output_file in enumerate(_output_files):\\n    try:\\n        os.makedirs(os.path.dirname(output_file))\\n    except\n          OSError:\\n        pass\\n    with open(output_file, ''w'') as f:\\n        f.write(_output_serializers[idx](_outputs[idx]))\\n\"],\n          \"image\": \"python:3.7\"}}, \"inputs\": [{\"name\": \"number\", \"type\": \"Integer\"}],\n          \"name\": \"Print and return number\", \"outputs\": [{\"name\": \"Output\", \"type\":\n          \"Integer\"}]}', pipelines.kubeflow.org/component_ref: '{}', pipelines.kubeflow.org/arguments.parameters: '{\"number\":\n          \"{{inputs.parameters.number_2}}\"}'}\n  - name: sum-and-print-numbers\n    container:\n      args: [--number-1, '{{inputs.parameters.print-and-return-number-Output}}', --number-2,\n        '{{inputs.parameters.print-and-return-number-2-Output}}']\n      command:\n      - sh\n      - -ec\n      - |\n        program_path=$(mktemp)\n        printf \"%s\" \"$0\" > \"$program_path\"\n        python3 -u \"$program_path\" \"$@\"\n      - |\n        def sum_and_print_numbers(number_1, number_2):\n            print(number_1 + number_2)\n\n        import argparse\n        _parser = argparse.ArgumentParser(prog='Sum and print numbers', description='')\n        _parser.add_argument(\"--number-1\", dest=\"number_1\", type=int, required=True, default=argparse.SUPPRESS)\n        _parser.add_argument(\"--number-2\", dest=\"number_2\", type=int, required=True, default=argparse.SUPPRESS)\n        _parsed_args = vars(_parser.parse_args())\n\n        _outputs = sum_and_print_numbers(**_parsed_args)\n      image: python:3.7\n    inputs:\n      parameters:\n      - {name: print-and-return-number-2-Output}\n      - {name: print-and-return-number-Output}\n    metadata:\n      labels: {pipelines.kubeflow.org/kfp_sdk_version: 1.6.3, pipelines.kubeflow.org/pipeline-sdk-type: kfp}\n      annotations: {pipelines.kubeflow.org/component_spec: '{\"implementation\": {\"container\":\n          {\"args\": [\"--number-1\", {\"inputValue\": \"number_1\"}, \"--number-2\", {\"inputValue\":\n          \"number_2\"}], \"command\": [\"sh\", \"-ec\", \"program_path=$(mktemp)\\nprintf \\\"%s\\\"\n          \\\"$0\\\" > \\\"$program_path\\\"\\npython3 -u \\\"$program_path\\\" \\\"$@\\\"\\n\", \"def\n          sum_and_print_numbers(number_1, number_2):\\n    print(number_1 + number_2)\\n\\nimport\n          argparse\\n_parser = argparse.ArgumentParser(prog=''Sum and print numbers'',\n          description='''')\\n_parser.add_argument(\\\"--number-1\\\", dest=\\\"number_1\\\",\n          type=int, required=True, default=argparse.SUPPRESS)\\n_parser.add_argument(\\\"--number-2\\\",\n          dest=\\\"number_2\\\", type=int, required=True, default=argparse.SUPPRESS)\\n_parsed_args\n          = vars(_parser.parse_args())\\n\\n_outputs = sum_and_print_numbers(**_parsed_args)\\n\"],\n          \"image\": \"python:3.7\"}}, \"inputs\": [{\"name\": \"number_1\", \"type\": \"Integer\"},\n          {\"name\": \"number_2\", \"type\": \"Integer\"}], \"name\": \"Sum and print numbers\"}',\n        pipelines.kubeflow.org/component_ref: '{}', pipelines.kubeflow.org/arguments.parameters: '{\"number_1\":\n          \"{{inputs.parameters.print-and-return-number-Output}}\", \"number_2\": \"{{inputs.parameters.print-and-return-number-2-Output}}\"}'}\n  arguments:\n    parameters:\n    - {name: number_1}\n    - {name: number_2}\n  serviceAccountName: pipeline-runner\n```\n\n</details>\n"
  },
  {
    "path": "i18n/en/docusaurus-plugin-content-docs/current/kubeflow/basic-requirements.md",
    "content": "---\ntitle : \"3. Install Requirements\"\ndescription: \"\"\nsidebar_position: 3\ncontributors: [\"Jongseob Jeon\"]\n---\n\nThe recommended Python version for practice is python>=3.7. For those unfamiliar with the Python environment, please refer to [Appendix 1. Python Virtual Environment](../appendix/pyenv) and install the packages on the **client node**.\n\nThe packages and versions required for the practice are as follows:\n\n- requirements.txt\n\n  ```bash\n  kfp==1.8.9\n  scikit-learn==1.0.1\n  mlflow==1.21.0\n  pandas==1.3.4\n  dill==0.3.4\n  ```\n\nActivate the [Python virtual environment](../appendix/pyenv.md#python-가상환경-생성) created in the previous section.\n\n```bash\npyenv activate demo\n```\n\nWe are proceeding with the package installation.\n\n```bash\npip3 install -U pip\npip3 install kfp==1.8.9 scikit-learn==1.0.1 mlflow==1.21.0 pandas==1.3.4 dill==0.3.4\n```\n"
  },
  {
    "path": "i18n/en/docusaurus-plugin-content-docs/current/kubeflow/basic-run.md",
    "content": "---\ntitle : \"7. Pipeline - Run\"\ndescription: \"\"\nsidebar_position: 7\ncontributors: [\"Jongseob Jeon\"]\n---\n\n## Run Pipeline\n\nNow we will run the uploaded pipeline.\n\n## Before Run\n\n### 1. Create Experiment\n\nExperiments in Kubeflow are units that logically manage runs executed within them.\n\nWhen you first enter the namespace in Kubeflow, there are no Experiments created. Therefore, you must create an Experiment beforehand in order to run the pipeline. If an Experiment already exists, you can go to [Run Pipeline](../kubeflow/basic-run.md#run-pipeline-1).\n\nExperiments can be created via the Create Experiment button.\n\n![run-0.png](./img/run-0.png)\n\n### 2. Name 입력\n\n![run-1.png](./img/run-1.png)\n\n## Run Pipeline\n\n### 1. Select Create Run\n\n![run-2.png](./img/run-2.png)\n\n### 2. Select Experiment\n\n![run-9.png](./img/run-9.png)\n\n![run-10.png](./img/run-10.png)\n\n### 3. Enter Pipeline Config\n\nFill in the values of the Config provided when creating the pipeline. The uploaded pipeline requires input values for `number_1` and `number_2`.\n\n![run-3.png](./img/run-3.png)\n\n### 4. Start\n\nClick the Start button after entering the values. The pipeline will start running.\n\n![run-4.png](./img/run-4.png)\n\n## Run Result\n\nThe executed pipelines can be viewed in the Runs tab.\nClicking on a run provides detailed information related to the executed pipeline.\n\n![run-5.png](./img/run-5.png)\n\nUpon clicking, the following screen appears. Components that have not yet executed are displayed in gray.\n\n![run-6.png](./img/run-6.png)\n\nWhen a component has completed execution, it is marked with a green checkmark.\n\n![run-7.png](./img/run-7.png)\n\nIf we look at the last component, we can see that it has outputted the sum of the input values, which in this case is 8 (the sum of 3 and 5).\n\n![run-8.png](./img/run-8.png)\n"
  },
  {
    "path": "i18n/en/docusaurus-plugin-content-docs/current/kubeflow/how-to-debug.md",
    "content": "---\ntitle : \"13. Component - Debugging\"\ndescription: \"\"\nsidebar_position: 13\ncontributors: [\"Jongseob Jeon\"]\n---\n\n## Debugging Pipeline\n\nThis page covers how to debug Kubeflow components.\n\n## Failed Component\n\nWe will modify a pipeline used in [Component - MLFlow](../kubeflow/advanced-mlflow.md#mlflow-pipeline) in this page.\n\nFirst, let's modify the pipeline so that the component fails.\n\n```python\nfrom functools import partial\n\nimport kfp\nfrom kfp.components import InputPath, OutputPath, create_component_from_func\nfrom kfp.dsl import pipeline\n\n\n@partial(\n    create_component_from_func,\n    packages_to_install=[\"pandas\", \"scikit-learn\"],\n)\ndef load_iris_data(\n    data_path: OutputPath(\"csv\"),\n    target_path: OutputPath(\"csv\"),\n):\n    import pandas as pd\n    from sklearn.datasets import load_iris\n\n    iris = load_iris()\n\n    data = pd.DataFrame(iris[\"data\"], columns=iris[\"feature_names\"])\n    target = pd.DataFrame(iris[\"target\"], columns=[\"target\"])\n    \n    data[\"sepal length (cm)\"] = None\n    data.to_csv(data_path, index=False)\n    target.to_csv(target_path, index=False)\n\n\n@partial(\n    create_component_from_func,\n    packages_to_install=[\"pandas\"],\n)\ndef drop_na_from_csv(\n    data_path: InputPath(\"csv\"),\n    output_path: OutputPath(\"csv\"),\n):\n    import pandas as pd\n\n    data = pd.read_csv(data_path)\n    data = data.dropna()\n    data.to_csv(output_path, index=False)\n\n\n@partial(\n    create_component_from_func,\n    packages_to_install=[\"dill\", \"pandas\", \"scikit-learn\", \"mlflow\"],\n)\ndef train_from_csv(\n    train_data_path: InputPath(\"csv\"),\n    train_target_path: InputPath(\"csv\"),\n    model_path: OutputPath(\"dill\"),\n    input_example_path: OutputPath(\"dill\"),\n    signature_path: OutputPath(\"dill\"),\n    conda_env_path: OutputPath(\"dill\"),\n    kernel: str,\n):\n    import dill\n    import pandas as pd\n    from sklearn.svm import SVC\n\n    from mlflow.models.signature import infer_signature\n    from mlflow.utils.environment import _mlflow_conda_env\n\n    train_data = pd.read_csv(train_data_path)\n    train_target = pd.read_csv(train_target_path)\n\n    clf = SVC(kernel=kernel)\n    clf.fit(train_data, train_target)\n\n    with open(model_path, mode=\"wb\") as file_writer:\n        dill.dump(clf, file_writer)\n\n    input_example = train_data.sample(1)\n    with open(input_example_path, \"wb\") as file_writer:\n        dill.dump(input_example, file_writer)\n\n    signature = infer_signature(train_data, clf.predict(train_data))\n    with open(signature_path, \"wb\") as file_writer:\n        dill.dump(signature, file_writer)\n\n    conda_env = _mlflow_conda_env(\n        additional_pip_deps=[\"dill\", \"pandas\", \"scikit-learn\"]\n    )\n    with open(conda_env_path, \"wb\") as file_writer:\n        dill.dump(conda_env, file_writer)\n\n\n\n@pipeline(name=\"debugging_pipeline\")\ndef debugging_pipeline(kernel: str):\n    iris_data = load_iris_data()\n    drop_data = drop_na_from_csv(data=iris_data.outputs[\"data\"])\n    model = train_from_csv(\n        train_data=drop_data.outputs[\"output\"],\n        train_target=iris_data.outputs[\"target\"],\n        kernel=kernel,\n    )\n\nif __name__ == \"__main__\":\n    kfp.compiler.Compiler().compile(debugging_pipeline, \"debugging_pipeline.yaml\")\n\n```\n\nThe modifications are as follows:\n\n1. In the `load_iris_data` component for loading data, `None` was injected into the `sepal length (cm)` feature.\n2. In the `drop_na_from_csv` component, use the `drop_na()` function to remove rows with na values.\n\nNow let's upload and run the pipeline.  \nAfter running, if you press Run you will see that it has failed in the `Train from csv` component.\n\n![debug-0.png](./img/debug-0.png)\n\nClick on the failed component and check the log to see the reason for the failure.\n\n![debug-2.png](./img/debug-2.png)\n\nIf the log shows that the data count is 0 and the component did not run, there may be an issue with the input data.  \nLet's investigate what might be the problem.\n\nFirst, click on the component and go to the Input/Output tab to download the input data.  \nYou can click on the link indicated by the red square to download the data.\n\n\n![debug-5.png](./img/debug-5.png)\n\nDownload both files to the same location. Then navigate to the specified path and check the downloaded files.\n\n\n```bash\nls\n```\n\nThere are two files as follows.\n\n```bash\ndrop-na-from-csv-output.tgz load-iris-data-target.tgz\n```\n\nI will try to unzip it.\n\n```bash\ntar -xzvf load-iris-data-target.tgz ; mv data target.csv\ntar -xzvf drop-na-from-csv-output.tgz ; mv data data.csv\n```\n\nAnd then run the component code using a Jupyter notebook.\n![debug-3.png](./img/debug-3.png)\n\nDebugging revealed that dropping the data was based on rows instead of columns, resulting in all the data being removed.\nNow that we know the cause of the problem, we can modify the component to drop based on columns.\n\n```python\n@partial(\n    create_component_from_func,\n    packages_to_install=[\"pandas\"],\n)\ndef drop_na_from_csv(\n    data_path: InputPath(\"csv\"),\n    output_path: OutputPath(\"csv\"),\n):\n    import pandas as pd\n\n    data = pd.read_csv(data_path)\n    data = data.dropna(axis=\"columns\")\n    data.to_csv(output_path, index=False)\n```\n\nAfter modifying, upload the pipeline again and run it to confirm that it is running normally as follows.\n\n![debug-6.png](./img/debug-6.png)\n"
  },
  {
    "path": "i18n/en/docusaurus-plugin-content-docs/current/kubeflow/kubeflow-concepts.md",
    "content": "---\ntitle : \"2. Kubeflow Concepts\"\ndescription: \"\"\nsidebar_position: 2\ncontributors: [\"Jongseob Jeon\"]\n---\n\n## Component\n\nA component is composed of Component contents and a Component wrapper.\nA single component is delivered to Kubeflow through a Component wrapper and the delivered component executes the defined Component contents and produces artifacts.\n\n![concept-0.png](./img/concept-0.png)\n\n### Component Contents\n\nThere are three components that make up the component contents:\n\n![concept-1.png](./img/concept-1.png)\n\n1. Environment\n2. Python code w/ Config\n3. Generates Artifacts\n\nLet's explore each component with an example.\nHere is a Python code that loads data, trains an SVC (Support Vector Classifier) model, and saves the SVC model.\n\n```python\nimport dill\nimport pandas as pd\n\nfrom sklearn.svm import SVC\n\ntrain_data = pd.read_csv(train_data_path)\ntrain_target= pd.read_csv(train_target_path)\n\nclf= SVC(\n    kernel=kernel\n)\nclf.fit(train_data)\n\nwith open(model_path, mode=\"wb\") as file_writer:\n     dill.dump(clf, file_writer)\n```\n\nThe above Python code can be divided into components contents as follows.\n\n![concept-2.png](./img/concept-2.png)\n\nEnvironment is the part of the Python code where the packages used in the code are imported.  \nNext, Python Code w\\ Config is where the given Config is used to actually perform the training.  \nFinally, there is a process to save the artifacts.  \n\n### Component Wrapper\n\nComponent wrappers deliver the necessary Config and execute tasks for component content.\n\n![concept-3.png](./img/concept-3.png)\n\nIn Kubeflow, component wrappers are defined as functions, similar to the `train_svc_from_csv` example above.\nWhen a component wrapper wraps the contents, it looks like the following:\n\n![concept-4.png](./img/concept-4.png)\n\n### Artifacts\n\nIn the explanation above, it was mentioned that the component creates Artifacts. Artifacts is a term used to refer to any form of a file that is generated, such as evaluation results, logs, etc.\nOf the ones that we are interested in, the following are significant: Models, Data, Metrics, and etc.\n\n![concept-5.png](./img/concept-5.png)\n\n- Model\n- Data\n- Metric\n- etc\n\n#### Model\n\nWe defined the model as follows: \n\n> A model is a form that includes Python code, trained weights and network architecture, and an environment to run it.\n\n#### Data\n\nData includes preprocessed features, model predictions, etc. \n\n#### Metric\n\nMetric is divided into two categories: dynamic metrics and static metrics.\n\n- Dynamic metrics refer to values that continuously change during the training process, such as train loss per epoch.\n- Static metrics refer to evaluation metrics, such as accuracy, that are calculated after the training is completed.\n\n## Pipeline\n\nA pipeline consists of a collection of components and the order in which they are executed. The order forms a directed acyclic graph (DAG), which can include simple conditional statements.\n\n![concept-6.png](./img/concept-6.png)\n\n### Pipeline Config\n\nAs mentioned earlier, components require config to be executed. The pipeline config contains the configs for all the components in the pipeline.\n\n![concept-7.png](./img/concept-7.png)\n\n## Run\n\nTo execute a pipeline, the pipeline config specific to that pipeline is required. In Kubeflow, an executed pipeline is called a \"Run.\"\n\n![concept-8.png](./img/concept-8.png)\n\nWhen a pipeline is executed, each component generates artifacts. Kubeflow pipeline assigns a unique ID to each Run, and all artifacts generated during the Run are stored.\n\n![concept-9.png](./img/concept-9.png)\n\nNow, let's learn how to write components and pipelines.\n"
  },
  {
    "path": "i18n/en/docusaurus-plugin-content-docs/current/kubeflow/kubeflow-intro.md",
    "content": "---\ntitle : \"1. Kubeflow Introduction\"\ndescription: \"\"\nsidebar_position: 1\ncontributors: [\"Jongseob Jeon\"]\n---\n\nTo use Kubeflow, you need to write components and pipelines.\n\nThe approach described in *MLOps for ALL* differs slightly from the method described on the [Kubeflow Pipeline official website](https://www.kubeflow.org/docs/components/pipelines/overview/quickstart/). Here, Kubeflow Pipeline is used as one of the components in the [elements that make up MLOps](../kubeflow/kubeflow-concepts.md#component-contents) rather than a standalone workflow.\n\nNow, let's understand what components and pipelines are and how to write them.\n"
  },
  {
    "path": "i18n/en/docusaurus-plugin-content-docs/current/kubeflow-dashboard-guide/_category_.json",
    "content": "{\n  \"label\": \"Kubeflow UI Guide\",\n  \"position\": 5,\n  \"link\": {\n    \"type\": \"generated-index\"\n  }\n}\n"
  },
  {
    "path": "i18n/en/docusaurus-plugin-content-docs/current/kubeflow-dashboard-guide/experiments-and-others.md",
    "content": "---\ntitle : \"6. Kubeflow Pipeline Relates\"\ndescription: \"\"\nsidebar_position: 6\ncontributors: [\"Jaeyeon Kim\"]\n---\n\nIn the left tabs of the Central Dashboard (KFP Experiments, Pipelines, Runs, Recurring Runs, Artifacts, Executions) you can manage Kubeflow Pipelines and the results of Pipeline execution and Pipeline Runs.\n\n![left-tabs](./img/left-tabs.png)\n\nKubeflow Pipelines are the main reason for using Kubeflow in *MLOps for ALL*, and details on how to create, execute, and check the results of Kubeflow Pipelines can be found in [3.Kubeflow](../kubeflow/kubeflow-intro).\n"
  },
  {
    "path": "i18n/en/docusaurus-plugin-content-docs/current/kubeflow-dashboard-guide/experiments.md",
    "content": "---\ntitle : \"5. Experiments(AutoML)\"\ndescription: \"\"\nsidebar_position: 5\ncontributors: [\"Jaeyeon Kim\"]\n---\n\nNext, we will click the Experiments(AutoML) tab on the left of the Central Dashboard.\n\n![left-tabs](./img/left-tabs.png)\n\n![automl](./img/automl.png)\n\nThe Experiments(AutoML) page is where you can manage [Katib](https://www.kubeflow.org/docs/components/katib/overview/), which is responsible for AutoML through Hyperparameter Tuning and Neural Architecture Search in Kubeflow.\n\nThe usage of Katib and Experiments(AutoML) is not covered in *MLOps for Everyone* v1.0, and will be added in v2.0.\n"
  },
  {
    "path": "i18n/en/docusaurus-plugin-content-docs/current/kubeflow-dashboard-guide/intro.md",
    "content": "---\ntitle : \"1. Central Dashboard\"\ndescription: \"\"\nsidebar_position: 1\ncontributors: [\"Jaeyeon Kim\", \"SeungTae Kim\"]\n---\n\nOnce you have completed [Kubeflow installation](../setup-components/install-components-kf.md), you can access the dashboard through the following command.\n\n```bash\nkubectl port-forward --address 0.0.0.0 svc/istio-ingressgateway -n istio-system 8080:80\n```\n\n![after-login](./img/after-login.png)\n\nThe Central Dashboard is a UI that integrates all the features provided by Kubeflow. The features provided by the Central Dashboard can be divided based on the tabs on the left side\n\n![left-tabs](./img/left-tabs.png)\n\n- Home\n- Notebooks\n- Tensorboards\n- Volumes\n- Models\n- Experiments(AutoML)\n- Experiments(KFP)\n- Pipelines\n- Runs\n- Recurring Runs\n- Artifacts\n- Executions\n\nLet's now look at the simple usage of each feature.\n"
  },
  {
    "path": "i18n/en/docusaurus-plugin-content-docs/current/kubeflow-dashboard-guide/notebooks.md",
    "content": "---\ntitle : \"2. Notebooks\"\ndescription: \"\"\nsidebar_position: 2\ncontributors: [\"Jaeyeon Kim\"]\n---\n\n## Launch Notebook Server\n\nClick on the Notebooks tab on the left side of the Central Dashboard.\n\n![left-tabs](./img/left-tabs.png)\n\nYou will see a similar screen.\n\nThe Notebooks tab is a page where users can independently create and access jupyter notebook and code server environments (hereinafter referred to as a notebook server).\n\n![notebook-home](./img/notebook-home.png)\n\nClick the \"+ NEW NOTEBOOK\" button at the top right. \n\n![new-notebook](./img/new-notebook.png)\n\nWhen the screen shown below appears, now specify the spec (Spec) of the notebook server to be created.\n\n![create](./img/create.png)\n\n\n<details>\n<summary>For details for spec:</summary>\n\n- **name**:\n  - Specifies a name to identify the notebook server.\n- **namespace**:\n  - Cannot be changed. (It is automatically set to the namespace of the currently logged-in user account.)\n- **Image**:\n  - Selects the image to use from pre-installed JupyterLab images with Python packages like sklearn, pytorch, tensorflow, etc.\n    - If you want to use an image that utilizes GPU within the notebook server, refer to the **GPUs** section below.\n  - If you want to use a custom notebook server that includes additional packages or source code, you can create a custom image and deploy it for use.\n- **CPU / RAM**:\n  - Specifies the amount of resources required.\n    - cpu: in core units\n      - Represents the number of virtual cores, and can also be specified as a float value such as `1.5`, `2.7`, etc.\n    - memory: in Gi units\n- **GPUs**:\n  - Specifies the number of GPUs to allocate to the Jupyter notebook.\n    - `None`\n      - When GPU resources are not required.\n    - 1, 2, 4\n      - Allocates 1, 2, or 4 GPUs.\n  - GPU Vendor:\n    - If you have followed the [(Optional) Setup GPU](../setup-kubernetes/setup-nvidia-gpu.md) guide and installed the NVIDIA GPU plugin, select NVIDIA.\n- **Workspace Volume**:\n  - Specifies the amount of disk space required within the notebook server.\n  - Do not change the Type and Name fields unless you want to increase the disk space or change the AccessMode.\n    - Check the **\"Don't use Persistent Storage for User's home\"** checkbox only if it is not necessary to save the notebook server's work. **It is generally recommended not to check this option.**\n    - If you want to use a pre-existing Persistent Volume Claim (PVC), select Type as \"Existing\" and enter the name of the PVC to use.\n- **Data Volumes**:\n  - If additional storage resources are required, click the **\"+ ADD VOLUME\"** button to create them.\n- ~~Configurations, Affinity/Tolerations, Miscellaneous Settings~~\n  - These are generally not needed, so detailed explanations are omitted in *MLOps for All*.\n\n</details>\n\nIf you followed the [Setup GPU (Optional)](../setup-kubernetes/setup-nvidia-gpu.md), select NVIDIA if you have installed the nvidia gpu plugin.\n\n![creating](./img/creating.png)\n\nAfter creation, the **Status** will change to a green check mark icon, and the **CONNECT button** will be activated.\n![created](./img/created.png)\n\n---\n## Accessing the Notebook Server\n\nClicking the **CONNECT button** will open a new browser window, where you will see the following screen:\n\n![notebook-access](./img/notebook-access.png)\n\nYou can use the Notebook, Console, and Terminal icons in the **Launcher** to start using them.\n\n  Notebook Interface\n\n![notebook-console](./img/notebook-console.png)\n\n  Terminal Interface\n\n![terminal-console](./img/terminal-console.png)\n\n---\n\n## Stopping the Notebook Server\n\nIf you haven't used the notebook server for an extended period of time, you can stop it to optimize resource usage in the Kubernetes cluster. **Note that stopping the notebook server will result in the deletion of all data stored outside the Workspace Volume or Data Volume specified when creating the notebook server.**  \nIf you haven't changed the path during notebook server creation, the default Workspace Volume path is `/home/jovyan` inside the notebook server, so any data stored outside the `/home/jovyan` directory will be deleted.\n\nClicking the `STOP` button as shown below will stop the notebook server:\n\n![notebook-stop](./img/notebook-stop.png)\n\nOnce the server is stopped, the `CONNECT` button will be disabled. To restart the notebook server and use it again, click the `PLAY` button.\n\n![notebook-restart](./img/notebook-restart.png)\n"
  },
  {
    "path": "i18n/en/docusaurus-plugin-content-docs/current/kubeflow-dashboard-guide/tensorboards.md",
    "content": "---\ntitle : \"3. Tensorboards\"\ndescription: \"\"\nsidebar_position: 3\ncontributors: [\"Jaeyeon Kim\"]\n---\n\nLet's click on the Tensorboards tab of the left tabs of the Central Dashboard next.\n\n![left-tabs](./img/left-tabs.png)\n\nWe can see the following screen. \n\n![tensorboard](./img/tensorboard.png)\n\nThe TensorBoard server created in this way can be used just like a regular remote TensorBoard server, or it can be used for the purpose of storing data directly from a Kubeflow Pipeline run for visualization purposes.\n\nYou can refer to the [TensorBoard documentation](https://www.kubeflow.org/docs/components/pipelines/sdk/output-viewer/#tensorboard) for more information on using TensorBoard with Kubeflow Pipeline runs.\n\nThere are various ways to visualize the results of Kubeflow Pipeline runs, and in *MLOps for ALL*, we will utilize the Visualization feature of Kubeflow components and the visualization capabilities of MLflow to enable more general use cases. Therefore, detailed explanations of the TensorBoards page will be omitted in this context.\n"
  },
  {
    "path": "i18n/en/docusaurus-plugin-content-docs/current/kubeflow-dashboard-guide/volumes.md",
    "content": "---\ntitle : \"4. Volumes\"\ndescription: \"\"\nsidebar_position: 4\ncontributors: [\"Jaeyeon Kim\"]\n---\n\n## Volumes\n\nNext, let's click on the Volumes tab in the left of the Central Dashboard.\n\n![left-tabs](./img/left-tabs.png)\n\nYou will see the following screen.\n\n![volumes](./img/volumes.png)\n\n\nVolumes tab provides the functionality to manage the Persistent Volume Claims (PVC) belonging to the current user's namespace in Kubernetes' Volume (Volume).\n\nBy looking at the screenshot, you can see the information of the Volume created on the [1. Notebooks](../kubeflow-dashboard-guide/notebooks) page. It can be seen that the Storage Class of the Volume is set to local-path, which is the Default Storage Class installed at the time of Kubernetes cluster installation.\n\nIn addition, the Volumes page can be used if you want to create, view, or delete a new Volume in the user namespace.\n\n---\n\n## Creating a Volume\n\nBy clicking the `+ NEW VOLUME` button at the top right, you can see the following screen.\n\n![new-volume](./img/new-volume.png)\n\n\nYou can create a volume by specifying its name, size, storage class, and access mode.\n\nWhen you specify the desired resource specs to create a volume, its Status will be shown as Pending on this page. When you hover over the Status icon, you will see a message that this *(This volume will be bound when its first consumer is created.)*  \nThis is according to the volume creation policy of the [StorageClass](https://kubernetes.io/ko/docs/concepts/storage/storage-classes/) used in the lab, which is local-path. **This is not a problem situation.**  \nWhen the Status is shown as Pending on this page, you can still specify the name of the volume in the notebook server or pod that you want to use the volume and the volume creation will be triggered at that time.\n\n![creating-volume](./img/creating-volume.png)\n"
  },
  {
    "path": "i18n/en/docusaurus-plugin-content-docs/current/prerequisites/_category_.json",
    "content": "{\n  \"label\": \"Prerequisites\",\n  \"position\": 1,\n  \"link\": {\n    \"type\": \"generated-index\"\n  }\n}\n"
  },
  {
    "path": "i18n/en/docusaurus-plugin-content-docs/current/prerequisites/docker/_category_.json",
    "content": "{\n  \"label\": \"Docker\",\n  \"position\": 1,\n  \"link\": {\n    \"type\": \"generated-index\"\n  }\n}\n"
  },
  {
    "path": "i18n/en/docusaurus-plugin-content-docs/current/prerequisites/docker/advanced.md",
    "content": "---\ntitle : \"[Practice] Docker Advanced\"\ndescription: \"Practice to use docker more advanced way.\"\nsidebar_position: 6\ncontributors: [\"Jongseob Jeon\", \"Jaeyeon Kim\"]\n---\n\n## Making a good Docker image\n\n### Considerations to make Docker image:\n\nWhen creating a Docker image using a Dockerfile, the **order** of the commands is important.  \nThis is because Docker images are composed of many Read-Only layers and when building the image, existing layers are **cached** and reused, so if you structure your Dockerfile with this in mind, you can **reduce the build time**.\n\nEach of the `RUN`, `ADD`, `COPY` commands in a Dockerfile are stored as one layer.\n\nFor example, if we have the following `Dockerfile`:\n\n```docker\n# Layer 1\nFROM ubuntu:latest\n\n# Layer 2\nRUN apt-get update && apt-get install python3 pip3 -y\n\n# Layer 3\nRUN pip3 install -U pip && pip3 install torch\n\n# Layer 4\nCOPY src/ src/\n\n# Layer 5\nCMD python src/app.py\n```\n\nIf you run the image built with the above `Dockerfile` with the command `docker run -it app:latest /bin/bash`, it can be represented in the following layers. \n\n![layers.png](./img/layers.png)\n\nThe topmost R/W layer does not affect the image. In other words, any changes made inside the container are volatile.\n\nWhen a lower layer is changed, all the layers above it need to be rebuilt. Therefore, the order of Dockerfile instructions is important. It is recommended to place the parts that are frequently changed towards the end. (e.g., `COPY src/ app/src/`)\n\nConversely, parts that are unlikely to change should be placed towards the beginning.\n\nIf there are parts that are rarely changed but used in multiple places, they can be consolidated. It is advisable to create a separate image for those common parts in advance and use it as a base image.\n\nFor example, if you want to create separate images for an environment that uses `tensorflow-cpu` and another environment that uses `tensorflow-gpu`, you can do the following:\nCreate a base image [`ghcr.io/makinarocks/python:3.8-base`](http://ghcr.io/makinarocks/python:3.8-base-cpu) that includes Python and other basic packages installed. Then, when creating the images with the CPU and GPU versions of TensorFlow, you can use the base image as the `FROM` instruction and write the separate instructions for installing TensorFlow in each Dockerfile. Managing two Dockerfiles in this way improves readability and reduces build time.\n\nCombining layers had performance benefits in older versions of Docker. However, since you cannot guarantee the Docker version in which your Docker containers will run, it is recommended to combine layers for readability purposes. It is best to combine layers that can be combined appropriately.\n\nHere is an example of a Dockerfile:\n\n```docker\n# Bad Case\nRUN apt-get update\nRUN apt-get install build-essential -y\nRUN apt-get install curl -y\nRUN apt-get install jq -y\nRUN apt-get install git -y\n```\n\nThis can be written by combining it as follows.\n\n```docker\n# Better Case\nRUN apt-get update && \\\n    apt-get install -y \\\n    build-essential \\\n    curl \\\n    jq \\\n    git\n```\n\nFor convenience, it is better to use `.dockerignore`.  \n`.dockerignore` is similar to `.gitignore` in the sense that it can be excluded when doing a `docker build` just like when doing a `git add`. \n\nMore information can be found in the [Docker Official Documentation](https://docs.docker.com/develop/develop-images/dockerfile_best-practices/).\n\n### ENTRYPOINT vs CMD\n\n`ENTRYPOINT` and `CMD` are both used when you want to execute a command at the runtime of the container. One of them must be present in the Dockerfile.\n\n- **Difference**\n  - `CMD`: Easily modifiable when running `docker run` command\n  - `ENTRYPOINT`: Requires the use of `--entrypoint` to modify\n\nWhen `ENTRYPOINT` and `CMD` are used together, `CMD` typically represents the arguments (parameters) for the command specified in `ENTRYPOINT`.\n\nFor example, consider the following Dockerfile:\n\n```docker\nFROM ubuntu:latest\n\n# 아래 4 가지 option 을 바꿔가며 직접 테스트해보시면 이해하기 편합니다.\n# 단, NO ENTRYPOINT 옵션은 base image 인 ubuntu:latest 에 이미 있어서 테스트해볼 수는 없고 나머지 v2, 3, 5, 6, 8, 9, 11, 12 를 테스트해볼 수 있습니다.\n# ENTRYPOINT echo \"Hello ENTRYPOINT\"\n# ENTRYPOINT [\"echo\", \"Hello ENTRYPOINT\"]\n# CMD echo \"Hello CMD\"\n# CMD [\"echo\", \"Hello CMD\"]\n```\n\n\nIf you build and run the above `Dockerfile` with the parts marked as comments deactivated, you can get the following results: \n\n|                    | No ENTRYPOINT  | ENTRYPOINT a b | ENTRYPOINT [\"a\", \"b\"] |\n| ------------------ | -------------- | -------------- | --------------------- |\n| **NO CMD**         | Error!         | /bin/sh -c a b | a b                   |\n| **CMD [\"x\", \"y\"]** | x y            | /bin/sh -c a b | a b x y               |\n| **CMD x y**        | /bin/sh -c x y | /bin/sh -c a b | a b /bin/sh -c x y    |\n\n- In Kubernetes pod, \n    - `ENTRYPOINT` corresponds to the command\n    - `CMD` corresponds to the arguments\n\n### Naming docker tag\n\nRecommend not using \"latest\" as a tag for a Docker image, as it is the default tag name and can be easily overwritten unintentionally.\n\nIt is important to ensure uniqueness of one image with one tag for the sake of collaboration and debugging in the production stage.  \nUsing the same tag for different contents can lead to dangling images, which are not shown in the `docker images` but still take up storage space.\n\n### ETC\n\n1. Logs and other information are stored separately from the container, not inside it.\n    This is because data written from within the container can be lost at any time.\n2. Secrets and environment-dependent information should not be written directly into the Dockerfile but should be passed in via environment variables or a .env config file.\n3. There is a **linter** for Dockerfiles, so it is useful to use it when collaborating.\n    [https://github.com/hadolint/hadolint](https://github.com/hadolint/hadolint)\n\n## Several options for docker run\n\nWhen using Docker containers, there are some inconveniences.\nSpecifically, Docker does not store any of the work done within the Docker container by default.\nThis is because Docker containers use isolated file systems. Therefore, it is difficult to share data between multiple Docker containers.\n\nTo solve this problem, there are two approaches offered by Docker.\n\n![storage.png](./img/storage.png)\n\n#### Docker volume\n\n- Use the Docker CLI to directly manage a resource called `volume`.\n- Create a specific directory under the Docker area (`/var/lib/docker`) on the host and mount that path to a Docker container.\n\n#### Bind mount\n\n- Mount a specific path on the host to a Docker container.\n\n#### How to use?\n\nThe usage is through the same interface, using the `-v` option.  \nHowever, when using volumes, you need to manage them directly by performing commands like `docker volume create`, `docker volume ls`, `docker volume rm`, etc.\n\n- Docker volume\n\n    ```bash\n    docker run \\\n        -v my_volume:/app \\\n        nginx:latest\n    ````\n\n- Blind mount\n\n    ```bash\n    docker run \\\n        -v /home/user/some/path:/app \\\n        nginx:latest\n    ```\n\nWhen developing locally, bind mount can be convenient, but if you want to maintain a clean environment, using Docker volume and explicitly performing create and rm operations can be another approach.\n\nThe way storage is provided in Kubernetes ultimately relies on Docker's bind mount as well.\n\n### Docker run with resource limit\n\nBasically, docker containers can **fully utilize the CPU and memory resources of the host OS**. However, when using this, depending on the resource situation of the host OS, docker containers may abnormally terminate due to issues such as **OOM**.\nTo address this problem, docker provides the `-m` [option](https://docs.docker.com/config/containers/resource_constraints/#limit-a-containers-access-to-memory) which allows you to **limit the usage of CPU and memory** when running the docker container.\n\n```bash\ndocker run -d -m 512m --memory-reservation=256m --name 512-limit ubuntu sleep 3600\ndocker run -d -m 1g --memory-reservation=256m --name 1g-limit ubuntu sleep 3600\n```\n\nAfter running the Docker above, you can check the usage through the 'docker stats' command.\n\n```bash\nCONTAINER ID   NAME        CPU %     MEM USAGE / LIMIT   MEM %     NET I/O       BLOCK I/O   PIDS\n4ea1258e2e09   1g-limit    0.00%     300KiB / 1GiB       0.03%     1kB / 0B      0B / 0B     1\n4edf94b9a3e5   512-limit   0.00%     296KiB / 512MiB     0.06%     1.11kB / 0B   0B / 0B     1\n```\n\nIn Kubernetes, when you limit the CPU and memory resources of a pod resource, it is provided using this technique.\n\n### docker run with restart policy\n\nIf there is a need to keep a particular container running continuously, the `--restart=always` option is provided to try to re-create the container immediately after it is terminated.\n\nAfter entering the option, run the docker.\n\n```bash\ndocker run --restart=always ubuntu\n```\n\nRun `watch -n1 docker ps` to check if it is restarting.\nIf it is running normally, `Restarting (0)` will be printed in STATUS.\n\n```bash\nCONTAINER ID   IMAGE     COMMAND   CREATED          STATUS                         PORTS     NAMES\na911850276e8   ubuntu    \"bash\"    35 seconds ago   Restarting (0) 6 seconds ago             hungry_vaughan\n```\n\n- [https://docs.docker.com/engine/reference/commandline/run/#restart-policies---restart](https://docs.docker.com/engine/reference/commandline/run/#restart-policies---restart)\n  - Provides options such as \"on-failure with max retries\" and \"always\"\n\nWhen specifying the restart option for a job resource in Kubernetes, this approach is used.\n\n### Running docker run as a background process\n\nBy default, when running a Docker container, it is executed as a foreground process. This means that the terminal that launched the container is automatically attached to it, preventing you from running other commands.\n\nLet's try an example. Open two terminals, and in one terminal, continuously monitor `docker ps`, while in the other terminal, execute the following commands one by one and observe the behavior.\n\n#### First Practice\n\n```bash\ndocker run -it ubuntu sleep 10\n```\n\nYou must remain stopped for 10 seconds and you cannot perform any other commands from that container. After 10 seconds, you can check in docker ps that the container has terminated.\n\n#### Second Practice\n\n```bash\ndocker run -it ubuntu sleep 10\n```\n\nAfter that, press `ctrl + p` -> `ctrl + q`.\n\nNow you can perform other commands in that terminal, and you can also see that the container is still alive for up to 10 seconds with `docker ps`. This situation, where you exit from the Docker container, is called \"detached\". Docker provides an option to run containers in detached mode, which allows you to run the container in the background while executing the `run` command.\n\n#### Third Practice\n\n```bash\ndocker run -d ubuntu sleep 10\n```\n\nIn detached mode, you can perform other actions in the terminal that executed the command.\n\nIt is good to use detached mode appropriately according to the situation.  \nFor example, when developing a backend API server that communicates with the DB, the backend API server needs to be constantly checked with hot-loading while changing the source code, but the DB does not need to be monitored, so it can be executed as follows.  \nRun the DB container in detached mode, and run the backend API server in attached mode to follow the logs.\n\n\n## References\n\n- [https://towardsdatascience.com/docker-storage-598e385f4efe](https://towardsdatascience.com/docker-storage-598e385f4efe)\n- [https://vsupalov.com/docker-latest-tag/](https://vsupalov.com/docker-latest-tag/)\n- [https://docs.microsoft.com/ko-kr/azure/container-registry/container-registry-image-tag-version](https://docs.microsoft.com/ko-kr/azure/container-registry/container-registry-image-tag-version)\n- [https://stevelasker.blog/2018/03/01/docker-tagging-best-practices-for-tagging-and-versioning-docker-images/](https://stevelasker.blog/2018/03/01/docker-tagging-best-practices-for-tagging-and-versioning-docker-images/)\n"
  },
  {
    "path": "i18n/en/docusaurus-plugin-content-docs/current/prerequisites/docker/command.md",
    "content": "---\ntitle : \"[Practice] Docker command\"\ndescription: \"Practice to use docker command.\"\nsidebar_position: 4\ncontributors: [\"Jongseob Jeon\", \"Jaeyeon Kim\"]\n---\n\n## 1. Normal installation confirmation\n\n```bash\ndocker run hello-world\n```\n\nIf installed correctly, you should be able to see the following message.\n\n```bash\nHello from Docker!\nThis message shows that your installation appears to be working correctly.\n....\n```\n\n\n**(For ubuntu)** If you want to use without sudo, please refer to the following site.\n\n- [https://docs.docker.com/engine/install/linux-postinstall/#manage-docker-as-a-non-root-user](https://docs.docker.com/engine/install/linux-postinstall/#manage-docker-as-a-non-root-user)\n\n## 2. Docker Pull\n\nDocker pull is a command to download Docker images from a Docker image registry (a repository where Docker images are stored and shared).\n\nYou can check the arguments available in docker pull using the command below.\n\n```bash\ndocker pull --help\n```\n\nIf performed normally, it prints out as follows.\n\n```bash\nUsage:  docker pull [OPTIONS] NAME[:TAG|@DIGEST]\n\nPull an image or a repository from a registry\n\nOptions:\n  -a, --all-tags                Download all tagged images in the repository\n      --disable-content-trust   Skip image verification (default true)\n      --platform string         Set platform if server is multi-platform capable\n  -q, --quiet                   Suppress verbose output\n```\n\nIt can be seen here that docker pull takes two types of arguments. \n\n1. `[OPTIONS]`\n2. `NAME[:TAG|@DIGEST]`\n\nIn order to use the `-a` and `-q` options from help, they must be used before the NAME. \nLet's try and pull the `ubuntu:18.04` image directly.\n\n```bash\ndocker pull ubuntu:18.04\n```\n\nIf interpreted correctly, the command means to pull an image with the tag `18.04` from an image named `ubuntu`.\n\nIf performed successfully, it will produce an output similar to the following.\n\n```bash\n18.04: Pulling from library/ubuntu\n20d796c36622: Pull complete \nDigest: sha256:42cd9143b6060261187a72716906187294b8b66653b50d70bc7a90ccade5c984\nStatus: Downloaded newer image for ubuntu:18.04\ndocker.io/library/ubuntu:18.04\n```\n\nIf you perform the above command, you will download the image called 'ubuntu:18.04' from a registry named [docker.io/library](http://docker.io/library/) to your laptop.\n\n- Note that \n  - in the future, if you need to get a docker image from a certain **private** registry instead of docker.io or public docker hub, you can use [`docker login`](https://docs.docker.com/engine/reference/commandline/login/) to point to the certain registry, then use `docker pull`. Alternatively, you can set up an [insecure registry](https://stackoverflow.com/questions/42211380/add-insecure-registry-to-docker). \n  - Also note that [`docker save`](https://docs.docker.com/engine/reference/commandline/save/) and [`docker load`](https://docs.docker.com/engine/reference/commandline/load/) commands are available to store and share docker images in the form of `.tar` file in an intranet.\n\n\n## 3. Docker images\n\nThis is the command to list the Docker images that exist locally.\n\n```bash\ndocker images --help\n```\n\nThe arguments available for use in docker images are as follows.\n\n```bash\nUsage:  docker images [OPTIONS] [REPOSITORY[:TAG]]\n\nList images\n\nOptions:\n  -a, --all             Show all images (default hides intermediate images)\n      --digests         Show digests\n  -f, --filter filter   Filter output based on conditions provided\n      --format string   Pretty-print images using a Go template\n      --no-trunc        Don't truncate output\n  -q, --quiet           Only show image IDs\n```\n\nLet's try executing the command below directly.\n\n```bash\ndocker images\n```\n\nIf you install Docker and proceed with this practice, it will output something similar to this.\n\n```bash\nREPOSITORY   TAG       IMAGE ID       CREATED      SIZE\nubuntu       18.04     29e70752d7b2   2 days ago   56.7MB\n```\n\nIf you use the `-q` argument among the possible arguments, only the `IMAGE ID` will be printed.\n\n```bash\ndocker images -q\n```\n\n```bash\n29e70752d7b2\n```\n\n## 4. Docker ps\n\nCommand to output the list of currently running Docker containers.\n\n```bash\ndocker ps --help\n```\n\nUse the following arguments can be used with 'docker ps':\n\n```bash\nUsage:  docker ps [OPTIONS]\n\nList containers\n\nOptions:\n  -a, --all             Show all containers (default shows just running)\n  -f, --filter filter   Filter output based on conditions provided\n      --format string   Pretty-print containers using a Go template\n  -n, --last int        Show n last created containers (includes all states) (default -1)\n  -l, --latest          Show the latest created container (includes all states)\n      --no-trunc        Don't truncate output\n  -q, --quiet           Only display container IDs\n  -s, --size            Display total file sizes\n```\n\nLet's try running the command below directly.\n\n```bash\ndocker ps\n```\n\nIf there are no currently running containers, it will be as follows.\n\n```bash\nCONTAINER ID   IMAGE     COMMAND   CREATED   STATUS    PORTS     NAMES\n```\n\nIf there is a container running, it will look similar to this.\n\n```bash\nCONTAINER ID   IMAGE     COMMAND        CREATED          STATUS          PORTS     NAMES\nc1e8f5e89d8d   ubuntu    \"sleep 3600\"   13 seconds ago   Up 12 seconds             trusting_newton\n```\n\n## 5. Docker run\n\nCommand to run a Docker container.\n\n```bash\ndocker run --help\n```\n\nThe command to run docker run is as follows.\n\n```bash\nUsage:  docker run [OPTIONS] IMAGE [COMMAND] [ARG...]\n\nRun a command in a new container\n```\n\nWhat we need to confirm here is that the docker run command takes three types of arguments. \n\n1. `[OPTIONS]`\n2. `[COMMAND]`\n3. `[ARG...]`\n\nLet's try running a docker container ourselves.\n\n```bash\n## Usage:  docker run [OPTIONS] IMAGE [COMMAND] [ARG...]\ndocker run -it --name demo1 ubuntu:18.04 /bin/bash\n```\n\n- `-it`: Combination of `-i` and `-t` options\n  - Runs the container and connects it to an interactive terminal\n- `--name`: Assigns a name to the container for easier identification instead of using the container ID\n- `/bin/bash`: Specifies the command to be executed in the container upon startup, where `/bin/bash` opens a bash shell.\n\nAfter running the command, you can exit the container by using the `exit` command.\n\nWhen you enter the previously learned `docker ps` command, the following output will be displayed.\n```bash\nCONTAINER ID   IMAGE     COMMAND   CREATED   STATUS    PORTS     NAMES\n```\n\nIt was said that the container being executed was coming out, but for some reason the container that was just executed does not appear. The reason is that `docker ps` shows the currently running containers by default. If you want to see the stopped containers too, you must give the `-a` option.\n```bash\ndocker ps -a\n```\n\nThen the list of terminated containers will also be displayed.\n\n```bash\nCONTAINER ID   IMAGE          COMMAND       CREATED         STATUS                     PORTS     NAMES\n4c1aa74a382a   ubuntu:18.04   \"/bin/bash\"   2 minutes ago   Exited (0) 2 minutes ago             demo1\n```\n\n## 6. Docker exec\n\nDocker exec is a command that is used to issue commands or access the inside of a Docker container.\n\n```bash\ndocker exec --help\n```\nFor example, let's try running the following command.\n\n```bash\ndocker run -d --name demo2 ubuntu:18.04 sleep 3600\n```\n\nHere, the `-d` option is a command that allows the Docker container to run in the background so that even if the connection ends to the container, it continues to run.\n\nUse `docker ps` to check if it is currently running.\n\nIt can be confirmed that it is running as follows.\n\n```bash\nCONTAINER ID   IMAGE          COMMAND        CREATED         STATUS         PORTS     NAMES\nfc88a83e90f0   ubuntu:18.04   \"sleep 3600\"   4 seconds ago   Up 3 seconds             demo2\n```\n\nNow let's connect to the running docker container through the `docker exec` command.\n\n```bash\ndocker exec -it demo2 /bin/bash\n```\n\nThis is the same as the previous `docker run` command, allowing you to access the inside of the container.\n \nYou can exit using `exit`.\n## 7. Docker logs\n\n```bash\ndocker logs --help\n```\n\nI will have the following container be executed.\n\n```bash\ndocker run --name demo3 -d busybox sh -c \"while true; do $(echo date); sleep 1; done\"\n```\n\nBy using the above command, we have set up a busybox container named \"test\" as a Docker container in the background and printed the current time once every second.\n\nNow let's check the log with the command below.\n\n```bash\ndocker logs demo3\n```\n\nIf performed normally, it will be similar to below.\n\n```bash\nSun Mar  6 11:06:49 UTC 2022\nSun Mar  6 11:06:50 UTC 2022\nSun Mar  6 11:06:51 UTC 2022\nSun Mar  6 11:06:52 UTC 2022\nSun Mar  6 11:06:53 UTC 2022\nSun Mar  6 11:06:54 UTC 2022\n```\nHowever, if used this way, you can only check the logs taken so far.  \nIn this case, you can use the `-f` option to keep watching and outputting.\n\n```bash\ndocker logs demo3 -f    \n```\n\n## 8. Docker stop\n\nCommand to stop a running Docker container.\n\n```bash\ndocker stop --help\n```\n\nThrough `docker ps`, you can check the containers currently running, as follows.\n\n```bash\nCONTAINER ID   IMAGE          COMMAND                  CREATED              STATUS              PORTS     NAMES\n730391669c39   busybox        \"sh -c 'while true; …\"   About a minute ago   Up About a minute             demo3\nfc88a83e90f0   ubuntu:18.04   \"sleep 3600\"             4 minutes ago        Up 4 minutes                  demo2\n```\nNow let's try to stop Docker with `docker stop`.\n\n```bash\ndocker stop demo2\n```\n\nAfter executing, type `docker ps` again.\n\n```bash\nCONTAINER ID   IMAGE     COMMAND                  CREATED         STATUS         PORTS     NAMES\n730391669c39   busybox   \"sh -c 'while true; …\"   2 minutes ago   Up 2 minutes             demo3\n```\n\nComparing with the above result, you can see that the demo2 container has disappeared from the list of currently running containers.\nThe rest of the containers will also be stopped.\n\n```bash\ndocker stop demo3\n```\n\nDocker rm: Command to delete a Docker container.\n\n```bash\ndocker rm --help\n```\n\nDocker containers are in a stopped state by default. That's why you can see stopped containers using `docker ps -a`.\nBut why do we have to delete the stopped containers?  \nEven when stopped, the data used in the Docker remains in the container.\nSo you can restart the container through restarting. But this process will use disk.\nSo\n in order to delete the containers that are not used at all, we should use the `docker rm` command.\n \n First, let's check the current containers.\n\n```bash\ndocker ps -a\n```\n\nThere are three containers as follows.\n\n```bash\nCONTAINER ID   IMAGE          COMMAND                  CREATED          STATUS                            PORTS     NAMES\n730391669c39   busybox        \"sh -c 'while true; …\"   4 minutes ago    Exited (137) About a minute ago             demo3\nfc88a83e90f0   ubuntu:18.04   \"sleep 3600\"             7 minutes ago    Exited (137) 2 minutes ago                  demo2\n4c1aa74a382a   ubuntu:18.04   \"/bin/bash\"              10 minutes ago   Exited (0) 10 minutes ago                   demo1\n```\n\nLet's try to delete the 'demo3' container through the following command.\n\n```bash\ndocker rm demo3\n```\n\nThe command `docker ps -a` reduced it to two lines as follows.\n\n```bash\nCONTAINER ID   IMAGE          COMMAND        CREATED          STATUS                       PORTS     NAMES\nfc88a83e90f0   ubuntu:18.04   \"sleep 3600\"   13 minutes ago   Exited (137) 8 minutes ago             demo2\n4c1aa74a382a   ubuntu:18.04   \"/bin/bash\"    16 minutes ago   Exited (0) 16 minutes ago              demo1\n```\n\nDelete the remaining containers as well.\n\n```bash\ndocker rm demo2\ndocker rm demo1\n```\n\n## 10. Docker rmi\n\nCommand to delete a Docker image.\n\n```bash\ndocker rmi --help\n```\n\nUse the following commands to check which images are currently on the local.\n\n```bash\ndocker images\n```\n\nThe following is output.\n\n```bash\nREPOSITORY   TAG       IMAGE ID       CREATED        SIZE\nbusybox      latest    a8440bba1bc0   32 hours ago   1.41MB\nubuntu       18.04     29e70752d7b2   2 days ago     56.7MB\n```\n\nI will try to delete the `busybox` image.\n\n```bash\ndocker rmi busybox\n```\n\nIf you type `docker images` again, the following will appear.\n\n```bash\nREPOSITORY   TAG       IMAGE ID       CREATED        SIZE\nubuntu       18.04     29e70752d7b2   2 days ago     56.7MB\n```\n\n## References\n\n- [https://docs.github.com/en/packages/working-with-a-github-packages-registry/working-with-the-container-registry](https://docs.github.com/en/packages/working-with-a-github-packages-registry/working-with-the-container-registry)\n"
  },
  {
    "path": "i18n/en/docusaurus-plugin-content-docs/current/prerequisites/docker/docker.md",
    "content": "---\ntitle : \"What is Docker?\"\ndescription: \"Introduction to Docker.\"\nsidebar_position: 3\ncontributors: [\"Jongseob Jeon\", \"Jaeyeon Kim\"]\n---\n\n\n## Container\n\n- Containerization:\n  - A technology that allows applications to be executed uniformly anywhere.\n- Container Image:\n  - A collection of all the files required to run an application.\n  - → Similar to a mold for making fish-shaped bread (Bungeoppang).\n- Container:\n  - A single process that is executed based on a container image.\n  - → A fish-shaped bread (Bungeoppang) produced using a mold.\n\n## Docker\n\nDocker is a platform that allows you to manage and use containers.  \nIts slogan is \"Build Once, Run Anywhere,\" guaranteeing the same execution results anywhere.\n\nIn the Docker, the resources for the container are separated and the lifecycle is controlled by Linux kernel's cgroups, etc.  \nHowever, it is too difficult to use these interfaces directly, so an abstraction layer is created.\n\n![docker-layer.png](./img/docker-layer.png)\n\nThrough this, users can easily control containers with just the user-friendly API **Docker CLI**.\n- Users can easily control containers using the user-friendly API called **Docker CLI**.\n\n## Interpretation of Layer\n\nThe roles of the layers mentioned above are as follows:\n\n1. runC: Utilizes the functionality of the Linux kernel to isolate namespaces, CPUs, memory, filesystems, etc., for a container, which is a single process.\n2. containerd: Acts as an abstraction layer to communicate with runC (OCI layer) and uses the standardized interface (OCI).\n3. dockerd: Solely responsible for issuing commands to containerd.\n4. Docker CLI: Users only need to issue commands to dockerd (Docker daemon) using Docker CLI.\n   - During this communication process, Unix socket is used, so sometimes Docker-related errors occur, such as \"the /var/run/docker.sock is in use\" or \"insufficient permissions\" error messages.\n\nAlthough Docker encompasses many stages, when the term \"Docker\" is used, it can refer to Docker CLI, Dockerd (Docker daemon), or even a single Docker container, which can lead to confusion.  \nIn the upcoming text, the term \"Docker\" may be used in various contexts.\n\n## For ML Engineer\n\nML engineers use Docker for the following reasons:\n\n1. ML training/inference code needs to be independent of the underlying operating system, Python version, Python environment, and specific versions of Python packages.\n2. Therefore, the goal is to bundle not only the code but also all the dependent packages, environment variables, folder names, etc., into a single package. Containerization technology enables this.\n3. Docker is one of the software tools that makes it easy to use and manage this technology, and the packaged units are referred to as Docker images.\n"
  },
  {
    "path": "i18n/en/docusaurus-plugin-content-docs/current/prerequisites/docker/images.md",
    "content": "---\ntitle : \"[Practice] Docker images\"\ndescription: \"Practice to use docker image.\"\nsidebar_position: 5\ncontributors: [\"Jongseob Jeon\", \"Jaeyeon Kim\"]\n---\n\n- `docker commit`\n  - running container 를 docker image 로 만드는 방법\n  - `docker commit -m \"message\" -a \"author\" <container-id> <image-name>`\n  - `docker commit` 을 사용하면, 수동으로 Dockerfile 을 만들지 않고도 도커 이미지를 만들 수 있습니다.\n    ```\n    touch Dockerfile\n    ```\n\n3. Move to the docker-practice folder.\n\n4. Create an empty file called Dockerfile.\n\n1. 이미지에 특정 패키지를 설치하는 명령어는 무엇입니까?\n\nAnswer: `RUN`\n\nTranslation: Let's look at the basic commands that can be used in Dockerfile one by one. FROM is a command that specifies which image to use as a base image for Dockerfile. When creating a Docker image, instead of creating the environment I intend from scratch, I can use a pre-made image such as `python:3.9`, `python-3.9-alpine`, etc. as the base and install pytorch and add my source code.\n```docker\nFROM <image>[:<tag>] [AS <name>]\n\n# 예시\nFROM ubuntu\nFROM ubuntu:18.04\nFROM nginx:latest AS ngx\n```\n\nThe command to copy files or directories from the `<src>` path on the host (local) to the `<dest>` path inside the container.\n```docker\nCOPY <src>... <dest>\n\n# 예시\nCOPY a.txt /some-directory/b.txt\nCOPY my-directory /some-directory-2\n```\n\nADD is similar to COPY but it has additional features.\n```docker\n# 1 - 호스트에 압축되어있는 파일을 풀면서 컨테이너 내부로 copy 할 수 있음\nADD scripts.tar.gz /tmp\n# 2 - Remote URLs 에 있는 파일을 소스 경로로 지정할 수 있음\nADD http://www.example.com/script.sh /tmp\n\n# 위 두 가지 기능을 사용하고 싶을 경우에만 COPY 대신 ADD 를 사용하는 것을 권장\n```\n\nThe command to run the specified command inside a Docker container. \nDocker images maintain the state in which the commands are executed.\n```docker\nRUN <command>\nRUN [\"executable-command\", \"parameter1\", \"parameter2\"]\n\n# 예시\nRUN pip install torch\nRUN pip install -r requirements.txt\n```\n\nCMD specifies a command that the Docker container will **run when it starts**. There is a similar command called **ENTRYPOINT**. The difference between them will be discussed **later**. Note that only one **CMD** can be run in one Docker image, which is different from **RUN** command.\n```docker\nCMD <command>\nCMD [\"executable-command\", \"parameter1\", \"parameter2\"]\nCMD [\"parameter1\", \"parameter2\"] # ENTRYPOINT 와 함께 사용될 때\n\n# 예시\nCMD python main.py\n```\n\nWORKDIR is a command that specifies which directory inside the container to perform future additional commands. If the directory does not exist, it will be created.\n```docker\nWORKDIR /path/to/workdir\n\n# 예시\nWORKDIR /home/demo\nRUN pwd # /home/demo 가 출력됨\n```\n\nThis is a command to set the value of environment variables that will be used continuously inside the container.\n```docker\nENV <KEY> <VALUE>\nENV <KEY>=<VALUE>\n\n# 예시\n# default 언어 설정\nRUN locale-gen ko_KR.UTF-8\nENV LANG ko_KR.UTF-8\nENV LANGUAGE ko_KR.UTF-8\nENV LC_ALL ko_KR.UTF-8\n```\n\nYou can specify the port/protocol to be opened from the container. If `<protocol>` is not specified, TCP is set as the default.\n```docker\nEXPOSE <port>\nEXPOSE <port>/<protocol>\n\n# 예시\nEXPOSE 8080\n```\nWrite a simple Dockerfile by using `vim Dockerfile` or an editor like vscode and write the following:\n```docker\n# base image 를 ubuntu 18.04 로 설정합니다.\nFROM ubuntu:18.04\n\n# apt-get update 명령을 실행합니다.\nRUN apt-get update\n\n# TEST env var의 값을 hello 로 지정합니다.\nENV TEST hello\n\n# DOCKER CONTAINER 가 시작될 때, 환경변수 TEST 의 값을 출력합니다.\nCMD echo $TEST\n```\n\nUse the `docker build` command to create a Docker Image from a Dockerfile.\n```bash\ndocker build --help\n```\n\nRun the following command from the path where the Dockerfile is located.\n```bash\ndocker build -t my-image:v1.0.0 .\n```\n\nThe command above means to build an image with the name \"my-image\" and the tag \"v1.0.0\" from the Dockerfile in the current path. Let's check if the image was built successfully.\n```bash\n# grep : my-image 가 있는지를 잡아내는 (grep) 하는 명령어\ndocker images | grep my-image\n```\nIf performed normally, it will output as follows.\n```bash\nmy-image     v1.0.0    143114710b2d   3 seconds ago   87.9MB\n```\n\nLet's now **run** a docker container with the `my-image:v1.0.0` image that we just built.\n```bash\ndocker run my-image:v1.0.0\n```\n\nIf performed normally, it will result in the following.\n```bash\nhello\n```\n\nLet's run a docker container and change the value of the `TEST` env var at the time of running the `my-image:v1.0.0` image we just built.\n```bash\ndocker run -e TEST=bye my-image:v1.0.0\n```\nIf performed normally, it will be as follows.\n```bash\nbye\n```\n\n\n"
  },
  {
    "path": "i18n/en/docusaurus-plugin-content-docs/current/prerequisites/docker/install.md",
    "content": "---\ntitle : \"Install Docker\"\ndescription: \"Install docker to start.\"\nsidebar_position: 1\ncontributors: [\"Jongseob Jeon\", \"Jaeyeon Kim\"]\n---\n\n## Docker\n\nTo practice Docker, you need to install Docker.  \nThe Docker installation varies depending on which OS you are using.  \nPlease refer to the official website for the Docker installation that fits your environment: \n\n- [ubuntu](https://docs.docker.com/engine/install/ubuntu/)\n- [mac](https://docs.docker.com/desktop/mac/install/)\n- [windows](https://docs.docker.com/desktop/windows/install/)\n\n## Check Installation\n\nCheck installation requires an OS, terminal environment where `docker run hello-world` runs correctly.\n\n| OS      | Docker Engine  | Terminal           |\n| ------- | -------------- | ------------------ |\n| MacOS   | Docker Desktop | zsh                |\n| Windows | Docker Desktop | Powershell         |\n| Windows | Docker Desktop | WSL2               |\n| Ubuntu  | Docker Engine  | bash               |\n\n## Before diving in..\n\nIt is possible that many metaphors and examples will be focused towards MLOps as they explain the necessary Docker usage to use MLOps.\n"
  },
  {
    "path": "i18n/en/docusaurus-plugin-content-docs/current/prerequisites/docker/introduction.md",
    "content": "---\ntitle : \"Why Docker & Kubernetes ?\"\ndescription: \"Introduction to Docker.\"\nsidebar_position: 2\ncontributors: [\"Jongseob Jeon\", \"Jaeyeon Kim\"]\n---\n\n## Why Kubernetes ?\n\nTo operationalize machine learning models, additional functionalities beyond model development are required.\n\n1. Training Phase\n   - Schedule management for model training commands\n   - Ensuring reproducibility of trained models\n2. Deployment Phase\n   - Traffic distribution\n   - Monitoring service failures\n   - Troubleshooting in case of failures\n\nFortunately, the software development field has already put a lot of thought and effort into addressing these needs. Therefore, when deploying machine learning models, leveraging the outcomes of these considerations can be highly beneficial. Docker and Kubernetes are two prominent software products widely used in MLOps to address these needs.\n\n## Docker & Kubernetes\n\n### Not a software but  a product\n\nDocker and Kubernetes are representative software (products) that provide containerization and container orchestration functions respectively.\n\n#### Docker\n\nDocker was the mainstream in the past, but its usage has been decreasing gradually with the addition of various paid policy.  \nHowever, as of March 2022, it is still the most commonly used container virtualization software.\n\n![sysdig-2019.png](./img/sysdig-2019.png)\n\n<center> [from sysdig 2019] </center>\n\n![sysdig-2021.png](./img/sysdig-2021.png)\n\n<center> [from sysdig 2021]  </center>\n\n#### Kubernetes\n\nKubernetes: Kubernetes is a product that has almost no comparison so far.\n\n![cncf-survey.png](./img/cncf-survey.png)\n\n<center> [from cncf survey] </center>\n\n![t4-ai.png](./img/t4-ai.png)\n\n<center> [from t4.ai]  </center>\n\n### History of Open source\n\n#### Initial Docker & Kubernetes\n\nAt the beginning of Docker development, **one package** called Docker Engine contained multiple features such as API, CLI, networking, storage, etc., but it began to be **divided one by one** according to the philosophy of **MSA**.  \nHowever, the initial Kubernetes included Docker Engine for container virtualization.  \nTherefore, whenever the Docker version was updated, the interface of Docker Engine changed and Kubernetes was greatly affected.\n\n#### Open Container Initiative\n\nIn order to alleviate such inconveniences, many groups interested in container technology such as Google have come together to start the Open Container Initiative (OCI) project to set standards for containers.  \nDocker further separated its interface and developed Containerd, a Container Runtime that adheres to the OCI standard, and added an abstraction layer so that dockerd calls the API of Containerd.\n\nIn accordance with this flow, Kubernetes also now supports not only Docker, but any Container Runtime that adheres to the OCI standard and the specified specifications with the Container Runtime Interface (CRI) specification, starting from version 1.5. \n\n#### CRI-O\n\nCRI-O is a container runtime developed by Red Hat, Intel, SUSE, and IBM, which adheres to the OCI standard + CRI specifications, specifically for Kubernetes.\n\n#### Current docker & kubernetes\n\nCurrently, Docker and Kubernetes have been using Docker Engine as the default container runtime, but since Docker's API did not match the CRI specification (*OCI follows*), Kubernetes developed and supported a **dockershim** to make Docker's API compatible with CRI, (*it was a huge burden for Kubernetes, not for Docker*). This was **deprecated from Kubernetes v1.20 and abandoned from v1.23**.\n\n- v1.23 will be released in December 2021\n\nSo from Kubernetes v1.23, you can no longer use Docker natively. \nHowever, **users are not much affected by this change** because Docker images created through Docker Engine comply with the OCI standard, so they can be used regardless of what container runtime Kubernetes is made of.\n\n### References\n\n- [*https://www.linkedin.com/pulse/containerd는-무엇이고-왜-중요할까-sean-lee/?originalSubdomain=kr*](https://www.linkedin.com/pulse/containerd%EB%8A%94-%EB%AC%B4%EC%97%87%EC%9D%B4%EA%B3%A0-%EC%99%9C-%EC%A4%91%EC%9A%94%ED%95%A0%EA%B9%8C-sean-lee/?originalSubdomain=kr)\n- [https://kubernetes.io/blog/2021/12/07/kubernetes-1-23-release-announcement/](https://kubernetes.io/blog/2021/12/07/kubernetes-1-23-release-announcement/)\n- [https://kubernetes.io/blog/2020/12/02/dockershim-faq/](https://kubernetes.io/blog/2020/12/02/dockershim-faq/)\n- [https://kubernetes.io/blog/2020/12/02/dont-panic-kubernetes-and-docker/](https://kubernetes.io/blog/2020/12/02/dont-panic-kubernetes-and-docker/)\n- [https://kubernetes.io/ko/blog/2020/12/02/dont-panic-kubernetes-and-docker/](https://kubernetes.io/ko/blog/2020/12/02/dont-panic-kubernetes-and-docker/)\n"
  },
  {
    "path": "i18n/en/docusaurus-plugin-content-docs/current/setup-components/_category_.json",
    "content": "{\n  \"label\": \"Setup Components\",\n  \"position\": 3,\n  \"link\": {\n    \"type\": \"generated-index\"\n  }\n}\n"
  },
  {
    "path": "i18n/en/docusaurus-plugin-content-docs/current/setup-components/install-components-kf.md",
    "content": "---\ntitle : \"1. Kubeflow\"\ndescription: \"구성요소 설치 - Kubeflow\"\nsidebar_position: 1\ndate: 2021-12-13\nlastmod: 2021-12-20\ncontributors: [\"Jaeyeon Kim\", \"SeungTae Kim\"]\n---\n\n## Prepare the installation file\n\nPrepare the installation files for installing Kubeflow **v1.4.0**\n\nClone the [kubeflow/manifests Repository](https://github.com/kubeflow/manifests) with the **v1.4.0** tag, and move to the corresponding folder.\n\n```bash\ngit clone -b v1.4.0 https://github.com/kubeflow/manifests.git\ncd manifests\n```\n\n## Install each components\n\nThe kubeflow/manifests repository provides installation commands for each component, but it often lacks information on potential issues that may arise during installation or how to verify if the installation was successful. This can make it challenging for first-time users.  \nTherefore, in this document, we will provide instructions on how to verify the successful installation of each component.\n\nPlease note that this document will not cover the installation of components that are not covered in *MLOps for ALL*, such as Knative, KFServing, and MPI Operator, as we prioritize efficient resource usage.\n\n### Cert-manager\n\n1. Install cert-manager.\n\n  ```bash\n  kustomize build common/cert-manager/cert-manager/base | kubectl apply -f -\n  ```\n\n  If the installation is successful, you should see output similar to the following:\n\n  ```bash\n  namespace/cert-manager created\n  customresourcedefinition.apiextensions.k8s.io/certificaterequests.cert-manager.io created\n  customresourcedefinition.apiextensions.k8s.io/certificates.cert-manager.io created\n  customresourcedefinition.apiextensions.k8s.io/challenges.acme.cert-manager.io created\n  customresourcedefinition.apiextensions.k8s.io/clusterissuers.cert-manager.io created\n  customresourcedefinition.apiextensions.k8s.io/issuers.cert-manager.io created\n  customresourcedefinition.apiextensions.k8s.io/orders.acme.cert-manager.io created\n  serviceaccount/cert-manager created\n  serviceaccount/cert-manager-cainjector created\n  serviceaccount/cert-manager-webhook created\n  role.rbac.authorization.k8s.io/cert-manager-webhook:dynamic-serving created\n  role.rbac.authorization.k8s.io/cert-manager-cainjector:leaderelection created\n  role.rbac.authorization.k8s.io/cert-manager:leaderelection created\n  clusterrole.rbac.authorization.k8s.io/cert-manager-cainjector created\n  clusterrole.rbac.authorization.k8s.io/cert-manager-controller-approve:cert-manager-io created\n  clusterrole.rbac.authorization.k8s.io/cert-manager-controller-certificates created\n  clusterrole.rbac.authorization.k8s.io/cert-manager-controller-challenges created\n  clusterrole.rbac.authorization.k8s.io/cert-manager-controller-clusterissuers created\n  clusterrole.rbac.authorization.k8s.io/cert-manager-controller-ingress-shim created\n  clusterrole.rbac.authorization.k8s.io/cert-manager-controller-issuers created\n  clusterrole.rbac.authorization.k8s.io/cert-manager-controller-orders created\n  clusterrole.rbac.authorization.k8s.io/cert-manager-edit created\n  clusterrole.rbac.authorization.k8s.io/cert-manager-view created\n  clusterrole.rbac.authorization.k8s.io/cert-manager-webhook:subjectaccessreviews created\n  rolebinding.rbac.authorization.k8s.io/cert-manager-webhook:dynamic-serving created\n  rolebinding.rbac.authorization.k8s.io/cert-manager-cainjector:leaderelection created\n  rolebinding.rbac.authorization.k8s.io/cert-manager:leaderelection created\n  clusterrolebinding.rbac.authorization.k8s.io/cert-manager-cainjector created\n  clusterrolebinding.rbac.authorization.k8s.io/cert-manager-controller-approve:cert-manager-io created\n  clusterrolebinding.rbac.authorization.k8s.io/cert-manager-controller-certificates created\n  clusterrolebinding.rbac.authorization.k8s.io/cert-manager-controller-challenges created\n  clusterrolebinding.rbac.authorization.k8s.io/cert-manager-controller-clusterissuers created\n  clusterrolebinding.rbac.authorization.k8s.io/cert-manager-controller-ingress-shim created\n  clusterrolebinding.rbac.authorization.k8s.io/cert-manager-controller-issuers created\n  clusterrolebinding.rbac.authorization.k8s.io/cert-manager-controller-orders created\n  clusterrolebinding.rbac.authorization.k8s.io/cert-manager-webhook:subjectaccessreviews created\n  service/cert-manager created\n  service/cert-manager-webhook created\n  deployment.apps/cert-manager created\n  deployment.apps/cert-manager-cainjector created\n  deployment.apps/cert-manager-webhook created\n  mutatingwebhookconfiguration.admissionregistration.k8s.io/cert-manager-webhook created\n  validatingwebhookconfiguration.admissionregistration.k8s.io/cert-manager-webhook created\n  ```\n\n  Wait for all 3 pods in the cert-manager namespace to become Running:\n\n  ```bash\n  kubectl get pod -n cert-manager\n  ```\n\n  Once all the pods are Running, you should see output similar to the following:\n\n  ```bash\n  NAME                                       READY   STATUS    RESTARTS   AGE\n  cert-manager-7dd5854bb4-7nmpd              1/1     Running   0          2m10s\n  cert-manager-cainjector-64c949654c-2scxr   1/1     Running   0          2m10s\n  cert-manager-webhook-6b57b9b886-7q6g2      1/1     Running   0          2m10s\n  ```\n\n2. To install `kubeflow-issuer`, run the following command:\n\n  ```bash\n  kustomize build common/cert-manager/kubeflow-issuer/base | kubectl apply -f -\n  ```\n\n  If the installation is successful, you should see the following output:\n\n  ```bash\n  clusterissuer.cert-manager.io/kubeflow-self-signing-issuer created\n  ```\n\n  Note: If the `cert-manager-webhook` deployment is not in the Running state, you may encounter an error similar to the one below, and the `kubeflow-issuer` may not be installed. In this case, please ensure that all 3 pods of cert-manager are Running before retrying the command.  \n  If you encounter the below error, make sure that the `cert-manager` deployment and all its pods are running properly before proceeding.\n\n  ```bash\n  Error from server: error when retrieving current configuration of:\n  Resource: \"cert-manager.io/v1alpha2, Resource=clusterissuers\", GroupVersionKind: \"cert-manager.io/v1alpha2, Kind=ClusterIssuer\"\n  Name: \"kubeflow-self-signing-issuer\", Namespace: \"\"\n  from server for: \"STDIN\": conversion webhook for cert-manager.io/v1, Kind=ClusterIssuer failed: Post \"https://cert-manager-webhook.cert-manager.svc:443/convert?timeout=30s\": dial tcp 10.101.177.157:443: connect: connection refused\n  ```\n\n### Istio\n\n1. Install Custom Resource Definition(CRD) for istio.\n\n  ```bash\n  kustomize build common/istio-1-9/istio-crds/base | kubectl apply -f -\n  ```\n\n  if run properly,  you should see the following output:\n\n  ```bash\n  customresourcedefinition.apiextensions.k8s.io/authorizationpolicies.security.istio.io created\n  customresourcedefinition.apiextensions.k8s.io/destinationrules.networking.istio.io created\n  customresourcedefinition.apiextensions.k8s.io/envoyfilters.networking.istio.io created\n  customresourcedefinition.apiextensions.k8s.io/gateways.networking.istio.io created\n  customresourcedefinition.apiextensions.k8s.io/istiooperators.install.istio.io created\n  customresourcedefinition.apiextensions.k8s.io/peerauthentications.security.istio.io created\n  customresourcedefinition.apiextensions.k8s.io/requestauthentications.security.istio.io created\n  customresourcedefinition.apiextensions.k8s.io/serviceentries.networking.istio.io created\n  customresourcedefinition.apiextensions.k8s.io/sidecars.networking.istio.io created\n  customresourcedefinition.apiextensions.k8s.io/virtualservices.networking.istio.io created\n  customresourcedefinition.apiextensions.k8s.io/workloadentries.networking.istio.io created\n  customresourcedefinition.apiextensions.k8s.io/workloadgroups.networking.istio.io created\n  ```\n\n1. Install istio namespace\n\n  ```bash\n  kustomize build common/istio-1-9/istio-namespace/base | kubectl apply -f -\n  ```\n\n  if run properly,  you should see the following output:\n\n  ```bash\n  namespace/istio-system created\n  ```\n\n3. Install istio.\n\n  ```bash\n  kustomize build common/istio-1-9/istio-install/base | kubectl apply -f -\n  ```\n\n  if run properly,  you should see the following output:\n\n  ```bash\n  serviceaccount/istio-ingressgateway-service-account created\n  serviceaccount/istio-reader-service-account created\n  serviceaccount/istiod-service-account created\n  role.rbac.authorization.k8s.io/istio-ingressgateway-sds created\n  role.rbac.authorization.k8s.io/istiod-istio-system created\n  clusterrole.rbac.authorization.k8s.io/istio-reader-istio-system created\n  clusterrole.rbac.authorization.k8s.io/istiod-istio-system created\n  rolebinding.rbac.authorization.k8s.io/istio-ingressgateway-sds created\n  rolebinding.rbac.authorization.k8s.io/istiod-istio-system created\n  clusterrolebinding.rbac.authorization.k8s.io/istio-reader-istio-system created\n  clusterrolebinding.rbac.authorization.k8s.io/istiod-istio-system created\n  configmap/istio created\n  configmap/istio-sidecar-injector created\n  service/istio-ingressgateway created\n  service/istiod created\n  deployment.apps/istio-ingressgateway created\n  deployment.apps/istiod created\n  envoyfilter.networking.istio.io/metadata-exchange-1.8 created\n  envoyfilter.networking.istio.io/metadata-exchange-1.9 created\n  envoyfilter.networking.istio.io/stats-filter-1.8 created\n  envoyfilter.networking.istio.io/stats-filter-1.9 created\n  envoyfilter.networking.istio.io/tcp-metadata-exchange-1.8 created\n  envoyfilter.networking.istio.io/tcp-metadata-exchange-1.9 created\n  envoyfilter.networking.istio.io/tcp-stats-filter-1.8 created\n  envoyfilter.networking.istio.io/tcp-stats-filter-1.9 created\n  envoyfilter.networking.istio.io/x-forwarded-host created\n  gateway.networking.istio.io/istio-ingressgateway created\n  authorizationpolicy.security.istio.io/global-deny-all created\n  authorizationpolicy.security.istio.io/istio-ingressgateway created\n  mutatingwebhookconfiguration.admissionregistration.k8s.io/istio-sidecar-injector created\n  validatingwebhookconfiguration.admissionregistration.k8s.io/istiod-istio-system created\n  ```\n\n  Wait for all 2 pods in the cert-manager namespace to become Running:\n\n  ```bash\n  kubectl get po -n istio-system\n  ```\n\n  Once all the pods are Running, you should see output similar to the following:\n\n  ```bash\n  NAME                                   READY   STATUS    RESTARTS   AGE\n  istio-ingressgateway-79b665c95-xm22l   1/1     Running   0          16s\n  istiod-86457659bb-5h58w                1/1     Running   0          16s\n  ```\n\n### Dex\n\nNow, let's install dex.\n\n```bash\nkustomize build common/dex/overlays/istio | kubectl apply -f -\n```\n\nIf performed normally, it will be printed as follows:\n\n```bash\nnamespace/auth created\ncustomresourcedefinition.apiextensions.k8s.io/authcodes.dex.coreos.com created\nserviceaccount/dex created\nclusterrole.rbac.authorization.k8s.io/dex created\nclusterrolebinding.rbac.authorization.k8s.io/dex created\nconfigmap/dex created\nsecret/dex-oidc-client created\nservice/dex created\ndeployment.apps/dex created\nvirtualservice.networking.istio.io/dex created\n```\n\nWait until all one pod in the auth namespace is running.\n```bash\nkubectl get po -n auth\n```\n\nWhen everyone is running, similar results will be printed.\n```bash\nNAME                   READY   STATUS    RESTARTS   AGE\ndex-5ddf47d88d-458cs   1/1     Running   1          12s\n```\n\nInstall OIDC AuthService.\n```bash\nkustomize build common/oidc-authservice/base | kubectl apply -f -\n```\n\nIf performed normally, it will be printed as follows.\n```bash\nconfigmap/oidc-authservice-parameters created\nsecret/oidc-authservice-client created\nservice/authservice created\npersistentvolumeclaim/authservice-pvc created\nstatefulset.apps/authservice created\nenvoyfilter.networking.istio.io/authn-filter created\n```\n\nWait until the authservice-0 pod in the istio-system namespace is Running.\n```bash\nkubectl get po -n istio-system -w\n```\n\nIf everybody runs, a similar result will be printed.\n```bash\nNAME                                   READY   STATUS    RESTARTS   AGE\nauthservice-0                          1/1     Running   0          14s\nistio-ingressgateway-79b665c95-xm22l   1/1     Running   0          2m37s\nistiod-86457659bb-5h58w                1/1     Running   0          2m37s\n```\n\nCreate a Kubeflow Namespace.\n```bash\nkustomize build common/kubeflow-namespace/base | kubectl apply -f -\n```\n\nIf performed normally, it will be outputted as follows.\n```bash\nnamespace/kubeflow created\n```\n\nRetrieve the Kubeflow namespace.\n```bash\nkubectl get ns kubeflow\n```\n\nIf generated normally, similar results will be output.\n```bash\nNAME       STATUS   AGE\nkubeflow   Active   8s\n```\n\nInstall kubeflow-roles.\n```bash\nkustomize build common/kubeflow-roles/base | kubectl apply -f -\n```\n\nIf properly performed, it will output as follows.\n```bash\nclusterrole.rbac.authorization.k8s.io/kubeflow-admin created\nclusterrole.rbac.authorization.k8s.io/kubeflow-edit created\nclusterrole.rbac.authorization.k8s.io/kubeflow-kubernetes-admin created\nclusterrole.rbac.authorization.k8s.io/kubeflow-kubernetes-edit created\nclusterrole.rbac.authorization.k8s.io/kubeflow-kubernetes-view created\nclusterrole.rbac.authorization.k8s.io/kubeflow-view created\n```\n\nRetrieve the kubeflow roles just created.\n```bash\nkubectl get clusterrole | grep kubeflow\n```\n\nThe following 6 clusterroles will be output.\n```bash\nkubeflow-admin                                                         2021-12-03T08:51:36Z\nkubeflow-edit                                                          2021-12-03T08:51:36Z\nkubeflow-kubernetes-admin                                              2021-12-03T08:51:36Z\nkubeflow-kubernetes-edit                                               2021-12-03T08:51:36Z\nkubeflow-kubernetes-view                                               2021-12-03T08:51:36Z\nkubeflow-view                                                          2021-12-03T08:51:36Z\n```\n\nInstall Kubeflow Istio Resources.\n```bash\nkustomize build common/istio-1-9/kubeflow-istio-resources/base | kubectl apply -f -\n```\n\nIf performed normally, it will be output as follows.\n```bash\nclusterrole.rbac.authorization.k8s.io/kubeflow-istio-admin created\nclusterrole.rbac.authorization.k8s.io/kubeflow-istio-edit created\nclusterrole.rbac.authorization.k8s.io/kubeflow-istio-view created\ngateway.networking.istio.io/kubeflow-gateway created\n```\n\nRetrieve the Kubeflow roles just created.\n```bash\nkubectl get clusterrole | grep kubeflow-istio\n```\nThe following three clusterroles are output.\n```bash\nkubeflow-istio-admin                                                   2021-12-03T08:53:17Z\nkubeflow-istio-edit                                                    2021-12-03T08:53:17Z\nkubeflow-istio-view                                                    2021-12-03T08:53:17Z\n```\n\nCheck if the gateway is properly installed in the Kubeflow namespace.\n```bash\nkubectl get gateway -n kubeflow\n```\n\nIf generated normally, a result similar to the following will be output.\n```bash\nNAME               AGE\nkubeflow-gateway   31s\n```\n\nInstalling Kubeflow Pipelines.\n```bash\nkustomize build apps/pipeline/upstream/env/platform-agnostic-multi-user | kubectl apply -f -\n```\nIf performed normally, it will be output as follows.\n```bash\ncustomresourcedefinition.apiextensions.k8s.io/clusterworkflowtemplates.argoproj.io created\ncustomresourcedefinition.apiextensions.k8s.io/cronworkflows.argoproj.io created\ncustomresourcedefinition.apiextensions.k8s.io/workfloweventbindings.argoproj.io created\n...(생략)\nauthorizationpolicy.security.istio.io/ml-pipeline-visualizationserver created\nauthorizationpolicy.security.istio.io/mysql created\nauthorizationpolicy.security.istio.io/service-cache-server created\n```\n\nThis command is installing multiple resources at once, but there are resources with dependencies on the installation order. Therefore, depending on the time, a similar error may occur.\n```bash\n\"error: unable to recognize \"STDIN\": no matches for kind \"CompositeController\" in version \"metacontroller.k8s.io/v1alpha1\"\"  \n```\n\nIf a similar error occurs, wait about 10 seconds and then try the command above again.\n```bash\nkustomize build apps/pipeline/upstream/env/platform-agnostic-multi-user | kubectl apply -f -\n```\n\nCheck to see if it has been installed correctly.\n```bash\nkubectl get po -n kubeflow\n```\n\nWait until all 16 pods are running as follows.\n```bash\nNAME                                                     READY   STATUS    RESTARTS   AGE\ncache-deployer-deployment-79fdf9c5c9-bjnbg               2/2     Running   1          5m3s\ncache-server-5bdf4f4457-48gbp                            2/2     Running   0          5m3s\nkubeflow-pipelines-profile-controller-7b947f4748-8d26b   1/1     Running   0          5m3s\nmetacontroller-0                                         1/1     Running   0          5m3s\nmetadata-envoy-deployment-5b4856dd5-xtlkd                1/1     Running   0          5m3s\nmetadata-grpc-deployment-6b5685488-kwvv7                 2/2     Running   3          5m3s\nmetadata-writer-548bd879bb-zjkcn                         2/2     Running   1          5m3s\nminio-5b65df66c9-k5gzg                                   2/2     Running   0          5m3s\nml-pipeline-8c4b99589-85jw6                              2/2     Running   1          5m3s\nml-pipeline-persistenceagent-d6bdc77bd-ssxrv             2/2     Running   0          5m3s\nml-pipeline-scheduledworkflow-5db54d75c5-zk2cw           2/2     Running   0          5m2s\nml-pipeline-ui-5bd8d6dc84-j7wqr                          2/2     Running   0          5m2s\nml-pipeline-viewer-crd-68fb5f4d58-mbcbg                  2/2     Running   1          5m2s\nml-pipeline-visualizationserver-8476b5c645-wljfm         2/2     Running   0          5m2s\nmysql-f7b9b7dd4-xfnw4                                    2/2     Running   0          5m2s\nworkflow-controller-5cbbb49bd8-5zrwx                     2/2     Running   1          5m2s\n```\n\nAdditionally, please check if the ml-pipeline UI is connected properly.\n```bash\nkubectl port-forward svc/ml-pipeline-ui -n kubeflow 8888:80\n```\n\nOpen the web browser and connect to the path [http://localhost:8888/#/pipelines/](http://localhost:8888/#/pipelines/). Confirm that the following screen is displayed.\n\nIf you get the error \"Connection refused on localhost\", you can access it through the command line by setting the address, as long as there are no security issues. To check if the ml-pipeline UI connects normally, open the bind of all addresses with 0.0.0.0.\n```bash\nkubectl port-forward --address 0.0.0.0 svc/ml-pipeline-ui -n kubeflow 8888:80\n```\nDespite running with the above options, if connection refusal issues still occur, add access permission by allowing all TCP protocol ports in the firewall settings or by adding access permission to port 8888.\n\nWhen you open the web browser and access the path `http://<your virtual instance public IP>:8888/#/pipelines/`, you can see the ml-pipeline UI screen.\n\nWhen accessing the other ports path that is being processed in the bottom, run the command in the same way as above and add the port number to the firewall to run it.\n\nEnglish: We will install Katib.\n```bash\nkustomize build apps/katib/upstream/installs/katib-with-kubeflow | kubectl apply -f -\n```\n\nIf performed normally, it will be output as follows.\n```bash\ncustomresourcedefinition.apiextensions.k8s.io/experiments.kubeflow.org created\ncustomresourcedefinition.apiextensions.k8s.io/suggestions.kubeflow.org created\ncustomresourcedefinition.apiextensions.k8s.io/trials.kubeflow.org created\nserviceaccount/katib-controller created\nserviceaccount/katib-ui created\nclusterrole.rbac.authorization.k8s.io/katib-controller created\nclusterrole.rbac.authorization.k8s.io/katib-ui created\nclusterrole.rbac.authorization.k8s.io/kubeflow-katib-admin created\nclusterrole.rbac.authorization.k8s.io/kubeflow-katib-edit created\nclusterrole.rbac.authorization.k8s.io/kubeflow-katib-view created\nclusterrolebinding.rbac.authorization.k8s.io/katib-controller created\nclusterrolebinding.rbac.authorization.k8s.io/katib-ui created\nconfigmap/katib-config created\nconfigmap/trial-templates created\nsecret/katib-mysql-secrets created\nservice/katib-controller created\nservice/katib-db-manager created\nservice/katib-mysql created\nservice/katib-ui created\npersistentvolumeclaim/katib-mysql created\ndeployment.apps/katib-controller created\ndeployment.apps/katib-db-manager created\ndeployment.apps/katib-mysql created\ndeployment.apps/katib-ui created\ncertificate.cert-manager.io/katib-webhook-cert created\nissuer.cert-manager.io/katib-selfsigned-issuer created\nvirtualservice.networking.istio.io/katib-ui created\nmutatingwebhookconfiguration.admissionregistration.k8s.io/katib.kubeflow.org created\nvalidatingwebhookconfiguration.admissionregistration.k8s.io/katib.kubeflow.org created\n```\n\nConfirm if it has been installed properly.\n```bash\nkubectl get po -n kubeflow | grep katib\n```\nWait until four pods are Running, like this.\n```bash\nkatib-controller-68c47fbf8b-b985z                        1/1     Running   0          82s\nkatib-db-manager-6c948b6b76-2d9gr                        1/1     Running   0          82s\nkatib-mysql-7894994f88-scs62                             1/1     Running   0          82s\nkatib-ui-64bb96d5bf-d89kp                                1/1     Running   0          82s\n```\n\nAdditionally, we will confirm that the Katib UI is connected normally.\n```bash\nkubectl port-forward svc/katib-ui -n kubeflow 8081:80\n```\n\nOpen the web browser and access the path [http://localhost:8081/katib/](http://localhost:8081/katib/) to confirm the following screen is displayed.\n\n\n```bash\nkustomize build apps/centraldashboard/upstream/overlays/istio | kubectl apply -f -\n```\n\nIf performed normally, it will be output as follows.\n```bash\nserviceaccount/centraldashboard created\nrole.rbac.authorization.k8s.io/centraldashboard created\nclusterrole.rbac.authorization.k8s.io/centraldashboard created\nrolebinding.rbac.authorization.k8s.io/centraldashboard created\nclusterrolebinding.rbac.authorization.k8s.io/centraldashboard created\nconfigmap/centraldashboard-config created\nconfigmap/centraldashboard-parameters created\nservice/centraldashboard created\ndeployment.apps/centraldashboard created\nvirtualservice.networking.istio.io/centraldashboard created\n```\n\nCheck to see if it has been installed normally.\n```bash\nkubectl get po -n kubeflow | grep centraldashboard\n```\n\nWait until one pod related to centraldashboard in the kubeflow namespace becomes Running.\n```bash\ncentraldashboard-8fc7d8cc-xl7ts                          1/1     Running   0          52s\n```\n\nAdditionally, we will check if the Central Dashboard UI is connected properly.\n```bash\nkubectl port-forward svc/centraldashboard -n kubeflow 8082:80\n```\nOpen the web browser to connect to the path [http://localhost:8082/](http://localhost:8082/) and check that the following screen is displayed.\n```bash\nkustomize build apps/admission-webhook/upstream/overlays/cert-manager | kubectl apply -f -\n```\n\nIf performed normally, it will be output as follows.\n```bash\ncustomresourcedefinition.apiextensions.k8s.io/poddefaults.kubeflow.org created\nserviceaccount/admission-webhook-service-account created\nclusterrole.rbac.authorization.k8s.io/admission-webhook-cluster-role created\nclusterrole.rbac.authorization.k8s.io/admission-webhook-kubeflow-poddefaults-admin created\nclusterrole.rbac.authorization.k8s.io/admission-webhook-kubeflow-poddefaults-edit created\nclusterrole.rbac.authorization.k8s.io/admission-webhook-kubeflow-poddefaults-view created\nclusterrolebinding.rbac.authorization.k8s.io/admission-webhook-cluster-role-binding created\nservice/admission-webhook-service created\ndeployment.apps/admission-webhook-deployment created\ncertificate.cert-manager.io/admission-webhook-cert created\nissuer.cert-manager.io/admission-webhook-selfsigned-issuer created\nmutatingwebhookconfiguration.admissionregistration.k8s.io/admission-webhook-mutating-webhook-configuration created\n```\n\nCheck if it is installed normally.\n```bash\nkubectl get po -n kubeflow | grep admission-webhook\n```\n\nWait until one pod is running.\n```bash\nadmission-webhook-deployment-667bd68d94-2hhrx            1/1     Running   0          11s\n```\n\nInstall the Notebook controller.\n\nIf done successfully, it will output as follows.\n  deployment.apps/notebook-controller created\n  ```\n\nA CustomResourceDefinition.apiextensions.k8s.io/notebooks.kubeflow.org, ServiceAccount/notebook-controller-service-account, Role.rbac.authorization.k8s.io/notebook-controller-leader-election-role, ClusterRole.rbac.authorization.k8s.io/notebook-controller-kubeflow-notebooks-admin, ClusterRole.rbac.authorization.k8s.io/notebook-controller-kubeflow-notebooks-edit, ClusterRole.rbac.authorization.k8s.io/notebook-controller-kubeflow-notebooks-view, ClusterRole.rbac.authorization.k8s.io/notebook-controller-role, RoleBinding.rbac.authorization.k8s.io/notebook-controller-leader-election-rolebinding, ClusterRoleBinding.rbac.authorization.k8s.io/notebook-controller-role-binding, ConfigMap/notebook-controller-config-m\n\nTranslation: Check if the installation was successful. Wait until one pod is running with the following command: kubectl get po -n kubeflow | grep notebook-controller.\nTranslation: Install Jupyter Web App.\n  If performed correctly, the following will be output.\n  ```\n  Confirm that the installation was successful:\n  configmap/jupyter-web-app-config-76844k4cd7 created\n  configmap/jupyter-web-app-logos created\n  configmap/jupyter-web-app-parameters-chmg88cm48 created\n  service/jupyter-web-app-service created\n  deployment.apps/jupyter-web-app-deployment created\n  virtualservice.networking.istio.io/jupyter-web-app-jupyter-web-app created\n\nWait until one pod is Running.\n\nEnglish: We will install the Profile Controller.\n```bash\nkustomize build apps/profiles/upstream/overlays/kubeflow | kubectl apply -f -\n```\n\nIf performed normally, it will be outputted as follows.\n```bash\ncustomresourcedefinition.apiextensions.k8s.io/profiles.kubeflow.org created\nserviceaccount/profiles-controller-service-account created\nrole.rbac.authorization.k8s.io/profiles-leader-election-role created\nrolebinding.rbac.authorization.k8s.io/profiles-leader-election-rolebinding created\nclusterrolebinding.rbac.authorization.k8s.io/profiles-cluster-role-binding created\nconfigmap/namespace-labels-data-48h7kd55mc created\nconfigmap/profiles-config-46c7tgh6fd created\nservice/profiles-kfam created\ndeployment.apps/profiles-deployment created\nvirtualservice.networking.istio.io/profiles-kfam created\n```\n\nCheck to see if it is installed normally.\n```bash\nkubectl get po -n kubeflow | grep profiles-deployment\n```\n\nWait until one pod is running.\n```bash\nprofiles-deployment-89f7d88b-qsnrd                       2/2     Running   0          42s\n```\n\nInstall the Volumes Web App.\n```bash\nkustomize build apps/volumes-web-app/upstream/overlays/istio | kubectl apply -f -\n```\n\nIf performed normally, it will be output as follows.\n```bash\nserviceaccount/volumes-web-app-service-account created\nclusterrole.rbac.authorization.k8s.io/volumes-web-app-cluster-role created\nclusterrole.rbac.authorization.k8s.io/volumes-web-app-kubeflow-volume-ui-admin created\nclusterrole.rbac.authorization.k8s.io/volumes-web-app-kubeflow-volume-ui-edit created\nclusterrole.rbac.authorization.k8s.io/volumes-web-app-kubeflow-volume-ui-view created\nclusterrolebinding.rbac.authorization.k8s.io/volumes-web-app-cluster-role-binding created\nconfigmap/volumes-web-app-parameters-4gg8cm2gmk created\nservice/volumes-web-app-service created\ndeployment.apps/volumes-web-app-deployment created\nvirtualservice.networking.istio.io/volumes-web-app-volumes-web-app created\n```\n\nCheck if it is installed normally.\n```bash\nkubectl get po -n kubeflow | grep volumes-web-app\n```\n\nWait until one pod is running.\n```bash\nvolumes-web-app-deployment-8589d664cc-62svl              1/1     Running   0          27s\n```\n  ```bash\n  Install Tensorboard Web App.\n\nService account/tensorboards-web-app-service-account created, Cluster role.rbac.authorization.k8s.io/tensorboards-web-app-cluster-role created, Cluster role.rbac.authorization.k8s.io/tensorboards-web-app-kubeflow-tensorboard-ui-admin created, Cluster role.rbac.authorization.k8s.io/tensorboards-web-app-kubeflow-tensorboard-ui-edit created, Cluster role.rbac.authorization.k8s.io/tensorboards-web-app-kubeflow-tensorboard-ui-view created, Cluster role binding.rbac.authorization.k8s.io/tensorboards-web-app-cluster-role-binding created, Config map/tensorboards-web-app-parameters-g28fbd6cch created, Service/tensorboards-web-app-service created, Deployment.apps/tensorboards-web-app-deployment created, and Virtual service.networking.istio.io/t\nCheck if it is installed correctly.\n  ```bash\n  Deployment \"tensorboard-web-app-deployment-6ff79b7f44-qbzmw\" created\n  deployment.apps/tensorboard-controller-controller-manager created\n```\n\nA custom resource definition for 'tensorboards.tensorboard.kubeflow.org' was created, along with a service account, roles, role bindings, a config map, and a deployment for the controller manager metrics service.\n  Check if the deployment.apps/tensorboard-controller-controller-manager was installed correctly. Wait for 1 pod to be Running.\nTranslation: Installing Training Operator.\n```bash\nkustomize build apps/training-operator/upstream/overlays/kubeflow | kubectl apply -f -\n```\n\nIf performed normally, it will be output as follows.\n\n```bash\ncustomresourcedefinition.apiextensions.k8s.io/mxjobs.kubeflow.org created\ncustomresourcedefinition.apiextensions.k8s.io/pytorchjobs.kubeflow.org created\ncustomresourcedefinition.apiextensions.k8s.io/tfjobs.kubeflow.org created\ncustomresourcedefinition.apiextensions.k8s.io/xgboostjobs.kubeflow.org created\nserviceaccount/training-operator created\nclusterrole.rbac.authorization.k8s.io/kubeflow-training-admin created\nclusterrole.rbac.authorization.k8s.io/kubeflow-training-edit created\nclusterrole.rbac.authorization.k8s.io/kubeflow-training-view created\nclusterrole.rbac.authorization.k8s.io/training-operator created\nclusterrolebinding.rbac.authorization.k8s.io/training-operator created\nservice/training-operator created\ndeployment.apps/training-operator created\n```\n\nCheck to see if it has been installed normally.\n\n```bash\nkubectl get po -n kubeflow | grep training-operator\n```\n\nWait until one pod is up and running.\n\n```bash\ntraining-operator-7d98f9dd88-6887f                          1/1     Running   0          28s\n```\n\n### User Namespace\n\nFor using Kubeflow, create a Kubeflow Profile for the User to be used.\n\n```bash\nkustomize build common/user-namespace/base | kubectl apply -f -\n```\n\nIf performed normally, it will be outputted as follows.\n\n```bash\nconfigmap/default-install-config-9h2h2b6hbk created\nprofile.kubeflow.org/kubeflow-user-example-com created\n```\n\nConfirm that the kubeflow-user-example-com profile has been created.\n\n```bash\nkubectl get profile\n```\n\n```bash\nkubeflow-user-example-com   37s\n```\n\n## Check installation\n\nConfirm successful installation by port forwarding to access Kubeflow central dashboard with web browser.\n\n```bash\nkubectl port-forward svc/istio-ingressgateway -n istio-system 8080:80\n```\n\nOpen a web browser and connect to [http://localhost:8080](http://localhost:8080) to confirm that the following screen is displayed. \n![login-ui](./img/login-after-install.png)\n\nEnter the following connection information to connect.\n\n- Email Address: `user@example.com`\n- Password: `12341234`\n\n![central-dashboard](./img/after-login.png)\n"
  },
  {
    "path": "i18n/en/docusaurus-plugin-content-docs/current/setup-components/install-components-mlflow.md",
    "content": "---\ntitle : \"2. MLflow Tracking Server\"\ndescription: \"구성요소 설치 - MLflow\"\nsidebar_position: 2\ndate: 2021-12-13\nlastmod: 2021-12-13\ncontributors: [\"Jaeyeon Kim\"]\n---\n\n## Install MLflow Tracking Server\n\nMLflow is a popular open-source ML experiment management tool. In addition to [experiment management](https://mlflow.org/docs/latest/tracking.html#tracking), MLflow provides functionalities for ML [model packaging](https://mlflow.org/docs/latest/projects.html#projects), [deployment management](https://mlflow.org/docs/latest/models.html#models), and [model storage](https://mlflow.org/docs/latest/model-registry.html#registry).\n\nIn *MLOps for ALL*, we will be using MLflow for experiment management purposes.   \no store the data managed by MLflow and provide a user interface, we will deploy the MLflow Tracking Server on the Kubernetes cluster.\n\n## Before Install MLflow Tracking Server\n\n### Install PostgreSQL DB\n\nMLflow Tracking Server deploys a PostgreSQL DB for use as a Backend Store to a Kubernetes cluster.\n\nFirst, create a namespace called `mlflow-system`.\n\n```bash\nkubectl create ns mlflow-system\n```\n\nIf the following message is output, it means that it has been generated normally.\n\n```bash\nnamespace/mlflow-system created\n```\n\nCreate a Postgresql DB in the `mlflow-system` namespace.\n\n```bash\nkubectl -n mlflow-system apply -f https://raw.githubusercontent.com/mlops-for-all/helm-charts/b94b5fe4133f769c04b25068b98ccfa7a505aa60/mlflow/manifests/postgres.yaml \n```\n\nIf performed normally, it will be outputted as follows.\n\n```bash\nservice/postgresql-mlflow-service created\ndeployment.apps/postgresql-mlflow created\npersistentvolumeclaim/postgresql-mlflow-pvc created\n```\n\nWait until one postgresql related pod is running in the mlflow-system namespace.\n\n```bash\nkubectl get pod -n mlflow-system | grep postgresql\n```\n\nIf it is output similar to the following, it has executed normally.\n\n```bash\npostgresql-mlflow-7b9bc8c79f-srkh7   1/1     Running   0          38s\n```\n\n### Setup Minio\n\nWe will utilize the Minio that was installed in the previous Kubeflow installation step. \nHowever, in order to separate it for kubeflow and mlflow purposes, we will create a mlflow-specific bucket.  \nFirst, port-forward the minio-service to access Minio and create the bucket.\n\n```bash\nkubectl port-forward svc/minio-service -n kubeflow 9000:9000\n```\n\nOpen a web browser and connect to [localhost:9000](http://localhost:9000) to display the following screen.\n\n![minio-install](./img/minio-install.png)\n\n\nEnter the following credentials to log in: \n\n- Username: `minio`\n- Password: `minio123`\n\nClick the **`+`** button on the right side bottom, then click `Create Bucket`. \n\n![create-bucket](./img/create-bucket.png)\n\n\nEnter `mlflow` in `Bucket Name` to create the bucket.\n\nIf successfully created, you will see a bucket named `mlflow` on the left.\n![mlflow-bucket](./img/mlflow-bucket.png)\n\n\n---\n\n## Let's Install MLflow Tracking Server\n\n### Add Helm Repository\n\n```bash\nhelm repo add mlops-for-all https://mlops-for-all.github.io/helm-charts\n```\n\nIf the following message is displayed, it means it has been added successfully.\n```bash\n\"mlops-for-all\" has been added to your repositories\n```\n\n### Update Helm Repository\n\n```bash\nhelm repo update\n```\n\nIf the following message is displayed, it means that the update has been successfully completed.\n\n```bash\nHang tight while we grab the latest from your chart repositories...\n...Successfully got an update from the \"mlops-for-all\" chart repository\nUpdate Complete. ⎈Happy Helming!⎈\n```\n\n### Helm Install\n\nInstall mlflow-server Helm Chart version 0.2.0.\n\n```bash\nhelm install mlflow-server mlops-for-all/mlflow-server \\\n  --namespace mlflow-system \\\n  --version 0.2.0\n```\n\n- The above Helm chart installs MLflow with the connection information for its backend store and artifacts store set to the default minio created during the Kubeflow installation process and the postgresql information created from the [PostgreSQL DB installation](#postgresql-db-installation) above.\n  - If you want to use a separate DB or object storage, please refer to the [Helm Chart Repo](https://github.com/mlops-for-all/helm-charts/tree/main/mlflow/chart) and set the values separately during helm install.\n\nThe following message should be displayed:\n\n```bash\nNAME: mlflow-server\nLAST DEPLOYED: Sat Dec 18 22:02:13 2021\nNAMESPACE: mlflow-system\nSTATUS: deployed\nREVISION: 1\nTEST SUITE: None\n```\n\nCheck to see if it was installed normally.\n\n```bash\nkubectl get pod -n mlflow-system | grep mlflow-server\n```\n\nWait until one mlflow-server related pod is running in the mlflow-system namespace.  \nIf it is output similar to the following, then it has been successfully executed.\n\n```bash\nmlflow-server-ffd66d858-6hm62        1/1     Running   0          74s\n```\n\n### Check installation\n\nLet's now check if we can successfully connect to the MLflow Server.\n\nFirst, we will perform port forwarding in order to connect from the client node.\n\n```bash\nkubectl port-forward svc/mlflow-server-service -n mlflow-system 5000:5000\n```\n\nOpen a web browser and connect to [localhost:5000](http://localhost:5000) and the following screen will be output.\n\n![mlflow-install](./img/mlflow-install.png)\n"
  },
  {
    "path": "i18n/en/docusaurus-plugin-content-docs/current/setup-components/install-components-pg.md",
    "content": "---\ntitle : \"4. Prometheus & Grafana\"\ndescription: \"구성요소 설치 - Prometheus & Grafana\"\nsidebar_position: 4\ndate: 2021-12-13\nlastmod: 2021-12-13\ncontributors: [\"Jaeyeon Kim\"]\n---\n\n## Prometheus & Grafana\n\nPrometheus and Grafana are tools for monitoring.  \nFor stable service operation, it is necessary to continuously observe the status of the service and infrastructure where the service is operating, and to respond quickly based on the observed metrics when a problem arises.  \nAmong the many tools to efficiently perform such monitoring, *Everyone's MLOps* will use open source Prometheus and Grafana.\n\nFor more information, please refer to the [Prometheus Official Documentation](https://prometheus.io/docs/introduction/overview/) and [Grafana Official Documentation](https://grafana.com/docs/).\n\nPrometheus is a tool to collect metrics from various targets, and Grafana is a tool to help visualize the gathered data. Although there is no dependency between them, they are often used together complementary to each other.\n\nIn this page, we will install Prometheus and Grafana on a Kubernetes cluster, then send API requests to a SeldonDeployment created with Seldon-Core and check if metrics are collected successfully.\n\nWe also install a dashboard to efficiently monitor the metrics of the SeldonDeployment created in Seldon-Core using Helm Chart version 1.12.0 from seldonio/seldon-core-analytics Helm Repository.\n\n### Add Helm Repository\n\n```bash\nhelm repo add seldonio https://storage.googleapis.com/seldon-charts\n```\n\nIf the following message is output, it means that it has been added successfully.\n\n```bash\n\"seldonio\" has been added to your repositories\n```\n\n### Update Helm Repository\n\n```bash\nhelm repo update\n```\n\nIf the following message is displayed, it means that the update was successful.\n\n```bash\nHang tight while we grab the latest from your chart repositories...\n...Successfully got an update from the \"seldonio\" chart repository\n...Successfully got an update from the \"datawire\" chart repository\nUpdate Complete. ⎈Happy Helming!⎈\n```\n\n### Helm Install\n\nInstall version 1.12.0 of the seldon-core-analytics Helm Chart.\n\n```bash\nhelm install seldon-core-analytics seldonio/seldon-core-analytics \\\n  --namespace seldon-system \\\n  --version 1.12.0\n```\n\nThe following message should be output.\n\n```bash\nSkip...\nNAME: seldon-core-analytics\nLAST DEPLOYED: Tue Dec 14 18:29:38 2021\nNAMESPACE: seldon-system\nSTATUS: deployed\nREVISION: 1\n```\n\nCheck to see if it was installed normally.\n\n```bash\nkubectl get pod -n seldon-system | grep seldon-core-analytics\n```\n\n\nWait until 6 seldon-core-analytics related pods are Running in the seldon-system namespace.\n```bash\nseldon-core-analytics-grafana-657c956c88-ng8wn                  2/2     Running   0          114s\nseldon-core-analytics-kube-state-metrics-94bb6cb9-svs82         1/1     Running   0          114s\nseldon-core-analytics-prometheus-alertmanager-64cf7b8f5-nxbl8   2/2     Running   0          114s\nseldon-core-analytics-prometheus-node-exporter-5rrj5            1/1     Running   0          114s\nseldon-core-analytics-prometheus-pushgateway-8476474cff-sr4n6   1/1     Running   0          114s\nseldon-core-analytics-prometheus-seldon-685c664894-7cr45        2/2     Running   0          114s\n```\n\n### Check installation\n\nLet's now check if we can connect to Grafana normally. First, we will port forward to connect to the client node.\n\n```bash\nkubectl port-forward svc/seldon-core-analytics-grafana -n seldon-system 8090:80\n```\n\nOpen the web browser and connect to [localhost:8090](http://localhost:8090), then the following screen will be displayed.\n\n![grafana-install](./img/grafana-install.png)\n\nEnter the following connection information to connect.\n\n- Email or username: `admin`\n- Password: `password`\n\nWhen you log in, the following screen will be displayed.\n\n![grafana-login](./img/grafana-login.png)\n\nClick the dashboard icon on the left and click the `Manage` button.\n\n![dashboard-click](./img/dashboard-click.png)\n\nYou can see that the basic Grafana dashboard is included. Click the `Prediction Analytics` dashboard among them.\n\n![dashboard](./img/dashboard.png)\n\n The Seldon Core API Dashboard is visible and can be confirmed with the following output.\n\n![seldon-dashboard](./img/seldon-dashboard.png)\n\n## References\n\n- [Seldon-Core-Analytics Helm Chart](https://github.com/SeldonIO/seldon-core/tree/master/helm-charts/seldon-core-analytics)\n"
  },
  {
    "path": "i18n/en/docusaurus-plugin-content-docs/current/setup-components/install-components-seldon.md",
    "content": "---\ntitle : \"3. Seldon-Core\"\ndescription: \"구성요소 설치 - Seldon-Core\"\nsidebar_position: 3\ndate: 2021-12-13\nlastmod: 2021-12-13\ncontributors: [\"Jaeyeon Kim\"]\n---\n\n## Seldon-Core\n\nSeldon-Core is one of the open source frameworks that can deploy and manage numerous machine learning models in Kubernetes environments.  \nFor more details, please refer to the official [product description page](https://www.seldon.io/tech/products/core/) and [GitHub](https://github.com/SeldonIO/seldon-core) of Seldon-Core and API Deployment part.\n\n## Installing Seldon-Core\n\nIn order to use Seldon-Core, modules such as Ambassador, which is responsible for Ingress of Kubernetes, and Istio are required [here](https://docs.seldon.io/projects/seldon-core/en/latest/workflow/install.html).  \nSeldon-Core officially supports only Ambassador and Istio, and *MLOps for everyone* will use Ambassador to use Seldon-core, so we will install Ambassador.\n\n### Adding Ambassador to the Helm Repository\n\n```bash\nhelm repo add datawire https://www.getambassador.io\n```\n\nIf the following message is displayed, it means it has been added normally.\n\n```bash\n\"datawire\" has been added to your repositories\n```\n\n### Update Ambassador - Helm Repository\n\n```bash\nhelm repo update\n```\n\nIf the following message is output, it means that the update has been completed normally.\n\n```bash\nHang tight while we grab the latest from your chart repositories...\n...Successfully got an update from the \"datawire\" chart repository\nUpdate Complete. ⎈Happy Helming!⎈\n```\n\n### Ambassador - Helm Install\n\nInstall version 6.9.3 of the Ambassador Chart.\n\n```bash\nhelm install ambassador datawire/ambassador \\\n  --namespace seldon-system \\\n  --create-namespace \\\n  --set image.repository=quay.io/datawire/ambassador \\\n  --set enableAES=false \\\n  --set crds.keep=false \\\n  --version 6.9.3\n```\n\nThe following message should be displayed.\n\n```bash\n생략...\n\nW1206 17:01:36.026326   26635 warnings.go:70] rbac.authorization.k8s.io/v1beta1 Role is deprecated in v1.17+, unavailable in v1.22+; use rbac.authorization.k8s.io/v1 Role\nW1206 17:01:36.029764   26635 warnings.go:70] rbac.authorization.k8s.io/v1beta1 RoleBinding is deprecated in v1.17+, unavailable in v1.22+; use rbac.authorization.k8s.io/v1 RoleBinding\nNAME: ambassador\nLAST DEPLOYED: Mon Dec  6 17:01:34 2021\nNAMESPACE: seldon-system\nSTATUS: deployed\nREVISION: 1\nNOTES:\n-------------------------------------------------------------------------------\n  Congratulations! You've successfully installed Ambassador!\n\n-------------------------------------------------------------------------------\nTo get the IP address of Ambassador, run the following commands:\nNOTE: It may take a few minutes for the LoadBalancer IP to be available.\n     You can watch the status of by running 'kubectl get svc -w  --namespace seldon-system ambassador'\n\n  On GKE/Azure:\n  export SERVICE_IP=$(kubectl get svc --namespace seldon-system ambassador -o jsonpath='{.status.loadBalancer.ingress[0].ip}')\n\n  On AWS:\n  export SERVICE_IP=$(kubectl get svc --namespace seldon-system ambassador -o jsonpath='{.status.loadBalancer.ingress[0].hostname}')\n\n  echo http://$SERVICE_IP:\n\nFor help, visit our Slack at http://a8r.io/Slack or view the documentation online at https://www.getambassador.io.\n```\n\nWait until four pods become running in the seldon-system.\n\n```bash\nkubectl get pod -n seldon-system\n```\n\n```bash\nambassador-7f596c8b57-4s9xh                  1/1     Running   0          7m15s\nambassador-7f596c8b57-dt6lr                  1/1     Running   0          7m15s\nambassador-7f596c8b57-h5l6f                  1/1     Running   0          7m15s\nambassador-agent-77bccdfcd5-d5jxj            1/1     Running   0          7m15s\n```\n\n### Seldon-Core - Helm Install\n\nInstall version 1.11.2 of the seldon-core-operator Chart.\n\n```bash\nhelm install seldon-core seldon-core-operator \\\n    --repo https://storage.googleapis.com/seldon-charts \\\n    --namespace seldon-system \\\n    --set usageMetrics.enabled=true \\\n    --set ambassador.enabled=true \\\n    --version 1.11.2\n```\nThe following message should be displayed.\n\n```bash\nSkip...\n\nW1206 17:05:38.336391   28181 warnings.go:70] admissionregistration.k8s.io/v1beta1 ValidatingWebhookConfiguration is deprecated in v1.16+, unavailable in v1.22+; use admissionregistration.k8s.io/v1 ValidatingWebhookConfiguration\nNAME: seldon-core\nLAST DEPLOYED: Mon Dec  6 17:05:34 2021\nNAMESPACE: seldon-system\nSTATUS: deployed\nREVISION: 1\nTEST SUITE: None\n```\n\nWait until one seldon-controller-manager pod is Running in the seldon-system namespace.\n\n```bash\nkubectl get pod -n seldon-system | grep seldon-controller\n```\n\n```bash\nseldon-controller-manager-8457b8b5c7-r2frm   1/1     Running   0          2m22s\n```\n\n## References\n\n- [Example Model Servers with Seldon](https://docs.seldon.io/projects/seldon-core/en/latest/examples/server_examples.html#examples-server-examples--page-root)\n"
  },
  {
    "path": "i18n/en/docusaurus-plugin-content-docs/current/setup-kubernetes/_category_.json",
    "content": "{\n  \"label\": \"Setup Kubernetes\",\n  \"position\": 2,\n  \"link\": {\n    \"type\": \"generated-index\"\n  }\n}\n"
  },
  {
    "path": "i18n/en/docusaurus-plugin-content-docs/current/setup-kubernetes/install-kubernetes/_category_.json",
    "content": "{\n  \"label\": \"4. Install Kubernetes\",\n  \"position\": 4,\n  \"link\": {\n    \"type\": \"generated-index\"\n  }\n}\n"
  },
  {
    "path": "i18n/en/docusaurus-plugin-content-docs/current/setup-kubernetes/install-kubernetes/kubernetes-with-k3s.md",
    "content": "---\ntitle: \"4.1. K3s\"\ndescription: \"\"\nsidebar_position: 1\ndate: 2021-12-13\nlastmod: 2021-12-20\ndraft: false\nweight: 221\ncontributors: [\"Jongseob Jeon\"]\nmenu:\n  docs:\n    parent:../setup-kubernetes\"\nimages: []\n---\n\n## 1. Prerequisite\n\nBefore setting up a Kubernetes cluster, install the necessary components on the **cluster**.\n\nPlease refer to [Install Prerequisite](../../setup-kubernetes/install-prerequisite.md) to install the necessary components on the **cluster** before installing Kubernetes.\n\nk3s uses containerd as the backend by default.\nHowever, we need to use docker as the backend to use GPU, so we will install the backend with the `--docker` option.\n\n```bash\ncurl -sfL https://get.k3s.io | INSTALL_K3S_VERSION=v1.21.7+k3s1 sh -s - server --disable traefik --disable servicelb --disable local-storage --docker\n```\n\nAfter installing k3s, check the k3s config.\n\n```bash\nsudo cat /etc/rancher/k3s/k3s.yaml\n```\n\nIf installed correctly, the following items will be output. (Security related keys are hidden with <...>.)\n\n```bash\napiVersion: v1\nclusters:\n- cluster:\n    certificate-authority-data:\n    <...>\n    server: https://127.0.0.1:6443\n  name: default\ncontexts:\n- context:\n    cluster: default\n    user: default\n  name: default\ncurrent-context: default\nkind: Config\npreferences: {}\nusers:\n- name: default\n  user:\n    client-certificate-data:\n    <...>\n    client-key-data:\n    <...>\n```\n\n## 2. Setup Kubernetes Cluster\n\nSet up the Kubernetes cluster by copying the k3s config to be used as the cluster’s kubeconfig.\n\n```bash\nmkdir .kube\nsudo cp /etc/rancher/k3s/k3s.yaml .kube/config\n```\n\nGrant user access permission to the copied config file.\n\n```bash\nsudo chown $USER:$USER .kube/config\n```\n\n## 3. Setup Kubernetes Client\n\nNow move the kubeconfig configured in the cluster to the local.\nSet the path to `~/.kube/config` on the local.\n\nThe config file copied at first has the server ip set to `https://127.0.0.1:6443`. \nModify this value to match the ip of the cluster. \n(We modified it to `https://192.168.0.19:6443` to match the ip of the cluster used in this page.)\n\n```bash\napiVersion: v1\nclusters:\n- cluster:\n    certificate-authority-data:\n    <...>\n    server: https://192.168.0.19:6443\n  name: default\ncontexts:\n- context:\n    cluster: default\n    user: default\n  name: default\ncurrent-context: default\nkind: Config\npreferences: {}\nusers:\n- name: default\n  user:\n    client-certificate-data:\n    <...>\n    client-key-data:\n    <...>\n```\n\n## 4. Install Kubernetes Default Modules\n\nPlease refer to [Setup Kubernetes Modules](../../setup-kubernetes/install-kubernetes-module.md) to install the following components:\n\n- helm\n- kustomize\n- CSI plugin\n- [Optional] nvidia-docker, nvidia-device-plugin\n\n## 5. Verify Successful Installation\n\nFinally, check if the nodes are Ready and verify the OS, Docker, and Kubernetes versions.\n\n```bash\nkubectl get nodes -o wide\n```\n\nIf you see the following message, it means that the installation was successful.\n\n```bash\nNAME    STATUS   ROLES                  AGE   VERSION        INTERNAL-IP    EXTERNAL-IP   OS-IMAGE             KERNEL-VERSION     CONTAINER-RUNTIME\nubuntu   Ready    control-plane,master   11m   v1.21.7+k3s1   192.168.0.19   <none>        Ubuntu 20.04.3 LTS   5.4.0-91-generic   docker://20.10.11\n```\n\n## 6. References\n\n- [https://rancher.com/docs/k3s/latest/en/installation/install-options/](https://rancher.com/docs/k3s/latest/en/installation/install-options/)\n"
  },
  {
    "path": "i18n/en/docusaurus-plugin-content-docs/current/setup-kubernetes/install-kubernetes/kubernetes-with-kubeadm.md",
    "content": "---\ntitle: \"4.3. Kubeadm\"\ndescription: \"\"\nsidebar_position: 3\ndate: 2021-12-13\nlastmod: 2021-12-20\ncontributors: [\"Youngcheol Jang\"]\n---\n\n## 1. Prerequisite\n\nBefore building a Kubernetes cluster, install the necessary components to the **cluster**.\n\nPlease refer to [Install Prerequisite](../../setup-kubernetes/install-prerequisite.md) and install the necessary components to the **cluster**.\n\nChange the configuration of the network for Kubernetes.\n\n```bash\nsudo modprobe br_netfilter\n\ncat <<EOF | sudo tee /etc/modules-load.d/k8s.conf\nbr_netfilter\nEOF\n\ncat <<EOF | sudo tee /etc/sysctl.d/k8s.conf\nnet.bridge.bridge-nf-call-ip6tables = 1\nnet.bridge.bridge-nf-call-iptables = 1\nEOF\nsudo sysctl --system\n```\n\n## 2. Setup Kubernetes Cluster\n\n- kubeadm : Automates the installation process by registering kubelet as a service and issuing certificates for communication between cluster components.\n- kubelet : Container handler responsible for starting and stopping container resources.\n- kubectl : CLI tool used to interact with and manage Kubernetes clusters from the terminal.\n\nInstall kubeadm, kubelet, and kubectl using the following commands. It's important to prevent accidental changes to the versions of these components, as it can lead to unexpected issues.\n\n```bash\nsudo apt-get update\nsudo apt-get install -y apt-transport-https ca-certificates curl &&\nsudo curl -fsSLo /usr/share/keyrings/kubernetes-archive-keyring.gpg https://packages.cloud.google.com/apt/doc/apt-key.gpg &&\necho \"deb [signed-by=/usr/share/keyrings/kubernetes-archive-keyring.gpg] https://apt.kubernetes.io/ kubernetes-xenial main\" | sudo tee /etc/apt/sources.list.d/kubernetes.list &&\nsudo apt-get update\nsudo apt-get install -y kubelet=1.21.7-00 kubeadm=1.21.7-00 kubectl=1.21.7-00 &&\nsudo apt-mark hold kubelet kubeadm kubectl\n```\n\nCheck if kubeadm, kubelet, and kubectl are installed correctly.\n\n```bash\nmlops@ubuntu:~$ kubeadm version\nkubeadm version: &version.Info{Major:\"1\", Minor:\"21\", GitVersion:\"v1.21.7\", GitCommit:\"1f86634ff08f37e54e8bfcd86bc90b61c98f84d4\", GitTreeState:\"clean\", BuildDate:\"2021-11-17T14:40:08Z\", GoVersion:\"go1.16.10\", Compiler:\"gc\", Platform:\"linux/amd64\"}\n```\n\n```bash\nmlops@ubuntu:~$ kubelet --version\nKubernetes v1.21.7\n```\n\n```bash\nmlops@ubuntu:~$ kubectl version --client\nClient Version: version.Info{Major:\"1\", Minor:\"21\", GitVersion:\"v1.21.7\", GitCommit:\"1f86634ff08f37e54e8bfcd86bc90b61c98f84d4\", GitTreeState:\"clean\", BuildDate:\"2021-11-17T14:41:19Z\", GoVersion:\"go1.16.10\", Compiler:\"gc\", Platform:\"linux/amd64\"}\n```\n\nNow we will use kubeadm to install Kubernetes.\n\n```bash\nkubeadm config images list\nkubeadm config images pull\n\nsudo kubeadm init --pod-network-cidr=10.244.0.0/16\n```\n\nThrough kubectl, copy the admin certificate to the path $HOME/.kube/config to control the Kubernetes cluster.\n\n```bash\nmkdir -p $HOME/.kube\nsudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config\nsudo chown $(id -u):$(id -g) $HOME/.kube/config\n```\n\nInstall CNI. There are various kinds of CNI, which is responsible for setting up the network inside Kubernetes, and in *MLOps for All*, flannel is used.\n\n```bash\nkubectl apply -f https://raw.githubusercontent.com/flannel-io/flannel/v0.13.0/Documentation/kube-flannel.yml\n```\n\nThere are two types of Kubernetes nodes: `Master Node` and `Worker Node`. For stability, it is generally recommended that only tasks to control the Kubernetes cluster are run on the `Master Node`, however this manual assumes a single cluster, so all types of tasks can be run on the Master Node.\n\n```bash\nkubectl taint nodes --all node-role.kubernetes.io/master-\n```\n\n## 3. Setup Kubernetes Client\n\nCopy the kubeconfig file created in the cluster to the **client** to control the cluster through kubectl.\n\n```bash\nmkdir -p $HOME/.kube\nscp -p {CLUSTER_USER_ID}@{CLUSTER_IP}:~/.kube/config ~/.kube/config\n```\n\n## 4. Install Kubernetes Default Modules\n\nPlease refer to [Setup Kubernetes Modules](../../setup-kubernetes/install-kubernetes-module.md) to install the following components:\n\n- helm\n- kustomize\n- CSI plugin\n- [Optional] nvidia-docker, nvidia-device-plugin\n\n## 5. Verify Successful Installation\n\nFinally, check if the nodes are Ready and verify the OS, Docker, and Kubernetes versions.\n\n```bash\nkubectl get nodes\n```\n\nWhen the node is in the \"Ready\" state, the output will be similar to the following:\n\n```bash\nNAME     STATUS   ROLES                  AGE     VERSION\nubuntu   Ready    control-plane,master   2m55s   v1.21.7\n```\n\n## 6. References\n\n- [kubeadm](https://kubernetes.io/docs/setup/production-environment/tools/kubeadm/install-kubeadm)\n"
  },
  {
    "path": "i18n/en/docusaurus-plugin-content-docs/current/setup-kubernetes/install-kubernetes/kubernetes-with-minikube.md",
    "content": "---\ntitle: \"4.2. Minikube\"\ndescription: \"\"\nsidebar_position: 2\ndate: 2021-12-13\nlastmod: 2021-12-20\ncontributors: [\"Jaeyeon Kim\"]\n---\n\n## 1. Prerequisite\n\nBefore setting up a Kubernetes cluster, install the necessary components on the **cluster**.\n\nPlease refer to [Install Prerequisite](../../setup-kubernetes/install-prerequisite.md) to install the necessary components on the **cluster** before installing Kubernetes.\n\n### Minikube binary\n\nInstall the v1.24.0 version of the Minikube binary to use Minikube.\n\n```bash\nwget https://github.com/kubernetes/minikube/releases/download/v1.24.0/minikube-linux-amd64\nsudo install minikube-linux-amd64 /usr/local/bin/minikube\n```\n\nCheck if it is installed properly.\n\n```bash\nminikube version\n```\n\nIf this message appears, it means the installation was successful.\n\n```bash\nmlops@ubuntu:~$ minikube version\nminikube version: v1.24.0\ncommit: 76b94fb3c4e8ac5062daf70d60cf03ddcc0a741b\n```\n\n## 2. Setup Kubernetes Cluster\n\nNow let's build the Kubernetes cluster using Minikube.\nTo facilitate the smooth use of GPUs and communication between cluster and client, Minikube is run using the `driver=none` option. Please note that this option must be run as root user. \n\nSwitch to root user.\n\n```bash\nsudo su\n```\n\nRun `minikube start` to build the Kubernetes cluster for Kubeflow's smooth operation, specifying the Kubernetes version as v1.21.7 and adding `--extra-config`.\n\n```bash\nminikube start --driver=none \\\n  --kubernetes-version=v1.21.7 \\\n  --extra-config=apiserver.service-account-signing-key-file=/var/lib/minikube/certs/sa.key \\\n  --extra-config=apiserver.service-account-issuer=kubernetes.default.svc\n```\n\n### Disable default addons\n\nWhen installing Minikube, there are default addons that are installed. We will disable any addons that we do not intend to use.\n\n```bash\nminikube addons disable storage-provisioner\nminikube addons disable default-storageclass\n```\n\nConfirm that all addons are disabled.\n\n```bash\nminikube addons list\n```\n\nIf the following message appears, it means that the installation was successful.\n\n```bash\nroot@ubuntu:/home/mlops# minikube addons list\n|-----------------------------|----------|--------------|-----------------------|\n|         ADDON NAME          | PROFILE  |    STATUS    |      MAINTAINER       |\n|-----------------------------|----------|--------------|-----------------------|\n| ambassador                  | minikube | disabled     | unknown (third-party) |\n| auto-pause                  | minikube | disabled     | google                |\n| csi-hostpath-driver         | minikube | disabled     | kubernetes            |\n| dashboard                   | minikube | disabled     | kubernetes            |\n| default-storageclass        | minikube | disabled     | kubernetes            |\n| efk                         | minikube | disabled     | unknown (third-party) |\n| freshpod                    | minikube | disabled     | google                |\n| gcp-auth                    | minikube | disabled     | google                |\n| gvisor                      | minikube | disabled     | google                |\n| helm-tiller                 | minikube | disabled     | unknown (third-party) |\n| ingress                     | minikube | disabled     | unknown (third-party) |\n| ingress-dns                 | minikube | disabled     | unknown (third-party) |\n| istio                       | minikube | disabled     | unknown (third-party) |\n| istio-provisioner           | minikube | disabled     | unknown (third-party) |\n| kubevirt                    | minikube | disabled     | unknown (third-party) |\n| logviewer                   | minikube | disabled     | google                |\n| metallb                     | minikube | disabled     | unknown (third-party) |\n| metrics-server              | minikube | disabled     | kubernetes            |\n| nvidia-driver-installer     | minikube | disabled     | google                |\n| nvidia-gpu-device-plugin    | minikube | disabled     | unknown (third-party) |\n| olm                         | minikube | disabled     | unknown (third-party) |\n| pod-security-policy         | minikube | disabled     | unknown (third-party) |\n| portainer                   | minikube | disabled     | portainer.io          |\n| registry                    | minikube | disabled     | google                |\n| registry-aliases            | minikube | disabled     | unknown (third-party) |\n| registry-creds              | minikube | disabled     | unknown (third-party) |\n| storage-provisioner         | minikube | disabled     | kubernetes            |\n| storage-provisioner-gluster | minikube | disabled     | unknown (third-party) |\n| volumesnapshots             | minikube | disabled     | kubernetes            |\n|-----------------------------|----------|--------------|-----------------------|\n```\n\n### 3. Setup Kubernetes Client\n\nNow, let's install the necessary tools for smooth usage of Kubernetes on the **client** machine. If the **client** and **cluster** nodes are not separated, please note that you need to perform all the operations as the root user.\n\nIf the **client** and **cluster** nodes are separated, first, we need to retrieve the Kubernetes administrator credentials from the **cluster** to the **client**.\n\n1. Check the config on the **cluster**:\n\n  ```bash\n  # Cluster node\n  minikube kubectl -- config view --flatten\n  ```\n\n2. The following information will be displayed:\n\n  ```bash\n  apiVersion: v1\n  clusters:\n  - cluster:\n      certificate-authority-data: LS0tLS1CRUd....\n      extensions:\n      - extension:\n          last-update: Mon, 06 Dec 2021 06:55:46 UTC\n          provider: minikube.sigs.k8s.io\n          version: v1.24.0\n        name: cluster_info\n      server: https://192.168.0.62:8443\n    name: minikube\n  contexts:\n  - context:\n      cluster: minikube\n      extensions:\n      - extension:\n          last-update: Mon, 06 Dec 2021 06:55:46 UTC\n          provider: minikube.sigs.k8s.io\n          version: v1.24.0\n        name: context_info\n      namespace: default\n      user: minikube\n    name: minikube\n  current-context: minikube\n  kind: Config\n  preferences: {}\n  users:\n  - name: minikube\n    user:\n      client-certificate-data: LS0tLS1CRUdJTi....\n      client-key-data: LS0tLS1CRUdJTiBSU0....\n  ```\n\n3. Create the `.kube` folder on the **client** node:\n\n  ```bash\n  # Client node\n  mkdir -p /home/$USER/.kube\n  ```\n\n4. Paste the information obtained from Step 2 into the file and save it:\n\n  ```bash\n  vi /home/$USER/.kube/config\n  ```\n\n## 4. Install Kubernetes Default Modules\n\nPlease refer to [Setup Kubernetes Modules](../../setup-kubernetes/install-kubernetes-module.md) to install the following components:\n\n- helm\n- kustomize\n- CSI plugin\n- [Optional] nvidia-docker, nvidia-device-plugin\n\n## 5. Verify Successful Installation\n\nFinally, check that the node is Ready, and check the OS, Docker, and Kubernetes versions.\n\n```bash\nkubectl get nodes -o wide\n```\n\nIf this message appears, it means that the installation has completed normally.\n\n```bash\nNAME     STATUS   ROLES                  AGE     VERSION   INTERNAL-IP    EXTERNAL-IP   OS-IMAGE             KERNEL-VERSION     CONTAINER-RUNTIME\nubuntu   Ready    control-plane,master   2d23h   v1.21.7   192.168.0.75   <none>        Ubuntu 20.04.3 LTS   5.4.0-91-generic   docker://20.10.11\n```\n"
  },
  {
    "path": "i18n/en/docusaurus-plugin-content-docs/current/setup-kubernetes/install-kubernetes-module.md",
    "content": "---\ntitle: \"5. Install Kubernetes Modules\"\ndescription: \"Install Helm, Kustomize\"\nsidebar_position: 5\ndate: 2021-12-13\nlastmod: 2021-12-20\ncontributors: [\"Jaeyeon Kim\"]\n---\n\n## Setup Kubernetes Modules\n\n\nOn this page, we will explain how to install the modules that will be used on the cluster from the client nodes.  \nAll the processes introduced here will be done on the **client nodes**.\n\n## Helm\n\nHelm is one of the package management tools that helps to deploy and manage resources related to Kubernetes packages at once.\n\n1. Download Helm version 3.7.1 into the current folder.\n\n- For Linux amd64\n\n  ```bash\n  wget https://get.helm.sh/helm-v3.7.1-linux-amd64.tar.gz\n  ```\n\n- Other OS refer to the [official website](https://github.com/helm/helm/releases/tag/v3.7.1) for the download path of the binary that matches the OS and CPU of your client node.\n\n2. Unzip the file to use helm and move the file to its desired location.\n\n  ```bash\n  tar -zxvf helm-v3.7.1-linux-amd64.tar.gz\n  sudo mv linux-amd64/helm /usr/local/bin/helm\n  ```\n\n3. Check to see if the installation was successful:\n  ```bash\n  helm help\n  ```\n\n  If you see the following message, it means that it has been installed normally. \n\n  ```bash\n  The Kubernetes package manager\n\n  Common actions for Helm:\n\n  - helm search:    search for charts\n  - helm pull:      download a chart to your local directory to view\n  - helm install:   upload the chart to Kubernetes\n  - helm list:      list releases of charts\n\n  Environment variables:\n\n  | Name                     | Description                                                         |\n  |--------------------------|---------------------------------------------------------------------|\n  | $HELM_CACHE_HOME         | set an alternative location for storing cached files.               |\n  | $HELM_CONFIG_HOME        | set an alternative location for storing Helm configuration.         |\n  | $HELM_DATA_HOME          | set an alternative location for storing Helm data.                  |\n\n  ...\n  ```\n\n## Kustomize\n\nKustomize is one of the package management tools that helps to deploy and manage multiple Kubernetes resources at once.\n\n1. Download the binary version of kustomize v3.10.0 in the current folder.\n\n- For Linux amd64\n\n  ```bash\n  wget https://github.com/kubernetes-sigs/kustomize/releases/download/kustomize%2Fv3.10.0/kustomize_v3.10.0_linux_amd64.tar.gz\n  ```\n\n- Other OS can be downloaded from [kustomize/v3.10.0](https://github.com/kubernetes-sigs/kustomize/releases/tag/kustomize%2Fv3.10.0) after checking.\n\n2. Unzip to use kustomize, and change the file location. \n\n  ```bash\n  tar -zxvf kustomize_v3.10.0_linux_amd64.tar.gz\n  sudo mv kustomize /usr/local/bin/kustomize\n  ```\n\n3. Check if it is installed correctly.\n\n  ```bash\n  kustomize help\n  ```\n\n  If you see the following message, it means that it has been installed normally.\n\n  ```bash\n  Manages declarative configuration of Kubernetes.\n  See https://sigs.k8s.io/kustomize\n\n  Usage:\n    kustomize [command]\n\n  Available Commands:\n    build                     Print configuration per contents of kustomization.yaml\n    cfg                       Commands for reading and writing configuration.\n    completion                Generate shell completion script\n    create                    Create a new kustomization in the current directory\n    edit                      Edits a kustomization file\n    fn                        Commands for running functions against configuration.\n  ...\n  ```\n\n## CSI Plugin : Local Path Provisioner\n\n1. The CSI Plugin is a module that is responsible for storage within Kubernetes. Install the CSI Plugin, Local Path Provisioner, which is easy to use in single node clusters.\n\n  ```bash\n  kubectl apply -f https://raw.githubusercontent.com/rancher/local-path-provisioner/v0.0.20/deploy/local-path-storage.yaml\n  ```\n\n  If you see the following messages, it means that the installation was successful: \n\n  ```bash\n  namespace/local-path-storage created\n  serviceaccount/local-path-provisioner-service-account created\n  clusterrole.rbac.authorization.k8s.io/local-path-provisioner-role created\n  clusterrolebinding.rbac.authorization.k8s.io/local-path-provisioner-bind created\n  deployment.apps/local-path-provisioner created\n  storageclass.storage.k8s.io/local-path created\n  configmap/local-path-config created\n  ```\n\n2. Also, check if the provisioner pod in the local-path-storage namespace is Running by executing the following command:\n\n  ```bash\n  kubectl -n local-path-storage get pod\n  ```\n\nIf successful, it will display the following output:\n\n  ```bash\n  NAME                                     READY     STATUS    RESTARTS   AGE\n  local-path-provisioner-d744ccf98-xfcbk   1/1       Running   0          7m\n  ```\n\n4. Execute the following command to change the default storage class:\n\n  ```bash\n  kubectl patch storageclass local-path -p '{\"metadata\": {\"annotations\":{\"storageclass.kubernetes.io/is-default-class\":\"true\"}}}'\n  ```\n\n  If the command is successful, the following output will be displayed:\n\n  ```bash\n  storageclass.storage.k8s.io/local-path patched\n  ```\n\n5. Verify that the default storage class has been set:\n\n  ```bash\n  kubectl get sc\n  ```\n\n  Check if there is a storage class with the name `local-path (default)` in the NAME column:\n\n  ```bash\n  NAME                   PROVISIONER             RECLAIMPOLICY   VOLUMEBINDINGMODE      ALLOWVOLUMEEXPANSION   AGE\n  local-path (default)   rancher.io/local-path   Delete          WaitForFirstConsumer   false                  2h\n  ```\n"
  },
  {
    "path": "i18n/en/docusaurus-plugin-content-docs/current/setup-kubernetes/install-prerequisite.md",
    "content": "---\ntitle: \"3. Install Prerequisite\"\ndescription: \"Install docker\"\nsidebar_position: 3\ndate: 2021-12-13\nlastmod: 2021-12-20\ncontributors: [\"Jaeyeon Kim\", \"Jongsun Shinn\", \"Sangwoo Shim\"]\n---\n\nOn this page, we describe the components that need to be installed or configured on the **Cluster** and **Client** prior to installing Kubernetes.\n\n## Install apt packages\n\nIn order to enable smooth communication between the Client and the Cluster, Port-Forwarding needs to be performed. To enable Port-Forwarding, the following packages need to be installed on the **Cluster**.\n```bash\nsudo apt-get update\nsudo apt-get install -y socat\n```\n\n## Install Docker\n\n1. Install apt packages for docker.\n\n   ```bash\n   sudo apt-get update && sudo apt-get install -y ca-certificates curl gnupg lsb-release\n   ```\n\n2. add docker official GPG key.\n\n   ```bash\n   curl -fsSL https://download.docker.com/linux/ubuntu/gpg | sudo gpg --dearmor -o /usr/share/keyrings/docker-archive-keyring.gpg\n   ```\n\n3. When installing Docker using the apt package manager, configure it to retrieve from the stable repository:\n\n   ```bash\n   echo \\\n   \"deb [arch=$(dpkg --print-architecture) signed-by=/usr/share/keyrings/docker-archive-keyring.gpg] https://download.docker.com/linux/ubuntu \\\n   $(lsb_release -cs) stable\" | sudo tee /etc/apt/sources.list.d/docker.list > /dev/null\n   ```\n\n4. Check the currently available Docker versions for installation:\n\n   ```bash\n   sudo apt-get update && apt-cache madison docker-ce\n   ```\n\n   Verify if the version `5:20.10.11~3-0~ubuntu-focal` is listed among the output:\n\n   ```bash\n   apt-cache madison docker-ce | grep 5:20.10.11~3-0~ubuntu-focal\n   ```\n\n   If the addition was successful, the following output will be displayed:\n\n   ```bash\n   docker-ce | 5:20.10.11~3-0~ubuntu-focal | https://download.docker.com/linux/ubuntu focal/stable amd64 Packages\n   ```\n\n5. Install Docker version `5:20.10.11~3-0~ubuntu-focal`:\n\n   ```bash\n   sudo apt-get install -y containerd.io docker-ce=5:20.10.11~3-0~ubuntu-focal docker-ce-cli=5:20.10.11~3-0~ubuntu-focal\n\n   ```\n\n6. Check docker is installed.\n\n   ```bash\n   sudo docker run hello-world\n   ```\n\n\n   If added successfully, it will output as follows:\n\n   ```bash\n   mlops@ubuntu:~$ sudo docker run hello-world\n\n   Hello from Docker!\n   This message shows that your installation appears to be working correctly.\n\n   To generate this message, Docker took the following steps:\n   1. The Docker client contacted the Docker daemon.\n   2. The Docker daemon pulled the \"hello-world\" image from the Docker Hub.\n      (amd64)\n   3. The Docker daemon created a new container from that image which runs the\n      executable that produces the output you are currently reading.\n   4. The Docker daemon streamed that output to the Docker client, which sent it\n      to your terminal.\n\n   To try something more ambitious, you can run an Ubuntu container with:\n   $ docker run -it ubuntu bash\n\n   Share images, automate workflows, and more with a free Docker ID:\n   https://hub.docker.com/\n\n   For more examples and ideas, visit:\n   https://docs.docker.com/get-started/\n   ```\n      \n7. Add permissions to use Docker commands without the `sudo` keyword by executing the following commands:\n\n   ```bash\n   sudo groupadd docker\n   sudo usermod -aG docker $USER\n   newgrp docker\n   ```\n\n8. To verify that you can now use Docker commands without `sudo`, run the `docker run` command again:\n\n   ```bash\n   docker run hello-world\n   ```\n\n   If you see the following message after executing the command, it means that the permissions have been successfully added:\n\n   ```bash\n   mlops@ubuntu:~$ docker run hello-world\n\n   Hello from Docker!\n   This message shows that your installation appears to be working correctly.\n\n   To generate this message, Docker took the following steps:\n   1. The Docker client contacted the Docker daemon.\n   2. The Docker daemon pulled the \"hello-world\" image from the Docker Hub.\n      (amd64)\n   3. The Docker daemon created a new container from that image which runs the\n      executable that produces the output you are currently reading.\n   4. The Docker daemon streamed that output to the Docker client, which sent it\n      to your terminal.\n\n   To try something more ambitious, you can run an Ubuntu container with:\n   $ docker run -it ubuntu bash\n\n   Share images, automate workflows, and more with a free Docker ID:\n   https://hub.docker.com/\n\n   For more examples and ideas, visit:\n   https://docs.docker.com/get-started/\n   ```\n\n## Turn off Swap Memory\n\nIn order for kubelet to work properly, **cluster** nodes must turn off the virtual memory called swap. The following command turns off the swap.  \n**(When using cluster and client on the same desktop, turning off swap memory may result in a slowdown in speed)**\n\n```bash\nsudo sed -i '/ swap / s/^\\(.*\\)$/#\\1/g' /etc/fstab\nsudo swapoff -a\n```\n\n## Install Kubectl\n\nkubectl is a client tool used to make API requests to a Kubernetes cluster. It needs to be installed on the client node.\n\n1. Download kubectl version v1.21.7 to the current folder:\n\n   ```bash\n   curl -LO https://dl.k8s.io/release/v1.21.7/bin/linux/amd64/kubectl\n   ```\n\n2. Change the file permissions and move it to the appropriate location to make kubectl executable:\n\n   ```bash\n   sudo install -o root -g root -m 0755 kubectl /usr/local/bin/kubectl\n   ```\n\n3. Verify that kubectl is installed correctly:\n\n   ```bash\n   kubectl version --client\n   ```\n\n   If you see the following message, it means that kubectl is installed successfully:\n\n   ```bash\n   Client Version: version.Info{Major:\"1\", Minor:\"21\", GitVersion:\"v1.21.7\", GitCommit:\"1f86634ff08f37e54e8bfcd86bc90b61c98f84d4\", GitTreeState:\"clean\", BuildDate:\"2021-11-17T14:41:19Z\", GoVersion:\"go1.16.10\", Compiler:\"gc\", Platform:\"linux/amd64\"}\n   ```\n\n4. If you work with multiple Kubernetes clusters and need to manage multiple kubeconfig files or kube-contexts efficiently, you can refer to the following resources:\n\n   - [Configuring Multiple kubeconfig on Your Machine](https://dev.to/aabiseverywhere/configuring-multiple-kubeconfig-on-your-machine-59eo)\n   - [kubectx - Switch between Kubernetes contexts easily](https://github.com/ahmetb/kubectx)\n\n## References\n\n- [Install Docker Engine on Ubuntu](https://docs.docker.com/engine/install/ubuntu/)\n- [Install and Set Up kubectl on Linux](https://kubernetes.io/docs/tasks/tools/install-kubectl-linux/)\n"
  },
  {
    "path": "i18n/en/docusaurus-plugin-content-docs/current/setup-kubernetes/intro.md",
    "content": "---\ntitle: \"1. Introduction\"\ndescription: \"Setup Introduction\"\nsidebar_position: 1\ndate: 2021-12-13\nlastmod: 2021-12-13\ncontributors: [\"Jaeyeon Kim\", \"Jongsun Shinn\", \"Youngdon Tae\", \"SeungTae Kim\"]\n---\n\n## Build MLOps System\n\nThe biggest barrier when studying MLOps is the difficulty of setting up and using an MLOps system. Using public cloud platforms like AWS or GCP, or commercial tools like Weights & Biases or neptune.ai, can be costly, and starting from scratch to build the entire environment can be overwhelming and confusing.\n\nTo address these challenges and help those who haven't been able to start with MLOps, *MLOps for ALL* will guide you on how to build and use an MLOps system from scratch, requiring only a desktop with Ubuntu installed.\n\nFor those who cannot prepare a Ubuntu desktop environment, use virtual machines to set up the environment.\n\n> If you are using Windows or an Intel-based Mac for the *MLOps for ALL* practical exercises, you can prepare an Ubuntu desktop environment using virtual machine software such as VirtualBox or VMware. Please make sure to meet the recommended specifications when creating the virtual machine.\n> However, for those using an M1 Mac, as of the date of writing (February 2022), VirtualBox and VMware are not available. ([Check if macOS apps are optimized for M1 Apple Silicon Mac](https://isapplesiliconready.com/kr))\n> Therefore, if you are not using a cloud environment, you can install UTM, Virtual machines for Mac, to use virtual machines. \n> (Purchasing and downloading software from the App Store is a form of donation-based payment. The free version is sufficient as it only differs in automatic updates.)\n> This virtual machine software supports the *Ubuntu 20.04.3 LTS* practice operating system, enabling you to perform the exercises on an M1 Mac.\n\n\nHowever, since it is not possible to use all the elements described in the [Components of MLOps](../introduction/component.md), *MLOps for ALL* will mainly focus on installing the representative open source software and connecting them to each other.\n\nIt is not meant that installing open source software in *MLOps for ALL* is a standard, and we recommend choosing the appropriate tool that fits your situation.\n\n## Components\n\nThe components of the MLOps system that we will make in this article and each version have been verified in the following environment.\n\nTo facilitate smooth testing, I will explain the setup of the **Cluster** and **Client** as separate entities.\n\nThe **Cluster** refers to a single desktop with Ubuntu installed.  \nThe **Client** is recommended to be a different desktop, such as a laptop or another desktop with access to the Cluster or Kubernetes installation. However, if you only have one machine available, you can use the same desktop for both Cluster and Client purposes.\n\n### Cluster\n\n#### 1. Software\n\nBelow is the list of software that needs to be installed on the Cluster:\n\n| Software        | Version     |\n| --------------- | ----------- |\n| Ubuntu          | 20.04.3 LTS |\n| Docker (Server) | 20.10.11    |\n| NVIDIA Driver   | 470.86      |\n| Kubernetes      | v1.21.7     |\n| Kubeflow        | v1.4.0      |\n| MLFlow          | v1.21.0     |\n\n#### 2. Helm Chart\n\nBelow is the list of third-party software that needs to be installed using Helm:\n\n| Helm Chart Repo Name          | Version |\n| ----------------------------- | ------- |\n| datawire/ambassador           | 6.9.3   |\n| seldonio/seldon-core-operator | 1.11.2  |\n\n### Client\n\nThe Client has been validated on MacOS (Intel CPU) and Ubuntu 20.04.\n\n| Software        | Version  |\n| --------------- | ----------|\n| kubectl         | v1.21.7   |\n| helm            | v3.7.1    |\n| kustomize       | v3.10.0   |\n\n### Minimum System Requirements\n\nIt is recommended that the Cluster meet the following specifications, which are dependent on the recommended specifications for Kubernetes and Kubeflow:\n\n- CPU: 6 cores\n- RAM: 12GB\n- DISK: 50GB\n- GPU: NVIDIA GPU (optional)\n"
  },
  {
    "path": "i18n/en/docusaurus-plugin-content-docs/current/setup-kubernetes/kubernetes.md",
    "content": "---\ntitle : \"2. Setup Kubernetes\"\ndescription: \"Setup Kubernetes\"\nsidebar_position: 2\ndate: 2021-12-13\nlastmod: 2021-12-13\ncontributors: [\"Jaeyeon Kim\"]\n---\n\n## Setup Kubernetes Cluster\n\nFor those learning Kubernetes for the first time, the first barrier to entry is setting up a Kubernetes practice environment.\n\nThe official tool that supports building a production-level Kubernetes cluster is kubeadm, but there are also tools such as kubespray and kops that help users set up more easily, and tools such as k3s, minikube, microk8s, and kind that help you set up a compact Kubernetes cluster easily for learning purposes.\n\nEach tool has its own advantages and disadvantages, so considering the preferences of each user, this article will use three tools: kubeadm, k3s, and minikube to set up a Kubernetes cluster.\nFor detailed comparisons of each tool, please refer to the official Kubernetes [documentation](https://kubernetes.io/ko/docs/tasks/tools/).\n\n*MLOps for ALL* recommends **k3s** as a tool that is easy to use when setting up a Kubernetes cluster.\n\nIf you want to use all the features of Kubernetes and configure the nodes, we recommend **kubeadm**.  \n**minikube** has the advantage of being able to easily install other Kubernetes in an add-on format, in addition to the components we describe.\n\nIn this *MLOps for ALL*, in order to use the components that will be built for MLOps smoothly, there are additional settings that must be configured when building the Kubernetes cluster using each of the tools.\n\nThe scope of this **Setup Kubernetes** section is to build a k8s cluster on a desktop that already has Ubuntu OS installed and to confirm that external client nodes can access the Kubernetes cluster.\n\nThe detailed setup procedure is composed of the following flow, as each of the three tools has its own setup procedure.\n```bash\n3. Setup Prerequisite\n4. Setup Kubernetes\n  4.1. with k3s\n  4.2. with minikube\n  4.3. with kubeadm\n5. Setup Kubernetes Modules\n```\n\nLet's now build a Kubernetes cluster by using each of the tools. You don't have to use all the tools, and you can use the tools that you are familiar with.\n"
  },
  {
    "path": "i18n/en/docusaurus-plugin-content-docs/current/setup-kubernetes/setup-nvidia-gpu.md",
    "content": "---\ntitle: \"6. (Optional) Setup GPU\"\ndescription: \"Install nvidia docker, nvidia device plugin\"\nsidebar_position: 6\ndate: 2021-12-13\nlastmod: 2021-12-13\ncontributors: [\"Jaeyeon Kim\"]\n---\n\nFor using GPU in Kubernetes and Kubeflow, the following tasks are required.\n\n## 1. Install NVIDIA Driver\n\nIf the following screen is output when executing `nvidia-smi`, please omit this step.\n\n  ```bash\n  mlops@ubuntu:~$ nvidia-smi \n  +-----------------------------------------------------------------------------+\n  | NVIDIA-SMI 470.86       Driver Version: 470.86       CUDA Version: 11.4     |\n  |-------------------------------+----------------------+----------------------+\n  | GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |\n  | Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |\n  |                               |                      |               MIG M. |\n  |===============================+======================+======================|\n  |   0  NVIDIA GeForce ...  Off  | 00000000:01:00.0 Off |                  N/A |\n  | 25%   32C    P8     4W / 120W |    211MiB /  6078MiB |      0%      Default |\n  |                               |                      |                  N/A |\n  +-------------------------------+----------------------+----------------------+\n  |   1  NVIDIA GeForce ...  Off  | 00000000:02:00.0 Off |                  N/A |\n  |  0%   34C    P8     7W / 175W |      5MiB /  7982MiB |      0%      Default |\n  |                               |                      |                  N/A |\n  +-------------------------------+----------------------+----------------------+\n                                                                                \n  +-----------------------------------------------------------------------------+\n  | Processes:                                                                  |\n  |  GPU   GI   CI        PID   Type   Process name                  GPU Memory |\n  |        ID   ID                                                   Usage      |\n  |=============================================================================|\n  |    0   N/A  N/A      1644      G   /usr/lib/xorg/Xorg                198MiB |\n  |    0   N/A  N/A      1893      G   /usr/bin/gnome-shell               10MiB |\n  |    1   N/A  N/A      1644      G   /usr/lib/xorg/Xorg                  4MiB |\n  +-----------------------------------------------------------------------------+\n  ```\n\nIf the output of nvidia-smi is not as above, please install the nvidia driver that fits your installed GPU.\n\nIf you are not familiar with the installation of nvidia drivers, please install it through the following command.\n\n  ```bash\n  sudo add-apt-repository ppa:graphics-drivers/ppa\n  sudo apt update && sudo apt install -y ubuntu-drivers-common\n  sudo ubuntu-drivers autoinstall\n  sudo reboot\n  ```\n\n## 2. Install NVIDIA-Docker.\n\nLet's install NVIDIA-Docker.\n\n```bash\ncurl -s -L https://nvidia.github.io/nvidia-docker/gpgkey | \\\n  sudo apt-key add -\ndistribution=$(. /etc/os-release;echo $ID$VERSION_ID)\ncurl -s -L https://nvidia.github.io/nvidia-docker/$distribution/nvidia-docker.list | sudo tee /etc/apt/sources.list.d/nvidia-docker.list\nsudo apt-get update\nsudo apt-get install -y nvidia-docker2 &&\nsudo systemctl restart docker\n```\n\nTo check if it is installed correctly, we will run the docker container using the GPU.\n\n```bash\nsudo docker run --rm --gpus all nvidia/cuda:11.0-base nvidia-smi\n```\n\nIf the following message appears, it means that the installation was successful: \n\n  ```bash\n  mlops@ubuntu:~$ sudo docker run --rm --gpus all nvidia/cuda:11.0-base nvidia-smi\n  +-----------------------------------------------------------------------------+\n  | NVIDIA-SMI 470.86       Driver Version: 470.86       CUDA Version: 11.4     |\n  |-------------------------------+----------------------+----------------------+\n  | GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |\n  | Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |\n  |                               |                      |               MIG M. |\n  |===============================+======================+======================|\n  |   0  NVIDIA GeForce ...  Off  | 00000000:01:00.0 Off |                  N/A |\n  | 25%   32C    P8     4W / 120W |    211MiB /  6078MiB |      0%      Default |\n  |                               |                      |                  N/A |\n  +-------------------------------+----------------------+----------------------+\n  |   1  NVIDIA GeForce ...  Off  | 00000000:02:00.0 Off |                  N/A |\n  |  0%   34C    P8     6W / 175W |      5MiB /  7982MiB |      0%      Default |\n  |                               |                      |                  N/A |\n  +-------------------------------+----------------------+----------------------+\n                                                                                \n  +-----------------------------------------------------------------------------+\n  | Processes:                                                                  |\n  |  GPU   GI   CI        PID   Type   Process name                  GPU Memory |\n  |        ID   ID                                                   Usage      |\n  |=============================================================================|\n  +-----------------------------------------------------------------------------+\n  ```\n\n## 3. Setting NVIDIA-Docker as the Default Container Runtime\n\nBy default, Kubernetes uses Docker-CE as the default container runtime. To use NVIDIA GPU within Docker containers, you need to configure NVIDIA-Docker as the container runtime and modify the default runtime for creating pods.\n\n1. Open the `/etc/docker/daemon.json` file and make the following modifications:\n\n  ```bash\n  sudo vi /etc/docker/daemon.json\n\n  {\n    \"default-runtime\": \"nvidia\",\n    \"runtimes\": {\n        \"nvidia\": {\n            \"path\": \"nvidia-container-runtime\",\n            \"runtimeArgs\": []\n    }\n    }\n  }\n  ```\n\n2. After confirming the file changes, restart Docker.\n\n  ```bash\n  sudo systemctl daemon-reload\n  sudo service docker restart\n  ```\n\n3. Verify that the changes have been applied.\n\n  ```bash\n  sudo docker info | grep nvidia\n  ```\n\n  If you see the following message, it means that the installation was successful.\n\n  ```bash\n  mlops@ubuntu:~$ docker info | grep nvidia\n  Runtimes: io.containerd.runc.v2 io.containerd.runtime.v1.linux nvidia runc\n  Default Runtime: nvidia\n  ```\n\n## 4. Nvidia-Device-Plugin\n\n1. Create the nvidia-device-plugin daemonset.\n\n  ```bash\n  kubectl create -f https://raw.githubusercontent.com/NVIDIA/k8s-device-plugin/v0.10.0/nvidia-device-plugin.yml\n  ```\n\n2. Verify that the nvidia-device-plugin pod is in the RUNNING state.\n\n  ```bash\n  kubectl get pod -n kube-system | grep nvidia\n  ```\n\nYou should see the following output:\n\n  ```bash\n  kube-system   nvidia-device-plugin-daemonset-nlqh2   1/1     Running   0    1h\n  ```\n\n3. Verify that the nodes have been configured to have GPUs available.\n\n  ```bash\n  kubectl get nodes \"-o=custom-columns=NAME:.metadata.name,GPU:.status.allocatable.nvidia\\.com/gpu\"\n  ```\n\n  If you see the following message, it means that the configuration was successful.  \n  (*In the *MLOps for ALL* tutorial cluster, there are two GPUs, so the output is 2.\n  If the output shows the correct number of GPUs for your cluster, it is fine.)\n\n  ```bash\n  NAME       GPU\n  ubuntu     2\n  ```\n\n  If it is not configured, the GPU value will be displayed as `<None>`.\n"
  },
  {
    "path": "i18n/en/docusaurus-plugin-content-docs/current.json",
    "content": "{\n  \"version.label\": {\n    \"message\": \"Next\",\n    \"description\": \"The label for version current\"\n  },\n  \"sidebar.tutorialSidebar.category.Introduction\": {\n    \"message\": \"Introduction\",\n    \"description\": \"The label for category Introduction in sidebar tutorialSidebar\"\n  },\n  \"sidebar.tutorialSidebar.category.Prerequisites\": {\n    \"message\": \"Prerequisites\",\n    \"description\": \"The label for category Prerequisites in sidebar tutorialSidebar\"\n  },\n  \"sidebar.tutorialSidebar.category.Docker\": {\n    \"message\": \"Docker\",\n    \"description\": \"The label for category Docker in sidebar tutorialSidebar\"\n  },\n  \"sidebar.tutorialSidebar.category.Setup Kubernetes\": {\n    \"message\": \"Setup Kubernetes\",\n    \"description\": \"The label for category Setup Kubernetes in sidebar tutorialSidebar\"\n  },\n  \"sidebar.tutorialSidebar.category.4. Install Kubernetes\": {\n    \"message\": \"4. Install Kubernetes\",\n    \"description\": \"The label for category 4. Install Kubernetes in sidebar tutorialSidebar\"\n  },\n  \"sidebar.tutorialSidebar.category.Setup Components\": {\n    \"message\": \"Setup Components\",\n    \"description\": \"The label for category Setup Components in sidebar tutorialSidebar\"\n  },\n  \"sidebar.tutorialSidebar.category.Kubeflow UI Guide\": {\n    \"message\": \"Kubeflow UI Guide\",\n    \"description\": \"The label for category Kubeflow UI Guide in sidebar tutorialSidebar\"\n  },\n  \"sidebar.tutorialSidebar.category.Kubeflow\": {\n    \"message\": \"Kubeflow\",\n    \"description\": \"The label for category Kubeflow in sidebar tutorialSidebar\"\n  },\n  \"sidebar.tutorialSidebar.category.API Deployment\": {\n    \"message\": \"API Deployment\",\n    \"description\": \"The label for category API Deployment in sidebar tutorialSidebar\"\n  },\n  \"sidebar.tutorialSidebar.category.Further Readings\": {\n    \"message\": \"Further Readings\",\n    \"description\": \"The label for category Further Readings in sidebar tutorialSidebar\"\n  },\n  \"sidebar.tutorialSidebar.category.Appendix\": {\n    \"message\": \"Appendix\",\n    \"description\": \"The label for category Appendix in sidebar tutorialSidebar\"\n  },\n  \"sidebar.preSidebar.category.Docker\": {\n    \"message\": \"Docker\",\n    \"description\": \"The label for category Docker in sidebar preSidebar\"\n  }\n}\n"
  },
  {
    "path": "i18n/en/docusaurus-plugin-content-docs/version-1.0/api-deployment/_category_.json",
    "content": "{\n  \"label\": \"API Deployment\",\n  \"position\": 7,\n  \"link\": {\n    \"type\": \"generated-index\"\n  }\n}\n"
  },
  {
    "path": "i18n/en/docusaurus-plugin-content-docs/version-1.0/api-deployment/seldon-children.md",
    "content": "---\ntitle : \"6. Multi Models\"\ndescription: \"\"\nsidebar_position: 6\ncontributors: [\"Jongseob Jeon\"]\n---\n\nPreviously, the methods explained were all targeted at a single model. On this page, we will look at how to connect multiple models. \n\nFirst, we will create a pipeline that creates two models. We will add a StandardScaler to the SVC model we used before and store it.\n```python\nfrom functools import partial\n\nimport kfp\nfrom kfp.components import InputPath, OutputPath, create_component_from_func\n\n\n@partial(\n    create_component_from_func,\n    packages_to_install=[\"pandas\", \"scikit-learn\"],\n)\ndef load_iris_data(\n    data_path: OutputPath(\"csv\"),\n    target_path: OutputPath(\"csv\"),\n):\n    import pandas as pd\n    from sklearn.datasets import load_iris\n\n    iris = load_iris()\n\n    data = pd.DataFrame(iris[\"data\"], columns=iris[\"feature_names\"])\n    target = pd.DataFrame(iris[\"target\"], columns=[\"target\"])\n\n    data.to_csv(data_path, index=False)\n    target.to_csv(target_path, index=False)\n\n@partial(\n    create_component_from_func,\n    packages_to_install=[\"dill\", \"pandas\", \"scikit-learn\", \"mlflow\"],\n)\ndef train_scaler_from_csv(\n    data_path: InputPath(\"csv\"),\n    scaled_data_path: OutputPath(\"csv\"),\n    model_path: OutputPath(\"dill\"),\n    input_example_path: OutputPath(\"dill\"),\n    signature_path: OutputPath(\"dill\"),\n    conda_env_path: OutputPath(\"dill\"),\n):\n    import dill\n    import pandas as pd\n    from sklearn.preprocessing import StandardScaler\n\n    from mlflow.models.signature import infer_signature\n    from mlflow.utils.environment import _mlflow_conda_env\n\n    data = pd.read_csv(data_path)\n\n    scaler = StandardScaler()\n    scaled_data = scaler.fit_transform(data)\n    scaled_data = pd.DataFrame(scaled_data, columns=data.columns, index=data.index)\n\n    scaled_data.to_csv(scaled_data_path, index=False)\n\n    with open(model_path, mode=\"wb\") as file_writer:\n        dill.dump(scaler, file_writer)\n\n    input_example = data.sample(1)\n    with open(input_example_path, \"wb\") as file_writer:\n        dill.dump(input_example, file_writer)\n\n    signature = infer_signature(data, scaler.transform(data))\n    with open(signature_path, \"wb\") as file_writer:\n        dill.dump(signature, file_writer)\n\n    conda_env = _mlflow_conda_env(\n        additional_pip_deps=[\"scikit-learn\"],\n        install_mlflow=False\n    )\n    with open(conda_env_path, \"wb\") as file_writer:\n        dill.dump(conda_env, file_writer)\n\n\n@partial(\n    create_component_from_func,\n    packages_to_install=[\"dill\", \"pandas\", \"scikit-learn\", \"mlflow\"],\n)\ndef train_svc_from_csv(\n    train_data_path: InputPath(\"csv\"),\n    train_target_path: InputPath(\"csv\"),\n    model_path: OutputPath(\"dill\"),\n    input_example_path: OutputPath(\"dill\"),\n    signature_path: OutputPath(\"dill\"),\n    conda_env_path: OutputPath(\"dill\"),\n    kernel: str,\n):\n    import dill\n    import pandas as pd\n    from sklearn.svm import SVC\n\n    from mlflow.models.signature import infer_signature\n    from mlflow.utils.environment import _mlflow_conda_env\n\n    train_data = pd.read_csv(train_data_path)\n    train_target = pd.read_csv(train_target_path)\n\n    clf = SVC(kernel=kernel)\n    clf.fit(train_data, train_target)\n\n    with open(model_path, mode=\"wb\") as file_writer:\n        dill.dump(clf, file_writer)\n\n    input_example = train_data.sample(1)\n    with open(input_example_path, \"wb\") as file_writer:\n        dill.dump(input_example, file_writer)\n\n    signature = infer_signature(train_data, clf.predict(train_data))\n    with open(signature_path, \"wb\") as file_writer:\n        dill.dump(signature, file_writer)\n\n    conda_env = _mlflow_conda_env(\n        additional_pip_deps=[\"scikit-learn\"],\n        install_mlflow=False\n    )\n    with open(conda_env_path, \"wb\") as file_writer:\n        dill.dump(conda_env, file_writer)\n\n\n@partial(\n    create_component_from_func,\n    packages_to_install=[\"dill\", \"pandas\", \"scikit-learn\", \"mlflow\", \"boto3\"],\n)\ndef upload_sklearn_model_to_mlflow(\n    model_name: str,\n    model_path: InputPath(\"dill\"),\n    input_example_path: InputPath(\"dill\"),\n    signature_path: InputPath(\"dill\"),\n    conda_env_path: InputPath(\"dill\"),\n):\n    import os\n    import dill\n    from mlflow.sklearn import save_model\n    \n    from mlflow.tracking.client import MlflowClient\n\n    os.environ[\"MLFLOW_S3_ENDPOINT_URL\"] = \"http://minio-service.kubeflow.svc:9000\"\n    os.environ[\"AWS_ACCESS_KEY_ID\"] = \"minio\"\n    os.environ[\"AWS_SECRET_ACCESS_KEY\"] = \"minio123\"\n\n    client = MlflowClient(\"http://mlflow-server-service.mlflow-system.svc:5000\")\n\n    with open(model_path, mode=\"rb\") as file_reader:\n        clf = dill.load(file_reader)\n\n    with open(input_example_path, \"rb\") as file_reader:\n        input_example = dill.load(file_reader)\n\n    with open(signature_path, \"rb\") as file_reader:\n        signature = dill.load(file_reader)\n\n    with open(conda_env_path, \"rb\") as file_reader:\n        conda_env = dill.load(file_reader)\n    save_model(\n        sk_model=clf,\n        path=model_name,\n        serialization_format=\"cloudpickle\",\n        conda_env=conda_env,\n        signature=signature,\n        input_example=input_example,\n    )\n    run = client.create_run(experiment_id=\"0\")\n    client.log_artifact(run.info.run_id, model_name)\n\n\nfrom kfp.dsl import pipeline\n\n\n@pipeline(name=\"multi_model_pipeline\")\ndef multi_model_pipeline(kernel: str = \"rbf\"):\n    iris_data = load_iris_data()\n    scaled_data = train_scaler_from_csv(data=iris_data.outputs[\"data\"])\n    _ = upload_sklearn_model_to_mlflow(\n        model_name=\"scaler\",\n        model=scaled_data.outputs[\"model\"],\n        input_example=scaled_data.outputs[\"input_example\"],\n        signature=scaled_data.outputs[\"signature\"],\n        conda_env=scaled_data.outputs[\"conda_env\"],\n    )\n    model = train_svc_from_csv(\n        train_data=scaled_data.outputs[\"scaled_data\"],\n        train_target=iris_data.outputs[\"target\"],\n        kernel=kernel,\n    )\n    _ = upload_sklearn_model_to_mlflow(\n        model_name=\"svc\",\n        model=model.outputs[\"model\"],\n        input_example=model.outputs[\"input_example\"],\n        signature=model.outputs[\"signature\"],\n        conda_env=model.outputs[\"conda_env\"],\n    )\n\n\nif __name__ == \"__main__\":\n    kfp.compiler.Compiler().compile(multi_model_pipeline, \"multi_model_pipeline.yaml\")\n\n```\n\nIf you upload the pipeline, it will look like this.\n![children-kubeflow.png](./img/children-kubeflow.png)\n\nWhen you check the MLflow dashboard, two models will be generated, as shown below. \n\n![children-mlflow.png](./img/children-mlflow.png)\n\nAfter checking the run_id of each one, define the SeldonDeployment spec as follows.\n```bash\napiVersion: machinelearning.seldon.io/v1\nkind: SeldonDeployment\nmetadata:\n  name: multi-model-example\n  namespace: kubeflow-user-example-com\nspec:\n  name: model\n  predictors:\n  - name: model\n\n    componentSpecs:\n    - spec:\n        volumes:\n        - name: model-provision-location\n          emptyDir: {}\n\n        initContainers:\n        - name: scaler-initializer\n          image: gcr.io/kfserving/storage-initializer:v0.4.0\n          args:\n            - \"s3://mlflow/mlflow/artifacts/0/7f445015a0e94519b003d316478766ef/artifacts/scaler\"\n            - \"/mnt/models\"\n          volumeMounts:\n          - mountPath: /mnt/models\n            name: model-provision-location\n          envFrom:\n          - secretRef:\n              name: seldon-init-container-secret\n        - name: svc-initializer\n          image: gcr.io/kfserving/storage-initializer:v0.4.0\n          args:\n            - \"s3://mlflow/mlflow/artifacts/0/87eb168e76264b39a24b0e5ca0fe922b/artifacts/svc\"\n            - \"/mnt/models\"\n          volumeMounts:\n          - mountPath: /mnt/models\n            name: model-provision-location\n          envFrom:\n          - secretRef:\n              name: seldon-init-container-secret\n\n        containers:\n        - name: scaler\n          image: seldonio/mlflowserver:1.8.0-dev\n          volumeMounts:\n          - mountPath: /mnt/models\n            name: model-provision-location\n            readOnly: true\n          securityContext:\n            privileged: true\n            runAsUser: 0\n            runAsGroup: 0\n        - name: svc\n          image: seldonio/mlflowserver:1.8.0-dev\n          volumeMounts:\n          - mountPath: /mnt/models\n            name: model-provision-location\n            readOnly: true\n          securityContext:\n            privileged: true\n            runAsUser: 0\n            runAsGroup: 0\n\n    graph:\n      name: scaler\n      type: MODEL\n      parameters:\n      - name: model_uri\n        type: STRING\n        value: \"/mnt/models\"\n      - name: predict_method\n        type: STRING\n        value: \"transform\"\n      children:\n      - name: svc\n        type: MODEL\n        parameters:\n        - name: model_uri\n          type: STRING\n          value: \"/mnt/models\"\n```\nTwo models have been created so each model's initContainer and container must be defined. This field takes input as an array and the order does not matter. The order in which the models are executed is defined in the graph.\n```bash\ngraph:\n  name: scaler\n  type: MODEL\n  parameters:\n  - name: model_uri\n    type: STRING\n    value: \"/mnt/models\"\n  - name: predict_method\n    type: STRING\n    value: \"transform\"\n  children:\n  - name: svc\n    type: MODEL\n    parameters:\n    - name: model_uri\n      type: STRING\n      value: \"/mnt/models\"\n```\n\nThe operation of the graph is to convert the initial value received into a predefined predict_method and then pass it to the model defined as children. In this case, the data is passed from scaler -> svc.\n\nNow let's create the above specifications in a yaml file.\n\n```bash\ncat <<EOF > multi-model.yaml\napiVersion: machinelearning.seldon.io/v1\nkind: SeldonDeployment\nmetadata:\n  name: multi-model-example\n  namespace: kubeflow-user-example-com\nspec:\n  name: model\n  predictors:\n  - name: model\n\n    componentSpecs:\n    - spec:\n        volumes:\n        - name: model-provision-location\n          emptyDir: {}\n\n        initContainers:\n        - name: scaler-initializer\n          image: gcr.io/kfserving/storage-initializer:v0.4.0\n          args:\n            - \"s3://mlflow/mlflow/artifacts/0/7f445015a0e94519b003d316478766ef/artifacts/scaler\"\n            - \"/mnt/models\"\n          volumeMounts:\n          - mountPath: /mnt/models\n            name: model-provision-location\n          envFrom:\n          - secretRef:\n              name: seldon-init-container-secret\n        - name: svc-initializer\n          image: gcr.io/kfserving/storage-initializer:v0.4.0\n          args:\n            - \"s3://mlflow/mlflow/artifacts/0/87eb168e76264b39a24b0e5ca0fe922b/artifacts/svc\"\n            - \"/mnt/models\"\n          volumeMounts:\n          - mountPath: /mnt/models\n            name: model-provision-location\n          envFrom:\n          - secretRef:\n              name: seldon-init-container-secret\n\n        containers:\n        - name: scaler\n          image: ghcr.io/mlops-for-all/mlflowserver\n          volumeMounts:\n          - mountPath: /mnt/models\n            name: model-provision-location\n            readOnly: true\n          securityContext:\n            privileged: true\n            runAsUser: 0\n            runAsGroup: 0\n        - name: svc\n          image: ghcr.io/mlops-for-all/mlflowserver\n          volumeMounts:\n          - mountPath: /mnt/models\n            name: model-provision-location\n            readOnly: true\n          securityContext:\n            privileged: true\n            runAsUser: 0\n            runAsGroup: 0\n\n    graph:\n      name: scaler\n      type: MODEL\n      parameters:\n      - name: model_uri\n        type: STRING\n        value: \"/mnt/models\"\n      - name: predict_method\n        type: STRING\n        value: \"transform\"\n      children:\n      - name: svc\n        type: MODEL\n        parameters:\n        - name: model_uri\n          type: STRING\n          value: \"/mnt/models\"\nEOF\n```\n\nCreate an API through the following command.\n```bash\nkubectl apply -f multi-model.yaml\n```\n\nIf properly performed, it will be outputted as follows.\n```bash\nseldondeployment.machinelearning.seldon.io/multi-model-example created\n```\n\nCheck to see if it has been generated normally.\n```bash\nkubectl get po -n kubeflow-user-example-com | grep multi-model-example\n```\n\nIf it is created normally, a similar pod will be created.\n```bash\nmulti-model-example-model-0-scaler-svc-9955fb795-n9ffw   4/4     Running     0          2m30s\n```\n"
  },
  {
    "path": "i18n/en/docusaurus-plugin-content-docs/version-1.0/api-deployment/seldon-fields.md",
    "content": "---\ntitle : \"4. Seldon Fields\"\ndescription: \"\"\nsidebar_position: 4\ncontributors: [\"Jongseob Jeon\"]\n---\n\nSummary of how Seldon Core creates an API server:\n\n1. initContainer downloads the required model from the model repository.\n2. The downloaded model is passed to the container.\n3. The container runs an API server enclosing the model.\n4. The API can be requested at the generated API server address to receive the inference values from the model.\n\nThe yaml file defining the custom resource, SeldonDeployment, which is most commonly used when using Seldon Core is as follows:\n```bash\napiVersion: machinelearning.seldon.io/v1\nkind: SeldonDeployment\nmetadata:\n  name: seldon-example\n  namespace: kubeflow-user-example-com\nspec:\n  name: model\n  predictors:\n  - name: model\n\n    componentSpecs:\n    - spec:\n        volumes:\n        - name: model-provision-location\n          emptyDir: {}\n\n        initContainers:\n        - name: model-initializer\n          image: gcr.io/kfserving/storage-initializer:v0.4.0\n          args:\n            - \"gs://seldon-models/v1.12.0-dev/sklearn/iris\"\n            - \"/mnt/models\"\n          volumeMounts:\n          - mountPath: /mnt/models\n            name: model-provision-location\n\n        containers:\n        - name: model\n          image: seldonio/sklearnserver:1.8.0-dev\n          volumeMounts:\n          - mountPath: /mnt/models\n            name: model-provision-location\n            readOnly: true\n          securityContext:\n            privileged: true\n            runAsUser: 0\n            runAsGroup: 0\n\n    graph:\n      name: model\n      type: MODEL\n      parameters:\n      - name: model_uri\n        type: STRING\n        value: \"/mnt/models\"\n      children: []\n\n```\n\nThe `name` and `predictors` fields of SeldonDeployment are required fields. `name` is mainly used as a name to differentiate pods in Kubernetes and does not have a major effect. `predictors` must be a single array consisting of `name`, `componentSpecs` and `graph` defined. Here also, `name` is mainly used as a name to differentiate pods in Kubernetes and does not have a major effect.\n\nNow let's take a look at the fields that need to be defined in `componentSpecs` and `graph`.\n\n## componentSpecs\n\n`componentSpecs` must be a single array consisting of the `spec` key. The `spec` must have the fields `volumes`, `initContainers` and `containers` defined.\n\n### volumes\n\n```bash\nvolumes:\n- name: model-provision-location\n  emptyDir: {}\n```\n`Volumes` refer to the space used to store the models downloaded from the initContainer, which is received as an array with the components `name` and `emptyDir`. These values are used only once when downloading and moving the models, so they do not need to be modified significantly.\n```bash\n- name: model-initializer\n  image: gcr.io/kfserving/storage-initializer:v0.4.0\n  args:\n    - \"gs://seldon-models/v1.12.0-dev/sklearn/iris\"\n    - \"/mnt/models\"\n  volumeMounts:\n  - mountPath: /mnt/models\n    name: model-provision-location\n```\nThe `args` field contains the system arguments necessary to download the model from the model repository and move it to the specified model path. It provides the required parameters for the initContainer to perform the downloading and storage operations.\n\ninitContainer is responsible for downloading the model to be used from the API, so the fields used determine the information needed to download data from the model registry. \n\nThe value of initContainer consists of n arrays, and each model needs to be specified separately.\n\n#### name\n`name` is the name of the pod in Kubernetes, and it is recommended to use `{model_name}-initializer` for debugging. \n\n#### image\n\n`image` is the name of the image used to download the model, and there are two recommended images by\n- gcr.io/kfserving/storage-initializer:v0.4.0\n- seldonio/rclone-storage-initializer:1.13.0-dev\n\nFor more detailed information, please refer to the following resources:\n\n- [kfserving](https://docs.seldon.io/projects/seldon-core/en/latest/servers/kfserving-storage-initializer.html)\n- [rclone](https://github.com/SeldonIO/seldon-core/tree/master/components/rclone-storage-initializer)\n\nIn MLOps for ALL, we use kfserving for downloading and storing models.\n\n#### args\n\n```bash\nargs:\n  - \"gs://seldon-models/v1.12.0-dev/sklearn/iris\"\n  - \"/mnt/models\"\n```\n\nWhen the gcr.io/kfserving/storage-initializer:v0.4.0 Docker image is run (`run`), it takes an argument in the form of an array. The first array value is the address of the model to be downloaded. The second array value is the address where the downloaded model will be stored (Seldon Core usually stores it in `/mnt/models`).\n\n### volumeMounts\n\n```bash\nvolumeMounts:\n  - mountPath: /mnt/models\n    name: model-provision-location\n```\n\n`volumeMounts` is a field that attaches volumes to the Kubernetes to share `/mnt/models` as described in volumes. For more information, refer to Kubernetes Volume [Kubernetes Volume](https://kubernetes.io/docs/concepts/storage/volumes/).\"\n\n### container\n\n```bash\ncontainers:\n- name: model\n  image: seldonio/sklearnserver:1.8.0-dev\n  volumeMounts:\n  - mountPath: /mnt/models\n    name: model-provision-location\n    readOnly: true\n  securityContext:\n    privileged: true\n    runAsUser: 0\n    runAsGroup: 0\n```\n \n Container defines the fields that determine the configuration when the model is run in an API form.\n\n#### name\n\nThe `name` field refers to the name of the pod in Kubernetes. It should be the name of the model being used.\n\n#### image\n\nThe `image` field represents the image used to convert the model into an API. The image should have all the necessary packages installed when the model is loaded.\n\nSeldon Core provides official images for different types of models, including:\n\n- seldonio/sklearnserver\n- seldonio/mlflowserver\n- seldonio/xgboostserver\n- seldonio/tfserving\n\nYou can choose the appropriate image based on the type of model you are using.\n\n#### volumeMounts\n\n```bash\nvolumeMounts:\n- mountPath: /mnt/models\n  name: model-provision-location\n  readOnly: true\n```\n\nThis is a field that tells the path where the data downloaded from initContainer is located. Here, to prevent the model from being modified, `readOnly: true` will also be given.\n\n#### securityContext\n\n```bash\nsecurityContext:\n  privileged: true\n  runAsUser: 0\n  runAsGroup: 0\n```\n\nWhen installing necessary packages, pod may not be able to perform the package installation due to lack of permission. To address this, root permission is granted (although this could cause security issues when in actual service).\n\n## graph\n\n```bash\ngraph:\n  name: model\n  type: MODEL\n  parameters:\n  - name: model_uri\n    type: STRING\n    value: \"/mnt/models\"\n  children: []\n```\n\nThis is a field that defines the order in which the model operates.\n\n### name\n\nThe `name` field refers to the name of the model graph. It should match the name defined in the container.\n\n### type\n\nThe `type` field can have four different values:\n\n1. TRANSFORMER\n2. MODEL\n3. OUTPUT_TRANSFORMER\n4. ROUTER\n\nFor detailed explanations of each type, you can refer to the [Seldon Core Complex Graphs Metadata Example](https://docs.seldon.io/projects/seldon-core/en/latest/examples/graph-metadata.html).\n\n### parameters\n\nThe `parameters` field contains values used in the class init. For the sklearnserver, you can find the required values in the [following file](https://github.com/SeldonIO/seldon-core/blob/master/servers/sklearnserver/sklearnserver/SKLearnServer.py).\n```python\nclass SKLearnServer(SeldonComponent):\n    def __init__(self, model_uri: str = None, method: str = \"predict_proba\"):\n```\n\nIf you look at the code, you can define `model_uri` and `method`.\n\n### children\n\nThe `children` field is used when creating the sequence diagram. More details about this field will be explained on the following page.\n"
  },
  {
    "path": "i18n/en/docusaurus-plugin-content-docs/version-1.0/api-deployment/seldon-iris.md",
    "content": "---\ntitle : \"2. Deploy SeldonDeployment\"\ndescription: \"\"\nsidebar_position: 2\ndate: 2021-12-22\nlastmod: 2021-12-22\ncontributors: [\"Youngcheol Jang\", \"SeungTae Kim\"]\n---\n\n## Deploy with SeldonDeployment\n\nLet's deploy our trained model as an API using SeldonDeployment. SeldonDeployment is a custom resource definition (CRD) defined to deploy models as REST/gRPC servers on Kubernetes.\n\n#### 1. Prerequisites\n\nWe will conduct the SeldonDeployment related practice in a new namespace called seldon-deploy. After creating the namespace, set seldon-deploy as the current namespace.\n\n```bash\nkubectl create namespace seldon-deploy\nkubectl config set-context --current --namespace=seldon-deploy\n```\n\n### 2. Define Spec\n\nGenerate a yaml file to deploy SeldonDeployment. \nIn this page, we will use a publicly available iris model.\nBecause this iris model is trained through the sklearn framework, we use SKLEARN_SERVER.\n\n```bash\ncat <<EOF > iris-sdep.yaml\napiVersion: machinelearning.seldon.io/v1alpha2\nkind: SeldonDeployment\nmetadata:\n  name: sklearn\n  namespace: seldon-deploy\nspec:\n  name: iris\n  predictors:\n  - graph:\n      children: []\n      implementation: SKLEARN_SERVER\n      modelUri: gs://seldon-models/v1.12.0-dev/sklearn/iris\n      name: classifier\n    name: default\n    replicas: 1\nEOF\n```\n\nDeploy yaml file.\n\n```bash\nkubectl apply -f iris-sdep.yaml\n```\n\nCheck if the deployment was successful through the following command.\n\n```bash\nkubectl get pods --selector seldon-app=sklearn-default -n seldon-deploy\n```\n\nIf everyone runs, similar results will be printed.\n\n```bash\nNAME                                            READY   STATUS    RESTARTS   AGE\nsklearn-default-0-classifier-5fdfd7bb77-ls9tr   2/2     Running   0          5m\n```\n\n## Ingress URL\n\nNow, send a inference request to the deployed model to get the inference result. The API created by the SeldonDeployment follows the following rule:\n`http://{NODE_IP}:{NODE_PORT}/seldon/{namespace}/{seldon-deployment-name}/api/v1.0/{method-name}/`\n\n### NODE_IP / NODE_PORT\n\n[Since Seldon Core was installed with Ambassador as the Ingress Controller](../setup-components/install-components-seldon.md), all APIs created by SeldonDeployment can be requested through the Ambassador Ingress gateway.\n\nTherefore, first set the url of the Ambassador Ingress Gateway as an environment variable.\n\n```bash\nexport NODE_IP=$(kubectl get nodes -o jsonpath='{ $.items[*].status.addresses[?(@.type==\"InternalIP\")].address }')\nexport NODE_PORT=$(kubectl get service ambassador -n seldon-system -o jsonpath=\"{.spec.ports[0].nodePort}\")\n```\n\nCheck the set url.\n\n```bash\necho \"NODE_IP\"=$NODE_IP\necho \"NODE_PORT\"=$NODE_PORT\n```\n\nIt should be outputted similarly as follows, and if set through the cloud, you can check that internal IP address is set.\n```bash\nNODE_IP=192.168.0.19\nNODE_PORT=30486\n```\n\n### namespace / seldon-deployment-name\n\nThis refers to the `namespace` and `seldon-deployment-name` where the SeldonDeployment is deployed and used to define the values defined in the metadata when defining the spec.\n```bash\nmetadata:\n  name: sklearn\n  namespace: seldon-deploy\n```\n\nIn the example above, `namespace` is seldon-deploy, `seldon-deployment-name` is sklearn.\n### method-name\n\nIn SeldonDeployment, the commonly used `method-name` has two options:\n\n1. doc\n2. predictions\n\nThe detailed usage of each method is explained below.\n\n## Using Swagger\n\nFirst, let's explore how to use the doc method, which allows access to the Swagger generated by Seldon.\n\n### 1. Accessing Swagger\n\nAccording to the provided ingress URL rules, you can access the Swagger documentation using the following URL:\n`http://192.168.0.19:30486/seldon/seldon-deploy/sklearn/api/v1.0/doc/`\n\n![iris-swagger1.png](./img/iris-swagger1.png)\n\n### 2. Selecting Swagger Predictions\n\nIn the Swagger UI, select the `/seldon/seldon-deploy/sklearn/api/v1.0/predictions` endpoint.\n\n![iris-swagger2.png](./img/iris-swagger2.png)\n\n### 3. Choosing *Try it out*\n\n![iris-swagger3.png](./img/iris-swagger3.png)\n\n### 4. Inputting data in the Request body\n\n![iris-swagger4.png](./img/iris-swagger4.png)\n\nEnter the following data into the Request body.\n\n```bash\n{\n  \"data\": {\n    \"ndarray\":[[1.0, 2.0, 5.0, 6.0]]\n  }\n}\n```\n\n### 5. Check the inference results\n\nYou can click the `Execute` button to obtain the inference result.\n\n![iris-swagger5.png](./img/iris-swagger5.png)\n\nIf everything is executed successfully, you will obtain the following inference result.\n\n```bash\n{\n  \"data\": {\n    \"names\": [\n      \"t:0\",\n      \"t:1\",\n      \"t:2\"\n    ],\n    \"ndarray\": [\n      [\n        9.912315378486697e-7,\n        0.0007015931307746079,\n        0.9992974156376876\n      ]\n    ]\n  },\n  \"meta\": {\n    \"requestPath\": {\n      \"classifier\": \"seldonio/sklearnserver:1.11.2\"\n    }\n  }\n}\n```\n\n## Using CLI\n\nAlso, you can use http client CLI tools such as curl to make API requests.\nFor example, requesting `/predictions` as follows\n\n```bash\ncurl -X POST http://$NODE_IP:$NODE_PORT/seldon/seldon-deploy/sklearn/api/v1.0/predictions \\\n-H 'Content-Type: application/json' \\\n-d '{ \"data\": { \"ndarray\": [[1,2,3,4]] } }'\n```\n\nYou can confirm that the following response is outputted normally.\n```bash\n{\"data\":{\"names\":[\"t:0\",\"t:1\",\"t:2\"],\"ndarray\":[[0.0006985194531162835,0.00366803903943666,0.995633441507447]]},\"meta\":{\"requestPath\":{\"classifier\":\"seldonio/sklearnserver:1.11.2\"}}}\n```\n"
  },
  {
    "path": "i18n/en/docusaurus-plugin-content-docs/version-1.0/api-deployment/seldon-mlflow.md",
    "content": "---\ntitle : \"5. Model from MLflow\"\ndescription: \"\"\nsidebar_position: 5\ncontributors: [\"Jongseob Jeon\"]\n---\n\n## Model from MLflow\n\nOn this page, we will learn how to create an API using a model saved in the [MLflow Component](../kubeflow/advanced-mlflow.md).\n\n## Secret\n\nThe initContainer needs credentials to access minio and download the model. The credentials for access to minio are as follows.\n\n```bash\napiVersion: v1\ntype: Opaque\nkind: Secret\nmetadata:\n  name: seldon-init-container-secret\n  namespace: kubeflow-user-example-com\ndata:\n  AWS_ACCESS_KEY_ID: bWluaW8K=\n  AWS_SECRET_ACCESS_KEY: bWluaW8xMjM=\n  AWS_ENDPOINT_URL: aHR0cDovL21pbmlvLm1ha2luYXJvY2tzLmFp\n  USE_SSL: ZmFsc2U=\n```\n\nThe input value for `AWS_ACCESS_KEY_ID` is `minio`. However, since the input value for the secret must be an encoded value, the value that is actually entered must be the value that comes out after performing the following. \n\nThe values that need to be entered in data are as follows.\n\n- AWS_ACCESS_KEY_ID: minio\n- AWS_SECRET_ACCESS_KEY: minio123\n- AWS_ENDPOINT_URL: http://minio-service.kubeflow.svc:9000\n- USE_SSL: false\n\nThe encoding can be done using the following command.\n\n```bash\necho -n minio | base64\n```\n\nThen the following values will be output.\n\n```bash\nbWluaW8=\n```\n\nIf you do the encoding for the entire value, it will look like this:\n\n- AWS_ACCESS_KEY_ID: minio=\n- AWS_SECRET_ACCESS_KEY: minio123=\n- AWS_ENDPOINT_URL: http://minio-service.kubeflow.svc:9000=\n- USE_SSL: false=\n\nYou can generate a yaml file through the following command to create the secret.\n\n```bash\ncat <<EOF > seldon-init-container-secret.yaml\napiVersion: v1\nkind: Secret\nmetadata:\n  name: seldon-init-container-secret\n  namespace: kubeflow-user-example-com\ntype: Opaque\ndata:\n  AWS_ACCESS_KEY_ID: bWluaW8=\n  AWS_SECRET_ACCESS_KEY: bWluaW8xMjM=\n  AWS_ENDPOINT_URL: aHR0cDovL21pbmlvLXNlcnZpY2Uua3ViZWZsb3cuc3ZjOjkwMDA=\n  USE_SSL: ZmFsc2U=\nEOF\n```\n\nCreate the secret through the following command.\n\n```bash\nkubectl apply -f seldon-init-container-secret.yaml\n```\n\nIf performed normally, it will be output as follows.\n\n```bash\nsecret/seldon-init-container-secret created\n```\n\n## Seldon Core yaml\n\nNow let's write the yaml file to create Seldon Core.\n\n```bash\napiVersion: machinelearning.seldon.io/v1\nkind: SeldonDeployment\nmetadata:\n  name: seldon-example\n  namespace: kubeflow-user-example-com\nspec:\n  name: model\n  predictors:\n  - name: model\n\n    componentSpecs:\n    - spec:\n        volumes:\n        - name: model-provision-location\n          emptyDir: {}\n\n        initContainers:\n        - name: model-initializer\n          image: gcr.io/kfserving/storage-initializer:v0.4.0\n          args:\n            - \"s3://mlflow/mlflow/artifacts/0/74ba8e33994144f599e50b3be176cdb0/artifacts/svc\"\n            - \"/mnt/models\"\n          volumeMounts:\n          - mountPath: /mnt/models\n            name: model-provision-location\n          envFrom:\n          - secretRef:\n              name: seldon-init-container-secret\n\n        containers:\n        - name: model\n          image: ghcr.io/mlops-for-all/mlflowserver\n          volumeMounts:\n          - mountPath: /mnt/models\n            name: model-provision-location\n            readOnly: true\n          securityContext:\n            privileged: true\n            runAsUser: 0\n            runAsGroup: 0\n\n    graph:\n      name: model\n      type: MODEL\n      parameters:\n      - name: model_uri\n        type: STRING\n        value: \"/mnt/models\"\n      children: []\n```\n\nThere are two major changes compared to the previously created [Seldon Fields](../api-deployment/seldon-fields.md):\n\n1. The `envFrom` field is added to the initContainer.\n2. The address in the args has been changed to `s3://mlflow/mlflow/artifacts/0/74ba8e33994144f599e50b3be176cdb0/artifacts/svc`.\n\n### args\n\nPreviously, we mentioned that the first element of the args array is the path to the model we want to download. So, how can we determine the path of the model stored in MLflow?\n\nTo find the path, go back to MLflow and click on the run, then click on the model, as shown below:\n\n![seldon-mlflow-0.png](./img/seldon-mlflow-0.png)\n\nYou can use the path obtained from there.\n\n### envFrom\n\nThis process involves providing the environment variables required to access MinIO and download the model. We will use the `seldon-init-container-secret` created earlier.\n\n## API Creation\n\nFirst, let's generate the YAML file based on the specification defined above.\n\n```bash\napiVersion: machinelearning.seldon.io/v1\nkind: SeldonDeployment\nmetadata:\n  name: seldon-example\n  namespace: kubeflow-user-example-com\nspec:\n  name: model\n  predictors:\n  - name: model\n\n    componentSpecs:\n    - spec:\n        volumes:\n        - name: model-provision-location\n          emptyDir: {}\n\n        initContainers:\n        - name: model-initializer\n          image: gcr.io/kfserving/storage-initializer:v0.4.0\n          args:\n            - \"s3://mlflow/mlflow/artifacts/0/74ba8e33994144f599e50b3be176cdb0/artifacts/svc\"\n            - \"/mnt/models\"\n          volumeMounts:\n          - mountPath: /mnt/models\n            name: model-provision-location\n          envFrom:\n          - secretRef:\n              name: seldon-init-container-secret\n\n        containers:\n        - name: model\n          image: ghcr.io/mlops-for-all/mlflowserver\n          volumeMounts:\n          - mountPath: /mnt/models\n            name: model-provision-location\n            readOnly: true\n          securityContext:\n            privileged: true\n            runAsUser: 0\n            runAsGroup: 0\n\n    graph:\n      name: model\n      type: MODEL\n      parameters:\n      - name: model_uri\n        type: STRING\n        value: \"/mnt/models\"\n      - name: xtype\n        type: STRING\n        value: \"dataframe\"\n      children: []\nEOF\n```\n\nCreate a seldon pod.\n\n```bash\nkubectl apply -f seldon-mlflow.yaml\n\n```\n\nIf it is performed normally, it will be outputted as follows.\n\n```bash\nseldondeployment.machinelearning.seldon.io/seldon-example created\n```\n\nNow we wait until the pod is up and running properly.\n\n```bash\nkubectl get po -n kubeflow-user-example-com | grep seldon\n```\n\nIf it is outputted similarly to the following, the API has been created normally.\n\n```bash\nseldon-example-model-0-model-5c949bd894-c5f28      3/3     Running     0          69s\n```\n\nYou can confirm the execution through the following request on the API created through the CLI.\n\n```bash\ncurl -X POST http://$NODE_IP:$NODE_PORT/seldon/seldon-deploy/sklearn/api/v1.0/predictions \\\n-H 'Content-Type: application/json' \\\n-d '{\n    \"data\": {\n        \"ndarray\": [\n            [\n                143.0,\n                0.0,\n                30.0,\n                30.0\n            ]\n        ],\n        \"names\": [\n            \"sepal length (cm)\",\n            \"sepal width (cm)\",\n            \"petal length (cm)\",\n            \"petal width (cm)\"\n        ]\n    }\n}'\n```\n\nIf executed normally, you can get the following results.\n\n```bash\n{\"data\":{\"names\":[],\"ndarray\":[\"Virginica\"]},\"meta\":{\"requestPath\":{\"model\":\"ghcr.io/mlops-for-all/mlflowserver:e141f57\"}}}\n```\n"
  },
  {
    "path": "i18n/en/docusaurus-plugin-content-docs/version-1.0/api-deployment/seldon-pg.md",
    "content": "---\ntitle : \"3. Seldon Monitoring\"\ndescription: \"Prometheus & Grafana 확인하기\"\nsidebar_position: 3\ndate: 2021-12-24\nlastmod: 2021-12-24\ncontributors: [\"Jongseob Jeon\"]\n---\n\n## Grafana & Prometheus\n\nNow, let's perform repeated API requests with the SeldonDeployment we created on the [previous page](../api-deployment/seldon-iris.md) and check if the dashboard changes.\n\n### Dashboard\n\n[Forward the dashboard created earlier](../setup-components/install-components-pg.md).\n\n```bash\nkubectl port-forward svc/seldon-core-analytics-grafana -n seldon-system 8090:80\n```\n\n### Request API\n\nRequest **repeated** to the [previously created Seldon Deployment](../api-deployment/seldon-iris.md#using-cli).\n\n```bash\ncurl -X POST http://$NODE_IP:$NODE_PORT/seldon/seldon-deploy/sklearn/api/v1.0/predictions \\\n-H 'Content-Type: application/json' \\\n-d '{ \"data\": { \"ndarray\": [[1,2,3,4]] } }'\n```\n\nFurthermore, when checking the Grafana dashboard, you can observe that the Global Request Rate increases momentarily from `0 ops`.\n\n![repeat-raise.png](./img/repeat-raise.png)\n\nThis confirms that Prometheus and Grafana have been successfully installed and configured.\n"
  },
  {
    "path": "i18n/en/docusaurus-plugin-content-docs/version-1.0/api-deployment/what-is-api-deployment.md",
    "content": "---\ntitle : \"1. What is API Deployment?\"\ndescription: \"\"\nsidebar_position: 1\ndate: 2021-12-22\nlastmod: 2021-12-22\ncontributors: [\"Youngcheol Jang\"]\n---\n\n## What is API Deployment?\n\nAfter training a machine learning model, how should it be used? When training a machine learning model, you expect a model with higher performance to come out, but when you infer with the trained model, you want to get the inference results quickly and easily.\n\nWhen you want to check the inference results of the model, you can load the trained model and infer through a Jupyter notebook or a Python script. However, this method becomes inefficient as the model gets bigger, and you can only use the model in the environment where the trained model exists and cannot be used by many people.\n\nTherefore, when machine learning is used in actual services, it uses an API to use the trained model. The model is loaded only once in the environment where the API server is running, and you can easily get the inference results using DNS, and you can also link it with other services.\n\nHowever, there is a lot of ancillary work necessary to make the model into an API. In order to make it easier to make an API, machine learning frameworks such as Tensorflow have developed inference engines.\n\nUsing inference engines, we can create APIs (REST or gRPC) that can load and infer from machine learning models developed and trained in the corresponding frameworks. When we send a request with the data we want to infer to an API server built using these inference engines, the engine performs the inference and sends back the results in the response.\n\nSome well-known open-source inference engines include:\n\n- [Tensorflow: Tensorflow Serving](https://github.com/tensorflow/serving)\n- [PyTorch: Torchserve](https://github.com/pytorch/serve)\n- [ONNX: ONNX Runtime](https://github.com/microsoft/onnxruntime)\n\nWhile not officially supported in open-source, there are also inference engines developed for popular frameworks like sklearn and XGBoost.\n\nDeploying and serving the model's inference results through an API is called **API deployment**.\n\n## Serving Framework\n\nI introduced the fact that various inference engines have been developed. Now, if we want to deploy these inference engines in a Kubernetes environment for API deployment, what steps are involved? We need to deploy various Kubernetes resources such as Deployments for the inference engines, Services to create endpoints for sending inference requests, and Ingress to forward external inference requests to the inference engines. Additionally, we may need to handle requirements such as scaling out when there is a high volume of inference requests, monitoring the status of the inference engines, and updating the version when an improved model is available. There are many considerations when operating an inference engine, and it goes beyond just a few tasks.\n\nTo address these requirements, serving frameworks have been developed to further abstract the deployment of inference engines in a Kubernetes environment.\n\nSome popular serving frameworks include:\n\n- [Seldon Core](https://github.com/SeldonIO/seldon-core)\n- [Kserve](https://github.com/kserve)\n- [BentoML](https://github.com/bentoml/BentoML)\n\nIn *MLOps for ALL*, we use Seldon Core to demonstrate the process of API deployment.\n"
  },
  {
    "path": "i18n/en/docusaurus-plugin-content-docs/version-1.0/appendix/_category_.json",
    "content": "{\n  \"label\": \"Appendix\",\n  \"position\": 9,\n  \"link\": {\n    \"type\": \"generated-index\"\n  }\n}\n"
  },
  {
    "path": "i18n/en/docusaurus-plugin-content-docs/version-1.0/appendix/metallb.md",
    "content": "---\ntitle: \"2. Install load balancer metallb for Bare Metal Cluster\"\nsidebar_position: 2\n---\n\n## What is MetalLB?\n\n## Installing MetalLB\n\nWhen using Kubernetes on cloud platforms such as AWS, GCP, and Azure, they provide their own load balancers. However, for on-premises clusters, an additional module needs to be installed to enable load balancing. [MetalLB](https://metallb.universe.tf/) is an open-source project that provides a load balancer for bare metal environments.\n\n## Requirements\n\n| Requirement                                                 | Version and Details                                          |\n| ----------------------------------------------------------- | ------------------------------------------------------------ |\n| Kubernetes                                                  | Version >= v1.13.0 without built-in load balancing            |\n| [Compatible Network CNI](https://metallb.universe.tf/installation/network-addons/) | Calico, Canal, Cilium, Flannel, Kube-ovn, Kube-router, Weave Net |\n| IPv4 addresses                                              | Used for MetalLB deployment                                  |\n| BGP mode                                                    | One or more routers that support BGP functionality           |\n| TCP/UDP port 7946 open between nodes                         | Memberlist requirement                                      |\n\n### MetalLB Installation\n\n#### Preparation\n\nIf you are using kube-proxy in IPVS mode, starting from Kubernetes v1.14.2, you need to enable strict ARP mode.  \nBy default, Kube-router enables strict ARP, so this feature is not required if you are using Kube-router as a service proxy.  \nBefore applying strict ARP mode, check the current mode.\n\n```bash\n# see what changes would be made, returns nonzero returncode if different\nkubectl get configmap kube-proxy -n kube-system -o yaml | \\\ngrep strictARP\n```\n\n```bash\nstrictARP: false\n```\n\nIf strictARP: false is outputted, run the following to change it to strictARP: true.\n(If strictARP: true is already outputted, you do not need to execute the following command).\n\n```bash\n# actually apply the changes, returns nonzero returncode on errors only\nkubectl get configmap kube-proxy -n kube-system -o yaml | \\\nsed -e \"s/strictARP: false/strictARP: true/\" | \\\nkubectl apply -f - -n kube-system\n```\n\nIf performed normally, it will be output as follows.\n\n```bash\nWarning: resource configmaps/kube-proxy is missing the kubectl.kubernetes.io/last-applied-configuration annotation which is required by kubectl apply. kubectl apply should only be used on resources created declaratively by either kubectl create --save-config or kubectl apply. The missing annotation will be patched automatically.\nconfigmap/kube-proxy configured\n```\n\n### Installation - Manifest\n\n#### 1. Install MetalLB.\n\n```bash\nkubectl apply -f https://raw.githubusercontent.com/metallb/metallb/v0.11.0/manifests/namespace.yaml\nkubectl apply -f https://raw.githubusercontent.com/metallb/metallb/v0.11.0/manifests/metallb.yaml\n```\n\n#### 2. Check installation.\n\nWait until both pods in the metallb-system namespace are Running.\n\n```bash\nkubectl get pod -n metallb-system\n```\n\nWhen everthing is Running, similar results will be output.\n\n```bash\nNAME                          READY   STATUS    RESTARTS   AGE\ncontroller-7dcc8764f4-8n92q   1/1     Running   1          1m\nspeaker-fnf8l                 1/1     Running   1          1m\n```\n\nThe components of the manifest are as follows:\n\n- metallb-system/controller\n  - Deployed as a deployment, responsible for assigning external IP addresses for load balancing.\n- metallb-system/speaker\n  - Deployed as a daemonset, responsible for configuring network communication to connect external traffic and services.\n\nThe service includes RBAC permissions which are necessary for the controller and speaker components to operate.\n\n## Configuration\n\nSetting up the load balancing policy of MetalLB can be done by deploying a configmap containing the related configuration information.\n\nThere are two modes that can be configured in MetalLB:\n\n1. [Layer 2 Mode](https://metallb.universe.tf/concepts/layer2/) \n2. [BGP Mode](https://metallb.universe.tf/concepts/bgp/) \n\nHere we will proceed with Layer 2 mode.\n\n### Layer 2 Configuration\n\nIn the Layer 2 mode, it is enough to set only the range of IP addresses to be used simply.  \nWhen using Layer 2 mode, it is not necessary to bind IP to the network interface of the worker node, because it operates in a way that it responds directly to the ARP request of the local network and provides the computer's MAC address to the client.\n\nThe following `metallb_config.yaml` file is the configuration for MetalLB to provide control over the IP range of 192.168.35.100 ~ 192.168.35.110, and to configure Layer 2 mode.\n\nIn case the cluster node and the client node are separated, the range of 192.168.35.100 ~ 192.168.35.110 must be accessible by both the client node and the cluster node.\n\n#### metallb_config.yaml\n\n```bash\napiVersion: v1\nkind: ConfigMap\nmetadata:\n  namespace: metallb-system\n  name: config\ndata:\n  config: |\n    address-pools:\n    - name: default\n      protocol: layer2\n      addresses:\n      - 192.168.35.100-192.168.35.110  # IP 대역폭\n```\n\nApply the above settings.\n\n```test\nkubectl apply -f metallb_config.yaml \n```\n\nIf deployed normally, it will output as follows.\n\n```test\nconfigmap/config created\n```\n\n## Using MetalLB\n\n### Kubeflow Dashboard\n\nFirst, before getting the load-balancing feature from MetalLB, check the current status by changing the type of the istio-ingressgateway service in the istio-system namespace to `LoadBalancer` to provide the Kubeflow Dashboard.\n\n```bash\nkubectl get svc/istio-ingressgateway -n istio-system\n```\n\nThe type of this service is ClusterIP and you can see that the External-IP value is `none`.\n\n```bash\nNAME                   TYPE        CLUSTER-IP    EXTERNAL-IP   PORT(S)                                        AGE\nistio-ingressgateway   ClusterIP   10.103.72.5   <none>        15021/TCP,80/TCP,443/TCP,31400/TCP,15443/TCP   4h21m\n```\n\nChange the type to LoadBalancer and if you want to input a desired IP address, add the loadBalancerIP item.  \nIf you do not add it, IP addresses will be assigned sequentially from the IP address pool set above.\n\n```bash\nkubectl edit svc/istio-ingressgateway -n istio-system\n```\n\n```bash\nspec:\n  clusterIP: 10.103.72.5\n  clusterIPs:\n  - 10.103.72.5\n  ipFamilies:\n  - IPv4\n  ipFamilyPolicy: SingleStack\n  ports:\n  - name: status-port\n    port: 15021\n    protocol: TCP\n    targetPort: 15021\n  - name: http2\n    port: 80\n    protocol: TCP\n    targetPort: 8080\n  - name: https\n    port: 443\n    protocol: TCP\n    targetPort: 8443\n  - name: tcp\n    port: 31400\n    protocol: TCP\n    targetPort: 31400\n  - name: tls\n    port: 15443\n    protocol: TCP\n    targetPort: 15443\n  selector:\n    app: istio-ingressgateway\n    istio: ingressgateway\n  sessionAffinity: None\n  type: LoadBalancer # Change ClusterIP to LoadBalancer\n  loadBalancerIP: 192.168.35.100   # Add IP\nstatus:\n  loadBalancer: {}\n```\n\nIf you check again, you will see that the External-IP value is `192.168.35.100`.\n\n```bash\nkubectl get svc/istio-ingressgateway -n istio-system\n```\n\n```bash\nNAME                   TYPE           CLUSTER-IP    EXTERNAL-IP      PORT(S)                                                                      AGE\nistio-ingressgateway   LoadBalancer   10.103.72.5   192.168.35.100   15021:31054/TCP,80:30853/TCP,443:30443/TCP,31400:30012/TCP,15443:31650/TCP   5h1m\n```\n\nOpen a web browser and connect to [http://192.168.35.100](http://192.168.35.100) to verify the following screen is output.\n\n![login-after-istio-ingressgateway-setting.png](./img/login-after-istio-ingressgateway-setting.png)\n\n### minio Dashboard\n\nFirst, we check the current status before changing the type of minio-service, which provides the Dashboard of minio, in the kubeflow namespace to LoadBalancer to receive the load balancing function from MetalLB.\n\n```bash\nkubectl get svc/minio-service -n kubeflow\n```\n\nThe type of this service is ClusterIP and you can confirm that the External-IP value is `none`.\n\n```bash\nNAME            TYPE        CLUSTER-IP      EXTERNAL-IP   PORT(S)    AGE\nminio-service   ClusterIP   10.109.209.87   <none>        9000/TCP   5h14m\n```\n\nChange the type to LoadBalancer and if you want to enter an IP address, add the loadBalancerIP item. If you do not add, the IP address will be assigned sequentially from the IP address pool set above.\n\n```bash\nkubectl edit svc/minio-service -n kubeflow\n```\n\n```bash\napiVersion: v1\nkind: Service\nmetadata:\n  annotations:\n    kubectl.kubernetes.io/last-applied-configuration: |\n      {\"apiVersion\":\"v1\",\"kind\":\"Service\",\"metadata\":{\"annotations\":{},\"labels\":{\"application-crd-id\":\"kubeflow-pipelines\"},\"name\":\"minio-ser>\n  creationTimestamp: \"2022-01-05T08:44:23Z\"\n  labels:\n    application-crd-id: kubeflow-pipelines\n  name: minio-service\n  namespace: kubeflow\n  resourceVersion: \"21120\"\n  uid: 0053ee28-4f87-47bb-ad6b-7ad68aa29a48\nspec:\n  clusterIP: 10.109.209.87\n  clusterIPs:\n  - 10.109.209.87\n  ipFamilies:\n  - IPv4\n  ipFamilyPolicy: SingleStack\n  ports:\n  - name: http\n    port: 9000\n    protocol: TCP\n    targetPort: 9000\n  selector:\n    app: minio\n    application-crd-id: kubeflow-pipelines\n  sessionAffinity: None\n  type: LoadBalancer # Change ClusterIP to LoadBalancer\n  loadBalancerIP: 192.168.35.101 # Add IP\nstatus:\n  loadBalancer: {}\n```\n\nIf we check again, we can see that the External-IP value is `192.168.35.101`.\n\n```bash\nkubectl get svc/minio-service -n kubeflow\n```\n\n```bash\nNAME            TYPE           CLUSTER-IP      EXTERNAL-IP      PORT(S)          AGE\nminio-service   LoadBalancer   10.109.209.87   192.168.35.101   9000:31371/TCP   5h21m\n```\n\nOpen a web browser and connect to [http://192.168.35.101:9000](http://192.168.35.101:9000) to confirm the following screen is printed. \n\n![login-after-minio-setting.png](./img/login-after-minio-setting.png)\n\n### mlflow Dashboard\n\nFirst, we check the current status before changing the type of mlflow-server-service service in the mlflow-system namespace that provides the mlflow Dashboard to LoadBalancer to receive load balancing function from MetalLB.\n\n```bash\nkubectl get svc/mlflow-server-service -n mlflow-system\n```\n\nThe type of this service is ClusterIP and you can confirm that the External-IP value is `none`.\n\n```bash\nNAME                    TYPE        CLUSTER-IP       EXTERNAL-IP   PORT(S)    AGE\nmlflow-server-service   ClusterIP   10.111.173.209   <none>        5000/TCP   4m50s\n```\n\nChange the type to LoadBalancer and if you want to input the desired IP address, add the loadBalancerIP item.  \nIf you do not add it, the IP address will be assigned sequentially from the IP address pool set above.\n\n```bash\nkubectl edit svc/mlflow-server-service -n mlflow-system\n```\n\n```bash\napiVersion: v1\nkind: Service\nmetadata:\n  annotations:\n    meta.helm.sh/release-name: mlflow-server\n    meta.helm.sh/release-namespace: mlflow-system\n  creationTimestamp: \"2022-01-07T04:00:19Z\"\n  labels:\n    app.kubernetes.io/managed-by: Helm\n  name: mlflow-server-service\n  namespace: mlflow-system\n  resourceVersion: \"276246\"\n  uid: e5d39fb7-ad98-47e7-b512-f9c673055356\nspec:\n  clusterIP: 10.111.173.209\n  clusterIPs:\n  - 10.111.173.209\n  ipFamilies:\n  - IPv4\n  ipFamilyPolicy: SingleStack\n  ports:\n  - port: 5000\n    protocol: TCP\n    targetPort: 5000\n  selector:\n    app.kubernetes.io/name: mlflow-server\n  sessionAffinity: None\n  type: LoadBalancer # Change ClusterIP to LoadBalancer\n  loadBalancerIP: 192.168.35.102 # Add IP\nstatus:\n  loadBalancer: {}\n```\n\nIf we check again, we can see that the External-IP value is `192.168.35.102`.\n\n```bash\nkubectl get svc/mlflow-server-service -n mlflow-system\n```\n\n```bash\nNAME                    TYPE           CLUSTER-IP       EXTERNAL-IP      PORT(S)          AGE\nmlflow-server-service   LoadBalancer   10.111.173.209   192.168.35.102   5000:32287/TCP   6m11s\n```\n\nOpen the web browser and connect to [http://192.168.35.102:5000](http://192.168.35.102:5000) to confirm the following screen is displayed.\n\n![login-after-mlflow-setting.png](./img/login-after-mlflow-setting.png)\n\n### Grafana Dashboard\n\nFirst, check the current status before changing the type of seldon-core-analytics-grafana service in the seldon-system namespace which provides Grafana's Dashboard to receive Load Balancing function from MetalLB.\n\n```bash\nkubectl get svc/seldon-core-analytics-grafana -n seldon-system\n```\n\nThe type of the corresponding service is ClusterIP, and you can see that the External-IP value is `none`.\n\n```bash\nNAME                            TYPE        CLUSTER-IP      EXTERNAL-IP   PORT(S)   AGE\nseldon-core-analytics-grafana   ClusterIP   10.109.20.161   <none>        80/TCP    94s\n```\n\nChange the type to LoadBalancer and if you want to enter an IP address, add the loadBalancerIP item.  \nIf not, an IP address will be assigned sequentially from the IP address pool set above.\n\n```bash\nkubectl edit svc/seldon-core-analytics-grafana -n seldon-system\n```\n\n```bash\napiVersion: v1\nkind: Service\nmetadata:\n  annotations:\n    meta.helm.sh/release-name: seldon-core-analytics\n    meta.helm.sh/release-namespace: seldon-system\n  creationTimestamp: \"2022-01-07T04:16:47Z\"\n  labels:\n    app.kubernetes.io/instance: seldon-core-analytics\n    app.kubernetes.io/managed-by: Helm\n    app.kubernetes.io/name: grafana\n    app.kubernetes.io/version: 7.0.3\n    helm.sh/chart: grafana-5.1.4\n  name: seldon-core-analytics-grafana\n  namespace: seldon-system\n  resourceVersion: \"280605\"\n  uid: 75073b78-92ec-472c-b0d5-240038ea8fa5\nspec:\n  clusterIP: 10.109.20.161\n  clusterIPs:\n  - 10.109.20.161\n  ipFamilies:\n  - IPv4\n  ipFamilyPolicy: SingleStack\n  ports:\n  - name: service\n    port: 80\n    protocol: TCP\n    targetPort: 3000\n  selector:\n    app.kubernetes.io/instance: seldon-core-analytics\n    app.kubernetes.io/name: grafana\n  sessionAffinity: None\n  type: LoadBalancer # Change ClusterIP to LoadBalancer\n  loadBalancerIP: 192.168.35.103 # Add IP\nstatus:\n  loadBalancer: {}\n```\n\nIf you check again, you can see that the External-IP value is `192.168.35.103`.\n\n```bash\nkubectl get svc/seldon-core-analytics-grafana -n seldon-system\n```\n\n```bash\nNAME                            TYPE           CLUSTER-IP      EXTERNAL-IP      PORT(S)        AGE\nseldon-core-analytics-grafana   LoadBalancer   10.109.20.161   192.168.35.103   80:31191/TCP   5m14s\n```\n\nOpen the Web Browser and connect to http://192.168.35.103:80 to confirm that the following screen is displayed.\n\n![login-after-grafana-setting.png](./img/login-after-grafana-setting.png)\n"
  },
  {
    "path": "i18n/en/docusaurus-plugin-content-docs/version-1.0/appendix/pyenv.md",
    "content": "---\ntitle: \"1. Install Python virtual environment\"\nsidebar_position: 1\n---\n\n## Python virtual environment\n\nWhen working with Python, there may be cases where you want to use multiple versions of Python environments or manage package versions separately for different projects.\n\nTo easily manage Python environments or Python package environments in a virtualized manner, there are tools available such as pyenv, conda, virtualenv, and venv.\n\nAmong these, *MLOps for ALL* covers the installation of [pyenv](https://github.com/pyenv/pyenv) and [pyenv-virtualenv](https://github.com/pyenv/pyenv-virtualenv).  \npyenv helps manage Python versions, while pyenv-virtualenv is a plugin for pyenv that helps manage Python package environments.\n\n## Installing pyenv\n\n### Prerequisites\n\nPrerequisites vary depending on the operating system. Please refer to the [following page](https://github.com/pyenv/pyenv/wiki#suggested-build-environment) and install the required packages accordingly.\n\n### Installation - macOS\n\n1. Install pyenv, pyenv-virtualenv\n\n```bash\nbrew update\nbrew install pyenv\nbrew install pyenv-virtualenv\n```\n\n2. Set pyenv\n\nFor macOS, assuming the use of zsh since the default shell has changed to zsh in Catalina version and later, setting up pyenv.\n\n```bash\necho 'eval \"$(pyenv init -)\"' >> ~/.zshrc\necho 'eval \"$(pyenv virtualenv-init -)\"' >> ~/.zshrc\nsource ~/.zshrc\n```\n\nCheck if the pyenv command is executed properly.\n\n```bash\npyenv --help\n```\n\n```bash\n$ pyenv --help\nUsage: pyenv <command> [<args>]\n\nSome useful pyenv commands are:\n   --version   Display the version of pyenv\n   activate    Activate virtual environment\n   commands    List all available pyenv commands\n   deactivate   Deactivate virtual environment\n   exec        Run an executable with the selected Python version\n   global      Set or show the global Python version(s)\n   help        Display help for a command\n   hooks       List hook scripts for a given pyenv command\n   init        Configure the shell environment for pyenv\n   install     Install a Python version using python-build\n   local       Set or show the local application-specific Python version(s)\n   prefix      Display prefix for a Python version\n   rehash      Rehash pyenv shims (run this after installing executables)\n   root        Display the root directory where versions and shims are kept\n   shell       Set or show the shell-specific Python version\n   shims       List existing pyenv shims\n   uninstall   Uninstall a specific Python version\n   version     Show the current Python version(s) and its origin\n   version-file   Detect the file that sets the current pyenv version\n   version-name   Show the current Python version\n   version-origin   Explain how the current Python version is set\n   versions    List all Python versions available to pyenv\n   virtualenv   Create a Python virtualenv using the pyenv-virtualenv plugin\n   virtualenv-delete   Uninstall a specific Python virtualenv\n   virtualenv-init   Configure the shell environment for pyenv-virtualenv\n   virtualenv-prefix   Display real_prefix for a Python virtualenv version\n   virtualenvs   List all Python virtualenvs found in `$PYENV_ROOT/versions/*'.\n   whence      List all Python versions that contain the given executable\n   which       Display the full path to an executable\n\nSee `pyenv help <command>' for information on a specific command.\nFor full documentation, see: https://github.com/pyenv/pyenv#readme\n```\n\n### Installation - Ubuntu\n\n1. Install pyenv and pyenv-virtualenv\n\n```bash\ncurl https://pyenv.run | bash\n```\n\nIf the following content is output, it means that the installation is successful.\n\n```bash\n  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current\n                                 Dload  Upload   Total   Spent    Left  Speed\n  0     0    0     0    0     0      0      0 --:--:-- --:--:--   0     0    0     0    0     0      0      0 --:--:-- --:--:-- 100   270  100   270    0     0    239      0  0:00:01  0:00:01 --:--:--   239\nCloning into '/home/mlops/.pyenv'...\nr\n...\nSkip...\n...\nremote: Enumerating objects: 10, done.\nremote: Counting objects: 100% (10/10), done.\nremote: Compressing objects: 100% (6/6), done.\nremote: Total 10 (delta 1), reused 6 (delta 0), pack-reused 0\nUnpacking objects: 100% (10/10), 2.92 KiB | 2.92 MiB/s, done.\n\nWARNING: seems you still have not added 'pyenv' to the load path.\n\n\n# See the README for instructions on how to set up\n# your shell environment for Pyenv.\n\n# Load pyenv-virtualenv automatically by adding\n# the following to ~/.bashrc:\n\neval \"$(pyenv virtualenv-init -)\"\n\n```\n\n2. Set pyenv\n\nAssuming the use of bash shell as the default shell, configure pyenv and pyenv-virtualenv to be used in bash.\n\n```bash\nsudo vi ~/.bashrc\n```\n\nEnter the following string and save it.\n\n```bash\nexport PATH=\"$HOME/.pyenv/bin:$PATH\"\neval \"$(pyenv init -)\"\neval \"$(pyenv virtualenv-init -)\"\n```\n\nRestart the shell.\n\n```bash\nexec $SHELL\n```\n\nCheck if the pyenv command is executed properly.\n\n```bash\npyenv --help\n```\n\nIf the following message is displayed, it means that the settings have been configured correctly.\n\n```bash\n$ pyenv\npyenv 2.2.2\nUsage: pyenv <command> [<args>]\n\nSome useful pyenv commands are:\n   --version   Display the version of pyenv\n   activate    Activate virtual environment\n   commands    List all available pyenv commands\n   deactivate   Deactivate virtual environment\n   doctor      Verify pyenv installation and development tools to build pythons.\n   exec        Run an executable with the selected Python version\n   global      Set or show the global Python version(s)\n   help        Display help for a command\n   hooks       List hook scripts for a given pyenv command\n   init        Configure the shell environment for pyenv\n   install     Install a Python version using python-build\n   local       Set or show the local application-specific Python version(s)\n   prefix      Display prefix for a Python version\n   rehash      Rehash pyenv shims (run this after installing executables)\n   root        Display the root directory where versions and shims are kept\n   shell       Set or show the shell-specific Python version\n   shims       List existing pyenv shims\n   uninstall   Uninstall a specific Python version\n   version     Show the current Python version(s) and its origin\n   version-file   Detect the file that sets the current pyenv version\n   version-name   Show the current Python version\n   version-origin   Explain how the current Python version is set\n   versions    List all Python versions available to pyenv\n   virtualenv   Create a Python virtualenv using the pyenv-virtualenv plugin\n   virtualenv-delete   Uninstall a specific Python virtualenv\n   virtualenv-init   Configure the shell environment for pyenv-virtualenv\n   virtualenv-prefix   Display real_prefix for a Python virtualenv version\n   virtualenvs   List all Python virtualenvs found in `$PYENV_ROOT/versions/*'.\n   whence      List all Python versions that contain the given executable\n   which       Display the full path to an executable\n\nSee `pyenv help <command>' for information on a specific command.\nFor full documentation, see: https://github.com/pyenv/pyenv#readme\n```\n\n## Using pyenv\n\n### Install python version\n\nUsing the `pyenv install <Python-Version>` command, you can install the desired Python version.  \nIn this page, we will install the Python 3.7.12 version that is used by Kubeflow by default as an example.\n\n```bash\npyenv install 3.7.12\n```\n\nIf installed normally, the following message will be printed.\n\n```bash\n$ pyenv install 3.7.12\nDownloading Python-3.7.12.tar.xz...\n-> https://www.python.org/ftp/python/3.7.12/Python-3.7.12.tar.xz\nInstalling Python-3.7.12...\npatching file Doc/library/ctypes.rst\npatching file Lib/test/test_unicode.py\npatching file Modules/_ctypes/_ctypes.c\npatching file Modules/_ctypes/callproc.c\npatching file Modules/_ctypes/ctypes.h\npatching file setup.py\npatching file 'Misc/NEWS.d/next/Core and Builtins/2020-06-30-04-44-29.bpo-41100.PJwA6F.rst'\npatching file Modules/_decimal/libmpdec/mpdecimal.h\nInstalled Python-3.7.12 to /home/mlops/.pyenv/versions/3.7.12\n```\n\n### Create python virtual environment\n\nCreate a Python virtual environment with the `pyenv virtualenv <Installed-Python-Version> <Virtual-Environment-Name>` command to create a Python virtual environment with the desired Python version.\n\nFor example, let's create a Python virtual environment called `demo` with Python 3.7.12 version.\n```bash\npyenv virtualenv 3.7.12 demo\n```\n\n```bash\n$ pyenv virtualenv 3.7.12 demo\nLooking in links: /tmp/tmpffqys0gv\nRequirement already satisfied: setuptools in /home/mlops/.pyenv/versions/3.7.12/envs/demo/lib/python3.7/site-packages (47.1.0)\nRequirement already satisfied: pip in /home/mlops/.pyenv/versions/3.7.12/envs/demo/lib/python3.7/site-packages (20.1.1)\n```\n\n### Activating python virtual environment\n\nUse the `pyenv activate <environment name>` command to use the virtual environment created in this way.\n\nFor example, we will use a Python virtual environment called `demo`.\n\n```bash\npyenv activate demo\n```\n\n\nYou can see that the information of the current virtual environment is printed at the front of the shell.\n\n  Before\n\n  ```bash\n  mlops@ubuntu:~$ pyenv activate demo\n  ```\n\n  After\n\n  ```bash\n  pyenv-virtualenv: prompt changing will be removed from future release. configure `export PYENV_VIRTUALENV_DISABLE_PROMPT=1' to simulate the behavior.\n  (demo) mlops@ubuntu:~$ \n  ```\n\n### Deactivating python virtual environment\n\nYou can deactivate the currently active virtualenv by using the command `source deactivate`.\n\n```bash\nsource deactivate\n```\n\n  Before\n\n  ```bash\n  (demo) mlops@ubuntu:~$ source deactivate\n  ```\n\n  After\n\n  ```bash\n  mlops@ubuntu:~$ \n  ```\n"
  },
  {
    "path": "i18n/en/docusaurus-plugin-content-docs/version-1.0/further-readings/_category_.json",
    "content": "{\n  \"label\": \"Further Readings\",\n  \"position\": 8,\n  \"link\": {\n    \"type\": \"generated-index\"\n  }\n}\n"
  },
  {
    "path": "i18n/en/docusaurus-plugin-content-docs/version-1.0/further-readings/info.md",
    "content": "---\ntitle: \"Further Readings\"\ndate: 2021-12-21\nlastmod: 2021-12-21\n---\n\n## MLOps Component\n\nFrom the components covered in [MLOps Concepts](../introduction/component.md), the following diagram illustrates them. \n\n![open-stacks-0.png](./img/open-stacks-0.png)\n\nThe technology stacks covered in *Everyone's MLOps* are as follows.\n\n![open-stacks-1.png](./img/open-stacks-1.png)\n\n| | Storage | [Minio](https://min.io/)                            |\n| | Data Processing | [Apache Spark](https://spark.apache.org/)                             |\n| | Data Visualization | [Tableau](https://www.tableau.com/)                               |\n| Workflow Mgmt.             | Orchestration               | [Airflow](https://airflow.apache.org/)                              |\n| | Scheduling               | [Kubernetes](https://kubernetes.io/)                            |\n| Security & Compliance      | Authentication & Authorization | [Ldap](https://www.openldap.org/)                               |\n| | Data Encryption & Tokenization | [Vault](https://www.vaultproject.io/)                         |\n| | Governance & Auditing | [Open Policy Agent](https://www.openpolicyagent.org/)              |\n\nAs you can see, there are still many MLOps components that we have not covered yet. We could not cover them all this time due to time constraints, but if you need it, it might be a good idea to refer to the following open source projects first.\n\n![open-stacks-2.png](./img/open-stacks-2.png)\n\nFor details:\n\n| Mgmt.                      | Component                   | Open Soruce                           |\n| -------------------------- | --------------------------- | ------------------------------------- |\n| Data Mgmt.                 | Collection                  | [Kafka](https://kafka.apache.org/)                                 |\n|                            | Validation                  | [Beam](https://beam.apache.org/)                                  |\n|                            | Feature Store               | [Flink](https://flink.apache.org/)                                 |\n| ML Model Dev. & Experiment | Modeling                    | [Jupyter](https://jupyter.org/)                               |\n|                            | Analysis & Experiment Mgmt. | [MLflow](https://mlflow.org/)                                |\n|                            | HPO Tuning & AutoML         | [Katib](https://github.com/kubeflow/katib)                                 |\n| Deploy Mgmt.               | Serving Framework           | [Seldon Core](https://docs.seldon.io/projects/seldon-core/en/latest/index.html)                           |\n|                            | A/B Test                    | [Iter8](https://iter8.tools/)                                 |\n|                            | Monitoring                  | [Grafana](https://grafana.com/oss/grafana/), [Prometheus](https://prometheus.io/)                   |\n| Process Mgmt.              | pipeline                    | [Kubeflow](https://www.kubeflow.org/)                              |\n|                            | CI/CD                       | [Github Action](https://docs.github.com/en/actions)                         |\n|                            | Continuous Training         | [Argo Events](https://argoproj.github.io/events/)                           |\n| Platform Mgmt.             | Configuration Mgmt.         | [Consul](https://www.consul.io/)                                |\n|                            | Code Version Mgmt.          | [Github](https://github.com/), [Minio](https://min.io/)                         |\n|                            | Logging                     | (EFK) [Elastic Search](https://www.elastic.co/kr/elasticsearch/), [Fluentd](https://www.fluentd.org/), [Kibana](https://www.elastic.co/kr/kibana/) |\n|                            | Resource Mgmt.              | [Kubernetes](https://kubernetes.io/)                            |\n"
  },
  {
    "path": "i18n/en/docusaurus-plugin-content-docs/version-1.0/introduction/_category_.json",
    "content": "{\n  \"label\": \"Introduction\",\n  \"position\": 1,\n  \"link\": {\n    \"type\": \"generated-index\"\n  }\n}\n"
  },
  {
    "path": "i18n/en/docusaurus-plugin-content-docs/version-1.0/introduction/component.md",
    "content": "---\ntitle : \"3. Components of MLOps\"\ndescription: \"Describe MLOps Components\"\nsidebar_position: 3\ndate: 2021-12-03\nlastmod: 2021-12-10\ncontributors: [\"Youngcheol Jang\"]\n---\n\n## Practitioners guide to MLOps\n\nGoogle's white paper [Practitioners guide to MLOps: A framework for continuous delivery and automation of machine learning] published in May 2021 mentions the following core functionalities of MLOps: \n\n![mlops-component](./img/mlops-component.png)\n\nLet's look at what each feature does.\n\n### 1. Experimentation\n\nExperimentation provides machine learning engineers with the following capabilities for data analysis, prototyping model development, and implementing training functionality:\n\n- Integration with version control tools like Git and a notebook (Jupyter Notebook) environment\n- Experiment tracking capabilities including data used, hyperparameters, and evaluation metrics\n- Data and model analysis and visualization capabilities\n\n### 2. Data Processing\n\nData Processing enables working with large volumes of data during the stages of model development, continuous training, and API deployment by providing the following functionalities:\n\n- Data connectors compatible with various data sources and services\n- Data encoders and decoders compatible with different data formats\n- Data transformation and feature engineering capabilities for different data types\n- Scalable batch and streaming data processing capabilities for training and serving\n\n### 3. Model Training\n\nModel Training offers functionalities to efficiently execute algorithms for model training:\n\n- Environment provisioning for ML framework execution\n- Distributed training environment for multiple GPUs and distributed training\n- Hyperparameter tuning and optimization capabilities\n\n### 4. Model Evaluation\n\nModel evaluation provides the following capabilities to observe the performance of models in both experimental and production environments:\n\n- Model performance evaluation on evaluation datasets\n- Tracking prediction performance across different continuous training runs\n- Comparison and visualization of performance between different models\n- Model output interpretation using interpretable AI techniques\n\n### 5. Model Serving\n\nModel serving offers functionalities to deploy and serve models in production environments:\n\n- Low-latency and high-availability inference capabilities\n- Support for various ML model serving frameworks (TensorFlow Serving, TorchServe, NVIDIA Triton, Scikit-learn, XGBoost, etc.)\n- Advanced inference routines, such as preprocessing or postprocessing, and multi-model ensembling for final results\n- Autoscaling capabilities to handle spiking inference requests\n- Logging of inference requests and results\n\n### 6. Online Experimentation\n\nOnline experimentation provides capabilities to validate the performance of newly generated models when deployed. This functionality should be integrated with a Model Registry to coordinate the deployment of new models.\n\n- Canary and shadow deployment features\n- A/B testing capabilities\n- Multi-armed bandit testing functionality\n\n### 7. Model Monitoring\n\nModel monitoring enables the monitoring of deployed models in production environments to ensure proper functioning and provides information on model performance degradation and the need for updates.\n\n### 8. ML Pipeline\n\nML Pipeline offers the following functionalities to configure, control, and automate complex ML training and inference workflows in production environments:\n\n- Pipeline execution through various event sources\n- ML metadata tracking and integration for pipeline parameter and artifact management\n- Support for built-in components for common ML tasks and user-defined components\n- Provisioning of different execution environments\n\n### 9. Model Registry\n\nThe Model Registry provides the capability to manage the lifecycle of machine learning models in a centralized repository.\n\n- Registration, tracking, and versioning of trained and deployed models\n- Storage of information about the required data and runtime packages for deployment\n\n### 10. Dataset and Feature Repository\n\n- Sharing, search, reuse, and versioning capabilities for datasets\n- Real-time processing and low-latency serving capabilities for event streaming and online inference tasks\n- Support for various types of data, such as images, text, and tabular data\n\n### 11. ML Metadata and Artifact Tracking\n\nIn each stage of MLOps, various artifacts are generated. ML metadata refers to the information about these artifacts. ML metadata and artifact management provide the following functionalities to manage the location, type, attributes, and associations with experiments:\n\n- History management for ML artifacts\n- Tracking and sharing of experiments and pipeline parameter configurations\n- Storage, access, visualization, and download capabilities for ML artifacts\n- Integration with other MLOps functionalities"
  },
  {
    "path": "i18n/en/docusaurus-plugin-content-docs/version-1.0/introduction/intro.md",
    "content": "---\ntitle : \"1. What is MLOps?\"\ndescription: \"Introduction to MLOps\"\nsidebar_position: 1\ndate: 2021-1./img to MLOps\"\nlastmod: 2022-03-05\ncontributors: [\"Jongseob Jeon\"]\n---\n\n## Machine Learning Project\n\nSince 2012, when Alexnet was introduced, Machine Learning and Deep Learning have been introduced in any domain where data exists, such as Computer Vision and Natural Language Processing. Deep Learning and Machine Learning were referred to collectively as AI, and the need for AI was shouted from many media. And many companies conducted numerous projects using Machine Learning and Deep Learning. But what was the result? Byungchan Eum, the Head of North East Asia at Element AI, said “If 10 companies start an AI project, 9 of them will only be able to do concept validation (POC)”.\n\n\n\nIn this way, in many projects, Machine Learning and Deep Learning only showed the possibility that they could solve this problem and then disappeared. And around this time, the outlook that [AI Winter was coming again](https://www.aifutures.org/2021/ai-winter-is-coming/) also began to emerge.\n\nWhy did most projects end at the concept validation (POC) stage? Because it is impossible to operate an actual service with only Machine Learning and Deep Learning code.\n\nAt the actual service stage, the portion taken up by machine learning and deep learning code is not as large as one would think, so one must consider many other aspects besides simply the performance of the model. Google has pointed out this problem in their 2015 paper [Hidden Technical Debt in Machine Learning Systems](https://proceedings.neurips.cc/paper/2015/file/86df7dcfd896fcaf2674f757a2463eba-Paper.pdf). However, at the time this paper was released, many ML engineers were busy proving the potential of deep learning and machine learning, so the points made in the paper were not given much attention. \n\nAnd after a few years, machine learning and deep learning had proven their potential and people were now looking to apply it to actual services. However, soon many people realized that actual services were not as easy as they thought.\n\n## Devops\n\nMLOps is not a new concept, but rather a term derived from the development methodology called DevOps. Therefore, understanding DevOps can help in understanding MLOps.\n\n### DevOps\n\nDevOps is a portmanteau of \"Development\" and \"Operations,\" referring to a development and operations methodology that emphasizes communication, collaboration, and integration between software developers and IT professionals. It encompasses both the development and operation phases of software, aiming to achieve a symbiotic relationship between the two. The primary goal of DevOps is to enable organizations to develop and deploy software products and services rapidly by fostering close collaboration and interdependence between development and operations teams.\n\n### Silo Effect\nLet's explore why DevOps is necessary through a simple scenario.\n\nIn the early stages of a service, there are fewer supported features, and the team or company is relatively small. At this point, there may not be a clear distinction between development and operations, or the teams may be small. The key point here is the small scale. In such cases, there are many points of contact for effective communication, and with a limited number of services to focus on, it is possible to rapidly improve the service.\n\nHowever, as the service scales up, the development and operations teams tend to separate, and the physical limitations of communication channels become apparent. For example, in meetings involving multiple teams, only team leaders or a small number of seniors may attend, rather than the entire team. These limitations in communication channels inevitably lead to a lack of communication. Consequently, the development team continues to develop new features, while the operations team faces issues during deployment caused by the features developed by the development team.\n\nWhen such situations are repeated, it can lead to organizational silos, a phenomenon known as silo mentality.\n\n![silo](./img/silo.png)\n\n> Indeed, the term \"silo\" originally refers to a tall, cylindrical structure used for storing grain or livestock feed. Silos are designed to keep the stored materials separate and prevent them from mixing. \n> In the context of organizations, the \"silo effect\" or \"organizational silos effect\" refers to a phenomenon where departments or teams within an organization operate independently and prioritize their own interests without effective collaboration. It reflects a mentality where individual departments focus on building their own \"silos\" and solely pursue their own interests.\n\nThe silo effect can lead to a decline in service quality and hinder organizational performance. To address this issue, DevOps emerged as a solution. DevOps emphasizes collaboration, communication, and integration between development and operations teams, breaking down the barriers and fostering a culture of shared responsibility and collaboration. By promoting cross-functional teamwork and streamlining processes, DevOps aims to overcome silos and improve the efficiency and effectiveness of software development and operations.\n\n### CI/CD\n\nContinuous Integration (CI) and Continuous Delivery (CD) are concrete methods to break down the barriers between development teams and operations teams.\n\n![cicd](./img/cicd.png)\n\nThrough this method, the development team can understand the operational environment and check whether the features being developed can be seamlessly deployed. The operations team can deploy validated features or improved products more often to increase customer product experience. In summary, DevOps is a methodology to solve the problem between development teams and operations teams.\n\n## MLOps\n\n### 1) ML + Ops\n\nDevOps is a methodology that addresses the challenges between development and operations teams, promoting collaboration and effective communication. By applying DevOps principles, development teams gain a better understanding of the operational environment, and the developed features can be seamlessly integrated and deployed. On the other hand, operations teams can deploy validated features or improved products more frequently, enhancing the overall customer experience.\n\nMLOps, which stands for Machine Learning Operations, extends the DevOps principles and practices specifically to the field of machine learning. In MLOps, the \"Dev\" in DevOps is replaced with \"ML\" to emphasize the unique challenges and considerations related to machine learning.\n\nMLOps aims to address the issues that arise between machine learning teams and operations teams. To understand these issues, let's consider an example using a recommendation system.\n\n#### Rule-Based Approach\n\nIn the initial stages of building a recommendation system, a simple rule-based approach may be used. For example, items could be recommended based on the highest sales volume in the past week. With this approach, there is no need for model updates unless there are specific reasons for modification.\n\n#### Machine Learning Approach\n\nAs the scale of the service grows and more log data accumulates, machine learning models can be developed based on item-based or user-based recommendations. In this case, the models are periodically retrained and redeployed.\n\n#### Deep Learning Approach\n\nWhen there is a greater demand for personalized recommendations and a need for models that deliver higher performance, deep learning models are developed. Similar to machine learning, these models are periodically retrained and redeployed.\n\nBy considering these examples, it becomes evident that challenges can arise between the machine learning team and the operations team. MLOps aims to address these challenges and provide a methodology and set of practices to facilitate the development, deployment, and operation of machine learning models in a collaborative and efficient manner.\n\n![graph](./img/graph.png)\n\nIf we represent the concepts explained earlier on a graph, with model complexity on the x-axis and model performance on the y-axis, we can observe an upward trend where the model performance improves as the complexity increases. This often leads to the emergence of separate machine learning teams specializing in transitioning from traditional machine learning to deep learning.\n\nIf there are only a few models to manage, collaboration between teams can be sufficient to address the challenges. However, as the number of models to develop increases, silos similar to those observed in DevOps can emerge.\n\nConsidering the goals of DevOps, we can understand the goals of MLOps as ensuring that the developed models can be deployed successfully. While DevOps focuses on verifying that the features developed by the development team can be deployed correctly, MLOps focuses on verifying that the models developed by the machine learning team can be deployed effectively.\n\n### 2) ML -> Ops\n\nHowever, recent MLOps-related products and explanations indicate that the goals are not limited to what was previously described. In some cases, the goal is to enable the machine learning team to directly operate and manage the models they develop. This need arises from the process of ongoing machine learning projects.\n\nIn the case of recommendation systems, it was possible to start with simple models in operations. However, in domains such as natural language processing and image analysis, it is common to perform verification (POC) to determine if deep learning models can solve the given tasks. Once the verification is complete, the focus shifts to developing the operational environment for serving the models. However, it may not be easy for the machine learning team to handle this challenge with their internal capabilities alone. This is where MLOps becomes necessary.\n\n### 3) Conclusion\n\nIn summary, MLOps has two main goals. The earlier explanation of MLOps focused on ML+Ops, aiming to enhance productivity and collaboration between the two teams. On the other hand, the latter explanation focused on ML -> Ops, aiming to enable the machine learning team to directly operate and manage their models.\n"
  },
  {
    "path": "i18n/en/docusaurus-plugin-content-docs/version-1.0/introduction/levels.md",
    "content": "---\ntitle : \"2. Levels of MLOps\"\ndescription: \"Levels of MLOps\"\nsidebar_position: 2\ndate: 2021-12-03\nlastmod: 2022-03-05\ncontributors: [\"Jongseob Jeon\"]\n\n---\n\nThis page will look at the steps of MLOps outlined by Google and explore what the core features of MLOps are.\n\n## Hidden Technical Debt in ML System\n\nGoogle has been talking about the need for MLOps since as far back as 2015.  The paper Hidden Technical Debt in Machine Learning Systems encapsulates this idea from Google.  \n\n![paper](./img/paper.png)\n\nThe key takeaway from this paper is that the machine learning code is only a small part of the entire system when it comes to building products with machine learning.\n\n\nGoogle developed MLOps by evolving this paper and expanding the term. More details can be found on the [Google Cloud homepage](https://cloud.google.com/architecture/mlops-continuous-delivery-and-automation-pipelines-in-machine-learning). In this post, we will try to explain what Google means by MLOps.\n\nGoogle divided the evolution of MLOps into three (0-2) stages. Before explaining each stage, let's review some of the concepts described in the previous post.\n\nIn order to operate a machine learning model, there is a machine learning team responsible for developing the model and an operations team responsible for deployment and operations. MLOps is needed for the successful collaboration of these two teams. We have previously said that it can be done simply through Continuous Integration (CI) / Continuous Deployment (CD), so let us see how to do CI / CD.\n\n## Level 0: Manual Process\n![level-0](./img/level-0.png)\n\nAt the 0th stage, two teams communicate through a \"model\". The machine learning team trains the model with accumulated data and delivers the trained model to the operation team. The operation team then deploys the model delivered in this way.\n\n![toon](./img/toon.png)\n\nInitial machine learning models are deployed through this \"model\" centered communication. However, there are several problems with this distribution method. For example, if some functions use Python 3.7 and some use Python 3.8, we often see the following situation.\n\nThe reason for this situation lies in the characteristics of the machine learning model. Three things are needed for the trained machine learning model to work:\n\n1. Python code\n2. Trained weights\n3. Environment (Packages, versions)\n\nIf any of these three aspects is communicated incorrectly, the model may fail to function or make unexpected predictions. However, in many cases, models fail to work due to environmental mismatches. Machine learning relies on various open-source libraries, and due to the nature of open-source, even the same function can produce different results depending on the version used.\n\nIn the early stages of a service, when there are not many models to manage, these issues can be resolved quickly. However, as the number of managed features increases and communication becomes more challenging, it becomes difficult to deploy models with better performance quickly.\n\n## Level 1: Automated ML Pipeline\n### Pipeline\n\n![level-1-pipeline](./img/level-1-pipeline.png)\n\nSo, in MLOps, \"pipeline\" is used to prevent such problems. The MLOps pipeline ensures that the model operates in the same environment as the one used by the machine learning engineer during model development, using containers like Docker. This helps prevent situations where the model doesn't work due to differences in the environment.\n\nHowever, the term \"pipeline\" is used in a broader context and in various tasks. What is the role of the pipeline that machine learning engineers create? The pipeline created by machine learning engineers produces trained models. Therefore, it would be more accurate to refer to it as a training pipeline rather than just a pipeline.\n\n### Continuous Training\n\n![level-1-ct.png](./img/level-1-ct.png)\n\nAnd the concept of Continuous Training (CT) is added. So why is CT necessary?\n\n#### Auto Retrain\n\nIn the real world, data exhibits a characteristic called \"Data Shift,\" where the data distribution keeps changing over time. As a result, models trained in the past may experience performance degradation over time. The simplest and most effective solution to this problem is to retrain the model using recent data. By retraining the model according to the changed data distribution, it can regain its performance.\n\n#### Auto Deploy\n\nHowever, in industries such as manufacturing, where multiple recipes are processed in a single factory, it may not always be desirable to retrain the model unconditionally. One common example is the blind spot.\n\nFor example, in an automotive production line, a model A was created and used for predictions. If an entirely different model B is introduced, it represents unseen data patterns, and a new model is trained for model B.\n\nNow, the model will make predictions for model B. However, if the data switches back to model A, what should be done? \nIf there are only retraining rules, a new model for model A will be trained again. However, machine learning models require a sufficient amount of data to demonstrate satisfactory performance. The term \"blind spot\" refers to a period in which the model does not work while gathering enough data.\n\nThere is a simple solution to address this blind spot. It involves checking whether there was a previous model for model A and, if so, using the previous model for prediction instead of immediately training a new model. This way, using meta-data associated with the model to automatically switch models is known as Auto Deploy.\n\nTo summarize, for Continuous Training (CT), both Auto Retrain and Auto Deploy are necessary. They complement each other's weaknesses and enable the model's performance to be maintained continuously.\n\n\n\n## Level 2: Automating the CI/CD Pipeline\n\n![level-2](./img/level-2.png)\n\nThe title of Step 2 is the automation of CI and CD. In DevOps, the focus of CI/CD is on source code. So what is the focus of CI/CD in MLOps?\n\nIn MLOps, the focus of CI/CD is also on source code, but more specifically, it can be seen as the training pipeline.\n\nTherefore, when it comes to training models, it is important to verify whether the model is trained correctly (CI) and whether the trained model functions properly (CD) in response to relevant changes that can impact the training process. Hence, CI/CD should be performed when there are direct modifications to the code used for training.\n\nIn addition to code, the versions of the packages used and changes in the Python version are also part of CI/CD. In many cases, machine learning utilizes open-source packages. However, open-source packages can have changes in the internal logic of functions when their versions are updated. Although notifications may be provided when there are certain version updates, significant changes in versions can go unnoticed. Therefore, when the versions of the packages used change, it is important to perform CI/CD to ensure that the model is trained and functions correctly.\n\nIn summary, in MLOps, CI/CD focuses on the source code, particularly the training pipeline, to verify that the model is trained correctly and functions properly. This includes checking for direct code modifications and changes in package versions or Python versions to ensure the integrity of the training and functioning processes of the model.\n"
  },
  {
    "path": "i18n/en/docusaurus-plugin-content-docs/version-1.0/introduction/why_kubernetes.md",
    "content": "---\ntitle : \"4. Why Kubernetes?\"\ndescription: \"Reason for using k8s in MLOps\"\nsidebar_position: 4\ndate: 2021-12-03\nlastmod: 2021-12-10\ncontributors: [\"Jaeyeon Kim\"]\n---\n\n## MLOps & Kubernetes\n\nWhen talking about MLOps, why is the word Kubernetes always heard together?\n\nTo build a successful MLOps system, various components are needed as described in [Components of MLOps](../introduction/component.md), but to operate them organically at the infrastructure level, there are many issues to be solved. For example, simply running a large number of machine learning model requests in order, ensuring the same execution environment in other workspaces, and responding quickly when a deployed service has a failure.\n\nThe need for containers and container orchestration systems appears here. With the introduction of container orchestration systems such as Kubernetes, efficient isolation and management of execution environments can be achieved. By introducing a container orchestration system, it is possible to prevent situations such as *'Is anyone using cluster 1?', 'Who killed my process that was using GPU?', 'Who updated the x package on the cluster?* when developing and deploying machine learning models while a few developers share a small number of clusters.\n\n## Container\n\nMicrosoft defines a container as follows: What is a container then? In Microsoft, a container is defined as [follows](https://azure.microsoft.com/en-us/overview/what-is-a-container/).\n\n> Container: Standardized, portable packaging of an application's code, libraries, and configuration files\n\nBut why is a container needed for machine learning? Machine learning models can behave differently depending on the operating system, Python execution environment, package version, etc. To prevent this, the technology used to share and execute the entire dependent execution environment with the source code used in machine learning is called containerization technology. This packaged form is called a container image, and by sharing the container image, users can ensure the same execution results on any system. In other words, by sharing not just the Jupyter Notebook file or the source code and requirements.txt file of the model, but the entire container image with the execution environment, you can avoid situations such as *\"It works on my notebook, why not yours?\"*.\n\nOne translation of the Korean sentence to English is: \"One of the common misunderstandings that people who are new to containers often make is to assume that \"container == Docker\". Docker is not a concept that has the same meaning as containers; rather, it is a tool that provides features to make it easier and more flexible to use containers, such as launching containers and creating and sharing container images. In summary, container is a virtualization technology, and Docker is an implementation of virtualization technology.\n\nHowever, Docker has become the mainstream quickly due to its easy usability and high efficiency among various container virtualization tools, so when people think of containers, they often think of Docker automatically. There are various reasons why the container and Docker ecosystem have become the mainstream, but for technical reasons, I won't go into that detail since it is outside the scope of Everybody's MLOps.\n\n## Container Orchestration System\n\nThen what is a container orchestration system? As inferred from the word \"orchestration,\" it can be compared to a system that coordinates the operation of numerous containers to work together harmoniously.\n\nIn container-based systems, services are provided to users in the form of containers. If the number of containers to be managed is small, a single operator can sufficiently handle all situations. However, if there are hundreds of containers running in dozens of clusters and they need to function continuously without causing any failures, it becomes nearly impossible for a single operator to monitor the proper functioning of all services and respond to issues.\n\nFor example, continuous monitoring is required to ensure that all services are functioning properly. If a specific service experiences a failure, the operator needs to investigate the problem by examining the logs of multiple containers. Additionally, they need to handle various tasks such as scheduling and load balancing to prevent work overload on specific clusters or containers, as well as scaling operations.\n\nA container orchestration system is software that provides functionality to manage and operate the states of numerous containers continuously and automatically, making the process of managing and operating a large number of containers somewhat easier.\n\n\nHow can it be used in machine learning? For example, a container that packages deep learning training code that requires a GPU can be executed on a cluster with available GPUs. A container that packages data preprocessing code requiring a large amount of memory can be executed on a cluster with ample memory. If there is an issue with the cluster during training, the system can automatically move the same container to a different cluster and continue the training, eliminating the need for manual intervention. Developing such a system that automates management without requiring manual intervention is the goal.\n\nAs of the writing of this text in 2022, Kubernetes is considered the de facto standard for container orchestration systems.\n\nAccording to the [survey](https://www.cncf.io/blog/2018/08/29/cncf-survey-use-of-cloud-native-technologies-in-production-has-grown-over-200-percent/) released by CNCF in 2018, Kubernetes was already showing its prominence. The [survey](https://www.cncf.io/wp-content/uploads/2020/08/CNCF_Survey_Report.pdf) published in 2019 indicates that 78% of respondents were using Kubernetes at a production level.\n\n![k8s-graph](./img/k8s-graph.png)\n\nThe growth of the Kubernetes ecosystem can be attributed to various reasons. However, similar to Docker, Kubernetes is not exclusively limited to machine learning-based services. Since delving into detailed technical content would require a substantial amount of discussion, this edition of \"MLOps for ALL\" will omit the detailed explanation of Kubernetes.\n"
  },
  {
    "path": "i18n/en/docusaurus-plugin-content-docs/version-1.0/kubeflow/_category_.json",
    "content": "{\n  \"label\": \"Kubeflow\",\n  \"position\": 6,\n  \"link\": {\n    \"type\": \"generated-index\"\n  }\n}\n"
  },
  {
    "path": "i18n/en/docusaurus-plugin-content-docs/version-1.0/kubeflow/advanced-component.md",
    "content": "---\ntitle : \"8. Component - InputPath/OutputPath\"\ndescription: \"\"\nsidebar_position: 8\ncontributors: [\"Jongseob Jeon\", \"SeungTae Kim\"]\n---\n\n## Complex Outputs\n\nOn this page, we will write the code example from [Kubeflow Concepts](../kubeflow/kubeflow-concepts.md#component-contents) as a component.\n\n## Component Contents\n\nBelow is the component content used in [Kubeflow Concepts](../kubeflow/kubeflow-concepts.md#component-contents).\n\n```python\nimport dill\nimport pandas as pd\n\nfrom sklearn.svm import SVC\n\ntrain_data = pd.read_csv(train_data_path)\ntrain_target = pd.read_csv(train_target_path)\n\nclf = SVC(kernel=kernel)\nclf.fit(train_data, train_target)\n\nwith open(model_path, mode=\"wb\") as file_writer:\n    dill.dump(clf, file_writer)\n```\n\n## Component Wrapper\n\n### Define a standalone Python function\n\nWith the necessary Configs for the Component Wrapper, it will look like this.\n\n```python\ndef train_from_csv(\n    train_data_path: str,\n    train_target_path: str,\n    model_path: str,\n    kernel: str,\n):\n    import dill\n    import pandas as pd\n\n    from sklearn.svm import SVC\n\n    train_data = pd.read_csv(train_data_path)\n    train_target = pd.read_csv(train_target_path)\n\n    clf = SVC(kernel=kernel)\n    clf.fit(train_data, train_target)\n\n    with open(model_path, mode=\"wb\") as file_writer:\n        dill.dump(clf, file_writer)\n```\n\nIn the [Basic Usage Component]](../kubeflow/basic-component), we explained that you should provide type hints for input and output when describing. But what about complex objects such as dataframes, models, that cannot be used in json?\n\nWhen passing values between functions in Python, objects can be returned and their value will be stored in the host's memory, so the same object can be used in the next function. However, in Kubeflow, components are running independently on each container, that is, they are not sharing the same memory, so you cannot pass objects in the same way as in a normal Python function. The only information that can be passed between components is in `json` format. Therefore, objects of types that cannot be converted into json format such as Model or DataFrame must be passed in some other way.\n\nKubeflow solves this by storing the data in a file instead of memory, and then using the file to pass information. Since the path of the stored file is a string, it can be passed between components. However, in Kubeflow, the user does not know the path of the file before the execution. For this, Kubeflow provides a magic related to the input and output paths, `InputPath` and `OutputPath`.\n\n`InputPath` literally means the input path, and `OutputPath` literally means the output path.\n\nFor example, in a component that generates and returns data, `data_path: OutputPath()` is created as an argument. And in a component that receives data, `data_path: InputPath()` is created as an argument.\n\nOnce these are created, when connecting them in a pipeline, Kubeflow automatically generates and inputs the necessary paths. Therefore, users no longer need to worry about the paths and only need to consider the relationships between components.\n\nBased on this information, when rewriting the component wrapper, it would look like the following.\n\n```python\nfrom kfp.components import InputPath, OutputPath\n\ndef train_from_csv(\n    train_data_path: InputPath(\"csv\"),\n    train_target_path: InputPath(\"csv\"),\n    model_path: OutputPath(\"dill\"),\n    kernel: str,\n):\n    import dill\n    import pandas as pd\n\n    from sklearn.svm import SVC\n\n    train_data = pd.read_csv(train_data_path)\n    train_target = pd.read_csv(train_target_path)\n\n    clf = SVC(kernel=kernel)\n    clf.fit(train_data, train_target)\n\n    with open(model_path, mode=\"wb\") as file_writer:\n        dill.dump(clf, file_writer)\n```\n\nInputPath or OutputPath can accept a string. This string is the format of the file to be input or output.  \nHowever, it does not necessarily mean that the file has to be stored in this format.  \nIt just serves as a helper for type checking when compiling the pipeline.  \nIf the file format is not fixed, then no input is needed (it serves the role of something like `Any` in type hints).\n\n### Convert to Kubeflow Format\n\nConvert the written component into a format that can be used in Kubeflow.\n\n```python\nfrom kfp.components import InputPath, OutputPath, create_component_from_func\n\n\n@create_component_from_func\ndef train_from_csv(\n    train_data_path: InputPath(\"csv\"),\n    train_target_path: InputPath(\"csv\"),\n    model_path: OutputPath(\"dill\"),\n    kernel: str,\n):\n    import dill\n    import pandas as pd\n\n    from sklearn.svm import SVC\n\n    train_data = pd.read_csv(train_data_path)\n    train_target = pd.read_csv(train_target_path)\n\n    clf = SVC(kernel=kernel)\n    clf.fit(train_data, train_target)\n\n    with open(model_path, mode=\"wb\") as file_writer:\n        dill.dump(clf, file_writer)\n```\n\n## Rule for using InputPath/OutputPath\n\nThere are rules to follow when using InputPath or OutputPath arguments in pipeline.\n\n### Load Data Component\n\nTo execute the previously written component, a component that generates data is created since data is required.\n\n```python\nfrom functools import partial\n\nfrom kfp.components import InputPath, OutputPath, create_component_from_func\n\n\n@create_component_from_func\ndef load_iris_data(\n    data_path: OutputPath(\"csv\"),\n    target_path: OutputPath(\"csv\"),\n):\n    import pandas as pd\n    from sklearn.datasets import load_iris\n\n    iris = load_iris()\n\n    data = pd.DataFrame(iris[\"data\"], columns=iris[\"feature_names\"])\n    target = pd.DataFrame(iris[\"target\"], columns=[\"target\"])\n\n    data.to_csv(data_path, index=False)\n    target.to_csv(target_path, index=False)\n```\n\n### Write Pipeline\n\nNow let's write the pipeline.\n\n```python\nfrom kfp.dsl import pipeline\n\n\n@pipeline(name=\"complex_pipeline\")\ndef complex_pipeline(kernel: str):\n    iris_data = load_iris_data()\n    model = train_from_csv(\n        train_data=iris_data.outputs[\"data\"],\n        train_target=iris_data.outputs[\"target\"],\n        kernel=kernel,\n    )\n```\n\nHave you noticed something strange?  \nAll the `_path` suffixes have disappeared from the arguments received in the input and output.  \nWe can see that instead of accessing `iris_data.outputs[\"data_path\"]`, we are accessing `iris_data.outputs[\"data\"]`.  \nThis happens because Kubeflow has a rule that paths created with `InputPath` and `OutputPath` can be accessed without the `_path` suffix when accessed from the pipeline.\n\nHowever, if you upload the pipeline just written, it will not run.  \nThe reason is explained on the next page.\n"
  },
  {
    "path": "i18n/en/docusaurus-plugin-content-docs/version-1.0/kubeflow/advanced-environment.md",
    "content": "---\ntitle : \"9. Component - Environment\"\ndescription: \"\"\nsidebar_position: 9\ncontributors: [\"Jongseob Jeon\"]\n---\n\n## Component Environment\n\nWhen we run the pipeline written in [8. Component - InputPath/OutputPath](../kubeflow/advanced-component.md), it fails. Let's find out why it fails and modify it so that it can run properly. \n\n### Convert to Kubeflow Format\n\nLet's convert the component written [earlier](../kubeflow/advanced-component.md#convert-to-kubeflow-format) into a yaml file.\n\n```python\nfrom kfp.components import InputPath, OutputPath, create_component_from_func\n\n\n@create_component_from_func\ndef train_from_csv(\n    train_data_path: InputPath(\"csv\"),\n    train_target_path: InputPath(\"csv\"),\n    model_path: OutputPath(\"dill\"),\n    kernel: str,\n):\n    import dill\n    import pandas as pd\n\n    from sklearn.svm import SVC\n\n    train_data = pd.read_csv(train_data_path)\n    train_target = pd.read_csv(train_target_path)\n\n    clf = SVC(kernel=kernel)\n    clf.fit(train_data, train_target)\n\n    with open(model_path, mode=\"wb\") as file_writer:\n        dill.dump(clf, file_writer)\n\n\nif __name__ == \"__main__\":\n    train_from_csv.component_spec.save(\"train_from_csv.yaml\")\n```\n\nIf you run the script above, you will get a `train_from_csv.yaml` file like the one below.\n\n```bash\nname: Train from csv\ninputs:\n- {name: train_data, type: csv}\n- {name: train_target, type: csv}\n- {name: model, type: dill}\n- {name: kernel, type: String}\nimplementation:\n  container:\n    image: python:3.7\n    command:\n    - sh\n    - -ec\n    - |\n      program_path=$(mktemp)\n      printf \"%s\" \"$0\" > \"$program_path\"\n      python3 -u \"$program_path\" \"$@\"\n    - |\n      def train_from_csv(\n          train_data_path,\n          train_target_path,\n          model_path,\n          kernel,\n      ):\n          import dill\n          import pandas as pd\n\n          from sklearn.svm import SVC\n\n          train_data = pd.read_csv(train_data_path)\n          train_target = pd.read_csv(train_target_path)\n\n          clf = SVC(kernel=kernel)\n          clf.fit(train_data, train_target)\n\n          with open(model_path, mode=\"wb\") as file_writer:\n              dill.dump(clf, file_writer)\n\n      import argparse\n      _parser = argparse.ArgumentParser(prog='Train from csv', description='')\n      _parser.add_argument(\"--train-data\", dest=\"train_data_path\", type=str, required=True, default=argparse.SUPPRESS)\n      _parser.add_argument(\"--train-target\", dest=\"train_target_path\", type=str, required=True, default=argparse.SUPPRESS)\n      _parser.add_argument(\"--model\", dest=\"model_path\", type=str, required=True, default=argparse.SUPPRESS)\n      _parser.add_argument(\"--kernel\", dest=\"kernel\", type=str, required=True, default=argparse.SUPPRESS)\n      _parsed_args = vars(_parser.parse_args())\n\n      _outputs = train_from_csv(**_parsed_args)\n    args:\n    - --train-data\n    - {inputPath: train_data}\n    - --train-target\n    - {inputPath: train_target}\n    - --model\n    - {inputPath: model}\n    - --kernel\n    - {inputValue: kernel}\n```\n\nAccording to the content explained in the [Basic Usage Component](../kubeflow/basic-component.md#convert-to-kubeflow-format) previously mentioned, this component will be executed as follows:\n\n1. `docker pull python:3.7`\n2. run `command`\n\nHowever, when running the component created above, an error will occur.  \nThe reason is in the way the component wrapper is executed.  \nKubeflow uses Kubernetes, so the component wrapper runs the component content on its own separate container.\n\nIn detail, the image specified in the generated `train_from_csv.yaml` is `image: python:3.7`.\n\nThere may be some people who notice why it is not running for some reason.\n\nThe `python:3.7` image does not have the packages we want to use, such as `dill`, `pandas`, and `sklearn`, installed.  \nTherefore, when executing, it fails with an error indicating that the packages are not found.\n\nSo, how can we add the packages?\n\n## Adding packages\n\nDuring the process of converting Kubeflow, there are two ways to add packages:\n\n1. Using `base_image`\n2. Using `package_to_install`\n\nLet's check what arguments the function `create_component_from_func` used to compile the components can receive.\n\n```bash\ndef create_component_from_func(\n    func: Callable,\n    output_component_file: Optional[str] = None,\n    base_image: Optional[str] = None,\n    packages_to_install: List[str] = None,\n    annotations: Optional[Mapping[str, str]] = None,\n):\n```\n\n- `func`: Function that creates the component wrapper to be made into a component.\n- `base_image`: Image that the component wrapper will run on.\n- `packages_to_install`: Additional packages that need to be installed for the component to use.\n\n### 1. base_image\n\nTake a closer look at the sequence in which the component is executed and it will be as follows:\n\n1. `docker pull base_image`\n2. `pip install packages_to_install`\n3. run `command`\n\nIf the base_image used by the component already has all the packages installed, you can use it without installing additional packages.\n\nFor example, on this page we are going to write a Dockerfile like this:\n\n```dockerfile\nFROM python:3.7\n\nRUN pip install dill pandas scikit-learn\n```\n\nLet's build the image using the Dockerfile above. The Docker hub we will use for the practice is ghcr.  \nYou can choose a Docker hub according to your environment and upload it.\n\n```bash\ndocker build . -f Dockerfile -t ghcr.io/mlops-for-all/base-image\ndocker push ghcr.io/mlops-for-all/base-image\n```\n\nNow let's try inputting the base image.\n\n```python\nfrom functools import partial\nfrom kfp.components import InputPath, OutputPath, create_component_from_func\n\n@partial(\n    create_component_from_func,\n    base_image=\"ghcr.io/mlops-for-all/base-image:latest\",\n)\ndef train_from_csv(\n    train_data_path: InputPath(\"csv\"),\n    train_target_path: InputPath(\"csv\"),\n    model_path: OutputPath(\"dill\"),\n    kernel: str,\n):\n    import dill\n    import pandas as pd\n\n    from sklearn.svm import SVC\n\n    train_data = pd.read_csv(train_data_path)\n    train_target = pd.read_csv(train_target_path)\n\n    clf = SVC(kernel=kernel)\n    clf.fit(train_data, train_target)\n\n    with open(model_path, mode=\"wb\") as file_writer:\n        dill.dump(clf, file_writer)\n\nif __name__ == \"__main__\":\n    train_from_csv.component_spec.save(\"train_from_csv.yaml\")\n```\n\nIf you compile the generated component, it will appear as follows.\n\n```bash\nname: Train from csv\ninputs:\n- {name: train_data, type: csv}\n- {name: train_target, type: csv}\n- {name: kernel, type: String}\noutputs:\n- {name: model, type: dill}\nimplementation:\n  container:\n    image: ghcr.io/mlops-for-all/base-image:latest\n    command:\n    - sh\n    - -ec\n    - |\n      program_path=$(mktemp)\n      printf \"%s\" \"$0\" > \"$program_path\"\n      python3 -u \"$program_path\" \"$@\"\n    - |\n      def _make_parent_dirs_and_return_path(file_path: str):\n          import os\n          os.makedirs(os.path.dirname(file_path), exist_ok=True)\n          return file_path\n\n      def train_from_csv(\n          train_data_path,\n          train_target_path,\n          model_path,\n          kernel,\n      ):\n          import dill\n          import pandas as pd\n\n          from sklearn.svm import SVC\n\n          train_data = pd.read_csv(train_data_path)\n          train_target = pd.read_csv(train_target_path)\n\n          clf = SVC(kernel=kernel)\n          clf.fit(train_data, train_target)\n\n          with open(model_path, mode=\"wb\") as file_writer:\n              dill.dump(clf, file_writer)\n\n      import argparse\n      _parser = argparse.ArgumentParser(prog='Train from csv', description='')\n      _parser.add_argument(\"--train-data\", dest=\"train_data_path\", type=str, required=True, default=argparse.SUPPRESS)\n      _parser.add_argument(\"--train-target\", dest=\"train_target_path\", type=str, required=True, default=argparse.SUPPRESS)\n      _parser.add_argument(\"--kernel\", dest=\"kernel\", type=str, required=True, default=argparse.SUPPRESS)\n      _parser.add_argument(\"--model\", dest=\"model_path\", type=_make_parent_dirs_and_return_path, required=True, default=argparse.SUPPRESS)\n      _parsed_args = vars(_parser.parse_args())\n\n      _outputs = train_from_csv(**_parsed_args)\n    args:\n    - --train-data\n    - {inputPath: train_data}\n    - --train-target\n    - {inputPath: train_target}\n    - --kernel\n    - {inputValue: kernel}\n    - --model\n    - {outputPath: model}\n```\n\nWe can confirm that the base_image has been changed to the value we have set.\n\n### 2. packages_to_install\n\nHowever, when packages are added, it takes a lot of time to create a new Docker image.\nIn this case, we can use the `packages_to_install` argument to easily add packages to the container.\n```python\nfrom functools import partial\nfrom kfp.components import InputPath, OutputPath, create_component_from_func\n\n@partial(\n    create_component_from_func,\n    packages_to_install=[\"dill==0.3.4\", \"pandas==1.3.4\", \"scikit-learn==1.0.1\"],\n)\ndef train_from_csv(\n    train_data_path: InputPath(\"csv\"),\n    train_target_path: InputPath(\"csv\"),\n    model_path: OutputPath(\"dill\"),\n    kernel: str,\n):\n    import dill\n    import pandas as pd\n\n    from sklearn.svm import SVC\n\n    train_data = pd.read_csv(train_data_path)\n    train_target = pd.read_csv(train_target_path)\n\n    clf = SVC(kernel=kernel)\n    clf.fit(train_data, train_target)\n\n    with open(model_path, mode=\"wb\") as file_writer:\n        dill.dump(clf, file_writer)\n\nif __name__ == \"__main__\":\n    train_from_csv.component_spec.save(\"train_from_csv.yaml\")\n```\n\nIf you execute the script, the `train_from_csv.yaml` file will be generated.\n\n```bash\nname: Train from csv\ninputs:\n- {name: train_data, type: csv}\n- {name: train_target, type: csv}\n- {name: kernel, type: String}\noutputs:\n- {name: model, type: dill}\nimplementation:\n  container:\n    image: python:3.7\n    command:\n    - sh\n    - -c\n    - (PIP_DISABLE_PIP_VERSION_CHECK=1 python3 -m pip install --quiet --no-warn-script-location\n      'dill==0.3.4' 'pandas==1.3.4' 'scikit-learn==1.0.1' || PIP_DISABLE_PIP_VERSION_CHECK=1\n      python3 -m pip install --quiet --no-warn-script-location 'dill==0.3.4' 'pandas==1.3.4'\n      'scikit-learn==1.0.1' --user) && \"$0\" \"$@\"\n    - sh\n    - -ec\n    - |\n      program_path=$(mktemp)\n      printf \"%s\" \"$0\" > \"$program_path\"\n      python3 -u \"$program_path\" \"$@\"\n    - |\n      def _make_parent_dirs_and_return_path(file_path: str):\n          import os\n          os.makedirs(os.path.dirname(file_path), exist_ok=True)\n          return file_path\n\n      def train_from_csv(\n          train_data_path,\n          train_target_path,\n          model_path,\n          kernel,\n      ):\n          import dill\n          import pandas as pd\n\n          from sklearn.svm import SVC\n\n          train_data = pd.read_csv(train_data_path)\n          train_target = pd.read_csv(train_target_path)\n\n          clf = SVC(kernel=kernel)\n          clf.fit(train_data, train_target)\n\n          with open(model_path, mode=\"wb\") as file_writer:\n              dill.dump(clf, file_writer)\n\n      import argparse\n      _parser = argparse.ArgumentParser(prog='Train from csv', description='')\n      _parser.add_argument(\"--train-data\", dest=\"train_data_path\", type=str, required=True, default=argparse.SUPPRESS)\n      _parser.add_argument(\"--train-target\", dest=\"train_target_path\", type=str, required=True, default=argparse.SUPPRESS)\n      _parser.add_argument(\"--kernel\", dest=\"kernel\", type=str, required=True, default=argparse.SUPPRESS)\n      _parser.add_argument(\"--model\", dest=\"model_path\", type=_make_parent_dirs_and_return_path, required=True, default=argparse.SUPPRESS)\n      _parsed_args = vars(_parser.parse_args())\n\n      _outputs = train_from_csv(**_parsed_args)\n    args:\n    - --train-data\n    - {inputPath: train_data}\n    - --train-target\n    - {inputPath: train_target}\n    - --kernel\n    - {inputValue: kernel}\n    - --model\n    - {outputPath: model}\n```\n\nIf we take a closer look at the order in which the components written above are executed, it looks like this:\n\n1. `docker pull python:3.7`\n2. `pip install dill==0.3.4 pandas==1.3.4 scikit-learn==1.0.1`\n3. run `command`\n\nWhen the generated yaml file is closely examined, the following lines are automatically added, so that the necessary packages are installed and the program runs smoothly without errors.\n\n```bash\n    command:\n    - sh\n    - -c\n    - (PIP_DISABLE_PIP_VERSION_CHECK=1 python3 -m pip install --quiet --no-warn-script-location\n      'dill==0.3.4' 'pandas==1.3.4' 'scikit-learn==1.0.1' || PIP_DISABLE_PIP_VERSION_CHECK=1\n      python3 -m pip install --quiet --no-warn-script-location 'dill==0.3.4' 'pandas==1.3.4'\n      'scikit-learn==1.0.1' --user) && \"$0\" \"$@\"\n```\n"
  },
  {
    "path": "i18n/en/docusaurus-plugin-content-docs/version-1.0/kubeflow/advanced-mlflow.md",
    "content": "---\ntitle : \"12. Component - MLFlow\"\ndescription: \"\"\nsidebar_position: 12\ndate: 2021-12-13\nlastmod: 2021-12-20\ncontributors: [\"Jongseob Jeon\", \"SeungTae Kim\"]\n---\n\n## MLFlow Component\n\nIn this page, we will explain the process of writing a component to store the model in MLFlow so that the model trained in [Advanced Usage Component](../kubeflow/advanced-component.md) can be linked to API deployment.\n\n## MLFlow in Local\n\nIn order to store the model in MLFlow and use it in serving, the following items are needed.\n\n- model\n- signature\n- input_example\n- conda_env\n\nWe will look into the process of saving a model to MLFlow through Python code.\n\n### 1. Train model\n\nThe following steps involve training an SVC model using the iris dataset.\n\n```python\nimport pandas as pd\nfrom sklearn.datasets import load_iris\nfrom sklearn.svm import SVC\n\niris = load_iris()\n\ndata = pd.DataFrame(iris[\"data\"], columns=iris[\"feature_names\"])\ntarget = pd.DataFrame(iris[\"target\"], columns=[\"target\"])\n\nclf = SVC(kernel=\"rbf\")\nclf.fit(data, target)\n\n```\n\n### 2. MLFLow Infos\n\nThis process creates the necessary information for MLFlow.\n\n```python\nfrom mlflow.models.signature import infer_signature\nfrom mlflow.utils.environment import _mlflow_conda_env\n\ninput_example = data.sample(1)\nsignature = infer_signature(data, clf.predict(data))\nconda_env = _mlflow_conda_env(additional_pip_deps=[\"dill\", \"pandas\", \"scikit-learn\"])\n```\n\nEach variable's content is as follows.\n\n- `input_example`\n\n    | sepal length (cm) | sepal width (cm) | petal length (cm) | petal width (cm) |\n    | --- | --- | --- | --- |\n    | 6.5 | 6.7 | 3.1 | 4.4 |\n\n- `signature`\n\n    ```python\n    inputs:\n      ['sepal length (cm)': double, 'sepal width (cm)': double, 'petal length (cm)': double, 'petal width (cm)': double]\n    outputs:\n      [Tensor('int64', (-1,))]\n    ```\n\n- `conda_env`\n\n    ```python\n    {'name': 'mlflow-env',\n     'channels': ['conda-forge'],\n     'dependencies': ['python=3.8.10',\n      'pip',\n      {'pip': ['mlflow', 'dill', 'pandas', 'scikit-learn']}]}\n    ```\n\n### 3. Save MLFLow Infos\n\nNext, we save the learned information and the model. Since the trained model uses the sklearn package, we can easily save the model using `mlflow.sklearn`.\n\n```python\nfrom mlflow.sklearn import save_model\n\nsave_model(\n    sk_model=clf,\n    path=\"svc\",\n    serialization_format=\"cloudpickle\",\n    conda_env=conda_env,\n    signature=signature,\n    input_example=input_example,\n)\n```\n\nIf you work locally, a svc folder will be created and the following files will be generated.\n\n```bash\nls svc\n```\n\nIf you execute the command above, you can check the following output value.\n\n```bash\nMLmodel            conda.yaml         input_example.json model.pkl          requirements.txt\n```\n\nEach file will be as follows if checked.\n\n- MLmodel\n\n    ```bash\n    flavors:\n      python_function:\n        env: conda.yaml\n        loader_module: mlflow.sklearn\n        model_path: model.pkl\n        python_version: 3.8.10\n      sklearn:\n        pickled_model: model.pkl\n        serialization_format: cloudpickle\n        sklearn_version: 1.0.1\n    saved_input_example_info:\n      artifact_path: input_example.json\n      pandas_orient: split\n      type: dataframe\n    signature:\n      inputs: '[{\"name\": \"sepal length (cm)\", \"type\": \"double\"}, {\"name\": \"sepal width\n        (cm)\", \"type\": \"double\"}, {\"name\": \"petal length (cm)\", \"type\": \"double\"}, {\"name\":\n        \"petal width (cm)\", \"type\": \"double\"}]'\n      outputs: '[{\"type\": \"tensor\", \"tensor-spec\": {\"dtype\": \"int64\", \"shape\": [-1]}}]'\n    utc_time_created: '2021-12-06 06:52:30.612810'\n    ```\n\n- conda.yaml\n\n    ```bash\n    channels:\n    - conda-forge\n    dependencies:\n    - python=3.8.10\n    - pip\n    - pip:\n      - mlflow\n      - dill\n      - pandas\n      - scikit-learn\n    name: mlflow-env\n    ```\n\n- input_example.json\n\n    ```bash\n    {\n        \"columns\": \n        [\n            \"sepal length (cm)\",\n            \"sepal width (cm)\",\n            \"petal length (cm)\",\n            \"petal width (cm)\"\n        ],\n        \"data\": \n        [\n            [6.7, 3.1, 4.4, 1.4]\n        ]\n    }\n    ```\n\n- requirements.txt\n\n    ```bash\n    mlflow\n    dill\n    pandas\n    scikit-learn\n    ```\n\n- model.pkl\n\n## MLFlow on Server\n\nNow, let's proceed with the task of uploading the saved model to the MLflow server.\n\n```python\nimport mlflow\n\nwith mlflow.start_run():\n    mlflow.log_artifact(\"svc/\")\n```\n\nSave and open the `mlruns` directory generated path with `mlflow ui` command to launch mlflow server and dashboard.\nAccess the mlflow dashboard, click the generated run to view it as below.\n\n![mlflow-0.png](./img/mlflow-0.png)\n(This screen may vary depending on the version of mlflow.)\n\n## MLFlow Component\n\nNow, let's write a reusable component in Kubeflow.\n\nThe ways of writing components that can be reused are broadly divided into three categories.\n\n1. After saving the necessary environment in the component responsible for model training, the MLflow component is only responsible for the upload.\n\n    ![mlflow-1.png](./img/mlflow-1.png)\n\n2. Pass the trained model and data to the MLflow component, which is responsible for saving and uploading.\n\n    ![mlflow-2.png](./img/mlflow-2.png)\n\n3. The component responsible for model training handles both saving and uploading.\n\n    ![mlflow-3.png](./img/mlflow-3.png)\n\nWe are trying to manage the model through the first approach.\nThe reason is that we don't need to write the code to upload the MLFlow model every time like three times for each component written.\n\nReusing components is possible by the methods 1 and 2.\nHowever, in the case of 2, it is necessary to deliver the trained image and packages to the component, so ultimately additional information about the component must be delivered.\n\nIn order to proceed with the method 1, the learning component must also be changed.\nCode that stores the environment needed to save the model must be added.\n\n```python\nfrom functools import partial\nfrom kfp.components import InputPath, OutputPath, create_component_from_func\n\n@partial(\n    create_component_from_func,\n    packages_to_install=[\"dill\", \"pandas\", \"scikit-learn\", \"mlflow\"],\n)\ndef train_from_csv(\n    train_data_path: InputPath(\"csv\"),\n    train_target_path: InputPath(\"csv\"),\n    model_path: OutputPath(\"dill\"),\n    input_example_path: OutputPath(\"dill\"),\n    signature_path: OutputPath(\"dill\"),\n    conda_env_path: OutputPath(\"dill\"),\n    kernel: str,\n):\n    import dill\n    import pandas as pd\n    from sklearn.svm import SVC\n\n    from mlflow.models.signature import infer_signature\n    from mlflow.utils.environment import _mlflow_conda_env\n\n    train_data = pd.read_csv(train_data_path)\n    train_target = pd.read_csv(train_target_path)\n\n    clf = SVC(kernel=kernel)\n    clf.fit(train_data, train_target)\n\n    with open(model_path, mode=\"wb\") as file_writer:\n        dill.dump(clf, file_writer)\n\n    input_example = train_data.sample(1)\n    with open(input_example_path, \"wb\") as file_writer:\n        dill.dump(input_example, file_writer)\n\n    signature = infer_signature(train_data, clf.predict(train_data))\n    with open(signature_path, \"wb\") as file_writer:\n        dill.dump(signature, file_writer)\n\n    conda_env = _mlflow_conda_env(\n        additional_pip_deps=[\"dill\", \"pandas\", \"scikit-learn\"]\n    )\n    with open(conda_env_path, \"wb\") as file_writer:\n        dill.dump(conda_env, file_writer)\n\n```\n\nWrite a component to upload to MLFlow.\nAt this time, configure the uploaded MLFlow endpoint to be connected to the [mlflow service](../setup-components/install-components-mlflow.md) that we installed.  \nIn this case, use the Kubernetes Service DNS Name of the Minio installed at the time of MLFlow Server installation. As this service is created in the Kubeflow namespace with the name minio-service, set it to `http://minio-service.kubeflow.svc:9000`.  \nSimilarly, for the tracking_uri address, use the Kubernetes Service DNS Name of the MLFlow server and set it to `http://mlflow-server-service.mlflow-system.svc:5000`.\n\n```python\nfrom functools import partial\nfrom kfp.components import InputPath, create_component_from_func\n\n@partial(\n    create_component_from_func,\n    packages_to_install=[\"dill\", \"pandas\", \"scikit-learn\", \"mlflow\", \"boto3\"],\n)\ndef upload_sklearn_model_to_mlflow(\n    model_name: str,\n    model_path: InputPath(\"dill\"),\n    input_example_path: InputPath(\"dill\"),\n    signature_path: InputPath(\"dill\"),\n    conda_env_path: InputPath(\"dill\"),\n):\n    import os\n    import dill\n    from mlflow.sklearn import save_model\n    \n    from mlflow.tracking.client import MlflowClient\n\n    os.environ[\"MLFLOW_S3_ENDPOINT_URL\"] = \"http://minio-service.kubeflow.svc:9000\"\n    os.environ[\"AWS_ACCESS_KEY_ID\"] = \"minio\"\n    os.environ[\"AWS_SECRET_ACCESS_KEY\"] = \"minio123\"\n\n    client = MlflowClient(\"http://mlflow-server-service.mlflow-system.svc:5000\")\n\n    with open(model_path, mode=\"rb\") as file_reader:\n        clf = dill.load(file_reader)\n\n    with open(input_example_path, \"rb\") as file_reader:\n        input_example = dill.load(file_reader)\n\n    with open(signature_path, \"rb\") as file_reader:\n        signature = dill.load(file_reader)\n\n    with open(conda_env_path, \"rb\") as file_reader:\n        conda_env = dill.load(file_reader)\n\n    save_model(\n        sk_model=clf,\n        path=model_name,\n        serialization_format=\"cloudpickle\",\n        conda_env=conda_env,\n        signature=signature,\n        input_example=input_example,\n    )\n    run = client.create_run(experiment_id=\"0\")\n    client.log_artifact(run.info.run_id, model_name)\n```\n\n## MLFlow Pipeline\n\nNow let's connect the components we have written and create a pipeline. \n\n### Data Component\n\nThe data we will use to train the model is sklearn's iris.\nWe will write a component to generate the data.\n\n```python\nfrom functools import partial\n\nfrom kfp.components import InputPath, OutputPath, create_component_from_func\n\n\n@partial(\n    create_component_from_func,\n    packages_to_install=[\"pandas\", \"scikit-learn\"],\n)\ndef load_iris_data(\n    data_path: OutputPath(\"csv\"),\n    target_path: OutputPath(\"csv\"),\n):\n    import pandas as pd\n    from sklearn.datasets import load_iris\n\n    iris = load_iris()\n\n    data = pd.DataFrame(iris[\"data\"], columns=iris[\"feature_names\"])\n    target = pd.DataFrame(iris[\"target\"], columns=[\"target\"])\n\n    data.to_csv(data_path, index=False)\n    target.to_csv(target_path, index=False)\n\n```\n\n### Pipeline\n\nThe pipeline code can be written as follows.\n\n```python\nfrom kfp.dsl import pipeline\n\n\n@pipeline(name=\"mlflow_pipeline\")\ndef mlflow_pipeline(kernel: str, model_name: str):\n    iris_data = load_iris_data()\n    model = train_from_csv(\n        train_data=iris_data.outputs[\"data\"],\n        train_target=iris_data.outputs[\"target\"],\n        kernel=kernel,\n    )\n    _ = upload_sklearn_model_to_mlflow(\n        model_name=model_name,\n        model=model.outputs[\"model\"],\n        input_example=model.outputs[\"input_example\"],\n        signature=model.outputs[\"signature\"],\n        conda_env=model.outputs[\"conda_env\"],\n    )\n```\n\n### Run\n\nIf you organize the components and pipelines written above into a single Python file, it would look like this.\n\n```python\nfrom functools import partial\n\nimport kfp\nfrom kfp.components import InputPath, OutputPath, create_component_from_func\nfrom kfp.dsl import pipeline\n\n\n@partial(\n    create_component_from_func,\n    packages_to_install=[\"pandas\", \"scikit-learn\"],\n)\ndef load_iris_data(\n    data_path: OutputPath(\"csv\"),\n    target_path: OutputPath(\"csv\"),\n):\n    import pandas as pd\n    from sklearn.datasets import load_iris\n\n    iris = load_iris()\n\n    data = pd.DataFrame(iris[\"data\"], columns=iris[\"feature_names\"])\n    target = pd.DataFrame(iris[\"target\"], columns=[\"target\"])\n\n    data.to_csv(data_path, index=False)\n    target.to_csv(target_path, index=False)\n\n\n@partial(\n    create_component_from_func,\n    packages_to_install=[\"dill\", \"pandas\", \"scikit-learn\", \"mlflow\"],\n)\ndef train_from_csv(\n    train_data_path: InputPath(\"csv\"),\n    train_target_path: InputPath(\"csv\"),\n    model_path: OutputPath(\"dill\"),\n    input_example_path: OutputPath(\"dill\"),\n    signature_path: OutputPath(\"dill\"),\n    conda_env_path: OutputPath(\"dill\"),\n    kernel: str,\n):\n    import dill\n    import pandas as pd\n    from sklearn.svm import SVC\n\n    from mlflow.models.signature import infer_signature\n    from mlflow.utils.environment import _mlflow_conda_env\n\n    train_data = pd.read_csv(train_data_path)\n    train_target = pd.read_csv(train_target_path)\n\n    clf = SVC(kernel=kernel)\n    clf.fit(train_data, train_target)\n\n    with open(model_path, mode=\"wb\") as file_writer:\n        dill.dump(clf, file_writer)\n\n    input_example = train_data.sample(1)\n    with open(input_example_path, \"wb\") as file_writer:\n        dill.dump(input_example, file_writer)\n\n    signature = infer_signature(train_data, clf.predict(train_data))\n    with open(signature_path, \"wb\") as file_writer:\n        dill.dump(signature, file_writer)\n\n    conda_env = _mlflow_conda_env(\n        additional_pip_deps=[\"dill\", \"pandas\", \"scikit-learn\"]\n    )\n    with open(conda_env_path, \"wb\") as file_writer:\n        dill.dump(conda_env, file_writer)\n\n\n@partial(\n    create_component_from_func,\n    packages_to_install=[\"dill\", \"pandas\", \"scikit-learn\", \"mlflow\", \"boto3\"],\n)\ndef upload_sklearn_model_to_mlflow(\n    model_name: str,\n    model_path: InputPath(\"dill\"),\n    input_example_path: InputPath(\"dill\"),\n    signature_path: InputPath(\"dill\"),\n    conda_env_path: InputPath(\"dill\"),\n):\n    import os\n    import dill\n    from mlflow.sklearn import save_model\n    \n    from mlflow.tracking.client import MlflowClient\n\n    os.environ[\"MLFLOW_S3_ENDPOINT_URL\"] = \"http://minio-service.kubeflow.svc:9000\"\n    os.environ[\"AWS_ACCESS_KEY_ID\"] = \"minio\"\n    os.environ[\"AWS_SECRET_ACCESS_KEY\"] = \"minio123\"\n\n    client = MlflowClient(\"http://mlflow-server-service.mlflow-system.svc:5000\")\n\n    with open(model_path, mode=\"rb\") as file_reader:\n        clf = dill.load(file_reader)\n\n    with open(input_example_path, \"rb\") as file_reader:\n        input_example = dill.load(file_reader)\n\n    with open(signature_path, \"rb\") as file_reader:\n        signature = dill.load(file_reader)\n\n    with open(conda_env_path, \"rb\") as file_reader:\n        conda_env = dill.load(file_reader)\n\n    save_model(\n        sk_model=clf,\n        path=model_name,\n        serialization_format=\"cloudpickle\",\n        conda_env=conda_env,\n        signature=signature,\n        input_example=input_example,\n    )\n    run = client.create_run(experiment_id=\"0\")\n    client.log_artifact(run.info.run_id, model_name)\n\n\n@pipeline(name=\"mlflow_pipeline\")\ndef mlflow_pipeline(kernel: str, model_name: str):\n    iris_data = load_iris_data()\n    model = train_from_csv(\n        train_data=iris_data.outputs[\"data\"],\n        train_target=iris_data.outputs[\"target\"],\n        kernel=kernel,\n    )\n    _ = upload_sklearn_model_to_mlflow(\n        model_name=model_name,\n        model=model.outputs[\"model\"],\n        input_example=model.outputs[\"input_example\"],\n        signature=model.outputs[\"signature\"],\n        conda_env=model.outputs[\"conda_env\"],\n    )\n\n\nif __name__ == \"__main__\":\n    kfp.compiler.Compiler().compile(mlflow_pipeline, \"mlflow_pipeline.yaml\")\n```\n\n<p>\n  <details>\n    <summary>mlflow_pipeline.yaml</summary>\n\n```bash\napiVersion: argoproj.io/v1alpha1\nkind: Workflow\nmetadata:\n  generateName: mlflow-pipeline-\n  annotations: {pipelines.kubeflow.org/kfp_sdk_version: 1.8.10, pipelines.kubeflow.org/pipeline_compilation_time: '2022-01-19T14:14:11.999807',\n    pipelines.kubeflow.org/pipeline_spec: '{\"inputs\": [{\"name\": \"kernel\", \"type\":\n      \"String\"}, {\"name\": \"model_name\", \"type\": \"String\"}], \"name\": \"mlflow_pipeline\"}'}\n  labels: {pipelines.kubeflow.org/kfp_sdk_version: 1.8.10}\nspec:\n  entrypoint: mlflow-pipeline\n  templates:\n  - name: load-iris-data\n    container:\n      args: [--data, /tmp/outputs/data/data, --target, /tmp/outputs/target/data]\n      command:\n      - sh\n      - -c\n      - (PIP_DISABLE_PIP_VERSION_CHECK=1 python3 -m pip install --quiet --no-warn-script-location\n        'pandas' 'scikit-learn' || PIP_DISABLE_PIP_VERSION_CHECK=1 python3 -m pip\n        install --quiet --no-warn-script-location 'pandas' 'scikit-learn' --user)\n        && \"$0\" \"$@\"\n      - sh\n      - -ec\n      - |\n        program_path=$(mktemp)\n        printf \"%s\" \"$0\" > \"$program_path\"\n        python3 -u \"$program_path\" \"$@\"\n      - |\n        def _make_parent_dirs_and_return_path(file_path: str):\n            import os\n            os.makedirs(os.path.dirname(file_path), exist_ok=True)\n            return file_path\n\n        def load_iris_data(\n            data_path,\n            target_path,\n        ):\n            import pandas as pd\n            from sklearn.datasets import load_iris\n\n            iris = load_iris()\n\n            data = pd.DataFrame(iris[\"data\"], columns=iris[\"feature_names\"])\n            target = pd.DataFrame(iris[\"target\"], columns=[\"target\"])\n\n            data.to_csv(data_path, index=False)\n            target.to_csv(target_path, index=False)\n\n        import argparse\n        _parser = argparse.ArgumentParser(prog='Load iris data', description='')\n        _parser.add_argument(\"--data\", dest=\"data_path\", type=_make_parent_dirs_and_return_path, required=True, default=argparse.SUPPRESS)\n        _parser.add_argument(\"--target\", dest=\"target_path\", type=_make_parent_dirs_and_return_path, required=True, default=argparse.SUPPRESS)\n        _parsed_args = vars(_parser.parse_args())\n\n        _outputs = load_iris_data(**_parsed_args)\n      image: python:3.7\n    outputs:\n      artifacts:\n      - {name: load-iris-data-data, path: /tmp/outputs/data/data}\n      - {name: load-iris-data-target, path: /tmp/outputs/target/data}\n    metadata:\n      labels:\n        pipelines.kubeflow.org/kfp_sdk_version: 1.8.10\n        pipelines.kubeflow.org/pipeline-sdk-type: kfp\n        pipelines.kubeflow.org/enable_caching: \"true\"\n      annotations: {pipelines.kubeflow.org/component_spec: '{\"implementation\": {\"container\":\n          {\"args\": [\"--data\", {\"outputPath\": \"data\"}, \"--target\", {\"outputPath\": \"target\"}],\n          \"command\": [\"sh\", \"-c\", \"(PIP_DISABLE_PIP_VERSION_CHECK=1 python3 -m pip\n          install --quiet --no-warn-script-location ''pandas'' ''scikit-learn'' ||\n          PIP_DISABLE_PIP_VERSION_CHECK=1 python3 -m pip install --quiet --no-warn-script-location\n          ''pandas'' ''scikit-learn'' --user) && \\\"$0\\\" \\\"$@\\\"\", \"sh\", \"-ec\", \"program_path=$(mktemp)\\nprintf\n          \\\"%s\\\" \\\"$0\\\" > \\\"$program_path\\\"\\npython3 -u \\\"$program_path\\\" \\\"$@\\\"\\n\",\n          \"def _make_parent_dirs_and_return_path(file_path: str):\\n    import os\\n    os.makedirs(os.path.dirname(file_path),\n          exist_ok=True)\\n    return file_path\\n\\ndef load_iris_data(\\n    data_path,\\n    target_path,\\n):\\n    import\n          pandas as pd\\n    from sklearn.datasets import load_iris\\n\\n    iris = load_iris()\\n\\n    data\n          = pd.DataFrame(iris[\\\"data\\\"], columns=iris[\\\"feature_names\\\"])\\n    target\n          = pd.DataFrame(iris[\\\"target\\\"], columns=[\\\"target\\\"])\\n\\n    data.to_csv(data_path,\n          index=False)\\n    target.to_csv(target_path, index=False)\\n\\nimport argparse\\n_parser\n          = argparse.ArgumentParser(prog=''Load iris data'', description='''')\\n_parser.add_argument(\\\"--data\\\",\n          dest=\\\"data_path\\\", type=_make_parent_dirs_and_return_path, required=True,\n          default=argparse.SUPPRESS)\\n_parser.add_argument(\\\"--target\\\", dest=\\\"target_path\\\",\n          type=_make_parent_dirs_and_return_path, required=True, default=argparse.SUPPRESS)\\n_parsed_args\n          = vars(_parser.parse_args())\\n\\n_outputs = load_iris_data(**_parsed_args)\\n\"],\n          \"image\": \"python:3.7\"}}, \"name\": \"Load iris data\", \"outputs\": [{\"name\":\n          \"data\", \"type\": \"csv\"}, {\"name\": \"target\", \"type\": \"csv\"}]}', pipelines.kubeflow.org/component_ref: '{}'}\n  - name: mlflow-pipeline\n    inputs:\n      parameters:\n      - {name: kernel}\n      - {name: model_name}\n    dag:\n      tasks:\n      - {name: load-iris-data, template: load-iris-data}\n      - name: train-from-csv\n        template: train-from-csv\n        dependencies: [load-iris-data]\n        arguments:\n          parameters:\n          - {name: kernel, value: '{{inputs.parameters.kernel}}'}\n          artifacts:\n          - {name: load-iris-data-data, from: '{{tasks.load-iris-data.outputs.artifacts.load-iris-data-data}}'}\n          - {name: load-iris-data-target, from: '{{tasks.load-iris-data.outputs.artifacts.load-iris-data-target}}'}\n      - name: upload-sklearn-model-to-mlflow\n        template: upload-sklearn-model-to-mlflow\n        dependencies: [train-from-csv]\n        arguments:\n          parameters:\n          - {name: model_name, value: '{{inputs.parameters.model_name}}'}\n          artifacts:\n          - {name: train-from-csv-conda_env, from: '{{tasks.train-from-csv.outputs.artifacts.train-from-csv-conda_env}}'}\n          - {name: train-from-csv-input_example, from: '{{tasks.train-from-csv.outputs.artifacts.train-from-csv-input_example}}'}\n          - {name: train-from-csv-model, from: '{{tasks.train-from-csv.outputs.artifacts.train-from-csv-model}}'}\n          - {name: train-from-csv-signature, from: '{{tasks.train-from-csv.outputs.artifacts.train-from-csv-signature}}'}\n  - name: train-from-csv\n    container:\n      args: [--train-data, /tmp/inputs/train_data/data, --train-target, /tmp/inputs/train_target/data,\n        --kernel, '{{inputs.parameters.kernel}}', --model, /tmp/outputs/model/data,\n        --input-example, /tmp/outputs/input_example/data, --signature, /tmp/outputs/signature/data,\n        --conda-env, /tmp/outputs/conda_env/data]\n      command:\n      - sh\n      - -c\n      - (PIP_DISABLE_PIP_VERSION_CHECK=1 python3 -m pip install --quiet --no-warn-script-location\n        'dill' 'pandas' 'scikit-learn' 'mlflow' || PIP_DISABLE_PIP_VERSION_CHECK=1\n        python3 -m pip install --quiet --no-warn-script-location 'dill' 'pandas' 'scikit-learn'\n        'mlflow' --user) && \"$0\" \"$@\"\n      - sh\n      - -ec\n      - |\n        program_path=$(mktemp)\n        printf \"%s\" \"$0\" > \"$program_path\"\n        python3 -u \"$program_path\" \"$@\"\n      - |\n        def _make_parent_dirs_and_return_path(file_path: str):\n            import os\n            os.makedirs(os.path.dirname(file_path), exist_ok=True)\n            return file_path\n\n        def train_from_csv(\n            train_data_path,\n            train_target_path,\n            model_path,\n            input_example_path,\n            signature_path,\n            conda_env_path,\n            kernel,\n        ):\n            import dill\n            import pandas as pd\n            from sklearn.svm import SVC\n\n            from mlflow.models.signature import infer_signature\n            from mlflow.utils.environment import _mlflow_conda_env\n\n            train_data = pd.read_csv(train_data_path)\n            train_target = pd.read_csv(train_target_path)\n\n            clf = SVC(kernel=kernel)\n            clf.fit(train_data, train_target)\n\n            with open(model_path, mode=\"wb\") as file_writer:\n                dill.dump(clf, file_writer)\n\n            input_example = train_data.sample(1)\n            with open(input_example_path, \"wb\") as file_writer:\n                dill.dump(input_example, file_writer)\n\n            signature = infer_signature(train_data, clf.predict(train_data))\n            with open(signature_path, \"wb\") as file_writer:\n                dill.dump(signature, file_writer)\n\n            conda_env = _mlflow_conda_env(\n                additional_pip_deps=[\"dill\", \"pandas\", \"scikit-learn\"]\n            )\n            with open(conda_env_path, \"wb\") as file_writer:\n                dill.dump(conda_env, file_writer)\n\n        import argparse\n        _parser = argparse.ArgumentParser(prog='Train from csv', description='')\n        _parser.add_argument(\"--train-data\", dest=\"train_data_path\", type=str, required=True, default=argparse.SUPPRESS)\n        _parser.add_argument(\"--train-target\", dest=\"train_target_path\", type=str, required=True, default=argparse.SUPPRESS)\n        _parser.add_argument(\"--kernel\", dest=\"kernel\", type=str, required=True, default=argparse.SUPPRESS)\n        _parser.add_argument(\"--model\", dest=\"model_path\", type=_make_parent_dirs_and_return_path, required=True, default=argparse.SUPPRESS)\n        _parser.add_argument(\"--input-example\", dest=\"input_example_path\", type=_make_parent_dirs_and_return_path, required=True, default=argparse.SUPPRESS)\n        _parser.add_argument(\"--signature\", dest=\"signature_path\", type=_make_parent_dirs_and_return_path, required=True, default=argparse.SUPPRESS)\n        _parser.add_argument(\"--conda-env\", dest=\"conda_env_path\", type=_make_parent_dirs_and_return_path, required=True, default=argparse.SUPPRESS)\n        _parsed_args = vars(_parser.parse_args())\n\n        _outputs = train_from_csv(**_parsed_args)\n      image: python:3.7\n    inputs:\n      parameters:\n      - {name: kernel}\n      artifacts:\n      - {name: load-iris-data-data, path: /tmp/inputs/train_data/data}\n      - {name: load-iris-data-target, path: /tmp/inputs/train_target/data}\n    outputs:\n      artifacts:\n      - {name: train-from-csv-conda_env, path: /tmp/outputs/conda_env/data}\n      - {name: train-from-csv-input_example, path: /tmp/outputs/input_example/data}\n      - {name: train-from-csv-model, path: /tmp/outputs/model/data}\n      - {name: train-from-csv-signature, path: /tmp/outputs/signature/data}\n    metadata:\n      labels:\n        pipelines.kubeflow.org/kfp_sdk_version: 1.8.10\n        pipelines.kubeflow.org/pipeline-sdk-type: kfp\n        pipelines.kubeflow.org/enable_caching: \"true\"\n      annotations: {pipelines.kubeflow.org/component_spec: '{\"implementation\": {\"container\":\n          {\"args\": [\"--train-data\", {\"inputPath\": \"train_data\"}, \"--train-target\",\n          {\"inputPath\": \"train_target\"}, \"--kernel\", {\"inputValue\": \"kernel\"}, \"--model\",\n          {\"outputPath\": \"model\"}, \"--input-example\", {\"outputPath\": \"input_example\"},\n          \"--signature\", {\"outputPath\": \"signature\"}, \"--conda-env\", {\"outputPath\":\n          \"conda_env\"}], \"command\": [\"sh\", \"-c\", \"(PIP_DISABLE_PIP_VERSION_CHECK=1\n          python3 -m pip install --quiet --no-warn-script-location ''dill'' ''pandas''\n          ''scikit-learn'' ''mlflow'' || PIP_DISABLE_PIP_VERSION_CHECK=1 python3 -m\n          pip install --quiet --no-warn-script-location ''dill'' ''pandas'' ''scikit-learn''\n          ''mlflow'' --user) && \\\"$0\\\" \\\"$@\\\"\", \"sh\", \"-ec\", \"program_path=$(mktemp)\\nprintf\n          \\\"%s\\\" \\\"$0\\\" > \\\"$program_path\\\"\\npython3 -u \\\"$program_path\\\" \\\"$@\\\"\\n\",\n          \"def _make_parent_dirs_and_return_path(file_path: str):\\n    import os\\n    os.makedirs(os.path.dirname(file_path),\n          exist_ok=True)\\n    return file_path\\n\\ndef train_from_csv(\\n    train_data_path,\\n    train_target_path,\\n    model_path,\\n    input_example_path,\\n    signature_path,\\n    conda_env_path,\\n    kernel,\\n):\\n    import\n          dill\\n    import pandas as pd\\n    from sklearn.svm import SVC\\n\\n    from\n          mlflow.models.signature import infer_signature\\n    from mlflow.utils.environment\n          import _mlflow_conda_env\\n\\n    train_data = pd.read_csv(train_data_path)\\n    train_target\n          = pd.read_csv(train_target_path)\\n\\n    clf = SVC(kernel=kernel)\\n    clf.fit(train_data,\n          train_target)\\n\\n    with open(model_path, mode=\\\"wb\\\") as file_writer:\\n        dill.dump(clf,\n          file_writer)\\n\\n    input_example = train_data.sample(1)\\n    with open(input_example_path,\n          \\\"wb\\\") as file_writer:\\n        dill.dump(input_example, file_writer)\\n\\n    signature\n          = infer_signature(train_data, clf.predict(train_data))\\n    with open(signature_path,\n          \\\"wb\\\") as file_writer:\\n        dill.dump(signature, file_writer)\\n\\n    conda_env\n          = _mlflow_conda_env(\\n        additional_pip_deps=[\\\"dill\\\", \\\"pandas\\\",\n          \\\"scikit-learn\\\"]\\n    )\\n    with open(conda_env_path, \\\"wb\\\") as file_writer:\\n        dill.dump(conda_env,\n          file_writer)\\n\\nimport argparse\\n_parser = argparse.ArgumentParser(prog=''Train\n          from csv'', description='''')\\n_parser.add_argument(\\\"--train-data\\\", dest=\\\"train_data_path\\\",\n          type=str, required=True, default=argparse.SUPPRESS)\\n_parser.add_argument(\\\"--train-target\\\",\n          dest=\\\"train_target_path\\\", type=str, required=True, default=argparse.SUPPRESS)\\n_parser.add_argument(\\\"--kernel\\\",\n          dest=\\\"kernel\\\", type=str, required=True, default=argparse.SUPPRESS)\\n_parser.add_argument(\\\"--model\\\",\n          dest=\\\"model_path\\\", type=_make_parent_dirs_and_return_path, required=True,\n          default=argparse.SUPPRESS)\\n_parser.add_argument(\\\"--input-example\\\", dest=\\\"input_example_path\\\",\n          type=_make_parent_dirs_and_return_path, required=True, default=argparse.SUPPRESS)\\n_parser.add_argument(\\\"--signature\\\",\n          dest=\\\"signature_path\\\", type=_make_parent_dirs_and_return_path, required=True,\n          default=argparse.SUPPRESS)\\n_parser.add_argument(\\\"--conda-env\\\", dest=\\\"conda_env_path\\\",\n          type=_make_parent_dirs_and_return_path, required=True, default=argparse.SUPPRESS)\\n_parsed_args\n          = vars(_parser.parse_args())\\n\\n_outputs = train_from_csv(**_parsed_args)\\n\"],\n          \"image\": \"python:3.7\"}}, \"inputs\": [{\"name\": \"train_data\", \"type\": \"csv\"},\n          {\"name\": \"train_target\", \"type\": \"csv\"}, {\"name\": \"kernel\", \"type\": \"String\"}],\n          \"name\": \"Train from csv\", \"outputs\": [{\"name\": \"model\", \"type\": \"dill\"},\n          {\"name\": \"input_example\", \"type\": \"dill\"}, {\"name\": \"signature\", \"type\":\n          \"dill\"}, {\"name\": \"conda_env\", \"type\": \"dill\"}]}', pipelines.kubeflow.org/component_ref: '{}',\n        pipelines.kubeflow.org/arguments.parameters: '{\"kernel\": \"{{inputs.parameters.kernel}}\"}'}\n  - name: upload-sklearn-model-to-mlflow\n    container:\n      args: [--model-name, '{{inputs.parameters.model_name}}', --model, /tmp/inputs/model/data,\n        --input-example, /tmp/inputs/input_example/data, --signature, /tmp/inputs/signature/data,\n        --conda-env, /tmp/inputs/conda_env/data]\n      command:\n      - sh\n      - -c\n      - (PIP_DISABLE_PIP_VERSION_CHECK=1 python3 -m pip install --quiet --no-warn-script-location\n        'dill' 'pandas' 'scikit-learn' 'mlflow' 'boto3' || PIP_DISABLE_PIP_VERSION_CHECK=1\n        python3 -m pip install --quiet --no-warn-script-location 'dill' 'pandas' 'scikit-learn'\n        'mlflow' 'boto3' --user) && \"$0\" \"$@\"\n      - sh\n      - -ec\n      - |\n        program_path=$(mktemp)\n        printf \"%s\" \"$0\" > \"$program_path\"\n        python3 -u \"$program_path\" \"$@\"\n      - |\n        def upload_sklearn_model_to_mlflow(\n            model_name,\n            model_path,\n            input_example_path,\n            signature_path,\n            conda_env_path,\n        ):\n            import os\n            import dill\n            from mlflow.sklearn import save_model\n\n            from mlflow.tracking.client import MlflowClient\n\n            os.environ[\"MLFLOW_S3_ENDPOINT_URL\"] = \"http://minio-service.kubeflow.svc:9000\"\n            os.environ[\"AWS_ACCESS_KEY_ID\"] = \"minio\"\n            os.environ[\"AWS_SECRET_ACCESS_KEY\"] = \"minio123\"\n\n            client = MlflowClient(\"http://mlflow-server-service.mlflow-system.svc:5000\")\n\n            with open(model_path, mode=\"rb\") as file_reader:\n                clf = dill.load(file_reader)\n\n            with open(input_example_path, \"rb\") as file_reader:\n                input_example = dill.load(file_reader)\n\n            with open(signature_path, \"rb\") as file_reader:\n                signature = dill.load(file_reader)\n\n            with open(conda_env_path, \"rb\") as file_reader:\n                conda_env = dill.load(file_reader)\n\n            save_model(\n                sk_model=clf,\n                path=model_name,\n                serialization_format=\"cloudpickle\",\n                conda_env=conda_env,\n                signature=signature,\n                input_example=input_example,\n            )\n            run = client.create_run(experiment_id=\"0\")\n            client.log_artifact(run.info.run_id, model_name)\n\n        import argparse\n        _parser = argparse.ArgumentParser(prog='Upload sklearn model to mlflow', description='')\n        _parser.add_argument(\"--model-name\", dest=\"model_name\", type=str, required=True, default=argparse.SUPPRESS)\n        _parser.add_argument(\"--model\", dest=\"model_path\", type=str, required=True, default=argparse.SUPPRESS)\n        _parser.add_argument(\"--input-example\", dest=\"input_example_path\", type=str, required=True, default=argparse.SUPPRESS)\n        _parser.add_argument(\"--signature\", dest=\"signature_path\", type=str, required=True, default=argparse.SUPPRESS)\n        _parser.add_argument(\"--conda-env\", dest=\"conda_env_path\", type=str, required=True, default=argparse.SUPPRESS)\n        _parsed_args = vars(_parser.parse_args())\n\n        _outputs = upload_sklearn_model_to_mlflow(**_parsed_args)\n      image: python:3.7\n    inputs:\n      parameters:\n      - {name: model_name}\n      artifacts:\n      - {name: train-from-csv-conda_env, path: /tmp/inputs/conda_env/data}\n      - {name: train-from-csv-input_example, path: /tmp/inputs/input_example/data}\n      - {name: train-from-csv-model, path: /tmp/inputs/model/data}\n      - {name: train-from-csv-signature, path: /tmp/inputs/signature/data}\n    metadata:\n      labels:\n        pipelines.kubeflow.org/kfp_sdk_version: 1.8.10\n        pipelines.kubeflow.org/pipeline-sdk-type: kfp\n        pipelines.kubeflow.org/enable_caching: \"true\"\n      annotations: {pipelines.kubeflow.org/component_spec: '{\"implementation\": {\"container\":\n          {\"args\": [\"--model-name\", {\"inputValue\": \"model_name\"}, \"--model\", {\"inputPath\":\n          \"model\"}, \"--input-example\", {\"inputPath\": \"input_example\"}, \"--signature\",\n          {\"inputPath\": \"signature\"}, \"--conda-env\", {\"inputPath\": \"conda_env\"}],\n          \"command\": [\"sh\", \"-c\", \"(PIP_DISABLE_PIP_VERSION_CHECK=1 python3 -m pip\n          install --quiet --no-warn-script-location ''dill'' ''pandas'' ''scikit-learn''\n          ''mlflow'' ''boto3'' || PIP_DISABLE_PIP_VERSION_CHECK=1 python3 -m pip install\n          --quiet --no-warn-script-location ''dill'' ''pandas'' ''scikit-learn'' ''mlflow''\n          ''boto3'' --user) && \\\"$0\\\" \\\"$@\\\"\", \"sh\", \"-ec\", \"program_path=$(mktemp)\\nprintf\n          \\\"%s\\\" \\\"$0\\\" > \\\"$program_path\\\"\\npython3 -u \\\"$program_path\\\" \\\"$@\\\"\\n\",\n          \"def upload_sklearn_model_to_mlflow(\\n    model_name,\\n    model_path,\\n    input_example_path,\\n    signature_path,\\n    conda_env_path,\\n):\\n    import\n          os\\n    import dill\\n    from mlflow.sklearn import save_model\\n\\n    from\n          mlflow.tracking.client import MlflowClient\\n\\n    os.environ[\\\"MLFLOW_S3_ENDPOINT_URL\\\"]\n          = \\\"http://minio-service.kubeflow.svc:9000\\\"\\n    os.environ[\\\"AWS_ACCESS_KEY_ID\\\"]\n          = \\\"minio\\\"\\n    os.environ[\\\"AWS_SECRET_ACCESS_KEY\\\"] = \\\"minio123\\\"\\n\\n    client\n          = MlflowClient(\\\"http://mlflow-server-service.mlflow-system.svc:5000\\\")\\n\\n    with\n          open(model_path, mode=\\\"rb\\\") as file_reader:\\n        clf = dill.load(file_reader)\\n\\n    with\n          open(input_example_path, \\\"rb\\\") as file_reader:\\n        input_example\n          = dill.load(file_reader)\\n\\n    with open(signature_path, \\\"rb\\\") as file_reader:\\n        signature\n          = dill.load(file_reader)\\n\\n    with open(conda_env_path, \\\"rb\\\") as file_reader:\\n        conda_env\n          = dill.load(file_reader)\\n\\n    save_model(\\n        sk_model=clf,\\n        path=model_name,\\n        serialization_format=\\\"cloudpickle\\\",\\n        conda_env=conda_env,\\n        signature=signature,\\n        input_example=input_example,\\n    )\\n    run\n          = client.create_run(experiment_id=\\\"0\\\")\\n    client.log_artifact(run.info.run_id,\n          model_name)\\n\\nimport argparse\\n_parser = argparse.ArgumentParser(prog=''Upload\n          sklearn model to mlflow'', description='''')\\n_parser.add_argument(\\\"--model-name\\\",\n          dest=\\\"model_name\\\", type=str, required=True, default=argparse.SUPPRESS)\\n_parser.add_argument(\\\"--model\\\",\n          dest=\\\"model_path\\\", type=str, required=True, default=argparse.SUPPRESS)\\n_parser.add_argument(\\\"--input-example\\\",\n          dest=\\\"input_example_path\\\", type=str, required=True, default=argparse.SUPPRESS)\\n_parser.add_argument(\\\"--signature\\\",\n          dest=\\\"signature_path\\\", type=str, required=True, default=argparse.SUPPRESS)\\n_parser.add_argument(\\\"--conda-env\\\",\n          dest=\\\"conda_env_path\\\", type=str, required=True, default=argparse.SUPPRESS)\\n_parsed_args\n          = vars(_parser.parse_args())\\n\\n_outputs = upload_sklearn_model_to_mlflow(**_parsed_args)\\n\"],\n          \"image\": \"python:3.7\"}}, \"inputs\": [{\"name\": \"model_name\", \"type\": \"String\"},\n          {\"name\": \"model\", \"type\": \"dill\"}, {\"name\": \"input_example\", \"type\": \"dill\"},\n          {\"name\": \"signature\", \"type\": \"dill\"}, {\"name\": \"conda_env\", \"type\": \"dill\"}],\n          \"name\": \"Upload sklearn model to mlflow\"}', pipelines.kubeflow.org/component_ref: '{}',\n        pipelines.kubeflow.org/arguments.parameters: '{\"model_name\": \"{{inputs.parameters.model_name}}\"}'}\n  arguments:\n    parameters:\n    - {name: kernel}\n    - {name: model_name}\n  serviceAccountName: pipeline-runner\n```\n\n  </details>\n</p>\n\nAfter generating the mlflow_pipeline.yaml file after execution, upload the pipeline and execute it to check the results of the run.\n\n![mlflow-svc-0](./img/mlflow-svc-0.png)\n\nPort-forward the mlflow service to access the MLflow UI.\n\n```bash\nkubectl port-forward svc/mlflow-server-service -n mlflow-system 5000:5000\n```\n\nOpen the web browser and connect to localhost:5000. You will then be able to see that the run has been created as follows.\n\n![mlflow-svc-1](./img/mlflow-svc-1.png)\n\nClick on run to verify that the trained model file is present.\n\n![mlflow-svc-2](./img/mlflow-svc-2.png)\n"
  },
  {
    "path": "i18n/en/docusaurus-plugin-content-docs/version-1.0/kubeflow/advanced-pipeline.md",
    "content": "---\ntitle : \"10. Pipeline - Setting\"\ndescription: \"\"\nsidebar_position: 10\ncontributors: [\"Jongseob Jeon\"]\n---\n\n## Pipeline Setting\n\nIn this page, we will look at values that can be set in the pipeline.\n\n## Display Name\n\nCreated within the pipeline, components have two names:\n\n- task_name: the function name when writing the component\n- display_name: the name that appears in the kubeflow UI\n\nFor example, in the case where both components are set to Print and return number, it is difficult to tell which component is 1 or 2.\n\n![run-7](./img/run-7.png)\n\n### set_display_name\n\nThe solution for this is the display_name.  \nWe can set the display_name in the pipeline by using the set_display_name [attribute](https://kubeflow-pipelines.readthedocs.io/en/latest/source/kfp.dsl.html#kfp.dsl.ContainerOp.set_display_name) of the component.\n\n```python\nimport kfp\nfrom kfp.components import create_component_from_func\nfrom kfp.dsl import pipeline\n\n\n@create_component_from_func\ndef print_and_return_number(number: int) -> int:\n    print(number)\n    return number\n\n\n@create_component_from_func\ndef sum_and_print_numbers(number_1: int, number_2: int):\n    print(number_1 + number_2)\n\n\n@pipeline(name=\"example_pipeline\")\ndef example_pipeline(number_1: int, number_2: int):\n    number_1_result = print_and_return_number(number_1).set_display_name(\"This is number 1\")\n    number_2_result = print_and_return_number(number_2).set_display_name(\"This is number 2\")\n    sum_result = sum_and_print_numbers(\n        number_1=number_1_result.output, number_2=number_2_result.output\n    ).set_display_name(\"This is sum of number 1 and number 2\")\n\n\nif __name__ == \"__main__\":\n    kfp.compiler.Compiler().compile(example_pipeline, \"example_pipeline.yaml\")\n```\n\nIf you run this script and check the resulting `example_pipeline.yaml`, it would be like this.\n\n<p>\n  <details>\n    <summary>example_pipeline.yaml</summary>\n\n```bash\napiVersion: argoproj.io/v1alpha1\nkind: Workflow\nmetadata:\n  generateName: example-pipeline-\n  annotations: {pipelines.kubeflow.org/kfp_sdk_version: 1.8.9, pipelines.kubeflow.org/pipeline_compilation_time: '2021-12-09T18:11:43.193190',\n    pipelines.kubeflow.org/pipeline_spec: '{\"inputs\": [{\"name\": \"number_1\", \"type\":\n      \"Integer\"}, {\"name\": \"number_2\", \"type\": \"Integer\"}], \"name\": \"example_pipeline\"}'}\n  labels: {pipelines.kubeflow.org/kfp_sdk_version: 1.8.9}\nspec:\n  entrypoint: example-pipeline\n  templates:\n  - name: example-pipeline\n    inputs:\n      parameters:\n      - {name: number_1}\n      - {name: number_2}\n    dag:\n      tasks:\n      - name: print-and-return-number\n        template: print-and-return-number\n        arguments:\n          parameters:\n          - {name: number_1, value: '{{inputs.parameters.number_1}}'}\n      - name: print-and-return-number-2\n        template: print-and-return-number-2\n        arguments:\n          parameters:\n          - {name: number_2, value: '{{inputs.parameters.number_2}}'}\n      - name: sum-and-print-numbers\n        template: sum-and-print-numbers\n        dependencies: [print-and-return-number, print-and-return-number-2]\n        arguments:\n          parameters:\n          - {name: print-and-return-number-2-Output, value: '{{tasks.print-and-return-number-2.outputs.parameters.print-and-return-number-2-Output}}'}\n          - {name: print-and-return-number-Output, value: '{{tasks.print-and-return-number.outputs.parameters.print-and-return-number-Output}}'}\n  - name: print-and-return-number\n    container:\n      args: [--number, '{{inputs.parameters.number_1}}', '----output-paths', /tmp/outputs/Output/data]\n      command:\n      - sh\n      - -ec\n      - |\n        program_path=$(mktemp)\n        printf \"%s\" \"$0\" > \"$program_path\"\n        python3 -u \"$program_path\" \"$@\"\n      - |\n        def print_and_return_number(number):\n            print(number)\n            return number\n\n        def _serialize_int(int_value: int) -> str:\n            if isinstance(int_value, str):\n                return int_value\n            if not isinstance(int_value, int):\n                raise TypeError('Value \"{}\" has type \"{}\" instead of int.'.format(\n                    str(int_value), str(type(int_value))))\n            return str(int_value)\n\n        import argparse\n        _parser = argparse.ArgumentParser(prog='Print and return number', description='')\n        _parser.add_argument(\"--number\", dest=\"number\", type=int, required=True, default=argparse.SUPPRESS)\n        _parser.add_argument(\"----output-paths\", dest=\"_output_paths\", type=str, nargs=1)\n        _parsed_args = vars(_parser.parse_args())\n        _output_files = _parsed_args.pop(\"_output_paths\", [])\n\n        _outputs = print_and_return_number(**_parsed_args)\n\n        _outputs = [_outputs]\n\n        _output_serializers = [\n            _serialize_int,\n\n        ]\n\n        import os\n        for idx, output_file in enumerate(_output_files):\n            try:\n                os.makedirs(os.path.dirname(output_file))\n            except OSError:\n                pass\n            with open(output_file, 'w') as f:\n                f.write(_output_serializers[idx](_outputs[idx]))\n      image: python:3.7\n    inputs:\n      parameters:\n      - {name: number_1}\n    outputs:\n      parameters:\n      - name: print-and-return-number-Output\n        valueFrom: {path: /tmp/outputs/Output/data}\n      artifacts:\n      - {name: print-and-return-number-Output, path: /tmp/outputs/Output/data}\n    metadata:\n      annotations: {pipelines.kubeflow.org/task_display_name: This is number 1, pipelines.kubeflow.org/component_spec: '{\"implementation\":\n          {\"container\": {\"args\": [\"--number\", {\"inputValue\": \"number\"}, \"----output-paths\",\n          {\"outputPath\": \"Output\"}], \"command\": [\"sh\", \"-ec\", \"program_path=$(mktemp)\\nprintf\n          \\\"%s\\\" \\\"$0\\\" > \\\"$program_path\\\"\\npython3 -u \\\"$program_path\\\" \\\"$@\\\"\\n\",\n          \"def print_and_return_number(number):\\n    print(number)\\n    return number\\n\\ndef\n          _serialize_int(int_value: int) -> str:\\n    if isinstance(int_value, str):\\n        return\n          int_value\\n    if not isinstance(int_value, int):\\n        raise TypeError(''Value\n          \\\"{}\\\" has type \\\"{}\\\" instead of int.''.format(\\n            str(int_value),\n          str(type(int_value))))\\n    return str(int_value)\\n\\nimport argparse\\n_parser\n          = argparse.ArgumentParser(prog=''Print and return number'', description='''')\\n_parser.add_argument(\\\"--number\\\",\n          dest=\\\"number\\\", type=int, required=True, default=argparse.SUPPRESS)\\n_parser.add_argument(\\\"----output-paths\\\",\n          dest=\\\"_output_paths\\\", type=str, nargs=1)\\n_parsed_args = vars(_parser.parse_args())\\n_output_files\n          = _parsed_args.pop(\\\"_output_paths\\\", [])\\n\\n_outputs = print_and_return_number(**_parsed_args)\\n\\n_outputs\n          = [_outputs]\\n\\n_output_serializers = [\\n    _serialize_int,\\n\\n]\\n\\nimport\n          os\\nfor idx, output_file in enumerate(_output_files):\\n    try:\\n        os.makedirs(os.path.dirname(output_file))\\n    except\n          OSError:\\n        pass\\n    with open(output_file, ''w'') as f:\\n        f.write(_output_serializers[idx](_outputs[idx]))\\n\"],\n          \"image\": \"python:3.7\"}}, \"inputs\": [{\"name\": \"number\", \"type\": \"Integer\"}],\n          \"name\": \"Print and return number\", \"outputs\": [{\"name\": \"Output\", \"type\":\n          \"Integer\"}]}', pipelines.kubeflow.org/component_ref: '{}', pipelines.kubeflow.org/arguments.parameters: '{\"number\":\n          \"{{inputs.parameters.number_1}}\"}'}\n      labels:\n        pipelines.kubeflow.org/kfp_sdk_version: 1.8.9\n        pipelines.kubeflow.org/pipeline-sdk-type: kfp\n        pipelines.kubeflow.org/enable_caching: \"true\"\n  - name: print-and-return-number-2\n    container:\n      args: [--number, '{{inputs.parameters.number_2}}', '----output-paths', /tmp/outputs/Output/data]\n      command:\n      - sh\n      - -ec\n      - |\n        program_path=$(mktemp)\n        printf \"%s\" \"$0\" > \"$program_path\"\n        python3 -u \"$program_path\" \"$@\"\n      - |\n        def print_and_return_number(number):\n            print(number)\n            return number\n\n        def _serialize_int(int_value: int) -> str:\n            if isinstance(int_value, str):\n                return int_value\n            if not isinstance(int_value, int):\n                raise TypeError('Value \"{}\" has type \"{}\" instead of int.'.format(\n                    str(int_value), str(type(int_value))))\n            return str(int_value)\n\n        import argparse\n        _parser = argparse.ArgumentParser(prog='Print and return number', description='')\n        _parser.add_argument(\"--number\", dest=\"number\", type=int, required=True, default=argparse.SUPPRESS)\n        _parser.add_argument(\"----output-paths\", dest=\"_output_paths\", type=str, nargs=1)\n        _parsed_args = vars(_parser.parse_args())\n        _output_files = _parsed_args.pop(\"_output_paths\", [])\n\n        _outputs = print_and_return_number(**_parsed_args)\n\n        _outputs = [_outputs]\n\n        _output_serializers = [\n            _serialize_int,\n\n        ]\n\n        import os\n        for idx, output_file in enumerate(_output_files):\n            try:\n                os.makedirs(os.path.dirname(output_file))\n            except OSError:\n                pass\n            with open(output_file, 'w') as f:\n                f.write(_output_serializers[idx](_outputs[idx]))\n      image: python:3.7\n    inputs:\n      parameters:\n      - {name: number_2}\n    outputs:\n      parameters:\n      - name: print-and-return-number-2-Output\n        valueFrom: {path: /tmp/outputs/Output/data}\n      artifacts:\n      - {name: print-and-return-number-2-Output, path: /tmp/outputs/Output/data}\n    metadata:\n      annotations: {pipelines.kubeflow.org/task_display_name: This is number 2, pipelines.kubeflow.org/component_spec: '{\"implementation\":\n          {\"container\": {\"args\": [\"--number\", {\"inputValue\": \"number\"}, \"----output-paths\",\n          {\"outputPath\": \"Output\"}], \"command\": [\"sh\", \"-ec\", \"program_path=$(mktemp)\\nprintf\n          \\\"%s\\\" \\\"$0\\\" > \\\"$program_path\\\"\\npython3 -u \\\"$program_path\\\" \\\"$@\\\"\\n\",\n          \"def print_and_return_number(number):\\n    print(number)\\n    return number\\n\\ndef\n          _serialize_int(int_value: int) -> str:\\n    if isinstance(int_value, str):\\n        return\n          int_value\\n    if not isinstance(int_value, int):\\n        raise TypeError(''Value\n          \\\"{}\\\" has type \\\"{}\\\" instead of int.''.format(\\n            str(int_value),\n          str(type(int_value))))\\n    return str(int_value)\\n\\nimport argparse\\n_parser\n          = argparse.ArgumentParser(prog=''Print and return number'', description='''')\\n_parser.add_argument(\\\"--number\\\",\n          dest=\\\"number\\\", type=int, required=True, default=argparse.SUPPRESS)\\n_parser.add_argument(\\\"----output-paths\\\",\n          dest=\\\"_output_paths\\\", type=str, nargs=1)\\n_parsed_args = vars(_parser.parse_args())\\n_output_files\n          = _parsed_args.pop(\\\"_output_paths\\\", [])\\n\\n_outputs = print_and_return_number(**_parsed_args)\\n\\n_outputs\n          = [_outputs]\\n\\n_output_serializers = [\\n    _serialize_int,\\n\\n]\\n\\nimport\n          os\\nfor idx, output_file in enumerate(_output_files):\\n    try:\\n        os.makedirs(os.path.dirname(output_file))\\n    except\n          OSError:\\n        pass\\n    with open(output_file, ''w'') as f:\\n        f.write(_output_serializers[idx](_outputs[idx]))\\n\"],\n          \"image\": \"python:3.7\"}}, \"inputs\": [{\"name\": \"number\", \"type\": \"Integer\"}],\n          \"name\": \"Print and return number\", \"outputs\": [{\"name\": \"Output\", \"type\":\n          \"Integer\"}]}', pipelines.kubeflow.org/component_ref: '{}', pipelines.kubeflow.org/arguments.parameters: '{\"number\":\n          \"{{inputs.parameters.number_2}}\"}'}\n      labels:\n        pipelines.kubeflow.org/kfp_sdk_version: 1.8.9\n        pipelines.kubeflow.org/pipeline-sdk-type: kfp\n        pipelines.kubeflow.org/enable_caching: \"true\"\n  - name: sum-and-print-numbers\n    container:\n      args: [--number-1, '{{inputs.parameters.print-and-return-number-Output}}', --number-2,\n        '{{inputs.parameters.print-and-return-number-2-Output}}']\n      command:\n      - sh\n      - -ec\n      - |\n        program_path=$(mktemp)\n        printf \"%s\" \"$0\" > \"$program_path\"\n        python3 -u \"$program_path\" \"$@\"\n      - |\n        def sum_and_print_numbers(number_1, number_2):\n            print(number_1 + number_2)\n\n        import argparse\n        _parser = argparse.ArgumentParser(prog='Sum and print numbers', description='')\n        _parser.add_argument(\"--number-1\", dest=\"number_1\", type=int, required=True, default=argparse.SUPPRESS)\n        _parser.add_argument(\"--number-2\", dest=\"number_2\", type=int, required=True, default=argparse.SUPPRESS)\n        _parsed_args = vars(_parser.parse_args())\n\n        _outputs = sum_and_print_numbers(**_parsed_args)\n      image: python:3.7\n    inputs:\n      parameters:\n      - {name: print-and-return-number-2-Output}\n      - {name: print-and-return-number-Output}\n    metadata:\n      annotations: {pipelines.kubeflow.org/task_display_name: This is sum of number\n          1 and number 2, pipelines.kubeflow.org/component_spec: '{\"implementation\":\n          {\"container\": {\"args\": [\"--number-1\", {\"inputValue\": \"number_1\"}, \"--number-2\",\n          {\"inputValue\": \"number_2\"}], \"command\": [\"sh\", \"-ec\", \"program_path=$(mktemp)\\nprintf\n          \\\"%s\\\" \\\"$0\\\" > \\\"$program_path\\\"\\npython3 -u \\\"$program_path\\\" \\\"$@\\\"\\n\",\n          \"def sum_and_print_numbers(number_1, number_2):\\n    print(number_1 + number_2)\\n\\nimport\n          argparse\\n_parser = argparse.ArgumentParser(prog=''Sum and print numbers'',\n          description='''')\\n_parser.add_argument(\\\"--number-1\\\", dest=\\\"number_1\\\",\n          type=int, required=True, default=argparse.SUPPRESS)\\n_parser.add_argument(\\\"--number-2\\\",\n          dest=\\\"number_2\\\", type=int, required=True, default=argparse.SUPPRESS)\\n_parsed_args\n          = vars(_parser.parse_args())\\n\\n_outputs = sum_and_print_numbers(**_parsed_args)\\n\"],\n          \"image\": \"python:3.7\"}}, \"inputs\": [{\"name\": \"number_1\", \"type\": \"Integer\"},\n          {\"name\": \"number_2\", \"type\": \"Integer\"}], \"name\": \"Sum and print numbers\"}',\n        pipelines.kubeflow.org/component_ref: '{}', pipelines.kubeflow.org/arguments.parameters: '{\"number_1\":\n          \"{{inputs.parameters.print-and-return-number-Output}}\", \"number_2\": \"{{inputs.parameters.print-and-return-number-2-Output}}\"}'}\n      labels:\n        pipelines.kubeflow.org/kfp_sdk_version: 1.8.9\n        pipelines.kubeflow.org/pipeline-sdk-type: kfp\n        pipelines.kubeflow.org/enable_caching: \"true\"\n  arguments:\n    parameters:\n    - {name: number_1}\n    - {name: number_2}\n  serviceAccountName: pipeline-runner\n```\n\n  </details>\n</p>\n\nIf compared with the previous file, the **`pipelines.kubeflow.org/task_display_name`** key has been newly created.\n\n### UI in Kubeflow\n\n\nWe will upload the version of the previously created [pipeline](../kubeflow/basic-pipeline-upload.md#upload-pipeline-version) using the files we created earlier.\n\n![adv-pipeline-0.png](./img/adv-pipeline-0.png)\n\nAs you can see, the configured name is displayed as shown above.\n\n## Resources\n\n### GPU\n\nBy default, when the pipeline runs components as Kubernetes pods, it uses the default resource specifications.  \nIf you need to train a model using a GPU and the Kubernetes environment doesn't allocate a GPU, the training may not be performed correctly.  \nTo address this, you can use the `set_gpu_limit()` [attribute](https://kubeflow-pipelines.readthedocs.io/en/latest/source/kfp.dsl.html?highlight=set_gpu_limit#kfp.dsl.UserContainer.set_gpu_limit) to set the GPU limit.\n\n```python\nimport kfp\nfrom kfp.components import create_component_from_func\nfrom kfp.dsl import pipeline\n\n\n@create_component_from_func\ndef print_and_return_number(number: int) -> int:\n    print(number)\n    return number\n\n\n@create_component_from_func\ndef sum_and_print_numbers(number_1: int, number_2: int):\n    print(number_1 + number_2)\n\n\n@pipeline(name=\"example_pipeline\")\ndef example_pipeline(number_1: int, number_2: int):\n    number_1_result = print_and_return_number(number_1).set_display_name(\"This is number 1\")\n    number_2_result = print_and_return_number(number_2).set_display_name(\"This is number 2\")\n    sum_result = sum_and_print_numbers(\n        number_1=number_1_result.output, number_2=number_2_result.output\n    ).set_display_name(\"This is sum of number 1 and number 2\").set_gpu_limit(1)\n\n\nif __name__ == \"__main__\":\n    kfp.compiler.Compiler().compile(example_pipeline, \"example_pipeline.yaml\")\n```\n\nIf you execute the above script, you can see that the resources has been added with `{nvidia.com/gpu: 1}` in the generated file when you look closely at `sum-and-print-numbers`.\nThrough this, you can allocate a GPU.\n\n```bash\n  - name: sum-and-print-numbers\n    container:\n      args: [--number-1, '{{inputs.parameters.print-and-return-number-Output}}', --number-2,\n        '{{inputs.parameters.print-and-return-number-2-Output}}']\n      command:\n      - sh\n      - -ec\n      - |\n        program_path=$(mktemp)\n        printf \"%s\" \"$0\" > \"$program_path\"\n        python3 -u \"$program_path\" \"$@\"\n      - |\n        def sum_and_print_numbers(number_1, number_2):\n            print(number_1 + number_2)\n\n        import argparse\n        _parser = argparse.ArgumentParser(prog='Sum and print numbers', description='')\n        _parser.add_argument(\"--number-1\", dest=\"number_1\", type=int, required=True, default=argparse.SUPPRESS)\n        _parser.add_argument(\"--number-2\", dest=\"number_2\", type=int, required=True, default=argparse.SUPPRESS)\n        _parsed_args = vars(_parser.parse_args())\n\n        _outputs = sum_and_print_numbers(**_parsed_args)\n      image: python:3.7\n      resources:\n        limits: {nvidia.com/gpu: 1}\n```\n\n### CPU\n\nThe function to set the number of CPUs can be set using the `.set_cpu_limit()` attribute [attribute](https://kubeflow-pipelines.readthedocs.io/en/latest/source/kfp.dsl.html?highlight=set_gpu_limit#kfp.dsl.Sidecar.set_cpu_limit).  \nThe difference from GPUs is that the input must be a string, not an int.\n\n```python\nimport kfp\nfrom kfp.components import create_component_from_func\nfrom kfp.dsl import pipeline\n\n\n@create_component_from_func\ndef print_and_return_number(number: int) -> int:\n    print(number)\n    return number\n\n\n@create_component_from_func\ndef sum_and_print_numbers(number_1: int, number_2: int):\n    print(number_1 + number_2)\n\n\n@pipeline(name=\"example_pipeline\")\ndef example_pipeline(number_1: int, number_2: int):\n    number_1_result = print_and_return_number(number_1).set_display_name(\"This is number 1\")\n    number_2_result = print_and_return_number(number_2).set_display_name(\"This is number 2\")\n    sum_result = sum_and_print_numbers(\n        number_1=number_1_result.output, number_2=number_2_result.output\n    ).set_display_name(\"This is sum of number 1 and number 2\").set_gpu_limit(1).set_cpu_limit(\"16\")\n\n\nif __name__ == \"__main__\":\n    kfp.compiler.Compiler().compile(example_pipeline, \"example_pipeline.yaml\")\n```\n\nThe changed part only can be confirmed as follows.\n\n```bash\n      resources:\n        limits: {nvidia.com/gpu: 1, cpu: '16'}\n```\n\n### Memory\n\nMemory can be set using the `.set_memory_limit()` [attribute](https://kubeflow-pipelines.readthedocs.io/en/latest/source/kfp.dsl.html?highlight=set_gpu_limit#kfp.dsl.Sidecar.set_memory_limit).\n\n```python\nimport kfp\nfrom kfp.components import create_component_from_func\nfrom kfp.dsl import pipeline\n\n\n@create_component_from_func\ndef print_and_return_number(number: int) -> int:\n    print(number)\n    return number\n\n\n@create_component_from_func\ndef sum_and_print_numbers(number_1: int, number_2: int):\n    print(number_1 + number_2)\n\n\n@pipeline(name=\"example_pipeline\")\ndef example_pipeline(number_1: int, number_2: int):\n    number_1_result = print_and_return_number(number_1).set_display_name(\"This is number 1\")\n    number_2_result = print_and_return_number(number_2).set_display_name(\"This is number 2\")\n    sum_result = sum_and_print_numbers(\n        number_1=number_1_result.output, number_2=number_2_result.output\n    ).set_display_name(\"This is sum of number 1 and number 2\").set_gpu_limit(1).set_memory_limit(\"1G\")\n\n\nif __name__ == \"__main__\":\n    kfp.compiler.Compiler().compile(example_pipeline, \"example_pipeline.yaml\")\n\n```\n\nThe changed parts are as follows if checked.\n\n```bash\n      resources:\n        limits: {nvidia.com/gpu: 1, memory: 1G}\n```\n"
  },
  {
    "path": "i18n/en/docusaurus-plugin-content-docs/version-1.0/kubeflow/advanced-run.md",
    "content": "---\ntitle : \"11. Pipeline - Run Result\"\ndescription: \"\"\nsidebar_position: 11\ncontributors: [\"Jongseob Jeon\", \"SeungTae Kim\"]\n---\n\n\n## Run Result\n\nClick Run Result and you will see three tabs:\nGraph, Run Output, and Config.\n\n![advanced-run-0.png](./img/advanced-run-0.png)\n\n## Graph\n\n![advanced-run-1.png](./img/advanced-run-1.png)\n\nIn the graph, if you click on the run component, you can check the running information of the component.\n\n### Input/Output\n\nThe Input/Output tab allows you to view and download the Configurations, Input, and Output Artifacts used in the components.\n\n### Logs\n\nIn the Logs tab, you can view all the stdout output generated during the execution of the Python code.\nHowever, pods are deleted after a certain period of time, so you may not be able to view them in this tab after a certain time.\nIn that case, you can check them in the main-logs section of the Output artifacts.\n\n### Visualizations\n\nThe Visualizations tab displays plots generated by the components.\n\nTo generate a plot, you can save the desired values as an argument using `mlpipeline_ui_metadata: OutputPath(\"UI_Metadata\")`. The plot should be in HTML format.\nThe conversion process is as follows.\n\n```python\n\n@partial(\n    create_component_from_func,\n    packages_to_install=[\"matplotlib\"],\n)\ndef plot_linear(\n    mlpipeline_ui_metadata: OutputPath(\"UI_Metadata\")\n):\n    import base64\n    import json\n    from io import BytesIO\n\n    import matplotlib.pyplot as plt\n\n    plt.plot(x=[1, 2, 3], y=[1, 2,3])\n\n    tmpfile = BytesIO()\n    plt.savefig(tmpfile, format=\"png\")\n    encoded = base64.b64encode(tmpfile.getvalue()).decode(\"utf-8\")\n\n    html = f\"<img src='data:image/png;base64,{encoded}'>\"\n    metadata = {\n        \"outputs\": [\n            {\n                \"type\": \"web-app\",\n                \"storage\": \"inline\",\n                \"source\": html,\n            },\n        ],\n    }\n    with open(mlpipeline_ui_metadata, \"w\") as html_writer:\n        json.dump(metadata, html_writer)\n```\n\nIf written in pipeline, it will be like this.\n\n```python\nfrom functools import partial\n\nimport kfp\nfrom kfp.components import create_component_from_func, OutputPath\nfrom kfp.dsl import pipeline\n\n\n@partial(\n    create_component_from_func,\n    packages_to_install=[\"matplotlib\"],\n)\ndef plot_linear(mlpipeline_ui_metadata: OutputPath(\"UI_Metadata\")):\n    import base64\n    import json\n    from io import BytesIO\n\n    import matplotlib.pyplot as plt\n\n    plt.plot([1, 2, 3], [1, 2, 3])\n\n    tmpfile = BytesIO()\n    plt.savefig(tmpfile, format=\"png\")\n    encoded = base64.b64encode(tmpfile.getvalue()).decode(\"utf-8\")\n\n    html = f\"<img src='data:image/png;base64,{encoded}'>\"\n    metadata = {\n        \"outputs\": [\n            {\n                \"type\": \"web-app\",\n                \"storage\": \"inline\",\n                \"source\": html,\n            },\n        ],\n    }\n    with open(mlpipeline_ui_metadata, \"w\") as html_writer:\n        json.dump(metadata, html_writer)\n\n\n@pipeline(name=\"plot_pipeline\")\ndef plot_pipeline():\n    plot_linear()\n\n\nif __name__ == \"__main__\":\n    kfp.compiler.Compiler().compile(plot_pipeline, \"plot_pipeline.yaml\")\n```\n\nIf you run this script and check the resulting `plot_pipeline.yaml`, you will see the following.\n\n<p>\n  <details>\n    <summary>plot_pipeline.yaml</summary>\n\n```bash\napiVersion: argoproj.io/v1alpha1\nkind: Workflow\nmetadata:\n  generateName: plot-pipeline-\n  annotations: {pipelines.kubeflow.org/kfp_sdk_version: 1.8.9, pipelines.kubeflow.org/pipeline_compilation_time: '2\n022-01-17T13:31:32.963214',\n    pipelines.kubeflow.org/pipeline_spec: '{\"name\": \"plot_pipeline\"}'}\n  labels: {pipelines.kubeflow.org/kfp_sdk_version: 1.8.9}\nspec:\n  entrypoint: plot-pipeline\n  templates:\n  - name: plot-linear\n    container:\n      args: [--mlpipeline-ui-metadata, /tmp/outputs/mlpipeline_ui_metadata/data]\n      command:\n      - sh\n      - -c\n      - (PIP_DISABLE_PIP_VERSION_CHECK=1 python3 -m pip install --quiet --no-warn-script-location\n        'matplotlib' || PIP_DISABLE_PIP_VERSION_CHECK=1 python3 -m pip install --quiet\n        --no-warn-script-location 'matplotlib' --user) && \"$0\" \"$@\"\n      - sh\n      - -ec\n      - |\n        program_path=$(mktemp)\n        printf \"%s\" \"$0\" > \"$program_path\"\n        python3 -u \"$program_path\" \"$@\"\n      - |\n        def _make_parent_dirs_and_return_path(file_path: str):\n            import os\n            os.makedirs(os.path.dirname(file_path), exist_ok=True)\n            return file_path\n        def plot_linear(mlpipeline_ui_metadata):\n            import base64\n            import json\n            from io import BytesIO\n            import matplotlib.pyplot as plt\n            plt.plot([1, 2, 3], [1, 2, 3])\n            tmpfile = BytesIO()\n            plt.savefig(tmpfile, format=\"png\")\n            encoded = base64.b64encode(tmpfile.getvalue()).decode(\"utf-8\")\n            html = f\"<img src='data:image/png;base64,{encoded}'>\"\n            metadata = {\n                \"outputs\": [\n                    {\n                        \"type\": \"web-app\",\n                        \"storage\": \"inline\",\n                        \"source\": html,\n                    },\n                ],\n            }\n            with open(mlpipeline_ui_metadata, \"w\") as html_writer:\n                json.dump(metadata, html_writer)\n\n        import argparse\n        _parser = argparse.ArgumentParser(prog='Plot linear', description='')\n        _parser.add_argument(\"--mlpipeline-ui-metadata\", dest=\"mlpipeline_ui_metadata\", type=_make_parent_dirs_and_return_path, required=True, default=argparse.SUPPRESS)\n        _parsed_args = vars(_parser.parse_args())\n        _outputs = plot_linear(**_parsed_args)\n      image: python:3.7\n    outputs:\n      artifacts:\n      - {name: mlpipeline-ui-metadata, path: /tmp/outputs/mlpipeline_ui_metadata/data}\n    metadata:\n      labels:\n        pipelines.kubeflow.org/kfp_sdk_version: 1.8.9\n        pipelines.kubeflow.org/pipeline-sdk-type: kfp\n        pipelines.kubeflow.org/enable_caching: \"true\"\n      annotations: {pipelines.kubeflow.org/component_spec: '{\"implementation\": {\"container\":\n          {\"args\": [\"--mlpipeline-ui-metadata\", {\"outputPath\": \"mlpipeline_ui_metadata\"}],\n          \"command\": [\"sh\", \"-c\", \"(PIP_DISABLE_PIP_VERSION_CHECK=1 python3 -m pip\n          install --quiet --no-warn-script-location ''matplotlib'' || PIP_DISABLE_PIP_VERSION_CHECK=1\n          python3 -m pip install --quiet --no-warn-script-location ''matplotlib''\n          --user) && \\\"$0\\\" \\\"$@\\\"\", \"sh\", \"-ec\", \"program_path=$(mktemp)\\nprintf\n          \\\"%s\\\" \\\"$0\\\" > \\\"$program_path\\\"\\npython3 -u \\\"$program_path\\\" \\\"$@\\\"\\n\",\n          \"def _make_parent_dirs_and_return_path(file_path: str):\\n    import os\\n    os.makedirs(os.path.dirname(file_path),\n          exist_ok=True)\\n    return file_path\\n\\ndef plot_linear(mlpipeline_ui_metadata):\\n    import\n          base64\\n    import json\\n    from io import BytesIO\\n\\n    import matplotlib.pyplot\n          as plt\\n\\n    plt.plot([1, 2, 3], [1, 2, 3])\\n\\n    tmpfile = BytesIO()\\n    plt.savefig(tmpfile,\n          format=\\\"png\\\")\\n    encoded = base64.b64encode(tmpfile.getvalue()).decode(\\\"utf-8\\\")\\n\\n    html\n          = f\\\"<img src=''data:image/png;base64,{encoded}''>\\\"\\n    metadata = {\\n        \\\"outputs\\\":\n          [\\n            {\\n                \\\"type\\\": \\\"web-app\\\",\\n                \\\"storage\\\":\n          \\\"inline\\\",\\n                \\\"source\\\": html,\\n            },\\n        ],\\n    }\\n    with\n          open(mlpipeline_ui_metadata, \\\"w\\\") as html_writer:\\n        json.dump(metadata,\n          html_writer)\\n\\nimport argparse\\n_parser = argparse.ArgumentParser(prog=''Plot\n          linear'', description='''')\\n_parser.add_argument(\\\"--mlpipeline-ui-metadata\\\",\n          dest=\\\"mlpipeline_ui_metadata\\\", type=_make_parent_dirs_and_return_path,\n          required=True, default=argparse.SUPPRESS)\\n_parsed_args = vars(_parser.parse_args())\\n\\n_outputs\n          = plot_linear(**_parsed_args)\\n\"], \"image\": \"python:3.7\"}}, \"name\": \"Plot\n          linear\", \"outputs\": [{\"name\": \"mlpipeline_ui_metadata\", \"type\": \"UI_Metadata\"}]}',\n        pipelines.kubeflow.org/component_ref: '{}'}\n  - name: plot-pipeline\n    dag:\n      tasks:\n      - {name: plot-linear, template: plot-linear}\n  arguments:\n    parameters: []\n  serviceAccountName: pipeline-runner\n```\n\n  </details>\n</p>\n\nAfter running, click Visualization.\n\n![advanced-run-5.png](./img/advanced-run-5.png)\n\n## Run output\n\n![advanced-run-2.png](./img/advanced-run-2.png)\n\nRun output is where Kubeflow gathers the Artifacts generated in the specified form and shows the evaluation index (Metric).\n\nTo show the evaluation index (Metric), you can save the name and value you want to show in the `mlpipeline_metrics_path: OutputPath(\"Metrics\")` argument in json format. For example, you can write it like this.\n\n```python\n@create_component_from_func\ndef show_metric_of_sum(\n    number: int,\n    mlpipeline_metrics_path: OutputPath(\"Metrics\"),\n  ):\n    import json\n    metrics = {\n        \"metrics\": [\n            {\n                \"name\": \"sum_value\",\n                \"numberValue\": number,\n            },\n        ],\n    }\n    with open(mlpipeline_metrics_path, \"w\") as f:\n        json.dump(metrics, f)\n```\n\nWe will add a component to generate evaluation metrics to the pipeline created in the [Pipeline](../kubeflow/basic-pipeline.md) and execute it. The whole pipeline is as follows.\n\n```python\nimport kfp\nfrom kfp.components import create_component_from_func, OutputPath\nfrom kfp.dsl import pipeline\n\n\n@create_component_from_func\ndef print_and_return_number(number: int) -> int:\n    print(number)\n    return number\n\n@create_component_from_func\ndef sum_and_print_numbers(number_1: int, number_2: int) -> int:\n    sum_number = number_1 + number_2\n    print(sum_number)\n    return sum_number\n\n@create_component_from_func\ndef show_metric_of_sum(\n    number: int,\n    mlpipeline_metrics_path: OutputPath(\"Metrics\"),\n  ):\n    import json\n    metrics = {\n        \"metrics\": [\n            {\n                \"name\": \"sum_value\",\n                \"numberValue\": number,\n            },\n        ],\n    }\n    with open(mlpipeline_metrics_path, \"w\") as f:\n        json.dump(metrics, f)\n\n@pipeline(name=\"example_pipeline\")\ndef example_pipeline(number_1: int, number_2: int):\n    number_1_result = print_and_return_number(number_1)\n    number_2_result = print_and_return_number(number_2)\n    sum_result = sum_and_print_numbers(\n        number_1=number_1_result.output, number_2=number_2_result.output\n    )\n    show_metric_of_sum(sum_result.output)\n\n\nif __name__ == \"__main__\":\n    kfp.compiler.Compiler().compile(example_pipeline, \"example_pipeline.yaml\")\n```\n\nAfter execution, click Run Output and it will show like this.\n\n![advanced-run-4.png](./img/advanced-run-4.png)\n\n## Config\n\n![advanced-run-3.png](./img/advanced-run-3.png)\n\nIn the Config tab, you can view all the values received as pipeline configurations.\n"
  },
  {
    "path": "i18n/en/docusaurus-plugin-content-docs/version-1.0/kubeflow/basic-component.md",
    "content": "---\ntitle : \"4. Component - Write\"\ndescription: \"\"\nsidebar_position: 4\ncontributors: [\"Jongseob Jeon\"]\n---\n\n\n## Component\n\nIn order to write a component, the following must be written: \n\n1. Writing Component Contents \n2. Writing Component Wrapper \n\nNow, let's look at each process.\n\n## Component Contents\n\nComponent Contents are no different from the Python code we commonly write.  \nFor example, let's try writing a component that takes a number as input, prints it, and then returns it. \n We can write it in Python code like this.\n\n```python\nprint(number)\n```\n\nHowever, when this code is run, an error occurs and it does not work because the `number` that should be printed is not defined. \n\nAs we saw in [Kubeflow Concepts](../kubeflow/kubeflow-concepts.md), values like `number` that are required in component content are defined in **Config**. In order to execute component content, the necessary Configs must be passed from the component wrapper.\n\n## Component Wrapper\n\n### Define a standalone Python function\n\nNow we need to create a component wrapper to be able to pass the required Configs.\n\nWithout a separate Config, it will be like this when wrapped with a component wrapper.\n\n```python\ndef print_and_return_number():\n    print(number)\n    return number\n```\n\nNow we add the required Config for the content as an argument to the wrapper. However, it is not just writing the argument but also writing the type hint of the argument. When Kubeflow converts the pipeline into the Kubeflow format, it checks if the specified input and output types are matched in the connection between the components. If the format of the input required by the component does not match the output received from another component, the pipeline cannot be created.\n\nNow we complete the component wrapper by writing down the argument, its type and the type to be returned as follows.\n\n```python\ndef print_and_return_number(number: int) -> int:\n    print(number)\n    return number\n```\n\nIn Kubeflow, you can only use types that can be expressed in json as return values. The most commonly used and recommended types are as follows:\n\n- int\n- float\n- str\n\nIf you want to return multiple values instead of a single value, you must use `collections.namedtuple`.  \nFor more details, please refer to the Kubeflow official documentation [Kubeflow Official Documentation](https://www.kubeflow.org/docs/components/pipelines/sdk/python-function-components/#passing-parameters-by-value).  \nFor example, if you want to write a component that returns the quotient and remainder of a number when divided by 2, it should be written as follows.\n\n```python\nfrom typing import NamedTuple\n\n\ndef divide_and_return_number(\n    number: int,\n) -> NamedTuple(\"DivideOutputs\", [(\"quotient\", int), (\"remainder\", int)]):\n    from collections import namedtuple\n\n    quotient, remainder = divmod(number, 2)\n    print(\"quotient is\", quotient)\n    print(\"remainder is\", remainder)\n\n    divide_outputs = namedtuple(\n        \"DivideOutputs\",\n        [\n            \"quotient\",\n            \"remainder\",\n        ],\n    )\n    return divide_outputs(quotient, remainder)\n```\n\n### Convert to Kubeflow Format\n\nNow you have to convert the written component into a format that can be used in Kubeflow. The conversion can be done through `kfp.components.create_component_from_func`. This converted form can be imported as a function in Python and used in the pipeline.\n\n```python\nfrom kfp.components import create_component_from_func\n\n@create_component_from_func\ndef print_and_return_number(number: int) -> int:\n    print(number)\n    return number\n```\n\n### Share component with yaml file\n\nIf it is not possible to share with Python code, you can share components with a YAML file and use them.\nTo do this, first convert the component to a YAML file and then use it in the pipeline with `kfp.components.load_component_from_file`.\n\nFirst, let's explain the process of converting the written component to a YAML file.\n\n```python\nfrom kfp.components import create_component_from_func\n\n@create_component_from_func\ndef print_and_return_number(number: int) -> int:\n    print(number)\n    return number\n\nif __name__ == \"__main__\":\n    print_and_return_number.component_spec.save(\"print_and_return_number.yaml\")\n```\n\nIf you run the Python code you wrote, a file called `print_and_return_number.yaml` will be created. When you check the file, it will be as follows.\n\n```bash\nname: Print and return number\ninputs:\n- {name: number, type: Integer}\noutputs:\n- {name: Output, type: Integer}\nimplementation:\n  container:\n    image: python:3.7\n    command:\n    - sh\n    - -ec\n    - |\n      program_path=$(mktemp)\n      printf \"%s\" \"$0\" > \"$program_path\"\n      python3 -u \"$program_path\" \"$@\"\n    - |\n      def print_and_return_number(number):\n          print(number)\n          return number\n\n      def _serialize_int(int_value: int) -> str:\n          if isinstance(int_value, str):\n              return int_value\n          if not isinstance(int_value, int):\n              raise TypeError('Value \"{}\" has type \"{}\" instead of int.'.format(str(int_value), str(type(int_value))))\n          return str(int_value)\n\n      import argparse\n      _parser = argparse.ArgumentParser(prog='Print and return number', description='')\n      _parser.add_argument(\"--number\", dest=\"number\", type=int, required=True, default=argparse.SUPPRESS)\n      _parser.add_argument(\"----output-paths\", dest=\"_output_paths\", type=str, nargs=1)\n      _parsed_args = vars(_parser.parse_args())\n      _output_files = _parsed_args.pop(\"_output_paths\", [])\n\n      _outputs = print_and_return_number(**_parsed_args)\n\n      _outputs = [_outputs]\n\n      _output_serializers = [\n          _serialize_int,\n\n      ]\n\n      import os\n      for idx, output_file in enumerate(_output_files):\n          try:\n              os.makedirs(os.path.dirname(output_file))\n          except OSError:\n              pass\n          with open(output_file, 'w') as f:\n              f.write(_output_serializers[idx](_outputs[idx]))\n    args:\n    - --number\n    - {inputValue: number}\n    - '----output-paths'\n    - {outputPath: Output}\n```\n\nNow the generated file can be shared and used in the pipeline as follows.\n\n```python\nfrom kfp.components import load_component_from_file\n\nprint_and_return_number = load_component_from_file(\"print_and_return_number.yaml\")\n```\n\n## How Kubeflow executes component\n\nIn Kubeflow, the execution order of components is as follows:\n\n1. `docker pull <image>`: Pull the image containing the execution environment information of the defined component.\n2. Run `command`: Execute the component's content within the pulled image.\n\nTaking `print_and_return_number.yaml` as an example, the default image in `@create_component_from_func` is `python:3.7`, so the component's content will be executed based on that image.\n\n1. `docker pull python:3.7`\n2. `print(number)`\n\n## References:\n- [Getting Started With Python function based components](https://www.kubeflow.org/docs/components/pipelines/sdk/python-function-components/#getting-started-with-python-function-based-components)\n"
  },
  {
    "path": "i18n/en/docusaurus-plugin-content-docs/version-1.0/kubeflow/basic-pipeline-upload.md",
    "content": "---\ntitle : \"6. Pipeline - Upload\"\ndescription: \"\"\nsidebar_position: 6\ncontributors: [\"Jongseob Jeon\"]\n---\n\n## Upload Pipeline\n\nNow, let's upload the pipeline we created directly to kubeflow.  \nPipeline uploads can be done through the kubeflow dashboard UI.\nUse the method used in [Install Kubeflow](../setup-components/install-components-kf.md) to do port forwarding.\n\n```bash\nkubectl port-forward svc/istio-ingressgateway -n istio-system 8080:80\n```\n\nAccess [http://localhost:8080](http://localhost:8080) to open the dashboard.\n\n### 1. Click Pipelines Tab\n\n![pipeline-gui-0.png](./img/pipeline-gui-0.png)\n\n### 2. Click Upload Pipeline\n\n![pipeline-gui-1.png](./img/pipeline-gui-1.png)\n\n### 3. Click Choose file\n\n![pipeline-gui-2.png](./img/pipeline-gui-2.png)\n\n### 4. Upload created yaml file\n\n![pipeline-gui-3.png](./img/pipeline-gui-3.png)\n\n### 5. Create\n\n![pipeline-gui-4.png](./img/pipeline-gui-4.png)\n\n## Upload Pipeline Version\n\n\nThe uploaded pipeline allows you to manage versions through uploads. However, it serves the role of gathering pipelines with the same name rather than version management at the code level, such as Github.\nIn the example above, clicking on example_pipeline will bring up the following screen.\n\n![pipeline-gui-5.png](./img/pipeline-gui-5.png)\n\nIf you click this screen shows.\n\n![pipeline-gui-4.png](./img/pipeline-gui-4.png)\n\nIf you click Upload Version, a screen appears where you can upload the pipeline.\n\n![pipeline-gui-6.png](./img/pipeline-gui-6.png)\n\nNow, upload your pipeline.\n\n![pipeline-gui-7.png](./img/pipeline-gui-7.png)\n\nOnce uploaded, you can check the pipeline version as follows.\n\n![pipeline-gui-8.png](./img/pipeline-gui-8.png)\n"
  },
  {
    "path": "i18n/en/docusaurus-plugin-content-docs/version-1.0/kubeflow/basic-pipeline.md",
    "content": "---\ntitle : \"5. Pipeline - Write\"\ndescription: \"\"\nsidebar_position: 5\ncontributors: [\"Jongseob Jeon\"]\n---\n\n## Pipeline\n\nComponents do not run independently but rather as components of a pipeline. Therefore, in order to run a component, a pipeline must be written.\nAnd in order to write a pipeline, a set of components and the order of execution of those components is necessary.\n\nOn this page, we will create a pipeline with a component that takes a number as input and outputs it, and a component that takes two numbers from two components and outputs the sum.\n\n## Component Set\n\nFirst, let's create the components that will be used in the pipeline.\n\n1. `print_and_return_number`\n\n   This component prints and returns the input number.  \n   Since the component returns the input value, we specify `int` as the return type hint.\n\n   ```python\n   @create_component_from_func\n   def print_and_return_number(number: int) -> int:\n       print(number)\n       return number\n   ```\n\n2. `sum_and_print_numbers`\n\n   This component calculates the sum of two input numbers and prints it.  \n   Similarly, since the component returns the sum, we specify `int` as the return type hint.\n\n   ```python\n   @create_component_from_func\n   def sum_and_print_numbers(number_1: int, number_2: int) -> int:\n       sum_num = number_1 + number_2\n       print(sum_num)\n       return sum_num\n   ```\n\n## Component Order\n\n### Define Order\n\nIf you have created the necessary set of components, the next step is to define their sequence.  \nThe diagram below represents the order of the pipeline components to be created on this page.\n\n![pipeline-0.png](./img/pipeline-0.png)\n\n### Single Output\n\nNow let's translate this sequence into code.\n\nFirst, writing `print_and_return_number_1` and `print_and_return_number_2` from the picture above would look like this.\n\n```python\ndef example_pipeline():\n    number_1_result = print_and_return_number(number_1)\n    number_2_result = print_and_return_number(number_2)\n```\n\nRun the component and store the return values in `number_1_result` and `number_2_result`, respectively.  \nThe return value of the stored `number_1_result` can be used through `number_1_resulst.output`.\n\n### Multi Output\n\nIn the example above, the components return a single value, so it can be directly used with `output`.  \nHowever, if there are multiple return values, they will be stored in `outputs` as a dictionary. You can use the keys to access the desired return values.\nLet's consider an example with a component that returns multiple values, like the one mentioned in the [component](../kubeflow/basic-component.md#define-a-standalone-python-function) definition. The `divide_and_return_number` component returns `quotient` and `remainder`. Here's an example of passing these two values to `print_and_return_number`:\n\n```python\ndef multi_pipeline():\n    divided_result = divde_and_return_number(number)\n    num_1_result = print_and_return_number(divided_result.outputs[\"quotient\"])\n    num_2_result = print_and_return_number(divided_result.outputs[\"remainder\"])\n```\n\nStore the result of `divide_and_return_number` in `divided_result` and you can get the values of each by `divided_result.outputs[\"quotient\"]` and `divided_result.outputs[\"remainder\"]`.\n\n### Write to python code\n\nNow, let's get back to the main topic and pass the result of these two values to `sum_and_print_numbers`.\n\n```python\ndef example_pipeline():\n    number_1_result = print_and_return_number(number_1)\n    number_2_result = print_and_return_number(number_2)\n    sum_result = sum_and_print_numbers(\n        number_1=number_1_result.output, number_2=number_2_result.output\n    )\n```\n\nNext, gather the necessary Configs for each component and define it as a pipeline Config.\n\n```python\ndef example_pipeline(number_1: int, number_2:int):\n    number_1_result = print_and_return_number(number_1)\n    number_2_result = print_and_return_number(number_2)\n    sum_result = sum_and_print_numbers(\n        number_1=number_1_result.output, number_2=number_2_result.output\n    )\n```\n\n## Convert to Kubeflow Format\n\nFinally, convert it into a format that can be used in Kubeflow. The conversion can be done using the `kfp.dsl.pipeline` function.\n\n```python\nfrom kfp.dsl import pipeline\n\n\n@pipeline(name=\"example_pipeline\")\ndef example_pipeline(number_1: int, number_2: int):\n    number_1_result = print_and_return_number(number_1)\n    number_2_result = print_and_return_number(number_2)\n    sum_result = sum_and_print_numbers(\n        number_1=number_1_result.output, number_2=number_2_result.output\n    )\n```\n\nIn order to run a pipeline in Kubeflow, it needs to be compiled into the designated yaml format as only yaml format is possible, so the created pipeline needs to be compiled into a specific yaml format.\nCompilation can be done using the following command.\n\n```python\nif __name__ == \"__main__\":\n    import kfp\n    kfp.compiler.Compiler().compile(example_pipeline, \"example_pipeline.yaml\")\n```\n\n## Conclusion\n\nAs explained earlier, if we gather the content into a Python code, it will look like this.\n\n```python\nimport kfp\nfrom kfp.components import create_component_from_func\nfrom kfp.dsl import pipeline\n\n@create_component_from_func\ndef print_and_return_number(number: int) -> int:\n    print(number)\n    return number\n\n@create_component_from_func\ndef sum_and_print_numbers(number_1: int, number_2: int):\n    print(number_1 + number_2)\n\n@pipeline(name=\"example_pipeline\")\ndef example_pipeline(number_1: int, number_2: int):\n    number_1_result = print_and_return_number(number_1)\n    number_2_result = print_and_return_number(number_2)\n    sum_result = sum_and_print_numbers(\n        number_1=number_1_result.output, number_2=number_2_result.output\n    )\n\nif __name__ == \"__main__\":\n    kfp.compiler.Compiler().compile(example_pipeline, \"example_pipeline.yaml\")\n```\n\nThe compiled result is as follows.\n\n<details>\n  <summary>example_pipeline.yaml</summary>\n\n```bash\napiVersion: argoproj.io/v1alpha1\nkind: Workflow\nmetadata:\n  generateName: example-pipeline-\n  annotations: {pipelines.kubeflow.org/kfp_sdk_version: 1.6.3, pipelines.kubeflow.org/pipeline_compilation_time: '2021-12-05T13:38:51.566777',\n    pipelines.kubeflow.org/pipeline_spec: '{\"inputs\": [{\"name\": \"number_1\", \"type\":\n      \"Integer\"}, {\"name\": \"number_2\", \"type\": \"Integer\"}], \"name\": \"example_pipeline\"}'}\n  labels: {pipelines.kubeflow.org/kfp_sdk_version: 1.6.3}\nspec:\n  entrypoint: example-pipeline\n  templates:\n  - name: example-pipeline\n    inputs:\n      parameters:\n      - {name: number_1}\n      - {name: number_2}\n    dag:\n      tasks:\n      - name: print-and-return-number\n        template: print-and-return-number\n        arguments:\n          parameters:\n          - {name: number_1, value: '{{inputs.parameters.number_1}}'}\n      - name: print-and-return-number-2\n        template: print-and-return-number-2\n        arguments:\n          parameters:\n          - {name: number_2, value: '{{inputs.parameters.number_2}}'}\n      - name: sum-and-print-numbers\n        template: sum-and-print-numbers\n        dependencies: [print-and-return-number, print-and-return-number-2]\n        arguments:\n          parameters:\n          - {name: print-and-return-number-2-Output, value: '{{tasks.print-and-return-number-2.outputs.parameters.print-and-return-number-2-Output}}'}\n          - {name: print-and-return-number-Output, value: '{{tasks.print-and-return-number.outputs.parameters.print-and-return-number-Output}}'}\n  - name: print-and-return-number\n    container:\n      args: [--number, '{{inputs.parameters.number_1}}', '----output-paths', /tmp/outputs/Output/data]\n      command:\n      - sh\n      - -ec\n      - |\n        program_path=$(mktemp)\n        printf \"%s\" \"$0\" > \"$program_path\"\n        python3 -u \"$program_path\" \"$@\"\n      - |\n        def print_and_return_number(number):\n            print(number)\n            return number\n\n        def _serialize_int(int_value: int) -> str:\n            if isinstance(int_value, str):\n                return int_value\n            if not isinstance(int_value, int):\n                raise TypeError('Value \"{}\" has type \"{}\" instead of int.'.format(str(int_value), str(type(int_value))))\n            return str(int_value)\n\n        import argparse\n        _parser = argparse.ArgumentParser(prog='Print and return number', description='')\n        _parser.add_argument(\"--number\", dest=\"number\", type=int, required=True, default=argparse.SUPPRESS)\n        _parser.add_argument(\"----output-paths\", dest=\"_output_paths\", type=str, nargs=1)\n        _parsed_args = vars(_parser.parse_args())\n        _output_files = _parsed_args.pop(\"_output_paths\", [])\n\n        _outputs = print_and_return_number(**_parsed_args)\n\n        _outputs = [_outputs]\n\n        _output_serializers = [\n            _serialize_int,\n\n        ]\n\n        import os\n        for idx, output_file in enumerate(_output_files):\n            try:\n                os.makedirs(os.path.dirname(output_file))\n            except OSError:\n                pass\n            with open(output_file, 'w') as f:\n                f.write(_output_serializers[idx](_outputs[idx]))\n      image: python:3.7\n    inputs:\n      parameters:\n      - {name: number_1}\n    outputs:\n      parameters:\n      - name: print-and-return-number-Output\n        valueFrom: {path: /tmp/outputs/Output/data}\n      artifacts:\n      - {name: print-and-return-number-Output, path: /tmp/outputs/Output/data}\n    metadata:\n      labels: {pipelines.kubeflow.org/kfp_sdk_version: 1.6.3, pipelines.kubeflow.org/pipeline-sdk-type: kfp}\n      annotations: {pipelines.kubeflow.org/component_spec: '{\"implementation\": {\"container\":\n          {\"args\": [\"--number\", {\"inputValue\": \"number\"}, \"----output-paths\", {\"outputPath\":\n          \"Output\"}], \"command\": [\"sh\", \"-ec\", \"program_path=$(mktemp)\\nprintf \\\"%s\\\"\n          \\\"$0\\\" > \\\"$program_path\\\"\\npython3 -u \\\"$program_path\\\" \\\"$@\\\"\\n\", \"def\n          print_and_return_number(number):\\n    print(number)\\n    return number\\n\\ndef\n          _serialize_int(int_value: int) -> str:\\n    if isinstance(int_value, str):\\n        return\n          int_value\\n    if not isinstance(int_value, int):\\n        raise TypeError(''Value\n          \\\"{}\\\" has type \\\"{}\\\" instead of int.''.format(str(int_value), str(type(int_value))))\\n    return\n          str(int_value)\\n\\nimport argparse\\n_parser = argparse.ArgumentParser(prog=''Print\n          and return number'', description='''')\\n_parser.add_argument(\\\"--number\\\",\n          dest=\\\"number\\\", type=int, required=True, default=argparse.SUPPRESS)\\n_parser.add_argument(\\\"----output-paths\\\",\n          dest=\\\"_output_paths\\\", type=str, nargs=1)\\n_parsed_args = vars(_parser.parse_args())\\n_output_files\n          = _parsed_args.pop(\\\"_output_paths\\\", [])\\n\\n_outputs = print_and_return_number(**_parsed_args)\\n\\n_outputs\n          = [_outputs]\\n\\n_output_serializers = [\\n    _serialize_int,\\n\\n]\\n\\nimport\n          os\\nfor idx, output_file in enumerate(_output_files):\\n    try:\\n        os.makedirs(os.path.dirname(output_file))\\n    except\n          OSError:\\n        pass\\n    with open(output_file, ''w'') as f:\\n        f.write(_output_serializers[idx](_outputs[idx]))\\n\"],\n          \"image\": \"python:3.7\"}}, \"inputs\": [{\"name\": \"number\", \"type\": \"Integer\"}],\n          \"name\": \"Print and return number\", \"outputs\": [{\"name\": \"Output\", \"type\":\n          \"Integer\"}]}', pipelines.kubeflow.org/component_ref: '{}', pipelines.kubeflow.org/arguments.parameters: '{\"number\":\n          \"{{inputs.parameters.number_1}}\"}'}\n  - name: print-and-return-number-2\n    container:\n      args: [--number, '{{inputs.parameters.number_2}}', '----output-paths', /tmp/outputs/Output/data]\n      command:\n      - sh\n      - -ec\n      - |\n        program_path=$(mktemp)\n        printf \"%s\" \"$0\" > \"$program_path\"\n        python3 -u \"$program_path\" \"$@\"\n      - |\n        def print_and_return_number(number):\n            print(number)\n            return number\n\n        def _serialize_int(int_value: int) -> str:\n            if isinstance(int_value, str):\n                return int_value\n            if not isinstance(int_value, int):\n                raise TypeError('Value \"{}\" has type \"{}\" instead of int.'.format(str(int_value), str(type(int_value))))\n            return str(int_value)\n\n        import argparse\n        _parser = argparse.ArgumentParser(prog='Print and return number', description='')\n        _parser.add_argument(\"--number\", dest=\"number\", type=int, required=True, default=argparse.SUPPRESS)\n        _parser.add_argument(\"----output-paths\", dest=\"_output_paths\", type=str, nargs=1)\n        _parsed_args = vars(_parser.parse_args())\n        _output_files = _parsed_args.pop(\"_output_paths\", [])\n\n        _outputs = print_and_return_number(**_parsed_args)\n\n        _outputs = [_outputs]\n\n        _output_serializers = [\n            _serialize_int,\n\n        ]\n\n        import os\n        for idx, output_file in enumerate(_output_files):\n            try:\n                os.makedirs(os.path.dirname(output_file))\n            except OSError:\n                pass\n            with open(output_file, 'w') as f:\n                f.write(_output_serializers[idx](_outputs[idx]))\n      image: python:3.7\n    inputs:\n      parameters:\n      - {name: number_2}\n    outputs:\n      parameters:\n      - name: print-and-return-number-2-Output\n        valueFrom: {path: /tmp/outputs/Output/data}\n      artifacts:\n      - {name: print-and-return-number-2-Output, path: /tmp/outputs/Output/data}\n    metadata:\n      labels: {pipelines.kubeflow.org/kfp_sdk_version: 1.6.3, pipelines.kubeflow.org/pipeline-sdk-type: kfp}\n      annotations: {pipelines.kubeflow.org/component_spec: '{\"implementation\": {\"container\":\n          {\"args\": [\"--number\", {\"inputValue\": \"number\"}, \"----output-paths\", {\"outputPath\":\n          \"Output\"}], \"command\": [\"sh\", \"-ec\", \"program_path=$(mktemp)\\nprintf \\\"%s\\\"\n          \\\"$0\\\" > \\\"$program_path\\\"\\npython3 -u \\\"$program_path\\\" \\\"$@\\\"\\n\", \"def\n          print_and_return_number(number):\\n    print(number)\\n    return number\\n\\ndef\n          _serialize_int(int_value: int) -> str:\\n    if isinstance(int_value, str):\\n        return\n          int_value\\n    if not isinstance(int_value, int):\\n        raise TypeError(''Value\n          \\\"{}\\\" has type \\\"{}\\\" instead of int.''.format(str(int_value), str(type(int_value))))\\n    return\n          str(int_value)\\n\\nimport argparse\\n_parser = argparse.ArgumentParser(prog=''Print\n          and return number'', description='''')\\n_parser.add_argument(\\\"--number\\\",\n          dest=\\\"number\\\", type=int, required=True, default=argparse.SUPPRESS)\\n_parser.add_argument(\\\"----output-paths\\\",\n          dest=\\\"_output_paths\\\", type=str, nargs=1)\\n_parsed_args = vars(_parser.parse_args())\\n_output_files\n          = _parsed_args.pop(\\\"_output_paths\\\", [])\\n\\n_outputs = print_and_return_number(**_parsed_args)\\n\\n_outputs\n          = [_outputs]\\n\\n_output_serializers = [\\n    _serialize_int,\\n\\n]\\n\\nimport\n          os\\nfor idx, output_file in enumerate(_output_files):\\n    try:\\n        os.makedirs(os.path.dirname(output_file))\\n    except\n          OSError:\\n        pass\\n    with open(output_file, ''w'') as f:\\n        f.write(_output_serializers[idx](_outputs[idx]))\\n\"],\n          \"image\": \"python:3.7\"}}, \"inputs\": [{\"name\": \"number\", \"type\": \"Integer\"}],\n          \"name\": \"Print and return number\", \"outputs\": [{\"name\": \"Output\", \"type\":\n          \"Integer\"}]}', pipelines.kubeflow.org/component_ref: '{}', pipelines.kubeflow.org/arguments.parameters: '{\"number\":\n          \"{{inputs.parameters.number_2}}\"}'}\n  - name: sum-and-print-numbers\n    container:\n      args: [--number-1, '{{inputs.parameters.print-and-return-number-Output}}', --number-2,\n        '{{inputs.parameters.print-and-return-number-2-Output}}']\n      command:\n      - sh\n      - -ec\n      - |\n        program_path=$(mktemp)\n        printf \"%s\" \"$0\" > \"$program_path\"\n        python3 -u \"$program_path\" \"$@\"\n      - |\n        def sum_and_print_numbers(number_1, number_2):\n            print(number_1 + number_2)\n\n        import argparse\n        _parser = argparse.ArgumentParser(prog='Sum and print numbers', description='')\n        _parser.add_argument(\"--number-1\", dest=\"number_1\", type=int, required=True, default=argparse.SUPPRESS)\n        _parser.add_argument(\"--number-2\", dest=\"number_2\", type=int, required=True, default=argparse.SUPPRESS)\n        _parsed_args = vars(_parser.parse_args())\n\n        _outputs = sum_and_print_numbers(**_parsed_args)\n      image: python:3.7\n    inputs:\n      parameters:\n      - {name: print-and-return-number-2-Output}\n      - {name: print-and-return-number-Output}\n    metadata:\n      labels: {pipelines.kubeflow.org/kfp_sdk_version: 1.6.3, pipelines.kubeflow.org/pipeline-sdk-type: kfp}\n      annotations: {pipelines.kubeflow.org/component_spec: '{\"implementation\": {\"container\":\n          {\"args\": [\"--number-1\", {\"inputValue\": \"number_1\"}, \"--number-2\", {\"inputValue\":\n          \"number_2\"}], \"command\": [\"sh\", \"-ec\", \"program_path=$(mktemp)\\nprintf \\\"%s\\\"\n          \\\"$0\\\" > \\\"$program_path\\\"\\npython3 -u \\\"$program_path\\\" \\\"$@\\\"\\n\", \"def\n          sum_and_print_numbers(number_1, number_2):\\n    print(number_1 + number_2)\\n\\nimport\n          argparse\\n_parser = argparse.ArgumentParser(prog=''Sum and print numbers'',\n          description='''')\\n_parser.add_argument(\\\"--number-1\\\", dest=\\\"number_1\\\",\n          type=int, required=True, default=argparse.SUPPRESS)\\n_parser.add_argument(\\\"--number-2\\\",\n          dest=\\\"number_2\\\", type=int, required=True, default=argparse.SUPPRESS)\\n_parsed_args\n          = vars(_parser.parse_args())\\n\\n_outputs = sum_and_print_numbers(**_parsed_args)\\n\"],\n          \"image\": \"python:3.7\"}}, \"inputs\": [{\"name\": \"number_1\", \"type\": \"Integer\"},\n          {\"name\": \"number_2\", \"type\": \"Integer\"}], \"name\": \"Sum and print numbers\"}',\n        pipelines.kubeflow.org/component_ref: '{}', pipelines.kubeflow.org/arguments.parameters: '{\"number_1\":\n          \"{{inputs.parameters.print-and-return-number-Output}}\", \"number_2\": \"{{inputs.parameters.print-and-return-number-2-Output}}\"}'}\n  arguments:\n    parameters:\n    - {name: number_1}\n    - {name: number_2}\n  serviceAccountName: pipeline-runner\n```\n\n</details>\n"
  },
  {
    "path": "i18n/en/docusaurus-plugin-content-docs/version-1.0/kubeflow/basic-requirements.md",
    "content": "---\ntitle : \"3. Install Requirements\"\ndescription: \"\"\nsidebar_position: 3\ncontributors: [\"Jongseob Jeon\"]\n---\n\nThe recommended Python version for practice is python>=3.7. For those unfamiliar with the Python environment, please refer to [Appendix 1. Python Virtual Environment](../appendix/pyenv) and install the packages on the **client node**.\n\nThe packages and versions required for the practice are as follows:\n\n- requirements.txt\n\n  ```bash\n  kfp==1.8.9\n  scikit-learn==1.0.1\n  mlflow==1.21.0\n  pandas==1.3.4\n  dill==0.3.4\n  ```\n\nActivate the [Python virtual environment](../appendix/pyenv.md#python-가상환경-생성) created in the previous section.\n\n```bash\npyenv activate demo\n```\n\nWe are proceeding with the package installation.\n\n```bash\npip3 install -U pip\npip3 install kfp==1.8.9 scikit-learn==1.0.1 mlflow==1.21.0 pandas==1.3.4 dill==0.3.4\n```\n"
  },
  {
    "path": "i18n/en/docusaurus-plugin-content-docs/version-1.0/kubeflow/basic-run.md",
    "content": "---\ntitle : \"7. Pipeline - Run\"\ndescription: \"\"\nsidebar_position: 7\ncontributors: [\"Jongseob Jeon\"]\n---\n\n## Run Pipeline\n\nNow we will run the uploaded pipeline.\n\n## Before Run\n\n### 1. Create Experiment\n\nExperiments in Kubeflow are units that logically manage runs executed within them.\n\nWhen you first enter the namespace in Kubeflow, there are no Experiments created. Therefore, you must create an Experiment beforehand in order to run the pipeline. If an Experiment already exists, you can go to [Run Pipeline](../kubeflow/basic-run.md#run-pipeline-1).\n\nExperiments can be created via the Create Experiment button.\n\n![run-0.png](./img/run-0.png)\n\n### 2. Name 입력\n\n![run-1.png](./img/run-1.png)\n\n## Run Pipeline\n\n### 1. Select Create Run\n\n![run-2.png](./img/run-2.png)\n\n### 2. Select Experiment\n\n![run-9.png](./img/run-9.png)\n\n![run-10.png](./img/run-10.png)\n\n### 3. Enter Pipeline Config\n\nFill in the values of the Config provided when creating the pipeline. The uploaded pipeline requires input values for `number_1` and `number_2`.\n\n![run-3.png](./img/run-3.png)\n\n### 4. Start\n\nClick the Start button after entering the values. The pipeline will start running.\n\n![run-4.png](./img/run-4.png)\n\n## Run Result\n\nThe executed pipelines can be viewed in the Runs tab.\nClicking on a run provides detailed information related to the executed pipeline.\n\n![run-5.png](./img/run-5.png)\n\nUpon clicking, the following screen appears. Components that have not yet executed are displayed in gray.\n\n![run-6.png](./img/run-6.png)\n\nWhen a component has completed execution, it is marked with a green checkmark.\n\n![run-7.png](./img/run-7.png)\n\nIf we look at the last component, we can see that it has outputted the sum of the input values, which in this case is 8 (the sum of 3 and 5).\n\n![run-8.png](./img/run-8.png)\n"
  },
  {
    "path": "i18n/en/docusaurus-plugin-content-docs/version-1.0/kubeflow/how-to-debug.md",
    "content": "---\ntitle : \"13. Component - Debugging\"\ndescription: \"\"\nsidebar_position: 13\ncontributors: [\"Jongseob Jeon\"]\n---\n\n## Debugging Pipeline\n\nThis page covers how to debug Kubeflow components.\n\n## Failed Component\n\nWe will modify a pipeline used in [Component - MLFlow](../kubeflow/advanced-mlflow.md#mlflow-pipeline) in this page.\n\nFirst, let's modify the pipeline so that the component fails.\n\n```python\nfrom functools import partial\n\nimport kfp\nfrom kfp.components import InputPath, OutputPath, create_component_from_func\nfrom kfp.dsl import pipeline\n\n\n@partial(\n    create_component_from_func,\n    packages_to_install=[\"pandas\", \"scikit-learn\"],\n)\ndef load_iris_data(\n    data_path: OutputPath(\"csv\"),\n    target_path: OutputPath(\"csv\"),\n):\n    import pandas as pd\n    from sklearn.datasets import load_iris\n\n    iris = load_iris()\n\n    data = pd.DataFrame(iris[\"data\"], columns=iris[\"feature_names\"])\n    target = pd.DataFrame(iris[\"target\"], columns=[\"target\"])\n    \n    data[\"sepal length (cm)\"] = None\n    data.to_csv(data_path, index=False)\n    target.to_csv(target_path, index=False)\n\n\n@partial(\n    create_component_from_func,\n    packages_to_install=[\"pandas\"],\n)\ndef drop_na_from_csv(\n    data_path: InputPath(\"csv\"),\n    output_path: OutputPath(\"csv\"),\n):\n    import pandas as pd\n\n    data = pd.read_csv(data_path)\n    data = data.dropna()\n    data.to_csv(output_path, index=False)\n\n\n@partial(\n    create_component_from_func,\n    packages_to_install=[\"dill\", \"pandas\", \"scikit-learn\", \"mlflow\"],\n)\ndef train_from_csv(\n    train_data_path: InputPath(\"csv\"),\n    train_target_path: InputPath(\"csv\"),\n    model_path: OutputPath(\"dill\"),\n    input_example_path: OutputPath(\"dill\"),\n    signature_path: OutputPath(\"dill\"),\n    conda_env_path: OutputPath(\"dill\"),\n    kernel: str,\n):\n    import dill\n    import pandas as pd\n    from sklearn.svm import SVC\n\n    from mlflow.models.signature import infer_signature\n    from mlflow.utils.environment import _mlflow_conda_env\n\n    train_data = pd.read_csv(train_data_path)\n    train_target = pd.read_csv(train_target_path)\n\n    clf = SVC(kernel=kernel)\n    clf.fit(train_data, train_target)\n\n    with open(model_path, mode=\"wb\") as file_writer:\n        dill.dump(clf, file_writer)\n\n    input_example = train_data.sample(1)\n    with open(input_example_path, \"wb\") as file_writer:\n        dill.dump(input_example, file_writer)\n\n    signature = infer_signature(train_data, clf.predict(train_data))\n    with open(signature_path, \"wb\") as file_writer:\n        dill.dump(signature, file_writer)\n\n    conda_env = _mlflow_conda_env(\n        additional_pip_deps=[\"dill\", \"pandas\", \"scikit-learn\"]\n    )\n    with open(conda_env_path, \"wb\") as file_writer:\n        dill.dump(conda_env, file_writer)\n\n\n\n@pipeline(name=\"debugging_pipeline\")\ndef debugging_pipeline(kernel: str):\n    iris_data = load_iris_data()\n    drop_data = drop_na_from_csv(data=iris_data.outputs[\"data\"])\n    model = train_from_csv(\n        train_data=drop_data.outputs[\"output\"],\n        train_target=iris_data.outputs[\"target\"],\n        kernel=kernel,\n    )\n\nif __name__ == \"__main__\":\n    kfp.compiler.Compiler().compile(debugging_pipeline, \"debugging_pipeline.yaml\")\n\n```\n\nThe modifications are as follows:\n\n1. In the `load_iris_data` component for loading data, `None` was injected into the `sepal length (cm)` feature.\n2. In the `drop_na_from_csv` component, use the `drop_na()` function to remove rows with na values.\n\nNow let's upload and run the pipeline.  \nAfter running, if you press Run you will see that it has failed in the `Train from csv` component.\n\n![debug-0.png](./img/debug-0.png)\n\nClick on the failed component and check the log to see the reason for the failure.\n\n![debug-2.png](./img/debug-2.png)\n\nIf the log shows that the data count is 0 and the component did not run, there may be an issue with the input data.  \nLet's investigate what might be the problem.\n\nFirst, click on the component and go to the Input/Output tab to download the input data.  \nYou can click on the link indicated by the red square to download the data.\n\n\n![debug-5.png](./img/debug-5.png)\n\nDownload both files to the same location. Then navigate to the specified path and check the downloaded files.\n\n\n```bash\nls\n```\n\nThere are two files as follows.\n\n```bash\ndrop-na-from-csv-output.tgz load-iris-data-target.tgz\n```\n\nI will try to unzip it.\n\n```bash\ntar -xzvf load-iris-data-target.tgz ; mv data target.csv\ntar -xzvf drop-na-from-csv-output.tgz ; mv data data.csv\n```\n\nAnd then run the component code using a Jupyter notebook.\n![debug-3.png](./img/debug-3.png)\n\nDebugging revealed that dropping the data was based on rows instead of columns, resulting in all the data being removed.\nNow that we know the cause of the problem, we can modify the component to drop based on columns.\n\n```python\n@partial(\n    create_component_from_func,\n    packages_to_install=[\"pandas\"],\n)\ndef drop_na_from_csv(\n    data_path: InputPath(\"csv\"),\n    output_path: OutputPath(\"csv\"),\n):\n    import pandas as pd\n\n    data = pd.read_csv(data_path)\n    data = data.dropna(axis=\"columns\")\n    data.to_csv(output_path, index=False)\n```\n\nAfter modifying, upload the pipeline again and run it to confirm that it is running normally as follows.\n\n![debug-6.png](./img/debug-6.png)\n"
  },
  {
    "path": "i18n/en/docusaurus-plugin-content-docs/version-1.0/kubeflow/kubeflow-concepts.md",
    "content": "---\ntitle : \"2. Kubeflow Concepts\"\ndescription: \"\"\nsidebar_position: 2\ncontributors: [\"Jongseob Jeon\"]\n---\n\n## Component\n\nA component is composed of Component contents and a Component wrapper.\nA single component is delivered to Kubeflow through a Component wrapper and the delivered component executes the defined Component contents and produces artifacts.\n\n![concept-0.png](./img/concept-0.png)\n\n### Component Contents\n\nThere are three components that make up the component contents:\n\n![concept-1.png](./img/concept-1.png)\n\n1. Environment\n2. Python code w/ Config\n3. Generates Artifacts\n\nLet's explore each component with an example.\nHere is a Python code that loads data, trains an SVC (Support Vector Classifier) model, and saves the SVC model.\n\n```python\nimport dill\nimport pandas as pd\n\nfrom sklearn.svm import SVC\n\ntrain_data = pd.read_csv(train_data_path)\ntrain_target= pd.read_csv(train_target_path)\n\nclf= SVC(\n    kernel=kernel\n)\nclf.fit(train_data)\n\nwith open(model_path, mode=\"wb\") as file_writer:\n     dill.dump(clf, file_writer)\n```\n\nThe above Python code can be divided into components contents as follows.\n\n![concept-2.png](./img/concept-2.png)\n\nEnvironment is the part of the Python code where the packages used in the code are imported.  \nNext, Python Code w\\ Config is where the given Config is used to actually perform the training.  \nFinally, there is a process to save the artifacts.  \n\n### Component Wrapper\n\nComponent wrappers deliver the necessary Config and execute tasks for component content.\n\n![concept-3.png](./img/concept-3.png)\n\nIn Kubeflow, component wrappers are defined as functions, similar to the `train_svc_from_csv` example above.\nWhen a component wrapper wraps the contents, it looks like the following:\n\n![concept-4.png](./img/concept-4.png)\n\n### Artifacts\n\nIn the explanation above, it was mentioned that the component creates Artifacts. Artifacts is a term used to refer to any form of a file that is generated, such as evaluation results, logs, etc.\nOf the ones that we are interested in, the following are significant: Models, Data, Metrics, and etc.\n\n![concept-5.png](./img/concept-5.png)\n\n- Model\n- Data\n- Metric\n- etc\n\n#### Model\n\nWe defined the model as follows: \n\n> A model is a form that includes Python code, trained weights and network architecture, and an environment to run it.\n\n#### Data\n\nData includes preprocessed features, model predictions, etc. \n\n#### Metric\n\nMetric is divided into two categories: dynamic metrics and static metrics.\n\n- Dynamic metrics refer to values that continuously change during the training process, such as train loss per epoch.\n- Static metrics refer to evaluation metrics, such as accuracy, that are calculated after the training is completed.\n\n## Pipeline\n\nA pipeline consists of a collection of components and the order in which they are executed. The order forms a directed acyclic graph (DAG), which can include simple conditional statements.\n\n![concept-6.png](./img/concept-6.png)\n\n### Pipeline Config\n\nAs mentioned earlier, components require config to be executed. The pipeline config contains the configs for all the components in the pipeline.\n\n![concept-7.png](./img/concept-7.png)\n\n## Run\n\nTo execute a pipeline, the pipeline config specific to that pipeline is required. In Kubeflow, an executed pipeline is called a \"Run.\"\n\n![concept-8.png](./img/concept-8.png)\n\nWhen a pipeline is executed, each component generates artifacts. Kubeflow pipeline assigns a unique ID to each Run, and all artifacts generated during the Run are stored.\n\n![concept-9.png](./img/concept-9.png)\n\nNow, let's learn how to write components and pipelines.\n"
  },
  {
    "path": "i18n/en/docusaurus-plugin-content-docs/version-1.0/kubeflow/kubeflow-intro.md",
    "content": "---\ntitle : \"1. Kubeflow Introduction\"\ndescription: \"\"\nsidebar_position: 1\ncontributors: [\"Jongseob Jeon\"]\n---\n\nTo use Kubeflow, you need to write components and pipelines.\n\nThe approach described in *MLOps for ALL* differs slightly from the method described on the [Kubeflow Pipeline official website](https://www.kubeflow.org/docs/components/pipelines/overview/quickstart/). Here, Kubeflow Pipeline is used as one of the components in the [elements that make up MLOps](../kubeflow/kubeflow-concepts.md#component-contents) rather than a standalone workflow.\n\nNow, let's understand what components and pipelines are and how to write them.\n"
  },
  {
    "path": "i18n/en/docusaurus-plugin-content-docs/version-1.0/kubeflow-dashboard-guide/_category_.json",
    "content": "{\n  \"label\": \"Kubeflow UI Guide\",\n  \"position\": 5,\n  \"link\": {\n    \"type\": \"generated-index\"\n  }\n}\n"
  },
  {
    "path": "i18n/en/docusaurus-plugin-content-docs/version-1.0/kubeflow-dashboard-guide/experiments-and-others.md",
    "content": "---\ntitle : \"6. Kubeflow Pipeline Relates\"\ndescription: \"\"\nsidebar_position: 6\ncontributors: [\"Jaeyeon Kim\"]\n---\n\nIn the left tabs of the Central Dashboard (KFP Experiments, Pipelines, Runs, Recurring Runs, Artifacts, Executions) you can manage Kubeflow Pipelines and the results of Pipeline execution and Pipeline Runs.\n\n![left-tabs](./img/left-tabs.png)\n\nKubeflow Pipelines are the main reason for using Kubeflow in *MLOps for ALL*, and details on how to create, execute, and check the results of Kubeflow Pipelines can be found in [3.Kubeflow](../kubeflow/kubeflow-intro).\n"
  },
  {
    "path": "i18n/en/docusaurus-plugin-content-docs/version-1.0/kubeflow-dashboard-guide/experiments.md",
    "content": "---\ntitle : \"5. Experiments(AutoML)\"\ndescription: \"\"\nsidebar_position: 5\ncontributors: [\"Jaeyeon Kim\"]\n---\n\nNext, we will click the Experiments(AutoML) tab on the left of the Central Dashboard.\n\n![left-tabs](./img/left-tabs.png)\n\n![automl](./img/automl.png)\n\nThe Experiments(AutoML) page is where you can manage [Katib](https://www.kubeflow.org/docs/components/katib/overview/), which is responsible for AutoML through Hyperparameter Tuning and Neural Architecture Search in Kubeflow.\n\nThe usage of Katib and Experiments(AutoML) is not covered in *MLOps for Everyone* v1.0, and will be added in v2.0.\n"
  },
  {
    "path": "i18n/en/docusaurus-plugin-content-docs/version-1.0/kubeflow-dashboard-guide/intro.md",
    "content": "---\ntitle : \"1. Central Dashboard\"\ndescription: \"\"\nsidebar_position: 1\ncontributors: [\"Jaeyeon Kim\", \"SeungTae Kim\"]\n---\n\nOnce you have completed [Kubeflow installation](../setup-components/install-components-kf.md), you can access the dashboard through the following command.\n\n```bash\nkubectl port-forward --address 0.0.0.0 svc/istio-ingressgateway -n istio-system 8080:80\n```\n\n![after-login](./img/after-login.png)\n\nThe Central Dashboard is a UI that integrates all the features provided by Kubeflow. The features provided by the Central Dashboard can be divided based on the tabs on the left side\n\n![left-tabs](./img/left-tabs.png)\n\n- Home\n- Notebooks\n- Tensorboards\n- Volumes\n- Models\n- Experiments(AutoML)\n- Experiments(KFP)\n- Pipelines\n- Runs\n- Recurring Runs\n- Artifacts\n- Executions\n\nLet's now look at the simple usage of each feature.\n"
  },
  {
    "path": "i18n/en/docusaurus-plugin-content-docs/version-1.0/kubeflow-dashboard-guide/notebooks.md",
    "content": "---\ntitle : \"2. Notebooks\"\ndescription: \"\"\nsidebar_position: 2\ncontributors: [\"Jaeyeon Kim\"]\n---\n\n## Launch Notebook Server\n\nClick on the Notebooks tab on the left side of the Central Dashboard.\n\n![left-tabs](./img/left-tabs.png)\n\nYou will see a similar screen.\n\nThe Notebooks tab is a page where users can independently create and access jupyter notebook and code server environments (hereinafter referred to as a notebook server).\n\n![notebook-home](./img/notebook-home.png)\n\nClick the \"+ NEW NOTEBOOK\" button at the top right. \n\n![new-notebook](./img/new-notebook.png)\n\nWhen the screen shown below appears, now specify the spec (Spec) of the notebook server to be created.\n\n![create](./img/create.png)\n\n\n<details>\n<summary>For details for spec:</summary>\n\n- **name**:\n  - Specifies a name to identify the notebook server.\n- **namespace**:\n  - Cannot be changed. (It is automatically set to the namespace of the currently logged-in user account.)\n- **Image**:\n  - Selects the image to use from pre-installed JupyterLab images with Python packages like sklearn, pytorch, tensorflow, etc.\n    - If you want to use an image that utilizes GPU within the notebook server, refer to the **GPUs** section below.\n  - If you want to use a custom notebook server that includes additional packages or source code, you can create a custom image and deploy it for use.\n- **CPU / RAM**:\n  - Specifies the amount of resources required.\n    - cpu: in core units\n      - Represents the number of virtual cores, and can also be specified as a float value such as `1.5`, `2.7`, etc.\n    - memory: in Gi units\n- **GPUs**:\n  - Specifies the number of GPUs to allocate to the Jupyter notebook.\n    - `None`\n      - When GPU resources are not required.\n    - 1, 2, 4\n      - Allocates 1, 2, or 4 GPUs.\n  - GPU Vendor:\n    - If you have followed the [(Optional) Setup GPU](../setup-kubernetes/setup-nvidia-gpu.md) guide and installed the NVIDIA GPU plugin, select NVIDIA.\n- **Workspace Volume**:\n  - Specifies the amount of disk space required within the notebook server.\n  - Do not change the Type and Name fields unless you want to increase the disk space or change the AccessMode.\n    - Check the **\"Don't use Persistent Storage for User's home\"** checkbox only if it is not necessary to save the notebook server's work. **It is generally recommended not to check this option.**\n    - If you want to use a pre-existing Persistent Volume Claim (PVC), select Type as \"Existing\" and enter the name of the PVC to use.\n- **Data Volumes**:\n  - If additional storage resources are required, click the **\"+ ADD VOLUME\"** button to create them.\n- ~~Configurations, Affinity/Tolerations, Miscellaneous Settings~~\n  - These are generally not needed, so detailed explanations are omitted in *MLOps for All*.\n\n</details>\n\nIf you followed the [Setup GPU (Optional)](../setup-kubernetes/setup-nvidia-gpu.md), select NVIDIA if you have installed the nvidia gpu plugin.\n\n![creating](./img/creating.png)\n\nAfter creation, the **Status** will change to a green check mark icon, and the **CONNECT button** will be activated.\n![created](./img/created.png)\n\n---\n## Accessing the Notebook Server\n\nClicking the **CONNECT button** will open a new browser window, where you will see the following screen:\n\n![notebook-access](./img/notebook-access.png)\n\nYou can use the Notebook, Console, and Terminal icons in the **Launcher** to start using them.\n\n  Notebook Interface\n\n![notebook-console](./img/notebook-console.png)\n\n  Terminal Interface\n\n![terminal-console](./img/terminal-console.png)\n\n---\n\n## Stopping the Notebook Server\n\nIf you haven't used the notebook server for an extended period of time, you can stop it to optimize resource usage in the Kubernetes cluster. **Note that stopping the notebook server will result in the deletion of all data stored outside the Workspace Volume or Data Volume specified when creating the notebook server.**  \nIf you haven't changed the path during notebook server creation, the default Workspace Volume path is `/home/jovyan` inside the notebook server, so any data stored outside the `/home/jovyan` directory will be deleted.\n\nClicking the `STOP` button as shown below will stop the notebook server:\n\n![notebook-stop](./img/notebook-stop.png)\n\nOnce the server is stopped, the `CONNECT` button will be disabled. To restart the notebook server and use it again, click the `PLAY` button.\n\n![notebook-restart](./img/notebook-restart.png)\n"
  },
  {
    "path": "i18n/en/docusaurus-plugin-content-docs/version-1.0/kubeflow-dashboard-guide/tensorboards.md",
    "content": "---\ntitle : \"3. Tensorboards\"\ndescription: \"\"\nsidebar_position: 3\ncontributors: [\"Jaeyeon Kim\"]\n---\n\nLet's click on the Tensorboards tab of the left tabs of the Central Dashboard next.\n\n![left-tabs](./img/left-tabs.png)\n\nWe can see the following screen. \n\n![tensorboard](./img/tensorboard.png)\n\nThe TensorBoard server created in this way can be used just like a regular remote TensorBoard server, or it can be used for the purpose of storing data directly from a Kubeflow Pipeline run for visualization purposes.\n\nYou can refer to the [TensorBoard documentation](https://www.kubeflow.org/docs/components/pipelines/sdk/output-viewer/#tensorboard) for more information on using TensorBoard with Kubeflow Pipeline runs.\n\nThere are various ways to visualize the results of Kubeflow Pipeline runs, and in *MLOps for ALL*, we will utilize the Visualization feature of Kubeflow components and the visualization capabilities of MLflow to enable more general use cases. Therefore, detailed explanations of the TensorBoards page will be omitted in this context.\n"
  },
  {
    "path": "i18n/en/docusaurus-plugin-content-docs/version-1.0/kubeflow-dashboard-guide/volumes.md",
    "content": "---\ntitle : \"4. Volumes\"\ndescription: \"\"\nsidebar_position: 4\ncontributors: [\"Jaeyeon Kim\"]\n---\n\n## Volumes\n\nNext, let's click on the Volumes tab in the left of the Central Dashboard.\n\n![left-tabs](./img/left-tabs.png)\n\nYou will see the following screen.\n\n![volumes](./img/volumes.png)\n\n\nVolumes tab provides the functionality to manage the Persistent Volume Claims (PVC) belonging to the current user's namespace in Kubernetes' Volume (Volume).\n\nBy looking at the screenshot, you can see the information of the Volume created on the [1. Notebooks](../kubeflow-dashboard-guide/notebooks) page. It can be seen that the Storage Class of the Volume is set to local-path, which is the Default Storage Class installed at the time of Kubernetes cluster installation.\n\nIn addition, the Volumes page can be used if you want to create, view, or delete a new Volume in the user namespace.\n\n---\n\n## Creating a Volume\n\nBy clicking the `+ NEW VOLUME` button at the top right, you can see the following screen.\n\n![new-volume](./img/new-volume.png)\n\n\nYou can create a volume by specifying its name, size, storage class, and access mode.\n\nWhen you specify the desired resource specs to create a volume, its Status will be shown as Pending on this page. When you hover over the Status icon, you will see a message that this *(This volume will be bound when its first consumer is created.)*  \nThis is according to the volume creation policy of the [StorageClass](https://kubernetes.io/ko/docs/concepts/storage/storage-classes/) used in the lab, which is local-path. **This is not a problem situation.**  \nWhen the Status is shown as Pending on this page, you can still specify the name of the volume in the notebook server or pod that you want to use the volume and the volume creation will be triggered at that time.\n\n![creating-volume](./img/creating-volume.png)\n"
  },
  {
    "path": "i18n/en/docusaurus-plugin-content-docs/version-1.0/prerequisites/_category_.json",
    "content": "{\n  \"label\": \"Prerequisites\",\n  \"position\": 1,\n  \"link\": {\n    \"type\": \"generated-index\"\n  }\n}\n"
  },
  {
    "path": "i18n/en/docusaurus-plugin-content-docs/version-1.0/prerequisites/docker/_category_.json",
    "content": "{\n  \"label\": \"Docker\",\n  \"position\": 1,\n  \"link\": {\n    \"type\": \"generated-index\"\n  }\n}\n"
  },
  {
    "path": "i18n/en/docusaurus-plugin-content-docs/version-1.0/prerequisites/docker/advanced.md",
    "content": "---\ntitle : \"[Practice] Docker Advanced\"\ndescription: \"Practice to use docker more advanced way.\"\nsidebar_position: 6\ncontributors: [\"Jongseob Jeon\", \"Jaeyeon Kim\"]\n---\n\n## Making a good Docker image\n\n### Considerations to make Docker image:\n\nWhen creating a Docker image using a Dockerfile, the **order** of the commands is important.  \nThis is because Docker images are composed of many Read-Only layers and when building the image, existing layers are **cached** and reused, so if you structure your Dockerfile with this in mind, you can **reduce the build time**.\n\nEach of the `RUN`, `ADD`, `COPY` commands in a Dockerfile are stored as one layer.\n\nFor example, if we have the following `Dockerfile`:\n\n```docker\n# Layer 1\nFROM ubuntu:latest\n\n# Layer 2\nRUN apt-get update && apt-get install python3 pip3 -y\n\n# Layer 3\nRUN pip3 install -U pip && pip3 install torch\n\n# Layer 4\nCOPY src/ src/\n\n# Layer 5\nCMD python src/app.py\n```\n\nIf you run the image built with the above `Dockerfile` with the command `docker run -it app:latest /bin/bash`, it can be represented in the following layers. \n\n![layers.png](./img/layers.png)\n\nThe topmost R/W layer does not affect the image. In other words, any changes made inside the container are volatile.\n\nWhen a lower layer is changed, all the layers above it need to be rebuilt. Therefore, the order of Dockerfile instructions is important. It is recommended to place the parts that are frequently changed towards the end. (e.g., `COPY src/ app/src/`)\n\nConversely, parts that are unlikely to change should be placed towards the beginning.\n\nIf there are parts that are rarely changed but used in multiple places, they can be consolidated. It is advisable to create a separate image for those common parts in advance and use it as a base image.\n\nFor example, if you want to create separate images for an environment that uses `tensorflow-cpu` and another environment that uses `tensorflow-gpu`, you can do the following:\nCreate a base image [`ghcr.io/makinarocks/python:3.8-base`](http://ghcr.io/makinarocks/python:3.8-base-cpu) that includes Python and other basic packages installed. Then, when creating the images with the CPU and GPU versions of TensorFlow, you can use the base image as the `FROM` instruction and write the separate instructions for installing TensorFlow in each Dockerfile. Managing two Dockerfiles in this way improves readability and reduces build time.\n\nCombining layers had performance benefits in older versions of Docker. However, since you cannot guarantee the Docker version in which your Docker containers will run, it is recommended to combine layers for readability purposes. It is best to combine layers that can be combined appropriately.\n\nHere is an example of a Dockerfile:\n\n```docker\n# Bad Case\nRUN apt-get update\nRUN apt-get install build-essential -y\nRUN apt-get install curl -y\nRUN apt-get install jq -y\nRUN apt-get install git -y\n```\n\nThis can be written by combining it as follows.\n\n```docker\n# Better Case\nRUN apt-get update && \\\n    apt-get install -y \\\n    build-essential \\\n    curl \\\n    jq \\\n    git\n```\n\nFor convenience, it is better to use `.dockerignore`.  \n`.dockerignore` is similar to `.gitignore` in the sense that it can be excluded when doing a `docker build` just like when doing a `git add`. \n\nMore information can be found in the [Docker Official Documentation](https://docs.docker.com/develop/develop-images/dockerfile_best-practices/).\n\n### ENTRYPOINT vs CMD\n\n`ENTRYPOINT` and `CMD` are both used when you want to execute a command at the runtime of the container. One of them must be present in the Dockerfile.\n\n- **Difference**\n  - `CMD`: Easily modifiable when running `docker run` command\n  - `ENTRYPOINT`: Requires the use of `--entrypoint` to modify\n\nWhen `ENTRYPOINT` and `CMD` are used together, `CMD` typically represents the arguments (parameters) for the command specified in `ENTRYPOINT`.\n\nFor example, consider the following Dockerfile:\n\n```docker\nFROM ubuntu:latest\n\n# 아래 4 가지 option 을 바꿔가며 직접 테스트해보시면 이해하기 편합니다.\n# 단, NO ENTRYPOINT 옵션은 base image 인 ubuntu:latest 에 이미 있어서 테스트해볼 수는 없고 나머지 v2, 3, 5, 6, 8, 9, 11, 12 를 테스트해볼 수 있습니다.\n# ENTRYPOINT echo \"Hello ENTRYPOINT\"\n# ENTRYPOINT [\"echo\", \"Hello ENTRYPOINT\"]\n# CMD echo \"Hello CMD\"\n# CMD [\"echo\", \"Hello CMD\"]\n```\n\n\nIf you build and run the above `Dockerfile` with the parts marked as comments deactivated, you can get the following results: \n\n|                    | No ENTRYPOINT  | ENTRYPOINT a b | ENTRYPOINT [\"a\", \"b\"] |\n| ------------------ | -------------- | -------------- | --------------------- |\n| **NO CMD**         | Error!         | /bin/sh -c a b | a b                   |\n| **CMD [\"x\", \"y\"]** | x y            | /bin/sh -c a b | a b x y               |\n| **CMD x y**        | /bin/sh -c x y | /bin/sh -c a b | a b /bin/sh -c x y    |\n\n- In Kubernetes pod, \n    - `ENTRYPOINT` corresponds to the command\n    - `CMD` corresponds to the arguments\n\n### Naming docker tag\n\nRecommend not using \"latest\" as a tag for a Docker image, as it is the default tag name and can be easily overwritten unintentionally.\n\nIt is important to ensure uniqueness of one image with one tag for the sake of collaboration and debugging in the production stage.  \nUsing the same tag for different contents can lead to dangling images, which are not shown in the `docker images` but still take up storage space.\n\n### ETC\n\n1. Logs and other information are stored separately from the container, not inside it.\n    This is because data written from within the container can be lost at any time.\n2. Secrets and environment-dependent information should not be written directly into the Dockerfile but should be passed in via environment variables or a .env config file.\n3. There is a **linter** for Dockerfiles, so it is useful to use it when collaborating.\n    [https://github.com/hadolint/hadolint](https://github.com/hadolint/hadolint)\n\n## Several options for docker run\n\nWhen using Docker containers, there are some inconveniences.\nSpecifically, Docker does not store any of the work done within the Docker container by default.\nThis is because Docker containers use isolated file systems. Therefore, it is difficult to share data between multiple Docker containers.\n\nTo solve this problem, there are two approaches offered by Docker.\n\n![storage.png](./img/storage.png)\n\n#### Docker volume\n\n- Use the Docker CLI to directly manage a resource called `volume`.\n- Create a specific directory under the Docker area (`/var/lib/docker`) on the host and mount that path to a Docker container.\n\n#### Bind mount\n\n- Mount a specific path on the host to a Docker container.\n\n#### How to use?\n\nThe usage is through the same interface, using the `-v` option.  \nHowever, when using volumes, you need to manage them directly by performing commands like `docker volume create`, `docker volume ls`, `docker volume rm`, etc.\n\n- Docker volume\n\n    ```bash\n    docker run \\\n        -v my_volume:/app \\\n        nginx:latest\n    ````\n\n- Blind mount\n\n    ```bash\n    docker run \\\n        -v /home/user/some/path:/app \\\n        nginx:latest\n    ```\n\nWhen developing locally, bind mount can be convenient, but if you want to maintain a clean environment, using Docker volume and explicitly performing create and rm operations can be another approach.\n\nThe way storage is provided in Kubernetes ultimately relies on Docker's bind mount as well.\n\n### Docker run with resource limit\n\nBasically, docker containers can **fully utilize the CPU and memory resources of the host OS**. However, when using this, depending on the resource situation of the host OS, docker containers may abnormally terminate due to issues such as **OOM**.\nTo address this problem, docker provides the `-m` [option](https://docs.docker.com/config/containers/resource_constraints/#limit-a-containers-access-to-memory) which allows you to **limit the usage of CPU and memory** when running the docker container.\n\n```bash\ndocker run -d -m 512m --memory-reservation=256m --name 512-limit ubuntu sleep 3600\ndocker run -d -m 1g --memory-reservation=256m --name 1g-limit ubuntu sleep 3600\n```\n\nAfter running the Docker above, you can check the usage through the 'docker stats' command.\n\n```bash\nCONTAINER ID   NAME        CPU %     MEM USAGE / LIMIT   MEM %     NET I/O       BLOCK I/O   PIDS\n4ea1258e2e09   1g-limit    0.00%     300KiB / 1GiB       0.03%     1kB / 0B      0B / 0B     1\n4edf94b9a3e5   512-limit   0.00%     296KiB / 512MiB     0.06%     1.11kB / 0B   0B / 0B     1\n```\n\nIn Kubernetes, when you limit the CPU and memory resources of a pod resource, it is provided using this technique.\n\n### docker run with restart policy\n\nIf there is a need to keep a particular container running continuously, the `--restart=always` option is provided to try to re-create the container immediately after it is terminated.\n\nAfter entering the option, run the docker.\n\n```bash\ndocker run --restart=always ubuntu\n```\n\nRun `watch -n1 docker ps` to check if it is restarting.\nIf it is running normally, `Restarting (0)` will be printed in STATUS.\n\n```bash\nCONTAINER ID   IMAGE     COMMAND   CREATED          STATUS                         PORTS     NAMES\na911850276e8   ubuntu    \"bash\"    35 seconds ago   Restarting (0) 6 seconds ago             hungry_vaughan\n```\n\n- [https://docs.docker.com/engine/reference/commandline/run/#restart-policies---restart](https://docs.docker.com/engine/reference/commandline/run/#restart-policies---restart)\n  - Provides options such as \"on-failure with max retries\" and \"always\"\n\nWhen specifying the restart option for a job resource in Kubernetes, this approach is used.\n\n### Running docker run as a background process\n\nBy default, when running a Docker container, it is executed as a foreground process. This means that the terminal that launched the container is automatically attached to it, preventing you from running other commands.\n\nLet's try an example. Open two terminals, and in one terminal, continuously monitor `docker ps`, while in the other terminal, execute the following commands one by one and observe the behavior.\n\n#### First Practice\n\n```bash\ndocker run -it ubuntu sleep 10\n```\n\nYou must remain stopped for 10 seconds and you cannot perform any other commands from that container. After 10 seconds, you can check in docker ps that the container has terminated.\n\n#### Second Practice\n\n```bash\ndocker run -it ubuntu sleep 10\n```\n\nAfter that, press `ctrl + p` -> `ctrl + q`.\n\nNow you can perform other commands in that terminal, and you can also see that the container is still alive for up to 10 seconds with `docker ps`. This situation, where you exit from the Docker container, is called \"detached\". Docker provides an option to run containers in detached mode, which allows you to run the container in the background while executing the `run` command.\n\n#### Third Practice\n\n```bash\ndocker run -d ubuntu sleep 10\n```\n\nIn detached mode, you can perform other actions in the terminal that executed the command.\n\nIt is good to use detached mode appropriately according to the situation.  \nFor example, when developing a backend API server that communicates with the DB, the backend API server needs to be constantly checked with hot-loading while changing the source code, but the DB does not need to be monitored, so it can be executed as follows.  \nRun the DB container in detached mode, and run the backend API server in attached mode to follow the logs.\n\n\n## References\n\n- [https://towardsdatascience.com/docker-storage-598e385f4efe](https://towardsdatascience.com/docker-storage-598e385f4efe)\n- [https://vsupalov.com/docker-latest-tag/](https://vsupalov.com/docker-latest-tag/)\n- [https://docs.microsoft.com/ko-kr/azure/container-registry/container-registry-image-tag-version](https://docs.microsoft.com/ko-kr/azure/container-registry/container-registry-image-tag-version)\n- [https://stevelasker.blog/2018/03/01/docker-tagging-best-practices-for-tagging-and-versioning-docker-images/](https://stevelasker.blog/2018/03/01/docker-tagging-best-practices-for-tagging-and-versioning-docker-images/)\n"
  },
  {
    "path": "i18n/en/docusaurus-plugin-content-docs/version-1.0/prerequisites/docker/command.md",
    "content": "---\ntitle : \"[Practice] Docker command\"\ndescription: \"Practice to use docker command.\"\nsidebar_position: 4\ncontributors: [\"Jongseob Jeon\", \"Jaeyeon Kim\"]\n---\n\n## 1. Normal installation confirmation\n\n```bash\ndocker run hello-world\n```\n\nIf installed correctly, you should be able to see the following message.\n\n```bash\nHello from Docker!\nThis message shows that your installation appears to be working correctly.\n....\n```\n\n\n**(For ubuntu)** If you want to use without sudo, please refer to the following site.\n\n- [https://docs.docker.com/engine/install/linux-postinstall/#manage-docker-as-a-non-root-user](https://docs.docker.com/engine/install/linux-postinstall/#manage-docker-as-a-non-root-user)\n\n## 2. Docker Pull\n\nDocker pull is a command to download Docker images from a Docker image registry (a repository where Docker images are stored and shared).\n\nYou can check the arguments available in docker pull using the command below.\n\n```bash\ndocker pull --help\n```\n\nIf performed normally, it prints out as follows.\n\n```bash\nUsage:  docker pull [OPTIONS] NAME[:TAG|@DIGEST]\n\nPull an image or a repository from a registry\n\nOptions:\n  -a, --all-tags                Download all tagged images in the repository\n      --disable-content-trust   Skip image verification (default true)\n      --platform string         Set platform if server is multi-platform capable\n  -q, --quiet                   Suppress verbose output\n```\n\nIt can be seen here that docker pull takes two types of arguments. \n\n1. `[OPTIONS]`\n2. `NAME[:TAG|@DIGEST]`\n\nIn order to use the `-a` and `-q` options from help, they must be used before the NAME. \nLet's try and pull the `ubuntu:18.04` image directly.\n\n```bash\ndocker pull ubuntu:18.04\n```\n\nIf interpreted correctly, the command means to pull an image with the tag `18.04` from an image named `ubuntu`.\n\nIf performed successfully, it will produce an output similar to the following.\n\n```bash\n18.04: Pulling from library/ubuntu\n20d796c36622: Pull complete \nDigest: sha256:42cd9143b6060261187a72716906187294b8b66653b50d70bc7a90ccade5c984\nStatus: Downloaded newer image for ubuntu:18.04\ndocker.io/library/ubuntu:18.04\n```\n\nIf you perform the above command, you will download the image called 'ubuntu:18.04' from a registry named [docker.io/library](http://docker.io/library/) to your laptop.\n\n- Note that \n  - in the future, if you need to get a docker image from a certain **private** registry instead of docker.io or public docker hub, you can use [`docker login`](https://docs.docker.com/engine/reference/commandline/login/) to point to the certain registry, then use `docker pull`. Alternatively, you can set up an [insecure registry](https://stackoverflow.com/questions/42211380/add-insecure-registry-to-docker). \n  - Also note that [`docker save`](https://docs.docker.com/engine/reference/commandline/save/) and [`docker load`](https://docs.docker.com/engine/reference/commandline/load/) commands are available to store and share docker images in the form of `.tar` file in an intranet.\n\n\n## 3. Docker images\n\nThis is the command to list the Docker images that exist locally.\n\n```bash\ndocker images --help\n```\n\nThe arguments available for use in docker images are as follows.\n\n```bash\nUsage:  docker images [OPTIONS] [REPOSITORY[:TAG]]\n\nList images\n\nOptions:\n  -a, --all             Show all images (default hides intermediate images)\n      --digests         Show digests\n  -f, --filter filter   Filter output based on conditions provided\n      --format string   Pretty-print images using a Go template\n      --no-trunc        Don't truncate output\n  -q, --quiet           Only show image IDs\n```\n\nLet's try executing the command below directly.\n\n```bash\ndocker images\n```\n\nIf you install Docker and proceed with this practice, it will output something similar to this.\n\n```bash\nREPOSITORY   TAG       IMAGE ID       CREATED      SIZE\nubuntu       18.04     29e70752d7b2   2 days ago   56.7MB\n```\n\nIf you use the `-q` argument among the possible arguments, only the `IMAGE ID` will be printed.\n\n```bash\ndocker images -q\n```\n\n```bash\n29e70752d7b2\n```\n\n## 4. Docker ps\n\nCommand to output the list of currently running Docker containers.\n\n```bash\ndocker ps --help\n```\n\nUse the following arguments can be used with 'docker ps':\n\n```bash\nUsage:  docker ps [OPTIONS]\n\nList containers\n\nOptions:\n  -a, --all             Show all containers (default shows just running)\n  -f, --filter filter   Filter output based on conditions provided\n      --format string   Pretty-print containers using a Go template\n  -n, --last int        Show n last created containers (includes all states) (default -1)\n  -l, --latest          Show the latest created container (includes all states)\n      --no-trunc        Don't truncate output\n  -q, --quiet           Only display container IDs\n  -s, --size            Display total file sizes\n```\n\nLet's try running the command below directly.\n\n```bash\ndocker ps\n```\n\nIf there are no currently running containers, it will be as follows.\n\n```bash\nCONTAINER ID   IMAGE     COMMAND   CREATED   STATUS    PORTS     NAMES\n```\n\nIf there is a container running, it will look similar to this.\n\n```bash\nCONTAINER ID   IMAGE     COMMAND        CREATED          STATUS          PORTS     NAMES\nc1e8f5e89d8d   ubuntu    \"sleep 3600\"   13 seconds ago   Up 12 seconds             trusting_newton\n```\n\n## 5. Docker run\n\nCommand to run a Docker container.\n\n```bash\ndocker run --help\n```\n\nThe command to run docker run is as follows.\n\n```bash\nUsage:  docker run [OPTIONS] IMAGE [COMMAND] [ARG...]\n\nRun a command in a new container\n```\n\nWhat we need to confirm here is that the docker run command takes three types of arguments. \n\n1. `[OPTIONS]`\n2. `[COMMAND]`\n3. `[ARG...]`\n\nLet's try running a docker container ourselves.\n\n```bash\n## Usage:  docker run [OPTIONS] IMAGE [COMMAND] [ARG...]\ndocker run -it --name demo1 ubuntu:18.04 /bin/bash\n```\n\n- `-it`: Combination of `-i` and `-t` options\n  - Runs the container and connects it to an interactive terminal\n- `--name`: Assigns a name to the container for easier identification instead of using the container ID\n- `/bin/bash`: Specifies the command to be executed in the container upon startup, where `/bin/bash` opens a bash shell.\n\nAfter running the command, you can exit the container by using the `exit` command.\n\nWhen you enter the previously learned `docker ps` command, the following output will be displayed.\n```bash\nCONTAINER ID   IMAGE     COMMAND   CREATED   STATUS    PORTS     NAMES\n```\n\nIt was said that the container being executed was coming out, but for some reason the container that was just executed does not appear. The reason is that `docker ps` shows the currently running containers by default. If you want to see the stopped containers too, you must give the `-a` option.\n```bash\ndocker ps -a\n```\n\nThen the list of terminated containers will also be displayed.\n\n```bash\nCONTAINER ID   IMAGE          COMMAND       CREATED         STATUS                     PORTS     NAMES\n4c1aa74a382a   ubuntu:18.04   \"/bin/bash\"   2 minutes ago   Exited (0) 2 minutes ago             demo1\n```\n\n## 6. Docker exec\n\nDocker exec is a command that is used to issue commands or access the inside of a Docker container.\n\n```bash\ndocker exec --help\n```\nFor example, let's try running the following command.\n\n```bash\ndocker run -d --name demo2 ubuntu:18.04 sleep 3600\n```\n\nHere, the `-d` option is a command that allows the Docker container to run in the background so that even if the connection ends to the container, it continues to run.\n\nUse `docker ps` to check if it is currently running.\n\nIt can be confirmed that it is running as follows.\n\n```bash\nCONTAINER ID   IMAGE          COMMAND        CREATED         STATUS         PORTS     NAMES\nfc88a83e90f0   ubuntu:18.04   \"sleep 3600\"   4 seconds ago   Up 3 seconds             demo2\n```\n\nNow let's connect to the running docker container through the `docker exec` command.\n\n```bash\ndocker exec -it demo2 /bin/bash\n```\n\nThis is the same as the previous `docker run` command, allowing you to access the inside of the container.\n \nYou can exit using `exit`.\n## 7. Docker logs\n\n```bash\ndocker logs --help\n```\n\nI will have the following container be executed.\n\n```bash\ndocker run --name demo3 -d busybox sh -c \"while true; do $(echo date); sleep 1; done\"\n```\n\nBy using the above command, we have set up a busybox container named \"test\" as a Docker container in the background and printed the current time once every second.\n\nNow let's check the log with the command below.\n\n```bash\ndocker logs demo3\n```\n\nIf performed normally, it will be similar to below.\n\n```bash\nSun Mar  6 11:06:49 UTC 2022\nSun Mar  6 11:06:50 UTC 2022\nSun Mar  6 11:06:51 UTC 2022\nSun Mar  6 11:06:52 UTC 2022\nSun Mar  6 11:06:53 UTC 2022\nSun Mar  6 11:06:54 UTC 2022\n```\nHowever, if used this way, you can only check the logs taken so far.  \nIn this case, you can use the `-f` option to keep watching and outputting.\n\n```bash\ndocker logs demo3 -f    \n```\n\n## 8. Docker stop\n\nCommand to stop a running Docker container.\n\n```bash\ndocker stop --help\n```\n\nThrough `docker ps`, you can check the containers currently running, as follows.\n\n```bash\nCONTAINER ID   IMAGE          COMMAND                  CREATED              STATUS              PORTS     NAMES\n730391669c39   busybox        \"sh -c 'while true; …\"   About a minute ago   Up About a minute             demo3\nfc88a83e90f0   ubuntu:18.04   \"sleep 3600\"             4 minutes ago        Up 4 minutes                  demo2\n```\nNow let's try to stop Docker with `docker stop`.\n\n```bash\ndocker stop demo2\n```\n\nAfter executing, type `docker ps` again.\n\n```bash\nCONTAINER ID   IMAGE     COMMAND                  CREATED         STATUS         PORTS     NAMES\n730391669c39   busybox   \"sh -c 'while true; …\"   2 minutes ago   Up 2 minutes             demo3\n```\n\nComparing with the above result, you can see that the demo2 container has disappeared from the list of currently running containers.\nThe rest of the containers will also be stopped.\n\n```bash\ndocker stop demo3\n```\n\nDocker rm: Command to delete a Docker container.\n\n```bash\ndocker rm --help\n```\n\nDocker containers are in a stopped state by default. That's why you can see stopped containers using `docker ps -a`.\nBut why do we have to delete the stopped containers?  \nEven when stopped, the data used in the Docker remains in the container.\nSo you can restart the container through restarting. But this process will use disk.\nSo\n in order to delete the containers that are not used at all, we should use the `docker rm` command.\n \n First, let's check the current containers.\n\n```bash\ndocker ps -a\n```\n\nThere are three containers as follows.\n\n```bash\nCONTAINER ID   IMAGE          COMMAND                  CREATED          STATUS                            PORTS     NAMES\n730391669c39   busybox        \"sh -c 'while true; …\"   4 minutes ago    Exited (137) About a minute ago             demo3\nfc88a83e90f0   ubuntu:18.04   \"sleep 3600\"             7 minutes ago    Exited (137) 2 minutes ago                  demo2\n4c1aa74a382a   ubuntu:18.04   \"/bin/bash\"              10 minutes ago   Exited (0) 10 minutes ago                   demo1\n```\n\nLet's try to delete the 'demo3' container through the following command.\n\n```bash\ndocker rm demo3\n```\n\nThe command `docker ps -a` reduced it to two lines as follows.\n\n```bash\nCONTAINER ID   IMAGE          COMMAND        CREATED          STATUS                       PORTS     NAMES\nfc88a83e90f0   ubuntu:18.04   \"sleep 3600\"   13 minutes ago   Exited (137) 8 minutes ago             demo2\n4c1aa74a382a   ubuntu:18.04   \"/bin/bash\"    16 minutes ago   Exited (0) 16 minutes ago              demo1\n```\n\nDelete the remaining containers as well.\n\n```bash\ndocker rm demo2\ndocker rm demo1\n```\n\n## 10. Docker rmi\n\nCommand to delete a Docker image.\n\n```bash\ndocker rmi --help\n```\n\nUse the following commands to check which images are currently on the local.\n\n```bash\ndocker images\n```\n\nThe following is output.\n\n```bash\nREPOSITORY   TAG       IMAGE ID       CREATED        SIZE\nbusybox      latest    a8440bba1bc0   32 hours ago   1.41MB\nubuntu       18.04     29e70752d7b2   2 days ago     56.7MB\n```\n\nI will try to delete the `busybox` image.\n\n```bash\ndocker rmi busybox\n```\n\nIf you type `docker images` again, the following will appear.\n\n```bash\nREPOSITORY   TAG       IMAGE ID       CREATED        SIZE\nubuntu       18.04     29e70752d7b2   2 days ago     56.7MB\n```\n\n## References\n\n- [https://docs.github.com/en/packages/working-with-a-github-packages-registry/working-with-the-container-registry](https://docs.github.com/en/packages/working-with-a-github-packages-registry/working-with-the-container-registry)\n"
  },
  {
    "path": "i18n/en/docusaurus-plugin-content-docs/version-1.0/prerequisites/docker/docker.md",
    "content": "---\ntitle : \"What is Docker?\"\ndescription: \"Introduction to Docker.\"\nsidebar_position: 3\ncontributors: [\"Jongseob Jeon\", \"Jaeyeon Kim\"]\n---\n\n\n## Container\n\n- Containerization:\n  - A technology that allows applications to be executed uniformly anywhere.\n- Container Image:\n  - A collection of all the files required to run an application.\n  - → Similar to a mold for making fish-shaped bread (Bungeoppang).\n- Container:\n  - A single process that is executed based on a container image.\n  - → A fish-shaped bread (Bungeoppang) produced using a mold.\n\n## Docker\n\nDocker is a platform that allows you to manage and use containers.  \nIts slogan is \"Build Once, Run Anywhere,\" guaranteeing the same execution results anywhere.\n\nIn the Docker, the resources for the container are separated and the lifecycle is controlled by Linux kernel's cgroups, etc.  \nHowever, it is too difficult to use these interfaces directly, so an abstraction layer is created.\n\n![docker-layer.png](./img/docker-layer.png)\n\nThrough this, users can easily control containers with just the user-friendly API **Docker CLI**.\n- Users can easily control containers using the user-friendly API called **Docker CLI**.\n\n## Interpretation of Layer\n\nThe roles of the layers mentioned above are as follows:\n\n1. runC: Utilizes the functionality of the Linux kernel to isolate namespaces, CPUs, memory, filesystems, etc., for a container, which is a single process.\n2. containerd: Acts as an abstraction layer to communicate with runC (OCI layer) and uses the standardized interface (OCI).\n3. dockerd: Solely responsible for issuing commands to containerd.\n4. Docker CLI: Users only need to issue commands to dockerd (Docker daemon) using Docker CLI.\n   - During this communication process, Unix socket is used, so sometimes Docker-related errors occur, such as \"the /var/run/docker.sock is in use\" or \"insufficient permissions\" error messages.\n\nAlthough Docker encompasses many stages, when the term \"Docker\" is used, it can refer to Docker CLI, Dockerd (Docker daemon), or even a single Docker container, which can lead to confusion.  \nIn the upcoming text, the term \"Docker\" may be used in various contexts.\n\n## For ML Engineer\n\nML engineers use Docker for the following reasons:\n\n1. ML training/inference code needs to be independent of the underlying operating system, Python version, Python environment, and specific versions of Python packages.\n2. Therefore, the goal is to bundle not only the code but also all the dependent packages, environment variables, folder names, etc., into a single package. Containerization technology enables this.\n3. Docker is one of the software tools that makes it easy to use and manage this technology, and the packaged units are referred to as Docker images.\n"
  },
  {
    "path": "i18n/en/docusaurus-plugin-content-docs/version-1.0/prerequisites/docker/images.md",
    "content": "---\ntitle : \"[Practice] Docker images\"\ndescription: \"Practice to use docker image.\"\nsidebar_position: 5\ncontributors: [\"Jongseob Jeon\", \"Jaeyeon Kim\"]\n---\n\n- `docker commit`\n  - running container 를 docker image 로 만드는 방법\n  - `docker commit -m \"message\" -a \"author\" <container-id> <image-name>`\n  - `docker commit` 을 사용하면, 수동으로 Dockerfile 을 만들지 않고도 도커 이미지를 만들 수 있습니다.\n    ```\n    touch Dockerfile\n    ```\n\n3. Move to the docker-practice folder.\n\n4. Create an empty file called Dockerfile.\n\n1. 이미지에 특정 패키지를 설치하는 명령어는 무엇입니까?\n\nAnswer: `RUN`\n\nTranslation: Let's look at the basic commands that can be used in Dockerfile one by one. FROM is a command that specifies which image to use as a base image for Dockerfile. When creating a Docker image, instead of creating the environment I intend from scratch, I can use a pre-made image such as `python:3.9`, `python-3.9-alpine`, etc. as the base and install pytorch and add my source code.\n```docker\nFROM <image>[:<tag>] [AS <name>]\n\n# 예시\nFROM ubuntu\nFROM ubuntu:18.04\nFROM nginx:latest AS ngx\n```\n\nThe command to copy files or directories from the `<src>` path on the host (local) to the `<dest>` path inside the container.\n```docker\nCOPY <src>... <dest>\n\n# 예시\nCOPY a.txt /some-directory/b.txt\nCOPY my-directory /some-directory-2\n```\n\nADD is similar to COPY but it has additional features.\n```docker\n# 1 - 호스트에 압축되어있는 파일을 풀면서 컨테이너 내부로 copy 할 수 있음\nADD scripts.tar.gz /tmp\n# 2 - Remote URLs 에 있는 파일을 소스 경로로 지정할 수 있음\nADD http://www.example.com/script.sh /tmp\n\n# 위 두 가지 기능을 사용하고 싶을 경우에만 COPY 대신 ADD 를 사용하는 것을 권장\n```\n\nThe command to run the specified command inside a Docker container. \nDocker images maintain the state in which the commands are executed.\n```docker\nRUN <command>\nRUN [\"executable-command\", \"parameter1\", \"parameter2\"]\n\n# 예시\nRUN pip install torch\nRUN pip install -r requirements.txt\n```\n\nCMD specifies a command that the Docker container will **run when it starts**. There is a similar command called **ENTRYPOINT**. The difference between them will be discussed **later**. Note that only one **CMD** can be run in one Docker image, which is different from **RUN** command.\n```docker\nCMD <command>\nCMD [\"executable-command\", \"parameter1\", \"parameter2\"]\nCMD [\"parameter1\", \"parameter2\"] # ENTRYPOINT 와 함께 사용될 때\n\n# 예시\nCMD python main.py\n```\n\nWORKDIR is a command that specifies which directory inside the container to perform future additional commands. If the directory does not exist, it will be created.\n```docker\nWORKDIR /path/to/workdir\n\n# 예시\nWORKDIR /home/demo\nRUN pwd # /home/demo 가 출력됨\n```\n\nThis is a command to set the value of environment variables that will be used continuously inside the container.\n```docker\nENV <KEY> <VALUE>\nENV <KEY>=<VALUE>\n\n# 예시\n# default 언어 설정\nRUN locale-gen ko_KR.UTF-8\nENV LANG ko_KR.UTF-8\nENV LANGUAGE ko_KR.UTF-8\nENV LC_ALL ko_KR.UTF-8\n```\n\nYou can specify the port/protocol to be opened from the container. If `<protocol>` is not specified, TCP is set as the default.\n```docker\nEXPOSE <port>\nEXPOSE <port>/<protocol>\n\n# 예시\nEXPOSE 8080\n```\nWrite a simple Dockerfile by using `vim Dockerfile` or an editor like vscode and write the following:\n```docker\n# base image 를 ubuntu 18.04 로 설정합니다.\nFROM ubuntu:18.04\n\n# apt-get update 명령을 실행합니다.\nRUN apt-get update\n\n# TEST env var의 값을 hello 로 지정합니다.\nENV TEST hello\n\n# DOCKER CONTAINER 가 시작될 때, 환경변수 TEST 의 값을 출력합니다.\nCMD echo $TEST\n```\n\nUse the `docker build` command to create a Docker Image from a Dockerfile.\n```bash\ndocker build --help\n```\n\nRun the following command from the path where the Dockerfile is located.\n```bash\ndocker build -t my-image:v1.0.0 .\n```\n\nThe command above means to build an image with the name \"my-image\" and the tag \"v1.0.0\" from the Dockerfile in the current path. Let's check if the image was built successfully.\n```bash\n# grep : my-image 가 있는지를 잡아내는 (grep) 하는 명령어\ndocker images | grep my-image\n```\nIf performed normally, it will output as follows.\n```bash\nmy-image     v1.0.0    143114710b2d   3 seconds ago   87.9MB\n```\n\nLet's now **run** a docker container with the `my-image:v1.0.0` image that we just built.\n```bash\ndocker run my-image:v1.0.0\n```\n\nIf performed normally, it will result in the following.\n```bash\nhello\n```\n\nLet's run a docker container and change the value of the `TEST` env var at the time of running the `my-image:v1.0.0` image we just built.\n```bash\ndocker run -e TEST=bye my-image:v1.0.0\n```\nIf performed normally, it will be as follows.\n```bash\nbye\n```\n\n\n"
  },
  {
    "path": "i18n/en/docusaurus-plugin-content-docs/version-1.0/prerequisites/docker/install.md",
    "content": "---\ntitle : \"Install Docker\"\ndescription: \"Install docker to start.\"\nsidebar_position: 1\ncontributors: [\"Jongseob Jeon\", \"Jaeyeon Kim\"]\n---\n\n## Docker\n\nTo practice Docker, you need to install Docker.  \nThe Docker installation varies depending on which OS you are using.  \nPlease refer to the official website for the Docker installation that fits your environment: \n\n- [ubuntu](https://docs.docker.com/engine/install/ubuntu/)\n- [mac](https://docs.docker.com/desktop/mac/install/)\n- [windows](https://docs.docker.com/desktop/windows/install/)\n\n## Check Installation\n\nCheck installation requires an OS, terminal environment where `docker run hello-world` runs correctly.\n\n| OS      | Docker Engine  | Terminal           |\n| ------- | -------------- | ------------------ |\n| MacOS   | Docker Desktop | zsh                |\n| Windows | Docker Desktop | Powershell         |\n| Windows | Docker Desktop | WSL2               |\n| Ubuntu  | Docker Engine  | bash               |\n\n## Before diving in..\n\nIt is possible that many metaphors and examples will be focused towards MLOps as they explain the necessary Docker usage to use MLOps.\n"
  },
  {
    "path": "i18n/en/docusaurus-plugin-content-docs/version-1.0/prerequisites/docker/introduction.md",
    "content": "---\ntitle : \"Why Docker & Kubernetes ?\"\ndescription: \"Introduction to Docker.\"\nsidebar_position: 2\ncontributors: [\"Jongseob Jeon\", \"Jaeyeon Kim\"]\n---\n\n## Why Kubernetes ?\n\nTo operationalize machine learning models, additional functionalities beyond model development are required.\n\n1. Training Phase\n   - Schedule management for model training commands\n   - Ensuring reproducibility of trained models\n2. Deployment Phase\n   - Traffic distribution\n   - Monitoring service failures\n   - Troubleshooting in case of failures\n\nFortunately, the software development field has already put a lot of thought and effort into addressing these needs. Therefore, when deploying machine learning models, leveraging the outcomes of these considerations can be highly beneficial. Docker and Kubernetes are two prominent software products widely used in MLOps to address these needs.\n\n## Docker & Kubernetes\n\n### Not a software but  a product\n\nDocker and Kubernetes are representative software (products) that provide containerization and container orchestration functions respectively.\n\n#### Docker\n\nDocker was the mainstream in the past, but its usage has been decreasing gradually with the addition of various paid policy.  \nHowever, as of March 2022, it is still the most commonly used container virtualization software.\n\n![sysdig-2019.png](./img/sysdig-2019.png)\n\n<center> [from sysdig 2019] </center>\n\n![sysdig-2021.png](./img/sysdig-2021.png)\n\n<center> [from sysdig 2021]  </center>\n\n#### Kubernetes\n\nKubernetes: Kubernetes is a product that has almost no comparison so far.\n\n![cncf-survey.png](./img/cncf-survey.png)\n\n<center> [from cncf survey] </center>\n\n![t4-ai.png](./img/t4-ai.png)\n\n<center> [from t4.ai]  </center>\n\n### History of Open source\n\n#### Initial Docker & Kubernetes\n\nAt the beginning of Docker development, **one package** called Docker Engine contained multiple features such as API, CLI, networking, storage, etc., but it began to be **divided one by one** according to the philosophy of **MSA**.  \nHowever, the initial Kubernetes included Docker Engine for container virtualization.  \nTherefore, whenever the Docker version was updated, the interface of Docker Engine changed and Kubernetes was greatly affected.\n\n#### Open Container Initiative\n\nIn order to alleviate such inconveniences, many groups interested in container technology such as Google have come together to start the Open Container Initiative (OCI) project to set standards for containers.  \nDocker further separated its interface and developed Containerd, a Container Runtime that adheres to the OCI standard, and added an abstraction layer so that dockerd calls the API of Containerd.\n\nIn accordance with this flow, Kubernetes also now supports not only Docker, but any Container Runtime that adheres to the OCI standard and the specified specifications with the Container Runtime Interface (CRI) specification, starting from version 1.5. \n\n#### CRI-O\n\nCRI-O is a container runtime developed by Red Hat, Intel, SUSE, and IBM, which adheres to the OCI standard + CRI specifications, specifically for Kubernetes.\n\n#### Current docker & kubernetes\n\nCurrently, Docker and Kubernetes have been using Docker Engine as the default container runtime, but since Docker's API did not match the CRI specification (*OCI follows*), Kubernetes developed and supported a **dockershim** to make Docker's API compatible with CRI, (*it was a huge burden for Kubernetes, not for Docker*). This was **deprecated from Kubernetes v1.20 and abandoned from v1.23**.\n\n- v1.23 will be released in December 2021\n\nSo from Kubernetes v1.23, you can no longer use Docker natively. \nHowever, **users are not much affected by this change** because Docker images created through Docker Engine comply with the OCI standard, so they can be used regardless of what container runtime Kubernetes is made of.\n\n### References\n\n- [*https://www.linkedin.com/pulse/containerd는-무엇이고-왜-중요할까-sean-lee/?originalSubdomain=kr*](https://www.linkedin.com/pulse/containerd%EB%8A%94-%EB%AC%B4%EC%97%87%EC%9D%B4%EA%B3%A0-%EC%99%9C-%EC%A4%91%EC%9A%94%ED%95%A0%EA%B9%8C-sean-lee/?originalSubdomain=kr)\n- [https://kubernetes.io/blog/2021/12/07/kubernetes-1-23-release-announcement/](https://kubernetes.io/blog/2021/12/07/kubernetes-1-23-release-announcement/)\n- [https://kubernetes.io/blog/2020/12/02/dockershim-faq/](https://kubernetes.io/blog/2020/12/02/dockershim-faq/)\n- [https://kubernetes.io/blog/2020/12/02/dont-panic-kubernetes-and-docker/](https://kubernetes.io/blog/2020/12/02/dont-panic-kubernetes-and-docker/)\n- [https://kubernetes.io/ko/blog/2020/12/02/dont-panic-kubernetes-and-docker/](https://kubernetes.io/ko/blog/2020/12/02/dont-panic-kubernetes-and-docker/)\n"
  },
  {
    "path": "i18n/en/docusaurus-plugin-content-docs/version-1.0/setup-components/_category_.json",
    "content": "{\n  \"label\": \"Setup Components\",\n  \"position\": 3,\n  \"link\": {\n    \"type\": \"generated-index\"\n  }\n}\n"
  },
  {
    "path": "i18n/en/docusaurus-plugin-content-docs/version-1.0/setup-components/install-components-kf.md",
    "content": "---\ntitle : \"1. Kubeflow\"\ndescription: \"구성요소 설치 - Kubeflow\"\nsidebar_position: 1\ndate: 2021-12-13\nlastmod: 2021-12-20\ncontributors: [\"Jaeyeon Kim\", \"SeungTae Kim\"]\n---\n\n## Prepare the installation file\n\nPrepare the installation files for installing Kubeflow **v1.4.0**\n\nClone the [kubeflow/manifests Repository](https://github.com/kubeflow/manifests) with the **v1.4.0** tag, and move to the corresponding folder.\n\n```bash\ngit clone -b v1.4.0 https://github.com/kubeflow/manifests.git\ncd manifests\n```\n\n## Install each components\n\nThe kubeflow/manifests repository provides installation commands for each component, but it often lacks information on potential issues that may arise during installation or how to verify if the installation was successful. This can make it challenging for first-time users.  \nTherefore, in this document, we will provide instructions on how to verify the successful installation of each component.\n\nPlease note that this document will not cover the installation of components that are not covered in *MLOps for ALL*, such as Knative, KFServing, and MPI Operator, as we prioritize efficient resource usage.\n\n### Cert-manager\n\n1. Install cert-manager.\n\n  ```bash\n  kustomize build common/cert-manager/cert-manager/base | kubectl apply -f -\n  ```\n\n  If the installation is successful, you should see output similar to the following:\n\n  ```bash\n  namespace/cert-manager created\n  customresourcedefinition.apiextensions.k8s.io/certificaterequests.cert-manager.io created\n  customresourcedefinition.apiextensions.k8s.io/certificates.cert-manager.io created\n  customresourcedefinition.apiextensions.k8s.io/challenges.acme.cert-manager.io created\n  customresourcedefinition.apiextensions.k8s.io/clusterissuers.cert-manager.io created\n  customresourcedefinition.apiextensions.k8s.io/issuers.cert-manager.io created\n  customresourcedefinition.apiextensions.k8s.io/orders.acme.cert-manager.io created\n  serviceaccount/cert-manager created\n  serviceaccount/cert-manager-cainjector created\n  serviceaccount/cert-manager-webhook created\n  role.rbac.authorization.k8s.io/cert-manager-webhook:dynamic-serving created\n  role.rbac.authorization.k8s.io/cert-manager-cainjector:leaderelection created\n  role.rbac.authorization.k8s.io/cert-manager:leaderelection created\n  clusterrole.rbac.authorization.k8s.io/cert-manager-cainjector created\n  clusterrole.rbac.authorization.k8s.io/cert-manager-controller-approve:cert-manager-io created\n  clusterrole.rbac.authorization.k8s.io/cert-manager-controller-certificates created\n  clusterrole.rbac.authorization.k8s.io/cert-manager-controller-challenges created\n  clusterrole.rbac.authorization.k8s.io/cert-manager-controller-clusterissuers created\n  clusterrole.rbac.authorization.k8s.io/cert-manager-controller-ingress-shim created\n  clusterrole.rbac.authorization.k8s.io/cert-manager-controller-issuers created\n  clusterrole.rbac.authorization.k8s.io/cert-manager-controller-orders created\n  clusterrole.rbac.authorization.k8s.io/cert-manager-edit created\n  clusterrole.rbac.authorization.k8s.io/cert-manager-view created\n  clusterrole.rbac.authorization.k8s.io/cert-manager-webhook:subjectaccessreviews created\n  rolebinding.rbac.authorization.k8s.io/cert-manager-webhook:dynamic-serving created\n  rolebinding.rbac.authorization.k8s.io/cert-manager-cainjector:leaderelection created\n  rolebinding.rbac.authorization.k8s.io/cert-manager:leaderelection created\n  clusterrolebinding.rbac.authorization.k8s.io/cert-manager-cainjector created\n  clusterrolebinding.rbac.authorization.k8s.io/cert-manager-controller-approve:cert-manager-io created\n  clusterrolebinding.rbac.authorization.k8s.io/cert-manager-controller-certificates created\n  clusterrolebinding.rbac.authorization.k8s.io/cert-manager-controller-challenges created\n  clusterrolebinding.rbac.authorization.k8s.io/cert-manager-controller-clusterissuers created\n  clusterrolebinding.rbac.authorization.k8s.io/cert-manager-controller-ingress-shim created\n  clusterrolebinding.rbac.authorization.k8s.io/cert-manager-controller-issuers created\n  clusterrolebinding.rbac.authorization.k8s.io/cert-manager-controller-orders created\n  clusterrolebinding.rbac.authorization.k8s.io/cert-manager-webhook:subjectaccessreviews created\n  service/cert-manager created\n  service/cert-manager-webhook created\n  deployment.apps/cert-manager created\n  deployment.apps/cert-manager-cainjector created\n  deployment.apps/cert-manager-webhook created\n  mutatingwebhookconfiguration.admissionregistration.k8s.io/cert-manager-webhook created\n  validatingwebhookconfiguration.admissionregistration.k8s.io/cert-manager-webhook created\n  ```\n\n  Wait for all 3 pods in the cert-manager namespace to become Running:\n\n  ```bash\n  kubectl get pod -n cert-manager\n  ```\n\n  Once all the pods are Running, you should see output similar to the following:\n\n  ```bash\n  NAME                                       READY   STATUS    RESTARTS   AGE\n  cert-manager-7dd5854bb4-7nmpd              1/1     Running   0          2m10s\n  cert-manager-cainjector-64c949654c-2scxr   1/1     Running   0          2m10s\n  cert-manager-webhook-6b57b9b886-7q6g2      1/1     Running   0          2m10s\n  ```\n\n2. To install `kubeflow-issuer`, run the following command:\n\n  ```bash\n  kustomize build common/cert-manager/kubeflow-issuer/base | kubectl apply -f -\n  ```\n\n  If the installation is successful, you should see the following output:\n\n  ```bash\n  clusterissuer.cert-manager.io/kubeflow-self-signing-issuer created\n  ```\n\n  Note: If the `cert-manager-webhook` deployment is not in the Running state, you may encounter an error similar to the one below, and the `kubeflow-issuer` may not be installed. In this case, please ensure that all 3 pods of cert-manager are Running before retrying the command.  \n  If you encounter the below error, make sure that the `cert-manager` deployment and all its pods are running properly before proceeding.\n\n  ```bash\n  Error from server: error when retrieving current configuration of:\n  Resource: \"cert-manager.io/v1alpha2, Resource=clusterissuers\", GroupVersionKind: \"cert-manager.io/v1alpha2, Kind=ClusterIssuer\"\n  Name: \"kubeflow-self-signing-issuer\", Namespace: \"\"\n  from server for: \"STDIN\": conversion webhook for cert-manager.io/v1, Kind=ClusterIssuer failed: Post \"https://cert-manager-webhook.cert-manager.svc:443/convert?timeout=30s\": dial tcp 10.101.177.157:443: connect: connection refused\n  ```\n\n### Istio\n\n1. Install Custom Resource Definition(CRD) for istio.\n\n  ```bash\n  kustomize build common/istio-1-9/istio-crds/base | kubectl apply -f -\n  ```\n\n  if run properly,  you should see the following output:\n\n  ```bash\n  customresourcedefinition.apiextensions.k8s.io/authorizationpolicies.security.istio.io created\n  customresourcedefinition.apiextensions.k8s.io/destinationrules.networking.istio.io created\n  customresourcedefinition.apiextensions.k8s.io/envoyfilters.networking.istio.io created\n  customresourcedefinition.apiextensions.k8s.io/gateways.networking.istio.io created\n  customresourcedefinition.apiextensions.k8s.io/istiooperators.install.istio.io created\n  customresourcedefinition.apiextensions.k8s.io/peerauthentications.security.istio.io created\n  customresourcedefinition.apiextensions.k8s.io/requestauthentications.security.istio.io created\n  customresourcedefinition.apiextensions.k8s.io/serviceentries.networking.istio.io created\n  customresourcedefinition.apiextensions.k8s.io/sidecars.networking.istio.io created\n  customresourcedefinition.apiextensions.k8s.io/virtualservices.networking.istio.io created\n  customresourcedefinition.apiextensions.k8s.io/workloadentries.networking.istio.io created\n  customresourcedefinition.apiextensions.k8s.io/workloadgroups.networking.istio.io created\n  ```\n\n1. Install istio namespace\n\n  ```bash\n  kustomize build common/istio-1-9/istio-namespace/base | kubectl apply -f -\n  ```\n\n  if run properly,  you should see the following output:\n\n  ```bash\n  namespace/istio-system created\n  ```\n\n3. Install istio.\n\n  ```bash\n  kustomize build common/istio-1-9/istio-install/base | kubectl apply -f -\n  ```\n\n  if run properly,  you should see the following output:\n\n  ```bash\n  serviceaccount/istio-ingressgateway-service-account created\n  serviceaccount/istio-reader-service-account created\n  serviceaccount/istiod-service-account created\n  role.rbac.authorization.k8s.io/istio-ingressgateway-sds created\n  role.rbac.authorization.k8s.io/istiod-istio-system created\n  clusterrole.rbac.authorization.k8s.io/istio-reader-istio-system created\n  clusterrole.rbac.authorization.k8s.io/istiod-istio-system created\n  rolebinding.rbac.authorization.k8s.io/istio-ingressgateway-sds created\n  rolebinding.rbac.authorization.k8s.io/istiod-istio-system created\n  clusterrolebinding.rbac.authorization.k8s.io/istio-reader-istio-system created\n  clusterrolebinding.rbac.authorization.k8s.io/istiod-istio-system created\n  configmap/istio created\n  configmap/istio-sidecar-injector created\n  service/istio-ingressgateway created\n  service/istiod created\n  deployment.apps/istio-ingressgateway created\n  deployment.apps/istiod created\n  envoyfilter.networking.istio.io/metadata-exchange-1.8 created\n  envoyfilter.networking.istio.io/metadata-exchange-1.9 created\n  envoyfilter.networking.istio.io/stats-filter-1.8 created\n  envoyfilter.networking.istio.io/stats-filter-1.9 created\n  envoyfilter.networking.istio.io/tcp-metadata-exchange-1.8 created\n  envoyfilter.networking.istio.io/tcp-metadata-exchange-1.9 created\n  envoyfilter.networking.istio.io/tcp-stats-filter-1.8 created\n  envoyfilter.networking.istio.io/tcp-stats-filter-1.9 created\n  envoyfilter.networking.istio.io/x-forwarded-host created\n  gateway.networking.istio.io/istio-ingressgateway created\n  authorizationpolicy.security.istio.io/global-deny-all created\n  authorizationpolicy.security.istio.io/istio-ingressgateway created\n  mutatingwebhookconfiguration.admissionregistration.k8s.io/istio-sidecar-injector created\n  validatingwebhookconfiguration.admissionregistration.k8s.io/istiod-istio-system created\n  ```\n\n  Wait for all 2 pods in the cert-manager namespace to become Running:\n\n  ```bash\n  kubectl get po -n istio-system\n  ```\n\n  Once all the pods are Running, you should see output similar to the following:\n\n  ```bash\n  NAME                                   READY   STATUS    RESTARTS   AGE\n  istio-ingressgateway-79b665c95-xm22l   1/1     Running   0          16s\n  istiod-86457659bb-5h58w                1/1     Running   0          16s\n  ```\n\n### Dex\n\nNow, let's install dex.\n\n```bash\nkustomize build common/dex/overlays/istio | kubectl apply -f -\n```\n\nIf performed normally, it will be printed as follows:\n\n```bash\nnamespace/auth created\ncustomresourcedefinition.apiextensions.k8s.io/authcodes.dex.coreos.com created\nserviceaccount/dex created\nclusterrole.rbac.authorization.k8s.io/dex created\nclusterrolebinding.rbac.authorization.k8s.io/dex created\nconfigmap/dex created\nsecret/dex-oidc-client created\nservice/dex created\ndeployment.apps/dex created\nvirtualservice.networking.istio.io/dex created\n```\n\nWait until all one pod in the auth namespace is running.\n```bash\nkubectl get po -n auth\n```\n\nWhen everyone is running, similar results will be printed.\n```bash\nNAME                   READY   STATUS    RESTARTS   AGE\ndex-5ddf47d88d-458cs   1/1     Running   1          12s\n```\n\nInstall OIDC AuthService.\n```bash\nkustomize build common/oidc-authservice/base | kubectl apply -f -\n```\n\nIf performed normally, it will be printed as follows.\n```bash\nconfigmap/oidc-authservice-parameters created\nsecret/oidc-authservice-client created\nservice/authservice created\npersistentvolumeclaim/authservice-pvc created\nstatefulset.apps/authservice created\nenvoyfilter.networking.istio.io/authn-filter created\n```\n\nWait until the authservice-0 pod in the istio-system namespace is Running.\n```bash\nkubectl get po -n istio-system -w\n```\n\nIf everybody runs, a similar result will be printed.\n```bash\nNAME                                   READY   STATUS    RESTARTS   AGE\nauthservice-0                          1/1     Running   0          14s\nistio-ingressgateway-79b665c95-xm22l   1/1     Running   0          2m37s\nistiod-86457659bb-5h58w                1/1     Running   0          2m37s\n```\n\nCreate a Kubeflow Namespace.\n```bash\nkustomize build common/kubeflow-namespace/base | kubectl apply -f -\n```\n\nIf performed normally, it will be outputted as follows.\n```bash\nnamespace/kubeflow created\n```\n\nRetrieve the Kubeflow namespace.\n```bash\nkubectl get ns kubeflow\n```\n\nIf generated normally, similar results will be output.\n```bash\nNAME       STATUS   AGE\nkubeflow   Active   8s\n```\n\nInstall kubeflow-roles.\n```bash\nkustomize build common/kubeflow-roles/base | kubectl apply -f -\n```\n\nIf properly performed, it will output as follows.\n```bash\nclusterrole.rbac.authorization.k8s.io/kubeflow-admin created\nclusterrole.rbac.authorization.k8s.io/kubeflow-edit created\nclusterrole.rbac.authorization.k8s.io/kubeflow-kubernetes-admin created\nclusterrole.rbac.authorization.k8s.io/kubeflow-kubernetes-edit created\nclusterrole.rbac.authorization.k8s.io/kubeflow-kubernetes-view created\nclusterrole.rbac.authorization.k8s.io/kubeflow-view created\n```\n\nRetrieve the kubeflow roles just created.\n```bash\nkubectl get clusterrole | grep kubeflow\n```\n\nThe following 6 clusterroles will be output.\n```bash\nkubeflow-admin                                                         2021-12-03T08:51:36Z\nkubeflow-edit                                                          2021-12-03T08:51:36Z\nkubeflow-kubernetes-admin                                              2021-12-03T08:51:36Z\nkubeflow-kubernetes-edit                                               2021-12-03T08:51:36Z\nkubeflow-kubernetes-view                                               2021-12-03T08:51:36Z\nkubeflow-view                                                          2021-12-03T08:51:36Z\n```\n\nInstall Kubeflow Istio Resources.\n```bash\nkustomize build common/istio-1-9/kubeflow-istio-resources/base | kubectl apply -f -\n```\n\nIf performed normally, it will be output as follows.\n```bash\nclusterrole.rbac.authorization.k8s.io/kubeflow-istio-admin created\nclusterrole.rbac.authorization.k8s.io/kubeflow-istio-edit created\nclusterrole.rbac.authorization.k8s.io/kubeflow-istio-view created\ngateway.networking.istio.io/kubeflow-gateway created\n```\n\nRetrieve the Kubeflow roles just created.\n```bash\nkubectl get clusterrole | grep kubeflow-istio\n```\nThe following three clusterroles are output.\n```bash\nkubeflow-istio-admin                                                   2021-12-03T08:53:17Z\nkubeflow-istio-edit                                                    2021-12-03T08:53:17Z\nkubeflow-istio-view                                                    2021-12-03T08:53:17Z\n```\n\nCheck if the gateway is properly installed in the Kubeflow namespace.\n```bash\nkubectl get gateway -n kubeflow\n```\n\nIf generated normally, a result similar to the following will be output.\n```bash\nNAME               AGE\nkubeflow-gateway   31s\n```\n\nInstalling Kubeflow Pipelines.\n```bash\nkustomize build apps/pipeline/upstream/env/platform-agnostic-multi-user | kubectl apply -f -\n```\nIf performed normally, it will be output as follows.\n```bash\ncustomresourcedefinition.apiextensions.k8s.io/clusterworkflowtemplates.argoproj.io created\ncustomresourcedefinition.apiextensions.k8s.io/cronworkflows.argoproj.io created\ncustomresourcedefinition.apiextensions.k8s.io/workfloweventbindings.argoproj.io created\n...(생략)\nauthorizationpolicy.security.istio.io/ml-pipeline-visualizationserver created\nauthorizationpolicy.security.istio.io/mysql created\nauthorizationpolicy.security.istio.io/service-cache-server created\n```\n\nThis command is installing multiple resources at once, but there are resources with dependencies on the installation order. Therefore, depending on the time, a similar error may occur.\n```bash\n\"error: unable to recognize \"STDIN\": no matches for kind \"CompositeController\" in version \"metacontroller.k8s.io/v1alpha1\"\"  \n```\n\nIf a similar error occurs, wait about 10 seconds and then try the command above again.\n```bash\nkustomize build apps/pipeline/upstream/env/platform-agnostic-multi-user | kubectl apply -f -\n```\n\nCheck to see if it has been installed correctly.\n```bash\nkubectl get po -n kubeflow\n```\n\nWait until all 16 pods are running as follows.\n```bash\nNAME                                                     READY   STATUS    RESTARTS   AGE\ncache-deployer-deployment-79fdf9c5c9-bjnbg               2/2     Running   1          5m3s\ncache-server-5bdf4f4457-48gbp                            2/2     Running   0          5m3s\nkubeflow-pipelines-profile-controller-7b947f4748-8d26b   1/1     Running   0          5m3s\nmetacontroller-0                                         1/1     Running   0          5m3s\nmetadata-envoy-deployment-5b4856dd5-xtlkd                1/1     Running   0          5m3s\nmetadata-grpc-deployment-6b5685488-kwvv7                 2/2     Running   3          5m3s\nmetadata-writer-548bd879bb-zjkcn                         2/2     Running   1          5m3s\nminio-5b65df66c9-k5gzg                                   2/2     Running   0          5m3s\nml-pipeline-8c4b99589-85jw6                              2/2     Running   1          5m3s\nml-pipeline-persistenceagent-d6bdc77bd-ssxrv             2/2     Running   0          5m3s\nml-pipeline-scheduledworkflow-5db54d75c5-zk2cw           2/2     Running   0          5m2s\nml-pipeline-ui-5bd8d6dc84-j7wqr                          2/2     Running   0          5m2s\nml-pipeline-viewer-crd-68fb5f4d58-mbcbg                  2/2     Running   1          5m2s\nml-pipeline-visualizationserver-8476b5c645-wljfm         2/2     Running   0          5m2s\nmysql-f7b9b7dd4-xfnw4                                    2/2     Running   0          5m2s\nworkflow-controller-5cbbb49bd8-5zrwx                     2/2     Running   1          5m2s\n```\n\nAdditionally, please check if the ml-pipeline UI is connected properly.\n```bash\nkubectl port-forward svc/ml-pipeline-ui -n kubeflow 8888:80\n```\n\nOpen the web browser and connect to the path [http://localhost:8888/#/pipelines/](http://localhost:8888/#/pipelines/). Confirm that the following screen is displayed.\n\nIf you get the error \"Connection refused on localhost\", you can access it through the command line by setting the address, as long as there are no security issues. To check if the ml-pipeline UI connects normally, open the bind of all addresses with 0.0.0.0.\n```bash\nkubectl port-forward --address 0.0.0.0 svc/ml-pipeline-ui -n kubeflow 8888:80\n```\nDespite running with the above options, if connection refusal issues still occur, add access permission by allowing all TCP protocol ports in the firewall settings or by adding access permission to port 8888.\n\nWhen you open the web browser and access the path `http://<your virtual instance public IP>:8888/#/pipelines/`, you can see the ml-pipeline UI screen.\n\nWhen accessing the other ports path that is being processed in the bottom, run the command in the same way as above and add the port number to the firewall to run it.\n\nEnglish: We will install Katib.\n```bash\nkustomize build apps/katib/upstream/installs/katib-with-kubeflow | kubectl apply -f -\n```\n\nIf performed normally, it will be output as follows.\n```bash\ncustomresourcedefinition.apiextensions.k8s.io/experiments.kubeflow.org created\ncustomresourcedefinition.apiextensions.k8s.io/suggestions.kubeflow.org created\ncustomresourcedefinition.apiextensions.k8s.io/trials.kubeflow.org created\nserviceaccount/katib-controller created\nserviceaccount/katib-ui created\nclusterrole.rbac.authorization.k8s.io/katib-controller created\nclusterrole.rbac.authorization.k8s.io/katib-ui created\nclusterrole.rbac.authorization.k8s.io/kubeflow-katib-admin created\nclusterrole.rbac.authorization.k8s.io/kubeflow-katib-edit created\nclusterrole.rbac.authorization.k8s.io/kubeflow-katib-view created\nclusterrolebinding.rbac.authorization.k8s.io/katib-controller created\nclusterrolebinding.rbac.authorization.k8s.io/katib-ui created\nconfigmap/katib-config created\nconfigmap/trial-templates created\nsecret/katib-mysql-secrets created\nservice/katib-controller created\nservice/katib-db-manager created\nservice/katib-mysql created\nservice/katib-ui created\npersistentvolumeclaim/katib-mysql created\ndeployment.apps/katib-controller created\ndeployment.apps/katib-db-manager created\ndeployment.apps/katib-mysql created\ndeployment.apps/katib-ui created\ncertificate.cert-manager.io/katib-webhook-cert created\nissuer.cert-manager.io/katib-selfsigned-issuer created\nvirtualservice.networking.istio.io/katib-ui created\nmutatingwebhookconfiguration.admissionregistration.k8s.io/katib.kubeflow.org created\nvalidatingwebhookconfiguration.admissionregistration.k8s.io/katib.kubeflow.org created\n```\n\nConfirm if it has been installed properly.\n```bash\nkubectl get po -n kubeflow | grep katib\n```\nWait until four pods are Running, like this.\n```bash\nkatib-controller-68c47fbf8b-b985z                        1/1     Running   0          82s\nkatib-db-manager-6c948b6b76-2d9gr                        1/1     Running   0          82s\nkatib-mysql-7894994f88-scs62                             1/1     Running   0          82s\nkatib-ui-64bb96d5bf-d89kp                                1/1     Running   0          82s\n```\n\nAdditionally, we will confirm that the Katib UI is connected normally.\n```bash\nkubectl port-forward svc/katib-ui -n kubeflow 8081:80\n```\n\nOpen the web browser and access the path [http://localhost:8081/katib/](http://localhost:8081/katib/) to confirm the following screen is displayed.\n\n\n```bash\nkustomize build apps/centraldashboard/upstream/overlays/istio | kubectl apply -f -\n```\n\nIf performed normally, it will be output as follows.\n```bash\nserviceaccount/centraldashboard created\nrole.rbac.authorization.k8s.io/centraldashboard created\nclusterrole.rbac.authorization.k8s.io/centraldashboard created\nrolebinding.rbac.authorization.k8s.io/centraldashboard created\nclusterrolebinding.rbac.authorization.k8s.io/centraldashboard created\nconfigmap/centraldashboard-config created\nconfigmap/centraldashboard-parameters created\nservice/centraldashboard created\ndeployment.apps/centraldashboard created\nvirtualservice.networking.istio.io/centraldashboard created\n```\n\nCheck to see if it has been installed normally.\n```bash\nkubectl get po -n kubeflow | grep centraldashboard\n```\n\nWait until one pod related to centraldashboard in the kubeflow namespace becomes Running.\n```bash\ncentraldashboard-8fc7d8cc-xl7ts                          1/1     Running   0          52s\n```\n\nAdditionally, we will check if the Central Dashboard UI is connected properly.\n```bash\nkubectl port-forward svc/centraldashboard -n kubeflow 8082:80\n```\nOpen the web browser to connect to the path [http://localhost:8082/](http://localhost:8082/) and check that the following screen is displayed.\n```bash\nkustomize build apps/admission-webhook/upstream/overlays/cert-manager | kubectl apply -f -\n```\n\nIf performed normally, it will be output as follows.\n```bash\ncustomresourcedefinition.apiextensions.k8s.io/poddefaults.kubeflow.org created\nserviceaccount/admission-webhook-service-account created\nclusterrole.rbac.authorization.k8s.io/admission-webhook-cluster-role created\nclusterrole.rbac.authorization.k8s.io/admission-webhook-kubeflow-poddefaults-admin created\nclusterrole.rbac.authorization.k8s.io/admission-webhook-kubeflow-poddefaults-edit created\nclusterrole.rbac.authorization.k8s.io/admission-webhook-kubeflow-poddefaults-view created\nclusterrolebinding.rbac.authorization.k8s.io/admission-webhook-cluster-role-binding created\nservice/admission-webhook-service created\ndeployment.apps/admission-webhook-deployment created\ncertificate.cert-manager.io/admission-webhook-cert created\nissuer.cert-manager.io/admission-webhook-selfsigned-issuer created\nmutatingwebhookconfiguration.admissionregistration.k8s.io/admission-webhook-mutating-webhook-configuration created\n```\n\nCheck if it is installed normally.\n```bash\nkubectl get po -n kubeflow | grep admission-webhook\n```\n\nWait until one pod is running.\n```bash\nadmission-webhook-deployment-667bd68d94-2hhrx            1/1     Running   0          11s\n```\n\nInstall the Notebook controller.\n\nIf done successfully, it will output as follows.\n  deployment.apps/notebook-controller created\n  ```\n\nA CustomResourceDefinition.apiextensions.k8s.io/notebooks.kubeflow.org, ServiceAccount/notebook-controller-service-account, Role.rbac.authorization.k8s.io/notebook-controller-leader-election-role, ClusterRole.rbac.authorization.k8s.io/notebook-controller-kubeflow-notebooks-admin, ClusterRole.rbac.authorization.k8s.io/notebook-controller-kubeflow-notebooks-edit, ClusterRole.rbac.authorization.k8s.io/notebook-controller-kubeflow-notebooks-view, ClusterRole.rbac.authorization.k8s.io/notebook-controller-role, RoleBinding.rbac.authorization.k8s.io/notebook-controller-leader-election-rolebinding, ClusterRoleBinding.rbac.authorization.k8s.io/notebook-controller-role-binding, ConfigMap/notebook-controller-config-m\n\nTranslation: Check if the installation was successful. Wait until one pod is running with the following command: kubectl get po -n kubeflow | grep notebook-controller.\nTranslation: Install Jupyter Web App.\n  If performed correctly, the following will be output.\n  ```\n  Confirm that the installation was successful:\n  configmap/jupyter-web-app-config-76844k4cd7 created\n  configmap/jupyter-web-app-logos created\n  configmap/jupyter-web-app-parameters-chmg88cm48 created\n  service/jupyter-web-app-service created\n  deployment.apps/jupyter-web-app-deployment created\n  virtualservice.networking.istio.io/jupyter-web-app-jupyter-web-app created\n\nWait until one pod is Running.\n\nEnglish: We will install the Profile Controller.\n```bash\nkustomize build apps/profiles/upstream/overlays/kubeflow | kubectl apply -f -\n```\n\nIf performed normally, it will be outputted as follows.\n```bash\ncustomresourcedefinition.apiextensions.k8s.io/profiles.kubeflow.org created\nserviceaccount/profiles-controller-service-account created\nrole.rbac.authorization.k8s.io/profiles-leader-election-role created\nrolebinding.rbac.authorization.k8s.io/profiles-leader-election-rolebinding created\nclusterrolebinding.rbac.authorization.k8s.io/profiles-cluster-role-binding created\nconfigmap/namespace-labels-data-48h7kd55mc created\nconfigmap/profiles-config-46c7tgh6fd created\nservice/profiles-kfam created\ndeployment.apps/profiles-deployment created\nvirtualservice.networking.istio.io/profiles-kfam created\n```\n\nCheck to see if it is installed normally.\n```bash\nkubectl get po -n kubeflow | grep profiles-deployment\n```\n\nWait until one pod is running.\n```bash\nprofiles-deployment-89f7d88b-qsnrd                       2/2     Running   0          42s\n```\n\nInstall the Volumes Web App.\n```bash\nkustomize build apps/volumes-web-app/upstream/overlays/istio | kubectl apply -f -\n```\n\nIf performed normally, it will be output as follows.\n```bash\nserviceaccount/volumes-web-app-service-account created\nclusterrole.rbac.authorization.k8s.io/volumes-web-app-cluster-role created\nclusterrole.rbac.authorization.k8s.io/volumes-web-app-kubeflow-volume-ui-admin created\nclusterrole.rbac.authorization.k8s.io/volumes-web-app-kubeflow-volume-ui-edit created\nclusterrole.rbac.authorization.k8s.io/volumes-web-app-kubeflow-volume-ui-view created\nclusterrolebinding.rbac.authorization.k8s.io/volumes-web-app-cluster-role-binding created\nconfigmap/volumes-web-app-parameters-4gg8cm2gmk created\nservice/volumes-web-app-service created\ndeployment.apps/volumes-web-app-deployment created\nvirtualservice.networking.istio.io/volumes-web-app-volumes-web-app created\n```\n\nCheck if it is installed normally.\n```bash\nkubectl get po -n kubeflow | grep volumes-web-app\n```\n\nWait until one pod is running.\n```bash\nvolumes-web-app-deployment-8589d664cc-62svl              1/1     Running   0          27s\n```\n  ```bash\n  Install Tensorboard Web App.\n\nService account/tensorboards-web-app-service-account created, Cluster role.rbac.authorization.k8s.io/tensorboards-web-app-cluster-role created, Cluster role.rbac.authorization.k8s.io/tensorboards-web-app-kubeflow-tensorboard-ui-admin created, Cluster role.rbac.authorization.k8s.io/tensorboards-web-app-kubeflow-tensorboard-ui-edit created, Cluster role.rbac.authorization.k8s.io/tensorboards-web-app-kubeflow-tensorboard-ui-view created, Cluster role binding.rbac.authorization.k8s.io/tensorboards-web-app-cluster-role-binding created, Config map/tensorboards-web-app-parameters-g28fbd6cch created, Service/tensorboards-web-app-service created, Deployment.apps/tensorboards-web-app-deployment created, and Virtual service.networking.istio.io/t\nCheck if it is installed correctly.\n  ```bash\n  Deployment \"tensorboard-web-app-deployment-6ff79b7f44-qbzmw\" created\n  deployment.apps/tensorboard-controller-controller-manager created\n```\n\nA custom resource definition for 'tensorboards.tensorboard.kubeflow.org' was created, along with a service account, roles, role bindings, a config map, and a deployment for the controller manager metrics service.\n  Check if the deployment.apps/tensorboard-controller-controller-manager was installed correctly. Wait for 1 pod to be Running.\nTranslation: Installing Training Operator.\n```bash\nkustomize build apps/training-operator/upstream/overlays/kubeflow | kubectl apply -f -\n```\n\nIf performed normally, it will be output as follows.\n\n```bash\ncustomresourcedefinition.apiextensions.k8s.io/mxjobs.kubeflow.org created\ncustomresourcedefinition.apiextensions.k8s.io/pytorchjobs.kubeflow.org created\ncustomresourcedefinition.apiextensions.k8s.io/tfjobs.kubeflow.org created\ncustomresourcedefinition.apiextensions.k8s.io/xgboostjobs.kubeflow.org created\nserviceaccount/training-operator created\nclusterrole.rbac.authorization.k8s.io/kubeflow-training-admin created\nclusterrole.rbac.authorization.k8s.io/kubeflow-training-edit created\nclusterrole.rbac.authorization.k8s.io/kubeflow-training-view created\nclusterrole.rbac.authorization.k8s.io/training-operator created\nclusterrolebinding.rbac.authorization.k8s.io/training-operator created\nservice/training-operator created\ndeployment.apps/training-operator created\n```\n\nCheck to see if it has been installed normally.\n\n```bash\nkubectl get po -n kubeflow | grep training-operator\n```\n\nWait until one pod is up and running.\n\n```bash\ntraining-operator-7d98f9dd88-6887f                          1/1     Running   0          28s\n```\n\n### User Namespace\n\nFor using Kubeflow, create a Kubeflow Profile for the User to be used.\n\n```bash\nkustomize build common/user-namespace/base | kubectl apply -f -\n```\n\nIf performed normally, it will be outputted as follows.\n\n```bash\nconfigmap/default-install-config-9h2h2b6hbk created\nprofile.kubeflow.org/kubeflow-user-example-com created\n```\n\nConfirm that the kubeflow-user-example-com profile has been created.\n\n```bash\nkubectl get profile\n```\n\n```bash\nkubeflow-user-example-com   37s\n```\n\n## Check installation\n\nConfirm successful installation by port forwarding to access Kubeflow central dashboard with web browser.\n\n```bash\nkubectl port-forward svc/istio-ingressgateway -n istio-system 8080:80\n```\n\nOpen a web browser and connect to [http://localhost:8080](http://localhost:8080) to confirm that the following screen is displayed. \n![login-ui](./img/login-after-install.png)\n\nEnter the following connection information to connect.\n\n- Email Address: `user@example.com`\n- Password: `12341234`\n\n![central-dashboard](./img/after-login.png)\n"
  },
  {
    "path": "i18n/en/docusaurus-plugin-content-docs/version-1.0/setup-components/install-components-mlflow.md",
    "content": "---\ntitle : \"2. MLflow Tracking Server\"\ndescription: \"구성요소 설치 - MLflow\"\nsidebar_position: 2\ndate: 2021-12-13\nlastmod: 2021-12-13\ncontributors: [\"Jaeyeon Kim\"]\n---\n\n## Install MLflow Tracking Server\n\nMLflow is a popular open-source ML experiment management tool. In addition to [experiment management](https://mlflow.org/docs/latest/tracking.html#tracking), MLflow provides functionalities for ML [model packaging](https://mlflow.org/docs/latest/projects.html#projects), [deployment management](https://mlflow.org/docs/latest/models.html#models), and [model storage](https://mlflow.org/docs/latest/model-registry.html#registry).\n\nIn *MLOps for ALL*, we will be using MLflow for experiment management purposes.   \no store the data managed by MLflow and provide a user interface, we will deploy the MLflow Tracking Server on the Kubernetes cluster.\n\n## Before Install MLflow Tracking Server\n\n### Install PostgreSQL DB\n\nMLflow Tracking Server deploys a PostgreSQL DB for use as a Backend Store to a Kubernetes cluster.\n\nFirst, create a namespace called `mlflow-system`.\n\n```bash\nkubectl create ns mlflow-system\n```\n\nIf the following message is output, it means that it has been generated normally.\n\n```bash\nnamespace/mlflow-system created\n```\n\nCreate a Postgresql DB in the `mlflow-system` namespace.\n\n```bash\nkubectl -n mlflow-system apply -f https://raw.githubusercontent.com/mlops-for-all/helm-charts/b94b5fe4133f769c04b25068b98ccfa7a505aa60/mlflow/manifests/postgres.yaml \n```\n\nIf performed normally, it will be outputted as follows.\n\n```bash\nservice/postgresql-mlflow-service created\ndeployment.apps/postgresql-mlflow created\npersistentvolumeclaim/postgresql-mlflow-pvc created\n```\n\nWait until one postgresql related pod is running in the mlflow-system namespace.\n\n```bash\nkubectl get pod -n mlflow-system | grep postgresql\n```\n\nIf it is output similar to the following, it has executed normally.\n\n```bash\npostgresql-mlflow-7b9bc8c79f-srkh7   1/1     Running   0          38s\n```\n\n### Setup Minio\n\nWe will utilize the Minio that was installed in the previous Kubeflow installation step. \nHowever, in order to separate it for kubeflow and mlflow purposes, we will create a mlflow-specific bucket.  \nFirst, port-forward the minio-service to access Minio and create the bucket.\n\n```bash\nkubectl port-forward svc/minio-service -n kubeflow 9000:9000\n```\n\nOpen a web browser and connect to [localhost:9000](http://localhost:9000) to display the following screen.\n\n![minio-install](./img/minio-install.png)\n\n\nEnter the following credentials to log in: \n\n- Username: `minio`\n- Password: `minio123`\n\nClick the **`+`** button on the right side bottom, then click `Create Bucket`. \n\n![create-bucket](./img/create-bucket.png)\n\n\nEnter `mlflow` in `Bucket Name` to create the bucket.\n\nIf successfully created, you will see a bucket named `mlflow` on the left.\n![mlflow-bucket](./img/mlflow-bucket.png)\n\n\n---\n\n## Let's Install MLflow Tracking Server\n\n### Add Helm Repository\n\n```bash\nhelm repo add mlops-for-all https://mlops-for-all.github.io/helm-charts\n```\n\nIf the following message is displayed, it means it has been added successfully.\n```bash\n\"mlops-for-all\" has been added to your repositories\n```\n\n### Update Helm Repository\n\n```bash\nhelm repo update\n```\n\nIf the following message is displayed, it means that the update has been successfully completed.\n\n```bash\nHang tight while we grab the latest from your chart repositories...\n...Successfully got an update from the \"mlops-for-all\" chart repository\nUpdate Complete. ⎈Happy Helming!⎈\n```\n\n### Helm Install\n\nInstall mlflow-server Helm Chart version 0.2.0.\n\n```bash\nhelm install mlflow-server mlops-for-all/mlflow-server \\\n  --namespace mlflow-system \\\n  --version 0.2.0\n```\n\n- The above Helm chart installs MLflow with the connection information for its backend store and artifacts store set to the default minio created during the Kubeflow installation process and the postgresql information created from the [PostgreSQL DB installation](#postgresql-db-installation) above.\n  - If you want to use a separate DB or object storage, please refer to the [Helm Chart Repo](https://github.com/mlops-for-all/helm-charts/tree/main/mlflow/chart) and set the values separately during helm install.\n\nThe following message should be displayed:\n\n```bash\nNAME: mlflow-server\nLAST DEPLOYED: Sat Dec 18 22:02:13 2021\nNAMESPACE: mlflow-system\nSTATUS: deployed\nREVISION: 1\nTEST SUITE: None\n```\n\nCheck to see if it was installed normally.\n\n```bash\nkubectl get pod -n mlflow-system | grep mlflow-server\n```\n\nWait until one mlflow-server related pod is running in the mlflow-system namespace.  \nIf it is output similar to the following, then it has been successfully executed.\n\n```bash\nmlflow-server-ffd66d858-6hm62        1/1     Running   0          74s\n```\n\n### Check installation\n\nLet's now check if we can successfully connect to the MLflow Server.\n\nFirst, we will perform port forwarding in order to connect from the client node.\n\n```bash\nkubectl port-forward svc/mlflow-server-service -n mlflow-system 5000:5000\n```\n\nOpen a web browser and connect to [localhost:5000](http://localhost:5000) and the following screen will be output.\n\n![mlflow-install](./img/mlflow-install.png)\n"
  },
  {
    "path": "i18n/en/docusaurus-plugin-content-docs/version-1.0/setup-components/install-components-pg.md",
    "content": "---\ntitle : \"4. Prometheus & Grafana\"\ndescription: \"구성요소 설치 - Prometheus & Grafana\"\nsidebar_position: 4\ndate: 2021-12-13\nlastmod: 2021-12-13\ncontributors: [\"Jaeyeon Kim\"]\n---\n\n## Prometheus & Grafana\n\nPrometheus and Grafana are tools for monitoring.  \nFor stable service operation, it is necessary to continuously observe the status of the service and infrastructure where the service is operating, and to respond quickly based on the observed metrics when a problem arises.  \nAmong the many tools to efficiently perform such monitoring, *Everyone's MLOps* will use open source Prometheus and Grafana.\n\nFor more information, please refer to the [Prometheus Official Documentation](https://prometheus.io/docs/introduction/overview/) and [Grafana Official Documentation](https://grafana.com/docs/).\n\nPrometheus is a tool to collect metrics from various targets, and Grafana is a tool to help visualize the gathered data. Although there is no dependency between them, they are often used together complementary to each other.\n\nIn this page, we will install Prometheus and Grafana on a Kubernetes cluster, then send API requests to a SeldonDeployment created with Seldon-Core and check if metrics are collected successfully.\n\nWe also install a dashboard to efficiently monitor the metrics of the SeldonDeployment created in Seldon-Core using Helm Chart version 1.12.0 from seldonio/seldon-core-analytics Helm Repository.\n\n### Add Helm Repository\n\n```bash\nhelm repo add seldonio https://storage.googleapis.com/seldon-charts\n```\n\nIf the following message is output, it means that it has been added successfully.\n\n```bash\n\"seldonio\" has been added to your repositories\n```\n\n### Update Helm Repository\n\n```bash\nhelm repo update\n```\n\nIf the following message is displayed, it means that the update was successful.\n\n```bash\nHang tight while we grab the latest from your chart repositories...\n...Successfully got an update from the \"seldonio\" chart repository\n...Successfully got an update from the \"datawire\" chart repository\nUpdate Complete. ⎈Happy Helming!⎈\n```\n\n### Helm Install\n\nInstall version 1.12.0 of the seldon-core-analytics Helm Chart.\n\n```bash\nhelm install seldon-core-analytics seldonio/seldon-core-analytics \\\n  --namespace seldon-system \\\n  --version 1.12.0\n```\n\nThe following message should be output.\n\n```bash\nSkip...\nNAME: seldon-core-analytics\nLAST DEPLOYED: Tue Dec 14 18:29:38 2021\nNAMESPACE: seldon-system\nSTATUS: deployed\nREVISION: 1\n```\n\nCheck to see if it was installed normally.\n\n```bash\nkubectl get pod -n seldon-system | grep seldon-core-analytics\n```\n\n\nWait until 6 seldon-core-analytics related pods are Running in the seldon-system namespace.\n```bash\nseldon-core-analytics-grafana-657c956c88-ng8wn                  2/2     Running   0          114s\nseldon-core-analytics-kube-state-metrics-94bb6cb9-svs82         1/1     Running   0          114s\nseldon-core-analytics-prometheus-alertmanager-64cf7b8f5-nxbl8   2/2     Running   0          114s\nseldon-core-analytics-prometheus-node-exporter-5rrj5            1/1     Running   0          114s\nseldon-core-analytics-prometheus-pushgateway-8476474cff-sr4n6   1/1     Running   0          114s\nseldon-core-analytics-prometheus-seldon-685c664894-7cr45        2/2     Running   0          114s\n```\n\n### Check installation\n\nLet's now check if we can connect to Grafana normally. First, we will port forward to connect to the client node.\n\n```bash\nkubectl port-forward svc/seldon-core-analytics-grafana -n seldon-system 8090:80\n```\n\nOpen the web browser and connect to [localhost:8090](http://localhost:8090), then the following screen will be displayed.\n\n![grafana-install](./img/grafana-install.png)\n\nEnter the following connection information to connect.\n\n- Email or username: `admin`\n- Password: `password`\n\nWhen you log in, the following screen will be displayed.\n\n![grafana-login](./img/grafana-login.png)\n\nClick the dashboard icon on the left and click the `Manage` button.\n\n![dashboard-click](./img/dashboard-click.png)\n\nYou can see that the basic Grafana dashboard is included. Click the `Prediction Analytics` dashboard among them.\n\n![dashboard](./img/dashboard.png)\n\n The Seldon Core API Dashboard is visible and can be confirmed with the following output.\n\n![seldon-dashboard](./img/seldon-dashboard.png)\n\n## References\n\n- [Seldon-Core-Analytics Helm Chart](https://github.com/SeldonIO/seldon-core/tree/master/helm-charts/seldon-core-analytics)\n"
  },
  {
    "path": "i18n/en/docusaurus-plugin-content-docs/version-1.0/setup-components/install-components-seldon.md",
    "content": "---\ntitle : \"3. Seldon-Core\"\ndescription: \"구성요소 설치 - Seldon-Core\"\nsidebar_position: 3\ndate: 2021-12-13\nlastmod: 2021-12-13\ncontributors: [\"Jaeyeon Kim\"]\n---\n\n## Seldon-Core\n\nSeldon-Core is one of the open source frameworks that can deploy and manage numerous machine learning models in Kubernetes environments.  \nFor more details, please refer to the official [product description page](https://www.seldon.io/tech/products/core/) and [GitHub](https://github.com/SeldonIO/seldon-core) of Seldon-Core and API Deployment part.\n\n## Installing Seldon-Core\n\nIn order to use Seldon-Core, modules such as Ambassador, which is responsible for Ingress of Kubernetes, and Istio are required [here](https://docs.seldon.io/projects/seldon-core/en/latest/workflow/install.html).  \nSeldon-Core officially supports only Ambassador and Istio, and *MLOps for everyone* will use Ambassador to use Seldon-core, so we will install Ambassador.\n\n### Adding Ambassador to the Helm Repository\n\n```bash\nhelm repo add datawire https://www.getambassador.io\n```\n\nIf the following message is displayed, it means it has been added normally.\n\n```bash\n\"datawire\" has been added to your repositories\n```\n\n### Update Ambassador - Helm Repository\n\n```bash\nhelm repo update\n```\n\nIf the following message is output, it means that the update has been completed normally.\n\n```bash\nHang tight while we grab the latest from your chart repositories...\n...Successfully got an update from the \"datawire\" chart repository\nUpdate Complete. ⎈Happy Helming!⎈\n```\n\n### Ambassador - Helm Install\n\nInstall version 6.9.3 of the Ambassador Chart.\n\n```bash\nhelm install ambassador datawire/ambassador \\\n  --namespace seldon-system \\\n  --create-namespace \\\n  --set image.repository=quay.io/datawire/ambassador \\\n  --set enableAES=false \\\n  --set crds.keep=false \\\n  --version 6.9.3\n```\n\nThe following message should be displayed.\n\n```bash\n생략...\n\nW1206 17:01:36.026326   26635 warnings.go:70] rbac.authorization.k8s.io/v1beta1 Role is deprecated in v1.17+, unavailable in v1.22+; use rbac.authorization.k8s.io/v1 Role\nW1206 17:01:36.029764   26635 warnings.go:70] rbac.authorization.k8s.io/v1beta1 RoleBinding is deprecated in v1.17+, unavailable in v1.22+; use rbac.authorization.k8s.io/v1 RoleBinding\nNAME: ambassador\nLAST DEPLOYED: Mon Dec  6 17:01:34 2021\nNAMESPACE: seldon-system\nSTATUS: deployed\nREVISION: 1\nNOTES:\n-------------------------------------------------------------------------------\n  Congratulations! You've successfully installed Ambassador!\n\n-------------------------------------------------------------------------------\nTo get the IP address of Ambassador, run the following commands:\nNOTE: It may take a few minutes for the LoadBalancer IP to be available.\n     You can watch the status of by running 'kubectl get svc -w  --namespace seldon-system ambassador'\n\n  On GKE/Azure:\n  export SERVICE_IP=$(kubectl get svc --namespace seldon-system ambassador -o jsonpath='{.status.loadBalancer.ingress[0].ip}')\n\n  On AWS:\n  export SERVICE_IP=$(kubectl get svc --namespace seldon-system ambassador -o jsonpath='{.status.loadBalancer.ingress[0].hostname}')\n\n  echo http://$SERVICE_IP:\n\nFor help, visit our Slack at http://a8r.io/Slack or view the documentation online at https://www.getambassador.io.\n```\n\nWait until four pods become running in the seldon-system.\n\n```bash\nkubectl get pod -n seldon-system\n```\n\n```bash\nambassador-7f596c8b57-4s9xh                  1/1     Running   0          7m15s\nambassador-7f596c8b57-dt6lr                  1/1     Running   0          7m15s\nambassador-7f596c8b57-h5l6f                  1/1     Running   0          7m15s\nambassador-agent-77bccdfcd5-d5jxj            1/1     Running   0          7m15s\n```\n\n### Seldon-Core - Helm Install\n\nInstall version 1.11.2 of the seldon-core-operator Chart.\n\n```bash\nhelm install seldon-core seldon-core-operator \\\n    --repo https://storage.googleapis.com/seldon-charts \\\n    --namespace seldon-system \\\n    --set usageMetrics.enabled=true \\\n    --set ambassador.enabled=true \\\n    --version 1.11.2\n```\nThe following message should be displayed.\n\n```bash\nSkip...\n\nW1206 17:05:38.336391   28181 warnings.go:70] admissionregistration.k8s.io/v1beta1 ValidatingWebhookConfiguration is deprecated in v1.16+, unavailable in v1.22+; use admissionregistration.k8s.io/v1 ValidatingWebhookConfiguration\nNAME: seldon-core\nLAST DEPLOYED: Mon Dec  6 17:05:34 2021\nNAMESPACE: seldon-system\nSTATUS: deployed\nREVISION: 1\nTEST SUITE: None\n```\n\nWait until one seldon-controller-manager pod is Running in the seldon-system namespace.\n\n```bash\nkubectl get pod -n seldon-system | grep seldon-controller\n```\n\n```bash\nseldon-controller-manager-8457b8b5c7-r2frm   1/1     Running   0          2m22s\n```\n\n## References\n\n- [Example Model Servers with Seldon](https://docs.seldon.io/projects/seldon-core/en/latest/examples/server_examples.html#examples-server-examples--page-root)\n"
  },
  {
    "path": "i18n/en/docusaurus-plugin-content-docs/version-1.0/setup-kubernetes/_category_.json",
    "content": "{\n  \"label\": \"Setup Kubernetes\",\n  \"position\": 2,\n  \"link\": {\n    \"type\": \"generated-index\"\n  }\n}\n"
  },
  {
    "path": "i18n/en/docusaurus-plugin-content-docs/version-1.0/setup-kubernetes/install-kubernetes/_category_.json",
    "content": "{\n  \"label\": \"4. Install Kubernetes\",\n  \"position\": 4,\n  \"link\": {\n    \"type\": \"generated-index\"\n  }\n}\n"
  },
  {
    "path": "i18n/en/docusaurus-plugin-content-docs/version-1.0/setup-kubernetes/install-kubernetes/kubernetes-with-k3s.md",
    "content": "---\ntitle: \"4.1. K3s\"\ndescription: \"\"\nsidebar_position: 1\ndate: 2021-12-13\nlastmod: 2021-12-20\ndraft: false\nweight: 221\ncontributors: [\"Jongseob Jeon\"]\nmenu:\n  docs:\n    parent:../setup-kubernetes\"\nimages: []\n---\n\n## 1. Prerequisite\n\nBefore setting up a Kubernetes cluster, install the necessary components on the **cluster**.\n\nPlease refer to [Install Prerequisite](../../setup-kubernetes/install-prerequisite.md) to install the necessary components on the **cluster** before installing Kubernetes.\n\nk3s uses containerd as the backend by default.\nHowever, we need to use docker as the backend to use GPU, so we will install the backend with the `--docker` option.\n\n```bash\ncurl -sfL https://get.k3s.io | INSTALL_K3S_VERSION=v1.21.7+k3s1 sh -s - server --disable traefik --disable servicelb --disable local-storage --docker\n```\n\nAfter installing k3s, check the k3s config.\n\n```bash\nsudo cat /etc/rancher/k3s/k3s.yaml\n```\n\nIf installed correctly, the following items will be output. (Security related keys are hidden with <...>.)\n\n```bash\napiVersion: v1\nclusters:\n- cluster:\n    certificate-authority-data:\n    <...>\n    server: https://127.0.0.1:6443\n  name: default\ncontexts:\n- context:\n    cluster: default\n    user: default\n  name: default\ncurrent-context: default\nkind: Config\npreferences: {}\nusers:\n- name: default\n  user:\n    client-certificate-data:\n    <...>\n    client-key-data:\n    <...>\n```\n\n## 2. Setup Kubernetes Cluster\n\nSet up the Kubernetes cluster by copying the k3s config to be used as the cluster’s kubeconfig.\n\n```bash\nmkdir .kube\nsudo cp /etc/rancher/k3s/k3s.yaml .kube/config\n```\n\nGrant user access permission to the copied config file.\n\n```bash\nsudo chown $USER:$USER .kube/config\n```\n\n## 3. Setup Kubernetes Client\n\nNow move the kubeconfig configured in the cluster to the local.\nSet the path to `~/.kube/config` on the local.\n\nThe config file copied at first has the server ip set to `https://127.0.0.1:6443`. \nModify this value to match the ip of the cluster. \n(We modified it to `https://192.168.0.19:6443` to match the ip of the cluster used in this page.)\n\n```bash\napiVersion: v1\nclusters:\n- cluster:\n    certificate-authority-data:\n    <...>\n    server: https://192.168.0.19:6443\n  name: default\ncontexts:\n- context:\n    cluster: default\n    user: default\n  name: default\ncurrent-context: default\nkind: Config\npreferences: {}\nusers:\n- name: default\n  user:\n    client-certificate-data:\n    <...>\n    client-key-data:\n    <...>\n```\n\n## 4. Install Kubernetes Default Modules\n\nPlease refer to [Setup Kubernetes Modules](../../setup-kubernetes/install-kubernetes-module.md) to install the following components:\n\n- helm\n- kustomize\n- CSI plugin\n- [Optional] nvidia-docker, nvidia-device-plugin\n\n## 5. Verify Successful Installation\n\nFinally, check if the nodes are Ready and verify the OS, Docker, and Kubernetes versions.\n\n```bash\nkubectl get nodes -o wide\n```\n\nIf you see the following message, it means that the installation was successful.\n\n```bash\nNAME    STATUS   ROLES                  AGE   VERSION        INTERNAL-IP    EXTERNAL-IP   OS-IMAGE             KERNEL-VERSION     CONTAINER-RUNTIME\nubuntu   Ready    control-plane,master   11m   v1.21.7+k3s1   192.168.0.19   <none>        Ubuntu 20.04.3 LTS   5.4.0-91-generic   docker://20.10.11\n```\n\n## 6. References\n\n- [https://rancher.com/docs/k3s/latest/en/installation/install-options/](https://rancher.com/docs/k3s/latest/en/installation/install-options/)\n"
  },
  {
    "path": "i18n/en/docusaurus-plugin-content-docs/version-1.0/setup-kubernetes/install-kubernetes/kubernetes-with-kubeadm.md",
    "content": "---\ntitle: \"4.3. Kubeadm\"\ndescription: \"\"\nsidebar_position: 3\ndate: 2021-12-13\nlastmod: 2021-12-20\ncontributors: [\"Youngcheol Jang\"]\n---\n\n## 1. Prerequisite\n\nBefore building a Kubernetes cluster, install the necessary components to the **cluster**.\n\nPlease refer to [Install Prerequisite](../../setup-kubernetes/install-prerequisite.md) and install the necessary components to the **cluster**.\n\nChange the configuration of the network for Kubernetes.\n\n```bash\nsudo modprobe br_netfilter\n\ncat <<EOF | sudo tee /etc/modules-load.d/k8s.conf\nbr_netfilter\nEOF\n\ncat <<EOF | sudo tee /etc/sysctl.d/k8s.conf\nnet.bridge.bridge-nf-call-ip6tables = 1\nnet.bridge.bridge-nf-call-iptables = 1\nEOF\nsudo sysctl --system\n```\n\n## 2. Setup Kubernetes Cluster\n\n- kubeadm : Automates the installation process by registering kubelet as a service and issuing certificates for communication between cluster components.\n- kubelet : Container handler responsible for starting and stopping container resources.\n- kubectl : CLI tool used to interact with and manage Kubernetes clusters from the terminal.\n\nInstall kubeadm, kubelet, and kubectl using the following commands. It's important to prevent accidental changes to the versions of these components, as it can lead to unexpected issues.\n\n```bash\nsudo apt-get update\nsudo apt-get install -y apt-transport-https ca-certificates curl &&\nsudo curl -fsSLo /usr/share/keyrings/kubernetes-archive-keyring.gpg https://packages.cloud.google.com/apt/doc/apt-key.gpg &&\necho \"deb [signed-by=/usr/share/keyrings/kubernetes-archive-keyring.gpg] https://apt.kubernetes.io/ kubernetes-xenial main\" | sudo tee /etc/apt/sources.list.d/kubernetes.list &&\nsudo apt-get update\nsudo apt-get install -y kubelet=1.21.7-00 kubeadm=1.21.7-00 kubectl=1.21.7-00 &&\nsudo apt-mark hold kubelet kubeadm kubectl\n```\n\nCheck if kubeadm, kubelet, and kubectl are installed correctly.\n\n```bash\nmlops@ubuntu:~$ kubeadm version\nkubeadm version: &version.Info{Major:\"1\", Minor:\"21\", GitVersion:\"v1.21.7\", GitCommit:\"1f86634ff08f37e54e8bfcd86bc90b61c98f84d4\", GitTreeState:\"clean\", BuildDate:\"2021-11-17T14:40:08Z\", GoVersion:\"go1.16.10\", Compiler:\"gc\", Platform:\"linux/amd64\"}\n```\n\n```bash\nmlops@ubuntu:~$ kubelet --version\nKubernetes v1.21.7\n```\n\n```bash\nmlops@ubuntu:~$ kubectl version --client\nClient Version: version.Info{Major:\"1\", Minor:\"21\", GitVersion:\"v1.21.7\", GitCommit:\"1f86634ff08f37e54e8bfcd86bc90b61c98f84d4\", GitTreeState:\"clean\", BuildDate:\"2021-11-17T14:41:19Z\", GoVersion:\"go1.16.10\", Compiler:\"gc\", Platform:\"linux/amd64\"}\n```\n\nNow we will use kubeadm to install Kubernetes.\n\n```bash\nkubeadm config images list\nkubeadm config images pull\n\nsudo kubeadm init --pod-network-cidr=10.244.0.0/16\n```\n\nThrough kubectl, copy the admin certificate to the path $HOME/.kube/config to control the Kubernetes cluster.\n\n```bash\nmkdir -p $HOME/.kube\nsudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config\nsudo chown $(id -u):$(id -g) $HOME/.kube/config\n```\n\nInstall CNI. There are various kinds of CNI, which is responsible for setting up the network inside Kubernetes, and in *MLOps for All*, flannel is used.\n\n```bash\nkubectl apply -f https://raw.githubusercontent.com/flannel-io/flannel/v0.13.0/Documentation/kube-flannel.yml\n```\n\nThere are two types of Kubernetes nodes: `Master Node` and `Worker Node`. For stability, it is generally recommended that only tasks to control the Kubernetes cluster are run on the `Master Node`, however this manual assumes a single cluster, so all types of tasks can be run on the Master Node.\n\n```bash\nkubectl taint nodes --all node-role.kubernetes.io/master-\n```\n\n## 3. Setup Kubernetes Client\n\nCopy the kubeconfig file created in the cluster to the **client** to control the cluster through kubectl.\n\n```bash\nmkdir -p $HOME/.kube\nscp -p {CLUSTER_USER_ID}@{CLUSTER_IP}:~/.kube/config ~/.kube/config\n```\n\n## 4. Install Kubernetes Default Modules\n\nPlease refer to [Setup Kubernetes Modules](../../setup-kubernetes/install-kubernetes-module.md) to install the following components:\n\n- helm\n- kustomize\n- CSI plugin\n- [Optional] nvidia-docker, nvidia-device-plugin\n\n## 5. Verify Successful Installation\n\nFinally, check if the nodes are Ready and verify the OS, Docker, and Kubernetes versions.\n\n```bash\nkubectl get nodes\n```\n\nWhen the node is in the \"Ready\" state, the output will be similar to the following:\n\n```bash\nNAME     STATUS   ROLES                  AGE     VERSION\nubuntu   Ready    control-plane,master   2m55s   v1.21.7\n```\n\n## 6. References\n\n- [kubeadm](https://kubernetes.io/docs/setup/production-environment/tools/kubeadm/install-kubeadm)\n"
  },
  {
    "path": "i18n/en/docusaurus-plugin-content-docs/version-1.0/setup-kubernetes/install-kubernetes/kubernetes-with-minikube.md",
    "content": "---\ntitle: \"4.2. Minikube\"\ndescription: \"\"\nsidebar_position: 2\ndate: 2021-12-13\nlastmod: 2021-12-20\ncontributors: [\"Jaeyeon Kim\"]\n---\n\n## 1. Prerequisite\n\nBefore setting up a Kubernetes cluster, install the necessary components on the **cluster**.\n\nPlease refer to [Install Prerequisite](../../setup-kubernetes/install-prerequisite.md) to install the necessary components on the **cluster** before installing Kubernetes.\n\n### Minikube binary\n\nInstall the v1.24.0 version of the Minikube binary to use Minikube.\n\n```bash\nwget https://github.com/kubernetes/minikube/releases/download/v1.24.0/minikube-linux-amd64\nsudo install minikube-linux-amd64 /usr/local/bin/minikube\n```\n\nCheck if it is installed properly.\n\n```bash\nminikube version\n```\n\nIf this message appears, it means the installation was successful.\n\n```bash\nmlops@ubuntu:~$ minikube version\nminikube version: v1.24.0\ncommit: 76b94fb3c4e8ac5062daf70d60cf03ddcc0a741b\n```\n\n## 2. Setup Kubernetes Cluster\n\nNow let's build the Kubernetes cluster using Minikube.\nTo facilitate the smooth use of GPUs and communication between cluster and client, Minikube is run using the `driver=none` option. Please note that this option must be run as root user. \n\nSwitch to root user.\n\n```bash\nsudo su\n```\n\nRun `minikube start` to build the Kubernetes cluster for Kubeflow's smooth operation, specifying the Kubernetes version as v1.21.7 and adding `--extra-config`.\n\n```bash\nminikube start --driver=none \\\n  --kubernetes-version=v1.21.7 \\\n  --extra-config=apiserver.service-account-signing-key-file=/var/lib/minikube/certs/sa.key \\\n  --extra-config=apiserver.service-account-issuer=kubernetes.default.svc\n```\n\n### Disable default addons\n\nWhen installing Minikube, there are default addons that are installed. We will disable any addons that we do not intend to use.\n\n```bash\nminikube addons disable storage-provisioner\nminikube addons disable default-storageclass\n```\n\nConfirm that all addons are disabled.\n\n```bash\nminikube addons list\n```\n\nIf the following message appears, it means that the installation was successful.\n\n```bash\nroot@ubuntu:/home/mlops# minikube addons list\n|-----------------------------|----------|--------------|-----------------------|\n|         ADDON NAME          | PROFILE  |    STATUS    |      MAINTAINER       |\n|-----------------------------|----------|--------------|-----------------------|\n| ambassador                  | minikube | disabled     | unknown (third-party) |\n| auto-pause                  | minikube | disabled     | google                |\n| csi-hostpath-driver         | minikube | disabled     | kubernetes            |\n| dashboard                   | minikube | disabled     | kubernetes            |\n| default-storageclass        | minikube | disabled     | kubernetes            |\n| efk                         | minikube | disabled     | unknown (third-party) |\n| freshpod                    | minikube | disabled     | google                |\n| gcp-auth                    | minikube | disabled     | google                |\n| gvisor                      | minikube | disabled     | google                |\n| helm-tiller                 | minikube | disabled     | unknown (third-party) |\n| ingress                     | minikube | disabled     | unknown (third-party) |\n| ingress-dns                 | minikube | disabled     | unknown (third-party) |\n| istio                       | minikube | disabled     | unknown (third-party) |\n| istio-provisioner           | minikube | disabled     | unknown (third-party) |\n| kubevirt                    | minikube | disabled     | unknown (third-party) |\n| logviewer                   | minikube | disabled     | google                |\n| metallb                     | minikube | disabled     | unknown (third-party) |\n| metrics-server              | minikube | disabled     | kubernetes            |\n| nvidia-driver-installer     | minikube | disabled     | google                |\n| nvidia-gpu-device-plugin    | minikube | disabled     | unknown (third-party) |\n| olm                         | minikube | disabled     | unknown (third-party) |\n| pod-security-policy         | minikube | disabled     | unknown (third-party) |\n| portainer                   | minikube | disabled     | portainer.io          |\n| registry                    | minikube | disabled     | google                |\n| registry-aliases            | minikube | disabled     | unknown (third-party) |\n| registry-creds              | minikube | disabled     | unknown (third-party) |\n| storage-provisioner         | minikube | disabled     | kubernetes            |\n| storage-provisioner-gluster | minikube | disabled     | unknown (third-party) |\n| volumesnapshots             | minikube | disabled     | kubernetes            |\n|-----------------------------|----------|--------------|-----------------------|\n```\n\n### 3. Setup Kubernetes Client\n\nNow, let's install the necessary tools for smooth usage of Kubernetes on the **client** machine. If the **client** and **cluster** nodes are not separated, please note that you need to perform all the operations as the root user.\n\nIf the **client** and **cluster** nodes are separated, first, we need to retrieve the Kubernetes administrator credentials from the **cluster** to the **client**.\n\n1. Check the config on the **cluster**:\n\n  ```bash\n  # Cluster node\n  minikube kubectl -- config view --flatten\n  ```\n\n2. The following information will be displayed:\n\n  ```bash\n  apiVersion: v1\n  clusters:\n  - cluster:\n      certificate-authority-data: LS0tLS1CRUd....\n      extensions:\n      - extension:\n          last-update: Mon, 06 Dec 2021 06:55:46 UTC\n          provider: minikube.sigs.k8s.io\n          version: v1.24.0\n        name: cluster_info\n      server: https://192.168.0.62:8443\n    name: minikube\n  contexts:\n  - context:\n      cluster: minikube\n      extensions:\n      - extension:\n          last-update: Mon, 06 Dec 2021 06:55:46 UTC\n          provider: minikube.sigs.k8s.io\n          version: v1.24.0\n        name: context_info\n      namespace: default\n      user: minikube\n    name: minikube\n  current-context: minikube\n  kind: Config\n  preferences: {}\n  users:\n  - name: minikube\n    user:\n      client-certificate-data: LS0tLS1CRUdJTi....\n      client-key-data: LS0tLS1CRUdJTiBSU0....\n  ```\n\n3. Create the `.kube` folder on the **client** node:\n\n  ```bash\n  # Client node\n  mkdir -p /home/$USER/.kube\n  ```\n\n4. Paste the information obtained from Step 2 into the file and save it:\n\n  ```bash\n  vi /home/$USER/.kube/config\n  ```\n\n## 4. Install Kubernetes Default Modules\n\nPlease refer to [Setup Kubernetes Modules](../../setup-kubernetes/install-kubernetes-module.md) to install the following components:\n\n- helm\n- kustomize\n- CSI plugin\n- [Optional] nvidia-docker, nvidia-device-plugin\n\n## 5. Verify Successful Installation\n\nFinally, check that the node is Ready, and check the OS, Docker, and Kubernetes versions.\n\n```bash\nkubectl get nodes -o wide\n```\n\nIf this message appears, it means that the installation has completed normally.\n\n```bash\nNAME     STATUS   ROLES                  AGE     VERSION   INTERNAL-IP    EXTERNAL-IP   OS-IMAGE             KERNEL-VERSION     CONTAINER-RUNTIME\nubuntu   Ready    control-plane,master   2d23h   v1.21.7   192.168.0.75   <none>        Ubuntu 20.04.3 LTS   5.4.0-91-generic   docker://20.10.11\n```\n"
  },
  {
    "path": "i18n/en/docusaurus-plugin-content-docs/version-1.0/setup-kubernetes/install-kubernetes-module.md",
    "content": "---\ntitle: \"5. Install Kubernetes Modules\"\ndescription: \"Install Helm, Kustomize\"\nsidebar_position: 5\ndate: 2021-12-13\nlastmod: 2021-12-20\ncontributors: [\"Jaeyeon Kim\"]\n---\n\n## Setup Kubernetes Modules\n\n\nOn this page, we will explain how to install the modules that will be used on the cluster from the client nodes.  \nAll the processes introduced here will be done on the **client nodes**.\n\n## Helm\n\nHelm is one of the package management tools that helps to deploy and manage resources related to Kubernetes packages at once.\n\n1. Download Helm version 3.7.1 into the current folder.\n\n- For Linux amd64\n\n  ```bash\n  wget https://get.helm.sh/helm-v3.7.1-linux-amd64.tar.gz\n  ```\n\n- Other OS refer to the [official website](https://github.com/helm/helm/releases/tag/v3.7.1) for the download path of the binary that matches the OS and CPU of your client node.\n\n2. Unzip the file to use helm and move the file to its desired location.\n\n  ```bash\n  tar -zxvf helm-v3.7.1-linux-amd64.tar.gz\n  sudo mv linux-amd64/helm /usr/local/bin/helm\n  ```\n\n3. Check to see if the installation was successful:\n  ```bash\n  helm help\n  ```\n\n  If you see the following message, it means that it has been installed normally. \n\n  ```bash\n  The Kubernetes package manager\n\n  Common actions for Helm:\n\n  - helm search:    search for charts\n  - helm pull:      download a chart to your local directory to view\n  - helm install:   upload the chart to Kubernetes\n  - helm list:      list releases of charts\n\n  Environment variables:\n\n  | Name                     | Description                                                         |\n  |--------------------------|---------------------------------------------------------------------|\n  | $HELM_CACHE_HOME         | set an alternative location for storing cached files.               |\n  | $HELM_CONFIG_HOME        | set an alternative location for storing Helm configuration.         |\n  | $HELM_DATA_HOME          | set an alternative location for storing Helm data.                  |\n\n  ...\n  ```\n\n## Kustomize\n\nKustomize is one of the package management tools that helps to deploy and manage multiple Kubernetes resources at once.\n\n1. Download the binary version of kustomize v3.10.0 in the current folder.\n\n- For Linux amd64\n\n  ```bash\n  wget https://github.com/kubernetes-sigs/kustomize/releases/download/kustomize%2Fv3.10.0/kustomize_v3.10.0_linux_amd64.tar.gz\n  ```\n\n- Other OS can be downloaded from [kustomize/v3.10.0](https://github.com/kubernetes-sigs/kustomize/releases/tag/kustomize%2Fv3.10.0) after checking.\n\n2. Unzip to use kustomize, and change the file location. \n\n  ```bash\n  tar -zxvf kustomize_v3.10.0_linux_amd64.tar.gz\n  sudo mv kustomize /usr/local/bin/kustomize\n  ```\n\n3. Check if it is installed correctly.\n\n  ```bash\n  kustomize help\n  ```\n\n  If you see the following message, it means that it has been installed normally.\n\n  ```bash\n  Manages declarative configuration of Kubernetes.\n  See https://sigs.k8s.io/kustomize\n\n  Usage:\n    kustomize [command]\n\n  Available Commands:\n    build                     Print configuration per contents of kustomization.yaml\n    cfg                       Commands for reading and writing configuration.\n    completion                Generate shell completion script\n    create                    Create a new kustomization in the current directory\n    edit                      Edits a kustomization file\n    fn                        Commands for running functions against configuration.\n  ...\n  ```\n\n## CSI Plugin : Local Path Provisioner\n\n1. The CSI Plugin is a module that is responsible for storage within Kubernetes. Install the CSI Plugin, Local Path Provisioner, which is easy to use in single node clusters.\n\n  ```bash\n  kubectl apply -f https://raw.githubusercontent.com/rancher/local-path-provisioner/v0.0.20/deploy/local-path-storage.yaml\n  ```\n\n  If you see the following messages, it means that the installation was successful: \n\n  ```bash\n  namespace/local-path-storage created\n  serviceaccount/local-path-provisioner-service-account created\n  clusterrole.rbac.authorization.k8s.io/local-path-provisioner-role created\n  clusterrolebinding.rbac.authorization.k8s.io/local-path-provisioner-bind created\n  deployment.apps/local-path-provisioner created\n  storageclass.storage.k8s.io/local-path created\n  configmap/local-path-config created\n  ```\n\n2. Also, check if the provisioner pod in the local-path-storage namespace is Running by executing the following command:\n\n  ```bash\n  kubectl -n local-path-storage get pod\n  ```\n\nIf successful, it will display the following output:\n\n  ```bash\n  NAME                                     READY     STATUS    RESTARTS   AGE\n  local-path-provisioner-d744ccf98-xfcbk   1/1       Running   0          7m\n  ```\n\n4. Execute the following command to change the default storage class:\n\n  ```bash\n  kubectl patch storageclass local-path -p '{\"metadata\": {\"annotations\":{\"storageclass.kubernetes.io/is-default-class\":\"true\"}}}'\n  ```\n\n  If the command is successful, the following output will be displayed:\n\n  ```bash\n  storageclass.storage.k8s.io/local-path patched\n  ```\n\n5. Verify that the default storage class has been set:\n\n  ```bash\n  kubectl get sc\n  ```\n\n  Check if there is a storage class with the name `local-path (default)` in the NAME column:\n\n  ```bash\n  NAME                   PROVISIONER             RECLAIMPOLICY   VOLUMEBINDINGMODE      ALLOWVOLUMEEXPANSION   AGE\n  local-path (default)   rancher.io/local-path   Delete          WaitForFirstConsumer   false                  2h\n  ```\n"
  },
  {
    "path": "i18n/en/docusaurus-plugin-content-docs/version-1.0/setup-kubernetes/install-prerequisite.md",
    "content": "---\ntitle: \"3. Install Prerequisite\"\ndescription: \"Install docker\"\nsidebar_position: 3\ndate: 2021-12-13\nlastmod: 2021-12-20\ncontributors: [\"Jaeyeon Kim\", \"Jongsun Shinn\", \"Sangwoo Shim\"]\n---\n\nOn this page, we describe the components that need to be installed or configured on the **Cluster** and **Client** prior to installing Kubernetes.\n\n## Install apt packages\n\nIn order to enable smooth communication between the Client and the Cluster, Port-Forwarding needs to be performed. To enable Port-Forwarding, the following packages need to be installed on the **Cluster**.\n```bash\nsudo apt-get update\nsudo apt-get install -y socat\n```\n\n## Install Docker\n\n1. Install apt packages for docker.\n\n   ```bash\n   sudo apt-get update && sudo apt-get install -y ca-certificates curl gnupg lsb-release\n   ```\n\n2. add docker official GPG key.\n\n   ```bash\n   curl -fsSL https://download.docker.com/linux/ubuntu/gpg | sudo gpg --dearmor -o /usr/share/keyrings/docker-archive-keyring.gpg\n   ```\n\n3. When installing Docker using the apt package manager, configure it to retrieve from the stable repository:\n\n   ```bash\n   echo \\\n   \"deb [arch=$(dpkg --print-architecture) signed-by=/usr/share/keyrings/docker-archive-keyring.gpg] https://download.docker.com/linux/ubuntu \\\n   $(lsb_release -cs) stable\" | sudo tee /etc/apt/sources.list.d/docker.list > /dev/null\n   ```\n\n4. Check the currently available Docker versions for installation:\n\n   ```bash\n   sudo apt-get update && apt-cache madison docker-ce\n   ```\n\n   Verify if the version `5:20.10.11~3-0~ubuntu-focal` is listed among the output:\n\n   ```bash\n   apt-cache madison docker-ce | grep 5:20.10.11~3-0~ubuntu-focal\n   ```\n\n   If the addition was successful, the following output will be displayed:\n\n   ```bash\n   docker-ce | 5:20.10.11~3-0~ubuntu-focal | https://download.docker.com/linux/ubuntu focal/stable amd64 Packages\n   ```\n\n5. Install Docker version `5:20.10.11~3-0~ubuntu-focal`:\n\n   ```bash\n   sudo apt-get install -y containerd.io docker-ce=5:20.10.11~3-0~ubuntu-focal docker-ce-cli=5:20.10.11~3-0~ubuntu-focal\n\n   ```\n\n6. Check docker is installed.\n\n   ```bash\n   sudo docker run hello-world\n   ```\n\n\n   If added successfully, it will output as follows:\n\n   ```bash\n   mlops@ubuntu:~$ sudo docker run hello-world\n\n   Hello from Docker!\n   This message shows that your installation appears to be working correctly.\n\n   To generate this message, Docker took the following steps:\n   1. The Docker client contacted the Docker daemon.\n   2. The Docker daemon pulled the \"hello-world\" image from the Docker Hub.\n      (amd64)\n   3. The Docker daemon created a new container from that image which runs the\n      executable that produces the output you are currently reading.\n   4. The Docker daemon streamed that output to the Docker client, which sent it\n      to your terminal.\n\n   To try something more ambitious, you can run an Ubuntu container with:\n   $ docker run -it ubuntu bash\n\n   Share images, automate workflows, and more with a free Docker ID:\n   https://hub.docker.com/\n\n   For more examples and ideas, visit:\n   https://docs.docker.com/get-started/\n   ```\n      \n7. Add permissions to use Docker commands without the `sudo` keyword by executing the following commands:\n\n   ```bash\n   sudo groupadd docker\n   sudo usermod -aG docker $USER\n   newgrp docker\n   ```\n\n8. To verify that you can now use Docker commands without `sudo`, run the `docker run` command again:\n\n   ```bash\n   docker run hello-world\n   ```\n\n   If you see the following message after executing the command, it means that the permissions have been successfully added:\n\n   ```bash\n   mlops@ubuntu:~$ docker run hello-world\n\n   Hello from Docker!\n   This message shows that your installation appears to be working correctly.\n\n   To generate this message, Docker took the following steps:\n   1. The Docker client contacted the Docker daemon.\n   2. The Docker daemon pulled the \"hello-world\" image from the Docker Hub.\n      (amd64)\n   3. The Docker daemon created a new container from that image which runs the\n      executable that produces the output you are currently reading.\n   4. The Docker daemon streamed that output to the Docker client, which sent it\n      to your terminal.\n\n   To try something more ambitious, you can run an Ubuntu container with:\n   $ docker run -it ubuntu bash\n\n   Share images, automate workflows, and more with a free Docker ID:\n   https://hub.docker.com/\n\n   For more examples and ideas, visit:\n   https://docs.docker.com/get-started/\n   ```\n\n## Turn off Swap Memory\n\nIn order for kubelet to work properly, **cluster** nodes must turn off the virtual memory called swap. The following command turns off the swap.  \n**(When using cluster and client on the same desktop, turning off swap memory may result in a slowdown in speed)**\n\n```bash\nsudo sed -i '/ swap / s/^\\(.*\\)$/#\\1/g' /etc/fstab\nsudo swapoff -a\n```\n\n## Install Kubectl\n\nkubectl is a client tool used to make API requests to a Kubernetes cluster. It needs to be installed on the client node.\n\n1. Download kubectl version v1.21.7 to the current folder:\n\n   ```bash\n   curl -LO https://dl.k8s.io/release/v1.21.7/bin/linux/amd64/kubectl\n   ```\n\n2. Change the file permissions and move it to the appropriate location to make kubectl executable:\n\n   ```bash\n   sudo install -o root -g root -m 0755 kubectl /usr/local/bin/kubectl\n   ```\n\n3. Verify that kubectl is installed correctly:\n\n   ```bash\n   kubectl version --client\n   ```\n\n   If you see the following message, it means that kubectl is installed successfully:\n\n   ```bash\n   Client Version: version.Info{Major:\"1\", Minor:\"21\", GitVersion:\"v1.21.7\", GitCommit:\"1f86634ff08f37e54e8bfcd86bc90b61c98f84d4\", GitTreeState:\"clean\", BuildDate:\"2021-11-17T14:41:19Z\", GoVersion:\"go1.16.10\", Compiler:\"gc\", Platform:\"linux/amd64\"}\n   ```\n\n4. If you work with multiple Kubernetes clusters and need to manage multiple kubeconfig files or kube-contexts efficiently, you can refer to the following resources:\n\n   - [Configuring Multiple kubeconfig on Your Machine](https://dev.to/aabiseverywhere/configuring-multiple-kubeconfig-on-your-machine-59eo)\n   - [kubectx - Switch between Kubernetes contexts easily](https://github.com/ahmetb/kubectx)\n\n## References\n\n- [Install Docker Engine on Ubuntu](https://docs.docker.com/engine/install/ubuntu/)\n- [Install and Set Up kubectl on Linux](https://kubernetes.io/docs/tasks/tools/install-kubectl-linux/)\n"
  },
  {
    "path": "i18n/en/docusaurus-plugin-content-docs/version-1.0/setup-kubernetes/intro.md",
    "content": "---\ntitle: \"1. Introduction\"\ndescription: \"Setup Introduction\"\nsidebar_position: 1\ndate: 2021-12-13\nlastmod: 2021-12-13\ncontributors: [\"Jaeyeon Kim\", \"Jongsun Shinn\", \"Youngdon Tae\", \"SeungTae Kim\"]\n---\n\n## Build MLOps System\n\nThe biggest barrier when studying MLOps is the difficulty of setting up and using an MLOps system. Using public cloud platforms like AWS or GCP, or commercial tools like Weights & Biases or neptune.ai, can be costly, and starting from scratch to build the entire environment can be overwhelming and confusing.\n\nTo address these challenges and help those who haven't been able to start with MLOps, *MLOps for ALL* will guide you on how to build and use an MLOps system from scratch, requiring only a desktop with Ubuntu installed.\n\nFor those who cannot prepare a Ubuntu desktop environment, use virtual machines to set up the environment.\n\n> If you are using Windows or an Intel-based Mac for the *MLOps for ALL* practical exercises, you can prepare an Ubuntu desktop environment using virtual machine software such as VirtualBox or VMware. Please make sure to meet the recommended specifications when creating the virtual machine.\n> However, for those using an M1 Mac, as of the date of writing (February 2022), VirtualBox and VMware are not available. ([Check if macOS apps are optimized for M1 Apple Silicon Mac](https://isapplesiliconready.com/kr))\n> Therefore, if you are not using a cloud environment, you can install UTM, Virtual machines for Mac, to use virtual machines. \n> (Purchasing and downloading software from the App Store is a form of donation-based payment. The free version is sufficient as it only differs in automatic updates.)\n> This virtual machine software supports the *Ubuntu 20.04.3 LTS* practice operating system, enabling you to perform the exercises on an M1 Mac.\n\n\nHowever, since it is not possible to use all the elements described in the [Components of MLOps](../introduction/component.md), *MLOps for ALL* will mainly focus on installing the representative open source software and connecting them to each other.\n\nIt is not meant that installing open source software in *MLOps for ALL* is a standard, and we recommend choosing the appropriate tool that fits your situation.\n\n## Components\n\nThe components of the MLOps system that we will make in this article and each version have been verified in the following environment.\n\nTo facilitate smooth testing, I will explain the setup of the **Cluster** and **Client** as separate entities.\n\nThe **Cluster** refers to a single desktop with Ubuntu installed.  \nThe **Client** is recommended to be a different desktop, such as a laptop or another desktop with access to the Cluster or Kubernetes installation. However, if you only have one machine available, you can use the same desktop for both Cluster and Client purposes.\n\n### Cluster\n\n#### 1. Software\n\nBelow is the list of software that needs to be installed on the Cluster:\n\n| Software        | Version     |\n| --------------- | ----------- |\n| Ubuntu          | 20.04.3 LTS |\n| Docker (Server) | 20.10.11    |\n| NVIDIA Driver   | 470.86      |\n| Kubernetes      | v1.21.7     |\n| Kubeflow        | v1.4.0      |\n| MLFlow          | v1.21.0     |\n\n#### 2. Helm Chart\n\nBelow is the list of third-party software that needs to be installed using Helm:\n\n| Helm Chart Repo Name          | Version |\n| ----------------------------- | ------- |\n| datawire/ambassador           | 6.9.3   |\n| seldonio/seldon-core-operator | 1.11.2  |\n\n### Client\n\nThe Client has been validated on MacOS (Intel CPU) and Ubuntu 20.04.\n\n| Software        | Version  |\n| --------------- | ----------|\n| kubectl         | v1.21.7   |\n| helm            | v3.7.1    |\n| kustomize       | v3.10.0   |\n\n### Minimum System Requirements\n\nIt is recommended that the Cluster meet the following specifications, which are dependent on the recommended specifications for Kubernetes and Kubeflow:\n\n- CPU: 6 cores\n- RAM: 12GB\n- DISK: 50GB\n- GPU: NVIDIA GPU (optional)\n"
  },
  {
    "path": "i18n/en/docusaurus-plugin-content-docs/version-1.0/setup-kubernetes/kubernetes.md",
    "content": "---\ntitle : \"2. Setup Kubernetes\"\ndescription: \"Setup Kubernetes\"\nsidebar_position: 2\ndate: 2021-12-13\nlastmod: 2021-12-13\ncontributors: [\"Jaeyeon Kim\"]\n---\n\n## Setup Kubernetes Cluster\n\nFor those learning Kubernetes for the first time, the first barrier to entry is setting up a Kubernetes practice environment.\n\nThe official tool that supports building a production-level Kubernetes cluster is kubeadm, but there are also tools such as kubespray and kops that help users set up more easily, and tools such as k3s, minikube, microk8s, and kind that help you set up a compact Kubernetes cluster easily for learning purposes.\n\nEach tool has its own advantages and disadvantages, so considering the preferences of each user, this article will use three tools: kubeadm, k3s, and minikube to set up a Kubernetes cluster.\nFor detailed comparisons of each tool, please refer to the official Kubernetes [documentation](https://kubernetes.io/ko/docs/tasks/tools/).\n\n*MLOps for ALL* recommends **k3s** as a tool that is easy to use when setting up a Kubernetes cluster.\n\nIf you want to use all the features of Kubernetes and configure the nodes, we recommend **kubeadm**.  \n**minikube** has the advantage of being able to easily install other Kubernetes in an add-on format, in addition to the components we describe.\n\nIn this *MLOps for ALL*, in order to use the components that will be built for MLOps smoothly, there are additional settings that must be configured when building the Kubernetes cluster using each of the tools.\n\nThe scope of this **Setup Kubernetes** section is to build a k8s cluster on a desktop that already has Ubuntu OS installed and to confirm that external client nodes can access the Kubernetes cluster.\n\nThe detailed setup procedure is composed of the following flow, as each of the three tools has its own setup procedure.\n```bash\n3. Setup Prerequisite\n4. Setup Kubernetes\n  4.1. with k3s\n  4.2. with minikube\n  4.3. with kubeadm\n5. Setup Kubernetes Modules\n```\n\nLet's now build a Kubernetes cluster by using each of the tools. You don't have to use all the tools, and you can use the tools that you are familiar with.\n"
  },
  {
    "path": "i18n/en/docusaurus-plugin-content-docs/version-1.0/setup-kubernetes/setup-nvidia-gpu.md",
    "content": "---\ntitle: \"6. (Optional) Setup GPU\"\ndescription: \"Install nvidia docker, nvidia device plugin\"\nsidebar_position: 6\ndate: 2021-12-13\nlastmod: 2021-12-13\ncontributors: [\"Jaeyeon Kim\"]\n---\n\nFor using GPU in Kubernetes and Kubeflow, the following tasks are required.\n\n## 1. Install NVIDIA Driver\n\nIf the following screen is output when executing `nvidia-smi`, please omit this step.\n\n  ```bash\n  mlops@ubuntu:~$ nvidia-smi \n  +-----------------------------------------------------------------------------+\n  | NVIDIA-SMI 470.86       Driver Version: 470.86       CUDA Version: 11.4     |\n  |-------------------------------+----------------------+----------------------+\n  | GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |\n  | Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |\n  |                               |                      |               MIG M. |\n  |===============================+======================+======================|\n  |   0  NVIDIA GeForce ...  Off  | 00000000:01:00.0 Off |                  N/A |\n  | 25%   32C    P8     4W / 120W |    211MiB /  6078MiB |      0%      Default |\n  |                               |                      |                  N/A |\n  +-------------------------------+----------------------+----------------------+\n  |   1  NVIDIA GeForce ...  Off  | 00000000:02:00.0 Off |                  N/A |\n  |  0%   34C    P8     7W / 175W |      5MiB /  7982MiB |      0%      Default |\n  |                               |                      |                  N/A |\n  +-------------------------------+----------------------+----------------------+\n                                                                                \n  +-----------------------------------------------------------------------------+\n  | Processes:                                                                  |\n  |  GPU   GI   CI        PID   Type   Process name                  GPU Memory |\n  |        ID   ID                                                   Usage      |\n  |=============================================================================|\n  |    0   N/A  N/A      1644      G   /usr/lib/xorg/Xorg                198MiB |\n  |    0   N/A  N/A      1893      G   /usr/bin/gnome-shell               10MiB |\n  |    1   N/A  N/A      1644      G   /usr/lib/xorg/Xorg                  4MiB |\n  +-----------------------------------------------------------------------------+\n  ```\n\nIf the output of nvidia-smi is not as above, please install the nvidia driver that fits your installed GPU.\n\nIf you are not familiar with the installation of nvidia drivers, please install it through the following command.\n\n  ```bash\n  sudo add-apt-repository ppa:graphics-drivers/ppa\n  sudo apt update && sudo apt install -y ubuntu-drivers-common\n  sudo ubuntu-drivers autoinstall\n  sudo reboot\n  ```\n\n## 2. Install NVIDIA-Docker.\n\nLet's install NVIDIA-Docker.\n\n```bash\ncurl -s -L https://nvidia.github.io/nvidia-docker/gpgkey | \\\n  sudo apt-key add -\ndistribution=$(. /etc/os-release;echo $ID$VERSION_ID)\ncurl -s -L https://nvidia.github.io/nvidia-docker/$distribution/nvidia-docker.list | sudo tee /etc/apt/sources.list.d/nvidia-docker.list\nsudo apt-get update\nsudo apt-get install -y nvidia-docker2 &&\nsudo systemctl restart docker\n```\n\nTo check if it is installed correctly, we will run the docker container using the GPU.\n\n```bash\nsudo docker run --rm --gpus all nvidia/cuda:11.0-base nvidia-smi\n```\n\nIf the following message appears, it means that the installation was successful: \n\n  ```bash\n  mlops@ubuntu:~$ sudo docker run --rm --gpus all nvidia/cuda:11.0-base nvidia-smi\n  +-----------------------------------------------------------------------------+\n  | NVIDIA-SMI 470.86       Driver Version: 470.86       CUDA Version: 11.4     |\n  |-------------------------------+----------------------+----------------------+\n  | GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |\n  | Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |\n  |                               |                      |               MIG M. |\n  |===============================+======================+======================|\n  |   0  NVIDIA GeForce ...  Off  | 00000000:01:00.0 Off |                  N/A |\n  | 25%   32C    P8     4W / 120W |    211MiB /  6078MiB |      0%      Default |\n  |                               |                      |                  N/A |\n  +-------------------------------+----------------------+----------------------+\n  |   1  NVIDIA GeForce ...  Off  | 00000000:02:00.0 Off |                  N/A |\n  |  0%   34C    P8     6W / 175W |      5MiB /  7982MiB |      0%      Default |\n  |                               |                      |                  N/A |\n  +-------------------------------+----------------------+----------------------+\n                                                                                \n  +-----------------------------------------------------------------------------+\n  | Processes:                                                                  |\n  |  GPU   GI   CI        PID   Type   Process name                  GPU Memory |\n  |        ID   ID                                                   Usage      |\n  |=============================================================================|\n  +-----------------------------------------------------------------------------+\n  ```\n\n## 3. Setting NVIDIA-Docker as the Default Container Runtime\n\nBy default, Kubernetes uses Docker-CE as the default container runtime. To use NVIDIA GPU within Docker containers, you need to configure NVIDIA-Docker as the container runtime and modify the default runtime for creating pods.\n\n1. Open the `/etc/docker/daemon.json` file and make the following modifications:\n\n  ```bash\n  sudo vi /etc/docker/daemon.json\n\n  {\n    \"default-runtime\": \"nvidia\",\n    \"runtimes\": {\n        \"nvidia\": {\n            \"path\": \"nvidia-container-runtime\",\n            \"runtimeArgs\": []\n    }\n    }\n  }\n  ```\n\n2. After confirming the file changes, restart Docker.\n\n  ```bash\n  sudo systemctl daemon-reload\n  sudo service docker restart\n  ```\n\n3. Verify that the changes have been applied.\n\n  ```bash\n  sudo docker info | grep nvidia\n  ```\n\n  If you see the following message, it means that the installation was successful.\n\n  ```bash\n  mlops@ubuntu:~$ docker info | grep nvidia\n  Runtimes: io.containerd.runc.v2 io.containerd.runtime.v1.linux nvidia runc\n  Default Runtime: nvidia\n  ```\n\n## 4. Nvidia-Device-Plugin\n\n1. Create the nvidia-device-plugin daemonset.\n\n  ```bash\n  kubectl create -f https://raw.githubusercontent.com/NVIDIA/k8s-device-plugin/v0.10.0/nvidia-device-plugin.yml\n  ```\n\n2. Verify that the nvidia-device-plugin pod is in the RUNNING state.\n\n  ```bash\n  kubectl get pod -n kube-system | grep nvidia\n  ```\n\nYou should see the following output:\n\n  ```bash\n  kube-system   nvidia-device-plugin-daemonset-nlqh2   1/1     Running   0    1h\n  ```\n\n3. Verify that the nodes have been configured to have GPUs available.\n\n  ```bash\n  kubectl get nodes \"-o=custom-columns=NAME:.metadata.name,GPU:.status.allocatable.nvidia\\.com/gpu\"\n  ```\n\n  If you see the following message, it means that the configuration was successful.  \n  (*In the *MLOps for ALL* tutorial cluster, there are two GPUs, so the output is 2.\n  If the output shows the correct number of GPUs for your cluster, it is fine.)\n\n  ```bash\n  NAME       GPU\n  ubuntu     2\n  ```\n\n  If it is not configured, the GPU value will be displayed as `<None>`.\n"
  },
  {
    "path": "i18n/en/docusaurus-plugin-content-docs/version-1.0.json",
    "content": "{\n  \"version.label\": {\n    \"message\": \"1.0\",\n    \"description\": \"The label for version 1.0\"\n  },\n  \"sidebar.tutorialSidebar.category.Introduction\": {\n    \"message\": \"Introduction\",\n    \"description\": \"The label for category Introduction in sidebar tutorialSidebar\"\n  },\n  \"sidebar.tutorialSidebar.category.Setup Kubernetes\": {\n    \"message\": \"Setup Kubernetes\",\n    \"description\": \"The label for category Setup Kubernetes in sidebar tutorialSidebar\"\n  },\n  \"sidebar.tutorialSidebar.category.4. Install Kubernetes\": {\n    \"message\": \"4. Install Kubernetes\",\n    \"description\": \"The label for category 4. Install Kubernetes in sidebar tutorialSidebar\"\n  },\n  \"sidebar.tutorialSidebar.category.Setup Components\": {\n    \"message\": \"Setup Components\",\n    \"description\": \"The label for category Setup Components in sidebar tutorialSidebar\"\n  },\n  \"sidebar.tutorialSidebar.category.Kubeflow UI Guide\": {\n    \"message\": \"Kubeflow UI Guide\",\n    \"description\": \"The label for category Kubeflow UI Guide in sidebar tutorialSidebar\"\n  },\n  \"sidebar.tutorialSidebar.category.Kubeflow\": {\n    \"message\": \"Kubeflow\",\n    \"description\": \"The label for category Kubeflow in sidebar tutorialSidebar\"\n  },\n  \"sidebar.tutorialSidebar.category.API Deployment\": {\n    \"message\": \"API Deployment\",\n    \"description\": \"The label for category API Deployment in sidebar tutorialSidebar\"\n  },\n  \"sidebar.tutorialSidebar.category.Appendix\": {\n    \"message\": \"Appendix\",\n    \"description\": \"The label for category Appendix in sidebar tutorialSidebar\"\n  },\n  \"sidebar.tutorialSidebar.category.Further Readings\": {\n    \"message\": \"Further Readings\",\n    \"description\": \"The label for category Further Readings in sidebar tutorialSidebar\"\n  },\n  \"sidebar.preSidebar.category.Docker\": {\n    \"message\": \"Docker\",\n    \"description\": \"The label for category Docker in sidebar preSidebar\"\n  }\n}\n"
  },
  {
    "path": "i18n/en/docusaurus-plugin-content-docs-community/current/community/community.md",
    "content": "---\ntitle: \"Community\"\nsidebar_position: 1\n---\n\n### *MLOps for ALL* 릴리즈 소식\n\n새로운 포스트나 수정사항은 [Announcements](https://github.com/mlops-for-all/mlops-for-all.github.io/discussions/categories/announcements)에서 확인할 수 있습니다.\n\n### Question\n\n프로젝트 내용과 관련된 궁금점은 [Q&A](https://github.com/mlops-for-all/mlops-for-all.github.io/discussions/categories/q-a)를 통해 질문할 수 있습니다.\n\n### Suggestion\n\n제안점은 [Ideas](https://github.com/mlops-for-all/mlops-for-all.github.io/discussions/categories/ideas)를 통해 제안해 주시면 됩니다.\n"
  },
  {
    "path": "i18n/en/docusaurus-plugin-content-docs-community/current/community/contributors.md",
    "content": "---\nsidebar_position: 3\n---\n\n# Contributors\n\n## Main Authors\n\nimport {\n  MainAuthorRow,\n} from '@site/src/components/TeamProfileCards';\n\n<MainAuthorRow />\n\n\n## Contributors\nThank you for contributing our tutorials!\n\nimport {\n  ContributorsRow,\n} from '@site/src/components/TeamProfileCards';\n\n<ContributorsRow />\n"
  },
  {
    "path": "i18n/en/docusaurus-plugin-content-docs-community/current/community/how-to-contribute.md",
    "content": "---\ntitle: \"How to Contribute\"\nsidebar_position: 2\n---\n\n## How to Start\n\n### Git Repo 준비\n\n1. [*MLOps for ALL* GitHub Repository](https://github.com/mlops-for-all/mlops-for-all.github.io)에 접속합니다.\n\n2. 여러분의 개인 Repository로 `Fork`합니다.\n\n3. Forked Repository를 여러분의 작업 환경으로 `git clone`합니다.\n\n### 환경 설정\n\n1. MLOps for ALL는 Hugo 와 Node를 이용하고 있습니다.  \n  다음 명령어를 통해 필요한 패키지가 설치되어 있는지 확인합니다.\n\n- node & npm\n\n    ```bash\n    npm --version\n    ```\n\n- hugo\n\n    ```bash\n    hugo version\n    ```\n\n1. 필요한 node module을 설치합니다.\n\n    ```bash\n    npm install\n    ```\n\n2. 프로젝트에서는 각 글의 일관성을 위해서 여러 markdown lint를 적용하고 있습니다.  \n  다음 명령어를 실행해 test를 진행한 후 커밋합니다.내용 수정 및 추가 후 lint가 맞는지 확인합니다.\n\n    ```bash\n    npm test\n    ```\n\n4. lint 확인 완료 후 ci 를 실행합니다.\n\n    ```bash\n    npm ci\n    ```\n\n4. 로컬에서 실행 후 수정한 글이 정상적으로 나오는지 확인합니다.\n\n    ```bash\n    npm run start\n    ```\n\n## How to Contribute\n\n### 1. 새로운 포스트를 작성할 때\n\n새로운 포스트는 각 챕터와 포스트의 위치에 맞는 weight를 설정합니다.\n\n- Introduction: 1xx\n- Setup: 2xx\n- Kubeflow: 3xx\n- API Deployment: 4xx\n- Help: 10xx\n\n### 2. 기존의 포스트를 수정할 때\n\n기존의 포스트를 수정할 때 Contributor에 본인의 이름을 입력합니다.\n\n```markdown\ncontributors: [\"John Doe\", \"Adam Smith\"]\n```\n\n### 3. 프로젝트에 처음 기여할 때\n\n만약 프로젝트에 처음 기여 할 때 `content/kor/contributors`에 본인의 이름으로 폴더를 생성한 후, `_index.md`라는 파일을 작성합니다.\n\n예를 들어, `minsoo kim`이 본인의 영어 이름이라면, 폴더명은 `minsoo-kim`으로 하여 해당 폴더 내부의 `_index.md`파일에 다음의 내용을 작성합니다.\n폴더명은 하이픈(-)으로 연결한 소문자로, title은 띄어쓰기를 포함한 CamelCase로 작성합니다.\n\n```markdown\n---\ntitle: \"John Doe\"\ndraft: false\n---\n```\n\n## After Pull Request\n\nPull Request를 생성하면 프로젝트에서는 자동으로 *MLOps for ALL* 운영진에게 리뷰 요청이 전해집니다. 최대 일주일 이내로 확인 후 Comment를 드릴 예정입니다.\n"
  },
  {
    "path": "i18n/en/docusaurus-plugin-content-docs-community/current.json",
    "content": "{\n  \"version.label\": {\n    \"message\": \"Next\",\n    \"description\": \"The label for version current\"\n  }\n}\n"
  },
  {
    "path": "i18n/en/docusaurus-theme-classic/footer.json",
    "content": "{\n  \"copyright\": {\n    \"message\": \"Copyright © 2021-2023 MakinaRocks. Built with Docusaurus.\",\n    \"description\": \"The footer copyright\"\n  }\n}\n"
  },
  {
    "path": "i18n/en/docusaurus-theme-classic/navbar.json",
    "content": "{\n  \"title\": {\n    \"message\": \"MLOps for ALL\",\n    \"description\": \"The title in the navbar\"\n  },\n  \"logo.alt\": {\n    \"message\": \"My Site Logo\",\n    \"description\": \"The alt text of navbar logo\"\n  },\n  \"item.label.Tutorial\": {\n    \"message\": \"Tutorial\",\n    \"description\": \"Navbar item with label Tutorial\"\n  },\n  \"item.label.Prerequisites\": {\n    \"message\": \"Prerequisites\",\n    \"description\": \"Navbar item with label Prerequisites\"\n  },\n  \"item.label.Community\": {\n    \"message\": \"Community\",\n    \"description\": \"Navbar item with label Community\"\n  },\n  \"item.label.GitHub\": {\n    \"message\": \"GitHub\",\n    \"description\": \"Navbar item with label GitHub\"\n  }\n}\n"
  },
  {
    "path": "i18n/ko/code.json",
    "content": "{\n  \"team.profile.Jongseob Jeon.body\": {\n    \"message\": \"마키나락스에서 머신러닝 엔지니어로 일하고 있습니다. 모두의 딥러닝을 통해 많은 사람들이 딥러닝을 쉽게 접했듯이 모두의 MLOps를 통해 많은 사람들이 MLOps에 쉽게 접할수 있길 바랍니다.\"\n  },\n  \"team.profile.Jaeyeon Kim.body\": {\n    \"message\": \"비효율적인 작업을 자동화하는 것에 관심이 많습니다.\"\n  },\n  \"team.profile.Youngchel Jang.body\": {\n    \"message\": \"마키나락스에서 MLOps Engineer로 일하고 있습니다. 단순하게 생각하는 노력을 하고 있습니다.\"\n  },\n  \"team.profile.Jongsun Shinn.body\": {\n    \"message\": \"마키나락스에서 ML Engineer로 일하고 있습니다.\"\n  },\n  \"team.profile.Sangwoo Shim.body\": {\n    \"message\": \"마키나락스에서 CTO로 일하고 있습니다. 마키나락스는 머신러닝 기반의 산업용 AI 솔루션을 개발하는 스타트업입니다. 산업 현장의 문제 해결을 통해 사람이 본연의 일에 집중할 수 있게 만드는 것, 그것이 우리가 하는 일입니다.\"\n  },\n  \"team.profile.Seunghyun Ko.body\": {\n    \"message\": \"3i에서 MLOps Engineer로 일하고 있습니다. kubeflow에 관심이 많습니다.\"\n  },\n  \"team.profile.SeungTae Kim.body\": {\n    \"message\": \"Genesis Lab이라는 스타트업에서 Applied AI Engineer 인턴 업무를 수행하고 있습니다. 머신러닝 생태계가 우리 산업 전반에 큰 변화을 가져올 것이라 믿으며, 한 걸음씩 나아가고 있습니다.\"\n  },\n  \"team.profile.Youngdon Tae.body\": {\n    \"message\": \"백패커에서 ML 엔지니어로 일하고 있습니다. 자연어처리, 추천시스템, MLOps에 관심이 많습니다.\"\n  },\n  \"theme.ErrorPageContent.title\": {\n    \"message\": \"페이지에 오류가 발생하였습니다.\",\n    \"description\": \"The title of the fallback page when the page crashed\"\n  },\n  \"theme.NotFound.title\": {\n    \"message\": \"페이지를 찾을 수 없습니다.\",\n    \"description\": \"The title of the 404 page\"\n  },\n  \"theme.NotFound.p1\": {\n    \"message\": \"원하는 페이지를 찾을 수 없습니다.\",\n    \"description\": \"The first paragraph of the 404 page\"\n  },\n  \"theme.NotFound.p2\": {\n    \"message\": \"사이트 관리자에게 링크가 깨진 것을 알려주세요.\",\n    \"description\": \"The 2nd paragraph of the 404 page\"\n  },\n  \"theme.admonition.note\": {\n    \"message\": \"노트\",\n    \"description\": \"The default label used for the Note admonition (:::note)\"\n  },\n  \"theme.admonition.tip\": {\n    \"message\": \"팁\",\n    \"description\": \"The default label used for the Tip admonition (:::tip)\"\n  },\n  \"theme.admonition.danger\": {\n    \"message\": \"위험\",\n    \"description\": \"The default label used for the Danger admonition (:::danger)\"\n  },\n  \"theme.admonition.info\": {\n    \"message\": \"정보\",\n    \"description\": \"The default label used for the Info admonition (:::info)\"\n  },\n  \"theme.admonition.caution\": {\n    \"message\": \"주의\",\n    \"description\": \"The default label used for the Caution admonition (:::caution)\"\n  },\n  \"theme.BackToTopButton.buttonAriaLabel\": {\n    \"message\": \"맨 위로 스크롤하기\",\n    \"description\": \"The ARIA label for the back to top button\"\n  },\n  \"theme.blog.paginator.navAriaLabel\": {\n    \"message\": \"블로그 게시물 목록 탐색\",\n    \"description\": \"The ARIA label for the blog pagination\"\n  },\n  \"theme.blog.paginator.newerEntries\": {\n    \"message\": \"이전 페이지\",\n    \"description\": \"The label used to navigate to the newer blog posts page (previous page)\"\n  },\n  \"theme.blog.paginator.olderEntries\": {\n    \"message\": \"다음 페이지\",\n    \"description\": \"The label used to navigate to the older blog posts page (next page)\"\n  },\n  \"theme.blog.archive.title\": {\n    \"message\": \"게시물 목록\",\n    \"description\": \"The page & hero title of the blog archive page\"\n  },\n  \"theme.blog.archive.description\": {\n    \"message\": \"게시물 목록\",\n    \"description\": \"The page & hero description of the blog archive page\"\n  },\n  \"theme.blog.post.paginator.navAriaLabel\": {\n    \"message\": \"블로그 게시물 탐색\",\n    \"description\": \"The ARIA label for the blog posts pagination\"\n  },\n  \"theme.blog.post.paginator.newerPost\": {\n    \"message\": \"이전 게시물\",\n    \"description\": \"The blog post button label to navigate to the newer/previous post\"\n  },\n  \"theme.blog.post.paginator.olderPost\": {\n    \"message\": \"다음 게시물\",\n    \"description\": \"The blog post button label to navigate to the older/next post\"\n  },\n  \"theme.blog.post.plurals\": {\n    \"message\": \"{count}개 게시물\",\n    \"description\": \"Pluralized label for \\\"{count} posts\\\". Use as much plural forms (separated by \\\"|\\\") as your language support (see https://www.unicode.org/cldr/cldr-aux/charts/34/supplemental/language_plural_rules.html)\"\n  },\n  \"theme.blog.tagTitle\": {\n    \"message\": \"\\\"{tagName}\\\" 태그로 연결된 {nPosts}개의 게시물이 있습니다.\",\n    \"description\": \"The title of the page for a blog tag\"\n  },\n  \"theme.tags.tagsPageLink\": {\n    \"message\": \"모든 태그 보기\",\n    \"description\": \"The label of the link targeting the tag list page\"\n  },\n  \"theme.colorToggle.ariaLabel\": {\n    \"message\": \"어두운 모드와 밝은 모드 전환하기 (현재 {mode})\",\n    \"description\": \"The ARIA label for the navbar color mode toggle\"\n  },\n  \"theme.colorToggle.ariaLabel.mode.dark\": {\n    \"message\": \"어두운 모드\",\n    \"description\": \"The name for the dark color mode\"\n  },\n  \"theme.colorToggle.ariaLabel.mode.light\": {\n    \"message\": \"밝은 모드\",\n    \"description\": \"The name for the light color mode\"\n  },\n  \"theme.docs.DocCard.categoryDescription\": {\n    \"message\": \"{count} 항목\",\n    \"description\": \"The default description for a category card in the generated index about how many items this category includes\"\n  },\n  \"theme.docs.breadcrumbs.navAriaLabel\": {\n    \"message\": \"Breadcrumbs\",\n    \"description\": \"The ARIA label for the breadcrumbs\"\n  },\n  \"theme.docs.paginator.navAriaLabel\": {\n    \"message\": \"문서 페이지\",\n    \"description\": \"The ARIA label for the docs pagination\"\n  },\n  \"theme.docs.paginator.previous\": {\n    \"message\": \"이전\",\n    \"description\": \"The label used to navigate to the previous doc\"\n  },\n  \"theme.docs.paginator.next\": {\n    \"message\": \"다음\",\n    \"description\": \"The label used to navigate to the next doc\"\n  },\n  \"theme.docs.tagDocListPageTitle.nDocsTagged\": {\n    \"message\": \"{count}개 문서가\",\n    \"description\": \"Pluralized label for \\\"{count} docs tagged\\\". Use as much plural forms (separated by \\\"|\\\") as your language support (see https://www.unicode.org/cldr/cldr-aux/charts/34/supplemental/language_plural_rules.html)\"\n  },\n  \"theme.docs.tagDocListPageTitle\": {\n    \"message\": \"{nDocsTagged} \\\"{tagName}\\\" 태그에 분류되었습니다\",\n    \"description\": \"The title of the page for a docs tag\"\n  },\n  \"theme.docs.versionBadge.label\": {\n    \"message\": \"버전: {versionLabel}\"\n  },\n  \"theme.docs.versions.unreleasedVersionLabel\": {\n    \"message\": \"{siteTitle} {versionLabel} 문서는 아직 정식 공개되지 않았습니다.\",\n    \"description\": \"The label used to tell the user that he's browsing an unreleased doc version\"\n  },\n  \"theme.docs.versions.unmaintainedVersionLabel\": {\n    \"message\": \"{siteTitle} {versionLabel} 문서는 더 이상 업데이트되지 않습니다.\",\n    \"description\": \"The label used to tell the user that he's browsing an unmaintained doc version\"\n  },\n  \"theme.docs.versions.latestVersionSuggestionLabel\": {\n    \"message\": \"최신 문서는 {latestVersionLink} ({versionLabel})을 확인하세요.\",\n    \"description\": \"The label used to tell the user to check the latest version\"\n  },\n  \"theme.docs.versions.latestVersionLinkLabel\": {\n    \"message\": \"최신 버전\",\n    \"description\": \"The label used for the latest version suggestion link label\"\n  },\n  \"theme.common.editThisPage\": {\n    \"message\": \"페이지 편집\",\n    \"description\": \"The link label to edit the current page\"\n  },\n  \"theme.common.headingLinkTitle\": {\n    \"message\": \"{heading}에 대한 직접 링크\",\n    \"description\": \"Title for link to heading\"\n  },\n  \"theme.lastUpdated.atDate\": {\n    \"message\": \" {date}에\",\n    \"description\": \"The words used to describe on which date a page has been last updated\"\n  },\n  \"theme.lastUpdated.byUser\": {\n    \"message\": \" {user}가\",\n    \"description\": \"The words used to describe by who the page has been last updated\"\n  },\n  \"theme.lastUpdated.lastUpdatedAtBy\": {\n    \"message\": \"최종 수정: {atDate}{byUser}\",\n    \"description\": \"The sentence used to display when a page has been last updated, and by who\"\n  },\n  \"theme.navbar.mobileVersionsDropdown.label\": {\n    \"message\": \"버전\",\n    \"description\": \"The label for the navbar versions dropdown on mobile view\"\n  },\n  \"theme.tags.tagsListLabel\": {\n    \"message\": \"태그:\",\n    \"description\": \"The label alongside a tag list\"\n  },\n  \"theme.AnnouncementBar.closeButtonAriaLabel\": {\n    \"message\": \"닫기\",\n    \"description\": \"The ARIA label for close button of announcement bar\"\n  },\n  \"theme.blog.sidebar.navAriaLabel\": {\n    \"message\": \"최근 블로그 문서 둘러보기\",\n    \"description\": \"The ARIA label for recent posts in the blog sidebar\"\n  },\n  \"theme.CodeBlock.wordWrapToggle\": {\n    \"message\": \"줄 바꿈 전환\",\n    \"description\": \"The title attribute for toggle word wrapping button of code block lines\"\n  },\n  \"theme.CodeBlock.copied\": {\n    \"message\": \"복사했습니다\",\n    \"description\": \"The copied button label on code blocks\"\n  },\n  \"theme.CodeBlock.copyButtonAriaLabel\": {\n    \"message\": \"클립보드에 코드 복사\",\n    \"description\": \"The ARIA label for copy code blocks button\"\n  },\n  \"theme.CodeBlock.copy\": {\n    \"message\": \"복사\",\n    \"description\": \"The copy button label on code blocks\"\n  },\n  \"theme.DocSidebarItem.toggleCollapsedCategoryAriaLabel\": {\n    \"message\": \"접을 수 있는 사이드바 분류 '{label}' 접기(펼치기)\",\n    \"description\": \"The ARIA label to toggle the collapsible sidebar category\"\n  },\n  \"theme.NavBar.navAriaLabel\": {\n    \"message\": \"Main\",\n    \"description\": \"The ARIA label for the main navigation\"\n  },\n  \"theme.navbar.mobileLanguageDropdown.label\": {\n    \"message\": \"언어\",\n    \"description\": \"The label for the mobile language switcher dropdown\"\n  },\n  \"theme.TOCCollapsible.toggleButtonLabel\": {\n    \"message\": \"이 페이지에서\",\n    \"description\": \"The label used by the button on the collapsible TOC component\"\n  },\n  \"theme.blog.post.readMore\": {\n    \"message\": \"자세히 보기\",\n    \"description\": \"The label used in blog post item excerpts to link to full blog posts\"\n  },\n  \"theme.blog.post.readMoreLabel\": {\n    \"message\": \"{title} 에 대해 더 읽어보기\",\n    \"description\": \"The ARIA label for the link to full blog posts from excerpts\"\n  },\n  \"theme.blog.post.readingTime.plurals\": {\n    \"message\": \"약 {readingTime}분\",\n    \"description\": \"Pluralized label for \\\"{readingTime} min read\\\". Use as much plural forms (separated by \\\"|\\\") as your language support (see https://www.unicode.org/cldr/cldr-aux/charts/34/supplemental/language_plural_rules.html)\"\n  },\n  \"theme.docs.breadcrumbs.home\": {\n    \"message\": \"홈\",\n    \"description\": \"The ARIA label for the home page in the breadcrumbs\"\n  },\n  \"theme.docs.sidebar.collapseButtonTitle\": {\n    \"message\": \"사이드바 숨기기\",\n    \"description\": \"The title attribute for collapse button of doc sidebar\"\n  },\n  \"theme.docs.sidebar.collapseButtonAriaLabel\": {\n    \"message\": \"사이드바 숨기기\",\n    \"description\": \"The title attribute for collapse button of doc sidebar\"\n  },\n  \"theme.docs.sidebar.navAriaLabel\": {\n    \"message\": \"Docs sidebar\",\n    \"description\": \"The ARIA label for the sidebar navigation\"\n  },\n  \"theme.docs.sidebar.closeSidebarButtonAriaLabel\": {\n    \"message\": \"Close navigation bar\",\n    \"description\": \"The ARIA label for close button of mobile sidebar\"\n  },\n  \"theme.navbar.mobileSidebarSecondaryMenu.backButtonLabel\": {\n    \"message\": \"← 메인 메뉴로 돌아가기\",\n    \"description\": \"The label of the back button to return to main menu, inside the mobile navbar sidebar secondary menu (notably used to display the docs sidebar)\"\n  },\n  \"theme.docs.sidebar.toggleSidebarButtonAriaLabel\": {\n    \"message\": \"Toggle navigation bar\",\n    \"description\": \"The ARIA label for hamburger menu button of mobile navigation\"\n  },\n  \"theme.docs.sidebar.expandButtonTitle\": {\n    \"message\": \"사이드바 열기\",\n    \"description\": \"The ARIA label and title attribute for expand button of doc sidebar\"\n  },\n  \"theme.docs.sidebar.expandButtonAriaLabel\": {\n    \"message\": \"사이드바 열기\",\n    \"description\": \"The ARIA label and title attribute for expand button of doc sidebar\"\n  },\n  \"theme.ErrorPageContent.tryAgain\": {\n    \"message\": \"다시 시도해 보세요\",\n    \"description\": \"The label of the button to try again rendering when the React error boundary captures an error\"\n  },\n  \"theme.common.skipToMainContent\": {\n    \"message\": \"본문으로 건너뛰기\",\n    \"description\": \"The skip to content label used for accessibility, allowing to rapidly navigate to main content with keyboard tab/enter navigation\"\n  },\n  \"theme.tags.tagsPageTitle\": {\n    \"message\": \"태그\",\n    \"description\": \"The title of the tag list page\"\n  }\n}\n"
  },
  {
    "path": "i18n/ko/docusaurus-plugin-content-blog/options.json",
    "content": "{\n  \"title\": {\n    \"message\": \"Blog\",\n    \"description\": \"The title for the blog used in SEO\"\n  },\n  \"description\": {\n    \"message\": \"Blog\",\n    \"description\": \"The description for the blog used in SEO\"\n  },\n  \"sidebar.title\": {\n    \"message\": \"Recent posts\",\n    \"description\": \"The label for the left sidebar\"\n  }\n}\n"
  },
  {
    "path": "i18n/ko/docusaurus-plugin-content-docs/current.json",
    "content": "{\n  \"version.label\": {\n    \"message\": \"1.0\",\n    \"description\": \"The label for version current\"\n  },\n  \"sidebar.tutorialSidebar.category.Introduction\": {\n    \"message\": \"Introduction\",\n    \"description\": \"The label for category Introduction in sidebar tutorialSidebar\"\n  },\n  \"sidebar.tutorialSidebar.category.Setup Kubernetes\": {\n    \"message\": \"Setup Kubernetes\",\n    \"description\": \"The label for category Setup Kubernetes in sidebar tutorialSidebar\"\n  },\n  \"sidebar.tutorialSidebar.category.4. Install Kubernetes\": {\n    \"message\": \"4. Install Kubernetes\",\n    \"description\": \"The label for category 4. Install Kubernetes in sidebar tutorialSidebar\"\n  },\n  \"sidebar.tutorialSidebar.category.Setup Components\": {\n    \"message\": \"Setup Components\",\n    \"description\": \"The label for category Setup Components in sidebar tutorialSidebar\"\n  },\n  \"sidebar.tutorialSidebar.category.Kubeflow UI Guide\": {\n    \"message\": \"Kubeflow UI Guide\",\n    \"description\": \"The label for category Kubeflow UI Guide in sidebar tutorialSidebar\"\n  },\n  \"sidebar.tutorialSidebar.category.Kubeflow\": {\n    \"message\": \"Kubeflow\",\n    \"description\": \"The label for category Kubeflow in sidebar tutorialSidebar\"\n  },\n  \"sidebar.tutorialSidebar.category.API Deployment\": {\n    \"message\": \"API Deployment\",\n    \"description\": \"The label for category API Deployment in sidebar tutorialSidebar\"\n  },\n  \"sidebar.tutorialSidebar.category.Appendix\": {\n    \"message\": \"Appendix\",\n    \"description\": \"The label for category Appendix in sidebar tutorialSidebar\"\n  },\n  \"sidebar.tutorialSidebar.category.Further Readings\": {\n    \"message\": \"Further Readings\",\n    \"description\": \"The label for category Further Readings in sidebar tutorialSidebar\"\n  },\n  \"sidebar.preSidebar.category.Docker\": {\n    \"message\": \"Docker\",\n    \"description\": \"The label for category Docker in sidebar preSidebar\"\n  }\n}\n"
  },
  {
    "path": "i18n/ko/docusaurus-plugin-content-docs/version-1.0.json",
    "content": "{\n  \"version.label\": {\n    \"message\": \"1.0\",\n    \"description\": \"The label for version 1.0\"\n  },\n  \"sidebar.tutorialSidebar.category.Introduction\": {\n    \"message\": \"Introduction\",\n    \"description\": \"The label for category Introduction in sidebar tutorialSidebar\"\n  },\n  \"sidebar.tutorialSidebar.category.Setup Kubernetes\": {\n    \"message\": \"Setup Kubernetes\",\n    \"description\": \"The label for category Setup Kubernetes in sidebar tutorialSidebar\"\n  },\n  \"sidebar.tutorialSidebar.category.4. Install Kubernetes\": {\n    \"message\": \"4. Install Kubernetes\",\n    \"description\": \"The label for category 4. Install Kubernetes in sidebar tutorialSidebar\"\n  },\n  \"sidebar.tutorialSidebar.category.Setup Components\": {\n    \"message\": \"Setup Components\",\n    \"description\": \"The label for category Setup Components in sidebar tutorialSidebar\"\n  },\n  \"sidebar.tutorialSidebar.category.Kubeflow UI Guide\": {\n    \"message\": \"Kubeflow UI Guide\",\n    \"description\": \"The label for category Kubeflow UI Guide in sidebar tutorialSidebar\"\n  },\n  \"sidebar.tutorialSidebar.category.Kubeflow\": {\n    \"message\": \"Kubeflow\",\n    \"description\": \"The label for category Kubeflow in sidebar tutorialSidebar\"\n  },\n  \"sidebar.tutorialSidebar.category.API Deployment\": {\n    \"message\": \"API Deployment\",\n    \"description\": \"The label for category API Deployment in sidebar tutorialSidebar\"\n  },\n  \"sidebar.tutorialSidebar.category.Appendix\": {\n    \"message\": \"Appendix\",\n    \"description\": \"The label for category Appendix in sidebar tutorialSidebar\"\n  },\n  \"sidebar.tutorialSidebar.category.Further Readings\": {\n    \"message\": \"Further Readings\",\n    \"description\": \"The label for category Further Readings in sidebar tutorialSidebar\"\n  },\n  \"sidebar.preSidebar.category.Docker\": {\n    \"message\": \"Docker\",\n    \"description\": \"The label for category Docker in sidebar preSidebar\"\n  }\n}\n"
  },
  {
    "path": "i18n/ko/docusaurus-plugin-content-docs-community/current.json",
    "content": "{\n  \"version.label\": {\n    \"message\": \"Next\",\n    \"description\": \"The label for version current\"\n  }\n}\n"
  },
  {
    "path": "i18n/ko/docusaurus-theme-classic/footer.json",
    "content": "{\n  \"copyright\": {\n    \"message\": \"Copyright © 2021-2023 MakinaRocks. Built with Docusaurus.\",\n    \"description\": \"The footer copyright\"\n  }\n}\n"
  },
  {
    "path": "i18n/ko/docusaurus-theme-classic/navbar.json",
    "content": "{\n  \"title\": {\n    \"message\": \"MLOps for ALL\",\n    \"description\": \"The title in the navbar\"\n  },\n  \"logo.alt\": {\n    \"message\": \"My Site Logo\",\n    \"description\": \"The alt text of navbar logo\"\n  },\n  \"item.label.Tutorial\": {\n    \"message\": \"Tutorial\",\n    \"description\": \"Navbar item with label Tutorial\"\n  },\n  \"item.label.Prerequisites\": {\n    \"message\": \"Prerequisites\",\n    \"description\": \"Navbar item with label Prerequisites\"\n  },\n  \"item.label.Community\": {\n    \"message\": \"Community\",\n    \"description\": \"Navbar item with label Community\"\n  },\n  \"item.label.GitHub\": {\n    \"message\": \"GitHub\",\n    \"description\": \"Navbar item with label GitHub\"\n  }\n}\n"
  },
  {
    "path": "package.json",
    "content": "{\n  \"name\": \"v-2\",\n  \"version\": \"0.0.0\",\n  \"private\": true,\n  \"scripts\": {\n    \"docusaurus\": \"docusaurus\",\n    \"start\": \"docusaurus start\",\n    \"build\": \"docusaurus build\",\n    \"swizzle\": \"docusaurus swizzle\",\n    \"deploy\": \"docusaurus deploy\",\n    \"clear\": \"docusaurus clear\",\n    \"serve\": \"docusaurus serve\",\n    \"write-translations\": \"docusaurus write-translations\",\n    \"write-heading-ids\": \"docusaurus write-heading-ids\",\n    \"typecheck\": \"tsc\"\n  },\n  \"dependencies\": {\n    \"@docusaurus/core\": \"2.4.1\",\n    \"@docusaurus/plugin-content-docs\": \"^2.4.1\",\n    \"@docusaurus/plugin-google-gtag\": \"^2.4.1\",\n    \"@docusaurus/plugin-sitemap\": \"^2.4.1\",\n    \"@docusaurus/preset-classic\": \"2.4.1\",\n    \"@mdx-js/react\": \"^1.6.22\",\n    \"clsx\": \"^1.2.1\",\n    \"prism-react-renderer\": \"^1.3.5\",\n    \"react\": \"^17.0.2\",\n    \"react-dom\": \"^17.0.2\"\n  },\n  \"devDependencies\": {\n    \"@docusaurus/module-type-aliases\": \"2.4.1\",\n    \"@tsconfig/docusaurus\": \"^1.0.5\",\n    \"typescript\": \"^4.7.4\"\n  },\n  \"browserslist\": {\n    \"production\": [\n      \">0.5%\",\n      \"not dead\",\n      \"not op_mini all\"\n    ],\n    \"development\": [\n      \"last 1 chrome version\",\n      \"last 1 firefox version\",\n      \"last 1 safari version\"\n    ]\n  },\n  \"engines\": {\n    \"node\": \">=16.14\"\n  }\n}\n"
  },
  {
    "path": "python/env/.gitkeep",
    "content": ""
  },
  {
    "path": "python/pyproject.toml",
    "content": "[tool.poetry]\nname = \"mlops-for-all\"\nversion = \"0.1.0\"\ndescription = \"Scripts for translation\"\nauthors = [\"Aiden-Jeon <aiden.jongseob@gmail.com>\"]\nreadme = \"README.md\"\npackages = [{include = \"mlops_for_all\"}]\n\n[tool.poetry.dependencies]\npython = \"^3.9\"\nopenai = \"^0.27.8\"\npython-dotenv = \"^1.0.0\"\nlangchain = {extras = [\"llms\"], version = \"^0.0.228\"}\n\n\n[build-system]\nrequires = [\"poetry-core\"]\nbuild-backend = \"poetry.core.masonry.api\"\n"
  },
  {
    "path": "python/translation/main.py",
    "content": "import os\nfrom pathlib import Path\n\nimport dotenv\nfrom langchain.llms import OpenAI\nfrom langchain.schema import HumanMessage\n\n\nROOT_PATH = Path(__file__).parent\nOPENAI_ENV_PATH = ROOT_PATH.parent / \"env\" / \"openai.env\"\ndotenv.load_dotenv(OPENAI_ENV_PATH)\n\nOPENAI_API_KEY = os.getenv(\"OPENAI_API_KEY\")\n\nOPENAI_MODEL = OpenAI(openai_api_key=OPENAI_API_KEY)\n\n\ndef request_prompt(source_sentence):\n    translated_sentence = \"\\n\"\n    if source_sentence:\n        translation_prompt = HumanMessage(\n            content=f\"Translate those sentences from Korean to English. {source_sentence}\"\n        )\n        translated_sentence = OPENAI_MODEL.predict_messages([translation_prompt]).content\n    return translated_sentence + \"\\n\"\n\n\ndef translate(source_path, dest_path):\n    translate_lines = []\n    with open(source_path, \"r\") as f:\n        line = f.readline()\n        translate_lines += [line]\n        lines = []\n        is_codeblock = False\n        is_header = True\n        while line:\n            line = f.readline()\n            # 헤더 블록인 경우\n            if line.startswith(\"---\"):\n                is_header = False\n                translate_lines += [line]\n                continue\n            if is_header:\n                translate_lines += [line]\n                continue\n\n            # 코드 블록인 경우\n            if line.startswith(\"```\"):\n                if not is_codeblock:\n                    # 코드 블록 시작인 경우 번역한다.\n                    source_sentence = \"\".join(lines)\n                    translated_sentence = request_prompt(source_sentence)\n                    translate_lines += [translated_sentence]\n                    # 모으는 부분을 초기화 하고 코드 블록임을 선언한다.\n                    lines = []\n                    is_codeblock = True\n                else:\n                    # 코드 블록이 끝난 경우\n                    is_codeblock = False\n                    translate_lines += [line]\n                    continue\n            if is_codeblock:\n                # 코드 블록 내부인 경우 통과한다.\n                translate_lines += [line]\n                continue\n            lines += [line]\n            if len(lines) > 10:\n                # 많이 모이면 먼저 번역한다.\n                source_sentence = \"\".join(lines)\n                translated_sentence = request_prompt(source_sentence)\n                translate_lines += [translated_sentence]\n                lines = []\n\n    source_sentence = \"\".join(lines)\n    #\n    # request\n    #\n    translated_sentence = request_prompt(source_sentence)\n    translate_lines += [translated_sentence]\n\n    docs = \"\".join(translate_lines)\n    with open(dest_path, \"w\") as f:\n        f.write(docs)\n\n\nif __name__ == \"__main__\":\n    from argparse import ArgumentParser\n\n    parser = ArgumentParser()\n    parser.add_argument(\"--chapter\", type=str)\n    args = parser.parse_args()\n    REPO_ROOT = ROOT_PATH.parent.parent\n    DOCS_ROOT = REPO_ROOT / \"docs\" / args.chapter\n    DEST_ROOT = REPO_ROOT / \"i18n/en/docusaurus-plugin-content-docs/version-1.0\" / args.chapter\n\n    for source_path in DOCS_ROOT.glob(\"*.md\"):\n        dest_path = DEST_ROOT / source_path.name\n        print(\"source : \", source_path)\n        translate(source_path, dest_path)\n        print(\"dest : \", dest_path)\n"
  },
  {
    "path": "sidebars.js",
    "content": "/**\n * Creating a sidebar enables you to:\n - create an ordered group of docs\n - render a sidebar for each doc of that group\n - provide next/previous navigation\n\n The sidebars can be generated from the filesystem, or explicitly defined here.\n\n Create as many sidebars as you want.\n */\n\n// @ts-check\n\n/** @type {import('@docusaurus/plugin-content-docs').SidebarsConfig} */\nconst sidebars = {\n  tutorialSidebar: [\n    {\n      type: \"category\",\n      label: \"Introduction\",\n      items: [\n        \"introduction/intro\",\n        \"introduction/levels\",\n        \"introduction/component\",\n        \"introduction/why_kubernetes\",\n      ],\n    },\n    {\n      type: \"category\",\n      label: \"Setup Kubernetes\",\n      items: [\n        \"setup-kubernetes/intro\",\n        \"setup-kubernetes/kubernetes\",\n        \"setup-kubernetes/install-prerequisite\",\n        {\n          type: \"category\",\n          label: \"4. Install Kubernetes\",\n          items: [\n            \"setup-kubernetes/install-kubernetes/kubernetes-with-k3s\",\n            \"setup-kubernetes/install-kubernetes/kubernetes-with-kubeadm\",\n            \"setup-kubernetes/install-kubernetes/kubernetes-with-minikube\",\n          ],\n        },\n        \"setup-kubernetes/install-kubernetes-module\",\n        \"setup-kubernetes/setup-nvidia-gpu\",\n      ],\n    },\n    {\n      type: \"category\",\n      label: \"Setup Components\",\n      items: [\n        \"setup-components/install-components-kf\",\n        \"setup-components/install-components-mlflow\",\n        \"setup-components/install-components-seldon\",\n        \"setup-components/install-components-pg\",\n      ],\n    },\n    {\n      type: \"category\",\n      label: \"Kubeflow UI Guide\",\n      items: [\n        \"kubeflow-dashboard-guide/intro\",\n        \"kubeflow-dashboard-guide/notebooks\",\n        \"kubeflow-dashboard-guide/tensorboards\",\n        \"kubeflow-dashboard-guide/volumes\",\n        \"kubeflow-dashboard-guide/experiments\",\n        \"kubeflow-dashboard-guide/experiments-and-others\",\n      ],\n    },\n    {\n      type: \"category\",\n      label: \"Kubeflow\",\n      items: [\n        \"kubeflow/kubeflow-intro\",\n        \"kubeflow/kubeflow-concepts\",\n        \"kubeflow/basic-requirements\",\n        \"kubeflow/basic-component\",\n        \"kubeflow/basic-pipeline\",\n        \"kubeflow/basic-pipeline-upload\",\n        \"kubeflow/basic-run\",\n        \"kubeflow/advanced-component\",\n        \"kubeflow/advanced-environment\",\n        \"kubeflow/advanced-pipeline\",\n        \"kubeflow/advanced-run\",\n        \"kubeflow/advanced-mlflow\",\n        \"kubeflow/how-to-debug\",\n      ],\n    },\n    {\n      type: \"category\",\n      label: \"API Deployment\",\n      items: [\n        \"api-deployment/what-is-api-deployment\",\n        \"api-deployment/seldon-iris\",\n        \"api-deployment/seldon-pg\",\n        \"api-deployment/seldon-fields\",\n        \"api-deployment/seldon-mlflow\",\n        \"api-deployment/seldon-children\",\n      ],\n    },\n    {\n      type: \"category\",\n      label: \"Appendix\",\n      items: [\"appendix/pyenv\", \"appendix/metallb\"],\n    },\n    {\n      type: \"category\",\n      label: \"Further Readings\",\n      items: [\"further-readings/info\"],\n    },\n  ],\n\n  preSidebar: [\n    {\n      type: \"category\",\n      label: \"Docker\",\n      items: [\n        \"prerequisites/docker/install\",\n        \"prerequisites/docker/introduction\",\n        \"prerequisites/docker/docker\",\n        \"prerequisites/docker/command\",\n        \"prerequisites/docker/images\",\n        \"prerequisites/docker/advanced\",\n      ],\n    },\n  ],\n};\n\nmodule.exports = sidebars;\n"
  },
  {
    "path": "sidebarsCommunity.js",
    "content": "/**\n * Creating a sidebar enables you to:\n - create an ordered group of docs\n - render a sidebar for each doc of that group\n - provide next/previous navigation\n\n The sidebars can be generated from the filesystem, or explicitly defined here.\n\n Create as many sidebars as you want.\n */\n\n// @ts-check\n\n/** @type {import('@docusaurus/plugin-content-docs').SidebarsConfig} */\nconst sidebars = {\n  // By default, Docusaurus generates a sidebar from the docs folder structure\n  tutorialSidebar: [{type: 'autogenerated', dirName: '.'}],\n\n  // But you can create a sidebar manually\n  /*\n  tutorialSidebar: [\n    'intro',\n    'hello',\n    {\n      type: 'category',\n      label: 'Tutorial',\n      items: ['tutorial-basics/create-a-document'],\n    },\n  ],\n   */\n};\n\nmodule.exports = sidebars;\n"
  },
  {
    "path": "src/components/HomepageFeatures/index.tsx",
    "content": "import React from 'react';\nimport clsx from 'clsx';\nimport styles from './styles.module.css';\n\ntype FeatureItem = {\n  title: string;\n  Svg: React.ComponentType<React.ComponentProps<'svg'>>;\n  description: JSX.Element;\n};\n\nconst FeatureList: FeatureItem[] = [\n  {\n    title: <a href=\"https://makinarocks.ai/\">MakinaRocks</a>,\n    Svg: require('@site/static/img/undraw_docusaurus_tree.svg').default,\n    description: (\n      <>\n        <p>\n          Sponsored by MakinaRocks\n        </p>\n        \n        이 프로젝트는 MakinaRocks의 지원을 받아 제작되었습니다.\n      </>\n    ),\n  },\n  {\n    title: <a href=\"https://mlops-for-mle.github.io/tutorial\">MLOps for MLE</a>,\n    Svg: require('@site/static/img/undraw_docusaurus_mountain.svg').default,\n    description: (\n      <>\n        <p>\n          ML Engineer를 위한 MLOps Release!\n        </p>\n        \n        구글에서 제안한 MLOps 0단계를 직접 구현하며 MLOps 가 무엇인지 공부할 수 있는 튜토리얼을 오픈했습니다!\n      </>\n    ),\n  },\n];\n\nfunction Feature({title, Svg, description}: FeatureItem) {\n  return (\n    <div className={clsx('col col--6')}>\n      <div className=\"text--center\">\n        <Svg className={styles.featureSvg} role=\"img\" />\n      </div>\n      <div className=\"text--center padding-horiz--md\">\n        <h3>{title}</h3>\n        <p>{description}</p>\n      </div>\n    </div>\n  );\n}\n\nexport default function HomepageFeatures(): JSX.Element {\n  return (\n    <section className={styles.features}>\n      <div className=\"container\">\n        <div className=\"row\">\n          {FeatureList.map((props, idx) => (\n            <Feature key={idx} {...props} />\n          ))}\n        </div>\n      </div>\n    </section>\n  );\n}\n"
  },
  {
    "path": "src/components/HomepageFeatures/styles.module.css",
    "content": ".features {\n  display: flex;\n  align-items: center;\n  padding: 2rem 0;\n  width: 100%;\n}\n\n.featureSvg {\n  height: 200px;\n  width: 200px;\n}\n"
  },
  {
    "path": "src/components/TeamProfileCards/index.tsx",
    "content": "/**\n * Copyright (c) Facebook, Inc. and its affiliates.\n *\n * This source code is licensed under the MIT license found in the\n * LICENSE file in the root directory of this source tree.\n */\n\nimport React, {type ReactNode} from 'react';\nimport Translate from '@docusaurus/Translate';\n\n\ntype ProfileProps = {\n  className?: string;\n  name: string;\n  children: ReactNode;\n  githubUrl: string;\n  linkedinUrl?: string;\n};\n\nfunction TeamProfileCard({\n  className,\n  name,\n  children,\n  githubUrl,\n  linkedinUrl,\n  role,\n}: ProfileProps) {\n  return (\n    <div className={className}>\n      <div className=\"card card--full-height\">\n        <div className=\"card__header\">\n          <div className=\"avatar avatar--vertical\">\n            <img\n              className=\"avatar__photo avatar__photo--xl\"\n              src={`${githubUrl}.png`}\n              alt={`${name}'s avatar`}\n            />\n            <div className=\"avatar__intro\">\n              <h3 className=\"avatar__name\">{name}</h3>\n            </div>\n            <div className=\"avatar__role\">\n              <h5 className=\"avatar__role\">{role}</h5>\n            </div>\n          </div>\n        </div>\n        <div className=\"card__body\">{children}</div>\n        <div className=\"card__footer\">\n          <div className=\"button-group button-group--block\">\n            {githubUrl && (\n              <a className=\"button button--secondary\" href={githubUrl}>\n                GitHub\n              </a>\n            )}\n            {linkedinUrl && (\n              <a className=\"button button--secondary\" href={linkedinUrl}>\n                LinkedIn\n              </a>\n            )}\n          </div>\n        </div>\n      </div>\n    </div>\n  );\n}\n\nfunction TeamProfileCardCol(props: ProfileProps) {\n  return (\n    <TeamProfileCard {...props} className=\"col col--6 margin-bottom--lg\" />\n  );\n}\n\nexport function MainAuthorRow(): JSX.Element {\n  return (\n    <div className=\"row\">\n      <TeamProfileCardCol\n        name=\"Jongseob Jeon\"\n        githubUrl=\"https://github.com/aiden-jeon\"\n        linkedinUrl=\"https://www.linkedin.com/in/jongseob-jeon/\"\n        role=\"Project Leader\"\n        >\n        <Translate id=\"team.profile.Jongseob Jeon.body\">\n        \n        \n        마키나락스에서 머신러닝 엔지니어로 일하고 있습니다.\n\n        모두의 딥러닝을 통해 많은 사람들이 딥러닝을 쉽게 접했듯이  \n        모두의 MLOps를 통해 많은 사람들이 MLOps에 쉽게 접할수 있길 바랍니다.\n\n        </Translate>\n      </TeamProfileCardCol>\n      <TeamProfileCardCol\n        name=\"Jayeon Kim\"\n        githubUrl=\"https://github.com/anencore94\"\n        linkedinUrl=\"https://www.linkedin.com/in/anencore94\"\n        role=\"Project Member\"\n        >\n        <Translate id=\"team.profile.Jaeyeon Kim.body\">\n        비효율적인 작업을 자동화하는 것에 관심이 많습니다.\n        </Translate>\n      </TeamProfileCardCol>\n      <TeamProfileCardCol\n        name=\"Youngchel Jang\"\n        githubUrl=\"https://github.com/zamonia500\"\n        linkedinUrl=\"https://www.linkedin.com/in/youngcheol-jang-b04a45187\"\n        role=\"Project Member\"\n        >\n        <Translate id=\"team.profile.Youngchel Jang.body\">\n        마키나락스에서 MLOps Engineer로 일하고 있습니다.\n\n        단순하게 생각하는 노력을 하고 있습니다.\n        </Translate>\n      </TeamProfileCardCol>\n    </div>\n  );\n}\n\nexport function ContributorsRow(): JSX.Element {\n  return (\n    <div className=\"row\">\n      <TeamProfileCardCol\n        name=\"Jongsun Shinn\"\n        githubUrl=\"https://github.com/jsshinn\"\n        linkedinUrl=\"https://www.linkedin.com/in/jongsun-shinn-311b00140/\"\n        >\n        <Translate id=\"team.profile.Jongsun Shinn.body\">\n        마키나락스에서 ML Engineer로 일하고 있습니다.\n        </Translate>\n      </TeamProfileCardCol>\n      <TeamProfileCardCol\n        name=\"Sangwoo Shim\"\n        githubUrl=\"https://github.com/borishim\"\n        linkedinUrl=\"https://www.linkedin.com/in/sangwooshim/\"\n        >\n        <Translate id=\"team.profile.Sangwoo Shim.body\">\n        마키나락스에서 CTO로 일하고 있습니다.\n        마키나락스는 머신러닝 기반의 산업용 AI 솔루션을 개발하는 스타트업입니다.\n        산업 현장의 문제 해결을 통해 사람이 본연의 일에 집중할 수 있게 만드는 것,\n        그것이 우리가 하는 일입니다.\n        </Translate>\n      </TeamProfileCardCol>\n      <TeamProfileCardCol\n        name=\"Seunghyun Ko\"\n        githubUrl=\"https://github.com/kosehy\"\n        linkedinUrl=\"https://www.linkedin.com/in/seunghyunko/\"\n        >\n        <Translate id=\"team.profile.Seunghyun Ko.body\">\n        3i에서 MLOps Engineer로 일하고 있습니다.\n\n        kubeflow에 관심이 많습니다.\n        </Translate>\n      </TeamProfileCardCol>\n      <TeamProfileCardCol\n        name=\"SeungTae Kim\"\n        githubUrl=\"https://github.com/RyanKor\"\n        linkedinUrl=\"https://www.linkedin.com/in/seung-tae-kim-3bb15715b/\"\n        >\n        <Translate id=\"team.profile.SeungTae Kim.body\">\n        Genesis Lab이라는 스타트업에서 Applied AI Engineer 인턴 업무를 수행하고 있습니다.\n\n        머신러닝 생태계가 우리 산업 전반에 큰 변화을 가져올 것이라 믿으며, 한 걸음씩 나아가고 있습니다.\n        </Translate>\n      </TeamProfileCardCol>\n      <TeamProfileCardCol\n        name=\"Youngdon Tae\"\n        githubUrl=\"https://github.com/taepd\"\n        linkedinUrl=\"https://www.linkedin.com/in/taepd/\"\n        >\n        <Translate id=\"team.profile.Youngdon Tae.body\">\n        백패커에서 ML 엔지니어로 일하고 있습니다.\n\n        자연어처리, 추천시스템, MLOps에 관심이 많습니다.\n        </Translate>\n      </TeamProfileCardCol>\n    </div>\n  );\n}\n"
  },
  {
    "path": "src/css/custom.css",
    "content": "/**\n * Any CSS included here will be global. The classic template\n * bundles Infima by default. Infima is a CSS framework designed to\n * work well for content-centric websites.\n */\n\n/* You can override the default Infima variables here. */\n:root {\n  --ifm-color-primary: #2e8555;\n  --ifm-color-primary-dark: #29784c;\n  --ifm-color-primary-darker: #277148;\n  --ifm-color-primary-darkest: #205d3b;\n  --ifm-color-primary-light: #33925d;\n  --ifm-color-primary-lighter: #359962;\n  --ifm-color-primary-lightest: #3cad6e;\n  --ifm-code-font-size: 95%;\n  --docusaurus-highlighted-code-line-bg: rgba(0, 0, 0, 0.1);\n}\n\n/* For readability concerns, you should choose a lighter palette in dark mode. */\n[data-theme='dark'] {\n  --ifm-color-primary: #25c2a0;\n  --ifm-color-primary-dark: #21af90;\n  --ifm-color-primary-darker: #1fa588;\n  --ifm-color-primary-darkest: #1a8870;\n  --ifm-color-primary-light: #29d5b0;\n  --ifm-color-primary-lighter: #32d8b4;\n  --ifm-color-primary-lightest: #4fddbf;\n  --docusaurus-highlighted-code-line-bg: rgba(0, 0, 0, 0.3);\n}\n"
  },
  {
    "path": "src/pages/index.module.css",
    "content": "/**\n * CSS files with the .module.css suffix will be treated as CSS modules\n * and scoped locally.\n */\n\n.heroBanner {\n  padding: 4rem 0;\n  text-align: center;\n  position: relative;\n  overflow: hidden;\n}\n\n@media screen and (max-width: 996px) {\n  .heroBanner {\n    padding: 2rem;\n  }\n}\n\n.buttons {\n  display: flex;\n  align-items: center;\n  justify-content: center;\n}\n"
  },
  {
    "path": "src/pages/index.tsx",
    "content": "import React from \"react\";\nimport clsx from \"clsx\";\nimport Link from \"@docusaurus/Link\";\nimport useDocusaurusContext from \"@docusaurus/useDocusaurusContext\";\nimport Layout from \"@theme/Layout\";\nimport HomepageFeatures from \"@site/src/components/HomepageFeatures\";\n\nimport styles from \"./index.module.css\";\n\nfunction HomepageHeader() {\n  const { siteConfig } = useDocusaurusContext();\n  return (\n    <header className={clsx(\"hero hero--primary\", styles.heroBanner)}>\n      <div className=\"container\">\n        <h1 className=\"hero__title\">{siteConfig.title}</h1>\n        <p className=\"hero__subtitle\">{siteConfig.tagline}</p>\n        <div className={styles.buttons}>\n          <Link\n            className=\"button button--secondary button--lg\"\n            to=\"/docs/introduction/intro\"\n          >\n            Let's Start!\n          </Link>\n        </div>\n      </div>\n    </header>\n  );\n}\n\nexport default function Home(): JSX.Element {\n  const { siteConfig } = useDocusaurusContext();\n  return (\n    <Layout\n      title={`MLOps for ALL`}\n      description=\"Description will go into a meta tag in <head />\"\n    >\n      <HomepageHeader />\n      <main>\n        <HomepageFeatures />\n      </main>\n    </Layout>\n  );\n}\n"
  },
  {
    "path": "src/pages/markdown-page.md",
    "content": "---\ntitle: Markdown page example\n---\n\n# Markdown page example\n\nYou don't need React to write simple standalone pages.\n"
  },
  {
    "path": "static/.nojekyll",
    "content": ""
  },
  {
    "path": "static/googlee5904fe980148e9b.html",
    "content": "google-site-verification: googlee5904fe980148e9b.html"
  },
  {
    "path": "static/img/site.webmanifest",
    "content": "{\"name\":\"\",\"short_name\":\"\",\"icons\":[{\"src\":\"/android-chrome-192x192.png\",\"sizes\":\"192x192\",\"type\":\"image/png\"},{\"src\":\"/android-chrome-512x512.png\",\"sizes\":\"512x512\",\"type\":\"image/png\"}],\"theme_color\":\"#ffffff\",\"background_color\":\"#ffffff\",\"display\":\"standalone\"}"
  },
  {
    "path": "tsconfig.json",
    "content": "{\n  // This file is not used in compilation. It is here just for a nice editor experience.\n  \"extends\": \"@tsconfig/docusaurus/tsconfig.json\",\n  \"compilerOptions\": {\n    \"baseUrl\": \".\"\n  }\n}\n"
  },
  {
    "path": "versioned_docs/version-1.0/api-deployment/_category_.json",
    "content": "{\n  \"label\": \"API Deployment\",\n  \"position\": 7,\n  \"link\": {\n    \"type\": \"generated-index\"\n  }\n}\n"
  },
  {
    "path": "versioned_docs/version-1.0/api-deployment/seldon-children.md",
    "content": "---\ntitle : \"6. Multi Models\"\ndescription: \"\"\nsidebar_position: 6\ncontributors: [\"Jongseob Jeon\"]\n---\n\n## Multi Models\n\n앞서 설명했던 방법들은 모두 단일 모델을 대상으로 했습니다.  \n이번 페이지에서는 여러 개의 모델을 연결하는 방법에 대해서 알아봅니다.\n\n## Pipeline\n\n우선 모델을 2개를 생성하는 파이프라인을 작성하겠습니다.\n\n모델은 앞서 사용한 SVC 모델에 StandardScaler를 추가하고 저장하도록 하겠습니다.\n\n```python\nfrom functools import partial\n\nimport kfp\nfrom kfp.components import InputPath, OutputPath, create_component_from_func\n\n\n@partial(\n    create_component_from_func,\n    packages_to_install=[\"pandas\", \"scikit-learn\"],\n)\ndef load_iris_data(\n    data_path: OutputPath(\"csv\"),\n    target_path: OutputPath(\"csv\"),\n):\n    import pandas as pd\n    from sklearn.datasets import load_iris\n\n    iris = load_iris()\n\n    data = pd.DataFrame(iris[\"data\"], columns=iris[\"feature_names\"])\n    target = pd.DataFrame(iris[\"target\"], columns=[\"target\"])\n\n    data.to_csv(data_path, index=False)\n    target.to_csv(target_path, index=False)\n\n@partial(\n    create_component_from_func,\n    packages_to_install=[\"dill\", \"pandas\", \"scikit-learn\", \"mlflow\"],\n)\ndef train_scaler_from_csv(\n    data_path: InputPath(\"csv\"),\n    scaled_data_path: OutputPath(\"csv\"),\n    model_path: OutputPath(\"dill\"),\n    input_example_path: OutputPath(\"dill\"),\n    signature_path: OutputPath(\"dill\"),\n    conda_env_path: OutputPath(\"dill\"),\n):\n    import dill\n    import pandas as pd\n    from sklearn.preprocessing import StandardScaler\n\n    from mlflow.models.signature import infer_signature\n    from mlflow.utils.environment import _mlflow_conda_env\n\n    data = pd.read_csv(data_path)\n\n    scaler = StandardScaler()\n    scaled_data = scaler.fit_transform(data)\n    scaled_data = pd.DataFrame(scaled_data, columns=data.columns, index=data.index)\n\n    scaled_data.to_csv(scaled_data_path, index=False)\n\n    with open(model_path, mode=\"wb\") as file_writer:\n        dill.dump(scaler, file_writer)\n\n    input_example = data.sample(1)\n    with open(input_example_path, \"wb\") as file_writer:\n        dill.dump(input_example, file_writer)\n\n    signature = infer_signature(data, scaler.transform(data))\n    with open(signature_path, \"wb\") as file_writer:\n        dill.dump(signature, file_writer)\n\n    conda_env = _mlflow_conda_env(\n        additional_pip_deps=[\"scikit-learn\"],\n        install_mlflow=False\n    )\n    with open(conda_env_path, \"wb\") as file_writer:\n        dill.dump(conda_env, file_writer)\n\n\n@partial(\n    create_component_from_func,\n    packages_to_install=[\"dill\", \"pandas\", \"scikit-learn\", \"mlflow\"],\n)\ndef train_svc_from_csv(\n    train_data_path: InputPath(\"csv\"),\n    train_target_path: InputPath(\"csv\"),\n    model_path: OutputPath(\"dill\"),\n    input_example_path: OutputPath(\"dill\"),\n    signature_path: OutputPath(\"dill\"),\n    conda_env_path: OutputPath(\"dill\"),\n    kernel: str,\n):\n    import dill\n    import pandas as pd\n    from sklearn.svm import SVC\n\n    from mlflow.models.signature import infer_signature\n    from mlflow.utils.environment import _mlflow_conda_env\n\n    train_data = pd.read_csv(train_data_path)\n    train_target = pd.read_csv(train_target_path)\n\n    clf = SVC(kernel=kernel)\n    clf.fit(train_data, train_target)\n\n    with open(model_path, mode=\"wb\") as file_writer:\n        dill.dump(clf, file_writer)\n\n    input_example = train_data.sample(1)\n    with open(input_example_path, \"wb\") as file_writer:\n        dill.dump(input_example, file_writer)\n\n    signature = infer_signature(train_data, clf.predict(train_data))\n    with open(signature_path, \"wb\") as file_writer:\n        dill.dump(signature, file_writer)\n\n    conda_env = _mlflow_conda_env(\n        additional_pip_deps=[\"scikit-learn\"],\n        install_mlflow=False\n    )\n    with open(conda_env_path, \"wb\") as file_writer:\n        dill.dump(conda_env, file_writer)\n\n\n@partial(\n    create_component_from_func,\n    packages_to_install=[\"dill\", \"pandas\", \"scikit-learn\", \"mlflow\", \"boto3\"],\n)\ndef upload_sklearn_model_to_mlflow(\n    model_name: str,\n    model_path: InputPath(\"dill\"),\n    input_example_path: InputPath(\"dill\"),\n    signature_path: InputPath(\"dill\"),\n    conda_env_path: InputPath(\"dill\"),\n):\n    import os\n    import dill\n    from mlflow.sklearn import save_model\n    \n    from mlflow.tracking.client import MlflowClient\n\n    os.environ[\"MLFLOW_S3_ENDPOINT_URL\"] = \"http://minio-service.kubeflow.svc:9000\"\n    os.environ[\"AWS_ACCESS_KEY_ID\"] = \"minio\"\n    os.environ[\"AWS_SECRET_ACCESS_KEY\"] = \"minio123\"\n\n    client = MlflowClient(\"http://mlflow-server-service.mlflow-system.svc:5000\")\n\n    with open(model_path, mode=\"rb\") as file_reader:\n        clf = dill.load(file_reader)\n\n    with open(input_example_path, \"rb\") as file_reader:\n        input_example = dill.load(file_reader)\n\n    with open(signature_path, \"rb\") as file_reader:\n        signature = dill.load(file_reader)\n\n    with open(conda_env_path, \"rb\") as file_reader:\n        conda_env = dill.load(file_reader)\n    save_model(\n        sk_model=clf,\n        path=model_name,\n        serialization_format=\"cloudpickle\",\n        conda_env=conda_env,\n        signature=signature,\n        input_example=input_example,\n    )\n    run = client.create_run(experiment_id=\"0\")\n    client.log_artifact(run.info.run_id, model_name)\n\n\nfrom kfp.dsl import pipeline\n\n\n@pipeline(name=\"multi_model_pipeline\")\ndef multi_model_pipeline(kernel: str = \"rbf\"):\n    iris_data = load_iris_data()\n    scaled_data = train_scaler_from_csv(data=iris_data.outputs[\"data\"])\n    _ = upload_sklearn_model_to_mlflow(\n        model_name=\"scaler\",\n        model=scaled_data.outputs[\"model\"],\n        input_example=scaled_data.outputs[\"input_example\"],\n        signature=scaled_data.outputs[\"signature\"],\n        conda_env=scaled_data.outputs[\"conda_env\"],\n    )\n    model = train_svc_from_csv(\n        train_data=scaled_data.outputs[\"scaled_data\"],\n        train_target=iris_data.outputs[\"target\"],\n        kernel=kernel,\n    )\n    _ = upload_sklearn_model_to_mlflow(\n        model_name=\"svc\",\n        model=model.outputs[\"model\"],\n        input_example=model.outputs[\"input_example\"],\n        signature=model.outputs[\"signature\"],\n        conda_env=model.outputs[\"conda_env\"],\n    )\n\n\nif __name__ == \"__main__\":\n    kfp.compiler.Compiler().compile(multi_model_pipeline, \"multi_model_pipeline.yaml\")\n\n```\n\n파이프라인을 업로드하면 다음과 같이 나옵니다.\n\n![children-kubeflow.png](./img/children-kubeflow.png)\n\nMLflow 대시보드를 확인하면 다음과 같이 두 개의 모델이 생성됩니다.\n\n![children-mlflow.png](./img/children-mlflow.png)\n\n각각의 run_id를 확인 후 다음과 같이 SeldonDeployment 스펙을 정의합니다.\n\n```bash\napiVersion: machinelearning.seldon.io/v1\nkind: SeldonDeployment\nmetadata:\n  name: multi-model-example\n  namespace: kubeflow-user-example-com\nspec:\n  name: model\n  predictors:\n  - name: model\n\n    componentSpecs:\n    - spec:\n        volumes:\n        - name: model-provision-location\n          emptyDir: {}\n\n        initContainers:\n        - name: scaler-initializer\n          image: gcr.io/kfserving/storage-initializer:v0.4.0\n          args:\n            - \"s3://mlflow/mlflow/artifacts/0/7f445015a0e94519b003d316478766ef/artifacts/scaler\"\n            - \"/mnt/models\"\n          volumeMounts:\n          - mountPath: /mnt/models\n            name: model-provision-location\n          envFrom:\n          - secretRef:\n              name: seldon-init-container-secret\n        - name: svc-initializer\n          image: gcr.io/kfserving/storage-initializer:v0.4.0\n          args:\n            - \"s3://mlflow/mlflow/artifacts/0/87eb168e76264b39a24b0e5ca0fe922b/artifacts/svc\"\n            - \"/mnt/models\"\n          volumeMounts:\n          - mountPath: /mnt/models\n            name: model-provision-location\n          envFrom:\n          - secretRef:\n              name: seldon-init-container-secret\n\n        containers:\n        - name: scaler\n          image: seldonio/mlflowserver:1.8.0-dev\n          volumeMounts:\n          - mountPath: /mnt/models\n            name: model-provision-location\n            readOnly: true\n          securityContext:\n            privileged: true\n            runAsUser: 0\n            runAsGroup: 0\n        - name: svc\n          image: seldonio/mlflowserver:1.8.0-dev\n          volumeMounts:\n          - mountPath: /mnt/models\n            name: model-provision-location\n            readOnly: true\n          securityContext:\n            privileged: true\n            runAsUser: 0\n            runAsGroup: 0\n\n    graph:\n      name: scaler\n      type: MODEL\n      parameters:\n      - name: model_uri\n        type: STRING\n        value: \"/mnt/models\"\n      - name: predict_method\n        type: STRING\n        value: \"transform\"\n      children:\n      - name: svc\n        type: MODEL\n        parameters:\n        - name: model_uri\n          type: STRING\n          value: \"/mnt/models\"\n```\n\n모델이 두 개가 되었으므로 각 모델의 initContainer와 container를 정의해주어야 합니다.\n이 필드는 입력값을 array로 받으며 순서는 관계없습니다.\n\n모델이 실행하는 순서는 graph에서 정의됩니다.\n\n```bash\ngraph:\n  name: scaler\n  type: MODEL\n  parameters:\n  - name: model_uri\n    type: STRING\n    value: \"/mnt/models\"\n  - name: predict_method\n    type: STRING\n    value: \"transform\"\n  children:\n  - name: svc\n    type: MODEL\n    parameters:\n    - name: model_uri\n      type: STRING\n      value: \"/mnt/models\"\n```\n\ngraph의 동작 방식은 처음 받은 값을 정해진 predict_method로 변환한 뒤 children으로 정의된 모델에 전달하는 방식입니다.\n이 경우 scaler -> svc 로 데이터가 전달됩니다.\n\n이제 위의 스펙을 yaml파일로 생성해 보겠습니다.\n\n```bash\ncat <<EOF > multi-model.yaml\napiVersion: machinelearning.seldon.io/v1\nkind: SeldonDeployment\nmetadata:\n  name: multi-model-example\n  namespace: kubeflow-user-example-com\nspec:\n  name: model\n  predictors:\n  - name: model\n\n    componentSpecs:\n    - spec:\n        volumes:\n        - name: model-provision-location\n          emptyDir: {}\n\n        initContainers:\n        - name: scaler-initializer\n          image: gcr.io/kfserving/storage-initializer:v0.4.0\n          args:\n            - \"s3://mlflow/mlflow/artifacts/0/7f445015a0e94519b003d316478766ef/artifacts/scaler\"\n            - \"/mnt/models\"\n          volumeMounts:\n          - mountPath: /mnt/models\n            name: model-provision-location\n          envFrom:\n          - secretRef:\n              name: seldon-init-container-secret\n        - name: svc-initializer\n          image: gcr.io/kfserving/storage-initializer:v0.4.0\n          args:\n            - \"s3://mlflow/mlflow/artifacts/0/87eb168e76264b39a24b0e5ca0fe922b/artifacts/svc\"\n            - \"/mnt/models\"\n          volumeMounts:\n          - mountPath: /mnt/models\n            name: model-provision-location\n          envFrom:\n          - secretRef:\n              name: seldon-init-container-secret\n\n        containers:\n        - name: scaler\n          image: ghcr.io/mlops-for-all/mlflowserver\n          volumeMounts:\n          - mountPath: /mnt/models\n            name: model-provision-location\n            readOnly: true\n          securityContext:\n            privileged: true\n            runAsUser: 0\n            runAsGroup: 0\n        - name: svc\n          image: ghcr.io/mlops-for-all/mlflowserver\n          volumeMounts:\n          - mountPath: /mnt/models\n            name: model-provision-location\n            readOnly: true\n          securityContext:\n            privileged: true\n            runAsUser: 0\n            runAsGroup: 0\n\n    graph:\n      name: scaler\n      type: MODEL\n      parameters:\n      - name: model_uri\n        type: STRING\n        value: \"/mnt/models\"\n      - name: predict_method\n        type: STRING\n        value: \"transform\"\n      children:\n      - name: svc\n        type: MODEL\n        parameters:\n        - name: model_uri\n          type: STRING\n          value: \"/mnt/models\"\nEOF\n```\n\n다음 명령어를 통해 API를 생성합니다.\n\n```bash\nkubectl apply -f multi-model.yaml\n```\n\n정상적으로 수행되면 다음과 같이 출력됩니다.\n\n```bash\nseldondeployment.machinelearning.seldon.io/multi-model-example created\n```\n\n정상적으로 생성됐는지 확인합니다.\n\n```bash\nkubectl get po -n kubeflow-user-example-com | grep multi-model-example\n```\n\n정상적으로 생성되면 다음과 비슷한 pod이 생성됩니다.\n\n```bash\nmulti-model-example-model-0-scaler-svc-9955fb795-n9ffw   4/4     Running     0          2m30s\n```\n"
  },
  {
    "path": "versioned_docs/version-1.0/api-deployment/seldon-fields.md",
    "content": "---\ntitle : \"4. Seldon Fields\"\ndescription: \"\"\nsidebar_position: 4\ncontributors: [\"Jongseob Jeon\"]\n---\n\n## How Seldon Core works?\n\nSeldon Core가 API 서버를 생성하는 과정을 요약하면 다음과 같습니다.\n\n![seldon-fields-0.png](./img/seldon-fields-0.png)\n\n1. initContainer는 모델 저장소에서 필요한 모델을 다운로드 받습니다.\n2. 다운로드받은 모델을 container로 전달합니다.\n3. container는 전달받은 모델을 감싼 API 서버를 실행합니다.\n4. 생성된 API 서버 주소로 API를 요청하여 모델의 추론 값을 받을 수 있습니다.\n\n## SeldonDeployment Spec\n\nSeldon Core를 사용할 때, 주로 사용하게 되는 커스텀 리소스인 SeldonDeployment를 정의하는 yaml 파일은 다음과 같습니다.\n\n```bash\napiVersion: machinelearning.seldon.io/v1\nkind: SeldonDeployment\nmetadata:\n  name: seldon-example\n  namespace: kubeflow-user-example-com\nspec:\n  name: model\n  predictors:\n  - name: model\n\n    componentSpecs:\n    - spec:\n        volumes:\n        - name: model-provision-location\n          emptyDir: {}\n\n        initContainers:\n        - name: model-initializer\n          image: gcr.io/kfserving/storage-initializer:v0.4.0\n          args:\n            - \"gs://seldon-models/v1.12.0-dev/sklearn/iris\"\n            - \"/mnt/models\"\n          volumeMounts:\n          - mountPath: /mnt/models\n            name: model-provision-location\n\n        containers:\n        - name: model\n          image: seldonio/sklearnserver:1.8.0-dev\n          volumeMounts:\n          - mountPath: /mnt/models\n            name: model-provision-location\n            readOnly: true\n          securityContext:\n            privileged: true\n            runAsUser: 0\n            runAsGroup: 0\n\n    graph:\n      name: model\n      type: MODEL\n      parameters:\n      - name: model_uri\n        type: STRING\n        value: \"/mnt/models\"\n      children: []\n\n```\n\nSeldonDeployment spec 중 `name` 과 `predictors` 필드는 required 필드입니다.  \n`name`은 쿠버네티스 상에서 pod의 구분을 위한 이름으로 크게 영향을 미치지 않습니다.  \n`predictors`는 한 개로 구성된 array로 `name`, `componentSpecs` 와 `graph` 가 정의되어야 합니다.  \n여기서도 `name`은 pod의 구분을 위한 이름으로 크게 영향을 미치지 않습니다.  \n\n이제 `componentSpecs` 와 `graph`에서 정의해야 할 필드들에 대해서 알아보겠습니다.\n\n## componentSpecs\n\n`componentSpecs` 는 하나로 구성된 array로 `spec` 키값이 정의되어야 합니다.  \n`spec` 에는 `volumes`, `initContainers`, `containers` 의 필드가 정의되어야 합니다.\n\n### volumes\n\n```bash\nvolumes:\n- name: model-provision-location\n  emptyDir: {}\n```\n\n`volumes`은 initContainer에서 다운로드받는 모델을 저장하기 위한 공간을 의미합니다.  \narray로 입력을 받으며 array의 구성 요소는 `name`과 `emptyDir` 입니다.  \n이 값들은 모델을 다운로드받고 옮길 때 한번 사용되므로 크게 수정하지 않아도 됩니다.\n\n### initContainer\n\n```bash\n- name: model-initializer\n  image: gcr.io/kfserving/storage-initializer:v0.4.0\n  args:\n    - \"gs://seldon-models/v1.12.0-dev/sklearn/iris\"\n    - \"/mnt/models\"\n  volumeMounts:\n  - mountPath: /mnt/models\n    name: model-provision-location\n```\n\ninitContainer는 API에서 사용할 모델을 다운로드받는 역할을 합니다.  \n그래서 사용되는 필드들은 모델 저장소(Model Registry)로부터 데이터를 다운로드받을 때 필요한 정보들을 정해줍니다.\n\ninitContainer의 값은 n개의 array로 구성되어 있으며 사용하는 모델마다 각각 지정해주어야 합니다.\n\n#### name\n\n`name`은 쿠버네티스 상의 pod의 이름입니다.  \n디버깅을 위해 `{model_name}-initializer` 로 사용하길 권장합니다.\n\n#### image\n\n`image` 는 모델을 다운로드 받기 위해 사용할 이미지 이름입니다.  \nseldon core에서 권장하는 이미지는 크게 두 가지입니다.\n\n- gcr.io/kfserving/storage-initializer:v0.4.0\n- seldonio/rclone-storage-initializer:1.13.0-dev\n\n각각의 자세한 내용은 다음을 참고 바랍니다.\n\n- [kfserving](https://docs.seldon.io/projects/seldon-core/en/latest/servers/kfserving-storage-initializer.html)\n- [rclone](https://github.com/SeldonIO/seldon-core/tree/master/components/rclone-storage-initializer)\n\n*모두의 MLOps* 에서는 kfserving을 사용합니다.\n\n#### args\n\n```bash\nargs:\n  - \"gs://seldon-models/v1.12.0-dev/sklearn/iris\"\n  - \"/mnt/models\"\n```\n\ngcr.io/kfserving/storage-initializer:v0.4.0 도커 이미지가 실행(`run`)될 때 입력받는 argument를 입력합니다.  \narray로 구성되며 첫 번째 array의 값은 다운로드받을 모델의 주소를 적습니다.  \n두 번째 array의 값은 다운로드받은 모델을 저장할 주소를 적습니다. (seldon core에서는 주로 `/mnt/models`에 저장합니다.)\n\n### volumeMounts\n\n```bash\nvolumeMounts:\n  - mountPath: /mnt/models\n    name: model-provision-location\n```\n\n`volumneMounts`는 volumes에서 설명한 것과 같이 `/mnt/models`를 쿠버네티스 상에서 공유할 수 있도록 볼륨을 붙여주는 필드입니다.  \n자세한 내용은 [쿠버네티스 Volume](https://kubernetes.io/docs/concepts/storage/volumes/)을 참조 바랍니다.\n\n### container\n\n```bash\ncontainers:\n- name: model\n  image: seldonio/sklearnserver:1.8.0-dev\n  volumeMounts:\n  - mountPath: /mnt/models\n    name: model-provision-location\n    readOnly: true\n  securityContext:\n    privileged: true\n    runAsUser: 0\n    runAsGroup: 0\n```\n\ncontainer는 실제로 모델이 API 형식으로 실행될 때의 설정을 정의하는 필드입니다.  \n\n#### name\n\n`name`은 쿠버네티스 상의 pod의 이름입니다. 사용하는 모델의 이름을 적습니다.\n\n#### image\n\n`image` 는 모델을 API로 만드는 데 사용할 이미지입니다.  \n이미지에는 모델이 로드될 때 필요한 패키지들이 모두 설치되어 있어야 합니다.\n\nSeldon Core에서 지원하는 공식 이미지는 다음과 같습니다.\n\n- seldonio/sklearnserver\n- seldonio/mlflowserver\n- seldonio/xgboostserver\n- seldonio/tfserving\n\n#### volumeMounts\n\n```bash\nvolumeMounts:\n- mountPath: /mnt/models\n  name: model-provision-location\n  readOnly: true\n```\n\ninitContainer에서 다운로드받은 데이터가 있는 경로를 알려주는 필드입니다.  \n이때 모델이 수정되는 것을 방지하기 위해 `readOnly: true`도 같이 주겠습니다.\n\n#### securityContext\n\n```bash\nsecurityContext:\n  privileged: true\n  runAsUser: 0\n  runAsGroup: 0\n```\n\n필요한 패키지를 설치할 때 pod이 권한이 없어서 패키지 설치를 수행하지 못할 수 있습니다.  \n이를 위해서 root 권한을 부여합니다. (다만 이 작업은 실제 서빙 시 보안 문제가 생길 수 있습니다.)\n\n## graph\n\n```bash\ngraph:\n  name: model\n  type: MODEL\n  parameters:\n  - name: model_uri\n    type: STRING\n    value: \"/mnt/models\"\n  children: []\n```\n\n모델이 동작하는 순서를 정의한 필드입니다.\n\n### name\n\n모델 그래프의 이름입니다. container에서 정의된 이름을 사용합니다.\n\n### type\n\ntype은 크게 4가지가 있습니다.\n\n1. TRANSFORMER\n2. MODEL\n3. OUTPUT_TRANSFORMER\n4. ROUTER\n\n각 type에 대한 자세한 설명은 [Seldon Core Complex Graphs Metadata Example](https://docs.seldon.io/projects/seldon-core/en/latest/examples/graph-metadata.html)을 참조 바랍니다.\n\n### parameters\n\nclass init 에서 사용되는 값들입니다.  \nsklearnserver에서 필요한 값은 [다음 파일](https://github.com/SeldonIO/seldon-core/blob/master/servers/sklearnserver/sklearnserver/SKLearnServer.py)에서 확인할 수 있습니다.\n\n```python\nclass SKLearnServer(SeldonComponent):\n    def __init__(self, model_uri: str = None, method: str = \"predict_proba\"):\n```\n\n코드를 보면 `model_uri`와 `method`를 정의할 수 있습니다.\n\n### children\n\n순서도를 작성할 때 사용됩니다. 자세한 내용은 다음 페이지에서 설명합니다.\n"
  },
  {
    "path": "versioned_docs/version-1.0/api-deployment/seldon-iris.md",
    "content": "---\ntitle : \"2. Deploy SeldonDeployment\"\ndescription: \"\"\nsidebar_position: 2\ndate: 2021-12-22\nlastmod: 2021-12-22\ncontributors: [\"Youngcheol Jang\", \"SeungTae Kim\"]\n---\n\n## SeldonDeployment를 통해 배포하기\n\n이번에는 학습된 모델이 있을 때 SeldonDeployment를 통해 API Deployment를 해보겠습니다.\nSeldonDeployment는 쿠버네티스(Kubernetes)에 모델을 REST/gRPC 서버의 형태로 배포하기 위해 정의된 CRD(CustomResourceDefinition)입니다.\n\n### 1. Prerequisites\n\nSeldonDeployment 관련된 실습은 seldon-deploy라는 새로운 네임스페이스(namespace)에서 진행하도록 하겠습니다.\n네임스페이스를 생성한 뒤, seldon-deploy를 현재 네임스페이스로 설정합니다.\n\n```bash\nkubectl create namespace seldon-deploy\nkubectl config set-context --current --namespace=seldon-deploy\n```\n\n### 2. 스펙 정의\n\nSeldonDeployment를 배포하기 위한 yaml 파일을 생성합니다.\n이번 페이지에서는 공개된 iris model을 사용하도록 하겠습니다.\n이 iris model은 sklearn 프레임워크를 통해 학습되었기 때문에 SKLEARN_SERVER를 사용합니다.\n\n```bash\ncat <<EOF > iris-sdep.yaml\napiVersion: machinelearning.seldon.io/v1alpha2\nkind: SeldonDeployment\nmetadata:\n  name: sklearn\n  namespace: seldon-deploy\nspec:\n  name: iris\n  predictors:\n  - graph:\n      children: []\n      implementation: SKLEARN_SERVER\n      modelUri: gs://seldon-models/v1.12.0-dev/sklearn/iris\n      name: classifier\n    name: default\n    replicas: 1\nEOF\n```\n\nyaml 파일을 배포합니다.\n\n```bash\nkubectl apply -f iris-sdep.yaml\n```\n\n다음 명령어를 통해 정상적으로 배포가 되었는지 확인합니다.\n\n```bash\nkubectl get pods --selector seldon-app=sklearn-default -n seldon-deploy\n```\n\n모두 Running 이 되면 다음과 비슷한 결과가 출력됩니다.\n\n```bash\nNAME                                            READY   STATUS    RESTARTS   AGE\nsklearn-default-0-classifier-5fdfd7bb77-ls9tr   2/2     Running   0          5m\n```\n\n## Ingress URL\n\n이제 배포된 모델에 추론 요청(predict request)를 보내서 추론 결괏값을 받아옵니다.\n배포된 API는 다음과 같은 규칙으로 생성됩니다.\n`http://{NODE_IP}:{NODE_PORT}/seldon/{namespace}/{seldon-deployment-name}/api/v1.0/{method-name}/`\n\n### NODE_IP / NODE_PORT\n\n[Seldon Core 설치 시, Ambassador를 Ingress Controller로 설정하였으므로](../setup-components/install-components-seldon.md), SeldonDeployment로 생성된 API 서버는 모두 Ambassador의 Ingress gateway를 통해 요청할 수 있습니다.\n\n따라서 우선 Ambassador Ingress Gateway의 url을 환경 변수로 설정합니다.\n\n```bash\nexport NODE_IP=$(kubectl get nodes -o jsonpath='{ $.items[*].status.addresses[?(@.type==\"InternalIP\")].address }')\nexport NODE_PORT=$(kubectl get service ambassador -n seldon-system -o jsonpath=\"{.spec.ports[0].nodePort}\")\n```\n\n설정된 url을 확인합니다.\n\n```bash\necho \"NODE_IP\"=$NODE_IP\necho \"NODE_PORT\"=$NODE_PORT\n```\n\n다음과 비슷하게 출력되어야 하며, 클라우드 등을 통해 설정할 경우, internal ip 주소가 설정되는 것을 확인할 수 있습니다.\n\n```bash\nNODE_IP=192.168.0.19\nNODE_PORT=30486\n```\n\n### namespace / seldon-deployment-name\n\nSeldonDeployment가 배포된 `namespace`와 `seldon-deployment-name`를 의미합니다.\n이는 스펙을 정의할 때 metadata에 정의된 값을 사용합니다.\n\n```bash\nmetadata:\n  name: sklearn\n  namespace: seldon-deploy\n```\n\n위의 예시에서는 `namespace`는 seldon-deploy, `seldon-deployment-name`은 sklearn 입니다.\n\n### method-name\n\nSeldonDeployment에서 주로 사용하는 `method-name`은 두 가지가 있습니다.\n\n1. doc\n2. predictions\n\n각각의 method의 자세한 사용 방법은 아래에서 설명합니다.\n\n## Using Swagger\n\n우선 doc method를 사용하는 방법입니다. doc method를 이용하면 seldon에서 생성한 swagger에 접속할 수 있습니다.\n\n### 1. Swagger 접속\n\n위에서 설명한 ingress url 규칙에 따라 아래 주소를 통해 swagger에 접근할 수 있습니다.  \n`http://192.168.0.19:30486/seldon/seldon-deploy/sklearn/api/v1.0/doc/`\n\n![iris-swagger1.png](./img/iris-swagger1.png)\n\n### 2. Swagger Predictions 메뉴 선택\n\nUI에서 `/seldon/seldon-deploy/sklearn/api/v1.0/predictions` 메뉴를 선택합니다.\n\n![iris-swagger2.png](./img/iris-swagger2.png)\n\n### 3. *Try it out* 선택\n\n![iris-swagger3.png](./img/iris-swagger3.png)\n\n### 4. Request body에 data 입력\n\n![iris-swagger4.png](./img/iris-swagger4.png)\n\n다음 데이터를 입력합니다.\n\n```bash\n{\n  \"data\": {\n    \"ndarray\":[[1.0, 2.0, 5.0, 6.0]]\n  }\n}\n```\n\n### 5. 추론 결과 확인\n\n`Execute` 버튼을 눌러서 추론 결과를 확인할 수 있습니다.\n\n![iris-swagger5.png](./img/iris-swagger5.png)\n\n정상적으로 수행되면 다음과 같은 추론 결과를 얻습니다.\n\n```bash\n{\n  \"data\": {\n    \"names\": [\n      \"t:0\",\n      \"t:1\",\n      \"t:2\"\n    ],\n    \"ndarray\": [\n      [\n        9.912315378486697e-7,\n        0.0007015931307746079,\n        0.9992974156376876\n      ]\n    ]\n  },\n  \"meta\": {\n    \"requestPath\": {\n      \"classifier\": \"seldonio/sklearnserver:1.11.2\"\n    }\n  }\n}\n```\n\n## Using CLI\n\n또한, curl과 같은 http client CLI 도구를 활용해서도 API 요청을 수행할 수 있습니다.\n\n예를 들어, 다음과 같이 `/predictions`를 요청하면\n\n```bash\ncurl -X POST http://$NODE_IP:$NODE_PORT/seldon/seldon-deploy/sklearn/api/v1.0/predictions \\\n-H 'Content-Type: application/json' \\\n-d '{ \"data\": { \"ndarray\": [[1,2,3,4]] } }'\n```\n\n아래와 같은 응답이 정상적으로 출력되는 것을 확인할 수 있습니다.\n\n```bash\n{\"data\":{\"names\":[\"t:0\",\"t:1\",\"t:2\"],\"ndarray\":[[0.0006985194531162835,0.00366803903943666,0.995633441507447]]},\"meta\":{\"requestPath\":{\"classifier\":\"seldonio/sklearnserver:1.11.2\"}}}\n```\n"
  },
  {
    "path": "versioned_docs/version-1.0/api-deployment/seldon-mlflow.md",
    "content": "---\ntitle : \"5. Model from MLflow\"\ndescription: \"\"\nsidebar_position: 5\ncontributors: [\"Jongseob Jeon\"]\n---\n\n## Model from MLflow\n\n이번 페이지에서는 [MLflow Component](../kubeflow/advanced-mlflow.md)에서 저장된 모델을 이용해 API를 생성하는 방법에 대해서 알아보겠습니다.\n\n## Secret\n\ninitContainer가 minio에 접근해서 모델을 다운로드받으려면 credentials가 필요합니다.\nminio에 접근하기 위한 credentials는 다음과 같습니다.\n\n```bash\napiVersion: v1\ntype: Opaque\nkind: Secret\nmetadata:\n  name: seldon-init-container-secret\n  namespace: kubeflow-user-example-com\ndata:\n  AWS_ACCESS_KEY_ID: bWluaW8K=\n  AWS_SECRET_ACCESS_KEY: bWluaW8xMjM=\n  AWS_ENDPOINT_URL: aHR0cDovL21pbmlvLm1ha2luYXJvY2tzLmFp\n  USE_SSL: ZmFsc2U=\n```\n\n`AWS_ACCESS_KEY_ID` 의 입력값은 `minio`입니다. 다만 secret의 입력값은 인코딩된 값이여야 되기 때문에 실제로 입력되는 값은 다음을 수행후 나오는 값이어야 합니다.\n\ndata에 입력되어야 하는 값들은 다음과 같습니다.\n\n- AWS_ACCESS_KEY_ID: minio\n- AWS_SECRET_ACCESS_KEY: minio123\n- AWS_ENDPOINT_URL: http://minio-service.kubeflow.svc:9000\n- USE_SSL: false\n\n인코딩은 다음 명령어를 통해서 할 수 있습니다.\n\n```bash\necho -n minio | base64\n```\n\n그러면 다음과 같은 값이 출력됩니다.\n\n```bash\nbWluaW8=\n```\n\n인코딩을 전체 값에 대해서 진행하면 다음과 같이 됩니다.\n\n- AWS_ACCESS_KEY_ID: bWluaW8=\n- AWS_SECRET_ACCESS_KEY: bWluaW8xMjM=\n- AWS_ENDPOINT_URL: aHR0cDovL21pbmlvLXNlcnZpY2Uua3ViZWZsb3cuc3ZjOjkwMDA=\n- USE_SSL: ZmFsc2U=\n\n다음 명령어를 통해 secret을 생성할 수 있는 yaml파일을 생성합니다.\n\n```bash\ncat <<EOF > seldon-init-container-secret.yaml\napiVersion: v1\nkind: Secret\nmetadata:\n  name: seldon-init-container-secret\n  namespace: kubeflow-user-example-com\ntype: Opaque\ndata:\n  AWS_ACCESS_KEY_ID: bWluaW8=\n  AWS_SECRET_ACCESS_KEY: bWluaW8xMjM=\n  AWS_ENDPOINT_URL: aHR0cDovL21pbmlvLXNlcnZpY2Uua3ViZWZsb3cuc3ZjOjkwMDA=\n  USE_SSL: ZmFsc2U=\nEOF\n```\n\n다음 명령어를 통해 secret을 생성합니다.\n\n```bash\nkubectl apply -f seldon-init-container-secret.yaml\n```\n\n정상적으로 수행되면 다음과 같이 출력됩니다.\n\n```bash\nsecret/seldon-init-container-secret created\n```\n\n## Seldon Core yaml\n\n이제 Seldon Core를 생성하는 yaml파일을 작성합니다.\n\n```bash\napiVersion: machinelearning.seldon.io/v1\nkind: SeldonDeployment\nmetadata:\n  name: seldon-example\n  namespace: kubeflow-user-example-com\nspec:\n  name: model\n  predictors:\n  - name: model\n\n    componentSpecs:\n    - spec:\n        volumes:\n        - name: model-provision-location\n          emptyDir: {}\n\n        initContainers:\n        - name: model-initializer\n          image: gcr.io/kfserving/storage-initializer:v0.4.0\n          args:\n            - \"s3://mlflow/mlflow/artifacts/0/74ba8e33994144f599e50b3be176cdb0/artifacts/svc\"\n            - \"/mnt/models\"\n          volumeMounts:\n          - mountPath: /mnt/models\n            name: model-provision-location\n          envFrom:\n          - secretRef:\n              name: seldon-init-container-secret\n\n        containers:\n        - name: model\n          image: ghcr.io/mlops-for-all/mlflowserver\n          volumeMounts:\n          - mountPath: /mnt/models\n            name: model-provision-location\n            readOnly: true\n          securityContext:\n            privileged: true\n            runAsUser: 0\n            runAsGroup: 0\n\n    graph:\n      name: model\n      type: MODEL\n      parameters:\n      - name: model_uri\n        type: STRING\n        value: \"/mnt/models\"\n      children: []\n```\n\n이 전에 작성한 [Seldon Fields](../api-deployment/seldon-fields.md)와 달라진 점은 크게 두 부분입니다.\ninitContainer에 `envFrom` 필드가 추가되었으며 args의 주소가 `s3://mlflow/mlflow/artifacts/0/74ba8e33994144f599e50b3be176cdb0/artifacts/svc` 로 바뀌었습니다.\n\n### args\n\n앞서 args의 첫번째 array는 우리가 다운로드받을 모델의 경로라고 했습니다.  \n그럼 mlflow에 저장된 모델의 경로는 어떻게 알 수 있을까요?\n\n다시 mlflow에 들어가서 run을 클릭하고 모델을 누르면 다음과 같이 확인할 수 있습니다.\n\n![seldon-mlflow-0.png](./img/seldon-mlflow-0.png)\n\n이렇게 확인된 경로를 입력하면 됩니다.\n\n### envFrom\n\nminio에 접근해서 모델을 다운로드 받는 데 필요한 환경변수를 입력해주는 과정입니다.\n앞서 만든 `seldon-init-container-secret`를 이용합니다.\n\n## API 생성\n\n우선 위에서 정의한 스펙을 yaml 파일로 생성하겠습니다.\n\n```bash\napiVersion: machinelearning.seldon.io/v1\nkind: SeldonDeployment\nmetadata:\n  name: seldon-example\n  namespace: kubeflow-user-example-com\nspec:\n  name: model\n  predictors:\n  - name: model\n\n    componentSpecs:\n    - spec:\n        volumes:\n        - name: model-provision-location\n          emptyDir: {}\n\n        initContainers:\n        - name: model-initializer\n          image: gcr.io/kfserving/storage-initializer:v0.4.0\n          args:\n            - \"s3://mlflow/mlflow/artifacts/0/74ba8e33994144f599e50b3be176cdb0/artifacts/svc\"\n            - \"/mnt/models\"\n          volumeMounts:\n          - mountPath: /mnt/models\n            name: model-provision-location\n          envFrom:\n          - secretRef:\n              name: seldon-init-container-secret\n\n        containers:\n        - name: model\n          image: ghcr.io/mlops-for-all/mlflowserver\n          volumeMounts:\n          - mountPath: /mnt/models\n            name: model-provision-location\n            readOnly: true\n          securityContext:\n            privileged: true\n            runAsUser: 0\n            runAsGroup: 0\n\n    graph:\n      name: model\n      type: MODEL\n      parameters:\n      - name: model_uri\n        type: STRING\n        value: \"/mnt/models\"\n      - name: xtype\n        type: STRING\n        value: \"dataframe\"\n      children: []\nEOF\n```\n\nseldon pod을 생성합니다.\n\n```bash\nkubectl apply -f seldon-mlflow.yaml\n\n```\n\n정상적으로 수행되면 다음과 같이 출력됩니다.\n\n```bash\nseldondeployment.machinelearning.seldon.io/seldon-example created\n```\n\n이제 pod이 정상적으로 뜰 때까지 기다립니다.\n\n```bash\nkubectl get po -n kubeflow-user-example-com | grep seldon\n```\n\n다음과 비슷하게 출력되면 정상적으로 API를 생성했습니다.\n\n```bash\nseldon-example-model-0-model-5c949bd894-c5f28      3/3     Running     0          69s\n```\n\nCLI를 이용해 생성된 API에는 다음 request를 통해 실행을 확인할 수 있습니다.\n\n```bash\ncurl -X POST http://$NODE_IP:$NODE_PORT/seldon/seldon-deploy/sklearn/api/v1.0/predictions \\\n-H 'Content-Type: application/json' \\\n-d '{\n    \"data\": {\n        \"ndarray\": [\n            [\n                143.0,\n                0.0,\n                30.0,\n                30.0\n            ]\n        ],\n        \"names\": [\n            \"sepal length (cm)\",\n            \"sepal width (cm)\",\n            \"petal length (cm)\",\n            \"petal width (cm)\"\n        ]\n    }\n}'\n```\n\n정상적으로 실행될 경우 다음과 같은 결과를 받을 수 있습니다.\n\n```bash\n{\"data\":{\"names\":[],\"ndarray\":[\"Virginica\"]},\"meta\":{\"requestPath\":{\"model\":\"ghcr.io/mlops-for-all/mlflowserver:e141f57\"}}}\n```\n"
  },
  {
    "path": "versioned_docs/version-1.0/api-deployment/seldon-pg.md",
    "content": "---\ntitle : \"3. Seldon Monitoring\"\ndescription: \"Prometheus & Grafana 확인하기\"\nsidebar_position: 3\ndate: 2021-12-24\nlastmod: 2021-12-24\ncontributors: [\"Jongseob Jeon\"]\n---\n\n## Grafana & Prometheus\n\n이제, [지난 페이지](../api-deployment/seldon-iris.md)에서 생성했던 SeldonDeployment 로 API Request 를 반복적으로 수행해보고, 대시보드에 변화가 일어나는지 확인해봅니다.\n\n### 대시보드\n\n[앞서 생성한 대시보드](../setup-components/install-components-pg.md)를 포트 포워딩합니다.\n\n```bash\nkubectl port-forward svc/seldon-core-analytics-grafana -n seldon-system 8090:80\n```\n\n### API 요청\n\n[앞서 생성한 Seldon Deployment](../api-deployment/seldon-iris.md#using-cli)에 요청을 **반복해서** 보냅니다.\n\n```bash\ncurl -X POST http://$NODE_IP:$NODE_PORT/seldon/seldon-deploy/sklearn/api/v1.0/predictions \\\n-H 'Content-Type: application/json' \\\n-d '{ \"data\": { \"ndarray\": [[1,2,3,4]] } }'\n```\n\n그리고 그라파나 대시보드를 확인하면 다음과 같이 Global Request Rate 이 `0 ops` 에서 순간적으로 상승하는 것을 확인할 수 있습니다.\n\n![repeat-raise.png](./img/repeat-raise.png)\n\n이렇게 프로메테우스와 그라파나가 정상적으로 설치된 것을 확인할 수 있습니다.\n"
  },
  {
    "path": "versioned_docs/version-1.0/api-deployment/what-is-api-deployment.md",
    "content": "---\ntitle : \"1. What is API Deployment?\"\ndescription: \"\"\nsidebar_position: 1\ndate: 2021-12-22\nlastmod: 2021-12-22\ncontributors: [\"Youngcheol Jang\"]\n---\n\n## API Deployment란?\n\n머신러닝 모델을 학습한 뒤에는 어떻게 사용해야 할까요?  \n머신러닝을 학습할 때는 더 높은 성능의 모델이 나오기를 기대하지만, 학습된 모델을 사용하여 추론을 할 때는 빠르고 쉽게 추론 결과를 받아보고 싶을 것입니다.\n\n모델의 추론 결과를 확인하고자 할 때 주피터 노트북이나 파이썬 스크립트를 통해 학습된 모델을 로드한 뒤 추론할 수 있습니다.  \n그렇지만 이런 방법은 모델이 클수록 모델을 불러오는 데 많은 시간을 소요하게 되어서 비효율적입니다. 또한 이렇게 이용하면 많은 사람이 모델을 이용할 수 없고 학습된 모델이 있는 환경에서밖에 사용할 수 없습니다.\n\n그래서 실제 서비스에서 머신러닝이 사용될 때는 API를 이용해서 학습된 모델을 사용합니다. 모델은 API 서버가 구동되는 환경에서 한 번만 로드가 되며, DNS를 활용하여 외부에서도 쉽게 추론 결과를 받을 수 있고 다른 서비스와 연동할 수 있습니다.\n\n하지만 모델을 API로 만드는 작업에는 생각보다 많은 부수적인 작업이 필요합니다.  \n그래서 API로 만드는 작업을 더 쉽게 하기 위해서 Tensorflow와 같은 머신러닝 프레임워크 진영에서는 추론 엔진(Inference engine)을 개발하였습니다.\n\n추론 엔진들을 이용하면 해당 머신러닝 프레임워크로 개발되고 학습된 모델을 불러와 추론이 가능한 API(REST 또는 gRPC)를 생성합니다.  \n이러한 추론 엔진을 활용하여 구축한 API 서버로 추론하고자 하는 데이터를 담아 요청을 보내면, 추론 엔진이 추론 결과를 응답에 담아 전송하는 것입니다.\n\n대표적으로 다음과 같은 오픈소스 추론 엔진들이 개발되었습니다.\n\n- [Tensorflow : Tensorflow Serving](https://github.com/tensorflow/serving)\n- [PyTorch : Torchserve](https://github.com/pytorch/serve)\n- [Onnx : Onnx Runtime](https://github.com/microsoft/onnxruntime)\n\n오프소스에서 공식적으로 지원하지는 않지만, 많이 쓰이는 sklearn, xgboost 프레임워크를 위한 추론 엔진도 개발되어 있습니다.\n\n이처럼 모델의 추론 결과를 API의 형태로 받아볼 수 있도록 배포하는 것을 **API Deployment**라고 합니다.\n\n## Serving Framework\n\n위에서 다양한 추론 엔진들이 개발되었다는 사실을 소개해 드렸습니다.\n쿠버네티스 환경에서 이러한 추론 엔진들을 사용하여 API Deployment를 한다면 어떤 작업이 필요할까요?\n추론 엔진을 배포하기 위한 Deployment, 추론 요청을 보낼 Endpoint를 생성하기 위한 Service,\n외부에서의 추론 요청을 추론 엔진으로 보내기 위한 Ingress 등 많은 쿠버네티스 리소스를 배포해 주어야 합니다.\n이것 이외에도, 많은 추론 요청이 들어왔을 경우의 스케일 아웃(scale-out), 추론 엔진 상태에 대한 모니터링, 개선된 모델이 나왔을 경우 버전 업데이트 등 추론 엔진을 운영할 때의 요구사항은 한두 가지가 아닙니다.\n\n이러한 많은 요구사항을 처리하기 위해 추론 엔진들을 쿠버네티스 환경 위에서 한 번 더 추상화한 **Serving Framework**들이 개발되었습니다.\n\n개발된 Serving Framework들은 다음과 같은 오픈소스들이 있습니다.\n\n- [Seldon Core](https://github.com/SeldonIO/seldon-core)\n- [Kserve](https://github.com/kserve)\n- [BentoML](https://github.com/bentoml/BentoML)\n\n*모두의 MLOps*에서는 Seldon Core를 사용하여 API Deployment를 하는 과정을 다루어 보도록 하겠습니다.\n"
  },
  {
    "path": "versioned_docs/version-1.0/appendix/_category_.json",
    "content": "{\n  \"label\": \"Appendix\",\n  \"position\": 9,\n  \"link\": {\n    \"type\": \"generated-index\"\n  }\n}\n"
  },
  {
    "path": "versioned_docs/version-1.0/appendix/metallb.md",
    "content": "---\ntitle: \"2. Bare Metal 클러스터용 load balancer metallb 설치\"\nsidebar_position: 2\n---\n\n## MetalLB란?\n\nKubernetes 사용 시 AWS, GCP, Azure 와 같은 클라우드 플랫폼에서는 자체적으로 로드 벨런서(Load Balancer)를 제공해 주지만, 온프레미스 클러스터에서는 로드 벨런싱 기능을 제공하는 모듈을 추가적으로 설치해야 합니다.  \n[MetalLB](https://metallb.universe.tf/)는 베어메탈 환경에서 사용할 수 있는 로드 벨런서를 제공하는 오픈소스 프로젝트 입니다.\n\n## 요구사항\n\n| 요구 사항                                                    | 버전 및 내용                                                 |\n| ------------------------------------------------------------ | ------------------------------------------------------------ |\n| Kubernetes                                                   | 로드 벨런싱 기능이 없는 >= v1.13.0                           |\n| [호환가능한 네트워크  CNI](https://metallb.universe.tf/installation/network-addons/) | Calico, Canal, Cilium, Flannel, Kube-ovn, Kube-router, Weave  Net |\n| IPv4 주소                                                    | MetalLB 배포에 사용                                          |\n| BGP 모드를 사용할 경우                                       | BGP 기능을 지원하는 하나 이상의 라우터                       |\n| 노드 간 포트 TCP/UDP 7946 오픈                               | memberlist 요구 사항  \n\n## MetalLB 설치\n\n### Preparation\n\nIPVS 모드에서 kube-proxy를 사용하는 경우 Kubernetes v1.14.2 이후부터는 엄격한 ARP(strictARP) 모드를 사용하도록 설정해야 합니다.  \nKube-router는 기본적으로 엄격한 ARP를 활성화하므로 서비스 프록시로 사용할 경우에는 이 기능이 필요하지 않습니다.  \n엄격한 ARP 모드를 적용하기에 앞서, 현재 모드를 확인합니다.\n\n```bash\n# see what changes would be made, returns nonzero returncode if different\nkubectl get configmap kube-proxy -n kube-system -o yaml | \\\ngrep strictARP\n```\n\n```bash\nstrictARP: false\n```\n\nstrictARP: false 가 출력되는 경우 다음을 실행하여 strictARP: true로 변경합니다.\n(strictARP: true가 이미 출력된다면 다음 커맨드를 수행하지 않으셔도 됩니다.)\n\n```bash\n# actually apply the changes, returns nonzero returncode on errors only\nkubectl get configmap kube-proxy -n kube-system -o yaml | \\\nsed -e \"s/strictARP: false/strictARP: true/\" | \\\nkubectl apply -f - -n kube-system\n```\n\n정상적으로 수행되면 다음과 같이 출력됩니다.\n\n```bash\nWarning: resource configmaps/kube-proxy is missing the kubectl.kubernetes.io/last-applied-configuration annotation which is required by kubectl apply. kubectl apply should only be used on resources created declaratively by either kubectl create --save-config or kubectl apply. The missing annotation will be patched automatically.\nconfigmap/kube-proxy configured\n```\n\n### 설치 - Manifest\n\n#### 1. MetalLB 를 설치합니다.\n\n```bash\nkubectl apply -f https://raw.githubusercontent.com/metallb/metallb/v0.11.0/manifests/namespace.yaml\nkubectl apply -f https://raw.githubusercontent.com/metallb/metallb/v0.11.0/manifests/metallb.yaml\n```\n\n#### 2. 정상 설치 확인\n\nmetallb-system namespace 의 2 개의 pod 이 모두 Running 이 될 때까지 기다립니다.\n\n```bash\nkubectl get pod -n metallb-system\n```\n\n모두 Running 이 되면 다음과 비슷한 결과가 출력됩니다.\n\n```bash\nNAME                          READY   STATUS    RESTARTS   AGE\ncontroller-7dcc8764f4-8n92q   1/1     Running   1          1m\nspeaker-fnf8l                 1/1     Running   1          1m\n```\n\n매니페스트의 구성 요소는 다음과 같습니다.\n\n- metallb-system/controller\n  - deployment 로 배포되며, 로드 벨런싱을 수행할 external IP 주소의 할당을 처리하는 역할을 담당합니다.\n- metallb-system/speaker\n  - daemonset 형태로 배포되며, 외부 트래픽과 서비스를 연결해 네트워크 통신이 가능하도록 구성하는 역할을 담당합니다.\n\n서비스에는 컨트롤러 및 스피커와 구성 요소가 작동하는 데 필요한 RBAC 사용 권한이 포함됩니다.\n\n## Configuration\n\nMetalLB 의 로드 벨런싱 정책 설정은 관련 설정 정보를 담은 configmap 을 배포하여 설정할 수 있습니다.\n\nMetalLB 에서 구성할 수 있는 모드로는 다음과 같이 2가지가 있습니다.\n\n1. [Layer 2 모드](https://metallb.universe.tf/concepts/layer2/)\n2. [BGP 모드](https://metallb.universe.tf/concepts/bgp/)\n\n여기에서는 Layer 2 모드로 진행하겠습니다.\n\n### Layer 2 Configuration\n\nLayer 2 모드는 간단하게 사용할 IP 주소의 대역만 설정하면 됩니다.  \nLayer 2 모드를 사용할 경우 워커 노드의 네트워크 인터페이스에 IP를 바인딩 하지 않아도 되는데 로컬 네트워크의 ARP 요청에 직접 응답하여 컴퓨터의 MAC주소를 클라이언트에 제공하는 방식으로 작동하기 때문입니다.\n\n다음 `metallb_config.yaml` 파일은 MetalLB 가 192.168.35.100 ~ 192.168.35.110의 IP에 대한 제어 권한을 제공하고 Layer 2 모드를 구성하는 설정입니다.\n\n클러스터 노드와 클라이언트 노드가 분리된 경우, 192.168.35.100 ~ 192.168.35.110 대역이 클라이언트 노드와 클러스터 노드 모두 접근 가능한 대역이어야 합니다.\n\n#### metallb_config.yaml\n\n```bash\napiVersion: v1\nkind: ConfigMap\nmetadata:\n  namespace: metallb-system\n  name: config\ndata:\n  config: |\n    address-pools:\n    - name: default\n      protocol: layer2\n      addresses:\n      - 192.168.35.100-192.168.35.110  # IP 대역폭\n```\n\n위의 설정을 적용합니다.\n\n```test\nkubectl apply -f metallb_config.yaml \n```\n\n정상적으로 배포하면 다음과 같이 출력됩니다.\n\n```test\nconfigmap/config created\n```\n\n## MetalLB 사용\n\n### Kubeflow Dashboard\n\n먼저 kubeflow의 Dashboard 를 제공하는 istio-system 네임스페이스의 istio-ingressgateway 서비스의 타입을 `LoadBalancer`로 변경하여 MetalLB로부터 로드 벨런싱 기능을 제공받기 전에, 현재 상태를 확인합니다.\n\n```bash\nkubectl get svc/istio-ingressgateway -n istio-system\n```\n\n해당 서비스의 타입은 ClusterIP이며, External-IP 값은 `none` 인 것을 확인할 수 있습니다.\n\n```bash\nNAME                   TYPE        CLUSTER-IP    EXTERNAL-IP   PORT(S)                                        AGE\nistio-ingressgateway   ClusterIP   10.103.72.5   <none>        15021/TCP,80/TCP,443/TCP,31400/TCP,15443/TCP   4h21m\n```\n\ntype 을 LoadBalancer 로 변경하고 원하는 IP 주소를 입력하고 싶은 경우 loadBalancerIP 항목을 추가합니다.  \n추가 하지 않을 경우에는 위에서 설정한 IP 주소풀에서 순차적으로 IP 주소가 배정됩니다.\n\n```bash\nkubectl edit svc/istio-ingressgateway -n istio-system\n```\n\n```bash\nspec:\n  clusterIP: 10.103.72.5\n  clusterIPs:\n  - 10.103.72.5\n  ipFamilies:\n  - IPv4\n  ipFamilyPolicy: SingleStack\n  ports:\n  - name: status-port\n    port: 15021\n    protocol: TCP\n    targetPort: 15021\n  - name: http2\n    port: 80\n    protocol: TCP\n    targetPort: 8080\n  - name: https\n    port: 443\n    protocol: TCP\n    targetPort: 8443\n  - name: tcp\n    port: 31400\n    protocol: TCP\n    targetPort: 31400\n  - name: tls\n    port: 15443\n    protocol: TCP\n    targetPort: 15443\n  selector:\n    app: istio-ingressgateway\n    istio: ingressgateway\n  sessionAffinity: None\n  type: LoadBalancer # Change ClusterIP to LoadBalancer\n  loadBalancerIP: 192.168.35.100   # Add IP\nstatus:\n  loadBalancer: {}\n```\n\n다시 확인을 해보면 External-IP 값이 `192.168.35.100` 인 것을 확인합니다.\n\n```bash\nkubectl get svc/istio-ingressgateway -n istio-system\n```\n\n```bash\nNAME                   TYPE           CLUSTER-IP    EXTERNAL-IP      PORT(S)                                                                      AGE\nistio-ingressgateway   LoadBalancer   10.103.72.5   192.168.35.100   15021:31054/TCP,80:30853/TCP,443:30443/TCP,31400:30012/TCP,15443:31650/TCP   5h1m\n```\n\nWeb Browser 를 열어 [http://192.168.35.100](http://192.168.35.100) 으로 접속하여, 다음과 같은 화면이 출력되는 것을 확인합니다.\n\n![login-after-istio-ingressgateway-setting.png](./img/login-after-istio-ingressgateway-setting.png)\n\n### minio Dashboard\n\n먼저 minio 의 Dashboard 를 제공하는 kubeflow 네임스페이스의 minio-service 서비스의 타입을 LoadBalancer로 변경하여 MetalLB로부터 로드 벨런싱 기능을 제공받기 전에, 현재 상태를 확인합니다.\n\n```bash\nkubectl get svc/minio-service -n kubeflow\n```\n\n해당 서비스의 타입은 ClusterIP이며, External-IP 값은 `none` 인 것을 확인할 수 있습니다.\n\n```bash\nNAME            TYPE        CLUSTER-IP      EXTERNAL-IP   PORT(S)    AGE\nminio-service   ClusterIP   10.109.209.87   <none>        9000/TCP   5h14m\n```\n\ntype 을 LoadBalancer 로 변경하고 원하는 IP 주소를 입력하고 싶은 경우 loadBalancerIP 항목을 추가합니다.  \n추가 하지 않을 경우에는 위에서 설정한 IP 주소풀에서 순차적으로 IP 주소가 배정됩니다.\n\n```bash\nkubectl edit svc/minio-service -n kubeflow\n```\n\n```bash\napiVersion: v1\nkind: Service\nmetadata:\n  annotations:\n    kubectl.kubernetes.io/last-applied-configuration: |\n      {\"apiVersion\":\"v1\",\"kind\":\"Service\",\"metadata\":{\"annotations\":{},\"labels\":{\"application-crd-id\":\"kubeflow-pipelines\"},\"name\":\"minio-ser>\n  creationTimestamp: \"2022-01-05T08:44:23Z\"\n  labels:\n    application-crd-id: kubeflow-pipelines\n  name: minio-service\n  namespace: kubeflow\n  resourceVersion: \"21120\"\n  uid: 0053ee28-4f87-47bb-ad6b-7ad68aa29a48\nspec:\n  clusterIP: 10.109.209.87\n  clusterIPs:\n  - 10.109.209.87\n  ipFamilies:\n  - IPv4\n  ipFamilyPolicy: SingleStack\n  ports:\n  - name: http\n    port: 9000\n    protocol: TCP\n    targetPort: 9000\n  selector:\n    app: minio\n    application-crd-id: kubeflow-pipelines\n  sessionAffinity: None\n  type: LoadBalancer # Change ClusterIP to LoadBalancer\n  loadBalancerIP: 192.168.35.101 # Add IP\nstatus:\n  loadBalancer: {}\n```\n\n다시 확인을 해보면 External-IP 값이 `192.168.35.101` 인 것을 확인할 수 있습니다.\n\n```bash\nkubectl get svc/minio-service -n kubeflow\n```\n\n```bash\nNAME            TYPE           CLUSTER-IP      EXTERNAL-IP      PORT(S)          AGE\nminio-service   LoadBalancer   10.109.209.87   192.168.35.101   9000:31371/TCP   5h21m\n```\n\nWeb Browser 를 열어 [http://192.168.35.101:9000](http://192.168.35.101:9000) 으로 접속하여, 다음과 같은 화면이 출력되는 것을 확인합니다.\n\n![login-after-minio-setting.png](./img/login-after-minio-setting.png)\n\n### mlflow Dashboard\n\n먼저 mlflow 의 Dashboard 를 제공하는 mlflow-system 네임스페이스의 mlflow-server-service 서비스의 타입을 LoadBalancer로 변경하여 MetalLB로부터 로드 벨런싱 기능을 제공받기 전에, 현재 상태를 확인합니다.\n\n```bash\nkubectl get svc/mlflow-server-service -n mlflow-system\n```\n\n해당 서비스의 타입은 ClusterIP이며, External-IP 값은 `none` 인 것을 확인할 수 있습니다.\n\n```bash\nNAME                    TYPE        CLUSTER-IP       EXTERNAL-IP   PORT(S)    AGE\nmlflow-server-service   ClusterIP   10.111.173.209   <none>        5000/TCP   4m50s\n```\n\ntype 을 LoadBalancer 로 변경하고 원하는 IP 주소를 입력하고 싶은 경우 loadBalancerIP 항목을 추가합니다.  \n추가 하지 않을 경우에는 위에서 설정한 IP 주소풀에서 순차적으로 IP 주소가 배정됩니다.\n\n```bash\nkubectl edit svc/mlflow-server-service -n mlflow-system\n```\n\n```bash\napiVersion: v1\nkind: Service\nmetadata:\n  annotations:\n    meta.helm.sh/release-name: mlflow-server\n    meta.helm.sh/release-namespace: mlflow-system\n  creationTimestamp: \"2022-01-07T04:00:19Z\"\n  labels:\n    app.kubernetes.io/managed-by: Helm\n  name: mlflow-server-service\n  namespace: mlflow-system\n  resourceVersion: \"276246\"\n  uid: e5d39fb7-ad98-47e7-b512-f9c673055356\nspec:\n  clusterIP: 10.111.173.209\n  clusterIPs:\n  - 10.111.173.209\n  ipFamilies:\n  - IPv4\n  ipFamilyPolicy: SingleStack\n  ports:\n  - port: 5000\n    protocol: TCP\n    targetPort: 5000\n  selector:\n    app.kubernetes.io/name: mlflow-server\n  sessionAffinity: None\n  type: LoadBalancer # Change ClusterIP to LoadBalancer\n  loadBalancerIP: 192.168.35.102 # Add IP\nstatus:\n  loadBalancer: {}\n```\n\n다시 확인을 해보면 External-IP 값이 `192.168.35.102` 인 것을 확인할 수 있습니다.\n\n```bash\nkubectl get svc/mlflow-server-service -n mlflow-system\n```\n\n```bash\nNAME                    TYPE           CLUSTER-IP       EXTERNAL-IP      PORT(S)          AGE\nmlflow-server-service   LoadBalancer   10.111.173.209   192.168.35.102   5000:32287/TCP   6m11s\n```\n\nWeb Browser 를 열어 [http://192.168.35.102:5000](http://192.168.35.102:5000) 으로 접속하여, 다음과 같은 화면이 출력되는 것을 확인합니다.\n\n![login-after-mlflow-setting.png](./img/login-after-mlflow-setting.png)\n\n### Grafana Dashboard\n\n먼저 Grafana 의 Dashboard 를 제공하는 seldon-system 네임스페이스의 seldon-core-analytics-grafana 서비스의 타입을 LoadBalancer로 변경하여 MetalLB로부터 로드 벨런싱 기능을 제공받기 전에, 현재 상태를 확인합니다.\n\n```bash\nkubectl get svc/seldon-core-analytics-grafana -n seldon-system\n```\n\n해당 서비스의 타입은 ClusterIP이며, External-IP 값은 `none` 인 것을 확인할 수 있습니다.\n\n```bash\nNAME                            TYPE        CLUSTER-IP      EXTERNAL-IP   PORT(S)   AGE\nseldon-core-analytics-grafana   ClusterIP   10.109.20.161   <none>        80/TCP    94s\n```\n\ntype 을 LoadBalancer 로 변경하고 원하는 IP 주소를 입력하고 싶은 경우 loadBalancerIP 항목을 추가합니다.  \n추가 하지 않을 경우에는 위에서 설정한 IP 주소풀에서 순차적으로 IP 주소가 배정됩니다.\n\n```bash\nkubectl edit svc/seldon-core-analytics-grafana -n seldon-system\n```\n\n```bash\napiVersion: v1\nkind: Service\nmetadata:\n  annotations:\n    meta.helm.sh/release-name: seldon-core-analytics\n    meta.helm.sh/release-namespace: seldon-system\n  creationTimestamp: \"2022-01-07T04:16:47Z\"\n  labels:\n    app.kubernetes.io/instance: seldon-core-analytics\n    app.kubernetes.io/managed-by: Helm\n    app.kubernetes.io/name: grafana\n    app.kubernetes.io/version: 7.0.3\n    helm.sh/chart: grafana-5.1.4\n  name: seldon-core-analytics-grafana\n  namespace: seldon-system\n  resourceVersion: \"280605\"\n  uid: 75073b78-92ec-472c-b0d5-240038ea8fa5\nspec:\n  clusterIP: 10.109.20.161\n  clusterIPs:\n  - 10.109.20.161\n  ipFamilies:\n  - IPv4\n  ipFamilyPolicy: SingleStack\n  ports:\n  - name: service\n    port: 80\n    protocol: TCP\n    targetPort: 3000\n  selector:\n    app.kubernetes.io/instance: seldon-core-analytics\n    app.kubernetes.io/name: grafana\n  sessionAffinity: None\n  type: LoadBalancer # Change ClusterIP to LoadBalancer\n  loadBalancerIP: 192.168.35.103 # Add IP\nstatus:\n  loadBalancer: {}\n```\n\n다시 확인을 해보면 External-IP 값이 `192.168.35.103` 인 것을 확인할 수 있습니다.\n\n```bash\nkubectl get svc/seldon-core-analytics-grafana -n seldon-system\n```\n\n```bash\nNAME                            TYPE           CLUSTER-IP      EXTERNAL-IP      PORT(S)        AGE\nseldon-core-analytics-grafana   LoadBalancer   10.109.20.161   192.168.35.103   80:31191/TCP   5m14s\n```\n\nWeb Browser 를 열어 [http://192.168.35.103:80](http://192.168.35.103:80) 으로 접속하여, 다음과 같은 화면이 출력되는 것을 확인합니다.\n\n![login-after-grafana-setting.png](./img/login-after-grafana-setting.png)\n"
  },
  {
    "path": "versioned_docs/version-1.0/appendix/pyenv.md",
    "content": "---\ntitle: \"1. Python 가상환경 설치\"\nsidebar_position: 1\n---\n\n## 파이썬 가상환경\n\nPython 환경을 사용하다 보면 여러 버전의 Python 환경을 사용하고 싶은 경우나, 여러 프로젝트별 패키지 버전을 따로 관리하고 싶은 경우가 발생합니다.\n\n이처럼 Python 환경 혹은 Python Package 환경을 가상화하여 관리하는 것을 쉽게 도와주는 도구로는 pyenv, conda, virtualenv, venv 등이 존재합니다.\n\n이 중 *모두의 MLOps*에서는 [pyenv](https://github.com/pyenv/pyenv)와 [pyenv-virtualenv](https://github.com/pyenv/pyenv-virtualenv)를 설치하는 방법을 다룹니다.  \npyenv는 Python 버전을 관리하는 것을 도와주며, pyenv-virtualenv는 pyenv의 plugin으로써 파이썬 패키지 환경을 관리하는 것을 도와줍니다.\n\n## pyenv 설치\n\n### Prerequisites\n\n운영 체제별로 Prerequisites가 다릅니다. [다음 페이지](https://github.com/pyenv/pyenv/wiki#suggested-build-environment)를 참고하여 필수 패키지들을 설치해주시기 바랍니다.\n\n### 설치 - macOS\n\n1. pyenv, pyenv-virtualenv 설치\n\n```bash\nbrew update\nbrew install pyenv\nbrew install pyenv-virtualenv\n```\n\n2. pyenv 설정\n\nmacOS의 경우 카탈리나 버전 이후 기본 shell이 zsh로 변경되었기 때문에 zsh을 사용하는 경우를 가정하였습니다.\n\n```bash\necho 'eval \"$(pyenv init -)\"' >> ~/.zshrc\necho 'eval \"$(pyenv virtualenv-init -)\"' >> ~/.zshrc\nsource ~/.zshrc\n```\n\npyenv 명령이 정상적으로 수행되는지 확인합니다.\n\n```bash\npyenv --help\n```\n\n```bash\n$ pyenv --help\nUsage: pyenv <command> [<args>]\n\nSome useful pyenv commands are:\n   --version   Display the version of pyenv\n   activate    Activate virtual environment\n   commands    List all available pyenv commands\n   deactivate   Deactivate virtual environment\n   exec        Run an executable with the selected Python version\n   global      Set or show the global Python version(s)\n   help        Display help for a command\n   hooks       List hook scripts for a given pyenv command\n   init        Configure the shell environment for pyenv\n   install     Install a Python version using python-build\n   local       Set or show the local application-specific Python version(s)\n   prefix      Display prefix for a Python version\n   rehash      Rehash pyenv shims (run this after installing executables)\n   root        Display the root directory where versions and shims are kept\n   shell       Set or show the shell-specific Python version\n   shims       List existing pyenv shims\n   uninstall   Uninstall a specific Python version\n   version     Show the current Python version(s) and its origin\n   version-file   Detect the file that sets the current pyenv version\n   version-name   Show the current Python version\n   version-origin   Explain how the current Python version is set\n   versions    List all Python versions available to pyenv\n   virtualenv   Create a Python virtualenv using the pyenv-virtualenv plugin\n   virtualenv-delete   Uninstall a specific Python virtualenv\n   virtualenv-init   Configure the shell environment for pyenv-virtualenv\n   virtualenv-prefix   Display real_prefix for a Python virtualenv version\n   virtualenvs   List all Python virtualenvs found in `$PYENV_ROOT/versions/*'.\n   whence      List all Python versions that contain the given executable\n   which       Display the full path to an executable\n\nSee `pyenv help <command>' for information on a specific command.\nFor full documentation, see: https://github.com/pyenv/pyenv#readme\n```\n\n### 설치 - Ubuntu\n\n1. pyenv, pyenv-virtualenv 설치\n\n```bash\ncurl https://pyenv.run | bash\n```\n\n다음과 같은 내용이 출력되면 정상적으로 설치된 것을 의미합니다.\n\n```bash\n  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current\n                                 Dload  Upload   Total   Spent    Left  Speed\n  0     0    0     0    0     0      0      0 --:--:-- --:--:--   0     0    0     0    0     0      0      0 --:--:-- --:--:-- 100   270  100   270    0     0    239      0  0:00:01  0:00:01 --:--:--   239\nCloning into '/home/mlops/.pyenv'...\nr\n...\n중략...\n...\nremote: Enumerating objects: 10, done.\nremote: Counting objects: 100% (10/10), done.\nremote: Compressing objects: 100% (6/6), done.\nremote: Total 10 (delta 1), reused 6 (delta 0), pack-reused 0\nUnpacking objects: 100% (10/10), 2.92 KiB | 2.92 MiB/s, done.\n\nWARNING: seems you still have not added 'pyenv' to the load path.\n\n\n# See the README for instructions on how to set up\n# your shell environment for Pyenv.\n\n# Load pyenv-virtualenv automatically by adding\n# the following to ~/.bashrc:\n\neval \"$(pyenv virtualenv-init -)\"\n\n```\n\n2. pyenv 설정\n\n기본 shell로 bash shell을 사용하는 경우를 가정하였습니다.\nbash에서 pyenv와 pyenv-virtualenv 를 사용할 수 있도록 설정합니다.\n\n```bash\nsudo vi ~/.bashrc\n```\n\n다음 문자열을 입력한 후 저장합니다.\n\n```bash\nexport PATH=\"$HOME/.pyenv/bin:$PATH\"\neval \"$(pyenv init -)\"\neval \"$(pyenv virtualenv-init -)\"\n```\n\nshell을 restart 합니다.\n\n```bash\nexec $SHELL\n```\n\npyenv 명령이 정상적으로 수행되는지 확인합니다.\n\n```bash\npyenv --help\n```\n\n다음과 같은 메시지가 출력되면 정상적으로 설정된 것을 의미합니다.\n\n```bash\n$ pyenv\npyenv 2.2.2\nUsage: pyenv <command> [<args>]\n\nSome useful pyenv commands are:\n   --version   Display the version of pyenv\n   activate    Activate virtual environment\n   commands    List all available pyenv commands\n   deactivate   Deactivate virtual environment\n   doctor      Verify pyenv installation and development tools to build pythons.\n   exec        Run an executable with the selected Python version\n   global      Set or show the global Python version(s)\n   help        Display help for a command\n   hooks       List hook scripts for a given pyenv command\n   init        Configure the shell environment for pyenv\n   install     Install a Python version using python-build\n   local       Set or show the local application-specific Python version(s)\n   prefix      Display prefix for a Python version\n   rehash      Rehash pyenv shims (run this after installing executables)\n   root        Display the root directory where versions and shims are kept\n   shell       Set or show the shell-specific Python version\n   shims       List existing pyenv shims\n   uninstall   Uninstall a specific Python version\n   version     Show the current Python version(s) and its origin\n   version-file   Detect the file that sets the current pyenv version\n   version-name   Show the current Python version\n   version-origin   Explain how the current Python version is set\n   versions    List all Python versions available to pyenv\n   virtualenv   Create a Python virtualenv using the pyenv-virtualenv plugin\n   virtualenv-delete   Uninstall a specific Python virtualenv\n   virtualenv-init   Configure the shell environment for pyenv-virtualenv\n   virtualenv-prefix   Display real_prefix for a Python virtualenv version\n   virtualenvs   List all Python virtualenvs found in `$PYENV_ROOT/versions/*'.\n   whence      List all Python versions that contain the given executable\n   which       Display the full path to an executable\n\nSee `pyenv help <command>' for information on a specific command.\nFor full documentation, see: https://github.com/pyenv/pyenv#readme\n```\n\n## pyenv 사용\n\n### Python 버전 설치\n\n`pyenv install <Python-Version>` 명령을 통해 원하는 파이썬 버전을 설치할 수 있습니다.\n이번 페이지에서는 예시로 kubeflow에서 기본으로 사용하는 파이썬 3.7.12 버전을 설치하겠습니다.\n\n```bash\npyenv install 3.7.12\n```\n\n정상적으로 설치되면 다음과 같은 메시지가 출력됩니다.\n\n```bash\n$ pyenv install 3.7.12\nDownloading Python-3.7.12.tar.xz...\n-> https://www.python.org/ftp/python/3.7.12/Python-3.7.12.tar.xz\nInstalling Python-3.7.12...\npatching file Doc/library/ctypes.rst\npatching file Lib/test/test_unicode.py\npatching file Modules/_ctypes/_ctypes.c\npatching file Modules/_ctypes/callproc.c\npatching file Modules/_ctypes/ctypes.h\npatching file setup.py\npatching file 'Misc/NEWS.d/next/Core and Builtins/2020-06-30-04-44-29.bpo-41100.PJwA6F.rst'\npatching file Modules/_decimal/libmpdec/mpdecimal.h\nInstalled Python-3.7.12 to /home/mlops/.pyenv/versions/3.7.12\n```\n\n### Python 가상환경 생성\n\n`pyenv virtualenv <Installed-Python-Version> <가상환경-이름>` 명령을 통해 원하는 파이썬 버전의 파이썬 가상환경을 생성할 수 있습니다.\n\n예시로 Python 3.7.12 버전의 `demo`라는 이름의 Python 가상환경을 생성하겠습니다.\n\n```bash\npyenv virtualenv 3.7.12 demo\n```\n\n```bash\n$ pyenv virtualenv 3.7.12 demo\nLooking in links: /tmp/tmpffqys0gv\nRequirement already satisfied: setuptools in /home/mlops/.pyenv/versions/3.7.12/envs/demo/lib/python3.7/site-packages (47.1.0)\nRequirement already satisfied: pip in /home/mlops/.pyenv/versions/3.7.12/envs/demo/lib/python3.7/site-packages (20.1.1)\n```\n\n### Python 가상환경 사용\n\n`pyenv activate <가상환경 이름>` 명령을 통해 위와 같은 방식으로 생성한 가상환경을 사용할 수 있습니다.\n\n예시로는 `demo`라는 이름의 Python 가상환경을 사용하겠습니다.\n\n```bash\npyenv activate demo\n```\n\n다음과 같이 현재 가상환경의 정보가 shell의 맨 앞에 출력되는 것을 확인할 수 있습니다.\n\n  Before\n\n  ```bash\n  mlops@ubuntu:~$ pyenv activate demo\n  ```\n\n  After\n\n  ```bash\n  pyenv-virtualenv: prompt changing will be removed from future release. configure `export PYENV_VIRTUALENV_DISABLE_PROMPT=1' to simulate the behavior.\n  (demo) mlops@ubuntu:~$ \n  ```\n\n### Python 가상환경 비활성화\n\n`source deactivate` 명령을 통해 현재 사용 중인 가상환경을 비활성화할 수 있습니다.\n\n```bash\nsource deactivate\n```\n\n  Before\n\n  ```bash\n  (demo) mlops@ubuntu:~$ source deactivate\n  ```\n\n  After\n\n  ```bash\n  mlops@ubuntu:~$ \n  ```\n"
  },
  {
    "path": "versioned_docs/version-1.0/further-readings/_category_.json",
    "content": "{\n  \"label\": \"Further Readings\",\n  \"position\": 8,\n  \"link\": {\n    \"type\": \"generated-index\"\n  }\n}\n"
  },
  {
    "path": "versioned_docs/version-1.0/further-readings/info.md",
    "content": "---\ntitle: \"다루지 못한 것들\"\ndate: 2021-12-21\nlastmod: 2021-12-21\n---\n\n## MLOps Component\n\n[MLOps Concepts](../introduction/component.md)에서 다루었던 컴포넌트를 도식화하면 다음과 같습니다.\n\n![open-stacks-0.png](./img/open-stacks-0.png)\n\n이 중 *모두의 MLOps* 에서 다룬 기술 스택들은 다음과 같습니다.\n\n![open-stacks-1.png](./img/open-stacks-1.png)\n\n보시는 것처럼 아직 우리가 다루지 못한 많은 MLOps 컴포넌트들이 있습니다.  \n\n시간 관계상 이번에 모두 다루지는 못했지만, 만약 필요하다면 다음과 같은 오픈소스들을 먼저 참고해보면 좋을 것 같습니다.\n\n![open-stacks-2.png](./img/open-stacks-2.png)\n\n세부 내용은 다음과 같습니다.\n\n| Mgmt.                      | Component                   | Open Soruce                           |\n| -------------------------- | --------------------------- | ------------------------------------- |\n| Data Mgmt.                 | Collection                  | [Kafka](https://kafka.apache.org/)                                 |\n|                            | Validation                  | [Beam](https://beam.apache.org/)                                  |\n|                            | Feature Store               | [Flink](https://flink.apache.org/)                                 |\n| ML Model Dev. & Experiment | Modeling                    | [Jupyter](https://jupyter.org/)                               |\n|                            | Analysis & Experiment Mgmt. | [MLflow](https://mlflow.org/)                                |\n|                            | HPO Tuning & AutoML         | [Katib](https://github.com/kubeflow/katib)                                 |\n| Deploy Mgmt.               | Serving Framework           | [Seldon Core](https://docs.seldon.io/projects/seldon-core/en/latest/index.html)                           |\n|                            | A/B Test                    | [Iter8](https://iter8.tools/)                                 |\n|                            | Monitoring                  | [Grafana](https://grafana.com/oss/grafana/), [Prometheus](https://prometheus.io/)                   |\n| Process Mgmt.              | pipeline                    | [Kubeflow](https://www.kubeflow.org/)                              |\n|                            | CI/CD                       | [Github Action](https://docs.github.com/en/actions)                         |\n|                            | Continuous Training         | [Argo Events](https://argoproj.github.io/events/)                           |\n| Platform Mgmt.             | Configuration Mgmt.         | [Consul](https://www.consul.io/)                                |\n|                            | Code Version Mgmt.          | [Github](https://github.com/), [Minio](https://min.io/)                         |\n|                            | Logging                     | (EFK) [Elastic Search](https://www.elastic.co/kr/elasticsearch/), [Fluentd](https://www.fluentd.org/), [Kibana](https://www.elastic.co/kr/kibana/) |\n|                            | Resource Mgmt.              | [Kubernetes](https://kubernetes.io/)                            |\n"
  },
  {
    "path": "versioned_docs/version-1.0/introduction/_category_.json",
    "content": "{\n  \"label\": \"Introduction\",\n  \"position\": 1,\n  \"link\": {\n    \"type\": \"generated-index\"\n  }\n}\n"
  },
  {
    "path": "versioned_docs/version-1.0/introduction/component.md",
    "content": "---\ntitle : \"3. Components of MLOps\"\ndescription: \"Describe MLOps Components\"\nsidebar_position: 3\ndate: 2021-12-03\nlastmod: 2021-12-10\ncontributors: [\"Youngcheol Jang\"]\n---\n\n## Practitioners guide to MLOps\n\n 2021년 5월에 발표된 구글의 [white paper : Practitioners guide to MLOps: A framework for continuous delivery and automation of machine learning](https://services.google.com/fh/files/misc/practitioners_guide_to_mlops_whitepaper.pdf)에서는 MLOps의 핵심 기능들로 다음과 같은 것들을 언급하였습니다.\n\n\n![mlops-component](./img/mlops-component.png)\n\n\n 각 기능이 어떤 역할을 하는지 살펴보겠습니다.\n\n### 1. Experimentation\n\n 실험(Experimentation)은 머신러닝 엔지니어들이 데이터를 분석하고, 프로토타입 모델을 만들며 학습 기능을 구현할 수 있도록 하는 다음과 같은 기능을 제공합니다.\n\n- 깃(Git)과 같은 버전 컨트롤 도구와 통합된 노트북(Jupyter Notebook) 환경 제공\n- 사용한 데이터, 하이퍼 파라미터, 평가 지표를 포함한 실험 추적 기능 제공\n- 데이터와 모델에 대한 분석 및 시각화 기능 제공\n\n### 2. Data Processing\n\n 데이터 처리(Data Processing)는 머신러닝 모델 개발 단계, 지속적인 학습(Continuous Training) 단계, 그리고 API 배포(API Deployment) 단계에서 많은 양의 데이터를 사용할 수 있게 해 주는 다음과 같은 기능을 제공합니다.\n\n- 다양한 데이터 소스와 서비스에 호환되는 데이터 커넥터(connector) 기능 제공\n- 다양한 형태의 데이터와 호환되는 데이터 인코더(encoder) & 디코더(decoder) 기능 제공\n- 다양한 형태의 데이터에 대한 데이터 변환과 피처 엔지니어링(feature engineering) 기능 제공\n- 학습과 서빙을 위한 확장 가능한 배치, 스트림 데이터 처리 기능 제공\n\n### 3. Model training\n\n 모델 학습(Model training)은 모델 학습을 위한 알고리즘을 효율적으로 실행시켜주는 다음과 같은 기능을 제공합니다.\n\n- ML 프레임워크의 실행을 위한 환경 제공\n- 다수의 GPU / 분산 학습 사용을 위한 분산 학습 환경 제공\n- 하이퍼 파라미터 튜닝과 최적화 기능 제공\n\n### 4. Model evaluation\n\n 모델 평가(Model evaluation)는 실험 환경과 상용 환경에서 동작하는 모델의 성능을 관찰할 수 있는 다음과 같은 기능을 제공합니다.\n\n- 평가 데이터에 대한 모델 성능 평가 기능\n- 서로 다른 지속 학습 실행 결과에 대한 예측 성능 추적\n- 서로 다른 모델의 성능 비교와 시각화\n- 해석할 수 있는 AI 기술을 이용한 모델 출력 해석 기능 제공\n\n### 5. Model serving\n\n 모델 서빙(Model serving)은 상용 환경에 모델을 배포하고 서빙하기 위한 다음과 같은 기능들을 제공합니다.\n\n- 저 지연 추론과 고가용성 추론 기능 제공\n- 다양한 ML 모델 서빙 프레임워크 지원(Tensorflow Serving, TorchServe, NVIDIA Triton, Scikit-learn, XGGoost. etc)\n- 복잡한 형태의 추론 루틴 기능 제공, 예를 들어 전처리(preprocess) 또는 후처리(postprocess) 기능과 최종 결과를 위해 다수의 모델이 사용되는 경우를 말합니다.\n- 순간적으로 치솟는 추론 요청을 처리하기 위한 오토 스케일링(autoscaling) 기능 제공\n- 추론 요청과 추론 결과에 대한 로깅 기능 제공\n\n### 6. Online experimentation\n\n 온라인 실험(Online experimentation)은 새로운 모델이 생성되었을 때, 이 모델을 배포하면 어느 정도의 성능을 보일 것인지 검증하는 기능을 제공합니다. 이 기능은 새 모델을 배포하는 것까지 연동하기 위해 모델 저장소(Model Registry)와 연동되어야 합니다.\n\n- 카나리(canary) & 섀도(shadow) 배포 기능 제공\n- A/B 테스트 기능 제공\n- 멀티 암드 밴딧(Multi-armed bandit) 테스트 기능 제공\n\n### 7. Model Monitoring\n\n모델 모니터링(Model Monitoring)은 상용 환경에 배포된 모델이 정상적으로 동작하고 있는지를 모니터링하는 기능을 제공합니다. 예를 들어 모델의 성능이 떨어져 업데이트가 필요한지에 대한 정보 등을 제공합니다.\n\n### 8. ML Pipeline\n\n머신러닝 파이프라인(ML Pipeline)은 상용 환경에서 복잡한 ML 학습과 추론 작업을 구성하고 제어하고 자동화하기 위한 다음과 같은 기능을 제공합니다.\n\n- 다양한 이벤트를 소스를 통한 파이프라인 실행 기능\n- 파이프라인 파라미터와 생성되는 산출물 관리를 위한 머신러닝 메타데이터 추적과 연동 기능\n- 일반적인 머신러닝 작업을 위한 내장 컴포넌트 지원과 사용자가 직접 구현한 컴포넌트에 대한 지원 기능\n- 서로 다른 실행 환경 제공 기능\n\n### 9. Model Registry\n\n 모델 저장소(Model Registry)는 머신러닝 모델의 생명 주기(Lifecycle)을 중앙 저장소에서 관리할 수 있게 해 주는 기능을 제공합니다.\n\n- 학습된 모델 그리고 배포된 모델에 대한 등록, 추적, 버저닝 기능 제공\n- 배포를 위해 필요한 데이터와 런타임 패키지들에 대한 정보 저장 기능\n\n### 10. Dataset and Feature Repository\n\n- 데이터에 대한 공유, 검색, 재사용 그리고 버전 관리 기능\n- 이벤트 스트리밍 및 온라인 추론 작업에 대한 실시간 처리 및 저 지연 서빙 기능\n- 사진, 텍스트, 테이블 형태의 데이터와 같은 다양한 형태의 데이터 지원 기능\n\n### 11. ML Metadata and Artifact Tracking\n\n MLOps의 각 단계에서는 다양한 형태의 산출물들이 생성됩니다. ML 메타데이터는 이런 산출물들에 대한 정보를 의미합니다.\n ML 메타데이터와 산출물 관리는 산출물의 위치, 타입, 속성, 그리고 관련된 실험(experiment)에 대한 정보를 관리하기 위해 다음과 같은 기능들을 제공합니다.\n\n- ML 산출물에 대한 히스토리 관리 기능\n- 실험과 파이프라인 파라미터 설정에 대한 추적, 공유 기능\n- ML 산출물에 대한 저장, 접근, 시각화, 다운로드 기능 제공\n- 다른 MLOps 기능과의 통합 기능 제공\n"
  },
  {
    "path": "versioned_docs/version-1.0/introduction/intro.md",
    "content": "---\ntitle : \"1. What is MLOps?\"\ndescription: \"Introduction to MLOps\"\nsidebar_position: 1\ndate: 2021-1./img to MLOps\"\nlastmod: 2022-03-05\ncontributors: [\"Jongseob Jeon\"]\n---\n\n## Machine Learning Project\n\n2012년 Alexnet 이후 CV, NLP를 비롯하여 데이터가 존재하는 도메인이라면 어디서든 머신러닝과 딥러닝을 도입하고자 하였습니다.  \n딥러닝과 머신러닝은 AI라는 단어로 묶이며 불렸고 많은 매체에서 AI의 필요성을 외쳤습니다. 그리고 무수히 많은 기업에서 머신러닝과 딥러닝을 이용한 수많은 프로젝트를 진행하였습니다. 하지만 그 결과는 어떻게 되었을까요?  \n엘리먼트 AI의 음병찬 동북아 지역 총괄책임자는 [*\"10개 기업에 AI 프로젝트를 시작한다면 그중 9개는 컨셉검증(POC)만 하다 끝난다\"*](https://zdnet.co.kr/view/?no=20200611062002)고 말했습니다.\n\n이처럼 많은 프로젝트에서 머신러닝과 딥러닝은 이 문제를 풀 수 있을 것 같다는 가능성만을 보여주고 사라졌습니다. 그리고 이 시기쯤에 [AI에 다시 겨울](https://www.aifutures.org/2021/ai-winter-is-coming/)이 다가오고 있다는 전망도 나오기 시작했습니다.\n\n왜 프로젝트 대부분이 컨셉검증(POC) 단계에서 끝났을까요?  \n머신러닝과 딥러닝 코드만으로는 실제 서비스를 운영할 수 없기 때문입니다.\n\n실제 서비스 단계에서 머신러닝과 딥러닝의 코드가 차지하는 부분은 생각보다 크지 않기 때문에, 단순히 모델의 성능만이 아닌 다른 많은 부분을 고려해야 합니다.  \n구글은 이런 문제를 2015년 [Hidden Technical Debt in Machine Learning Systems](https://proceedings.neurips.cc/paper/2015/file/86df7dcfd896fcaf2674f757a2463eba-Paper.pdf)에서 지적한 바 있습니다.  \n하지만 이 논문이 나올 당시에는 아직 많은 머신러닝 엔지니어들이 딥러닝과 머신러닝의 가능성을 입증하기 바쁜 시기였기 때문에, 논문이 지적하는 바에 많은 주의를 기울이지는 않았습니다.\n\n그리고 몇 년이 지난 후 머신러닝과 딥러닝은 가능성을 입증해내어, 이제 사람들은 실제 서비스에 적용하고자 했습니다.  \n하지만 곧 많은 사람이 실제 서비스는 쉽지 않다는 것을 깨달았습니다.\n\n## Devops\n\nMLOps는 이전에 없던 새로운 개념이 아니라 DevOps라고 불리는 개발 방법론에서 파생된 단어입니다. 그렇기에 DevOps를 이해한다면 MLOps를 이해하는 데 도움이 됩니다.\n\n### DevOps\n\nDevOps는 Development(개발)와 Operations(운영)의 합성어로 소프트웨어의 개발(Development)과 운영(Operations)의 합성어로서 소프트웨어 개발자와 정보기술 전문가 간의 소통, 협업 및 통합을 강조하는 개발 환경이나 문화를 말합니다.\nDevOps의 목적은 소프트웨어 개발 조직과 운영 조직간의 상호 의존적 대응이며 조직이 소프트웨어 제품과 서비스를 빠른 시간에 개발 및 배포하는 것을 목적으로 합니다.\n\n### Silo Effect\n\n그럼 간단한 상황 설명을 통해 DevOps가 왜 필요한지 알아보도록 하겠습니다.\n\n서비스 초기에는 지원하는 기능이 많지 않으며 팀 또는 회사의 규모가 작습니다. 이때에는 개발팀과 운영팀의 구분이 없거나 작은 규모의 팀으로 구분되어 있습니다. 핵심은 규모가 작다는 것에 있습니다. 이때는 서로 소통할 수 있는 접점이 많고, 집중해야 하는 서비스가 적기 때문에 빠르게 서비스를 개선해 나갈 수 있습니다.\n\n하지만 서비스의 규모가 커질수록 개발팀과 운영팀은 분리되고 서로 소통할 수 있는 채널의 물리적인 한계가 오게 됩니다. 예를 들어서 다른 팀과 함께하는 미팅에 팀원 전체가 미팅을 하는 것이 아니라 각 팀의 팀장 혹은 소수의 시니어만 참석하여 미팅을 진행하게 됩니다. 이런 소통 채널의 한계는 필연적으로 소통의 부재로 이어지게 됩니다. 그러다 보면 개발팀은 새로운 기능들을 계속해서 개발하고 운영팀 입장에서는 개발팀에서 개발한 기능이 배포 시 장애를 일으키는 등 여러 문제가 생기게 됩니다.\n\n위와 같은 상황이 반복되면 조직 이기주의라고 불리는 사일로 현상이 생길 수 있습니다.\n\n![silo](./img/silo.png)\n\n> 사일로(silo)는 곡식이나 사료를 저장하는 굴뚝 모양의 창고를 의미한다. 사일로는 독립적으로 존재하며 저장되는 물품이 서로 섞이지 않도록 철저히 관리할 수 있도록 도와준다.  \n> 사일로 효과(Organizational Silos Effect)는 조직 부서 간에 서로 협력하지 않고 내부 이익만을 추구하는 현상을 의미한다. 조직 내에서 개별 부서끼리 서로 담을 쌓고 각자의 이익에만 몰두하는 부서 이기주의를 일컫는다.\n\n사일로 현상은 서비스 품질의 저하로 이어지게 됩니다. 이러한 사일로 현상을 해결하기 위해 나온 것이 바로 DevOps입니다.\n\n### CI/CD\n\nContinuous Integration(CI) 와 Continuous Delivery (CD)는 개발팀과 운영팀의 장벽을 해제하기 위한 구체적인 방법입니다.\n\n![cicd](./img/cicd.png)\n\n이 방법을 통해서 개발팀에서는 운영팀의 환경을 이해하고 개발팀에서 개발 중인 기능이 정상적으로 배포까지 이어질 수 있는지 확인합니다. 운영팀은 검증된 기능 또는 개선된 제품을 더 자주 배포해 고객의 제품 경험을 상승시킵니다.  \n앞에서 설명한 내용을 종합하자면 DevOps는 개발팀과 운영팀 간의 문제가 있었고 이를 해결하기 위한 방법론입니다.\n\n## MLOps\n\n### 1) ML+Ops\n\nMLOps는 Machine Learning 과 Operations의 합성어로 DevOps에서 Dev가 ML로 바뀌었습니다. 이제 앞에서 살펴본 DevOps를 통해 MLOps가 무엇인지 짐작해 볼 수 있습니다.\n“MLOps는 머신러닝팀과 운영팀의 문제를 해결하기 위한 방법입니다.”\n이 말은 머신러닝팀과 운영팀 사이에 문제가 발생했다는 의미입니다. 그럼 왜 머신러닝팀과 운영팀에는 문제가 발생했을까요? 두 팀 간의 문제를 알아보기 위해서 추천시스템을 예시로 알아보겠습니다.\n\n#### Rule Based\n\n처음 추천시스템을 만드는 경우 간단한 규칙을 기반으로 아이템을 추천합니다. 예를 들어서 1주일간 판매량이 가장 많은 순서대로 보여주는 식의 방식을 이용합니다. 이 방식으로 모델을 정한다면 특별한 이유가 없는 이상 모델의 수정이 필요 없습니다.\n\n#### Machine Learning\n\n서비스의 규모가 조금 커지고 로그 데이터가 많이 쌓인다면 이를 이용해 아이템 기반 혹은 유저 기반의 머신러닝 모델을 생성합니다. 이때 모델은 정해진 주기에 따라 모델을 재학습 후 재배포합니다.\n\n#### Deep Learning\n\n개인화 추천에 대한 요구가 더 커지고 더 좋은 성능을 내는 모델을 필요해질 경우 딥러닝을 이용한 모델을 개발하기 시작합니다. 이때 만드는 모델은 머신러닝과 같이 정해진 주기에 따라 모델을 재학습 후 재배포합니다.\n\n![graph](./img/graph.png)\n\n위에서 설명한 것을 x축을 모델의 복잡도, y축을 모델의 성능으로 두고 그래프로 표현한다면 다음과 같이 복잡도가 올라갈 때 모델의 성능이 올라가는 상승 관계를 갖습니다. 머신러닝에서 딥러닝으로 넘어갈 머신러닝 팀이 새로 생기게 됩니다.\n\n만약 관리해야할 모델이 적다면 서로 협업을 통해서 충분히 해결할 수 있지만 개발해야 할 모델이 많아진다면 DevOps의 경우와 같이 사일로 현상이 나타나게 됩니다.\n\nDevOps의 목표와 맞춰서 생각해보면 MLOps의 목표는 개발한 모델이 정상적으로 배포될 수 있는지 테스트하는 것입니다. 개발팀에서 개발한 기능이 정상적으로 배포될 수 있는지 확인하는 것이 DevOps의 목표였다면, MLOps의 목표는 머신러닝 팀에서 개발한 모델이 정상적으로 배포될 수 있는지 확인하는 것입니다.\n\n### 2) ML -> Ops\n\n하지만 최근 나오고 있는 MLOps 관련 제품과 설명을 보면 꼭 앞에서 설명한 목표만을 대상으로 하고 있지 않습니다.\n어떤 경우에는 머신러닝 팀에서 만든 모델을 이용해 직접 운영을 할 수 있도록 도와주려고 합니다. 이러한 니즈는 최근 머신러닝 프로젝트가 진행되는 과정에서 알 수 있습니다.\n\n추천시스템의 경우 운영에서 간단한 모델부터 시작해 운영할 수 있었습니다. 하지만 자연어, 이미지와 같은 곳에서는 규칙 기반의 모델보다는 딥러닝을 이용해 주어진 태스크를 해결할 수 있는지 검증(POC)를 선행하는 경우가 많습니다. 검증이 끝난 프로젝트는 이제 서비스를 위한 운영 환경을 개발하기 시작합니다. 하지만 머신러닝 팀 내의 자체 역량으로는 이 문제를 해결하기 쉽지 않습니다. 이를 해결하기 위해서 MLOps가 필요한 경우도 있습니다.\n\n### 3) 결론\n\n요약하자면 MLOps는 두 가지 목표가 있습니다.\n앞에서 설명한 MLOps는 ML+Ops 로 두 팀의 생산성 향상을 위한 것이였습니다.\n반면, 뒤에서 설명한 것은 ML->Ops 로 머신러닝 팀에서 직접 운영을 할 수 있도록 도와주는 것을 말합니다.\n"
  },
  {
    "path": "versioned_docs/version-1.0/introduction/levels.md",
    "content": "---\ntitle : \"2. Levels of MLOps\"\ndescription: \"Levels of MLOps\"\nsidebar_position: 2\ndate: 2021-12-03\nlastmod: 2022-03-05\ncontributors: [\"Jongseob Jeon\"]\n\n---\n\n이번 페이지에서는 구글에서 발표한 MLOps의 단계를 보며 MLOps의 핵심 기능은 무엇인지 알아 보겠습니다.\n\n## Hidden Technical Debt in ML System\n\n구글은 무려 2015년부터 MLOps의 필요성을 말했습니다. Hidden Technical Debt in Machine Learning Systems 은 그런 구글의 생각을 담은 논문입니다.\n\n![paper](./img/paper.png)\n\n이 논문의 핵심은 바로 머신러닝을 이용한 제품을 만드는데 있어서 머신러닝 코드는 전체 시스템을 구성하는데 있어서 아주 일부일 뿐이라는 것입니다.\n\n![paper-2](./img/paper-2.png)\n\n구글은 이 논문을 더 발전시켜서 MLOps라는 용어를 만들어 확장시켰습니다. 더 자세한 내용은 [구글 클라우드 홈페이지](https://cloud.google.com/architecture/mlops-continuous-delivery-and-automation-pipelines-in-machine-learning)에서 더 자세한 내용을 확인할 수 있습니다. 이번 포스트에서는 구글에서 말하는 MLOps란 어떤 것인지에 대해서 설명해보고자 합니다.\n\n구글에서는 MLOps의 발전 단계를 총 3(0~2)단계로 나누었습니다. 각 단계들에 대해 설명하기 앞서 이전 포스트에서 설명했던 개념 중 필요한 부분을 다시 한번 보겠습니다.\n\n머신러닝 모델을 운영하기 위해서는 모델을 개발하는 머신러닝 팀과 배포 및 운영을 담당하는 운영팀이 있습니다. 이 두 팀의 원할한 협업을 위해서 MLOps가 필요하게 되었습니다. 이전에는 간단히 Continuous Integration(CI)/Continuous Deployment(CD)를 통해서 할 수 있다고 하였는데, 어떻게 CI/CD를 하는지에 대해서 알아 보겠습니다.\n\n## 0단계: 수동 프로세스\n\n![level-0](./img/level-0.png)\n\n0단계에서 두 팀은 “모델”을 통해 소통합니다. 머신 러닝팀은 쌓여있는 데이터로 모델을 학습시키고 학습된 모델을 운영팀에게 전달 합니다. 운영팀은 이렇게 전달받은 모델을 배포합니다.\n\n![toon](./img/toon.png)\n\n초기의 머신 러닝 모델들은 이 “모델” 중심의 소통을 통해 배포합니다. 그런데 이런 배포 방식은 여러 문제가 있습니다.  \n예를 들어서 어떤 기능에서는 파이썬 3.7을 쓰고 어떤 기능에서는 파이썬 3.8을 쓴다면 다음과 같은 상황을 자주 목격할 수 있습니다.\n\n이러한 상황이 일어나는 이유는 머신러닝 모델의 특성에 있습니다. 학습된 머신러닝 모델이 동작하기 위해서는 3가지가 필요합니다.\n\n1. 파이썬 코드\n2. 학습된 가중치\n3. 환경 (패키지, 버전 등)\n\n만약 이 3가지 중 한 가지라도 전달이 잘못 된다면 모델이 동작하지 않거나 예상하지 못한 예측을 할수 있습니다. 그런데 많은 경우 환경이 일치하지 않아서 동작하지 않는 경우가 많습니다. 머신러닝은 다양한 오픈소스를 사용하는데 오픈소스는 특성상 어떤 버전을 쓰는지에 따라서 같은 함수라도 결과가 다를 수 있습니다.\n\n이러한 문제는 서비스 초기에는 관리할 모델이 많지 않기 때문에 금방 해결할 수 있습니다. 하지만 관리하는 기능들이 많아지고 서로 소통에 어려움을 겪게 된다면 성능이 더 좋은 모델을 빠르게 배포할 수 없게 됩니다.\n\n## 1단계: ML 파이프라인 자동화\n\n### Pipeline\n\n![level-1-pipeline](./img/level-1-pipeline.png)\n\n그래서 MLOps에서는 “파이프라인(Pipeline)”을 이용해 이러한 문제를 방지하고자 했습니다. MLOps의 파이프라인은 도커와 같은 컨테이너를 이용해 머신러닝 엔지니어가 모델 개발에 사용한 것과 동일한 환경으로 동작되는 것을 보장합니다. 이를 통해서 환경이 달라서 모델이 동작하지 않는 상황을 방지합니다.\n\n그런데 파이프라인은 범용적인 용어로 여러 다양한 태스크에서 사용됩니다. 머신러닝 엔지니어가 작성하는 파이프라인의 역할은 무엇일까요?  \n머신러닝 엔지니어가 작성하는 파이프라인은 학습된 모델을 생산합니다. 그래서 파이프라인 대신 학습 파이프라인(Training Pipeline)이 더 정확하다고 볼 수 있습니다.\n\n### Continuous Training\n\n![level-1-ct.png](./img/level-1-ct.png)\n\n그리고 Continuous Training(CT) 개념이 추가됩니다. 그렇다면 CT는 왜 필요할까요?\n\n#### Auto Retrain\n\nReal World에서 데이터는 Data Shift라는 데이터의 분포가 계속해서 변하는 특징이 있습니다. 그래서 과거에 학습한 모델이 시간이 지남에 따라 모델의 성능이 저하되는 문제가 있습니다. 이 문제를 해결하는 가장 간단하고 효과적인 해결책은 바로 최근 데이터를 이용해 모델을 재학습하는 것입니다. 변화된 데이터 분포에 맞춰서 모델을 재학습하면 다시 준수한 성능을 낼 수 있습니다.\n\n#### Auto Deploy\n\n하지만 제조업과 같이 한 공장에서 여러 레시피를 처리하는 경우 무조건 재학습을 하는 것이 좋지 않을 수 도 있습니다. Blind Spot이 대표적인 예입니다.\n\n예를 들어서 자동차 생산 라인에서 모델 A에 대해서 모델을 만들고 이를 이용해 예측을 진행하고 있었습니다. 만약 전혀 다른 모델 B가 들어오면 이전에 보지 못한 데이터 패턴이기 때문에 모델 B에 대해서 새로운 모델을 학습합니다.\n\n이제 모델 B에 대해서 모델을 만들었기 때문에 모델은 예측을 진행할 것 입니다. 그런데 만약 데이터가 다시 모델 A로 바뀐다면 어떻게 할까요?  \n만약 Retraining 규칙만 있다면 다시 모델 A에 대해서 새로운 모델을 학습하게 됩니다. 그런데 머신러닝 모델이 충분한 성능을 보이기 위해서는 충분한 양의 데이터가 모여야 합니다. Blind Spot이란 이렇게 데이터를 모으기 위해서 모델이 동작하지 않는 구간을 말합니다.\n\n이러한 Blind Spot을 해결하는 방법은 간단할 수 있습니다. 바로 모델 A에 대한 모델이 과거에 있었는지 확인하고 만약 있었다면 새로운 모델을 바로 학습하기 보다는 이 전 모델을 이용해 다시 예측을 하면 이런 Blind Spot을 해결할 수 있습니다. 이렇게 모델와 같은 메타 데이터를 이용해 모델을 자동으로 변환해주는 것을 Auto Deploy라고 합니다.\n\n정리하자면 CT를 위해서는 Auto Retraining과 Auto Deploy 두 가지 기능이 필요합니다. 둘은 서로의 단점을 보완해 계속해서 모델의 성능을 유지할 수 있게 합니다.\n\n## 2단계: CI/CD 파이프라인의 자동화\n\n![level-2](./img/level-2.png)\n\n2단계의 제목은 CI와 CD의 자동화 입니다. DevOps에서의 CI/CD의 대상은 소스 코드입니다. 그렇다면 MLOps는 어떤 것이 CI/CD의 대상일까요?\n\nMLOps의 CI/CD 대상 또한 소스 코드인 것은 맞지만 조금 더 엄밀히 정의하자면 학습 파이프라인이라고 볼 수 있습니다.\n\n그래서 모델을 학습하는데 있어서 영향이 있는 변화에 대해서 실제로 모델이 정상적으로 학습이 되는지 (CI), 학습된 모델이 정상적으로 동작하는지 (CD)를 확인해야 합니다. 그래서 학습을 하는 코드에 직접적인 수정이 있는 경우에는 CI/CD를 진행해야 합니다.\n\n코드 외에도 사용하는 패키지의 버전, 파이썬의 버전 변경도 CI/CD의 대상입니다. 많은 경우 머신 러닝은 오픈 소스를 이용합니다. 하지만 오픈 소스는 그 특성상 버전이 바뀌었을 때 함수의 내부 로직이 변하는 경우도 있습니다. 물론 어느 정도 버전이 올라 갈 때 이와 관련된 알림을 주지만 한 번에 버전이 크게 바뀐다면 이러한 변화를 모를 수도 있습니다.  \n그래서 사용하는 패키지의 버전이 변하는 경우에도 CI/CD를 통해 정상적으로 모델이 학습, 동작하는지 확인을 해야 합니다.\n"
  },
  {
    "path": "versioned_docs/version-1.0/introduction/why_kubernetes.md",
    "content": "---\ntitle : \"4. Why Kubernetes?\"\ndescription: \"Reason for using k8s in MLOps\"\nsidebar_position: 4\ndate: 2021-12-03\nlastmod: 2021-12-10\ncontributors: [\"Jaeyeon Kim\"]\n---\n\n## MLOps & Kubernetes\n\n그렇다면 MLOps를 이야기할 때, 쿠버네티스(Kubernetes)라는 단어가 항상 함께 들리는 이유가 무엇일까요?\n\n성공적인 MLOps 시스템을 구축하기 위해서는 [MLOps의 구성요소](../introduction/component.md) 에서 설명한 것처럼 다양한 구성 요소들이 필요하지만, 각각의 구성 요소들이 유기적으로 운영되기 위해서는 인프라 레벨에서 수많은 이슈를 해결해야 합니다.  \n간단하게는 수많은 머신러닝 모델의 학습 요청을 차례대로 실행하는 것, 다른 작업 공간에서도 같은 실행 환경을 보장해야 하는 것, 배포된 서비스에 장애가 생겼을 때 빠르게 대응해야 하는 것 등의 이슈 등을 생각해볼 수 있습니다.  \n여기서 컨테이너(Container)와 컨테이너 오케스트레이션 시스템(Container Orchestration System)의 필요성이 등장합니다.\n\n쿠버네티스와 같은 컨테이너 오케스트레이션 시스템을 도입하면 실행 환경의 격리와 관리를 효율적으로 수행할 수 있습니다. 컨테이너 오케스트레이션 시스템을 도입한다면, 머신러닝 모델을 개발하고 배포하는 과정에서 다수의 개발자가 소수의 클러스터를 공유하면서 *'1번 클러스터 사용 중이신가요?', 'GPU 사용 중이던 제 프로세스 누가 죽였나요?', '누가 클러스터에 x 패키지 업데이트했나요?'* 와 같은 상황을 방지할 수 있습니다.\n\n## Container\n\n그렇다면 컨테이너란 무엇일까요? 마이크로소프트에서는 컨테이너를 [다음](https://azure.microsoft.com/ko-kr/overview/what-is-a-container/)과 같이 정의하고 있습니다.\n\n> 컨테이너란 : 애플리케이션의 표준화된 이식 가능한 패키징\n\n그런데 왜 머신러닝에서 컨테이너가 필요할까요? 머신러닝 모델들은 운영체제나 Python 실행 환경, 패키지 버전 등에 따라 다르게 동작할 수 있습니다.  \n이를 방지하기 위해서 머신러닝에 사용된 소스 코드와 함께 종속적인 실행 환경 전체를 **하나로 묶어서(패키징해서)** 공유하고 실행하는 데 활용할 수 있는 기술이 컨테이너라이제이션(Containerization) 기술입니다.\n이렇게 패키징된 형태를 컨테이너 이미지라고 부르며, 컨테이너 이미지를 공유함으로써 사용자들은 어떤 시스템에서든 같은 실행 결과를 보장할 수 있게 됩니다.  \n즉, 단순히 Jupyter Notebook 파일이나, 모델의 소스 코드와 requirements.txt 파일을 공유하는 것이 아닌, 모든 실행 환경이 담긴 컨테이너 이미지를 공유한다면 *\"제 노트북에서는 잘 되는데요?\"* 와 같은 상황을 피할 수 있습니다.\n\n컨테이너를 처음 접하시는 분들이 흔히 하시는 오해 중 하나는 \"**컨테이너 == 도커**\"라고 받아들이는 것입니다.  \n도커는 컨테이너와 같은 의미를 지니는 개념이 아니라, 컨테이너를 띄우거나, 컨테이너 이미지를 만들고 공유하는 것과 같이 컨테이너를 더욱더 쉽고 유연하게 사용할 수 있는 기능을 제공해주는 도구입니다. 정리하자면 컨테이너는 가상화 기술이고, 도커는 가상화 기술의 구현체라고 말할 수 있습니다.\n\n다만, 도커는 여러 컨테이너 가상화 도구 중에서 쉬운 사용성과 높은 효율성을 바탕으로 가장 빠르게 성장하여 대세가 되었기에 컨테이너하면 도커라는 이미지가 자동으로 떠오르게 되었습니다. 이렇게 컨테이너와 도커 생태계가 대세가 되기까지는 다양한 이유가 있지만, 기술적으로 자세한 이야기는 *모두의 MLOps*의 범위를 넘어서기 때문에 다루지는 않겠습니다.\n\n컨테이너 혹은 도커를 처음 들어보시는 분들에게는 *모두의 MLOps*의 내용이 다소 어렵게 느껴질 수 있으므로, [생활코딩](https://opentutorials.org/course/4781), [subicura 님의 개인 블로그 글](https://subicura.com/2017/01/19/docker-guide-for-beginners-1.html) 등의 자료를 먼저 살펴보는 것을 권장합니다.\n\n## Container Orchestration System\n\n그렇다면 컨테이너 오케스트레이션 시스템은 무엇일까요? **오케스트레이션**이라는 단어에서 추측해 볼 수 있듯이, 수많은 컨테이너가 있을 때 컨테이너들이 서로 조화롭게 구동될 수 있도록 지휘하는 시스템에 비유할 수 있습니다.\n\n컨테이너 기반의 시스템에서 서비스는 컨테이너의 형태로 사용자들에게 제공됩니다. 이때 관리해야 할 컨테이너의 수가 적다면 운영 담당자 한 명이서도 충분히 모든 상황에 대응할 수 있습니다.  \n하지만, 수백 개 이상의 컨테이너가 수 십 대 이상의 클러스터에서 구동되고 있고 장애를 일으키지 않고 항상 정상 동작해야 한다면, 모든 서비스의 정상 동작 여부를 담당자 한 명이 파악하고 이슈에 대응하는 것은 불가능에 가깝습니다.\n\n예를 들면, 모든 서비스가 정상적으로 동작하고 있는지를 계속해서 모니터링(Monitoring)해야 합니다.  \n만약, 특정 서비스가 장애를 일으켰다면 여러 컨테이너의 로그를 확인해가며 문제를 파악해야 합니다.  \n또한, 특정 클러스터나 특정 컨테이너에 작업이 몰리지 않도록 스케줄링(Scheduling)하고 로드 밸런싱(Load Balancing)하며, 스케일링(Scaling)하는 등의 수많은 작업을 담당해야 합니다.\n이렇게 수많은 컨테이너의 상태를 지속해서 관리하고 운영하는 과정을 조금이나마 쉽게, 자동으로 할 수 있는 기능을 제공해주는 소프트웨어가 바로 컨테이너 오케스트레이션 시스템입니다.  \n\n머신러닝에서는 어떻게 쓰일 수 있을까요?  \n예를 들어서 GPU가 있어야 하는 딥러닝 학습 코드가 패키징된 컨테이너는 사용 가능한 GPU가 있는 클러스터에서 수행하고, 많은 메모리를 필요로 하는 데이터 전처리 코드가 패키징된 컨테이너는 메모리의 여유가 많은 클러스터에서 수행하고, 학습 중에 클러스터에 문제가 생기면 자동으로 같은 컨테이너를 다른 클러스터로 이동시키고 다시 학습을 진행하는 등의 작업을 사람이 일일이 수행하지 않고, 자동으로 관리하는 시스템을 개발한 뒤 맡기는 것입니다.\n\n집필을 하는 2022년을 기준으로 쿠버네티스는 컨테이너 오케스트레이션 시스템의 사실상의 표준(De facto standard)입니다.\n\nCNCF에서 2018년 발표한 [Survey](https://www.cncf.io/blog/2018/08/29/cncf-survey-use-of-cloud-native-technologies-in-production-has-grown-over-200-percent/) 에 따르면 다음 그림과 같이 이미 두각을 나타내고 있었으며, 2019년 발표한 [Survey](https://www.cncf.io/wp-content/uploads/2020/08/CNCF_Survey_Report.pdf)에 따르면 그중 78%가 상용 수준(Production Level)에서 사용하고 있다는 것을 알 수 있습니다.\n\n![k8s-graph](./img/k8s-graph.png)\n\n쿠버네티스 생태계가 이처럼 커지게 된 이유에는 여러 가지 이유가 있습니다. 하지만 도커와 마찬가지로 쿠버네티스 역시 머신러닝 기반의 서비스에서만 사용하는 기술이 아니기에, 자세히 다루기에는 상당히 많은 양의 기술적인 내용을 다루어야 하므로 이번 *모두의 MLOps*에서는 자세한 내용은 생략할 예정입니다.\n\n다만, *모두의 MLOps*에서 앞으로 다룰 내용은 도커와 쿠버네티스에 대한 내용을 어느 정도 알고 계신 분들을 대상으로 작성하였습니다. 따라서 쿠버네티스에 대해 익숙하지 않으신 분들은 다음 [쿠버네티스 공식 문서](https://kubernetes.io/ko/docs/concepts/overview/what-is-kubernetes/), [subicura 님의 개인 블로그 글](https://subicura.com/k8s/) 등의 쉽고 자세한 자료들을 먼저 참고해주시는 것을 권장합니다.\n"
  },
  {
    "path": "versioned_docs/version-1.0/kubeflow/_category_.json",
    "content": "{\n  \"label\": \"Kubeflow\",\n  \"position\": 6,\n  \"link\": {\n    \"type\": \"generated-index\"\n  }\n}\n"
  },
  {
    "path": "versioned_docs/version-1.0/kubeflow/advanced-component.md",
    "content": "---\ntitle : \"8. Component - InputPath/OutputPath\"\ndescription: \"\"\nsidebar_position: 8\ncontributors: [\"Jongseob Jeon\", \"SeungTae Kim\"]\n---\n\n\n## Complex Outputs\n\n이번 페이지에서는 [Kubeflow Concepts](../kubeflow/kubeflow-concepts.md#component-contents) 예시로 나왔던 코드를 컴포넌트로 작성해 보겠습니다.\n\n## Component Contents\n\n아래 코드는 [Kubeflow Concepts](../kubeflow/kubeflow-concepts.md#component-contents)에서 사용했던 컴포넌트 콘텐츠입니다.\n\n```python\nimport dill\nimport pandas as pd\n\nfrom sklearn.svm import SVC\n\ntrain_data = pd.read_csv(train_data_path)\ntrain_target = pd.read_csv(train_target_path)\n\nclf = SVC(kernel=kernel)\nclf.fit(train_data, train_target)\n\nwith open(model_path, mode=\"wb\") as file_writer:\n    dill.dump(clf, file_writer)\n```\n\n## Component Wrapper\n\n### Define a standalone Python function\n\n컴포넌트 래퍼에 필요한 Config들과 함께 작성하면 다음과 같이 됩니다.\n\n```python\ndef train_from_csv(\n    train_data_path: str,\n    train_target_path: str,\n    model_path: str,\n    kernel: str,\n):\n    import dill\n    import pandas as pd\n\n    from sklearn.svm import SVC\n\n    train_data = pd.read_csv(train_data_path)\n    train_target = pd.read_csv(train_target_path)\n\n    clf = SVC(kernel=kernel)\n    clf.fit(train_data, train_target)\n\n    with open(model_path, mode=\"wb\") as file_writer:\n        dill.dump(clf, file_writer)\n```\n\n[Basic Usage Component](../kubeflow/basic-component)에서 설명할 때 입력과 출력에 대한 타입 힌트를 적어야 한다고 설명 했었습니다. 그런데 만약 json에서 사용할 수 있는 기본 타입이 아닌 dataframe, model와 같이 복잡한 객체들은 어떻게 할까요?\n\n파이썬에서 함수간에 값을 전달할 때, 객체를 반환해도 그 값이 호스트의 메모리에 저장되어 있으므로 다음 함수에서도 같은 객체를 사용할 수 있습니다. 하지만 kubeflow에서 컴포넌트들은 각각 컨테이너 위에서 서로 독립적으로 실행됩니다. 즉, 같은 메모리를 공유하고 있지 않기 때문에, 보통의 파이썬 함수에서 사용하는 방식과 같이 객체를 전달할 수 없습니다. 컴포넌트 간에 넘겨 줄 수 있는 정보는 `json` 으로만 가능합니다. 따라서 Model이나 DataFrame과 같이 json 형식으로 변환할 수 없는 타입의 객체는 다른 방법을 통해야 합니다.\n\nKubeflow에서는 이를 해결하기 위해 json-serializable 하지 않은 타입의 객체는 메모리 대신 파일에 데이터를 저장한 뒤, 그 파일을 이용해 정보를 전달합니다. 저장된 파일의 경로는 str이기 때문에 컴포넌트 간에 전달할 수 있기 때문입니다. 그런데 kubeflow에서는 minio를 이용해 파일을 저장하는데 유저는 실행을 하기 전에는 각 파일의 경로를 알 수 없습니다. 이를 위해서 kubeflow에서는 입력과 출력의 경로와 관련된 매직을 제공하는데 바로 `InputPath`와 `OutputPath` 입니다.\n\n`InputPath`는 단어 그대로 입력 경로를 `OutputPath` 는 단어 그대로 출력 경로를 의미합니다.\n\n예를 들어서 데이터를 생성하고 반환하는 컴포넌트에서는 `data_path: OutputPath()`를 argument로 만듭니다.\n그리고 데이터를 받는 컴포넌트에서는 `data_path: InputPath()`을 argument로 생성합니다.\n\n이렇게 만든 후 파이프라인에서 서로 연결을 하면 kubeflow에서 필요한 경로를 자동으로 생성후 입력해 주기 때문에 더 이상 유저는 경로를 신경쓰지 않고 컴포넌트간의 관계만 신경쓰면 됩니다.\n\n이제 이 내용을 바탕으로 다시 컴포넌트 래퍼를 작성하면 다음과 같이 됩니다.\n\n```python\nfrom kfp.components import InputPath, OutputPath\n\ndef train_from_csv(\n    train_data_path: InputPath(\"csv\"),\n    train_target_path: InputPath(\"csv\"),\n    model_path: OutputPath(\"dill\"),\n    kernel: str,\n):\n    import dill\n    import pandas as pd\n\n    from sklearn.svm import SVC\n\n    train_data = pd.read_csv(train_data_path)\n    train_target = pd.read_csv(train_target_path)\n\n    clf = SVC(kernel=kernel)\n    clf.fit(train_data, train_target)\n\n    with open(model_path, mode=\"wb\") as file_writer:\n        dill.dump(clf, file_writer)\n```\n\nInputPath나 OutputPath는 string을 입력할 수 있습니다. 이 string은 입력 또는 출력하려고 하는 파일의 포맷입니다.  \n그렇다고 꼭 이 포맷으로 파일 형태로 저장이 강제되는 것은 아닙니다.  \n다만 파이프라인을 컴파일할 때 최소한의 타입 체크를 위한 도우미 역할을 합니다.  \n만약 파일 포맷이 고정되지 않는다면 입력하지 않으면 됩니다 (타입 힌트 에서 `Any` 와 같은 역할을 합니다).\n\n### Convert to Kubeflow Format\n\n작성한 컴포넌트를 kubeflow에서 사용할 수 있는 포맷으로 변환합니다.\n\n```python\nfrom kfp.components import InputPath, OutputPath, create_component_from_func\n\n\n@create_component_from_func\ndef train_from_csv(\n    train_data_path: InputPath(\"csv\"),\n    train_target_path: InputPath(\"csv\"),\n    model_path: OutputPath(\"dill\"),\n    kernel: str,\n):\n    import dill\n    import pandas as pd\n\n    from sklearn.svm import SVC\n\n    train_data = pd.read_csv(train_data_path)\n    train_target = pd.read_csv(train_target_path)\n\n    clf = SVC(kernel=kernel)\n    clf.fit(train_data, train_target)\n\n    with open(model_path, mode=\"wb\") as file_writer:\n        dill.dump(clf, file_writer)\n```\n\n## Rule to use InputPath/OutputPath\n\nInputPath나 OutputPath argument는 파이프라인으로 작성할 때 지켜야하는 규칙이 있습니다.\n\n### Load Data Component\n\n위에서 작성한 컴포넌트를 실행하기 위해서는 데이터가 필요하므로 데이터를 생성하는 컴포넌트를 작성합니다.\n\n```python\nfrom functools import partial\n\nfrom kfp.components import InputPath, OutputPath, create_component_from_func\n\n\n@create_component_from_func\ndef load_iris_data(\n    data_path: OutputPath(\"csv\"),\n    target_path: OutputPath(\"csv\"),\n):\n    import pandas as pd\n    from sklearn.datasets import load_iris\n\n    iris = load_iris()\n\n    data = pd.DataFrame(iris[\"data\"], columns=iris[\"feature_names\"])\n    target = pd.DataFrame(iris[\"target\"], columns=[\"target\"])\n\n    data.to_csv(data_path, index=False)\n    target.to_csv(target_path, index=False)\n```\n\n### Write Pipeline\n\n이제 파이프라인을 작성해 보도록 하겠습니다.\n\n```python\nfrom kfp.dsl import pipeline\n\n\n@pipeline(name=\"complex_pipeline\")\ndef complex_pipeline(kernel: str):\n    iris_data = load_iris_data()\n    model = train_from_csv(\n        train_data=iris_data.outputs[\"data\"],\n        train_target=iris_data.outputs[\"target\"],\n        kernel=kernel,\n    )\n```\n\n한 가지 이상한 점을 확인하셨나요?  \n바로 입력과 출력에서 받는 argument중 경로와 관련된 것들에 `_path` 접미사가 모두 사라졌습니다.  \n`iris_data.outputs[\"data_path\"]` 가 아닌 `iris_data.outputs[\"data\"]` 으로 접근하는 것을 확인할 수 있습니다.  \n이는 kubeflow에서 정한 법칙으로 `InputPath` 와 `OutputPath` 으로 생성된 경로들은 파이프라인에서 접근할 때는 `_path` 접미사를 생략하여 접근합니다.\n\n다만 방금 작성한 파이프라인을 업로드할 경우 실행이 되지 않습니다.\n이유는 다음 페이지에서 설명합니다.\n"
  },
  {
    "path": "versioned_docs/version-1.0/kubeflow/advanced-environment.md",
    "content": "---\ntitle : \"9. Component - Environment\"\ndescription: \"\"\nsidebar_position: 9\ncontributors: [\"Jongseob Jeon\"]\n---\n\n\n## Component Environment\n\n앞서  [8. Component - InputPath/OutputPath](../kubeflow/advanced-component.md)에서 작성한 파이프라인을 실행하면 실패하게 됩니다. 왜 실패하는지 알아보고 정상적으로 실행될 수 있도록 수정합니다.\n\n### Convert to Kubeflow Format\n\n[앞에서 작성한 컴포넌트](../kubeflow/advanced-component.md#convert-to-kubeflow-format)를 yaml파일로 변환하도록 하겠습니다.\n\n```python\nfrom kfp.components import InputPath, OutputPath, create_component_from_func\n\n\n@create_component_from_func\ndef train_from_csv(\n    train_data_path: InputPath(\"csv\"),\n    train_target_path: InputPath(\"csv\"),\n    model_path: OutputPath(\"dill\"),\n    kernel: str,\n):\n    import dill\n    import pandas as pd\n\n    from sklearn.svm import SVC\n\n    train_data = pd.read_csv(train_data_path)\n    train_target = pd.read_csv(train_target_path)\n\n    clf = SVC(kernel=kernel)\n    clf.fit(train_data, train_target)\n\n    with open(model_path, mode=\"wb\") as file_writer:\n        dill.dump(clf, file_writer)\n\n\nif __name__ == \"__main__\":\n    train_from_csv.component_spec.save(\"train_from_csv.yaml\")\n```\n\n위의 스크립트를 실행하면 다음과 같은 `train_from_csv.yaml` 파일을 얻을 수 있습니다.\n\n```bash\nname: Train from csv\ninputs:\n- {name: train_data, type: csv}\n- {name: train_target, type: csv}\n- {name: model, type: dill}\n- {name: kernel, type: String}\nimplementation:\n  container:\n    image: python:3.7\n    command:\n    - sh\n    - -ec\n    - |\n      program_path=$(mktemp)\n      printf \"%s\" \"$0\" > \"$program_path\"\n      python3 -u \"$program_path\" \"$@\"\n    - |\n      def train_from_csv(\n          train_data_path,\n          train_target_path,\n          model_path,\n          kernel,\n      ):\n          import dill\n          import pandas as pd\n\n          from sklearn.svm import SVC\n\n          train_data = pd.read_csv(train_data_path)\n          train_target = pd.read_csv(train_target_path)\n\n          clf = SVC(kernel=kernel)\n          clf.fit(train_data, train_target)\n\n          with open(model_path, mode=\"wb\") as file_writer:\n              dill.dump(clf, file_writer)\n\n      import argparse\n      _parser = argparse.ArgumentParser(prog='Train from csv', description='')\n      _parser.add_argument(\"--train-data\", dest=\"train_data_path\", type=str, required=True, default=argparse.SUPPRESS)\n      _parser.add_argument(\"--train-target\", dest=\"train_target_path\", type=str, required=True, default=argparse.SUPPRESS)\n      _parser.add_argument(\"--model\", dest=\"model_path\", type=str, required=True, default=argparse.SUPPRESS)\n      _parser.add_argument(\"--kernel\", dest=\"kernel\", type=str, required=True, default=argparse.SUPPRESS)\n      _parsed_args = vars(_parser.parse_args())\n\n      _outputs = train_from_csv(**_parsed_args)\n    args:\n    - --train-data\n    - {inputPath: train_data}\n    - --train-target\n    - {inputPath: train_target}\n    - --model\n    - {inputPath: model}\n    - --kernel\n    - {inputValue: kernel}\n```\n\n앞서 [Basic Usage Component](../kubeflow/basic-component.md#convert-to-kubeflow-format)에서 설명한 내용에 따르면 이 컴포넌트는 다음과 같이 실행됩니다.\n\n1. `docker pull python:3.7`\n2. run `command`\n\n하지만 위에서 생성된 컴포넌트를 실행하면 오류가 발생하게 됩니다.  \n그 이유는 컴포넌트 래퍼가 실행되는 방식에 있습니다.  \nKubeflow는 쿠버네티스를 이용하기 때문에 컴포넌트 래퍼는 각각 독립된 컨테이너 위에서 컴포넌트 콘텐츠를 실행합니다.\n\n자세히 보면 생성된 만든 `train_from_csv.yaml` 에서 정해진 이미지는  `image: python:3.7` 입니다.\n\n이제 어떤 이유 때문에 실행이 안 되는지 눈치채신 분들도 있을 것입니다.\n\n`python:3.7` 이미지에는 우리가 사용하고자 하는 `dill`, `pandas`, `sklearn` 이 설치되어 있지 않습니다.  \n그러므로 실행할 때 해당 패키지가 존재하지 않는다는 에러와 함께 실행이 안 됩니다.\n\n그럼 어떻게 패키지를 추가할 수 있을까요?\n\n## 패키지 추가 방법\n\nKubeflow를 변환하는 과정에서 두 가지 방법을 통해 패키지를 추가할 수 있습니다.\n\n1. `base_image` 사용\n2. `package_to_install` 사용\n\n컴포넌트를 컴파일할 때 사용했던 함수 `create_component_from_func` 가 어떤 argument들을 받을 수 있는지 확인해 보겠습니다.\n\n```bash\ndef create_component_from_func(\n    func: Callable,\n    output_component_file: Optional[str] = None,\n    base_image: Optional[str] = None,\n    packages_to_install: List[str] = None,\n    annotations: Optional[Mapping[str, str]] = None,\n):\n```\n\n- `func`: 컴포넌트로 만들 컴포넌트 래퍼 함수\n- `base_image`: 컴포넌트 래퍼가 실행할 이미지\n- `packages_to_install`: 컴포넌트에서 사용해서 추가로 설치해야 하는 패키지\n\n### 1. base_image\n\n컴포넌트가 실행되는 순서를 좀 더 자세히 들여다보면 다음과 같습니다.\n\n1. `docker pull base_image`\n2. `pip install packages_to_install`\n3. run `command`\n\n만약 컴포넌트가 사용하는 base_image에 패키지들이 전부 설치되어 있다면 추가적인 패키지 설치 없이 바로 사용할 수 있습니다.\n\n예를 들어, 이번 페이지에서는 다음과 같은 Dockerfile을 작성하겠습니다.\n\n```dockerfile\nFROM python:3.7\n\nRUN pip install dill pandas scikit-learn\n```\n\n위의 Dockerfile을 이용해 이미지를 빌드해 보겠습니다. 실습에서 사용해볼 도커 허브는 ghcr입니다.  \n각자 환경에 맞추어서 도커 허브를 선택 후 업로드하면 됩니다.\n\n```bash\ndocker build . -f Dockerfile -t ghcr.io/mlops-for-all/base-image\ndocker push ghcr.io/mlops-for-all/base-image\n```\n\n이제 base_image를 입력해 보겠습니다.\n\n```python\nfrom functools import partial\nfrom kfp.components import InputPath, OutputPath, create_component_from_func\n\n@partial(\n    create_component_from_func,\n    base_image=\"ghcr.io/mlops-for-all/base-image:latest\",\n)\ndef train_from_csv(\n    train_data_path: InputPath(\"csv\"),\n    train_target_path: InputPath(\"csv\"),\n    model_path: OutputPath(\"dill\"),\n    kernel: str,\n):\n    import dill\n    import pandas as pd\n\n    from sklearn.svm import SVC\n\n    train_data = pd.read_csv(train_data_path)\n    train_target = pd.read_csv(train_target_path)\n\n    clf = SVC(kernel=kernel)\n    clf.fit(train_data, train_target)\n\n    with open(model_path, mode=\"wb\") as file_writer:\n        dill.dump(clf, file_writer)\n\nif __name__ == \"__main__\":\n    train_from_csv.component_spec.save(\"train_from_csv.yaml\")\n```\n\n이제 생성된 컴포넌트를 컴파일하면 다음과 같이 나옵니다.\n\n```bash\nname: Train from csv\ninputs:\n- {name: train_data, type: csv}\n- {name: train_target, type: csv}\n- {name: kernel, type: String}\noutputs:\n- {name: model, type: dill}\nimplementation:\n  container:\n    image: ghcr.io/mlops-for-all/base-image:latest\n    command:\n    - sh\n    - -ec\n    - |\n      program_path=$(mktemp)\n      printf \"%s\" \"$0\" > \"$program_path\"\n      python3 -u \"$program_path\" \"$@\"\n    - |\n      def _make_parent_dirs_and_return_path(file_path: str):\n          import os\n          os.makedirs(os.path.dirname(file_path), exist_ok=True)\n          return file_path\n\n      def train_from_csv(\n          train_data_path,\n          train_target_path,\n          model_path,\n          kernel,\n      ):\n          import dill\n          import pandas as pd\n\n          from sklearn.svm import SVC\n\n          train_data = pd.read_csv(train_data_path)\n          train_target = pd.read_csv(train_target_path)\n\n          clf = SVC(kernel=kernel)\n          clf.fit(train_data, train_target)\n\n          with open(model_path, mode=\"wb\") as file_writer:\n              dill.dump(clf, file_writer)\n\n      import argparse\n      _parser = argparse.ArgumentParser(prog='Train from csv', description='')\n      _parser.add_argument(\"--train-data\", dest=\"train_data_path\", type=str, required=True, default=argparse.SUPPRESS)\n      _parser.add_argument(\"--train-target\", dest=\"train_target_path\", type=str, required=True, default=argparse.SUPPRESS)\n      _parser.add_argument(\"--kernel\", dest=\"kernel\", type=str, required=True, default=argparse.SUPPRESS)\n      _parser.add_argument(\"--model\", dest=\"model_path\", type=_make_parent_dirs_and_return_path, required=True, default=argparse.SUPPRESS)\n      _parsed_args = vars(_parser.parse_args())\n\n      _outputs = train_from_csv(**_parsed_args)\n    args:\n    - --train-data\n    - {inputPath: train_data}\n    - --train-target\n    - {inputPath: train_target}\n    - --kernel\n    - {inputValue: kernel}\n    - --model\n    - {outputPath: model}\n```\n\nbase_image가 우리가 설정한 값으로 바뀐 것을 확인할 수 있습니다.\n\n### 2. packages_to_install\n\n하지만 패키지가 추가될 때마다 docker 이미지를 계속해서 새로 생성하는 작업은 많은 시간이 소요됩니다.\n이 때, `packages_to_install` argument 를 사용하면 패키지를 컨테이너에 쉽게 추가할 수 있습니다.\n\n```python\nfrom functools import partial\nfrom kfp.components import InputPath, OutputPath, create_component_from_func\n\n@partial(\n    create_component_from_func,\n    packages_to_install=[\"dill==0.3.4\", \"pandas==1.3.4\", \"scikit-learn==1.0.1\"],\n)\ndef train_from_csv(\n    train_data_path: InputPath(\"csv\"),\n    train_target_path: InputPath(\"csv\"),\n    model_path: OutputPath(\"dill\"),\n    kernel: str,\n):\n    import dill\n    import pandas as pd\n\n    from sklearn.svm import SVC\n\n    train_data = pd.read_csv(train_data_path)\n    train_target = pd.read_csv(train_target_path)\n\n    clf = SVC(kernel=kernel)\n    clf.fit(train_data, train_target)\n\n    with open(model_path, mode=\"wb\") as file_writer:\n        dill.dump(clf, file_writer)\n\nif __name__ == \"__main__\":\n    train_from_csv.component_spec.save(\"train_from_csv.yaml\")\n```\n\n스크립트를 실행하면 다음과 같은 `train_from_csv.yaml` 파일이 생성됩니다.\n\n```bash\nname: Train from csv\ninputs:\n- {name: train_data, type: csv}\n- {name: train_target, type: csv}\n- {name: kernel, type: String}\noutputs:\n- {name: model, type: dill}\nimplementation:\n  container:\n    image: python:3.7\n    command:\n    - sh\n    - -c\n    - (PIP_DISABLE_PIP_VERSION_CHECK=1 python3 -m pip install --quiet --no-warn-script-location\n      'dill==0.3.4' 'pandas==1.3.4' 'scikit-learn==1.0.1' || PIP_DISABLE_PIP_VERSION_CHECK=1\n      python3 -m pip install --quiet --no-warn-script-location 'dill==0.3.4' 'pandas==1.3.4'\n      'scikit-learn==1.0.1' --user) && \"$0\" \"$@\"\n    - sh\n    - -ec\n    - |\n      program_path=$(mktemp)\n      printf \"%s\" \"$0\" > \"$program_path\"\n      python3 -u \"$program_path\" \"$@\"\n    - |\n      def _make_parent_dirs_and_return_path(file_path: str):\n          import os\n          os.makedirs(os.path.dirname(file_path), exist_ok=True)\n          return file_path\n\n      def train_from_csv(\n          train_data_path,\n          train_target_path,\n          model_path,\n          kernel,\n      ):\n          import dill\n          import pandas as pd\n\n          from sklearn.svm import SVC\n\n          train_data = pd.read_csv(train_data_path)\n          train_target = pd.read_csv(train_target_path)\n\n          clf = SVC(kernel=kernel)\n          clf.fit(train_data, train_target)\n\n          with open(model_path, mode=\"wb\") as file_writer:\n              dill.dump(clf, file_writer)\n\n      import argparse\n      _parser = argparse.ArgumentParser(prog='Train from csv', description='')\n      _parser.add_argument(\"--train-data\", dest=\"train_data_path\", type=str, required=True, default=argparse.SUPPRESS)\n      _parser.add_argument(\"--train-target\", dest=\"train_target_path\", type=str, required=True, default=argparse.SUPPRESS)\n      _parser.add_argument(\"--kernel\", dest=\"kernel\", type=str, required=True, default=argparse.SUPPRESS)\n      _parser.add_argument(\"--model\", dest=\"model_path\", type=_make_parent_dirs_and_return_path, required=True, default=argparse.SUPPRESS)\n      _parsed_args = vars(_parser.parse_args())\n\n      _outputs = train_from_csv(**_parsed_args)\n    args:\n    - --train-data\n    - {inputPath: train_data}\n    - --train-target\n    - {inputPath: train_target}\n    - --kernel\n    - {inputValue: kernel}\n    - --model\n    - {outputPath: model}\n```\n\n위에 작성한 컴포넌트가 실행되는 순서를 좀 더 자세히 들여다보면 다음과 같습니다.\n\n1. `docker pull python:3.7`\n2. `pip install dill==0.3.4 pandas==1.3.4 scikit-learn==1.0.1`\n3. run `command`\n\n생성된 yaml 파일을 자세히 보면, 다음과 같은 줄이 자동으로 추가되어 필요한 패키지가 설치되기 때문에 오류 없이 정상적으로 실행됩니다.\n\n```bash\n    command:\n    - sh\n    - -c\n    - (PIP_DISABLE_PIP_VERSION_CHECK=1 python3 -m pip install --quiet --no-warn-script-location\n      'dill==0.3.4' 'pandas==1.3.4' 'scikit-learn==1.0.1' || PIP_DISABLE_PIP_VERSION_CHECK=1\n      python3 -m pip install --quiet --no-warn-script-location 'dill==0.3.4' 'pandas==1.3.4'\n      'scikit-learn==1.0.1' --user) && \"$0\" \"$@\"\n```\n"
  },
  {
    "path": "versioned_docs/version-1.0/kubeflow/advanced-mlflow.md",
    "content": "---\ntitle : \"12. Component - MLFlow\"\ndescription: \"\"\nsidebar_position: 12\ndate: 2021-12-13\nlastmod: 2021-12-20\ncontributors: [\"Jongseob Jeon\", \"SeungTae Kim\"]\n---\n\n## MLFlow Component\n\n[Advanced Usage Component](../kubeflow/advanced-component.md) 에서 학습한 모델이 API Deployment까지 이어지기 위해서는 MLFlow에 모델을 저장해야 합니다.\n\n이번 페이지에서는 MLFlow에 모델을 저장할 수 있는 컴포넌트를 작성하는 과정을 설명합니다.\n\n## MLFlow in Local\n\nMLFlow에서 모델을 저장하고 서빙에서 사용하기 위해서는 다음의 항목들이 필요합니다.\n\n- model\n- signature\n- input_example\n- conda_env\n\n파이썬 코드를 통해서 MLFLow에 모델을 저장하는 과정에 대해서 알아보겠습니다.\n\n### 1. 모델 학습\n\n아래 과정은 iris 데이터를 이용해 SVC 모델을 학습하는 과정입니다.\n\n```python\nimport pandas as pd\nfrom sklearn.datasets import load_iris\nfrom sklearn.svm import SVC\n\niris = load_iris()\n\ndata = pd.DataFrame(iris[\"data\"], columns=iris[\"feature_names\"])\ntarget = pd.DataFrame(iris[\"target\"], columns=[\"target\"])\n\nclf = SVC(kernel=\"rbf\")\nclf.fit(data, target)\n\n```\n\n### 2. MLFLow Infos\n\nmlflow에 필요한 정보들을 만드는 과정입니다.\n\n```python\nfrom mlflow.models.signature import infer_signature\nfrom mlflow.utils.environment import _mlflow_conda_env\n\ninput_example = data.sample(1)\nsignature = infer_signature(data, clf.predict(data))\nconda_env = _mlflow_conda_env(additional_pip_deps=[\"dill\", \"pandas\", \"scikit-learn\"])\n```\n\n각 변수의 내용을 확인하면 다음과 같습니다.\n\n- `input_example`\n\n    | sepal length (cm) | sepal width (cm) | petal length (cm) | petal width (cm) |\n    | --- | --- | --- | --- |\n    | 6.5 | 6.7 | 3.1 | 4.4 |\n\n- `signature`\n\n    ```python\n    inputs:\n      ['sepal length (cm)': double, 'sepal width (cm)': double, 'petal length (cm)': double, 'petal width (cm)': double]\n    outputs:\n      [Tensor('int64', (-1,))]\n    ```\n\n- `conda_env`\n\n    ```python\n    {'name': 'mlflow-env',\n     'channels': ['conda-forge'],\n     'dependencies': ['python=3.8.10',\n      'pip',\n      {'pip': ['mlflow', 'dill', 'pandas', 'scikit-learn']}]}\n    ```\n\n### 3. Save MLFLow Infos\n\n다음으로 학습한 정보들과 모델을 저장합니다.\n학습한 모델이 sklearn 패키지를 이용하기 때문에 `mlflow.sklearn` 을 이용하면 쉽게 모델을 저장할 수 있습니다.\n\n```python\nfrom mlflow.sklearn import save_model\n\nsave_model(\n    sk_model=clf,\n    path=\"svc\",\n    serialization_format=\"cloudpickle\",\n    conda_env=conda_env,\n    signature=signature,\n    input_example=input_example,\n)\n```\n\n로컬에서 작업하면 다음과 같은 svc 폴더가 생기며 아래와 같은 파일들이 생성됩니다.\n\n```bash\nls svc\n```\n\n위의 명령어를 실행하면 다음의 출력값을 확인할 수 있습니다.\n\n```bash\nMLmodel            conda.yaml         input_example.json model.pkl          requirements.txt\n```\n\n각 파일을 확인하면 다음과 같습니다.\n\n- MLmodel\n\n    ```bash\n    flavors:\n      python_function:\n        env: conda.yaml\n        loader_module: mlflow.sklearn\n        model_path: model.pkl\n        python_version: 3.8.10\n      sklearn:\n        pickled_model: model.pkl\n        serialization_format: cloudpickle\n        sklearn_version: 1.0.1\n    saved_input_example_info:\n      artifact_path: input_example.json\n      pandas_orient: split\n      type: dataframe\n    signature:\n      inputs: '[{\"name\": \"sepal length (cm)\", \"type\": \"double\"}, {\"name\": \"sepal width\n        (cm)\", \"type\": \"double\"}, {\"name\": \"petal length (cm)\", \"type\": \"double\"}, {\"name\":\n        \"petal width (cm)\", \"type\": \"double\"}]'\n      outputs: '[{\"type\": \"tensor\", \"tensor-spec\": {\"dtype\": \"int64\", \"shape\": [-1]}}]'\n    utc_time_created: '2021-12-06 06:52:30.612810'\n    ```\n\n- conda.yaml\n\n    ```bash\n    channels:\n    - conda-forge\n    dependencies:\n    - python=3.8.10\n    - pip\n    - pip:\n      - mlflow\n      - dill\n      - pandas\n      - scikit-learn\n    name: mlflow-env\n    ```\n\n- input_example.json\n\n    ```bash\n    {\n        \"columns\": \n        [\n            \"sepal length (cm)\",\n            \"sepal width (cm)\",\n            \"petal length (cm)\",\n            \"petal width (cm)\"\n        ],\n        \"data\": \n        [\n            [6.7, 3.1, 4.4, 1.4]\n        ]\n    }\n    ```\n\n- requirements.txt\n\n    ```bash\n    mlflow\n    dill\n    pandas\n    scikit-learn\n    ```\n\n- model.pkl\n\n## MLFlow on Server\n\n이제 저장된 모델을 mlflow 서버에 올리는 작업을 해보겠습니다.\n\n```python\nimport mlflow\n\nwith mlflow.start_run():\n    mlflow.log_artifact(\"svc/\")\n```\n\n저장하고 `mlruns` 가 생성된 경로에서 `mlflow ui` 명령어를 이용해 mlflow 서버와 대시보드를 띄웁니다.\nmlflow 대시보드에 접속하여 생성된 run을 클릭하면 다음과 같이 보입니다.\n\n![mlflow-0.png](./img/mlflow-0.png)\n(해당 화면은 mlflow 버전에 따라 다를 수 있습니다.)\n\n## MLFlow Component\n\n이제 Kubeflow에서 재사용할 수 있는 컴포넌트를 작성해 보겠습니다.\n\n재사용할 수 있는 컴포넌트를 작성하는 방법은 크게 3가지가 있습니다.\n\n1. 모델을 학습하는 컴포넌트에서 필요한 환경을 저장 후 MLFlow 컴포넌트는 업로드만 담당\n\n    ![mlflow-1.png](./img/mlflow-1.png)\n\n2. 학습된 모델과 데이터를 MLFlow 컴포넌트에 전달 후 컴포넌트에서 저장과 업로드 담당\n\n    ![mlflow-2.png](./img/mlflow-2.png)\n\n3. 모델을 학습하는 컴포넌트에서 저장과 업로드를 담당\n\n    ![mlflow-3.png](./img/mlflow-3.png)\n\n저희는 이 중 1번의 접근 방법을 통해 모델을 관리하려고 합니다.\n이유는 MLFlow 모델을 업로드하는 코드는 바뀌지 않기 때문에 매번 3번처럼 컴포넌트 작성마다 작성할 필요는 없기 때문입니다.\n\n컴포넌트를 재활용하는 방법은 1번과 2번의 방법으로 가능합니다.\n다만 2번의 경우 모델이 학습된 이미지와 패키지들을 전달해야 하므로 결국 컴포넌트에 대한 추가 정보를 전달해야 합니다.\n\n1번의 방법으로 진행하기 위해서는 학습하는 컴포넌트 또한 변경되어야 합니다.\n모델을 저장하는데 필요한 환경들을 저장해주는 코드가 추가되어야 합니다.\n\n```python\nfrom functools import partial\nfrom kfp.components import InputPath, OutputPath, create_component_from_func\n\n@partial(\n    create_component_from_func,\n    packages_to_install=[\"dill\", \"pandas\", \"scikit-learn\", \"mlflow\"],\n)\ndef train_from_csv(\n    train_data_path: InputPath(\"csv\"),\n    train_target_path: InputPath(\"csv\"),\n    model_path: OutputPath(\"dill\"),\n    input_example_path: OutputPath(\"dill\"),\n    signature_path: OutputPath(\"dill\"),\n    conda_env_path: OutputPath(\"dill\"),\n    kernel: str,\n):\n    import dill\n    import pandas as pd\n    from sklearn.svm import SVC\n\n    from mlflow.models.signature import infer_signature\n    from mlflow.utils.environment import _mlflow_conda_env\n\n    train_data = pd.read_csv(train_data_path)\n    train_target = pd.read_csv(train_target_path)\n\n    clf = SVC(kernel=kernel)\n    clf.fit(train_data, train_target)\n\n    with open(model_path, mode=\"wb\") as file_writer:\n        dill.dump(clf, file_writer)\n\n    input_example = train_data.sample(1)\n    with open(input_example_path, \"wb\") as file_writer:\n        dill.dump(input_example, file_writer)\n\n    signature = infer_signature(train_data, clf.predict(train_data))\n    with open(signature_path, \"wb\") as file_writer:\n        dill.dump(signature, file_writer)\n\n    conda_env = _mlflow_conda_env(\n        additional_pip_deps=[\"dill\", \"pandas\", \"scikit-learn\"]\n    )\n    with open(conda_env_path, \"wb\") as file_writer:\n        dill.dump(conda_env, file_writer)\n\n```\n\n그리고 MLFlow에 업로드하는 컴포넌트를 작성합니다.\n이 때 업로드되는 MLflow의 endpoint를 우리가 설치한 [mlflow service](../setup-components/install-components-mlflow.md) 로 이어지게 설정해주어야 합니다.  \n이 때 S3 Endpoint의 주소는 MLflow Server 설치 당시 설치한 minio의 [쿠버네티스 서비스 DNS 네임을 활용](https://kubernetes.io/ko/docs/concepts/services-networking/dns-pod-service/)합니다. 해당 service 는 kubeflow namespace에서 minio-service라는 이름으로 생성되었으므로, `http://minio-service.kubeflow.svc:9000` 로 설정합니다.  \n이와 비슷하게 tracking_uri의 주소는 mlflow server의 쿠버네티스 서비스 DNS 네임을 활용하여, `http://mlflow-server-service.mlflow-system.svc:5000` 로 설정합니다.\n\n```python\nfrom functools import partial\nfrom kfp.components import InputPath, create_component_from_func\n\n@partial(\n    create_component_from_func,\n    packages_to_install=[\"dill\", \"pandas\", \"scikit-learn\", \"mlflow\", \"boto3\"],\n)\ndef upload_sklearn_model_to_mlflow(\n    model_name: str,\n    model_path: InputPath(\"dill\"),\n    input_example_path: InputPath(\"dill\"),\n    signature_path: InputPath(\"dill\"),\n    conda_env_path: InputPath(\"dill\"),\n):\n    import os\n    import dill\n    from mlflow.sklearn import save_model\n    \n    from mlflow.tracking.client import MlflowClient\n\n    os.environ[\"MLFLOW_S3_ENDPOINT_URL\"] = \"http://minio-service.kubeflow.svc:9000\"\n    os.environ[\"AWS_ACCESS_KEY_ID\"] = \"minio\"\n    os.environ[\"AWS_SECRET_ACCESS_KEY\"] = \"minio123\"\n\n    client = MlflowClient(\"http://mlflow-server-service.mlflow-system.svc:5000\")\n\n    with open(model_path, mode=\"rb\") as file_reader:\n        clf = dill.load(file_reader)\n\n    with open(input_example_path, \"rb\") as file_reader:\n        input_example = dill.load(file_reader)\n\n    with open(signature_path, \"rb\") as file_reader:\n        signature = dill.load(file_reader)\n\n    with open(conda_env_path, \"rb\") as file_reader:\n        conda_env = dill.load(file_reader)\n\n    save_model(\n        sk_model=clf,\n        path=model_name,\n        serialization_format=\"cloudpickle\",\n        conda_env=conda_env,\n        signature=signature,\n        input_example=input_example,\n    )\n    run = client.create_run(experiment_id=\"0\")\n    client.log_artifact(run.info.run_id, model_name)\n```\n\n## MLFlow Pipeline\n\n이제 작성한 컴포넌트들을 연결해서 파이프라인으로 만들어 보겠습니다.\n\n### Data Component\n\n모델을 학습할 때 쓸 데이터는 sklearn의 iris 입니다.\n데이터를 생성하는 컴포넌트를 작성합니다.\n\n```python\nfrom functools import partial\n\nfrom kfp.components import InputPath, OutputPath, create_component_from_func\n\n\n@partial(\n    create_component_from_func,\n    packages_to_install=[\"pandas\", \"scikit-learn\"],\n)\ndef load_iris_data(\n    data_path: OutputPath(\"csv\"),\n    target_path: OutputPath(\"csv\"),\n):\n    import pandas as pd\n    from sklearn.datasets import load_iris\n\n    iris = load_iris()\n\n    data = pd.DataFrame(iris[\"data\"], columns=iris[\"feature_names\"])\n    target = pd.DataFrame(iris[\"target\"], columns=[\"target\"])\n\n    data.to_csv(data_path, index=False)\n    target.to_csv(target_path, index=False)\n\n```\n\n### Pipeline\n\n파이프라인 코드는 다음과 같이 작성할 수 있습니다.\n\n```python\nfrom kfp.dsl import pipeline\n\n\n@pipeline(name=\"mlflow_pipeline\")\ndef mlflow_pipeline(kernel: str, model_name: str):\n    iris_data = load_iris_data()\n    model = train_from_csv(\n        train_data=iris_data.outputs[\"data\"],\n        train_target=iris_data.outputs[\"target\"],\n        kernel=kernel,\n    )\n    _ = upload_sklearn_model_to_mlflow(\n        model_name=model_name,\n        model=model.outputs[\"model\"],\n        input_example=model.outputs[\"input_example\"],\n        signature=model.outputs[\"signature\"],\n        conda_env=model.outputs[\"conda_env\"],\n    )\n```\n\n### Run\n\n위에서 작성된 컴포넌트와 파이프라인을 하나의 파이썬 파일에 정리하면 다음과 같습니다.\n\n```python\nfrom functools import partial\n\nimport kfp\nfrom kfp.components import InputPath, OutputPath, create_component_from_func\nfrom kfp.dsl import pipeline\n\n\n@partial(\n    create_component_from_func,\n    packages_to_install=[\"pandas\", \"scikit-learn\"],\n)\ndef load_iris_data(\n    data_path: OutputPath(\"csv\"),\n    target_path: OutputPath(\"csv\"),\n):\n    import pandas as pd\n    from sklearn.datasets import load_iris\n\n    iris = load_iris()\n\n    data = pd.DataFrame(iris[\"data\"], columns=iris[\"feature_names\"])\n    target = pd.DataFrame(iris[\"target\"], columns=[\"target\"])\n\n    data.to_csv(data_path, index=False)\n    target.to_csv(target_path, index=False)\n\n\n@partial(\n    create_component_from_func,\n    packages_to_install=[\"dill\", \"pandas\", \"scikit-learn\", \"mlflow\"],\n)\ndef train_from_csv(\n    train_data_path: InputPath(\"csv\"),\n    train_target_path: InputPath(\"csv\"),\n    model_path: OutputPath(\"dill\"),\n    input_example_path: OutputPath(\"dill\"),\n    signature_path: OutputPath(\"dill\"),\n    conda_env_path: OutputPath(\"dill\"),\n    kernel: str,\n):\n    import dill\n    import pandas as pd\n    from sklearn.svm import SVC\n\n    from mlflow.models.signature import infer_signature\n    from mlflow.utils.environment import _mlflow_conda_env\n\n    train_data = pd.read_csv(train_data_path)\n    train_target = pd.read_csv(train_target_path)\n\n    clf = SVC(kernel=kernel)\n    clf.fit(train_data, train_target)\n\n    with open(model_path, mode=\"wb\") as file_writer:\n        dill.dump(clf, file_writer)\n\n    input_example = train_data.sample(1)\n    with open(input_example_path, \"wb\") as file_writer:\n        dill.dump(input_example, file_writer)\n\n    signature = infer_signature(train_data, clf.predict(train_data))\n    with open(signature_path, \"wb\") as file_writer:\n        dill.dump(signature, file_writer)\n\n    conda_env = _mlflow_conda_env(\n        additional_pip_deps=[\"dill\", \"pandas\", \"scikit-learn\"]\n    )\n    with open(conda_env_path, \"wb\") as file_writer:\n        dill.dump(conda_env, file_writer)\n\n\n@partial(\n    create_component_from_func,\n    packages_to_install=[\"dill\", \"pandas\", \"scikit-learn\", \"mlflow\", \"boto3\"],\n)\ndef upload_sklearn_model_to_mlflow(\n    model_name: str,\n    model_path: InputPath(\"dill\"),\n    input_example_path: InputPath(\"dill\"),\n    signature_path: InputPath(\"dill\"),\n    conda_env_path: InputPath(\"dill\"),\n):\n    import os\n    import dill\n    from mlflow.sklearn import save_model\n    \n    from mlflow.tracking.client import MlflowClient\n\n    os.environ[\"MLFLOW_S3_ENDPOINT_URL\"] = \"http://minio-service.kubeflow.svc:9000\"\n    os.environ[\"AWS_ACCESS_KEY_ID\"] = \"minio\"\n    os.environ[\"AWS_SECRET_ACCESS_KEY\"] = \"minio123\"\n\n    client = MlflowClient(\"http://mlflow-server-service.mlflow-system.svc:5000\")\n\n    with open(model_path, mode=\"rb\") as file_reader:\n        clf = dill.load(file_reader)\n\n    with open(input_example_path, \"rb\") as file_reader:\n        input_example = dill.load(file_reader)\n\n    with open(signature_path, \"rb\") as file_reader:\n        signature = dill.load(file_reader)\n\n    with open(conda_env_path, \"rb\") as file_reader:\n        conda_env = dill.load(file_reader)\n\n    save_model(\n        sk_model=clf,\n        path=model_name,\n        serialization_format=\"cloudpickle\",\n        conda_env=conda_env,\n        signature=signature,\n        input_example=input_example,\n    )\n    run = client.create_run(experiment_id=\"0\")\n    client.log_artifact(run.info.run_id, model_name)\n\n\n@pipeline(name=\"mlflow_pipeline\")\ndef mlflow_pipeline(kernel: str, model_name: str):\n    iris_data = load_iris_data()\n    model = train_from_csv(\n        train_data=iris_data.outputs[\"data\"],\n        train_target=iris_data.outputs[\"target\"],\n        kernel=kernel,\n    )\n    _ = upload_sklearn_model_to_mlflow(\n        model_name=model_name,\n        model=model.outputs[\"model\"],\n        input_example=model.outputs[\"input_example\"],\n        signature=model.outputs[\"signature\"],\n        conda_env=model.outputs[\"conda_env\"],\n    )\n\n\nif __name__ == \"__main__\":\n    kfp.compiler.Compiler().compile(mlflow_pipeline, \"mlflow_pipeline.yaml\")\n```\n\n<p>\n  <details>\n    <summary>mlflow_pipeline.yaml</summary>\n\n```bash\napiVersion: argoproj.io/v1alpha1\nkind: Workflow\nmetadata:\n  generateName: mlflow-pipeline-\n  annotations: {pipelines.kubeflow.org/kfp_sdk_version: 1.8.10, pipelines.kubeflow.org/pipeline_compilation_time: '2022-01-19T14:14:11.999807',\n    pipelines.kubeflow.org/pipeline_spec: '{\"inputs\": [{\"name\": \"kernel\", \"type\":\n      \"String\"}, {\"name\": \"model_name\", \"type\": \"String\"}], \"name\": \"mlflow_pipeline\"}'}\n  labels: {pipelines.kubeflow.org/kfp_sdk_version: 1.8.10}\nspec:\n  entrypoint: mlflow-pipeline\n  templates:\n  - name: load-iris-data\n    container:\n      args: [--data, /tmp/outputs/data/data, --target, /tmp/outputs/target/data]\n      command:\n      - sh\n      - -c\n      - (PIP_DISABLE_PIP_VERSION_CHECK=1 python3 -m pip install --quiet --no-warn-script-location\n        'pandas' 'scikit-learn' || PIP_DISABLE_PIP_VERSION_CHECK=1 python3 -m pip\n        install --quiet --no-warn-script-location 'pandas' 'scikit-learn' --user)\n        && \"$0\" \"$@\"\n      - sh\n      - -ec\n      - |\n        program_path=$(mktemp)\n        printf \"%s\" \"$0\" > \"$program_path\"\n        python3 -u \"$program_path\" \"$@\"\n      - |\n        def _make_parent_dirs_and_return_path(file_path: str):\n            import os\n            os.makedirs(os.path.dirname(file_path), exist_ok=True)\n            return file_path\n\n        def load_iris_data(\n            data_path,\n            target_path,\n        ):\n            import pandas as pd\n            from sklearn.datasets import load_iris\n\n            iris = load_iris()\n\n            data = pd.DataFrame(iris[\"data\"], columns=iris[\"feature_names\"])\n            target = pd.DataFrame(iris[\"target\"], columns=[\"target\"])\n\n            data.to_csv(data_path, index=False)\n            target.to_csv(target_path, index=False)\n\n        import argparse\n        _parser = argparse.ArgumentParser(prog='Load iris data', description='')\n        _parser.add_argument(\"--data\", dest=\"data_path\", type=_make_parent_dirs_and_return_path, required=True, default=argparse.SUPPRESS)\n        _parser.add_argument(\"--target\", dest=\"target_path\", type=_make_parent_dirs_and_return_path, required=True, default=argparse.SUPPRESS)\n        _parsed_args = vars(_parser.parse_args())\n\n        _outputs = load_iris_data(**_parsed_args)\n      image: python:3.7\n    outputs:\n      artifacts:\n      - {name: load-iris-data-data, path: /tmp/outputs/data/data}\n      - {name: load-iris-data-target, path: /tmp/outputs/target/data}\n    metadata:\n      labels:\n        pipelines.kubeflow.org/kfp_sdk_version: 1.8.10\n        pipelines.kubeflow.org/pipeline-sdk-type: kfp\n        pipelines.kubeflow.org/enable_caching: \"true\"\n      annotations: {pipelines.kubeflow.org/component_spec: '{\"implementation\": {\"container\":\n          {\"args\": [\"--data\", {\"outputPath\": \"data\"}, \"--target\", {\"outputPath\": \"target\"}],\n          \"command\": [\"sh\", \"-c\", \"(PIP_DISABLE_PIP_VERSION_CHECK=1 python3 -m pip\n          install --quiet --no-warn-script-location ''pandas'' ''scikit-learn'' ||\n          PIP_DISABLE_PIP_VERSION_CHECK=1 python3 -m pip install --quiet --no-warn-script-location\n          ''pandas'' ''scikit-learn'' --user) && \\\"$0\\\" \\\"$@\\\"\", \"sh\", \"-ec\", \"program_path=$(mktemp)\\nprintf\n          \\\"%s\\\" \\\"$0\\\" > \\\"$program_path\\\"\\npython3 -u \\\"$program_path\\\" \\\"$@\\\"\\n\",\n          \"def _make_parent_dirs_and_return_path(file_path: str):\\n    import os\\n    os.makedirs(os.path.dirname(file_path),\n          exist_ok=True)\\n    return file_path\\n\\ndef load_iris_data(\\n    data_path,\\n    target_path,\\n):\\n    import\n          pandas as pd\\n    from sklearn.datasets import load_iris\\n\\n    iris = load_iris()\\n\\n    data\n          = pd.DataFrame(iris[\\\"data\\\"], columns=iris[\\\"feature_names\\\"])\\n    target\n          = pd.DataFrame(iris[\\\"target\\\"], columns=[\\\"target\\\"])\\n\\n    data.to_csv(data_path,\n          index=False)\\n    target.to_csv(target_path, index=False)\\n\\nimport argparse\\n_parser\n          = argparse.ArgumentParser(prog=''Load iris data'', description='''')\\n_parser.add_argument(\\\"--data\\\",\n          dest=\\\"data_path\\\", type=_make_parent_dirs_and_return_path, required=True,\n          default=argparse.SUPPRESS)\\n_parser.add_argument(\\\"--target\\\", dest=\\\"target_path\\\",\n          type=_make_parent_dirs_and_return_path, required=True, default=argparse.SUPPRESS)\\n_parsed_args\n          = vars(_parser.parse_args())\\n\\n_outputs = load_iris_data(**_parsed_args)\\n\"],\n          \"image\": \"python:3.7\"}}, \"name\": \"Load iris data\", \"outputs\": [{\"name\":\n          \"data\", \"type\": \"csv\"}, {\"name\": \"target\", \"type\": \"csv\"}]}', pipelines.kubeflow.org/component_ref: '{}'}\n  - name: mlflow-pipeline\n    inputs:\n      parameters:\n      - {name: kernel}\n      - {name: model_name}\n    dag:\n      tasks:\n      - {name: load-iris-data, template: load-iris-data}\n      - name: train-from-csv\n        template: train-from-csv\n        dependencies: [load-iris-data]\n        arguments:\n          parameters:\n          - {name: kernel, value: '{{inputs.parameters.kernel}}'}\n          artifacts:\n          - {name: load-iris-data-data, from: '{{tasks.load-iris-data.outputs.artifacts.load-iris-data-data}}'}\n          - {name: load-iris-data-target, from: '{{tasks.load-iris-data.outputs.artifacts.load-iris-data-target}}'}\n      - name: upload-sklearn-model-to-mlflow\n        template: upload-sklearn-model-to-mlflow\n        dependencies: [train-from-csv]\n        arguments:\n          parameters:\n          - {name: model_name, value: '{{inputs.parameters.model_name}}'}\n          artifacts:\n          - {name: train-from-csv-conda_env, from: '{{tasks.train-from-csv.outputs.artifacts.train-from-csv-conda_env}}'}\n          - {name: train-from-csv-input_example, from: '{{tasks.train-from-csv.outputs.artifacts.train-from-csv-input_example}}'}\n          - {name: train-from-csv-model, from: '{{tasks.train-from-csv.outputs.artifacts.train-from-csv-model}}'}\n          - {name: train-from-csv-signature, from: '{{tasks.train-from-csv.outputs.artifacts.train-from-csv-signature}}'}\n  - name: train-from-csv\n    container:\n      args: [--train-data, /tmp/inputs/train_data/data, --train-target, /tmp/inputs/train_target/data,\n        --kernel, '{{inputs.parameters.kernel}}', --model, /tmp/outputs/model/data,\n        --input-example, /tmp/outputs/input_example/data, --signature, /tmp/outputs/signature/data,\n        --conda-env, /tmp/outputs/conda_env/data]\n      command:\n      - sh\n      - -c\n      - (PIP_DISABLE_PIP_VERSION_CHECK=1 python3 -m pip install --quiet --no-warn-script-location\n        'dill' 'pandas' 'scikit-learn' 'mlflow' || PIP_DISABLE_PIP_VERSION_CHECK=1\n        python3 -m pip install --quiet --no-warn-script-location 'dill' 'pandas' 'scikit-learn'\n        'mlflow' --user) && \"$0\" \"$@\"\n      - sh\n      - -ec\n      - |\n        program_path=$(mktemp)\n        printf \"%s\" \"$0\" > \"$program_path\"\n        python3 -u \"$program_path\" \"$@\"\n      - |\n        def _make_parent_dirs_and_return_path(file_path: str):\n            import os\n            os.makedirs(os.path.dirname(file_path), exist_ok=True)\n            return file_path\n\n        def train_from_csv(\n            train_data_path,\n            train_target_path,\n            model_path,\n            input_example_path,\n            signature_path,\n            conda_env_path,\n            kernel,\n        ):\n            import dill\n            import pandas as pd\n            from sklearn.svm import SVC\n\n            from mlflow.models.signature import infer_signature\n            from mlflow.utils.environment import _mlflow_conda_env\n\n            train_data = pd.read_csv(train_data_path)\n            train_target = pd.read_csv(train_target_path)\n\n            clf = SVC(kernel=kernel)\n            clf.fit(train_data, train_target)\n\n            with open(model_path, mode=\"wb\") as file_writer:\n                dill.dump(clf, file_writer)\n\n            input_example = train_data.sample(1)\n            with open(input_example_path, \"wb\") as file_writer:\n                dill.dump(input_example, file_writer)\n\n            signature = infer_signature(train_data, clf.predict(train_data))\n            with open(signature_path, \"wb\") as file_writer:\n                dill.dump(signature, file_writer)\n\n            conda_env = _mlflow_conda_env(\n                additional_pip_deps=[\"dill\", \"pandas\", \"scikit-learn\"]\n            )\n            with open(conda_env_path, \"wb\") as file_writer:\n                dill.dump(conda_env, file_writer)\n\n        import argparse\n        _parser = argparse.ArgumentParser(prog='Train from csv', description='')\n        _parser.add_argument(\"--train-data\", dest=\"train_data_path\", type=str, required=True, default=argparse.SUPPRESS)\n        _parser.add_argument(\"--train-target\", dest=\"train_target_path\", type=str, required=True, default=argparse.SUPPRESS)\n        _parser.add_argument(\"--kernel\", dest=\"kernel\", type=str, required=True, default=argparse.SUPPRESS)\n        _parser.add_argument(\"--model\", dest=\"model_path\", type=_make_parent_dirs_and_return_path, required=True, default=argparse.SUPPRESS)\n        _parser.add_argument(\"--input-example\", dest=\"input_example_path\", type=_make_parent_dirs_and_return_path, required=True, default=argparse.SUPPRESS)\n        _parser.add_argument(\"--signature\", dest=\"signature_path\", type=_make_parent_dirs_and_return_path, required=True, default=argparse.SUPPRESS)\n        _parser.add_argument(\"--conda-env\", dest=\"conda_env_path\", type=_make_parent_dirs_and_return_path, required=True, default=argparse.SUPPRESS)\n        _parsed_args = vars(_parser.parse_args())\n\n        _outputs = train_from_csv(**_parsed_args)\n      image: python:3.7\n    inputs:\n      parameters:\n      - {name: kernel}\n      artifacts:\n      - {name: load-iris-data-data, path: /tmp/inputs/train_data/data}\n      - {name: load-iris-data-target, path: /tmp/inputs/train_target/data}\n    outputs:\n      artifacts:\n      - {name: train-from-csv-conda_env, path: /tmp/outputs/conda_env/data}\n      - {name: train-from-csv-input_example, path: /tmp/outputs/input_example/data}\n      - {name: train-from-csv-model, path: /tmp/outputs/model/data}\n      - {name: train-from-csv-signature, path: /tmp/outputs/signature/data}\n    metadata:\n      labels:\n        pipelines.kubeflow.org/kfp_sdk_version: 1.8.10\n        pipelines.kubeflow.org/pipeline-sdk-type: kfp\n        pipelines.kubeflow.org/enable_caching: \"true\"\n      annotations: {pipelines.kubeflow.org/component_spec: '{\"implementation\": {\"container\":\n          {\"args\": [\"--train-data\", {\"inputPath\": \"train_data\"}, \"--train-target\",\n          {\"inputPath\": \"train_target\"}, \"--kernel\", {\"inputValue\": \"kernel\"}, \"--model\",\n          {\"outputPath\": \"model\"}, \"--input-example\", {\"outputPath\": \"input_example\"},\n          \"--signature\", {\"outputPath\": \"signature\"}, \"--conda-env\", {\"outputPath\":\n          \"conda_env\"}], \"command\": [\"sh\", \"-c\", \"(PIP_DISABLE_PIP_VERSION_CHECK=1\n          python3 -m pip install --quiet --no-warn-script-location ''dill'' ''pandas''\n          ''scikit-learn'' ''mlflow'' || PIP_DISABLE_PIP_VERSION_CHECK=1 python3 -m\n          pip install --quiet --no-warn-script-location ''dill'' ''pandas'' ''scikit-learn''\n          ''mlflow'' --user) && \\\"$0\\\" \\\"$@\\\"\", \"sh\", \"-ec\", \"program_path=$(mktemp)\\nprintf\n          \\\"%s\\\" \\\"$0\\\" > \\\"$program_path\\\"\\npython3 -u \\\"$program_path\\\" \\\"$@\\\"\\n\",\n          \"def _make_parent_dirs_and_return_path(file_path: str):\\n    import os\\n    os.makedirs(os.path.dirname(file_path),\n          exist_ok=True)\\n    return file_path\\n\\ndef train_from_csv(\\n    train_data_path,\\n    train_target_path,\\n    model_path,\\n    input_example_path,\\n    signature_path,\\n    conda_env_path,\\n    kernel,\\n):\\n    import\n          dill\\n    import pandas as pd\\n    from sklearn.svm import SVC\\n\\n    from\n          mlflow.models.signature import infer_signature\\n    from mlflow.utils.environment\n          import _mlflow_conda_env\\n\\n    train_data = pd.read_csv(train_data_path)\\n    train_target\n          = pd.read_csv(train_target_path)\\n\\n    clf = SVC(kernel=kernel)\\n    clf.fit(train_data,\n          train_target)\\n\\n    with open(model_path, mode=\\\"wb\\\") as file_writer:\\n        dill.dump(clf,\n          file_writer)\\n\\n    input_example = train_data.sample(1)\\n    with open(input_example_path,\n          \\\"wb\\\") as file_writer:\\n        dill.dump(input_example, file_writer)\\n\\n    signature\n          = infer_signature(train_data, clf.predict(train_data))\\n    with open(signature_path,\n          \\\"wb\\\") as file_writer:\\n        dill.dump(signature, file_writer)\\n\\n    conda_env\n          = _mlflow_conda_env(\\n        additional_pip_deps=[\\\"dill\\\", \\\"pandas\\\",\n          \\\"scikit-learn\\\"]\\n    )\\n    with open(conda_env_path, \\\"wb\\\") as file_writer:\\n        dill.dump(conda_env,\n          file_writer)\\n\\nimport argparse\\n_parser = argparse.ArgumentParser(prog=''Train\n          from csv'', description='''')\\n_parser.add_argument(\\\"--train-data\\\", dest=\\\"train_data_path\\\",\n          type=str, required=True, default=argparse.SUPPRESS)\\n_parser.add_argument(\\\"--train-target\\\",\n          dest=\\\"train_target_path\\\", type=str, required=True, default=argparse.SUPPRESS)\\n_parser.add_argument(\\\"--kernel\\\",\n          dest=\\\"kernel\\\", type=str, required=True, default=argparse.SUPPRESS)\\n_parser.add_argument(\\\"--model\\\",\n          dest=\\\"model_path\\\", type=_make_parent_dirs_and_return_path, required=True,\n          default=argparse.SUPPRESS)\\n_parser.add_argument(\\\"--input-example\\\", dest=\\\"input_example_path\\\",\n          type=_make_parent_dirs_and_return_path, required=True, default=argparse.SUPPRESS)\\n_parser.add_argument(\\\"--signature\\\",\n          dest=\\\"signature_path\\\", type=_make_parent_dirs_and_return_path, required=True,\n          default=argparse.SUPPRESS)\\n_parser.add_argument(\\\"--conda-env\\\", dest=\\\"conda_env_path\\\",\n          type=_make_parent_dirs_and_return_path, required=True, default=argparse.SUPPRESS)\\n_parsed_args\n          = vars(_parser.parse_args())\\n\\n_outputs = train_from_csv(**_parsed_args)\\n\"],\n          \"image\": \"python:3.7\"}}, \"inputs\": [{\"name\": \"train_data\", \"type\": \"csv\"},\n          {\"name\": \"train_target\", \"type\": \"csv\"}, {\"name\": \"kernel\", \"type\": \"String\"}],\n          \"name\": \"Train from csv\", \"outputs\": [{\"name\": \"model\", \"type\": \"dill\"},\n          {\"name\": \"input_example\", \"type\": \"dill\"}, {\"name\": \"signature\", \"type\":\n          \"dill\"}, {\"name\": \"conda_env\", \"type\": \"dill\"}]}', pipelines.kubeflow.org/component_ref: '{}',\n        pipelines.kubeflow.org/arguments.parameters: '{\"kernel\": \"{{inputs.parameters.kernel}}\"}'}\n  - name: upload-sklearn-model-to-mlflow\n    container:\n      args: [--model-name, '{{inputs.parameters.model_name}}', --model, /tmp/inputs/model/data,\n        --input-example, /tmp/inputs/input_example/data, --signature, /tmp/inputs/signature/data,\n        --conda-env, /tmp/inputs/conda_env/data]\n      command:\n      - sh\n      - -c\n      - (PIP_DISABLE_PIP_VERSION_CHECK=1 python3 -m pip install --quiet --no-warn-script-location\n        'dill' 'pandas' 'scikit-learn' 'mlflow' 'boto3' || PIP_DISABLE_PIP_VERSION_CHECK=1\n        python3 -m pip install --quiet --no-warn-script-location 'dill' 'pandas' 'scikit-learn'\n        'mlflow' 'boto3' --user) && \"$0\" \"$@\"\n      - sh\n      - -ec\n      - |\n        program_path=$(mktemp)\n        printf \"%s\" \"$0\" > \"$program_path\"\n        python3 -u \"$program_path\" \"$@\"\n      - |\n        def upload_sklearn_model_to_mlflow(\n            model_name,\n            model_path,\n            input_example_path,\n            signature_path,\n            conda_env_path,\n        ):\n            import os\n            import dill\n            from mlflow.sklearn import save_model\n\n            from mlflow.tracking.client import MlflowClient\n\n            os.environ[\"MLFLOW_S3_ENDPOINT_URL\"] = \"http://minio-service.kubeflow.svc:9000\"\n            os.environ[\"AWS_ACCESS_KEY_ID\"] = \"minio\"\n            os.environ[\"AWS_SECRET_ACCESS_KEY\"] = \"minio123\"\n\n            client = MlflowClient(\"http://mlflow-server-service.mlflow-system.svc:5000\")\n\n            with open(model_path, mode=\"rb\") as file_reader:\n                clf = dill.load(file_reader)\n\n            with open(input_example_path, \"rb\") as file_reader:\n                input_example = dill.load(file_reader)\n\n            with open(signature_path, \"rb\") as file_reader:\n                signature = dill.load(file_reader)\n\n            with open(conda_env_path, \"rb\") as file_reader:\n                conda_env = dill.load(file_reader)\n\n            save_model(\n                sk_model=clf,\n                path=model_name,\n                serialization_format=\"cloudpickle\",\n                conda_env=conda_env,\n                signature=signature,\n                input_example=input_example,\n            )\n            run = client.create_run(experiment_id=\"0\")\n            client.log_artifact(run.info.run_id, model_name)\n\n        import argparse\n        _parser = argparse.ArgumentParser(prog='Upload sklearn model to mlflow', description='')\n        _parser.add_argument(\"--model-name\", dest=\"model_name\", type=str, required=True, default=argparse.SUPPRESS)\n        _parser.add_argument(\"--model\", dest=\"model_path\", type=str, required=True, default=argparse.SUPPRESS)\n        _parser.add_argument(\"--input-example\", dest=\"input_example_path\", type=str, required=True, default=argparse.SUPPRESS)\n        _parser.add_argument(\"--signature\", dest=\"signature_path\", type=str, required=True, default=argparse.SUPPRESS)\n        _parser.add_argument(\"--conda-env\", dest=\"conda_env_path\", type=str, required=True, default=argparse.SUPPRESS)\n        _parsed_args = vars(_parser.parse_args())\n\n        _outputs = upload_sklearn_model_to_mlflow(**_parsed_args)\n      image: python:3.7\n    inputs:\n      parameters:\n      - {name: model_name}\n      artifacts:\n      - {name: train-from-csv-conda_env, path: /tmp/inputs/conda_env/data}\n      - {name: train-from-csv-input_example, path: /tmp/inputs/input_example/data}\n      - {name: train-from-csv-model, path: /tmp/inputs/model/data}\n      - {name: train-from-csv-signature, path: /tmp/inputs/signature/data}\n    metadata:\n      labels:\n        pipelines.kubeflow.org/kfp_sdk_version: 1.8.10\n        pipelines.kubeflow.org/pipeline-sdk-type: kfp\n        pipelines.kubeflow.org/enable_caching: \"true\"\n      annotations: {pipelines.kubeflow.org/component_spec: '{\"implementation\": {\"container\":\n          {\"args\": [\"--model-name\", {\"inputValue\": \"model_name\"}, \"--model\", {\"inputPath\":\n          \"model\"}, \"--input-example\", {\"inputPath\": \"input_example\"}, \"--signature\",\n          {\"inputPath\": \"signature\"}, \"--conda-env\", {\"inputPath\": \"conda_env\"}],\n          \"command\": [\"sh\", \"-c\", \"(PIP_DISABLE_PIP_VERSION_CHECK=1 python3 -m pip\n          install --quiet --no-warn-script-location ''dill'' ''pandas'' ''scikit-learn''\n          ''mlflow'' ''boto3'' || PIP_DISABLE_PIP_VERSION_CHECK=1 python3 -m pip install\n          --quiet --no-warn-script-location ''dill'' ''pandas'' ''scikit-learn'' ''mlflow''\n          ''boto3'' --user) && \\\"$0\\\" \\\"$@\\\"\", \"sh\", \"-ec\", \"program_path=$(mktemp)\\nprintf\n          \\\"%s\\\" \\\"$0\\\" > \\\"$program_path\\\"\\npython3 -u \\\"$program_path\\\" \\\"$@\\\"\\n\",\n          \"def upload_sklearn_model_to_mlflow(\\n    model_name,\\n    model_path,\\n    input_example_path,\\n    signature_path,\\n    conda_env_path,\\n):\\n    import\n          os\\n    import dill\\n    from mlflow.sklearn import save_model\\n\\n    from\n          mlflow.tracking.client import MlflowClient\\n\\n    os.environ[\\\"MLFLOW_S3_ENDPOINT_URL\\\"]\n          = \\\"http://minio-service.kubeflow.svc:9000\\\"\\n    os.environ[\\\"AWS_ACCESS_KEY_ID\\\"]\n          = \\\"minio\\\"\\n    os.environ[\\\"AWS_SECRET_ACCESS_KEY\\\"] = \\\"minio123\\\"\\n\\n    client\n          = MlflowClient(\\\"http://mlflow-server-service.mlflow-system.svc:5000\\\")\\n\\n    with\n          open(model_path, mode=\\\"rb\\\") as file_reader:\\n        clf = dill.load(file_reader)\\n\\n    with\n          open(input_example_path, \\\"rb\\\") as file_reader:\\n        input_example\n          = dill.load(file_reader)\\n\\n    with open(signature_path, \\\"rb\\\") as file_reader:\\n        signature\n          = dill.load(file_reader)\\n\\n    with open(conda_env_path, \\\"rb\\\") as file_reader:\\n        conda_env\n          = dill.load(file_reader)\\n\\n    save_model(\\n        sk_model=clf,\\n        path=model_name,\\n        serialization_format=\\\"cloudpickle\\\",\\n        conda_env=conda_env,\\n        signature=signature,\\n        input_example=input_example,\\n    )\\n    run\n          = client.create_run(experiment_id=\\\"0\\\")\\n    client.log_artifact(run.info.run_id,\n          model_name)\\n\\nimport argparse\\n_parser = argparse.ArgumentParser(prog=''Upload\n          sklearn model to mlflow'', description='''')\\n_parser.add_argument(\\\"--model-name\\\",\n          dest=\\\"model_name\\\", type=str, required=True, default=argparse.SUPPRESS)\\n_parser.add_argument(\\\"--model\\\",\n          dest=\\\"model_path\\\", type=str, required=True, default=argparse.SUPPRESS)\\n_parser.add_argument(\\\"--input-example\\\",\n          dest=\\\"input_example_path\\\", type=str, required=True, default=argparse.SUPPRESS)\\n_parser.add_argument(\\\"--signature\\\",\n          dest=\\\"signature_path\\\", type=str, required=True, default=argparse.SUPPRESS)\\n_parser.add_argument(\\\"--conda-env\\\",\n          dest=\\\"conda_env_path\\\", type=str, required=True, default=argparse.SUPPRESS)\\n_parsed_args\n          = vars(_parser.parse_args())\\n\\n_outputs = upload_sklearn_model_to_mlflow(**_parsed_args)\\n\"],\n          \"image\": \"python:3.7\"}}, \"inputs\": [{\"name\": \"model_name\", \"type\": \"String\"},\n          {\"name\": \"model\", \"type\": \"dill\"}, {\"name\": \"input_example\", \"type\": \"dill\"},\n          {\"name\": \"signature\", \"type\": \"dill\"}, {\"name\": \"conda_env\", \"type\": \"dill\"}],\n          \"name\": \"Upload sklearn model to mlflow\"}', pipelines.kubeflow.org/component_ref: '{}',\n        pipelines.kubeflow.org/arguments.parameters: '{\"model_name\": \"{{inputs.parameters.model_name}}\"}'}\n  arguments:\n    parameters:\n    - {name: kernel}\n    - {name: model_name}\n  serviceAccountName: pipeline-runner\n```\n\n  </details>\n</p>\n\n실행후 생성된 mlflow_pipeline.yaml 파일을 파이프라인 업로드한 후, 실행하여 run 의 결과를 확인합니다.\n\n![mlflow-svc-0](./img/mlflow-svc-0.png)\n\nmlflow service를 포트포워딩해서 MLflow ui에 접속합니다.\n\n```bash\nkubectl port-forward svc/mlflow-server-service -n mlflow-system 5000:5000\n```\n\n웹 브라우저를 열어 localhost:5000으로 접속하면, 다음과 같이 run이 생성된 것을 확인할 수 있습니다.\n\n![mlflow-svc-1](./img/mlflow-svc-1.png)\n\nrun 을 클릭해서 확인하면 학습한 모델 파일이 있는 것을 확인할 수 있습니다.\n\n![mlflow-svc-2](./img/mlflow-svc-2.png)\n"
  },
  {
    "path": "versioned_docs/version-1.0/kubeflow/advanced-pipeline.md",
    "content": "---\ntitle : \"10. Pipeline - Setting\"\ndescription: \"\"\nsidebar_position: 10\ncontributors: [\"Jongseob Jeon\"]\n---\n\n## Pipeline Setting\n\n이번 페이지에서는 파이프라인에서 설정할 수 있는 값들에 대해 알아보겠습니다.\n\n## Display Name\n\n생성된 파이프라인 내에서 컴포넌트는 두 개의 이름을 갖습니다.\n\n- task_name: 컴포넌트를 작성할 때 작성한 함수 이름\n- display_name: kubeflow UI상에 보이는 이름\n\n예를 들어서 다음과 같은 경우 두 컴포넌트 모두 Print and return number로 설정되어 있어서 어떤 컴포넌트가 1번인지 2번인지 확인하기 어렵습니다.\n\n![run-7](./img/run-7.png)\n\n### set_display_name\n\n이를 위한 것이 바로 display_name 입니다.  \n설정하는 방법은 파이프라인에서 컴포넌트에 다음과 같이 `set_display_name` [attribute](https://kubeflow-pipelines.readthedocs.io/en/latest/source/kfp.dsl.html#kfp.dsl.ContainerOp.set_display_name)를 이용하면 됩니다.\n\n```python\nimport kfp\nfrom kfp.components import create_component_from_func\nfrom kfp.dsl import pipeline\n\n\n@create_component_from_func\ndef print_and_return_number(number: int) -> int:\n    print(number)\n    return number\n\n\n@create_component_from_func\ndef sum_and_print_numbers(number_1: int, number_2: int):\n    print(number_1 + number_2)\n\n\n@pipeline(name=\"example_pipeline\")\ndef example_pipeline(number_1: int, number_2: int):\n    number_1_result = print_and_return_number(number_1).set_display_name(\"This is number 1\")\n    number_2_result = print_and_return_number(number_2).set_display_name(\"This is number 2\")\n    sum_result = sum_and_print_numbers(\n        number_1=number_1_result.output, number_2=number_2_result.output\n    ).set_display_name(\"This is sum of number 1 and number 2\")\n\n\nif __name__ == \"__main__\":\n    kfp.compiler.Compiler().compile(example_pipeline, \"example_pipeline.yaml\")\n```\n\n이 스크립트를 실행해서 나온 `example_pipeline.yaml`을 확인하면 다음과 같습니다.\n\n<p>\n  <details>\n    <summary>example_pipeline.yaml</summary>\n\n```bash\napiVersion: argoproj.io/v1alpha1\nkind: Workflow\nmetadata:\n  generateName: example-pipeline-\n  annotations: {pipelines.kubeflow.org/kfp_sdk_version: 1.8.9, pipelines.kubeflow.org/pipeline_compilation_time: '2021-12-09T18:11:43.193190',\n    pipelines.kubeflow.org/pipeline_spec: '{\"inputs\": [{\"name\": \"number_1\", \"type\":\n      \"Integer\"}, {\"name\": \"number_2\", \"type\": \"Integer\"}], \"name\": \"example_pipeline\"}'}\n  labels: {pipelines.kubeflow.org/kfp_sdk_version: 1.8.9}\nspec:\n  entrypoint: example-pipeline\n  templates:\n  - name: example-pipeline\n    inputs:\n      parameters:\n      - {name: number_1}\n      - {name: number_2}\n    dag:\n      tasks:\n      - name: print-and-return-number\n        template: print-and-return-number\n        arguments:\n          parameters:\n          - {name: number_1, value: '{{inputs.parameters.number_1}}'}\n      - name: print-and-return-number-2\n        template: print-and-return-number-2\n        arguments:\n          parameters:\n          - {name: number_2, value: '{{inputs.parameters.number_2}}'}\n      - name: sum-and-print-numbers\n        template: sum-and-print-numbers\n        dependencies: [print-and-return-number, print-and-return-number-2]\n        arguments:\n          parameters:\n          - {name: print-and-return-number-2-Output, value: '{{tasks.print-and-return-number-2.outputs.parameters.print-and-return-number-2-Output}}'}\n          - {name: print-and-return-number-Output, value: '{{tasks.print-and-return-number.outputs.parameters.print-and-return-number-Output}}'}\n  - name: print-and-return-number\n    container:\n      args: [--number, '{{inputs.parameters.number_1}}', '----output-paths', /tmp/outputs/Output/data]\n      command:\n      - sh\n      - -ec\n      - |\n        program_path=$(mktemp)\n        printf \"%s\" \"$0\" > \"$program_path\"\n        python3 -u \"$program_path\" \"$@\"\n      - |\n        def print_and_return_number(number):\n            print(number)\n            return number\n\n        def _serialize_int(int_value: int) -> str:\n            if isinstance(int_value, str):\n                return int_value\n            if not isinstance(int_value, int):\n                raise TypeError('Value \"{}\" has type \"{}\" instead of int.'.format(\n                    str(int_value), str(type(int_value))))\n            return str(int_value)\n\n        import argparse\n        _parser = argparse.ArgumentParser(prog='Print and return number', description='')\n        _parser.add_argument(\"--number\", dest=\"number\", type=int, required=True, default=argparse.SUPPRESS)\n        _parser.add_argument(\"----output-paths\", dest=\"_output_paths\", type=str, nargs=1)\n        _parsed_args = vars(_parser.parse_args())\n        _output_files = _parsed_args.pop(\"_output_paths\", [])\n\n        _outputs = print_and_return_number(**_parsed_args)\n\n        _outputs = [_outputs]\n\n        _output_serializers = [\n            _serialize_int,\n\n        ]\n\n        import os\n        for idx, output_file in enumerate(_output_files):\n            try:\n                os.makedirs(os.path.dirname(output_file))\n            except OSError:\n                pass\n            with open(output_file, 'w') as f:\n                f.write(_output_serializers[idx](_outputs[idx]))\n      image: python:3.7\n    inputs:\n      parameters:\n      - {name: number_1}\n    outputs:\n      parameters:\n      - name: print-and-return-number-Output\n        valueFrom: {path: /tmp/outputs/Output/data}\n      artifacts:\n      - {name: print-and-return-number-Output, path: /tmp/outputs/Output/data}\n    metadata:\n      annotations: {pipelines.kubeflow.org/task_display_name: This is number 1, pipelines.kubeflow.org/component_spec: '{\"implementation\":\n          {\"container\": {\"args\": [\"--number\", {\"inputValue\": \"number\"}, \"----output-paths\",\n          {\"outputPath\": \"Output\"}], \"command\": [\"sh\", \"-ec\", \"program_path=$(mktemp)\\nprintf\n          \\\"%s\\\" \\\"$0\\\" > \\\"$program_path\\\"\\npython3 -u \\\"$program_path\\\" \\\"$@\\\"\\n\",\n          \"def print_and_return_number(number):\\n    print(number)\\n    return number\\n\\ndef\n          _serialize_int(int_value: int) -> str:\\n    if isinstance(int_value, str):\\n        return\n          int_value\\n    if not isinstance(int_value, int):\\n        raise TypeError(''Value\n          \\\"{}\\\" has type \\\"{}\\\" instead of int.''.format(\\n            str(int_value),\n          str(type(int_value))))\\n    return str(int_value)\\n\\nimport argparse\\n_parser\n          = argparse.ArgumentParser(prog=''Print and return number'', description='''')\\n_parser.add_argument(\\\"--number\\\",\n          dest=\\\"number\\\", type=int, required=True, default=argparse.SUPPRESS)\\n_parser.add_argument(\\\"----output-paths\\\",\n          dest=\\\"_output_paths\\\", type=str, nargs=1)\\n_parsed_args = vars(_parser.parse_args())\\n_output_files\n          = _parsed_args.pop(\\\"_output_paths\\\", [])\\n\\n_outputs = print_and_return_number(**_parsed_args)\\n\\n_outputs\n          = [_outputs]\\n\\n_output_serializers = [\\n    _serialize_int,\\n\\n]\\n\\nimport\n          os\\nfor idx, output_file in enumerate(_output_files):\\n    try:\\n        os.makedirs(os.path.dirname(output_file))\\n    except\n          OSError:\\n        pass\\n    with open(output_file, ''w'') as f:\\n        f.write(_output_serializers[idx](_outputs[idx]))\\n\"],\n          \"image\": \"python:3.7\"}}, \"inputs\": [{\"name\": \"number\", \"type\": \"Integer\"}],\n          \"name\": \"Print and return number\", \"outputs\": [{\"name\": \"Output\", \"type\":\n          \"Integer\"}]}', pipelines.kubeflow.org/component_ref: '{}', pipelines.kubeflow.org/arguments.parameters: '{\"number\":\n          \"{{inputs.parameters.number_1}}\"}'}\n      labels:\n        pipelines.kubeflow.org/kfp_sdk_version: 1.8.9\n        pipelines.kubeflow.org/pipeline-sdk-type: kfp\n        pipelines.kubeflow.org/enable_caching: \"true\"\n  - name: print-and-return-number-2\n    container:\n      args: [--number, '{{inputs.parameters.number_2}}', '----output-paths', /tmp/outputs/Output/data]\n      command:\n      - sh\n      - -ec\n      - |\n        program_path=$(mktemp)\n        printf \"%s\" \"$0\" > \"$program_path\"\n        python3 -u \"$program_path\" \"$@\"\n      - |\n        def print_and_return_number(number):\n            print(number)\n            return number\n\n        def _serialize_int(int_value: int) -> str:\n            if isinstance(int_value, str):\n                return int_value\n            if not isinstance(int_value, int):\n                raise TypeError('Value \"{}\" has type \"{}\" instead of int.'.format(\n                    str(int_value), str(type(int_value))))\n            return str(int_value)\n\n        import argparse\n        _parser = argparse.ArgumentParser(prog='Print and return number', description='')\n        _parser.add_argument(\"--number\", dest=\"number\", type=int, required=True, default=argparse.SUPPRESS)\n        _parser.add_argument(\"----output-paths\", dest=\"_output_paths\", type=str, nargs=1)\n        _parsed_args = vars(_parser.parse_args())\n        _output_files = _parsed_args.pop(\"_output_paths\", [])\n\n        _outputs = print_and_return_number(**_parsed_args)\n\n        _outputs = [_outputs]\n\n        _output_serializers = [\n            _serialize_int,\n\n        ]\n\n        import os\n        for idx, output_file in enumerate(_output_files):\n            try:\n                os.makedirs(os.path.dirname(output_file))\n            except OSError:\n                pass\n            with open(output_file, 'w') as f:\n                f.write(_output_serializers[idx](_outputs[idx]))\n      image: python:3.7\n    inputs:\n      parameters:\n      - {name: number_2}\n    outputs:\n      parameters:\n      - name: print-and-return-number-2-Output\n        valueFrom: {path: /tmp/outputs/Output/data}\n      artifacts:\n      - {name: print-and-return-number-2-Output, path: /tmp/outputs/Output/data}\n    metadata:\n      annotations: {pipelines.kubeflow.org/task_display_name: This is number 2, pipelines.kubeflow.org/component_spec: '{\"implementation\":\n          {\"container\": {\"args\": [\"--number\", {\"inputValue\": \"number\"}, \"----output-paths\",\n          {\"outputPath\": \"Output\"}], \"command\": [\"sh\", \"-ec\", \"program_path=$(mktemp)\\nprintf\n          \\\"%s\\\" \\\"$0\\\" > \\\"$program_path\\\"\\npython3 -u \\\"$program_path\\\" \\\"$@\\\"\\n\",\n          \"def print_and_return_number(number):\\n    print(number)\\n    return number\\n\\ndef\n          _serialize_int(int_value: int) -> str:\\n    if isinstance(int_value, str):\\n        return\n          int_value\\n    if not isinstance(int_value, int):\\n        raise TypeError(''Value\n          \\\"{}\\\" has type \\\"{}\\\" instead of int.''.format(\\n            str(int_value),\n          str(type(int_value))))\\n    return str(int_value)\\n\\nimport argparse\\n_parser\n          = argparse.ArgumentParser(prog=''Print and return number'', description='''')\\n_parser.add_argument(\\\"--number\\\",\n          dest=\\\"number\\\", type=int, required=True, default=argparse.SUPPRESS)\\n_parser.add_argument(\\\"----output-paths\\\",\n          dest=\\\"_output_paths\\\", type=str, nargs=1)\\n_parsed_args = vars(_parser.parse_args())\\n_output_files\n          = _parsed_args.pop(\\\"_output_paths\\\", [])\\n\\n_outputs = print_and_return_number(**_parsed_args)\\n\\n_outputs\n          = [_outputs]\\n\\n_output_serializers = [\\n    _serialize_int,\\n\\n]\\n\\nimport\n          os\\nfor idx, output_file in enumerate(_output_files):\\n    try:\\n        os.makedirs(os.path.dirname(output_file))\\n    except\n          OSError:\\n        pass\\n    with open(output_file, ''w'') as f:\\n        f.write(_output_serializers[idx](_outputs[idx]))\\n\"],\n          \"image\": \"python:3.7\"}}, \"inputs\": [{\"name\": \"number\", \"type\": \"Integer\"}],\n          \"name\": \"Print and return number\", \"outputs\": [{\"name\": \"Output\", \"type\":\n          \"Integer\"}]}', pipelines.kubeflow.org/component_ref: '{}', pipelines.kubeflow.org/arguments.parameters: '{\"number\":\n          \"{{inputs.parameters.number_2}}\"}'}\n      labels:\n        pipelines.kubeflow.org/kfp_sdk_version: 1.8.9\n        pipelines.kubeflow.org/pipeline-sdk-type: kfp\n        pipelines.kubeflow.org/enable_caching: \"true\"\n  - name: sum-and-print-numbers\n    container:\n      args: [--number-1, '{{inputs.parameters.print-and-return-number-Output}}', --number-2,\n        '{{inputs.parameters.print-and-return-number-2-Output}}']\n      command:\n      - sh\n      - -ec\n      - |\n        program_path=$(mktemp)\n        printf \"%s\" \"$0\" > \"$program_path\"\n        python3 -u \"$program_path\" \"$@\"\n      - |\n        def sum_and_print_numbers(number_1, number_2):\n            print(number_1 + number_2)\n\n        import argparse\n        _parser = argparse.ArgumentParser(prog='Sum and print numbers', description='')\n        _parser.add_argument(\"--number-1\", dest=\"number_1\", type=int, required=True, default=argparse.SUPPRESS)\n        _parser.add_argument(\"--number-2\", dest=\"number_2\", type=int, required=True, default=argparse.SUPPRESS)\n        _parsed_args = vars(_parser.parse_args())\n\n        _outputs = sum_and_print_numbers(**_parsed_args)\n      image: python:3.7\n    inputs:\n      parameters:\n      - {name: print-and-return-number-2-Output}\n      - {name: print-and-return-number-Output}\n    metadata:\n      annotations: {pipelines.kubeflow.org/task_display_name: This is sum of number\n          1 and number 2, pipelines.kubeflow.org/component_spec: '{\"implementation\":\n          {\"container\": {\"args\": [\"--number-1\", {\"inputValue\": \"number_1\"}, \"--number-2\",\n          {\"inputValue\": \"number_2\"}], \"command\": [\"sh\", \"-ec\", \"program_path=$(mktemp)\\nprintf\n          \\\"%s\\\" \\\"$0\\\" > \\\"$program_path\\\"\\npython3 -u \\\"$program_path\\\" \\\"$@\\\"\\n\",\n          \"def sum_and_print_numbers(number_1, number_2):\\n    print(number_1 + number_2)\\n\\nimport\n          argparse\\n_parser = argparse.ArgumentParser(prog=''Sum and print numbers'',\n          description='''')\\n_parser.add_argument(\\\"--number-1\\\", dest=\\\"number_1\\\",\n          type=int, required=True, default=argparse.SUPPRESS)\\n_parser.add_argument(\\\"--number-2\\\",\n          dest=\\\"number_2\\\", type=int, required=True, default=argparse.SUPPRESS)\\n_parsed_args\n          = vars(_parser.parse_args())\\n\\n_outputs = sum_and_print_numbers(**_parsed_args)\\n\"],\n          \"image\": \"python:3.7\"}}, \"inputs\": [{\"name\": \"number_1\", \"type\": \"Integer\"},\n          {\"name\": \"number_2\", \"type\": \"Integer\"}], \"name\": \"Sum and print numbers\"}',\n        pipelines.kubeflow.org/component_ref: '{}', pipelines.kubeflow.org/arguments.parameters: '{\"number_1\":\n          \"{{inputs.parameters.print-and-return-number-Output}}\", \"number_2\": \"{{inputs.parameters.print-and-return-number-2-Output}}\"}'}\n      labels:\n        pipelines.kubeflow.org/kfp_sdk_version: 1.8.9\n        pipelines.kubeflow.org/pipeline-sdk-type: kfp\n        pipelines.kubeflow.org/enable_caching: \"true\"\n  arguments:\n    parameters:\n    - {name: number_1}\n    - {name: number_2}\n  serviceAccountName: pipeline-runner\n```\n\n  </details>\n</p>\n\n이 전의 파일과 비교하면 `pipelines.kubeflow.org/task_display_name` key가 새로 생성되었습니다.\n\n### UI in Kubeflow\n\n위에서 만든 파일을 이용해 이전에 생성한 [파이프라인](../kubeflow/basic-pipeline-upload.md#upload-pipeline-version)의 버전을 올리겠습니다.\n\n![adv-pipeline-0.png](./img/adv-pipeline-0.png)\n\n그러면 위와 같이 설정한 이름이 노출되는 것을 확인할 수 있습니다.\n\n## Resources\n\n### GPU\n\n특별한 설정이 없다면 파이프라인은 컴포넌트를 쿠버네티스 파드(pod)로 실행할 때, 기본 리소스 스펙으로 실행하게 됩니다.  \n만약 GPU를 사용해 모델을 학습해야 할 때 쿠버네티스상에서 GPU를 할당받지 못해 제대로 학습이 이루어지지 않습니다.  \n이를 위해 `set_gpu_limit()` [attribute](https://kubeflow-pipelines.readthedocs.io/en/latest/source/kfp.dsl.html?highlight=set_gpu_limit#kfp.dsl.UserContainer.set_gpu_limit)을 이용해 설정할 수 있습니다.\n\n```python\nimport kfp\nfrom kfp.components import create_component_from_func\nfrom kfp.dsl import pipeline\n\n\n@create_component_from_func\ndef print_and_return_number(number: int) -> int:\n    print(number)\n    return number\n\n\n@create_component_from_func\ndef sum_and_print_numbers(number_1: int, number_2: int):\n    print(number_1 + number_2)\n\n\n@pipeline(name=\"example_pipeline\")\ndef example_pipeline(number_1: int, number_2: int):\n    number_1_result = print_and_return_number(number_1).set_display_name(\"This is number 1\")\n    number_2_result = print_and_return_number(number_2).set_display_name(\"This is number 2\")\n    sum_result = sum_and_print_numbers(\n        number_1=number_1_result.output, number_2=number_2_result.output\n    ).set_display_name(\"This is sum of number 1 and number 2\").set_gpu_limit(1)\n\n\nif __name__ == \"__main__\":\n    kfp.compiler.Compiler().compile(example_pipeline, \"example_pipeline.yaml\")\n```\n\n위의 스크립트를 실행하면 생성된 파일에서 `sum-and-print-numbers`를 자세히 보면 resources에 `{nvidia.com/gpu: 1}` 도 추가된 것을 볼 수 있습니다.\n이를 통해 GPU를 할당받을 수 있습니다.\n\n```bash\n  - name: sum-and-print-numbers\n    container:\n      args: [--number-1, '{{inputs.parameters.print-and-return-number-Output}}', --number-2,\n        '{{inputs.parameters.print-and-return-number-2-Output}}']\n      command:\n      - sh\n      - -ec\n      - |\n        program_path=$(mktemp)\n        printf \"%s\" \"$0\" > \"$program_path\"\n        python3 -u \"$program_path\" \"$@\"\n      - |\n        def sum_and_print_numbers(number_1, number_2):\n            print(number_1 + number_2)\n\n        import argparse\n        _parser = argparse.ArgumentParser(prog='Sum and print numbers', description='')\n        _parser.add_argument(\"--number-1\", dest=\"number_1\", type=int, required=True, default=argparse.SUPPRESS)\n        _parser.add_argument(\"--number-2\", dest=\"number_2\", type=int, required=True, default=argparse.SUPPRESS)\n        _parsed_args = vars(_parser.parse_args())\n\n        _outputs = sum_and_print_numbers(**_parsed_args)\n      image: python:3.7\n      resources:\n        limits: {nvidia.com/gpu: 1}\n```\n\n### CPU\n\ncpu의 개수를 정하기 위해서 이용하는 함수는 `.set_cpu_limit()` [attribute](https://kubeflow-pipelines.readthedocs.io/en/latest/source/kfp.dsl.html?highlight=set_gpu_limit#kfp.dsl.Sidecar.set_cpu_limit)을 이용해 설정할 수 있습니다.  \ngpu와는 다른 점은 int가 아닌 string으로 입력해야 한다는 점입니다.\n\n```python\nimport kfp\nfrom kfp.components import create_component_from_func\nfrom kfp.dsl import pipeline\n\n\n@create_component_from_func\ndef print_and_return_number(number: int) -> int:\n    print(number)\n    return number\n\n\n@create_component_from_func\ndef sum_and_print_numbers(number_1: int, number_2: int):\n    print(number_1 + number_2)\n\n\n@pipeline(name=\"example_pipeline\")\ndef example_pipeline(number_1: int, number_2: int):\n    number_1_result = print_and_return_number(number_1).set_display_name(\"This is number 1\")\n    number_2_result = print_and_return_number(number_2).set_display_name(\"This is number 2\")\n    sum_result = sum_and_print_numbers(\n        number_1=number_1_result.output, number_2=number_2_result.output\n    ).set_display_name(\"This is sum of number 1 and number 2\").set_gpu_limit(1).set_cpu_limit(\"16\")\n\n\nif __name__ == \"__main__\":\n    kfp.compiler.Compiler().compile(example_pipeline, \"example_pipeline.yaml\")\n```\n\n바뀐 부분만 확인하면 다음과 같습니다.\n\n```bash\n      resources:\n        limits: {nvidia.com/gpu: 1, cpu: '16'}\n```\n\n### Memory\n\n메모리는 `.set_memory_limit()` [attribute](https://kubeflow-pipelines.readthedocs.io/en/latest/source/kfp.dsl.html?highlight=set_gpu_limit#kfp.dsl.Sidecar.set_memory_limit)을 이용해 설정할 수 있습니다.\n\n```python\nimport kfp\nfrom kfp.components import create_component_from_func\nfrom kfp.dsl import pipeline\n\n\n@create_component_from_func\ndef print_and_return_number(number: int) -> int:\n    print(number)\n    return number\n\n\n@create_component_from_func\ndef sum_and_print_numbers(number_1: int, number_2: int):\n    print(number_1 + number_2)\n\n\n@pipeline(name=\"example_pipeline\")\ndef example_pipeline(number_1: int, number_2: int):\n    number_1_result = print_and_return_number(number_1).set_display_name(\"This is number 1\")\n    number_2_result = print_and_return_number(number_2).set_display_name(\"This is number 2\")\n    sum_result = sum_and_print_numbers(\n        number_1=number_1_result.output, number_2=number_2_result.output\n    ).set_display_name(\"This is sum of number 1 and number 2\").set_gpu_limit(1).set_memory_limit(\"1G\")\n\n\nif __name__ == \"__main__\":\n    kfp.compiler.Compiler().compile(example_pipeline, \"example_pipeline.yaml\")\n\n```\n\n바뀐 부분만 확인하면 다음과 같습니다.\n\n```bash\n      resources:\n        limits: {nvidia.com/gpu: 1, memory: 1G}\n```\n"
  },
  {
    "path": "versioned_docs/version-1.0/kubeflow/advanced-run.md",
    "content": "---\ntitle : \"11. Pipeline - Run Result\"\ndescription: \"\"\nsidebar_position: 11\ncontributors: [\"Jongseob Jeon\", \"SeungTae Kim\"]\n---\n\n\n## Run Result\n\nRun 실행 결과를 눌러보면 3개의 탭이 존재합니다.\n각각 Graph, Run output, Config 입니다.\n\n![advanced-run-0.png](./img/advanced-run-0.png)\n\n## Graph\n\n![advanced-run-1.png](./img/advanced-run-1.png)\n\n그래프에서는 실행된 컴포넌트를 누르면 컴포넌트의 실행 정보를 확인할 수 있습니다.\n\n### Input/Output\n\nInput/Output 탭은 컴포넌트에서 사용한 Config들과 Input, Output Artifacts를 확인하고 다운로드 받을 수 있습니다.\n\n### Logs\n\nLogs에서는 파이썬 코드 실행 중 나오는 모든 stdout을 확인할 수 있습니다.\n다만 pod은 일정 시간이 지난 후 지워지기 때문에 일정 시간이 지나면 이 탭에서는 확인할 수 없습니다.\n이때는 Output artifacts의 main-logs에서 확인할 수 있습니다.\n\n### Visualizations\n\nVisualizations에서는 컴포넌트에서 생성된 플랏을 보여줍니다.\n\n플랏을 생성하기 위해서는 `mlpipeline_ui_metadata: OutputPath(\"UI_Metadata\")` argument로 보여주고 싶은 값을 저장하면 됩니다. 이 때 플랏의 형태는 html 포맷이어야 합니다.\n변환하는 과정은 다음과 같습니다.\n\n```python\n\n@partial(\n    create_component_from_func,\n    packages_to_install=[\"matplotlib\"],\n)\ndef plot_linear(\n    mlpipeline_ui_metadata: OutputPath(\"UI_Metadata\")\n):\n    import base64\n    import json\n    from io import BytesIO\n\n    import matplotlib.pyplot as plt\n\n    plt.plot(x=[1, 2, 3], y=[1, 2,3])\n\n    tmpfile = BytesIO()\n    plt.savefig(tmpfile, format=\"png\")\n    encoded = base64.b64encode(tmpfile.getvalue()).decode(\"utf-8\")\n\n    html = f\"<img src='data:image/png;base64,{encoded}'>\"\n    metadata = {\n        \"outputs\": [\n            {\n                \"type\": \"web-app\",\n                \"storage\": \"inline\",\n                \"source\": html,\n            },\n        ],\n    }\n    with open(mlpipeline_ui_metadata, \"w\") as html_writer:\n        json.dump(metadata, html_writer)\n```\n\n파이프라인으로 작성하면 다음과 같이 됩니다.\n\n```python\nfrom functools import partial\n\nimport kfp\nfrom kfp.components import create_component_from_func, OutputPath\nfrom kfp.dsl import pipeline\n\n\n@partial(\n    create_component_from_func,\n    packages_to_install=[\"matplotlib\"],\n)\ndef plot_linear(mlpipeline_ui_metadata: OutputPath(\"UI_Metadata\")):\n    import base64\n    import json\n    from io import BytesIO\n\n    import matplotlib.pyplot as plt\n\n    plt.plot([1, 2, 3], [1, 2, 3])\n\n    tmpfile = BytesIO()\n    plt.savefig(tmpfile, format=\"png\")\n    encoded = base64.b64encode(tmpfile.getvalue()).decode(\"utf-8\")\n\n    html = f\"<img src='data:image/png;base64,{encoded}'>\"\n    metadata = {\n        \"outputs\": [\n            {\n                \"type\": \"web-app\",\n                \"storage\": \"inline\",\n                \"source\": html,\n            },\n        ],\n    }\n    with open(mlpipeline_ui_metadata, \"w\") as html_writer:\n        json.dump(metadata, html_writer)\n\n\n@pipeline(name=\"plot_pipeline\")\ndef plot_pipeline():\n    plot_linear()\n\n\nif __name__ == \"__main__\":\n    kfp.compiler.Compiler().compile(plot_pipeline, \"plot_pipeline.yaml\")\n```\n\n이 스크립트를 실행해서 나온 `plot_pipeline.yaml`을 확인하면 다음과 같습니다.\n\n<p>\n  <details>\n    <summary>plot_pipeline.yaml</summary>\n\n```bash\napiVersion: argoproj.io/v1alpha1\nkind: Workflow\nmetadata:\n  generateName: plot-pipeline-\n  annotations: {pipelines.kubeflow.org/kfp_sdk_version: 1.8.9, pipelines.kubeflow.org/pipeline_compilation_time: '2\n022-01-17T13:31:32.963214',\n    pipelines.kubeflow.org/pipeline_spec: '{\"name\": \"plot_pipeline\"}'}\n  labels: {pipelines.kubeflow.org/kfp_sdk_version: 1.8.9}\nspec:\n  entrypoint: plot-pipeline\n  templates:\n  - name: plot-linear\n    container:\n      args: [--mlpipeline-ui-metadata, /tmp/outputs/mlpipeline_ui_metadata/data]\n      command:\n      - sh\n      - -c\n      - (PIP_DISABLE_PIP_VERSION_CHECK=1 python3 -m pip install --quiet --no-warn-script-location\n        'matplotlib' || PIP_DISABLE_PIP_VERSION_CHECK=1 python3 -m pip install --quiet\n        --no-warn-script-location 'matplotlib' --user) && \"$0\" \"$@\"\n      - sh\n      - -ec\n      - |\n        program_path=$(mktemp)\n        printf \"%s\" \"$0\" > \"$program_path\"\n        python3 -u \"$program_path\" \"$@\"\n      - |\n        def _make_parent_dirs_and_return_path(file_path: str):\n            import os\n            os.makedirs(os.path.dirname(file_path), exist_ok=True)\n            return file_path\n        def plot_linear(mlpipeline_ui_metadata):\n            import base64\n            import json\n            from io import BytesIO\n            import matplotlib.pyplot as plt\n            plt.plot([1, 2, 3], [1, 2, 3])\n            tmpfile = BytesIO()\n            plt.savefig(tmpfile, format=\"png\")\n            encoded = base64.b64encode(tmpfile.getvalue()).decode(\"utf-8\")\n            html = f\"<img src='data:image/png;base64,{encoded}'>\"\n            metadata = {\n                \"outputs\": [\n                    {\n                        \"type\": \"web-app\",\n                        \"storage\": \"inline\",\n                        \"source\": html,\n                    },\n                ],\n            }\n            with open(mlpipeline_ui_metadata, \"w\") as html_writer:\n                json.dump(metadata, html_writer)\n\n        import argparse\n        _parser = argparse.ArgumentParser(prog='Plot linear', description='')\n        _parser.add_argument(\"--mlpipeline-ui-metadata\", dest=\"mlpipeline_ui_metadata\", type=_make_parent_dirs_and_return_path, required=True, default=argparse.SUPPRESS)\n        _parsed_args = vars(_parser.parse_args())\n        _outputs = plot_linear(**_parsed_args)\n      image: python:3.7\n    outputs:\n      artifacts:\n      - {name: mlpipeline-ui-metadata, path: /tmp/outputs/mlpipeline_ui_metadata/data}\n    metadata:\n      labels:\n        pipelines.kubeflow.org/kfp_sdk_version: 1.8.9\n        pipelines.kubeflow.org/pipeline-sdk-type: kfp\n        pipelines.kubeflow.org/enable_caching: \"true\"\n      annotations: {pipelines.kubeflow.org/component_spec: '{\"implementation\": {\"container\":\n          {\"args\": [\"--mlpipeline-ui-metadata\", {\"outputPath\": \"mlpipeline_ui_metadata\"}],\n          \"command\": [\"sh\", \"-c\", \"(PIP_DISABLE_PIP_VERSION_CHECK=1 python3 -m pip\n          install --quiet --no-warn-script-location ''matplotlib'' || PIP_DISABLE_PIP_VERSION_CHECK=1\n          python3 -m pip install --quiet --no-warn-script-location ''matplotlib''\n          --user) && \\\"$0\\\" \\\"$@\\\"\", \"sh\", \"-ec\", \"program_path=$(mktemp)\\nprintf\n          \\\"%s\\\" \\\"$0\\\" > \\\"$program_path\\\"\\npython3 -u \\\"$program_path\\\" \\\"$@\\\"\\n\",\n          \"def _make_parent_dirs_and_return_path(file_path: str):\\n    import os\\n    os.makedirs(os.path.dirname(file_path),\n          exist_ok=True)\\n    return file_path\\n\\ndef plot_linear(mlpipeline_ui_metadata):\\n    import\n          base64\\n    import json\\n    from io import BytesIO\\n\\n    import matplotlib.pyplot\n          as plt\\n\\n    plt.plot([1, 2, 3], [1, 2, 3])\\n\\n    tmpfile = BytesIO()\\n    plt.savefig(tmpfile,\n          format=\\\"png\\\")\\n    encoded = base64.b64encode(tmpfile.getvalue()).decode(\\\"utf-8\\\")\\n\\n    html\n          = f\\\"<img src=''data:image/png;base64,{encoded}''>\\\"\\n    metadata = {\\n        \\\"outputs\\\":\n          [\\n            {\\n                \\\"type\\\": \\\"web-app\\\",\\n                \\\"storage\\\":\n          \\\"inline\\\",\\n                \\\"source\\\": html,\\n            },\\n        ],\\n    }\\n    with\n          open(mlpipeline_ui_metadata, \\\"w\\\") as html_writer:\\n        json.dump(metadata,\n          html_writer)\\n\\nimport argparse\\n_parser = argparse.ArgumentParser(prog=''Plot\n          linear'', description='''')\\n_parser.add_argument(\\\"--mlpipeline-ui-metadata\\\",\n          dest=\\\"mlpipeline_ui_metadata\\\", type=_make_parent_dirs_and_return_path,\n          required=True, default=argparse.SUPPRESS)\\n_parsed_args = vars(_parser.parse_args())\\n\\n_outputs\n          = plot_linear(**_parsed_args)\\n\"], \"image\": \"python:3.7\"}}, \"name\": \"Plot\n          linear\", \"outputs\": [{\"name\": \"mlpipeline_ui_metadata\", \"type\": \"UI_Metadata\"}]}',\n        pipelines.kubeflow.org/component_ref: '{}'}\n  - name: plot-pipeline\n    dag:\n      tasks:\n      - {name: plot-linear, template: plot-linear}\n  arguments:\n    parameters: []\n  serviceAccountName: pipeline-runner\n```\n\n  </details>\n</p>\n\n실행 후 Visualization을 클릭합니다.\n\n![advanced-run-5.png](./img/advanced-run-5.png)\n\n## Run output\n\n![advanced-run-2.png](./img/advanced-run-2.png)\n\nRun output은 kubeflow에서 지정한 형태로 생긴 Artifacts를 모아서 보여주는 곳이며 평가 지표(Metric)를 보여줍니다.\n\n평가 지표(Metric)을 보여주기 위해서는 `mlpipeline_metrics_path: OutputPath(\"Metrics\")` argument에 보여주고 싶은 이름과 값을 json 형태로 저장하면 됩니다.\n예를 들어서 다음과 같이 작성할 수 있습니다.\n\n```python\n@create_component_from_func\ndef show_metric_of_sum(\n    number: int,\n    mlpipeline_metrics_path: OutputPath(\"Metrics\"),\n  ):\n    import json\n    metrics = {\n        \"metrics\": [\n            {\n                \"name\": \"sum_value\",\n                \"numberValue\": number,\n            },\n        ],\n    }\n    with open(mlpipeline_metrics_path, \"w\") as f:\n        json.dump(metrics, f)\n```\n\n평가 지표를 생성하는 컴포넌트를 [파이프라인](../kubeflow/basic-pipeline.md)에서 생성한 파이프라인에 추가 후 실행해 보겠습니다.\n전체 파이프라인은 다음과 같습니다.\n\n```python\nimport kfp\nfrom kfp.components import create_component_from_func, OutputPath\nfrom kfp.dsl import pipeline\n\n\n@create_component_from_func\ndef print_and_return_number(number: int) -> int:\n    print(number)\n    return number\n\n@create_component_from_func\ndef sum_and_print_numbers(number_1: int, number_2: int) -> int:\n    sum_number = number_1 + number_2\n    print(sum_number)\n    return sum_number\n\n@create_component_from_func\ndef show_metric_of_sum(\n    number: int,\n    mlpipeline_metrics_path: OutputPath(\"Metrics\"),\n  ):\n    import json\n    metrics = {\n        \"metrics\": [\n            {\n                \"name\": \"sum_value\",\n                \"numberValue\": number,\n            },\n        ],\n    }\n    with open(mlpipeline_metrics_path, \"w\") as f:\n        json.dump(metrics, f)\n\n@pipeline(name=\"example_pipeline\")\ndef example_pipeline(number_1: int, number_2: int):\n    number_1_result = print_and_return_number(number_1)\n    number_2_result = print_and_return_number(number_2)\n    sum_result = sum_and_print_numbers(\n        number_1=number_1_result.output, number_2=number_2_result.output\n    )\n    show_metric_of_sum(sum_result.output)\n\n\nif __name__ == \"__main__\":\n    kfp.compiler.Compiler().compile(example_pipeline, \"example_pipeline.yaml\")\n```\n\n실행 후 Run Output을 클릭하면 다음과 같이 나옵니다.\n\n![advanced-run-4.png](./img/advanced-run-4.png)\n\n## Config\n\n![advanced-run-3.png](./img/advanced-run-3.png)\n\nConfig에서는 파이프라인 Config로 입력받은 모든 값을 확인할 수 있습니다.\n"
  },
  {
    "path": "versioned_docs/version-1.0/kubeflow/basic-component.md",
    "content": "---\ntitle : \"4. Component - Write\"\ndescription: \"\"\nsidebar_position: 4\ncontributors: [\"Jongseob Jeon\"]\n---\n\n\n## Component\n\n컴포넌트(Component)를 작성하기 위해서는 다음과 같은 내용을 작성해야 합니다.\n\n1. 컴포넌트 콘텐츠(Component Contents) 작성\n2. 컴포넌트 래퍼(Component Wrapper) 작성\n\n이제 각 과정에 대해서 알아보도록 하겠습니다.\n\n## Component Contents\n\n컴포넌트 콘텐츠는 우리가 흔히 작성하는 파이썬 코드와 다르지 않습니다.  \n예를 들어서 숫자를 입력으로 받고 입력받은 숫자를 출력한 뒤 반환하는 컴포넌트를 작성해 보겠습니다.  \n파이썬 코드로 작성하면 다음과 같이 작성할 수 있습니다.\n\n```python\nprint(number)\n```\n\n그런데 이 코드를 실행하면 에러가 나고 동작하지 않는데 그 이유는 출력해야 할 `number`가 정의되어 있지 않기 때문입니다.\n\n[Kubeflow Concepts](../kubeflow/kubeflow-concepts.md)에서 `number` 와 같이 컴포넌트 콘텐츠에서 필요한 값들은 **Config**로 정의한다고 했습니다. 컴포넌트 콘텐츠를 실행시키기 위해 필요한 Config들은 컴포넌트 래퍼에서 전달이 되어야 합니다.\n\n## Component Wrapper\n\n### Define a standalone Python function\n\n이제 필요한 Config를 전달할 수 있도록 컴포넌트 래퍼를 만들어야 합니다.\n\n별도의 Config 없이 컴포넌트 래퍼로 감쌀 경우 다음과 같이 됩니다.\n\n```python\ndef print_and_return_number():\n    print(number)\n    return number\n```\n\n이제 콘텐츠에서 필요한 Config를 래퍼의 argument로 추가합니다. 다만, argument 만을 적는 것이 아니라 argument의 타입 힌트도 작성해야 합니다. Kubeflow에서는 파이프라인을 Kubeflow 포맷으로 변환할 때, 컴포넌트 간의 연결에서 정해진 입력과 출력의 타입이 일치하는지 체크합니다. 만약 컴포넌트가 필요로 하는 입력과 다른 컴포넌트로부터 전달받은 출력의 포맷이 일치하지 않을 경우 파이프라인 생성을 할 수 없습니다.\n\n이제 다음과 같이 argument와 그 타입, 그리고 반환하는 타입을 적어서 컴포넌트 래퍼를 완성합니다.\n\n```python\ndef print_and_return_number(number: int) -> int:\n    print(number)\n    return number\n```\n\nKubeflow에서 반환 값으로 사용할 수 있는 타입은 json에서 표현할 수 있는 타입들만 사용할 수 있습니다. 대표적으로 사용되며 권장하는 타입들은 다음과 같습니다.\n\n- int\n- float\n- str\n\n만약 단일 값이 아닌 여러 값을 반환하려면 `collections.namedtuple` 을 이용해야 합니다.  \n자세한 내용은 [Kubeflow 공식 문서](https://www.kubeflow.org/docs/components/pipelines/sdk/python-function-components/#passing-parameters-by-value)를 참고 하시길 바랍니다.  \n예를 들어서 입력받은 숫자를 2로 나눈 몫과 나머지를 반환하는 컴포넌트는 다음과 같이 작성해야 합니다.\n\n```python\nfrom typing import NamedTuple\n\n\ndef divide_and_return_number(\n    number: int,\n) -> NamedTuple(\"DivideOutputs\", [(\"quotient\", int), (\"remainder\", int)]):\n    from collections import namedtuple\n\n    quotient, remainder = divmod(number, 2)\n    print(\"quotient is\", quotient)\n    print(\"remainder is\", remainder)\n\n    divide_outputs = namedtuple(\n        \"DivideOutputs\",\n        [\n            \"quotient\",\n            \"remainder\",\n        ],\n    )\n    return divide_outputs(quotient, remainder)\n```\n\n### Convert to Kubeflow Format\n\n이제 작성한 컴포넌트를 kubeflow에서 사용할 수 있는 포맷으로 변환해야 합니다. 변환은 `kfp.components.create_component_from_func` 를 통해서 할 수 있습니다.  \n이렇게 변환된 형태는 파이썬에서 함수로 import 하여서 파이프라인에서 사용할 수 있습니다.\n\n```python\nfrom kfp.components import create_component_from_func\n\n@create_component_from_func\ndef print_and_return_number(number: int) -> int:\n    print(number)\n    return number\n```\n\n### Share component with yaml file\n\n만약 파이썬 코드로 공유를 할 수 없는 경우 YAML 파일로 컴포넌트를 공유해서 사용할 수 있습니다.\n이를 위해서는 우선 컴포넌트를 YAML 파일로 변환한 뒤 `kfp.components.load_component_from_file` 을 통해 파이프라인에서 사용할 수 있습니다.\n\n우선 작성한 컴포넌트를 YAML 파일로 변환하는 과정에 대해서 설명합니다.\n\n```python\nfrom kfp.components import create_component_from_func\n\n@create_component_from_func\ndef print_and_return_number(number: int) -> int:\n    print(number)\n    return number\n\nif __name__ == \"__main__\":\n    print_and_return_number.component_spec.save(\"print_and_return_number.yaml\")\n```\n\n작성한 파이썬 코드를 실행하면 `print_and_return_number.yaml` 파일이 생성됩니다. 파일을 확인하면 다음과 같습니다.\n\n```bash\nname: Print and return number\ninputs:\n- {name: number, type: Integer}\noutputs:\n- {name: Output, type: Integer}\nimplementation:\n  container:\n    image: python:3.7\n    command:\n    - sh\n    - -ec\n    - |\n      program_path=$(mktemp)\n      printf \"%s\" \"$0\" > \"$program_path\"\n      python3 -u \"$program_path\" \"$@\"\n    - |\n      def print_and_return_number(number):\n          print(number)\n          return number\n\n      def _serialize_int(int_value: int) -> str:\n          if isinstance(int_value, str):\n              return int_value\n          if not isinstance(int_value, int):\n              raise TypeError('Value \"{}\" has type \"{}\" instead of int.'.format(str(int_value), str(type(int_value))))\n          return str(int_value)\n\n      import argparse\n      _parser = argparse.ArgumentParser(prog='Print and return number', description='')\n      _parser.add_argument(\"--number\", dest=\"number\", type=int, required=True, default=argparse.SUPPRESS)\n      _parser.add_argument(\"----output-paths\", dest=\"_output_paths\", type=str, nargs=1)\n      _parsed_args = vars(_parser.parse_args())\n      _output_files = _parsed_args.pop(\"_output_paths\", [])\n\n      _outputs = print_and_return_number(**_parsed_args)\n\n      _outputs = [_outputs]\n\n      _output_serializers = [\n          _serialize_int,\n\n      ]\n\n      import os\n      for idx, output_file in enumerate(_output_files):\n          try:\n              os.makedirs(os.path.dirname(output_file))\n          except OSError:\n              pass\n          with open(output_file, 'w') as f:\n              f.write(_output_serializers[idx](_outputs[idx]))\n    args:\n    - --number\n    - {inputValue: number}\n    - '----output-paths'\n    - {outputPath: Output}\n```\n\n이제 생성된 파일을 공유해서 파이프라인에서 다음과 같이 사용할 수 있습니다.\n\n```python\nfrom kfp.components import load_component_from_file\n\nprint_and_return_number = load_component_from_file(\"print_and_return_number.yaml\")\n```\n\n## How Kubeflow executes component\n\nKubeflow에서 컴포넌트가 실행되는 순서는 다음과 같습니다.\n\n1. `docker pull <image>`: 정의된 컴포넌트의 실행 환경 정보가 담긴 이미지를 pull\n2. run `command`: pull 한 이미지에서 컴포넌트 콘텐츠를 실행합니다.  \n\n`print_and_return_number.yaml` 를 예시로 들자면 `@create_component_from_func` 의 default image 는 python:3.7 이므로 해당 이미지를 기준으로 컴포넌트 콘텐츠를 실행하게 됩니다.  \n\n1. `docker pull python:3.7`\n2. `print(number)`\n\n## References:\n\n- [Getting Started With Python function based components](https://www.kubeflow.org/docs/components/pipelines/sdk/python-function-components/#getting-started-with-python-function-based-components)\n"
  },
  {
    "path": "versioned_docs/version-1.0/kubeflow/basic-pipeline-upload.md",
    "content": "---\ntitle : \"6. Pipeline - Upload\"\ndescription: \"\"\nsidebar_position: 6\ncontributors: [\"Jongseob Jeon\"]\n---\n\n## Upload Pipeline\n\n이제 우리가 만든 파이프라인을 직접 kubeflow에서 업로드 해 보겠습니다.  \n파이프라인 업로드는 kubeflow 대시보드 UI를 통해 진행할 수 있습니다.\n[Install Kubeflow](../setup-components/install-components-kf.md#정상-설치-확인) 에서 사용한 방법을 이용해 포트포워딩합니다.\n\n```bash\nkubectl port-forward svc/istio-ingressgateway -n istio-system 8080:80\n```\n\n[http://localhost:8080](http://localhost:8080)에 접속해 대시보드를 열어줍니다.\n\n### 1. Pipelines 탭 선택\n\n![pipeline-gui-0.png](./img/pipeline-gui-0.png)\n\n### 2. Upload Pipeline 선택\n\n![pipeline-gui-1.png](./img/pipeline-gui-1.png)\n\n### 3. Choose file 선택\n\n![pipeline-gui-2.png](./img/pipeline-gui-2.png)\n\n### 4. 생성된 yaml파일 업로드\n\n![pipeline-gui-3.png](./img/pipeline-gui-3.png)\n\n### 5. Create\n\n![pipeline-gui-4.png](./img/pipeline-gui-4.png)\n\n## Upload Pipeline Version\n\n업로드된 파이프라인은 업로드를 통해서 버전을 관리할 수 있습니다. 다만 깃헙과 같은 코드 차원의 버전 관리가 아닌 같은 이름의 파이프라인을 모아서 보여주는 역할을 합니다.\n위의 예시에서 파이프라인을 업로드한 경우 다음과 같이 example_pipeline이 생성된 것을 확인할 수 있습니다.\n\n![pipeline-gui-5.png](./img/pipeline-gui-5.png)\n\n클릭하면 다음과 같은 화면이 나옵니다.\n\n![pipeline-gui-4.png](./img/pipeline-gui-4.png)\n\nUpload Version을 클릭하면 다음과 같이 파이프라인을 업로드할 수 있는 화면이 생성됩니다.\n\n![pipeline-gui-6.png](./img/pipeline-gui-6.png)\n\n파이프라인을 업로드 합니다.\n\n![pipeline-gui-7.png](./img/pipeline-gui-7.png)\n\n업로드된 경우 다음과 같이 파이프라인 버전을 확인할 수 있습니다.\n\n![pipeline-gui-8.png](./img/pipeline-gui-8.png)\n"
  },
  {
    "path": "versioned_docs/version-1.0/kubeflow/basic-pipeline.md",
    "content": "---\ntitle : \"5. Pipeline - Write\"\ndescription: \"\"\nsidebar_position: 5\ncontributors: [\"Jongseob Jeon\"]\n---\n\n## Pipeline\n\n컴포넌트는 독립적으로 실행되지 않고 파이프라인의 구성요소로써 실행됩니다. 그러므로 컴포넌트를 실행해 보려면 파이프라인을 작성해야 합니다.\n그리고 파이프라인을 작성하기 위해서는 컴포넌트의 집합과 컴포넌트의 실행 순서가 필요합니다.\n\n이번 페이지에서는 숫자를 입력받고 출력하는 컴포넌트와 두 개의 컴포넌트로부터 숫자를 받아서 합을 출력하는 컴포넌트가 있는 파이프라인을 만들어 보도록 하겠습니다.\n\n## Component Set\n\n우선 파이프라인에서 사용할 컴포넌트들을 작성합니다.\n\n1. `print_and_return_number`\n\n  입력받은 숫자를 출력하고 반환하는 컴포넌트입니다.  \n  컴포넌트가 입력받은 값을 반환하기 때문에 int를 return의 타입 힌트로 입력합니다.\n\n  ```python\n  @create_component_from_func\n  def print_and_return_number(number: int) -> int:\n      print(number)\n      return number\n  ```\n\n2. `sum_and_print_numbers`\n\n  입력받은 두 개의 숫자의 합을 출력하는 컴포넌트입니다.  \n  이 컴포넌트 역시 두 숫자의 합을 반환하기 때문에 int를 return의 타입 힌트로 입력합니다.\n\n  ```python\n  @create_component_from_func\n  def sum_and_print_numbers(number_1: int, number_2: int) -> int:\n      sum_num = number_1 + number_2\n      print(sum_num)\n      return sum_num\n  ```\n\n## Component Order\n\n### Define Order\n\n필요한 컴포넌트의 집합을 만들었으면, 다음으로는 이들의 순서를 정의해야 합니다.  \n이번 페이지에서 만들 파이프라인의 순서를 그림으로 표현하면 다음과 같이 됩니다.\n\n![pipeline-0.png](./img/pipeline-0.png)\n\n### Single Output\n\n이제 이 순서를 코드로 옮겨보겠습니다.  \n\n우선 위의 그림에서 `print_and_return_number_1` 과 `print_and_return_number_2` 를 작성하면 다음과 같이 됩니다.\n\n```python\ndef example_pipeline():\n    number_1_result = print_and_return_number(number_1)\n    number_2_result = print_and_return_number(number_2)\n```\n\n컴포넌트를 실행하고 그 반환 값을 각각 `number_1_result` 와 `number_2_result` 에 저장합니다.  \n저장된 `number_1_result` 의 반환 값은 `number_1_resulst.output` 를 통해 사용할 수 있습니다.\n\n### Multi Output\n\n위의 예시에서 컴포넌트는 단일 값만을 반환하기 때문에 `output`을 이용해 바로 사용할 수 있습니다.  \n만약, 여러 개의 반환 값이 있다면 `outputs`에 저장이 되며 dict 타입이기에 key를 이용해 원하는 반환 값을 사용할 수 있습니다.\n예를 들어서 앞에서 작성한 여러 개를 반환하는 [컴포넌트](../kubeflow/basic-component.md#define-a-standalone-python-function) 의 경우를 보겠습니다.\n`divde_and_return_number` 의 return 값은 `quotient` 와 `remainder` 가 있습니다. 이 두 값을 `print_and_return_number` 에 전달하는 예시를 보면 다음과 같습니다.\n\n```python\ndef multi_pipeline():\n    divided_result = divde_and_return_number(number)\n    num_1_result = print_and_return_number(divided_result.outputs[\"quotient\"])\n    num_2_result = print_and_return_number(divided_result.outputs[\"remainder\"])\n```\n\n`divde_and_return_number`의 결과를 `divided_result`에 저장하고 각각 `divided_result.outputs[\"quotient\"]`, `divided_result.outputs[\"remainder\"]`로 값을 가져올 수 있습니다.\n\n### Write to python code\n\n이제 다시 본론으로 돌아와서 이 두 값의 결과를 `sum_and_print_numbers` 에 전달합니다.\n\n```python\ndef example_pipeline():\n    number_1_result = print_and_return_number(number_1)\n    number_2_result = print_and_return_number(number_2)\n    sum_result = sum_and_print_numbers(\n        number_1=number_1_result.output, number_2=number_2_result.output\n    )\n```\n\n다음으로 각 컴포넌트에 필요한 Config들을 모아서 파이프라인 Config로 정의 합니다.\n\n```python\ndef example_pipeline(number_1: int, number_2:int):\n    number_1_result = print_and_return_number(number_1)\n    number_2_result = print_and_return_number(number_2)\n    sum_result = sum_and_print_numbers(\n        number_1=number_1_result.output, number_2=number_2_result.output\n    )\n```\n\n## Convert to Kubeflow Format\n\n마지막으로 kubeflow에서 사용할 수 있는 형식으로 변환합니다. 변환은 `kfp.dsl.pipeline` 함수를 이용해 할 수 있습니다.\n\n```python\nfrom kfp.dsl import pipeline\n\n\n@pipeline(name=\"example_pipeline\")\ndef example_pipeline(number_1: int, number_2: int):\n    number_1_result = print_and_return_number(number_1)\n    number_2_result = print_and_return_number(number_2)\n    sum_result = sum_and_print_numbers(\n        number_1=number_1_result.output, number_2=number_2_result.output\n    )\n```\n\nKubeflow에서 파이프라인을 실행하기 위해서는 yaml 형식으로만 가능하기 때문에 생성한 파이프라인을 정해진 yaml 형식으로 컴파일(Compile) 해 주어야 합니다.\n컴파일은 다음 명령어를 이용해 생성할 수 있습니다.\n\n```python\nif __name__ == \"__main__\":\n    import kfp\n    kfp.compiler.Compiler().compile(example_pipeline, \"example_pipeline.yaml\")\n```\n\n## Conclusion\n\n앞서 설명한 내용을 한 파이썬 코드로 모으면 다음과 같이 됩니다.\n\n```python\nimport kfp\nfrom kfp.components import create_component_from_func\nfrom kfp.dsl import pipeline\n\n@create_component_from_func\ndef print_and_return_number(number: int) -> int:\n    print(number)\n    return number\n\n@create_component_from_func\ndef sum_and_print_numbers(number_1: int, number_2: int):\n    print(number_1 + number_2)\n\n@pipeline(name=\"example_pipeline\")\ndef example_pipeline(number_1: int, number_2: int):\n    number_1_result = print_and_return_number(number_1)\n    number_2_result = print_and_return_number(number_2)\n    sum_result = sum_and_print_numbers(\n        number_1=number_1_result.output, number_2=number_2_result.output\n    )\n\nif __name__ == \"__main__\":\n    kfp.compiler.Compiler().compile(example_pipeline, \"example_pipeline.yaml\")\n```\n\n컴파일된 결과를 보면 다음과 같습니다.\n\n<details>\n  <summary>example_pipeline.yaml</summary>\n\n```bash\napiVersion: argoproj.io/v1alpha1\nkind: Workflow\nmetadata:\n  generateName: example-pipeline-\n  annotations: {pipelines.kubeflow.org/kfp_sdk_version: 1.6.3, pipelines.kubeflow.org/pipeline_compilation_time: '2021-12-05T13:38:51.566777',\n    pipelines.kubeflow.org/pipeline_spec: '{\"inputs\": [{\"name\": \"number_1\", \"type\":\n      \"Integer\"}, {\"name\": \"number_2\", \"type\": \"Integer\"}], \"name\": \"example_pipeline\"}'}\n  labels: {pipelines.kubeflow.org/kfp_sdk_version: 1.6.3}\nspec:\n  entrypoint: example-pipeline\n  templates:\n  - name: example-pipeline\n    inputs:\n      parameters:\n      - {name: number_1}\n      - {name: number_2}\n    dag:\n      tasks:\n      - name: print-and-return-number\n        template: print-and-return-number\n        arguments:\n          parameters:\n          - {name: number_1, value: '{{inputs.parameters.number_1}}'}\n      - name: print-and-return-number-2\n        template: print-and-return-number-2\n        arguments:\n          parameters:\n          - {name: number_2, value: '{{inputs.parameters.number_2}}'}\n      - name: sum-and-print-numbers\n        template: sum-and-print-numbers\n        dependencies: [print-and-return-number, print-and-return-number-2]\n        arguments:\n          parameters:\n          - {name: print-and-return-number-2-Output, value: '{{tasks.print-and-return-number-2.outputs.parameters.print-and-return-number-2-Output}}'}\n          - {name: print-and-return-number-Output, value: '{{tasks.print-and-return-number.outputs.parameters.print-and-return-number-Output}}'}\n  - name: print-and-return-number\n    container:\n      args: [--number, '{{inputs.parameters.number_1}}', '----output-paths', /tmp/outputs/Output/data]\n      command:\n      - sh\n      - -ec\n      - |\n        program_path=$(mktemp)\n        printf \"%s\" \"$0\" > \"$program_path\"\n        python3 -u \"$program_path\" \"$@\"\n      - |\n        def print_and_return_number(number):\n            print(number)\n            return number\n\n        def _serialize_int(int_value: int) -> str:\n            if isinstance(int_value, str):\n                return int_value\n            if not isinstance(int_value, int):\n                raise TypeError('Value \"{}\" has type \"{}\" instead of int.'.format(str(int_value), str(type(int_value))))\n            return str(int_value)\n\n        import argparse\n        _parser = argparse.ArgumentParser(prog='Print and return number', description='')\n        _parser.add_argument(\"--number\", dest=\"number\", type=int, required=True, default=argparse.SUPPRESS)\n        _parser.add_argument(\"----output-paths\", dest=\"_output_paths\", type=str, nargs=1)\n        _parsed_args = vars(_parser.parse_args())\n        _output_files = _parsed_args.pop(\"_output_paths\", [])\n\n        _outputs = print_and_return_number(**_parsed_args)\n\n        _outputs = [_outputs]\n\n        _output_serializers = [\n            _serialize_int,\n\n        ]\n\n        import os\n        for idx, output_file in enumerate(_output_files):\n            try:\n                os.makedirs(os.path.dirname(output_file))\n            except OSError:\n                pass\n            with open(output_file, 'w') as f:\n                f.write(_output_serializers[idx](_outputs[idx]))\n      image: python:3.7\n    inputs:\n      parameters:\n      - {name: number_1}\n    outputs:\n      parameters:\n      - name: print-and-return-number-Output\n        valueFrom: {path: /tmp/outputs/Output/data}\n      artifacts:\n      - {name: print-and-return-number-Output, path: /tmp/outputs/Output/data}\n    metadata:\n      labels: {pipelines.kubeflow.org/kfp_sdk_version: 1.6.3, pipelines.kubeflow.org/pipeline-sdk-type: kfp}\n      annotations: {pipelines.kubeflow.org/component_spec: '{\"implementation\": {\"container\":\n          {\"args\": [\"--number\", {\"inputValue\": \"number\"}, \"----output-paths\", {\"outputPath\":\n          \"Output\"}], \"command\": [\"sh\", \"-ec\", \"program_path=$(mktemp)\\nprintf \\\"%s\\\"\n          \\\"$0\\\" > \\\"$program_path\\\"\\npython3 -u \\\"$program_path\\\" \\\"$@\\\"\\n\", \"def\n          print_and_return_number(number):\\n    print(number)\\n    return number\\n\\ndef\n          _serialize_int(int_value: int) -> str:\\n    if isinstance(int_value, str):\\n        return\n          int_value\\n    if not isinstance(int_value, int):\\n        raise TypeError(''Value\n          \\\"{}\\\" has type \\\"{}\\\" instead of int.''.format(str(int_value), str(type(int_value))))\\n    return\n          str(int_value)\\n\\nimport argparse\\n_parser = argparse.ArgumentParser(prog=''Print\n          and return number'', description='''')\\n_parser.add_argument(\\\"--number\\\",\n          dest=\\\"number\\\", type=int, required=True, default=argparse.SUPPRESS)\\n_parser.add_argument(\\\"----output-paths\\\",\n          dest=\\\"_output_paths\\\", type=str, nargs=1)\\n_parsed_args = vars(_parser.parse_args())\\n_output_files\n          = _parsed_args.pop(\\\"_output_paths\\\", [])\\n\\n_outputs = print_and_return_number(**_parsed_args)\\n\\n_outputs\n          = [_outputs]\\n\\n_output_serializers = [\\n    _serialize_int,\\n\\n]\\n\\nimport\n          os\\nfor idx, output_file in enumerate(_output_files):\\n    try:\\n        os.makedirs(os.path.dirname(output_file))\\n    except\n          OSError:\\n        pass\\n    with open(output_file, ''w'') as f:\\n        f.write(_output_serializers[idx](_outputs[idx]))\\n\"],\n          \"image\": \"python:3.7\"}}, \"inputs\": [{\"name\": \"number\", \"type\": \"Integer\"}],\n          \"name\": \"Print and return number\", \"outputs\": [{\"name\": \"Output\", \"type\":\n          \"Integer\"}]}', pipelines.kubeflow.org/component_ref: '{}', pipelines.kubeflow.org/arguments.parameters: '{\"number\":\n          \"{{inputs.parameters.number_1}}\"}'}\n  - name: print-and-return-number-2\n    container:\n      args: [--number, '{{inputs.parameters.number_2}}', '----output-paths', /tmp/outputs/Output/data]\n      command:\n      - sh\n      - -ec\n      - |\n        program_path=$(mktemp)\n        printf \"%s\" \"$0\" > \"$program_path\"\n        python3 -u \"$program_path\" \"$@\"\n      - |\n        def print_and_return_number(number):\n            print(number)\n            return number\n\n        def _serialize_int(int_value: int) -> str:\n            if isinstance(int_value, str):\n                return int_value\n            if not isinstance(int_value, int):\n                raise TypeError('Value \"{}\" has type \"{}\" instead of int.'.format(str(int_value), str(type(int_value))))\n            return str(int_value)\n\n        import argparse\n        _parser = argparse.ArgumentParser(prog='Print and return number', description='')\n        _parser.add_argument(\"--number\", dest=\"number\", type=int, required=True, default=argparse.SUPPRESS)\n        _parser.add_argument(\"----output-paths\", dest=\"_output_paths\", type=str, nargs=1)\n        _parsed_args = vars(_parser.parse_args())\n        _output_files = _parsed_args.pop(\"_output_paths\", [])\n\n        _outputs = print_and_return_number(**_parsed_args)\n\n        _outputs = [_outputs]\n\n        _output_serializers = [\n            _serialize_int,\n\n        ]\n\n        import os\n        for idx, output_file in enumerate(_output_files):\n            try:\n                os.makedirs(os.path.dirname(output_file))\n            except OSError:\n                pass\n            with open(output_file, 'w') as f:\n                f.write(_output_serializers[idx](_outputs[idx]))\n      image: python:3.7\n    inputs:\n      parameters:\n      - {name: number_2}\n    outputs:\n      parameters:\n      - name: print-and-return-number-2-Output\n        valueFrom: {path: /tmp/outputs/Output/data}\n      artifacts:\n      - {name: print-and-return-number-2-Output, path: /tmp/outputs/Output/data}\n    metadata:\n      labels: {pipelines.kubeflow.org/kfp_sdk_version: 1.6.3, pipelines.kubeflow.org/pipeline-sdk-type: kfp}\n      annotations: {pipelines.kubeflow.org/component_spec: '{\"implementation\": {\"container\":\n          {\"args\": [\"--number\", {\"inputValue\": \"number\"}, \"----output-paths\", {\"outputPath\":\n          \"Output\"}], \"command\": [\"sh\", \"-ec\", \"program_path=$(mktemp)\\nprintf \\\"%s\\\"\n          \\\"$0\\\" > \\\"$program_path\\\"\\npython3 -u \\\"$program_path\\\" \\\"$@\\\"\\n\", \"def\n          print_and_return_number(number):\\n    print(number)\\n    return number\\n\\ndef\n          _serialize_int(int_value: int) -> str:\\n    if isinstance(int_value, str):\\n        return\n          int_value\\n    if not isinstance(int_value, int):\\n        raise TypeError(''Value\n          \\\"{}\\\" has type \\\"{}\\\" instead of int.''.format(str(int_value), str(type(int_value))))\\n    return\n          str(int_value)\\n\\nimport argparse\\n_parser = argparse.ArgumentParser(prog=''Print\n          and return number'', description='''')\\n_parser.add_argument(\\\"--number\\\",\n          dest=\\\"number\\\", type=int, required=True, default=argparse.SUPPRESS)\\n_parser.add_argument(\\\"----output-paths\\\",\n          dest=\\\"_output_paths\\\", type=str, nargs=1)\\n_parsed_args = vars(_parser.parse_args())\\n_output_files\n          = _parsed_args.pop(\\\"_output_paths\\\", [])\\n\\n_outputs = print_and_return_number(**_parsed_args)\\n\\n_outputs\n          = [_outputs]\\n\\n_output_serializers = [\\n    _serialize_int,\\n\\n]\\n\\nimport\n          os\\nfor idx, output_file in enumerate(_output_files):\\n    try:\\n        os.makedirs(os.path.dirname(output_file))\\n    except\n          OSError:\\n        pass\\n    with open(output_file, ''w'') as f:\\n        f.write(_output_serializers[idx](_outputs[idx]))\\n\"],\n          \"image\": \"python:3.7\"}}, \"inputs\": [{\"name\": \"number\", \"type\": \"Integer\"}],\n          \"name\": \"Print and return number\", \"outputs\": [{\"name\": \"Output\", \"type\":\n          \"Integer\"}]}', pipelines.kubeflow.org/component_ref: '{}', pipelines.kubeflow.org/arguments.parameters: '{\"number\":\n          \"{{inputs.parameters.number_2}}\"}'}\n  - name: sum-and-print-numbers\n    container:\n      args: [--number-1, '{{inputs.parameters.print-and-return-number-Output}}', --number-2,\n        '{{inputs.parameters.print-and-return-number-2-Output}}']\n      command:\n      - sh\n      - -ec\n      - |\n        program_path=$(mktemp)\n        printf \"%s\" \"$0\" > \"$program_path\"\n        python3 -u \"$program_path\" \"$@\"\n      - |\n        def sum_and_print_numbers(number_1, number_2):\n            print(number_1 + number_2)\n\n        import argparse\n        _parser = argparse.ArgumentParser(prog='Sum and print numbers', description='')\n        _parser.add_argument(\"--number-1\", dest=\"number_1\", type=int, required=True, default=argparse.SUPPRESS)\n        _parser.add_argument(\"--number-2\", dest=\"number_2\", type=int, required=True, default=argparse.SUPPRESS)\n        _parsed_args = vars(_parser.parse_args())\n\n        _outputs = sum_and_print_numbers(**_parsed_args)\n      image: python:3.7\n    inputs:\n      parameters:\n      - {name: print-and-return-number-2-Output}\n      - {name: print-and-return-number-Output}\n    metadata:\n      labels: {pipelines.kubeflow.org/kfp_sdk_version: 1.6.3, pipelines.kubeflow.org/pipeline-sdk-type: kfp}\n      annotations: {pipelines.kubeflow.org/component_spec: '{\"implementation\": {\"container\":\n          {\"args\": [\"--number-1\", {\"inputValue\": \"number_1\"}, \"--number-2\", {\"inputValue\":\n          \"number_2\"}], \"command\": [\"sh\", \"-ec\", \"program_path=$(mktemp)\\nprintf \\\"%s\\\"\n          \\\"$0\\\" > \\\"$program_path\\\"\\npython3 -u \\\"$program_path\\\" \\\"$@\\\"\\n\", \"def\n          sum_and_print_numbers(number_1, number_2):\\n    print(number_1 + number_2)\\n\\nimport\n          argparse\\n_parser = argparse.ArgumentParser(prog=''Sum and print numbers'',\n          description='''')\\n_parser.add_argument(\\\"--number-1\\\", dest=\\\"number_1\\\",\n          type=int, required=True, default=argparse.SUPPRESS)\\n_parser.add_argument(\\\"--number-2\\\",\n          dest=\\\"number_2\\\", type=int, required=True, default=argparse.SUPPRESS)\\n_parsed_args\n          = vars(_parser.parse_args())\\n\\n_outputs = sum_and_print_numbers(**_parsed_args)\\n\"],\n          \"image\": \"python:3.7\"}}, \"inputs\": [{\"name\": \"number_1\", \"type\": \"Integer\"},\n          {\"name\": \"number_2\", \"type\": \"Integer\"}], \"name\": \"Sum and print numbers\"}',\n        pipelines.kubeflow.org/component_ref: '{}', pipelines.kubeflow.org/arguments.parameters: '{\"number_1\":\n          \"{{inputs.parameters.print-and-return-number-Output}}\", \"number_2\": \"{{inputs.parameters.print-and-return-number-2-Output}}\"}'}\n  arguments:\n    parameters:\n    - {name: number_1}\n    - {name: number_2}\n  serviceAccountName: pipeline-runner\n```\n\n</details>\n"
  },
  {
    "path": "versioned_docs/version-1.0/kubeflow/basic-requirements.md",
    "content": "---\ntitle : \"3. Install Requirements\"\ndescription: \"\"\nsidebar_position: 3\ncontributors: [\"Jongseob Jeon\"]\n---\n\n실습을 위해 권장하는 파이썬 버전은 python>=3.7입니다. 파이썬 환경에 익숙하지 않은 분들은 다음 [Appendix 1. 파이썬 가상환경](../appendix/pyenv)을 참고하여 **클라이언트 노드**에 설치해주신 뒤 패키지 설치를 진행해주시기를 바랍니다.\n\n실습을 진행하기에서 필요한 패키지들과 버전은 다음과 같습니다.\n\n- requirements.txt\n\n  ```bash\n  kfp==1.8.9\n  scikit-learn==1.0.1\n  mlflow==1.21.0\n  pandas==1.3.4\n  dill==0.3.4\n  ```\n\n[앞에서 만든 파이썬 가상환경](../appendix/pyenv.md#python-가상환경-생성)을 활성화합니다.\n\n```bash\npyenv activate demo\n```\n\n패키지 설치를 진행합니다.\n\n```bash\npip3 install -U pip\npip3 install kfp==1.8.9 scikit-learn==1.0.1 mlflow==1.21.0 pandas==1.3.4 dill==0.3.4\n```\n"
  },
  {
    "path": "versioned_docs/version-1.0/kubeflow/basic-run.md",
    "content": "---\ntitle : \"7. Pipeline - Run\"\ndescription: \"\"\nsidebar_position: 7\ncontributors: [\"Jongseob Jeon\"]\n---\n\n## Run Pipeline\n\n이제 업로드한 파이프라인을 실행시켜 보겠습니다.\n\n## Before Run\n\n### 1. Create Experiment\n\nExperiment란 Kubeflow 에서 실행되는 Run을 논리적으로 관리하는 단위입니다.  \n\nKubeflow에서 namespace를 처음 들어오면 생성되어 있는 Experiment가 없습니다. 따라서 파이프라인을 실행하기 전에 미리 Experiment를 생성해두어야 합니다. Experiment이 있다면 [Run Pipeline](../kubeflow/basic-run.md#run-pipeline-1)으로 넘어가도 무방합니다.\n\nExperiment는 Create Experiment 버튼을 통해 생성할 수 있습니다.\n\n![run-0.png](./img/run-0.png)\n\n### 2. Name 입력\n\nExperiment로 사용할 이름을 입력합니다.\n![run-1.png](./img/run-1.png)\n\n## Run Pipeline\n\n### 1. Create Run 선택\n\n![run-2.png](./img/run-2.png)\n\n### 2. Experiment 선택\n\n![run-9.png](./img/run-9.png)\n\n![run-10.png](./img/run-10.png)\n\n### 3. Pipeline Config 입력\n\n파이프라인을 생성할 때 입력한 Config 값들을 채워 넣습니다.\n업로드한 파이프라인은 number_1과 number_2를 입력해야 합니다.\n\n![run-3.png](./img/run-3.png)\n\n### 4. Start\n\n입력 후 Start 버튼을 누르면 파이프라인이 실행됩니다.\n\n![run-4.png](./img/run-4.png)\n\n## Run Result\n\n실행된 파이프라인들은 Runs 탭에서 확인할 수 있습니다.\nRun을 클릭하면 실행된 파이프라인과 관련된 자세한 내용을 확인해 볼 수 있습니다.\n\n![run-5.png](./img/run-5.png)\n\n클릭하면 다음과 같은 화면이 나옵니다. 아직 실행되지 않은 컴포넌트는 회색 표시로 나옵니다.\n\n![run-6.png](./img/run-6.png)\n\n컴포넌트가 실행이 완료되면 초록색 체크 표시가 나옵니다.\n\n![run-7.png](./img/run-7.png)\n\n가장 마지막 컴포넌트를 보면 입력한 Config인 3과 5의 합인 8이 출력된 것을 확인할 수 있습니다.\n\n![run-8.png](./img/run-8.png)\n"
  },
  {
    "path": "versioned_docs/version-1.0/kubeflow/how-to-debug.md",
    "content": "---\ntitle : \"13. Component - Debugging\"\ndescription: \"\"\nsidebar_position: 13\ncontributors: [\"Jongseob Jeon\"]\n---\n\n## Debugging Pipeline\n\n이번 페이지에서는 Kubeflow 컴포넌트를 디버깅하는 방법에 대해서 알아봅니다.\n\n## Failed Component\n\n이번 페이지에서는 [Component - MLFlow](../kubeflow/advanced-mlflow.md#mlflow-pipeline) 에서 이용한 파이프라인을 조금 수정해서 사용합니다.\n\n우선 컴포넌트가 실패하도록 파이프라인을 변경하도록 하겠습니다.\n\n```python\nfrom functools import partial\n\nimport kfp\nfrom kfp.components import InputPath, OutputPath, create_component_from_func\nfrom kfp.dsl import pipeline\n\n\n@partial(\n    create_component_from_func,\n    packages_to_install=[\"pandas\", \"scikit-learn\"],\n)\ndef load_iris_data(\n    data_path: OutputPath(\"csv\"),\n    target_path: OutputPath(\"csv\"),\n):\n    import pandas as pd\n    from sklearn.datasets import load_iris\n\n    iris = load_iris()\n\n    data = pd.DataFrame(iris[\"data\"], columns=iris[\"feature_names\"])\n    target = pd.DataFrame(iris[\"target\"], columns=[\"target\"])\n    \n    data[\"sepal length (cm)\"] = None\n    data.to_csv(data_path, index=False)\n    target.to_csv(target_path, index=False)\n\n\n@partial(\n    create_component_from_func,\n    packages_to_install=[\"pandas\"],\n)\ndef drop_na_from_csv(\n    data_path: InputPath(\"csv\"),\n    output_path: OutputPath(\"csv\"),\n):\n    import pandas as pd\n\n    data = pd.read_csv(data_path)\n    data = data.dropna()\n    data.to_csv(output_path, index=False)\n\n\n@partial(\n    create_component_from_func,\n    packages_to_install=[\"dill\", \"pandas\", \"scikit-learn\", \"mlflow\"],\n)\ndef train_from_csv(\n    train_data_path: InputPath(\"csv\"),\n    train_target_path: InputPath(\"csv\"),\n    model_path: OutputPath(\"dill\"),\n    input_example_path: OutputPath(\"dill\"),\n    signature_path: OutputPath(\"dill\"),\n    conda_env_path: OutputPath(\"dill\"),\n    kernel: str,\n):\n    import dill\n    import pandas as pd\n    from sklearn.svm import SVC\n\n    from mlflow.models.signature import infer_signature\n    from mlflow.utils.environment import _mlflow_conda_env\n\n    train_data = pd.read_csv(train_data_path)\n    train_target = pd.read_csv(train_target_path)\n\n    clf = SVC(kernel=kernel)\n    clf.fit(train_data, train_target)\n\n    with open(model_path, mode=\"wb\") as file_writer:\n        dill.dump(clf, file_writer)\n\n    input_example = train_data.sample(1)\n    with open(input_example_path, \"wb\") as file_writer:\n        dill.dump(input_example, file_writer)\n\n    signature = infer_signature(train_data, clf.predict(train_data))\n    with open(signature_path, \"wb\") as file_writer:\n        dill.dump(signature, file_writer)\n\n    conda_env = _mlflow_conda_env(\n        additional_pip_deps=[\"dill\", \"pandas\", \"scikit-learn\"]\n    )\n    with open(conda_env_path, \"wb\") as file_writer:\n        dill.dump(conda_env, file_writer)\n\n\n\n@pipeline(name=\"debugging_pipeline\")\ndef debugging_pipeline(kernel: str):\n    iris_data = load_iris_data()\n    drop_data = drop_na_from_csv(data=iris_data.outputs[\"data\"])\n    model = train_from_csv(\n        train_data=drop_data.outputs[\"output\"],\n        train_target=iris_data.outputs[\"target\"],\n        kernel=kernel,\n    )\n\nif __name__ == \"__main__\":\n    kfp.compiler.Compiler().compile(debugging_pipeline, \"debugging_pipeline.yaml\")\n\n```\n\n수정한 점은 다음과 같습니다.\n\n1. 데이터를 불러오는 `load_iris_data` 컴포넌트에서 `sepal length (cm)` 피처에 `None` 값을 주입\n2. `drop_na_from_csv` 컴포넌트에서 `drop_na()` 함수를 이용해 na 값이 포함된 `row`를 제거\n\n이제 파이프라인을 업로드하고 실행해 보겠습니다.  \n실행 후 Run을 눌러서 확인해보면 `Train from csv` 컴포넌트에서 실패했다고 나옵니다.\n\n![debug-0.png](./img/debug-0.png)\n\n실패한 컴포넌트를 클릭하고 로그를 확인해서 실패한 이유를 확인해 보겠습니다.\n\n![debug-2.png](./img/debug-2.png)\n\n로그를 확인하면 데이터의 개수가 0이여서 실행되지 않았다고 나옵니다.  \n분명 정상적으로 데이터를 전달했는데 왜 데이터의 개수가 0개일까요?  \n\n이제 입력받은 데이터에 어떤 문제가 있었는지 확인해 보겠습니다.  \n우선 컴포넌트를 클릭하고 Input/Ouput 탭에서 입력값으로 들어간 데이터들을 다운로드 받습니다.  \n다운로드는 빨간색 네모로 표시된 곳의 링크를 클릭하면 됩니다.\n\n![debug-5.png](./img/debug-5.png)\n\n두 개의 파일을 같은 경로에 다운로드합니다.  \n그리고 해당 경로로 이동해서 파일을 확인합니다.\n\n```bash\nls\n```\n\n다음과 같이 두 개의 파일이 있습니다.\n\n```bash\ndrop-na-from-csv-output.tgz load-iris-data-target.tgz\n```\n\n압축을 풀어보겠습니다.\n\n```bash\ntar -xzvf load-iris-data-target.tgz ; mv data target.csv\ntar -xzvf drop-na-from-csv-output.tgz ; mv data data.csv\n```\n\n그리고 이를 주피터 노트북을 이용해 컴포넌트 코드를 실행합니다.\n\n![debug-3.png](./img/debug-3.png)\n\n디버깅을 해본 결과 dropna 할 때 column을 기준으로 drop을 해야 하는데 row를 기준으로 drop을 해서 데이터가 모두 사라졌습니다.\n이제 문제의 원인을 알아냈으니 column을 기준으로 drop이 되게 컴포넌트를 수정합니다.\n\n```python\n@partial(\n    create_component_from_func,\n    packages_to_install=[\"pandas\"],\n)\ndef drop_na_from_csv(\n    data_path: InputPath(\"csv\"),\n    output_path: OutputPath(\"csv\"),\n):\n    import pandas as pd\n\n    data = pd.read_csv(data_path)\n    data = data.dropna(axis=\"columns\")\n    data.to_csv(output_path, index=False)\n```\n\n수정 후 파이프라인을 다시 업로드하고 실행하면 다음과 같이 정상적으로 수행하는 것을 확인할 수 있습니다.\n\n![debug-6.png](./img/debug-6.png)\n"
  },
  {
    "path": "versioned_docs/version-1.0/kubeflow/kubeflow-concepts.md",
    "content": "---\ntitle : \"2. Kubeflow Concepts\"\ndescription: \"\"\nsidebar_position: 2\ncontributors: [\"Jongseob Jeon\"]\n---\n\n## Component\n\n컴포넌트(Component)는 컴포넌트 콘텐츠(Component contents)와 컴포넌트 래퍼(Component wrapper)로 구성되어 있습니다.\n하나의 컴포넌트는 컴포넌트 래퍼를 통해 kubeflow에 전달되며 전달된 컴포넌트는 정의된 컴포넌트 콘텐츠를 실행(execute)하고 아티팩트(artifacts)들을 생산합니다.\n\n![concept-0.png](./img/concept-0.png)\n\n### Component Contents\n\n컴포넌트 콘텐츠를 구성하는 것은 총 3가지가 있습니다.\n\n![concept-1.png](./img/concept-1.png)\n\n1. Environemnt\n2. Python code w\\ Config\n3. Generates Artifacts\n\n예시와 함께 각 구성 요소가 어떤 것인지 알아보도록 하겠습니다.\n다음과 같이 데이터를 불러와 SVC(Support Vector Classifier)를 학습한 후 SVC 모델을 저장하는 과정을 적은 파이썬 코드가 있습니다.\n\n```python\nimport dill\nimport pandas as pd\n\nfrom sklearn.svm import SVC\n\ntrain_data = pd.read_csv(train_data_path)\ntrain_target= pd.read_csv(train_target_path)\n\nclf= SVC(\n    kernel=kernel\n)\nclf.fit(train_data)\n\nwith open(model_path, mode=\"wb\") as file_writer:\n     dill.dump(clf, file_writer)\n```\n\n위의 파이썬 코드는 다음과 같이 컴포넌트 콘텐츠로 나눌 수 있습니다.\n\n![concept-2.png](./img/concept-2.png)\n\nEnvironment는 파이썬 코드에서 사용하는 패키지들을 import하는 부분입니다.  \n다음으로 Python Code w\\ Config 에서는 주어진 Config를 이용해 실제로 학습을 수행합니다.  \n마지막으로 아티팩트를 저장하는 과정이 있습니다.\n\n### Component Wrapper\n\n컴포넌트 래퍼는 컴포넌트 콘텐츠에 필요한 Config를 전달하고 실행시키는 작업을 합니다.\n\n![concept-3.png](./img/concept-3.png)\n\nKubeflow에서는 컴포넌트 래퍼를 위의 `train_svc_from_csv`와 같이 함수의 형태로 정의합니다.\n컴포넌트 래퍼가 콘텐츠를 감싸면 다음과 같이 됩니다.\n\n![concept-4.png](./img/concept-4.png)\n\n### Artifacts\n\n위의 설명에서 컴포넌트는 아티팩트(Artifacts)를 생성한다고 했습니다. 아티팩트란 evaluation result, log 등 어떤 형태로든 파일로 생성되는 것을 통틀어서 칭하는 용어입니다.\n그중 우리가 관심을 두는 유의미한 것들은 다음과 같은 것들이 있습니다.\n\n![concept-5.png](./img/concept-5.png)\n\n- Model\n- Data\n- Metric\n- etc\n\n#### Model\n\n저희는 모델을 다음과 같이 정의 했습니다.\n\n> 모델이란 파이썬 코드와 학습된 Weights와 Network 구조 그리고 이를 실행시키기 위한 환경이 모두 포함된 형태\n\n#### Data\n\n데이터는 전 처리된 피처, 모델의 예측 값 등을 포함합니다.\n\n#### Metric\n\nMetric은 동적 지표와 정적 지표 두 가지로 나누었습니다.\n\n- 동적 지표란 train loss와 같이 학습이 진행되는 중 에폭(Epoch)마다 계속해서 변화하는 값을 의미합니다.\n- 정적 지표란 학습이 끝난 후 최종적으로 모델을 평가하는 정확도 등을 의미합니다.\n\n## Pipeline\n\n파이프라인은 컴포넌트의 집합과 컴포넌트를 실행시키는 순서도로 구성되어 있습니다. 이 때, 순서도는 방향 순환이 없는 그래프로 이루어져 있으며, 간단한 조건문을 포함할 수 있습니다.\n\n![concept-6.png](./img/concept-6.png)\n\n### Pipeline Config\n\n앞서 컴포넌트를 실행시키기 위해서는 Config가 필요하다고 설명했습니다. 파이프라인을 구성하는 컴포넌트의 Config 들을 모아 둔 것이 파이프라인 Config입니다.\n\n![concept-7.png](./img/concept-7.png)\n\n## Run\n\n파이프라인이 필요로 하는 파이프라인 Config가 주어져야지만 파이프라인을 실행할 수 있습니다.  \nKubeflow에서는 실행된 파이프라인을 Run 이라고 부릅니다.\n\n![concept-8.png](./img/concept-8.png)\n\n파이프라인이 실행되면 각 컴포넌트가 아티팩트들을 생성합니다.\nKubeflow pipeline에서는 Run 하나당 고유한 ID 를 생성하고, Run에서 생성되는 모든 아티팩트들을 저장합니다.\n\n![concept-9.png](./img/concept-9.png)\n\n그러면 이제 직접 컴포넌트와 파이프라인을 작성하는 방법에 대해서 알아보도록 하겠습니다.\n"
  },
  {
    "path": "versioned_docs/version-1.0/kubeflow/kubeflow-intro.md",
    "content": "---\ntitle : \"1. Kubeflow Introduction\"\ndescription: \"\"\nsidebar_position: 1\ncontributors: [\"Jongseob Jeon\"]\n---\n\nKubeflow를 사용하기 위해서는 컴포넌트(Component)와 파이프라인(Pipeline)을 작성해야 합니다.\n\n*모두의 MLOps*에서 설명하는 방식은 [Kubeflow Pipeline 공식 홈페이지](https://www.kubeflow.org/docs/components/pipelines/overview/quickstart/)에서 설명하는 방식과는 다소 차이가 있습니다. 여기에서는 Kubeflow Pipeline을 워크플로(Workflow)가 아닌 앞서 설명한 [MLOps를 구성하는 요소](../kubeflow/kubeflow-concepts.md#component-contents) 중 하나의 컴포넌트로 사용하기 때문입니다.\n\n그럼 이제 컴포넌트와 파이프라인은 무엇이며 어떻게 작성할 수 있는지 알아보도록 하겠습니다.\n"
  },
  {
    "path": "versioned_docs/version-1.0/kubeflow-dashboard-guide/_category_.json",
    "content": "{\n  \"label\": \"Kubeflow UI Guide\",\n  \"position\": 5,\n  \"link\": {\n    \"type\": \"generated-index\"\n  }\n}\n"
  },
  {
    "path": "versioned_docs/version-1.0/kubeflow-dashboard-guide/experiments-and-others.md",
    "content": "---\ntitle : \"6. Kubeflow Pipeline 관련\"\ndescription: \"\"\nsidebar_position: 6\ncontributors: [\"Jaeyeon Kim\"]\n---\n\nCentral Dashboard의 왼쪽 탭의 Experiments(KFP), Pipelines, Runs, Recurring Runs, Artifacts, Executions 페이지들에서는 Kubeflow Pipeline과 Pipeline의 실행 그리고 Pipeline Run의 결과를 관리합니다.\n\n![left-tabs](./img/left-tabs.png)\n\nKubeflow Pipeline이 *모두의 MLOps*에서 Kubeflow를 사용하는 주된 이유이며, Kubeflow Pipeline을 만드는 방법, 실행하는 방법, 결과를 확인하는 방법 등 자세한 내용은 [3.Kubeflow](../kubeflow/kubeflow-intro)에서 다룹니다.\n"
  },
  {
    "path": "versioned_docs/version-1.0/kubeflow-dashboard-guide/experiments.md",
    "content": "---\ntitle : \"5. Experiments(AutoML)\"\ndescription: \"\"\nsidebar_position: 5\ncontributors: [\"Jaeyeon Kim\"]\n---\n\n다음으로는 Central Dashboard의 왼쪽 탭의 Experiments(AutoML)을 클릭해보겠습니다.\n\n![left-tabs](./img/left-tabs.png)\n\n![automl](./img/automl.png)\n\nExperiments(AutoML) 페이지는 Kubeflow에서 Hyperparameter Tuning과 Neural Architecture Search를 통한 AutoML을 담당하는 [Katib](https://www.kubeflow.org/docs/components/katib/overview/)를 관리할 수 있는 페이지입니다.\n\nKatib와 Experiments(AutoML)에 대한 사용법은 *모두의 MLOps* v1.0에서는 다루지 않으며, v2.0에 추가될 예정입니다.\n"
  },
  {
    "path": "versioned_docs/version-1.0/kubeflow-dashboard-guide/intro.md",
    "content": "---\ntitle : \"1. Central Dashboard\"\ndescription: \"\"\nsidebar_position: 1\ncontributors: [\"Jaeyeon Kim\", \"SeungTae Kim\"]\n---\n\n[Kubeflow 설치](../setup-components/install-components-kf.md)를 완료하면, 다음 커맨드를 통해 대시보드에 접속할 수 있습니다.\n\n```bash\nkubectl port-forward --address 0.0.0.0 svc/istio-ingressgateway -n istio-system 8080:80\n```\n\n![after-login](./img/after-login.png)\n\nCentral Dashboard는 Kubeflow에서 제공하는 모든 기능을 통합하여 제공하는 UI입니다. Central Dashboard에서 제공하는 기능은 크게 왼쪽의 탭을 기준으로 구분할 수 있습니다.\n\n![left-tabs](./img/left-tabs.png)\n\n- Home\n- Notebooks\n- Tensorboards\n- Volumes\n- Models\n- Experiments(AutoML)\n- Experiments(KFP)\n- Pipelines\n- Runs\n- Recurring Runs\n- Artifacts\n- Executions\n\n그럼 이제 기능별 간단한 사용법을 알아보겠습니다.\n"
  },
  {
    "path": "versioned_docs/version-1.0/kubeflow-dashboard-guide/notebooks.md",
    "content": "---\ntitle : \"2. Notebooks\"\ndescription: \"\"\nsidebar_position: 2\ncontributors: [\"Jaeyeon Kim\"]\n---\n\n## 노트북 서버(Notebook Server) 생성하기\n\n다음 Central Dashboard의 왼쪽 탭의 Notebooks를 클릭해보겠습니다.\n\n![left-tabs](./img/left-tabs.png)\n\n다음과 같은 화면을 볼 수 있습니다.\n\nNotebooks 탭은 JupyterHub와 비슷하게 유저별로 jupyter notebook 및 code server 환경(이하 노트북 서버)을 독립적으로 생성하고 접속할 수 있는 페이지입니다.\n\n![notebook-home](./img/notebook-home.png)\n\n오른쪽 위의 `+ NEW NOTEBOOK` 버튼을 클릭합니다.\n\n![new-notebook](./img/new-notebook.png)\n\n아래와 같은 화면이 나타나면, 이제 생성할 노트북 서버의 스펙(Spec)을 명시하여 생성합니다.\n\n![create](./img/create.png)\n\n<details>\n<summary>각 스펙에 대한 자세한 내용은 아래와 같습니다.</summary>\n\n- **name**:\n  - 노트북 서버를 구분할 수 있는 이름으로 생성합니다.\n- **namespace** :\n  - 따로 변경할 수 없습니다. (현재 로그인한 user 계정의 namespace이 자동으로 지정되어 있습니다.)\n- **Image**:\n  - sklearn, pytorch, tensorflow 등의 파이썬 패키지가 미리 설치된 jupyter lab 이미지 중 사용할 이미지를 선택합니다.\n    - 노트북 서버 내에서 GPU를 사용하여 tensorflow-cuda, pytorch-cuda 등의 이미지를 사용하는 경우, **하단의 GPUs** 부분을 확인하시기 바랍니다.\n  - 추가적인 패키지나 소스코드 등을 포함한 커스텀(Custom) 노트북 서버를 사용하고 싶은 경우에는 커스텀 이미지(Custom Image)를 만들고 배포 후 사용할 수도 있습니다.\n- **CPU / RAM**\n  - 필요한 자원 사용량을 입력합니다.\n    - cpu : core 단위\n      - 가상 core 개수 단위를 의미하며, int 형식이 아닌  `1.5`, `2.7` 등의 float 형식도 입력할 수 있습니다.\n    - memory : Gi 단위\n- **GPUs**\n  - 주피터 노트북에 할당할 GPU 개수를 입력합니다.\n    - `None`\n      - GPU 자원이 필요하지 않은 상황\n    - 1, 2, 4\n      - GPU 1, 2, 4 개 할당\n  - GPU Vendor\n    - 앞의 [(Optional) Setup GPU](../setup-kubernetes/setup-nvidia-gpu.md) 를 따라 nvidia gpu plugin을 설치하였다면 NVIDIA를 선택합니다.\n- **Workspace Volume**\n  - 노트북 서버 내에서 필요한 만큼의 디스크 용량을 입력합니다.\n  - Type 과 Name 은 변경하지 않고, **디스크 용량을 늘리고 싶거나** **AccessMode 를 변경하고 싶을** 때에만 변경해서 사용하시면 됩니다.\n    - **\"Don't use Persistent Storage for User's home\"** 체크박스는 노트북 서버의 작업 내용을 저장하지 않아도 상관없을 때에만 클릭합니다. **일반적으로는 누르지 않는 것을 권장합니다.**\n    - 기존에 미리 생성해두었던 PVC를 사용하고 싶을 때에는, Type을 \"Existing\" 으로 입력하여 해당 PVC의 이름을 입력하여 사용하시면 됩니다.\n- **Data Volumes**\n  - 추가적인 스토리지 자원이 필요하다면 **\"+ ADD VOLUME\"** 버튼을 클릭하여 생성할 수 있습니다.\n- ~~Configurations, Affinity/Tolerations, Miscellaneous Settings~~\n  - 일반적으로는 필요하지 않으므로 *모두의 MLOps*에서는 자세한 설명을 생략합니다.\n\n</details>\n\n모두 정상적으로 입력하였다면 하단의 **LAUNCH** 버튼이 활성화되며, 버튼을 클릭하면 노트북 서버 생성이 시작됩니다.\n\n![creating](./img/creating.png)\n\n생성 후 아래와 같이 **Status** 가 초록색 체크 표시 아이콘으로 변하며, **CONNECT 버튼**이 활성화됩니다.\n\n![created](./img/created.png)\n\n---\n\n## 노트북 서버 접속하기\n\n**CONNECT 버튼**을 클릭하면 브라우저에 새 창이 열리며, 다음과 같은 화면이 보입니다.\n\n![notebook-access](./img/notebook-access.png)\n\n**Launcher**의 Notebook, Console, Terminal 아이콘을 클릭하여 사용할 수 있습니다.\n\n  생성된 Notebook 화면\n\n![notebook-console](./img/notebook-console.png)\n\n  생성된 Terminal 화면\n\n![terminal-console](./img/terminal-console.png)\n\n---\n\n## 노트북 서버 중단하기\n\n노트북 서버를 오랜 시간 사용하지 않는 경우, 쿠버네티스 클러스터의 효율적인 리소스 사용을 위해서 노트북 서버를 중단(Stop)할 수 있습니다. **단, 이 경우 노트북 서버 생성 시 Workspace Volume 또는 Data Volume으로 지정해놓은 경로 외에 저장된 데이터는 모두 초기화되는 것에 주의하시기 바랍니다.**  \n노트북 서버 생성 당시 경로를 변경하지 않았다면, 디폴트(Default) Workspace Volume의 경로는 노트북 서버 내의 `/home/jovyan` 이므로, `/home/jovyan` 의 하위 경로 이외의 경로에 저장된 데이터는 모두 사라집니다.\n\n다음과 같이 `STOP` 버튼을 클릭하면 노트북 서버가 중단됩니다.\n\n![notebook-stop](./img/notebook-stop.png)\n\n중단이 완료되면 다음과 같이 `CONNECT` 버튼이 비활성화되며, `PLAY` 버튼을 클릭하면 다시 정상적으로 사용할 수 있습니다.\n\n![notebook-restart](./img/notebook-restart.png)\n"
  },
  {
    "path": "versioned_docs/version-1.0/kubeflow-dashboard-guide/tensorboards.md",
    "content": "---\ntitle : \"3. Tensorboards\"\ndescription: \"\"\nsidebar_position: 3\ncontributors: [\"Jaeyeon Kim\"]\n---\n\n다음으로는 Central Dashboard의 왼쪽 탭의 Tensorboards를 클릭해보겠습니다.\n\n![left-tabs](./img/left-tabs.png)\n\n다음과 같은 화면을 볼 수 있습니다.\n\n![tensorboard](./img/tensorboard.png)\n\nTensorboards 탭은 Tensorflow, PyTorch 등의 프레임워크에서 제공하는 Tensorboard 유틸이 생성한 ML 학습 관련 데이터를 시각화하는 텐서보드 서버(Tensorboard Server)를 쿠버네티스 클러스터에 생성하는 기능을 제공합니다.\n\n이렇게 생성한 텐서보드 서버는, 일반적인 원격 텐서보드 서버의 사용법과 같이 사용할 수도 있으며, [Kubeflow 파이프라인 런에서 바로 텐서보드 서버에 데이터를 저장하는 용도](https://www.kubeflow.org/docs/components/pipelines/sdk/output-viewer/#tensorboard)로 활용할 수 있습니다.\n\nKubeflow 파이프라인 런의 결과를 시각화하는 방법에는 [다양한 방식](https://www.kubeflow.org/docs/components/pipelines/sdk/output-viewer/)이 있으며, *모두의 MLOps*에서는 더 일반적으로 활용할 수 있도록 Kubeflow 컴포넌트의 Visualization 기능과 MLflow의 시각화 기능을 활용할 예정이므로, Tensorboards 페이지에 대한 자세한 설명은 생략하겠습니다.\n"
  },
  {
    "path": "versioned_docs/version-1.0/kubeflow-dashboard-guide/volumes.md",
    "content": "---\ntitle : \"4. Volumes\"\ndescription: \"\"\nsidebar_position: 4\ncontributors: [\"Jaeyeon Kim\"]\n---\n\n## Volumes\n\n다음으로는 Central Dashboard의 왼쪽 탭의 Volumes를 클릭해보겠습니다.\n\n![left-tabs](./img/left-tabs.png)\n\n다음과 같은 화면을 볼 수 있습니다.\n\n![volumes](./img/volumes.png)\n\nVolumes 탭은 [Kubernetes의 볼륨(Volume)](https://kubernetes.io/ko/docs/concepts/storage/volumes/), 정확히는 [퍼시스턴트 볼륨 클레임(Persistent Volume Claim, 이하 pvc)](https://kubernetes.io/ko/docs/concepts/storage/persistent-volumes/) 중 현재 user의 namespace에 속한 pvc를 관리하는 기능을 제공합니다.\n\n위 스크린샷을 보면, [1. Notebooks](../kubeflow-dashboard-guide/notebooks) 페이지에서 생성한 Volume의 정보를 확인할 수 있습니다. 해당 Volume의 Storage Class는 쿠버네티스 클러스터 설치 당시 설치한 Default Storage Class인 local-path로 설정되어있음을 확인할 수 있습니다.\n\n이외에도 user namespace에 새로운 볼륨을 생성하거나, 조회하거나, 삭제하고 싶은 경우에 Volumes 페이지를 활용할 수 있습니다.\n\n---\n\n## 볼륨 생성하기\n\n오른쪽 위의 `+ NEW VOLUME` 버튼을 클릭하면 다음과 같은 화면을 볼 수 있습니다.\n\n![new-volume](./img/new-volume.png)\n\nname, size, storage class, access mode를 지정하여 생성할 수 있습니다.\n\n원하는 리소스 스펙을 지정하여 생성하면 다음과 같이 볼륨의 Status가 `Pending`으로 조회됩니다. `Status` 아이콘에 마우스 커서를 가져다 대면 *해당 볼륨은 mount하여 사용하는 first consumer가 나타날 때 실제로 생성을 진행한다(This volume will be bound when its first consumer is created.)*는 메시지를 확인할 수 있습니다.  \n이는 실습을 진행하는 [StorageClass](https://kubernetes.io/ko/docs/concepts/storage/storage-classes/)인 `local-path`의 볼륨 생성 정책에 해당하며, **문제 상황이 아닙니다.**  \n해당 페이지에서 Status가 `Pending` 으로 보이더라도 해당 볼륨을 사용하길 원하는 노트북 서버 혹은 파드(Pod)에서는 해당 볼륨의 이름을 지정하여 사용할 수 있으며, 그때 실제로 볼륨 생성이 진행됩니다.\n\n![creating-volume](./img/creating-volume.png)\n"
  },
  {
    "path": "versioned_docs/version-1.0/prerequisites/_category_.json",
    "content": "{\n  \"label\": \"Prerequisites\",\n  \"position\": 1,\n  \"link\": {\n    \"type\": \"generated-index\"\n  }\n}\n"
  },
  {
    "path": "versioned_docs/version-1.0/prerequisites/docker/_category_.json",
    "content": "{\n  \"label\": \"Docker\",\n  \"position\": 1,\n  \"link\": {\n    \"type\": \"generated-index\"\n  }\n}\n"
  },
  {
    "path": "versioned_docs/version-1.0/prerequisites/docker/advanced.md",
    "content": "---\ntitle : \"[Practice] Docker Advanced\"\ndescription: \"Practice to use docker more advanced way.\"\nsidebar_position: 6\ncontributors: [\"Jongseob Jeon\", \"Jaeyeon Kim\"]\n---\n\n## 도커 이미지 잘 만들기\n\n### 도커 이미지를 만들 때 고려해야 될 점\n\nDockerfile 을 활용하여 도커 이미지를 만들 때는 명령어의 **순서**가 중요합니다.  \n그 이유는 도커 이미지는 여러 개의 Read-Only Layer 로 구성되어있고, 이미지를 빌드할 때 이미 존재하는 레이어는 **캐시되어** 재사용되기 때문에, 이를 생각해서 Dockerfile 을 구성한다면 **빌드 시간을 줄일 수 있습니다.**\n\nDockerfile에서 `RUN`, `ADD`, `COPY` 명령어 하나가 하나의 레이어로 저장됩니다.\n\n예를 들어서 다음과 같은 `Dockerfile`이 있습니다.\n\n```docker\n# Layer 1\nFROM ubuntu:latest\n\n# Layer 2\nRUN apt-get update && apt-get install python3 pip3 -y\n\n# Layer 3\nRUN pip3 install -U pip && pip3 install torch\n\n# Layer 4\nCOPY src/ src/\n\n# Layer 5\nCMD python src/app.py\n```\n\n위의 `Dockerfile`로 빌드된 이미지를 `docker run -it app:latest /bin/bash` 명령어로 실행하면 다음과 같은 레이어로 표현할 수 있습니다.\n\n![layers.png](./img/layers.png)\n\n최상단의 R/W Layer 는 이미지에 영향을 주지 않습니다. 즉, 컨테이너 내부에서 작업한 내역은 모두 휘발성입니다.\n\n하단의 레이어가 변경되면, 그 위의 레이어는 모두 새로 빌드됩니다. 그래서 Dockerfile 내장 명령어의 순서가 중요합니다.  \n예를 들면, **자주 변경**되는 부분은 **최대한 뒤쪽으로** 정렬하는 것을 추천합니다. (ex. `COPY src/ app/src/`)\n\n그렇기 때문에 반대로 변경되지 않는 부분은 최대한 앞쪽으로 정렬하는게 좋습니다.\n\n만약 거의 **변경되지 않지만**, 여러 곳에서 **자주** 쓰이는 부분을 공통화할 수도 있습니다.\n해당 공통부분만 묶어서 별도의 이미지는 미리 만들어둔 다음, **베이스 이미지** 로 활용하는 것이 좋습니다.\n\n예를 들어, 다른 건 거의 똑같은데, tensorflow-cpu 를 사용하는 이미지와, tensorflow-gpu 를 사용하는 환경을 분리해서 이미지로 만들고 싶은 경우에는 다음과 같이 할 수 있습니다.  \npython 과 기타 기본적인 패키지가 설치된 [`ghcr.io/makinarocks/python:3.8-base`](http://ghcr.io/makinarocks/python:3.8-base-cpu) 를 만들어두고, **tensorflow cpu 버전과 gpu 버전이** 설치된 이미지 새로 만들때는, 위의 이미지를 `FROM` 으로 불러온 다음, tensorflow install 하는 부분만 별도로 작성해서 Dockerfile 을 2 개로 관리한다면 가독성도 좋고 빌드 시간도 줄일 수 있습니다.\n\n**합칠 수 있는 Layer 는 합치는 것**이 Old version 의 도커에서는 성능 향상 효과를 이끌었습니다. 여러분의 도커 컨테이너가 어떤 도커 버전에서 실행될 것인지 보장할 수 없으며, **가독성**을 위해서도 합칠 수 있는 Layer 는 적절히 합치는 것이 좋습니다.\n\n예를 들면, 다음과 같이 작성된 `Dockerfile`이 있습니다.\n\n```docker\n# Bad Case\nRUN apt-get update\nRUN apt-get install build-essential -y\nRUN apt-get install curl -y\nRUN apt-get install jq -y\nRUN apt-get install git -y\n```\n\n이를 아래와 같이 합쳐서 적을 수 있습니다.\n\n```docker\n# Better Case\nRUN apt-get update && \\\n    apt-get install -y \\\n    build-essential \\\n    curl \\\n    jq \\\n    git\n```\n\n편의를 위해서는 `.dockerignore` 도 사용하는게 좋습니다.\n`.dockerignore`는 `.gitignore` 와 비슷한 역할을 한다고 이해하면 됩니다. (git add 할 때 제외할 수 있듯이, docker build 할 때 자동으로 제외)\n\n더 많은 정보는 [Docker 공식 문서](https://docs.docker.com/develop/develop-images/dockerfile_best-practices/)에서 확인하실 수 있습니다.\n\n### ENTRYPOINT vs CMD\n\n`ENTRYPOINT` 와 `CMD` 는 모두 컨테이너의 실행 시점에서 어떤 명령어를 실행시키고 싶을 때 사용합니다.\n그리고 이 둘 중 하나는 반드시 존재해야 합니다.\n\n- **차이점**\n  - `CMD`: docker run 을 수행할 때, 쉽게 변경하여 사용할 수 있음\n  - `ENTRYPOINT`: `--entrypoint`  를 사용해야 변경할 수 있음\n\n`ENTRYPOINT` 와 `CMD` 가 함께 쓰일 때는 보통 `CMD`는 `ENTRYPOINT` 에서 적은 명령의 arguments(parameters) 를 의미합니다.\n\n예를 들어서 다음과 같은 `Dockerfile` 이 있습니다.\n\n```docker\nFROM ubuntu:latest\n\n# 아래 4 가지 option 을 바꿔가며 직접 테스트해보시면 이해하기 편합니다.\n# 단, NO ENTRYPOINT 옵션은 base image 인 ubuntu:latest 에 이미 있어서 테스트해볼 수는 없고 나머지 v2, 3, 5, 6, 8, 9, 11, 12 를 테스트해볼 수 있습니다.\n# ENTRYPOINT echo \"Hello ENTRYPOINT\"\n# ENTRYPOINT [\"echo\", \"Hello ENTRYPOINT\"]\n# CMD echo \"Hello CMD\"\n# CMD [\"echo\", \"Hello CMD\"]\n```\n\n위의 `Dockerfile`에서 주석으로 표시된 부분들을 해제하며 빌드하고 실행하면 다음과 같은 결과를 얻을 수 있습니다.\n\n|                    | No ENTRYPOINT  | ENTRYPOINT a b | ENTRYPOINT [\"a\", \"b\"] |\n| ------------------ | -------------- | -------------- | --------------------- |\n| **NO CMD**         | Error!         | /bin/sh -c a b | a b                   |\n| **CMD [\"x\", \"y\"]** | x y            | /bin/sh -c a b | a b x y               |\n| **CMD x y**        | /bin/sh -c x y | /bin/sh -c a b | a b /bin/sh -c x y    |\n\n- In Kubernetes pod\n  - `ENTRYPOINT` → command\n  - `CMD` → args\n\n### Docker tag 이름 짓기\n\n도커 이미지의 tag 로 **latest 는 사용하지 않는 것을 권장**합니다.  \n이유는 latest 는 default tag name 이므로 **의도치 않게 overwritten** 되는 경우가 너무 많이 발생하기 때문입니다.\n\n하나의 이미지는 하나의 태그를 가짐(**uniqueness**)을 보장해야 추후 Production 단계에서 **협업/디버깅**에 용이합니다.  \n내용은 다르지만, 동일한 tag 를 사용하게 되면 추후 dangling image 로 취급되어 관리하기 어려워집니다.  \ndangling image는 `docker images`에는 나오지 않지만 계속해서 저장소를 차지하고 있습니다.\n\n### ETC\n\n1. log 등의 정보는 container 내부가 아닌 곳에 따로 저장합니다.\n    container 내부에서 write 한 data 는 언제든지 사라질 수 있기 때문입니다.\n2. secret 한 정보, 환경(dev/prod) dependent 한 정보 등은 Dockerfile 에 직접 적는 게 아니라, env var 또는 .env config file 을 사용합니다.\n3. Dockerfile **linter** 도 존재하므로, 협업 시에는 활용하면 좋습니다.\n    [https://github.com/hadolint/hadolint](https://github.com/hadolint/hadolint)\n\n## docker run 의 다양한 옵션\n\n### docker run with volume\n\nDocker container 사용 시 불편한 점이 있습니다.\n바로 Docker는 기본적으로 Docker **container 내부에서 작업한 모든 사항은 저장되지 않습니다.**\n이유는 Docker container 는 각각 격리된 파일시스템을 사용합니다. 따라서, **여러 docker container 끼리 데이터를 공유하기 어렵습니다.**\n\n이 문제를 해결하기 위해서 Docker에서 제공하는 방식은 **2 가지**가 있습니다.\n\n![storage.png](./img/storage.png)\n\n#### Docker volume\n\n- docker cli 를 사용해 `volume` 이라는 리소스를 직접 관리\n- host 에서 Docker area(`/var/lib/docker`) 아래에 특정 디렉토리를 생성한 다음, 해당 경로를 docker container 에 mount\n\n#### Bind mount\n\n- host 의 특정 경로를 docker container 에 mount\n\n#### How to use?\n\n사용 방식은 **동일한 인터페이스**로 `-v` 옵션을 통해 사용할 수 있습니다.  \n다만, volume 을 사용할 때에는 `docker volume create`, `docker volume ls`, `docker volume rm` 등을 수행하여 직접 관리해주어야 합니다.\n\n- Docker volume\n\n    ```bash\n    docker run \\\n        -v my_volume:/app \\\n        nginx:latest\n    ````\n\n- Blind mount\n\n    ```bash\n    docker run \\\n        -v /home/user/some/path:/app \\\n        nginx:latest\n    ```\n\n로컬에서 개발할 때는 bind mount 가 편하긴 하지만, 환경을 깔끔하게 유지하고 싶다면 docker volume 을 사용하여 create, rm 을 명시적으로 수행하는 것도 하나의 방법입니다.\n\n쿠버네티스에서 스토리지를 제공하는 방식도 결국 docker 의 bind mount 를 활용하여 제공합니다.\n\n### docker run with resource limit\n\n기본적으로 docker container 는 **host OS 의 cpu, memory 자원을 fully 사용**할 수 있습니다. 하지만 이렇게 사용하게 되면 host OS 의 자원 상황에 따라서 **OOM** 등의 이슈로 docker container 가 비정상적으로 종료되는 상황이 발생할 수 있습니다.  \n이런 문제를 다루기 위해 **docker container 실행 시, cpu 와 memory 의 사용량 제한**을 걸 수 있는 `-m` [옵션](https://docs.docker.com/config/containers/resource_constraints/#limit-a-containers-access-to-memory)을 제공합니다.\n\n```bash\ndocker run -d -m 512m --memory-reservation=256m --name 512-limit ubuntu sleep 3600\ndocker run -d -m 1g --memory-reservation=256m --name 1g-limit ubuntu sleep 3600\n```\n\n위의 도커를 실행 후 `docker stats` 커맨드를 통해 사용량을 확인할 수 있습니다.\n\n```bash\nCONTAINER ID   NAME        CPU %     MEM USAGE / LIMIT   MEM %     NET I/O       BLOCK I/O   PIDS\n4ea1258e2e09   1g-limit    0.00%     300KiB / 1GiB       0.03%     1kB / 0B      0B / 0B     1\n4edf94b9a3e5   512-limit   0.00%     296KiB / 512MiB     0.06%     1.11kB / 0B   0B / 0B     1\n```\n\n쿠버네티스에서 pod 라는 리소스에 cpu, memory 제한을 줄 때, 이 방식을 활용하여 제공합니다.\n\n### docker run with restart policy\n\n특정 컨테이너가 계속해서 running 상태를 유지시켜야 하는 경우가 존재합니다. 이런 경우를 위해서 해당 컨테이너가 종료되자마자 바로 재생성을 시도할 수 있는 `--restart=always` 옵션을 제공하고 있습니다.\n\n옵션 입력 후 도커를 실행합니다.\n\n```bash\ndocker run --restart=always ubuntu\n```\n\n`watch -n1 docker ps`를 통해 재실행이 되고 있는지 확인합니다.\n정상적으로 수행되고 있다면 다음과 같이 STATUS에 `Restarting (0)` 이 출력됩니다.\n\n```bash\nCONTAINER ID   IMAGE     COMMAND   CREATED          STATUS                         PORTS     NAMES\na911850276e8   ubuntu    \"bash\"    35 seconds ago   Restarting (0) 6 seconds ago             hungry_vaughan\n```\n\n- [https://docs.docker.com/engine/reference/commandline/run/#restart-policies---restart](https://docs.docker.com/engine/reference/commandline/run/#restart-policies---restart)\n  - on-failure with max retries\n  - always 등의 선택지 제공\n\n쿠버네티스에서 job 이라는 resource 의 restart 옵션을 줄 때, 이 방식을 활용하여 제공합니다.\n\n### docker run as a background process\n\n도커 컨테이너를 실행할 때는 기본적으로 foreground process 로 실행됩니다. 즉, 컨테이너를 실행한 터미널이 해당 컨테이너에 자동으로 attach 되어 있어, 다른 명령을 실행할 수 없습니다.\n\n다음과 같은 예시를 수행해봅니다.  \n우선 터미널 2 개를 열어, 하나의 터미널에서는 `docker ps` 를 지켜보고, 다른 하나의 터미널에서는 다음과 같은 명령을 차례로 실행해보며 동작을 지켜봅니다.\n\n#### First Practice\n\n```bash\ndocker run -it ubuntu sleep 10\n```\n\n10 초동안 멈춰 있어야 하고, 해당 컨테이너에서 다른 명령을 수행할 수 없습니다. 10초 뒤에는 docker ps 에서 container 가 종료되는 것을 확인할 수 있습니다.\n\n#### Second Practice\n\n```bash\ndocker run -it ubuntu sleep 10\n```\n\n이후, `ctrl + p` -> `ctrl + q`\n\n해당 터미널에서 이제 다른 명령을 수행할 수 있게 되었으며, docker ps 로도 10초까지는 해당 컨테이너가 살아있는 것을 확인할 수 있습니다.\n이렇게 docker container 내부에서 빠져나온 상황을 detached 라고 부릅니다.\n도커에서는 run 을 실행함과 동시에 detached mode 로 실행시킬 수 있는 옵션을 제공합니다.\n\n#### Third Practice\n\n```bash\ndocker run -d ubuntu sleep 10\n```\n\ndetached mode 이므로 해당 명령을 실행시킨 터미널에서 다른 액션을 수행시킬 수 있습니다.\n\n상황에 따라 detached mode 를 적절히 활용하면 좋습니다.  \n예를 들어, DB 와 통신하는 Backend API server 를 개발할 때 Backend API server 는 source code 를 변경시켜가면서 hot-loading 으로 계속해서 로그를 확인해봐야 하지만, DB 는 로그를 지켜볼 필요는 없는 경우라면 다음과 같이 실행할 수 있습니다.  \nDB 는 docker container 를 detached mode 로 실행시키고, Backend API server 는 attached mode 로 log 를 following 하면서 실행시키면 효율적입니다.\n\n## References\n\n- [https://towardsdatascience.com/docker-storage-598e385f4efe](https://towardsdatascience.com/docker-storage-598e385f4efe)\n- [https://vsupalov.com/docker-latest-tag/](https://vsupalov.com/docker-latest-tag/)\n- [https://docs.microsoft.com/ko-kr/azure/container-registry/container-registry-image-tag-version](https://docs.microsoft.com/ko-kr/azure/container-registry/container-registry-image-tag-version)\n- [https://stevelasker.blog/2018/03/01/docker-tagging-best-practices-for-tagging-and-versioning-docker-images/](https://stevelasker.blog/2018/03/01/docker-tagging-best-practices-for-tagging-and-versioning-docker-images/)\n"
  },
  {
    "path": "versioned_docs/version-1.0/prerequisites/docker/command.md",
    "content": "---\ntitle : \"[Practice] Docker command\"\ndescription: \"Practice to use docker command.\"\nsidebar_position: 4\ncontributors: [\"Jongseob Jeon\", \"Jaeyeon Kim\"]\n---\n\n## 1. 정상 설치 확인\n\n```bash\ndocker run hello-world\n```\n\n정상적으로 설치된 경우 다음과 같은 메시지를 확인할 수 있습니다.\n\n```bash\nHello from Docker!\nThis message shows that your installation appears to be working correctly.\n....\n```\n\n**(For ubuntu)** sudo 없이 사용하고 싶다면 아래 사이트를 참고합니다.\n\n- [https://docs.docker.com/engine/install/linux-postinstall/#manage-docker-as-a-non-root-user](https://docs.docker.com/engine/install/linux-postinstall/#manage-docker-as-a-non-root-user)\n\n## 2. Docker Pull\n\ndocker image registry(도커 이미지를 저장하고 공유할 수 있는 저장소)로부터 Docker image 를 로컬에 다운로드 받는 커맨드입니다.\n\n아래 커맨드를 통해 docker pull에서 사용 가능한 argument들을 확인할 수 있습니다.\n\n```bash\ndocker pull --help\n```\n\n정상적으로 수행되면 아래와 같이 출력됩니다.\n\n```bash\nUsage:  docker pull [OPTIONS] NAME[:TAG|@DIGEST]\n\nPull an image or a repository from a registry\n\nOptions:\n  -a, --all-tags                Download all tagged images in the repository\n      --disable-content-trust   Skip image verification (default true)\n      --platform string         Set platform if server is multi-platform capable\n  -q, --quiet                   Suppress verbose output\n```\n\n여기서 알 수 있는 것은 바로 docker pull은 두 개 타입의 argument를 받는다는 것을 알 수 있습니다.\n\n1. `[OPTIONS]`\n2. `NAME[:TAG|@DIGEST]`\n\nhelp에서 나온 `-a`, -`q` 옵션을 사용하기 위해서는 NAME 앞에서 사용해야 합니다.\n\n직접 `ubuntu:18.04` 이미지를 pull 해보겠습니다.\n\n```bash\ndocker pull ubuntu:18.04\n```\n\n위 명령어를 해석하면 `ubuntu` 라는 이름을 가진 이미지 중 `18.04` 태그가 달려있는 이미지를 가져오라는 뜻입니다.\n\n만약, 정상적으로 수행된다면 다음과 비슷하게 출력됩니다.\n\n```bash\n18.04: Pulling from library/ubuntu\n20d796c36622: Pull complete \nDigest: sha256:42cd9143b6060261187a72716906187294b8b66653b50d70bc7a90ccade5c984\nStatus: Downloaded newer image for ubuntu:18.04\ndocker.io/library/ubuntu:18.04\n```\n\n위의 명령어를 수행하면 [docker.io/library](http://docker.io/library/) 라는 이름의 registry 에서 ubuntu:18.04 라는 image 를 여러분의 노트북에 다운로드 받게됩니다.\n\n- 참고사항\n  - 추후 [docker.io](http://docker.io) 나 public 한 docker hub 와 같은 registry 대신에, 특정 **private** 한 registry 에서 docker image 를 가져와야 하는 경우에는, [`docker login`](https://docs.docker.com/engine/reference/commandline/login/) 을 통해서 특정 registry 를 바라보도록 한 뒤, docker pull 을 수행하는 형태로 사용합니다. 혹은 insecure registry 를 설정하는 [방안](https://stackoverflow.com/questions/42211380/add-insecure-registry-to-docker)도 활용할 수 있습니다.\n  - 폐쇄망에서 docker image 를 `.tar` 파일과 같은 형태로 저장하고 공유할 수 있도록 [`docker save`](https://docs.docker.com/engine/reference/commandline/save/), [`docker load`](https://docs.docker.com/engine/reference/commandline/load/) 와 같은 명령어도 존재합니다.\n\n## 3. Docker images\n\n로컬에 존재하는 docker image 리스트를 출력하는 커맨드입니다.\n\n```bash\ndocker images --help\n```\n\ndocker images에서 사용할 수 있는 argument는 다음과 같습니다.\n\n```bash\nUsage:  docker images [OPTIONS] [REPOSITORY[:TAG]]\n\nList images\n\nOptions:\n  -a, --all             Show all images (default hides intermediate images)\n      --digests         Show digests\n  -f, --filter filter   Filter output based on conditions provided\n      --format string   Pretty-print images using a Go template\n      --no-trunc        Don't truncate output\n  -q, --quiet           Only show image IDs\n```\n\n아래 명령어를 이용해 직접 실행해 보겠습니다.\n\n```bash\ndocker images\n```\n\n만약 도커를 최초 설치 후 이 실습을 진행한다면 다음과 비슷하게 출력됩니다.\n\n```bash\nREPOSITORY   TAG       IMAGE ID       CREATED      SIZE\nubuntu       18.04     29e70752d7b2   2 days ago   56.7MB\n```\n\n줄 수 있는 argument중 `-q`를 사용하면 `IMAGE ID` 만 출력됩니다.\n\n```bash\ndocker images -q\n```\n\n```bash\n29e70752d7b2\n```\n\n## 4. Docker ps\n\n현재 실행 중인 도커 컨테이너 리스트를 출력하는 커맨드입니다.\n\n```bash\ndocker ps --help\n```\n\ndocker ps에서 사용할 수 있는 argument는 다음과 같습니다.\n\n```bash\nUsage:  docker ps [OPTIONS]\n\nList containers\n\nOptions:\n  -a, --all             Show all containers (default shows just running)\n  -f, --filter filter   Filter output based on conditions provided\n      --format string   Pretty-print containers using a Go template\n  -n, --last int        Show n last created containers (includes all states) (default -1)\n  -l, --latest          Show the latest created container (includes all states)\n      --no-trunc        Don't truncate output\n  -q, --quiet           Only display container IDs\n  -s, --size            Display total file sizes\n```\n\n아래 명령어를 이용해 직접 실행해 보겠습니다.\n\n```bash\ndocker ps\n```\n\n현재 실행 중인 컨테이너가 없다면 다음과 같이 나옵니다.\n\n```bash\nCONTAINER ID   IMAGE     COMMAND   CREATED   STATUS    PORTS     NAMES\n```\n\n만약 실행되는 컨테이너가 있다면 다음과 비슷하게 나옵니다.\n\n```bash\nCONTAINER ID   IMAGE     COMMAND        CREATED          STATUS          PORTS     NAMES\nc1e8f5e89d8d   ubuntu    \"sleep 3600\"   13 seconds ago   Up 12 seconds             trusting_newton\n```\n\n## 5. Docker run\n\n도커 컨테이너를 실행시키는 커맨드입니다.\n\n```bash\ndocker run --help\n```\n\ndocker run을 실행하는 명령어는 다음과 같습니다.\n\n```bash\nUsage:  docker run [OPTIONS] IMAGE [COMMAND] [ARG...]\n\nRun a command in a new container\n```\n\n여기서 우리가 확인해야 하는 것은 바로 docker run은 세 개 타입의 argument를 받는다는 것을 알 수 있습니다.\n\n1. `[OPTIONS]`\n2. `[COMMAND]`\n3. `[ARG...]`\n\n직접 도커 컨테이너를 실행해 보겠습니다.\n\n```bash\n## Usage:  docker run [OPTIONS] IMAGE [COMMAND] [ARG...]\ndocker run -it --name demo1 ubuntu:18.04 /bin/bash\n```\n\n- `-it` : `-i` 옵션 + `-t` 옵션\n  - container 를 실행시킴과 동시에 interactive 한 terminal 로 접속시켜주는 옵션\n- `--name` : name\n  - 컨테이너 id 대신, 구분하기 쉽도록 지정해주는 이름\n- `/bin/bash`\n  - 컨테이너를 실행시킴과 동시에 실행할 커맨드로, `/bin/bash` 는 bash 쉘을 여는 것을 의미합니다.\n\n실행 후 `exit` 명령어를 통해 컨테이너를 종료합니다.\n\n이 제 앞서 배웠던 `docker ps` 명령어를 치면 다음과 같이 나옵니다.\n\n```bash\nCONTAINER ID   IMAGE     COMMAND   CREATED   STATUS    PORTS     NAMES\n```\n\n실행되고 있는 컨테이너가 나온다고 했지만 어째서인지 방금 실행한 컨테이너가 보이지 않습니다.\n그 이유는 `docker ps`는 기본값으로 현재 실행 중인 컨테이너를 보여주기 때문입니다.\n\n만약 종료된 컨테이너들도 보고 싶다면 `-a` 옵션을 주어야 합니다.\n\n```bash\ndocker ps -a\n```\n\n그러면 다음과 같이 종료된 컨테이너 목록도 나옵니다.\n\n```bash\nCONTAINER ID   IMAGE          COMMAND       CREATED         STATUS                     PORTS     NAMES\n4c1aa74a382a   ubuntu:18.04   \"/bin/bash\"   2 minutes ago   Exited (0) 2 minutes ago             demo1\n```\n\n## 6. Docker exec\n\nDocker 컨테이너 내부에서 명령을 내리거나, 내부로 접속하는 커맨드입니다.\n\n```bash\ndocker exec --help\n```\n\n예를 들어서 다음과 같은 명령어를 실행해 보겠습니다.\n\n```bash\ndocker run -d --name demo2 ubuntu:18.04 sleep 3600\n```\n\n여기서 `-d` 옵션은 도커 컨테이너를 백그라운드에서 실행시켜서, 컨테이너에서 접속 종료를 하더라도, 계속 실행 중이 되도록 하는 커맨드입니다.\n\n`docker ps`를 통해 현재 실행중인지 확인합니다.\n\n다음과 같이 실행 중임을 확인할 수 있습니다.\n\n```bash\nCONTAINER ID   IMAGE          COMMAND        CREATED         STATUS         PORTS     NAMES\nfc88a83e90f0   ubuntu:18.04   \"sleep 3600\"   4 seconds ago   Up 3 seconds             demo2\n```\n\n이제 `docker exec` 명령어를 통해서 실행중인 도커 컨테이너에 접속해 보겠습니다.\n\n```bash\ndocker exec -it demo2 /bin/bash\n```\n\n이 전의 `docker run`과 동일하게 container 내부에 접속할 수 있습니다.\n\n`exit`을 통해 종료합니다.\n\n## 7. Docker logs\n\n도커 컨테이너의 log를 확인하는 커맨드 입니다.\n\n```bash\ndocker logs --help\n```\n\n다음과 같은 컨테이너를 실행시키도록 하겠습니다.\n\n```bash\ndocker run --name demo3 -d busybox sh -c \"while true; do $(echo date); sleep 1; done\"\n```\n\n위 명령어를 통해서 test 라는 이름의 busybox 컨테이너를 백그라운드에서 도커 컨테이너로 실행하여, 1초에 한 번씩 현재 시간을 출력하도록 했습니다.\n\n이제 아래 명령어를 통해 log를 확인해 보겠습니다.\n\n```bash\ndocker logs demo3\n```\n\n정상적으로 수행되면 아래와 비슷하게 나옵니다.\n\n```bash\nSun Mar  6 11:06:49 UTC 2022\nSun Mar  6 11:06:50 UTC 2022\nSun Mar  6 11:06:51 UTC 2022\nSun Mar  6 11:06:52 UTC 2022\nSun Mar  6 11:06:53 UTC 2022\nSun Mar  6 11:06:54 UTC 2022\n```\n\n그런데 이렇게 사용할 경우 여태까지 찍힌 log 밖에 확인할 수 없습니다.  \n이 때 `-f` 옵션을 이용해 계속 watch 하며 출력할 수 있습니다.\n\n```bash\ndocker logs demo3 -f    \n```\n\n## 8. Docker stop\n\n실행 중인 도커 컨테이너를 중단시키는 커맨드입니다.\n\n```bash\ndocker stop --help\n```\n\n`docker ps`를 통해 현재 실행 중인 컨테이너를 확인하면 다음과 같습니다.\n\n```bash\nCONTAINER ID   IMAGE          COMMAND                  CREATED              STATUS              PORTS     NAMES\n730391669c39   busybox        \"sh -c 'while true; …\"   About a minute ago   Up About a minute             demo3\nfc88a83e90f0   ubuntu:18.04   \"sleep 3600\"             4 minutes ago        Up 4 minutes                  demo2\n```\n\n이제 `docker stop` 을 통해 도커를 정지해 보겠습니다.\n\n```bash\ndocker stop demo2\n```\n\n실행 후 `docker ps`를 다시 입력합니다.\n\n```bash\nCONTAINER ID   IMAGE     COMMAND                  CREATED         STATUS         PORTS     NAMES\n730391669c39   busybox   \"sh -c 'while true; …\"   2 minutes ago   Up 2 minutes             demo3\n```\n\n위의 결과와 비교했을 때 demo2 컨테이너가 현재 실행 중인 컨테이너 목록에서 사라진 것을 확인할 수 있습니다.\n\n나머지 컨테이너도 정지합니다.\n\n```bash\ndocker stop demo3\n```\n\n## 9. Docker rm\n\n도커 컨테이너를 삭제하는 커맨드입니다.\n\n```bash\ndocker rm --help\n```\n\n도커 컨테이너는 기본적으로 종료가 된 상태로 있습니다. 그래서 `docker ps -a`를 통해서 종료된 컨테이너도 볼 수 있습니다.\n그런데 종료된 컨테이너는 왜 지워야 할까요?  \n종료되어 있는 도커에는 이전에 사용한 데이터가 아직 컨테이너 내부에 남아있습니다.\n그래서 restart 등을 통해서 컨테이너를 재시작할 수 있습니다.\n그런데 이 과정에서 disk를 사용하게 됩니다.\n\n그래서 완전히 사용하지 않는 컨테이너를 지우기 위해서는 `docker rm` 명령어를 사용해야 합니다.\n\n우선 현재 컨테이너들을 확인합니다.\n\n```bash\ndocker ps -a\n```\n\n다음과 같이 3개의 컨테이너가 있습니다.\n\n```bash\nCONTAINER ID   IMAGE          COMMAND                  CREATED          STATUS                            PORTS     NAMES\n730391669c39   busybox        \"sh -c 'while true; …\"   4 minutes ago    Exited (137) About a minute ago             demo3\nfc88a83e90f0   ubuntu:18.04   \"sleep 3600\"             7 minutes ago    Exited (137) 2 minutes ago                  demo2\n4c1aa74a382a   ubuntu:18.04   \"/bin/bash\"              10 minutes ago   Exited (0) 10 minutes ago                   demo1\n```\n\n아래 명령어를 통해 `demo3` 컨테이너를 삭제해 보겠습니다.\n\n```bash\ndocker rm demo3\n```\n\n`docker ps -a` 명령어를 치면 다음과 같이 2개로 줄었습니다.\n\n```bash\nCONTAINER ID   IMAGE          COMMAND        CREATED          STATUS                       PORTS     NAMES\nfc88a83e90f0   ubuntu:18.04   \"sleep 3600\"   13 minutes ago   Exited (137) 8 minutes ago             demo2\n4c1aa74a382a   ubuntu:18.04   \"/bin/bash\"    16 minutes ago   Exited (0) 16 minutes ago              demo1\n```\n\n나머지 컨테이너들도 삭제합니다.\n\n```bash\ndocker rm demo2\ndocker rm demo1\n```\n\n## 10. Docker rmi\n\n도커 이미지를 삭제하는 커맨드입니다.\n\n```bash\ndocker rmi --help\n```\n\n아래 명령어를 통해 현재 어떤 이미지들이 로컬에 있는지 확인합니다.\n\n```bash\ndocker images\n```\n\n다음과 같이 출력됩니다.\n\n```bash\nREPOSITORY   TAG       IMAGE ID       CREATED        SIZE\nbusybox      latest    a8440bba1bc0   32 hours ago   1.41MB\nubuntu       18.04     29e70752d7b2   2 days ago     56.7MB\n```\n\n`busybox` 이미지를 삭제해 보겠습니다.\n\n```bash\ndocker rmi busybox\n```\n\n다시 `docker images`를 칠 경우 다음과 같이 나옵니다.\n\n```bash\nREPOSITORY   TAG       IMAGE ID       CREATED        SIZE\nubuntu       18.04     29e70752d7b2   2 days ago     56.7MB\n```\n\n## References\n\n- [https://docs.github.com/en/packages/working-with-a-github-packages-registry/working-with-the-container-registry](https://docs.github.com/en/packages/working-with-a-github-packages-registry/working-with-the-container-registry)\n"
  },
  {
    "path": "versioned_docs/version-1.0/prerequisites/docker/docker.md",
    "content": "---\ntitle : \"What is Docker?\"\ndescription: \"Introduction to Docker.\"\nsidebar_position: 3\ncontributors: [\"Jongseob Jeon\", \"Jaeyeon Kim\"]\n---\n\n\n## 컨테이너\n\n- 컨테이너 가상화\n  - 어플리케이션을 어디에서나 동일하게 실행하는 기술\n- 컨테이너 이미지\n  - 어플리케이션을 실행시키기 위해 필요한 모든 파일들의 집합\n  - → 붕어빵 틀\n- 컨테이너란?\n  - 컨테이너 이미지를 기반으로 실행된 한 개의 프로세스\n  - → 붕어빵 틀로 찍어낸 붕어빵\n\n## 도커\n\n도커는 **컨테이너를 관리**하고 사용할 수 있게 해주는 플랫폼입니다.  \n이러한 도커의 슬로건은 바로 **Build Once, Run Anywhere** 로 어디에서나 동일한 실행 결과를 보장합니다.\n\n도커 내부에서 동작하는 과정을 보자면 실제로 container 를 위한 리소스를 분리하고, lifecycle 을 제어하는 기능은 linux kernel 의 cgroup 등이 수행합니다.\n하지만 이러한 인터페이스를 바로 사용하는 것은 **너무 어렵기 때문에** 다음과 같은 추상화 layer를 만들게 됩니다.\n\n![docker-layer.png](./img/docker-layer.png)\n\n이를 통해 사용자는 사용자 친화적인 API 인 **Docker CLI** 만으로 쉽게 컨테이너를 제어할 수 있습니다.\n\n## Layer 해석\n\n위에서 나온 layer들의 역할은 다음과 같습니다.\n\n1. runC: linux kernel 의 기능을 직접 사용해서, container 라는 하나의 프로세스가 사용할 네임스페이스와 cpu, memory, filesystem 등을 격리시켜주는 기능을 수행합니다.\n2. containerd: runC(OCI layer) 에게 명령을 내리기 위한 추상화 단계이며, 표준화된 인터페이스(OCI)를 사용합니다.\n3. dockerd: containerd 에게 명령을 내리는 역할만 합니다.\n4. docker cli: 사용자는 docker cli 로 dockerd (Docker daemon)에게 명령을 내리기만 하면 됩니다.\n    - 이 통신 과정에서 unix socket 을 사용하기 때문에 가끔 도커 관련 에러가 나면 `/var/run/docker.sock` 가 사용 중이다, 권한이 없다 등등의 에러 메시지가 나오는 것입니다.\n\n이처럼 도커는 많은 단계를 감싸고 있지만, 흔히 도커라는 용어를 사용할 때는 Docker CLI 를 말할 때도 있고, Dockerd 를 말할 때도 있고 Docker Container 하나를 말할 때도 있어서 혼란이 생길 수 있습니다.  \n앞으로 나오는 글에서도 도커가 여러가지 의미로 쓰일 수 있습니다.\n\n## For ML Engineer\n\n머신러닝 엔지니어가 도커를 사용하는 이유는 다음과 같습니다.\n\n1. 나의 ML 학습/추론 코드를 OS, python version, python 환경, 특정 python package 버전에 independent 하도록 해야 한다.\n2. 그래서 코드 뿐만이 아닌 **해당 코드가 실행되기 위해 필요한 모든 종속적인 패키지, 환경 변수, 폴더명 등등을 하나의 패키지로** 묶을 수 있는 기술이 컨테이너화 기술이다.\n3. 이 기술을 쉽게 사용하고 관리할 수 있는 소프트웨어 중 하나가 도커이며, 패키지를 도커 이미지라고 부른다.\n"
  },
  {
    "path": "versioned_docs/version-1.0/prerequisites/docker/images.md",
    "content": "---\ntitle : \"[Practice] Docker images\"\ndescription: \"Practice to use docker image.\"\nsidebar_position: 5\ncontributors: [\"Jongseob Jeon\", \"Jaeyeon Kim\"]\n---\n\n## 1. Dockerfile 만들기\n\n도커 이미지를 만드는 가장 쉬운 방법은 도커에서 제공하는 템플릿인 Dockerfile을 사용하는 것입니다.  \n이외에는 running container 를 docker image 로 만드는 `docker commit` 등을 활용하는 방법이 있습니다.\n\n- `Dockerfile`\n  - 사용자가 도커 이미지를 쉽게 만들 수 있도록, 제공하는 템플릿\n  - 파일명은 꼭 `Dockerfile` 이 아니어도 상관없지만, `docker build` 수행 시, default 로 사용하는 파일명이 `Dockerfile` 입니다.\n  - 도커 이미지를 만드는 `docker build` 를 수행할 때, `-f` 옵션을 주면 다른 파일명으로도 사용 가능합니다.\n    - ex) `docker build -f dockerfile-asdf .` 도 가능\n\n1. 실습을 위해서 편한 디렉토리로 이동합니다.\n\n    ```bash\n    cd <SOME-DIRECTORY>\n    ```\n\n2. docker-practice 라는 이름의 폴더를 생성합니다.\n\n    ```bash\n    mkdir docker-practice\n    ```\n\n3. docker-practice 폴더로 이동합니다.\n\n    ```bash\n    cd docker-practice\n    ```\n\n4. Dockerfile 이라는 빈 파일을 생성합니다.\n\n    ```bash\n    touch Dockerfile\n    ```\n\n5. 정상적으로 생성되었는지 확인합니다.\n\n    ```bash\n    ls\n    ```\n\n## 2. Dockerfile 내장 명령어\n\nDockerfile 에서 사용할 수 있는 기본적인 명령어에 대해서 하나씩 알아보겠습니다.\n\n### FROM\n\nDockerfile 이 base image 로 어떠한 이미지를 사용할 것인지를 명시하는 명령어입니다.  \n도커 이미지를 만들 때, 아무것도 없는 빈 환경에서부터 하나하나씩 제가 의도한 환경을 만들어가는게 아니라, python 3.9 버전이 설치된 환경을 베이스로해두고, 저는 pytorch 를 설치하고, 제 소스코드만 넣어두는 형태로 활용할 수가 있습니다.  \n이러한 경우에는 `python:3.9`, `python-3.9-alpine`, ... 등의 잘 만들어진 이미지를 베이스로 활용합니다.\n\n```docker\nFROM <image>[:<tag>] [AS <name>]\n\n# 예시\nFROM ubuntu\nFROM ubuntu:18.04\nFROM nginx:latest AS ngx\n```\n\n### COPY\n\n**host(로컬)에서의 `<src>`** 경로의 파일 혹은 디렉토리를 **container 내부에서의 `<dest>`** 경로에 복사하는 명령어입니다.\n\n```docker\nCOPY <src>... <dest>\n\n# 예시\nCOPY a.txt /some-directory/b.txt\nCOPY my-directory /some-directory-2\n```\n\n`ADD` 는 `COPY` 와 비슷하지만 추가적인 기능을 품고 있습니다.\n\n```docker\n# 1 - 호스트에 압축되어있는 파일을 풀면서 컨테이너 내부로 copy 할 수 있음\nADD scripts.tar.gz /tmp\n# 2 - Remote URLs 에 있는 파일을 소스 경로로 지정할 수 있음\nADD http://www.example.com/script.sh /tmp\n\n# 위 두 가지 기능을 사용하고 싶을 경우에만 COPY 대신 ADD 를 사용하는 것을 권장\n```\n\n### RUN\n\n명시한 커맨드를 도커 컨테이너 내부에서 실행하는 명령어입니다.  \n도커 이미지는 해당 커맨드들이 실행된 상태를 유지합니다.\n\n```docker\nRUN <command>\nRUN [\"executable-command\", \"parameter1\", \"parameter2\"]\n\n# 예시\nRUN pip install torch\nRUN pip install -r requirements.txt\n```\n\n### CMD\n\n명시한 커맨드를 도커 컨테이너가 **시작될 때**, 실행하는 것을 명시하는 명령어입니다.  \n비슷한 역할을 하는 명령어로 **ENTRYPOINT** 가 있습니다. 이 둘의 차이에 대해서는 **뒤에서** 다룹니다.  \n하나의 도커 이미지에서는 하나의 **CMD** 만 실행할 수 있다는 점에서 **RUN** 명령어와 다릅니다.\n\n```docker\nCMD <command>\nCMD [\"executable-command\", \"parameter1\", \"parameter2\"]\nCMD [\"parameter1\", \"parameter2\"] # ENTRYPOINT 와 함께 사용될 때\n\n# 예시\nCMD python main.py\n```\n\n### WORKDIR\n\n이후 추가될 명령어를 컨테이너 내의 어떤 디렉토리에서 수행할 것인지를 명시하는 명령어입니다.  \n만약, 해당 디렉토리가 없다면 생성합니다.\n\n```docker\nWORKDIR /path/to/workdir\n\n# 예시\nWORKDIR /home/demo\nRUN pwd # /home/demo 가 출력됨\n```\n\n### ENV\n\n컨테이너 내부에서 지속적으로 사용될 environment variable 의 값을 설정하는 명령어입니다.\n\n```docker\nENV <KEY> <VALUE>\nENV <KEY>=<VALUE>\n\n# 예시\n# default 언어 설정\nRUN locale-gen ko_KR.UTF-8\nENV LANG ko_KR.UTF-8\nENV LANGUAGE ko_KR.UTF-8\nENV LC_ALL ko_KR.UTF-8\n```\n\n### EXPOSE\n\n컨테이너에서 뚫어줄 포트/프로토콜을 지정할 수 있습니다.  \n`<protocol>` 을 지정하지 않으면 TCP 가 디폴트로 설정됩니다.\n\n```docker\nEXPOSE <port>\nEXPOSE <port>/<protocol>\n\n# 예시\nEXPOSE 8080\n```\n\n## 3. 간단한 Dockerfile 작성해보기\n\n`vim Dockerfile` 혹은 vscode 등 본인이 사용하는 편집기로 `Dockerfile` 을 열어 다음과 같이 작성해줍니다.\n\n```docker\n# base image 를 ubuntu 18.04 로 설정합니다.\nFROM ubuntu:18.04\n\n# apt-get update 명령을 실행합니다.\nRUN apt-get update\n\n# TEST env var의 값을 hello 로 지정합니다.\nENV TEST hello\n\n# DOCKER CONTAINER 가 시작될 때, 환경변수 TEST 의 값을 출력합니다.\nCMD echo $TEST\n```\n\n## 4. Docker build from Dockerfile\n\n`docker build` 명령어로 Dockerfile 로부터 Docker Image 를 만들어봅니다.\n\n```bash\ndocker build --help\n```\n\nDockerfile 이 있는 경로에서 다음 명령을 실행합니다.\n\n```bash\ndocker build -t my-image:v1.0.0 .\n```\n\n위 커맨드를 설명하면 다음과 같습니다.\n\n- `.` : **현재 경로**에 있는 Dockerfile 로부터\n- `-t` : my-image 라는 **이름**과 v1.0.0 이라는 **태그**로 **이미지**를\n- 빌드하겠다라는 명령어\n\n정상적으로 이미지 빌드되었는지 확인해 보겠습니다.\n\n```bash\n# grep : my-image 가 있는지를 잡아내는 (grep) 하는 명령어\ndocker images | grep my-image\n```\n\n정상적으로 수행된다면 다음과 같이 출력됩니다.\n\n```bash\nmy-image     v1.0.0    143114710b2d   3 seconds ago   87.9MB\n```\n\n## 5. Docker run from Dockerfile\n\n그럼 이제 방금 빌드한 `my-image:v1.0.0` 이미지로 docker 컨테이너를 **run** 해보겠습니다.\n\n```bash\ndocker run my-image:v1.0.0\n```\n\n정상적으로 수행된다면 다음과 같이 나옵니다.\n\n```bash\nhello\n```\n\n## 6. Docker run with env\n\n이번에는 방금 빌드한 `my-image:v1.0.0` 이미지를 실행하는 시점에, `TEST` env var 의 값을 변경하여 docker 컨테이너를 run 해보겠습니다.\n\n```bash\ndocker run -e TEST=bye my-image:v1.0.0\n```\n\n정상적으로 수행된다면 다음과 같이 나옵니다.\n\n```bash\nbye\n```\n"
  },
  {
    "path": "versioned_docs/version-1.0/prerequisites/docker/install.md",
    "content": "---\ntitle : \"Install Docker\"\ndescription: \"Install docker to start.\"\nsidebar_position: 1\ncontributors: [\"Jongseob Jeon\", \"Jaeyeon Kim\"]\n---\n\n## Docker\n\n도커 실습을 위해 도커를 설치해야 합니다.  \n도커 설치는 어떤 OS를 사용하는지에 따라 달라집니다.  \n각 환경에 맞는 도커 설치는 공식 홈페이지를 참고해주세요.\n\n- [ubuntu](https://docs.docker.com/engine/install/ubuntu/)\n- [mac](https://docs.docker.com/desktop/mac/install/)\n- [windows](https://docs.docker.com/desktop/windows/install/)\n\n## 설치 확인\n\n`docker run hello-world` 가 정상적으로 수행되는 OS, 터미널 환경이 필요합니다.\n\n| OS      | Docker Engine  | Terminal           |\n| ------- | -------------- | ------------------ |\n| MacOS   | Docker Desktop | zsh                |\n| Windows | Docker Desktop | Powershell         |\n| Windows | Docker Desktop | WSL2               |\n| Ubuntu  | Docker Engine  | bash               |\n\n## 들어가기 앞서서..\n\nMLOps를 사용하기 위해 필요한 도커 사용법을 설명하니 많은 비유와 예시가 MLOps 쪽으로 치중되어 있을 수 있습니다.\n"
  },
  {
    "path": "versioned_docs/version-1.0/prerequisites/docker/introduction.md",
    "content": "---\ntitle : \"Why Docker & Kubernetes ?\"\ndescription: \"Introduction to Docker.\"\nsidebar_position: 2\ncontributors: [\"Jongseob Jeon\", \"Jaeyeon Kim\"]\n---\n\n## Why Kubernetes ?\n\n머신러닝 모델을 서비스화하기 위해서는 모델 개발 외에도 많은 **부가적인** 기능들이 필요합니다.\n\n1. 학습 단계\n    - 모델 학습 명령의 스케줄 관리\n    - 학습된 모델의 Reproducibility 보장\n2. 배포 단계\n    - 트래픽 분산\n    - 서비스 장애 모니터링\n    - 장애 시 트러블슈팅\n\n다행히도 이런 기능들에 대한 needs는 소프트웨어 개발 쪽에서 이미 많은 고민을 거쳐 발전되어 왔습니다.  \n따라서 머신러닝 모델을 배포할 때도 이런 고민의 결과물들을 활용하면 큰 도움을 받을 수 있습니다.\nMLOps에서 대표적으로 활용하는 소프트웨어 제품이 바로 도커와 쿠버네티스입니다.\n\n## 도커와 쿠버네티스\n\n### 기술 이름이 아니라 제품 이름\n\n도커와 쿠버네티스는 각각 컨테이너라이제이션(Containerization) 기능과 컨테이너 오케스트레이션(Container Orchestration) 기능을 제공하는 대표 소프트웨어(제품)입니다.\n\n#### 도커\n\n도커는 과거에 대세였지만 유료화 관련 정책들을 하나씩 추가하면서 점점 사용 빈도가 하락세입니다.\n하지만 2022년 3월 기준으로 아직까지도 가장 일반적으로 사용되는 컨테이너 가상화 소프트웨어입니다.\n\n![sysdig-2019.png](./img/sysdig-2019.png)\n\n<center> [from sysdig 2019] </center>\n\n![sysdig-2021.png](./img/sysdig-2021.png)\n\n<center> [from sysdig 2021]  </center>\n\n#### 쿠버네티스\n\n쿠버네티스는 지금까지는 비교 대상조차 거의 없는 제품입니다.\n\n![cncf-survey.png](./img/cncf-survey.png)\n\n<center> [from cncf survey] </center>\n\n![t4-ai.png](./img/t4-ai.png)\n\n<center> [from t4.ai]  </center>\n\n### **재미있는 오픈소스 역사 이야기**\n\n#### 초기 도커 & 쿠버네티스\n\n초기 도커 개발시에는 Docker Engine이라는 **하나의 패키지**에 API, CLI, 네트워크, 스토리지 등 여러 기능들을 모두 포함했으나, **MSA** 의 철학을 담아 **하나씩 분리**하기 시작했습니다.  \n하지만 초기의 쿠버네티스는 컨테이너 가상화를 위해 Docker Engine을 내장하고 있었습니다.  \n따라서 도커 버전이 업데이트될 때마다 Docker Engine 의 인터페이스가 변경되어 쿠버네티스에서 크게 영향을 받는 일이 계속해서 발생하였습니다.\n\n#### Open Container Initiative\n\n그래서 **이런 불편함을 해소**하고자, 도커를 중심으로 구글 등 컨테이너 기술에 관심있는 **여러 집단**들이 한데 모여 **Open Container Initiative,** 이하 **OCI**라는 프로젝트를 시작하여 컨테이너에 관한 **표준**을 정하는 일들을 시작하였습니다.  \n도커에서도 인터페이스를 **한 번 더 분리**해서, OCI 표준을 준수하는 **containerd**라는 Container Runtime 를 개발하고, **dockerd** 가 containerd 의 API 를 호출하도록 추상화 레이어를 추가하였습니다.\n\n이러한 흐름에 맞추어서 쿠버네티스에서도 이제부터는 도커만을 지원하지 않고, **OCI 표준을** 준수하고, 정해진 스펙을 지키는 컨테이너 런타임은 무엇이든 쿠버네티스에서 사용할 수 있도록, Container Runtime Interface, 이하 **CRI 스펙**을 버전 1.5부터 제공하기 시작했습니다.\n\n#### CRI-O\n\nRed Hat, Intel, SUSE, IBM에서 **OCI 표준+CRI 스펙을** 따라 Kubernetes 전용 Container Runtime 을 목적으로 개발한 컨테이너 런타임입니다.\n\n#### 지금의 도커 & 쿠버네티스\n\n쿠버네티스는 Docker Engine 을 디폴트 컨테이너 런타임으로 사용해왔지만, 도커의 API 가 **CRI** 스펙에 맞지 않아(*OCI 는 따름*) 도커의 API를 **CRI**와 호환되게 바꿔주는 **dockershim**을 쿠버네티스 자체적으로 개발 및 지원해왔었는데,(*도커 측이 아니라 쿠버네티스 측에서 지원했다는 점이 굉장히 큰 짐이었습니다.*) 이걸 쿠버네티스 **v1.20 부터는 Deprecated하고,** **v1.23 부터는 지원을 포기**하기로 결정하였습니다.\n\n- v1.23 은 2021 년 12월 릴리즈\n\n그래서 쿠버네티스 v1.23 부터는 도커를 native 하게 쓸 수 없습니다다.  \n그렇지만 **사용자들은 이런 변화에 크게 관련이 있진 않습니다.**\n왜냐하면 Docker Engine을 통해 만들어진 도커 이미지는 OCI 표준을 준수하기 때문에, 쿠버네티스가 어떤 컨테이너 런타임으로 이루어져있든 사용 가능하기 때문입니다.\n\n### References\n\n- [*https://www.linkedin.com/pulse/containerd는-무엇이고-왜-중요할까-sean-lee/?originalSubdomain=kr*](https://www.linkedin.com/pulse/containerd%EB%8A%94-%EB%AC%B4%EC%97%87%EC%9D%B4%EA%B3%A0-%EC%99%9C-%EC%A4%91%EC%9A%94%ED%95%A0%EA%B9%8C-sean-lee/?originalSubdomain=kr)\n- [https://kubernetes.io/blog/2021/12/07/kubernetes-1-23-release-announcement/](https://kubernetes.io/blog/2021/12/07/kubernetes-1-23-release-announcement/)\n- [https://kubernetes.io/blog/2020/12/02/dockershim-faq/](https://kubernetes.io/blog/2020/12/02/dockershim-faq/)\n- [https://kubernetes.io/blog/2020/12/02/dont-panic-kubernetes-and-docker/](https://kubernetes.io/blog/2020/12/02/dont-panic-kubernetes-and-docker/)\n- [https://kubernetes.io/ko/blog/2020/12/02/dont-panic-kubernetes-and-docker/](https://kubernetes.io/ko/blog/2020/12/02/dont-panic-kubernetes-and-docker/)\n"
  },
  {
    "path": "versioned_docs/version-1.0/setup-components/_category_.json",
    "content": "{\n  \"label\": \"Setup Components\",\n  \"position\": 3,\n  \"link\": {\n    \"type\": \"generated-index\"\n  }\n}\n"
  },
  {
    "path": "versioned_docs/version-1.0/setup-components/install-components-kf.md",
    "content": "---\ntitle : \"1. Kubeflow\"\ndescription: \"구성요소 설치 - Kubeflow\"\nsidebar_position: 1\ndate: 2021-12-13\nlastmod: 2021-12-20\ncontributors: [\"Jaeyeon Kim\", \"SeungTae Kim\"]\n---\n\n## 설치 파일 준비\n\nKubeflow **v1.4.0** 버전을 설치하기 위해서, 설치에 필요한 manifests 파일들을 준비합니다.\n\n[kubeflow/manifests Repository](https://github.com/kubeflow/manifests) 를 **v1.4.0** 태그로 깃 클론한 뒤, 해당 폴더로 이동합니다.\n\n```bash\ngit clone -b v1.4.0 https://github.com/kubeflow/manifests.git\ncd manifests\n```\n\n## 각 구성 요소별 설치\n\nkubeflow/manifests Repository 에 각 구성 요소별 설치 커맨드가 적혀져 있지만, 설치하며 발생할 수 있는 이슈 혹은 정상적으로 설치되었는지 확인하는 방법이 적혀져 있지 않아 처음 설치하는 경우 어려움을 겪는 경우가 많습니다.  \n따라서, 각 구성 요소별로 정상적으로 설치되었는지 확인하는 방법을 함께 작성합니다.  \n\n또한, 본 문서에서는 **모두의 MLOps** 에서 다루지 않는 구성요소인 Knative, KFServing, MPI Operator 의 설치는 리소스의 효율적 사용을 위해 따로 설치하지 않습니다.\n\n### Cert-manager\n\n1. cert-manager 를 설치합니다.\n\n  ```bash\n  kustomize build common/cert-manager/cert-manager/base | kubectl apply -f -\n  ```\n\n  정상적으로 설치되면 다음과 같이 출력됩니다.\n\n  ```bash\n  namespace/cert-manager created\n  customresourcedefinition.apiextensions.k8s.io/certificaterequests.cert-manager.io created\n  customresourcedefinition.apiextensions.k8s.io/certificates.cert-manager.io created\n  customresourcedefinition.apiextensions.k8s.io/challenges.acme.cert-manager.io created\n  customresourcedefinition.apiextensions.k8s.io/clusterissuers.cert-manager.io created\n  customresourcedefinition.apiextensions.k8s.io/issuers.cert-manager.io created\n  customresourcedefinition.apiextensions.k8s.io/orders.acme.cert-manager.io created\n  serviceaccount/cert-manager created\n  serviceaccount/cert-manager-cainjector created\n  serviceaccount/cert-manager-webhook created\n  role.rbac.authorization.k8s.io/cert-manager-webhook:dynamic-serving created\n  role.rbac.authorization.k8s.io/cert-manager-cainjector:leaderelection created\n  role.rbac.authorization.k8s.io/cert-manager:leaderelection created\n  clusterrole.rbac.authorization.k8s.io/cert-manager-cainjector created\n  clusterrole.rbac.authorization.k8s.io/cert-manager-controller-approve:cert-manager-io created\n  clusterrole.rbac.authorization.k8s.io/cert-manager-controller-certificates created\n  clusterrole.rbac.authorization.k8s.io/cert-manager-controller-challenges created\n  clusterrole.rbac.authorization.k8s.io/cert-manager-controller-clusterissuers created\n  clusterrole.rbac.authorization.k8s.io/cert-manager-controller-ingress-shim created\n  clusterrole.rbac.authorization.k8s.io/cert-manager-controller-issuers created\n  clusterrole.rbac.authorization.k8s.io/cert-manager-controller-orders created\n  clusterrole.rbac.authorization.k8s.io/cert-manager-edit created\n  clusterrole.rbac.authorization.k8s.io/cert-manager-view created\n  clusterrole.rbac.authorization.k8s.io/cert-manager-webhook:subjectaccessreviews created\n  rolebinding.rbac.authorization.k8s.io/cert-manager-webhook:dynamic-serving created\n  rolebinding.rbac.authorization.k8s.io/cert-manager-cainjector:leaderelection created\n  rolebinding.rbac.authorization.k8s.io/cert-manager:leaderelection created\n  clusterrolebinding.rbac.authorization.k8s.io/cert-manager-cainjector created\n  clusterrolebinding.rbac.authorization.k8s.io/cert-manager-controller-approve:cert-manager-io created\n  clusterrolebinding.rbac.authorization.k8s.io/cert-manager-controller-certificates created\n  clusterrolebinding.rbac.authorization.k8s.io/cert-manager-controller-challenges created\n  clusterrolebinding.rbac.authorization.k8s.io/cert-manager-controller-clusterissuers created\n  clusterrolebinding.rbac.authorization.k8s.io/cert-manager-controller-ingress-shim created\n  clusterrolebinding.rbac.authorization.k8s.io/cert-manager-controller-issuers created\n  clusterrolebinding.rbac.authorization.k8s.io/cert-manager-controller-orders created\n  clusterrolebinding.rbac.authorization.k8s.io/cert-manager-webhook:subjectaccessreviews created\n  service/cert-manager created\n  service/cert-manager-webhook created\n  deployment.apps/cert-manager created\n  deployment.apps/cert-manager-cainjector created\n  deployment.apps/cert-manager-webhook created\n  mutatingwebhookconfiguration.admissionregistration.k8s.io/cert-manager-webhook created\n  validatingwebhookconfiguration.admissionregistration.k8s.io/cert-manager-webhook created\n  ```\n\n  cert-manager namespace 의 3 개의 pod 가 모두 Running 이 될 때까지 기다립니다.\n\n  ```bash\n  kubectl get pod -n cert-manager\n  ```\n\n  모두 Running 이 되면 다음과 비슷한 결과가 출력됩니다.\n\n  ```bash\n  NAME                                       READY   STATUS    RESTARTS   AGE\n  cert-manager-7dd5854bb4-7nmpd              1/1     Running   0          2m10s\n  cert-manager-cainjector-64c949654c-2scxr   1/1     Running   0          2m10s\n  cert-manager-webhook-6b57b9b886-7q6g2      1/1     Running   0          2m10s\n  ```\n\n2. kubeflow-issuer 를 설치합니다.\n\n  ```bash\n  kustomize build common/cert-manager/kubeflow-issuer/base | kubectl apply -f -\n  ```\n\n  정상적으로 설치되면 다음과 같이 출력됩니다.\n\n  ```bash\n  clusterissuer.cert-manager.io/kubeflow-self-signing-issuer created\n  ```\n\n- cert-manager-webhook 이슈\n\n  cert-manager-webhook deployment 가 Running 이 아닌 경우, 다음과 비슷한 에러가 발생하며 kubeflow-issuer가 설치되지 않을 수 있음에 주의하시기 바랍니다.  \n  해당 에러가 발생한 경우, cert-manager 의 3개의 pod 가 모두 Running 이 되는 것을 확인한 이후 다시 명령어를 수행하시기 바랍니다.\n\n  ```bash\n  Error from server: error when retrieving current configuration of:\n  Resource: \"cert-manager.io/v1alpha2, Resource=clusterissuers\", GroupVersionKind: \"cert-manager.io/v1alpha2, Kind=ClusterIssuer\"\n  Name: \"kubeflow-self-signing-issuer\", Namespace: \"\"\n  from server for: \"STDIN\": conversion webhook for cert-manager.io/v1, Kind=ClusterIssuer failed: Post \"https://cert-manager-webhook.cert-manager.svc:443/convert?timeout=30s\": dial tcp 10.101.177.157:443: connect: connection refused\n  ```\n\n### Istio\n\n1. istio 관련 Custom Resource Definition(CRD) 를 설치합니다.\n\n  ```bash\n  kustomize build common/istio-1-9/istio-crds/base | kubectl apply -f -\n  ```\n\n  정상적으로 수행되면 다음과 같이 출력됩니다.\n\n  ```bash\n  customresourcedefinition.apiextensions.k8s.io/authorizationpolicies.security.istio.io created\n  customresourcedefinition.apiextensions.k8s.io/destinationrules.networking.istio.io created\n  customresourcedefinition.apiextensions.k8s.io/envoyfilters.networking.istio.io created\n  customresourcedefinition.apiextensions.k8s.io/gateways.networking.istio.io created\n  customresourcedefinition.apiextensions.k8s.io/istiooperators.install.istio.io created\n  customresourcedefinition.apiextensions.k8s.io/peerauthentications.security.istio.io created\n  customresourcedefinition.apiextensions.k8s.io/requestauthentications.security.istio.io created\n  customresourcedefinition.apiextensions.k8s.io/serviceentries.networking.istio.io created\n  customresourcedefinition.apiextensions.k8s.io/sidecars.networking.istio.io created\n  customresourcedefinition.apiextensions.k8s.io/virtualservices.networking.istio.io created\n  customresourcedefinition.apiextensions.k8s.io/workloadentries.networking.istio.io created\n  customresourcedefinition.apiextensions.k8s.io/workloadgroups.networking.istio.io created\n  ```\n\n2. istio namespace 를 설치합니다.\n\n  ```bash\n  kustomize build common/istio-1-9/istio-namespace/base | kubectl apply -f -\n  ```\n\n  정상적으로 수행되면 다음과 같이 출력됩니다.\n\n  ```bash\n  namespace/istio-system created\n  ```\n\n3. istio 를 설치합니다.\n\n  ```bash\n  kustomize build common/istio-1-9/istio-install/base | kubectl apply -f -\n  ```\n\n  정상적으로 수행되면 다음과 같이 출력됩니다.\n\n  ```bash\n  serviceaccount/istio-ingressgateway-service-account created\n  serviceaccount/istio-reader-service-account created\n  serviceaccount/istiod-service-account created\n  role.rbac.authorization.k8s.io/istio-ingressgateway-sds created\n  role.rbac.authorization.k8s.io/istiod-istio-system created\n  clusterrole.rbac.authorization.k8s.io/istio-reader-istio-system created\n  clusterrole.rbac.authorization.k8s.io/istiod-istio-system created\n  rolebinding.rbac.authorization.k8s.io/istio-ingressgateway-sds created\n  rolebinding.rbac.authorization.k8s.io/istiod-istio-system created\n  clusterrolebinding.rbac.authorization.k8s.io/istio-reader-istio-system created\n  clusterrolebinding.rbac.authorization.k8s.io/istiod-istio-system created\n  configmap/istio created\n  configmap/istio-sidecar-injector created\n  service/istio-ingressgateway created\n  service/istiod created\n  deployment.apps/istio-ingressgateway created\n  deployment.apps/istiod created\n  envoyfilter.networking.istio.io/metadata-exchange-1.8 created\n  envoyfilter.networking.istio.io/metadata-exchange-1.9 created\n  envoyfilter.networking.istio.io/stats-filter-1.8 created\n  envoyfilter.networking.istio.io/stats-filter-1.9 created\n  envoyfilter.networking.istio.io/tcp-metadata-exchange-1.8 created\n  envoyfilter.networking.istio.io/tcp-metadata-exchange-1.9 created\n  envoyfilter.networking.istio.io/tcp-stats-filter-1.8 created\n  envoyfilter.networking.istio.io/tcp-stats-filter-1.9 created\n  envoyfilter.networking.istio.io/x-forwarded-host created\n  gateway.networking.istio.io/istio-ingressgateway created\n  authorizationpolicy.security.istio.io/global-deny-all created\n  authorizationpolicy.security.istio.io/istio-ingressgateway created\n  mutatingwebhookconfiguration.admissionregistration.k8s.io/istio-sidecar-injector created\n  validatingwebhookconfiguration.admissionregistration.k8s.io/istiod-istio-system created\n  ```\n\n  istio-system namespace 의 2 개의 pod 가 모두 Running 이 될 때까지 기다립니다.\n\n  ```bash\n  kubectl get po -n istio-system\n  ```\n\n  모두 Running 이 되면 다음과 비슷한 결과가 출력됩니다.\n\n  ```bash\n  NAME                                   READY   STATUS    RESTARTS   AGE\n  istio-ingressgateway-79b665c95-xm22l   1/1     Running   0          16s\n  istiod-86457659bb-5h58w                1/1     Running   0          16s\n  ```\n\n### Dex\n\ndex 를 설치합니다.\n\n```bash\nkustomize build common/dex/overlays/istio | kubectl apply -f -\n```\n\n정상적으로 수행되면 다음과 같이 출력됩니다.\n\n```bash\nnamespace/auth created\ncustomresourcedefinition.apiextensions.k8s.io/authcodes.dex.coreos.com created\nserviceaccount/dex created\nclusterrole.rbac.authorization.k8s.io/dex created\nclusterrolebinding.rbac.authorization.k8s.io/dex created\nconfigmap/dex created\nsecret/dex-oidc-client created\nservice/dex created\ndeployment.apps/dex created\nvirtualservice.networking.istio.io/dex created\n```\n\nauth namespace 의 1 개의 pod 가 모두 Running 이 될 때까지 기다립니다.\n\n```bash\nkubectl get po -n auth\n```\n\n모두 Running 이 되면 다음과 비슷한 결과가 출력됩니다.\n\n```bash\nNAME                   READY   STATUS    RESTARTS   AGE\ndex-5ddf47d88d-458cs   1/1     Running   1          12s\n```\n\n### OIDC AuthService\n\nOIDC AuthService 를 설치합니다.\n\n```bash\nkustomize build common/oidc-authservice/base | kubectl apply -f -\n```\n\n정상적으로 수행되면 다음과 같이 출력됩니다.\n\n```bash\nconfigmap/oidc-authservice-parameters created\nsecret/oidc-authservice-client created\nservice/authservice created\npersistentvolumeclaim/authservice-pvc created\nstatefulset.apps/authservice created\nenvoyfilter.networking.istio.io/authn-filter created\n```\n\nistio-system namespace 에 authservice-0 pod 가 Running 이 될 때까지 기다립니다.\n\n```bash\nkubectl get po -n istio-system -w\n```\n\n모두 Running 이 되면 다음과 비슷한 결과가 출력됩니다.\n\n```bash\nNAME                                   READY   STATUS    RESTARTS   AGE\nauthservice-0                          1/1     Running   0          14s\nistio-ingressgateway-79b665c95-xm22l   1/1     Running   0          2m37s\nistiod-86457659bb-5h58w                1/1     Running   0          2m37s\n```\n\n### Kubeflow Namespace\n\nkubeflow namespace 를 생성합니다.\n\n```bash\nkustomize build common/kubeflow-namespace/base | kubectl apply -f -\n```\n\n정상적으로 수행되면 다음과 같이 출력됩니다.\n\n```bash\nnamespace/kubeflow created\n```\n\nkubeflow namespace 를 조회합니다.\n\n```bash\nkubectl get ns kubeflow\n```\n\n정상적으로 생성되면 다음과 비슷한 결과가 출력됩니다.\n\n```bash\nNAME       STATUS   AGE\nkubeflow   Active   8s\n```\n\n### Kubeflow Roles\n\nkubeflow-roles 를 설치합니다.\n\n```bash\nkustomize build common/kubeflow-roles/base | kubectl apply -f -\n```\n\n정상적으로 수행되면 다음과 같이 출력됩니다.\n\n```bash\nclusterrole.rbac.authorization.k8s.io/kubeflow-admin created\nclusterrole.rbac.authorization.k8s.io/kubeflow-edit created\nclusterrole.rbac.authorization.k8s.io/kubeflow-kubernetes-admin created\nclusterrole.rbac.authorization.k8s.io/kubeflow-kubernetes-edit created\nclusterrole.rbac.authorization.k8s.io/kubeflow-kubernetes-view created\nclusterrole.rbac.authorization.k8s.io/kubeflow-view created\n```\n\n방금 생성한 kubeflow roles 를 조회합니다.\n\n```bash\nkubectl get clusterrole | grep kubeflow\n```\n\n다음과 같이 총 6개의 clusterrole 이 출력됩니다.\n\n```bash\nkubeflow-admin                                                         2021-12-03T08:51:36Z\nkubeflow-edit                                                          2021-12-03T08:51:36Z\nkubeflow-kubernetes-admin                                              2021-12-03T08:51:36Z\nkubeflow-kubernetes-edit                                               2021-12-03T08:51:36Z\nkubeflow-kubernetes-view                                               2021-12-03T08:51:36Z\nkubeflow-view                                                          2021-12-03T08:51:36Z\n```\n\n### Kubeflow Istio Resources\n\nkubeflow-istio-resources 를 설치합니다.\n\n```bash\nkustomize build common/istio-1-9/kubeflow-istio-resources/base | kubectl apply -f -\n```\n\n정상적으로 수행되면 다음과 같이 출력됩니다.\n\n```bash\nclusterrole.rbac.authorization.k8s.io/kubeflow-istio-admin created\nclusterrole.rbac.authorization.k8s.io/kubeflow-istio-edit created\nclusterrole.rbac.authorization.k8s.io/kubeflow-istio-view created\ngateway.networking.istio.io/kubeflow-gateway created\n```\n\n방금 생성한 kubeflow roles 를 조회합니다.\n\n```bash\nkubectl get clusterrole | grep kubeflow-istio\n```\n\n다음과 같이 총 3개의 clusterrole 이 출력됩니다.\n\n```bash\nkubeflow-istio-admin                                                   2021-12-03T08:53:17Z\nkubeflow-istio-edit                                                    2021-12-03T08:53:17Z\nkubeflow-istio-view                                                    2021-12-03T08:53:17Z\n```\n\nKubeflow namespace 에 gateway 가 정상적으로 설치되었는지 확인합니다.\n\n```bash\nkubectl get gateway -n kubeflow\n```\n\n정상적으로 생성되면 다음과 비슷한 결과가 출력됩니다.\n\n```bash\nNAME               AGE\nkubeflow-gateway   31s\n```\n\n### Kubeflow Pipelines\n\nkubeflow pipelines 를 설치합니다.\n\n```bash\nkustomize build apps/pipeline/upstream/env/platform-agnostic-multi-user | kubectl apply -f -\n```\n\n정상적으로 수행되면 다음과 같이 출력됩니다.\n\n```bash\ncustomresourcedefinition.apiextensions.k8s.io/clusterworkflowtemplates.argoproj.io created\ncustomresourcedefinition.apiextensions.k8s.io/cronworkflows.argoproj.io created\ncustomresourcedefinition.apiextensions.k8s.io/workfloweventbindings.argoproj.io created\n...(생략)\nauthorizationpolicy.security.istio.io/ml-pipeline-visualizationserver created\nauthorizationpolicy.security.istio.io/mysql created\nauthorizationpolicy.security.istio.io/service-cache-server created\n```\n\n위 명령어는 여러 resources 를 한 번에 설치하고 있지만, 설치 순서의 의존성이 있는 리소스가 존재합니다.  \n따라서 때에 따라 다음과 비슷한 에러가 발생할 수 있습니다.\n\n```bash\n\"error: unable to recognize \"STDIN\": no matches for kind \"CompositeController\" in version \"metacontroller.k8s.io/v1alpha1\"\"  \n```\n\n위와 비슷한 에러가 발생한다면, 10 초 정도 기다린 뒤 다시 위의 명령을 수행합니다.\n\n```bash\nkustomize build apps/pipeline/upstream/env/platform-agnostic-multi-user | kubectl apply -f -\n```\n\n정상적으로 설치되었는지 확인합니다.\n\n```bash\nkubectl get po -n kubeflow\n```\n\n다음과 같이 총 16개의 pod 가 모두 Running 이 될 때까지 기다립니다.\n\n```bash\nNAME                                                     READY   STATUS    RESTARTS   AGE\ncache-deployer-deployment-79fdf9c5c9-bjnbg               2/2     Running   1          5m3s\ncache-server-5bdf4f4457-48gbp                            2/2     Running   0          5m3s\nkubeflow-pipelines-profile-controller-7b947f4748-8d26b   1/1     Running   0          5m3s\nmetacontroller-0                                         1/1     Running   0          5m3s\nmetadata-envoy-deployment-5b4856dd5-xtlkd                1/1     Running   0          5m3s\nmetadata-grpc-deployment-6b5685488-kwvv7                 2/2     Running   3          5m3s\nmetadata-writer-548bd879bb-zjkcn                         2/2     Running   1          5m3s\nminio-5b65df66c9-k5gzg                                   2/2     Running   0          5m3s\nml-pipeline-8c4b99589-85jw6                              2/2     Running   1          5m3s\nml-pipeline-persistenceagent-d6bdc77bd-ssxrv             2/2     Running   0          5m3s\nml-pipeline-scheduledworkflow-5db54d75c5-zk2cw           2/2     Running   0          5m2s\nml-pipeline-ui-5bd8d6dc84-j7wqr                          2/2     Running   0          5m2s\nml-pipeline-viewer-crd-68fb5f4d58-mbcbg                  2/2     Running   1          5m2s\nml-pipeline-visualizationserver-8476b5c645-wljfm         2/2     Running   0          5m2s\nmysql-f7b9b7dd4-xfnw4                                    2/2     Running   0          5m2s\nworkflow-controller-5cbbb49bd8-5zrwx                     2/2     Running   1          5m2s\n```\n\n추가로 ml-pipeline UI가 정상적으로 접속되는지 확인합니다.\n\n```bash\nkubectl port-forward svc/ml-pipeline-ui -n kubeflow 8888:80\n```\n\n웹 브라우저를 열어 [http://localhost:8888/#/pipelines/](http://localhost:8888/#/pipelines/) 경로에 접속합니다.\n\n다음과 같은 화면이 출력되는 것을 확인합니다.\n\n![pipeline-ui](./img/pipeline-ui.png)\n\n- localhost 연결 거부 이슈\n\n![localhost-reject](./img/localhost-reject.png)\n\n만약 다음과 같이 `localhost에서 연결을 거부했습니다` 라는 에러가 출력될 경우, 커맨드로 address 설정을 통해 접근하는 것이 가능합니다.\n\n**보안상의 문제가 되지 않는다면,** 아래와 같이 `0.0.0.0` 로 모든 주소의 bind를 열어주는 방향으로 ml-pipeline UI가 정상적으로 접속되는지 확인합니다.\n\n```bash\nkubectl port-forward --address 0.0.0.0 svc/ml-pipeline-ui -n kubeflow 8888:80\n```\n\n- 위의 옵션으로 실행했음에도 여전히 연결 거부 이슈가 발생할 경우\n\n방화벽 설정으로 접속해 모든 tcp 프로토콜의 포트에 대한 접속을 허가 또는 8888번 포트의 접속 허가를 추가해 접근 권한을 허가해줍니다.\n\n웹 브라우저를 열어 `http://<당신의 가상 인스턴스 공인 ip 주소>:8888/#/pipelines/` 경로에 접속하면, ml-pipeline UI 화면이 출력되는 것을 확인할 수 있습니다.\n\n하단에서 진행되는 다른 포트의 경로에 접속할 때도 위의 절차와 동일하게 커맨드를 실행하고, 방화벽에 포트 번호를 추가해주면 실행하는 것이 가능합니다.\n\n### Katib\n\nKatib 를 설치합니다.\n\n```bash\nkustomize build apps/katib/upstream/installs/katib-with-kubeflow | kubectl apply -f -\n```\n\n정상적으로 수행되면 다음과 같이 출력됩니다.\n\n```bash\ncustomresourcedefinition.apiextensions.k8s.io/experiments.kubeflow.org created\ncustomresourcedefinition.apiextensions.k8s.io/suggestions.kubeflow.org created\ncustomresourcedefinition.apiextensions.k8s.io/trials.kubeflow.org created\nserviceaccount/katib-controller created\nserviceaccount/katib-ui created\nclusterrole.rbac.authorization.k8s.io/katib-controller created\nclusterrole.rbac.authorization.k8s.io/katib-ui created\nclusterrole.rbac.authorization.k8s.io/kubeflow-katib-admin created\nclusterrole.rbac.authorization.k8s.io/kubeflow-katib-edit created\nclusterrole.rbac.authorization.k8s.io/kubeflow-katib-view created\nclusterrolebinding.rbac.authorization.k8s.io/katib-controller created\nclusterrolebinding.rbac.authorization.k8s.io/katib-ui created\nconfigmap/katib-config created\nconfigmap/trial-templates created\nsecret/katib-mysql-secrets created\nservice/katib-controller created\nservice/katib-db-manager created\nservice/katib-mysql created\nservice/katib-ui created\npersistentvolumeclaim/katib-mysql created\ndeployment.apps/katib-controller created\ndeployment.apps/katib-db-manager created\ndeployment.apps/katib-mysql created\ndeployment.apps/katib-ui created\ncertificate.cert-manager.io/katib-webhook-cert created\nissuer.cert-manager.io/katib-selfsigned-issuer created\nvirtualservice.networking.istio.io/katib-ui created\nmutatingwebhookconfiguration.admissionregistration.k8s.io/katib.kubeflow.org created\nvalidatingwebhookconfiguration.admissionregistration.k8s.io/katib.kubeflow.org created\n```\n\n정상적으로 설치되었는지 확인합니다.\n\n```bash\nkubectl get po -n kubeflow | grep katib\n```\n\n다음과 같이 총 4 개의 pod 가 Running 이 될 때까지 기다립니다.\n\n```bash\nkatib-controller-68c47fbf8b-b985z                        1/1     Running   0          82s\nkatib-db-manager-6c948b6b76-2d9gr                        1/1     Running   0          82s\nkatib-mysql-7894994f88-scs62                             1/1     Running   0          82s\nkatib-ui-64bb96d5bf-d89kp                                1/1     Running   0          82s\n```\n\n추가로 katib UI가 정상적으로 접속되는지 확인합니다.\n\n```bash\nkubectl port-forward svc/katib-ui -n kubeflow 8081:80\n```\n\n웹 브라우저를 열어 [http://localhost:8081/katib/](http://localhost:8081/katib/) 경로에 접속합니다.\n\n다음과 같은 화면이 출력되는 것을 확인합니다.\n\n![katib-ui](./img/katib-ui.png)\n\n### Central Dashboard\n\nDashboard 를 설치합니다.\n\n```bash\nkustomize build apps/centraldashboard/upstream/overlays/istio | kubectl apply -f -\n```\n\n정상적으로 수행되면 다음과 같이 출력됩니다.\n\n```bash\nserviceaccount/centraldashboard created\nrole.rbac.authorization.k8s.io/centraldashboard created\nclusterrole.rbac.authorization.k8s.io/centraldashboard created\nrolebinding.rbac.authorization.k8s.io/centraldashboard created\nclusterrolebinding.rbac.authorization.k8s.io/centraldashboard created\nconfigmap/centraldashboard-config created\nconfigmap/centraldashboard-parameters created\nservice/centraldashboard created\ndeployment.apps/centraldashboard created\nvirtualservice.networking.istio.io/centraldashboard created\n```\n\n정상적으로 설치되었는지 확인합니다.\n\n```bash\nkubectl get po -n kubeflow | grep centraldashboard\n```\n\nkubeflow namespace 에 centraldashboard 관련 1 개의 pod 가 Running 이 될 때까지 기다립니다.\n\n```bash\ncentraldashboard-8fc7d8cc-xl7ts                          1/1     Running   0          52s\n```\n\n추가로 Central Dashboard UI가 정상적으로 접속되는지 확인합니다.\n\n```bash\nkubectl port-forward svc/centraldashboard -n kubeflow 8082:80\n```\n\n웹 브라우저를 열어 [http://localhost:8082/](http://localhost:8082/) 경로에 접속합니다.\n\n다음과 같은 화면이 출력되는 것을 확인합니다.\n\n![central-dashboard](./img/central-dashboard.png)\n\n### Admission Webhook\n\n```bash\nkustomize build apps/admission-webhook/upstream/overlays/cert-manager | kubectl apply -f -\n```\n\n정상적으로 수행되면 다음과 같이 출력됩니다.\n\n```bash\ncustomresourcedefinition.apiextensions.k8s.io/poddefaults.kubeflow.org created\nserviceaccount/admission-webhook-service-account created\nclusterrole.rbac.authorization.k8s.io/admission-webhook-cluster-role created\nclusterrole.rbac.authorization.k8s.io/admission-webhook-kubeflow-poddefaults-admin created\nclusterrole.rbac.authorization.k8s.io/admission-webhook-kubeflow-poddefaults-edit created\nclusterrole.rbac.authorization.k8s.io/admission-webhook-kubeflow-poddefaults-view created\nclusterrolebinding.rbac.authorization.k8s.io/admission-webhook-cluster-role-binding created\nservice/admission-webhook-service created\ndeployment.apps/admission-webhook-deployment created\ncertificate.cert-manager.io/admission-webhook-cert created\nissuer.cert-manager.io/admission-webhook-selfsigned-issuer created\nmutatingwebhookconfiguration.admissionregistration.k8s.io/admission-webhook-mutating-webhook-configuration created\n```\n\n정상적으로 설치되었는지 확인합니다.\n\n```bash\nkubectl get po -n kubeflow | grep admission-webhook\n```\n\n1 개의 pod 가 Running 이 될 때까지 기다립니다.\n\n```bash\nadmission-webhook-deployment-667bd68d94-2hhrx            1/1     Running   0          11s\n```\n\n### Notebooks & Jupyter Web App\n\n1. Notebook controller 를 설치합니다.\n\n  ```bash\n  kustomize build apps/jupyter/notebook-controller/upstream/overlays/kubeflow | kubectl apply -f -\n  ```\n\n  정상적으로 수행되면 다음과 같이 출력됩니다.\n\n  ```bash\n  customresourcedefinition.apiextensions.k8s.io/notebooks.kubeflow.org created\n  serviceaccount/notebook-controller-service-account created\n  role.rbac.authorization.k8s.io/notebook-controller-leader-election-role created\n  clusterrole.rbac.authorization.k8s.io/notebook-controller-kubeflow-notebooks-admin created\n  clusterrole.rbac.authorization.k8s.io/notebook-controller-kubeflow-notebooks-edit created\n  clusterrole.rbac.authorization.k8s.io/notebook-controller-kubeflow-notebooks-view created\n  clusterrole.rbac.authorization.k8s.io/notebook-controller-role created\n  rolebinding.rbac.authorization.k8s.io/notebook-controller-leader-election-rolebinding created\n  clusterrolebinding.rbac.authorization.k8s.io/notebook-controller-role-binding created\n  configmap/notebook-controller-config-m44cmb547t created\n  service/notebook-controller-service created\n  deployment.apps/notebook-controller-deployment created\n  ```\n\n  정상적으로 설치되었는지 확인합니다.\n\n  ```bash\n  kubectl get po -n kubeflow | grep notebook-controller\n  ```\n\n  1 개의 pod 가 Running 이 될 때까지 기다립니다.\n\n  ```bash\n  notebook-controller-deployment-75b4f7b578-w4d4l          1/1     Running   0          105s\n  ```\n\n2. Jupyter Web App 을 설치합니다.\n\n  ```bash\n  kustomize build apps/jupyter/jupyter-web-app/upstream/overlays/istio | kubectl apply -f -\n  ```\n\n  정상적으로 수행되면 다음과 같이 출력됩니다.\n\n  ```bash\n  serviceaccount/jupyter-web-app-service-account created\n  role.rbac.authorization.k8s.io/jupyter-web-app-jupyter-notebook-role created\n  clusterrole.rbac.authorization.k8s.io/jupyter-web-app-cluster-role created\n  clusterrole.rbac.authorization.k8s.io/jupyter-web-app-kubeflow-notebook-ui-admin created\n  clusterrole.rbac.authorization.k8s.io/jupyter-web-app-kubeflow-notebook-ui-edit created\n  clusterrole.rbac.authorization.k8s.io/jupyter-web-app-kubeflow-notebook-ui-view created\n  rolebinding.rbac.authorization.k8s.io/jupyter-web-app-jupyter-notebook-role-binding created\n  clusterrolebinding.rbac.authorization.k8s.io/jupyter-web-app-cluster-role-binding created\n  configmap/jupyter-web-app-config-76844k4cd7 created\n  configmap/jupyter-web-app-logos created\n  configmap/jupyter-web-app-parameters-chmg88cm48 created\n  service/jupyter-web-app-service created\n  deployment.apps/jupyter-web-app-deployment created\n  virtualservice.networking.istio.io/jupyter-web-app-jupyter-web-app created\n  ```\n\n  정상적으로 설치되었는지 확인합니다.\n\n  ```bash\n  kubectl get po -n kubeflow | grep jupyter-web-app\n  ```\n\n  1개의 pod 가 Running 이 될 때까지 기다립니다.\n\n  ```bash\n  jupyter-web-app-deployment-6f744fbc54-p27ts              1/1     Running   0          2m\n  ```\n\n### Profiles + KFAM\n\nProfile Controller를 설치합니다.\n\n```bash\nkustomize build apps/profiles/upstream/overlays/kubeflow | kubectl apply -f -\n```\n\n정상적으로 수행되면 다음과 같이 출력됩니다.\n\n```bash\ncustomresourcedefinition.apiextensions.k8s.io/profiles.kubeflow.org created\nserviceaccount/profiles-controller-service-account created\nrole.rbac.authorization.k8s.io/profiles-leader-election-role created\nrolebinding.rbac.authorization.k8s.io/profiles-leader-election-rolebinding created\nclusterrolebinding.rbac.authorization.k8s.io/profiles-cluster-role-binding created\nconfigmap/namespace-labels-data-48h7kd55mc created\nconfigmap/profiles-config-46c7tgh6fd created\nservice/profiles-kfam created\ndeployment.apps/profiles-deployment created\nvirtualservice.networking.istio.io/profiles-kfam created\n```\n\n정상적으로 설치되었는지 확인합니다.\n\n```bash\nkubectl get po -n kubeflow | grep profiles-deployment\n```\n\n1 개의 pod 가 Running 이 될 때까지 기다립니다.\n\n```bash\nprofiles-deployment-89f7d88b-qsnrd                       2/2     Running   0          42s\n```\n\n### Volumes Web App\n\nVolumes Web App 을 설치합니다.\n\n```bash\nkustomize build apps/volumes-web-app/upstream/overlays/istio | kubectl apply -f -\n```\n\n정상적으로 수행되면 다음과 같이 출력됩니다.\n\n```bash\nserviceaccount/volumes-web-app-service-account created\nclusterrole.rbac.authorization.k8s.io/volumes-web-app-cluster-role created\nclusterrole.rbac.authorization.k8s.io/volumes-web-app-kubeflow-volume-ui-admin created\nclusterrole.rbac.authorization.k8s.io/volumes-web-app-kubeflow-volume-ui-edit created\nclusterrole.rbac.authorization.k8s.io/volumes-web-app-kubeflow-volume-ui-view created\nclusterrolebinding.rbac.authorization.k8s.io/volumes-web-app-cluster-role-binding created\nconfigmap/volumes-web-app-parameters-4gg8cm2gmk created\nservice/volumes-web-app-service created\ndeployment.apps/volumes-web-app-deployment created\nvirtualservice.networking.istio.io/volumes-web-app-volumes-web-app created\n```\n\n정상적으로 설치되었는지 확인합니다.\n\n```bash\nkubectl get po -n kubeflow | grep volumes-web-app\n```\n\n1개의 pod가 Running 이 될 때까지 기다립니다.\n\n```bash\nvolumes-web-app-deployment-8589d664cc-62svl              1/1     Running   0          27s\n```\n\n### Tensorboard & Tensorboard Web App\n\n1. Tensorboard Web App 를 설치합니다.\n\n  ```bash\n  kustomize build apps/tensorboard/tensorboards-web-app/upstream/overlays/istio | kubectl apply -f -\n  ```\n\n  정상적으로 수행되면 다음과 같이 출력됩니다.\n\n  ```bash\n  serviceaccount/tensorboards-web-app-service-account created\n  clusterrole.rbac.authorization.k8s.io/tensorboards-web-app-cluster-role created\n  clusterrole.rbac.authorization.k8s.io/tensorboards-web-app-kubeflow-tensorboard-ui-admin created\n  clusterrole.rbac.authorization.k8s.io/tensorboards-web-app-kubeflow-tensorboard-ui-edit created\n  clusterrole.rbac.authorization.k8s.io/tensorboards-web-app-kubeflow-tensorboard-ui-view created\n  clusterrolebinding.rbac.authorization.k8s.io/tensorboards-web-app-cluster-role-binding created\n  configmap/tensorboards-web-app-parameters-g28fbd6cch created\n  service/tensorboards-web-app-service created\n  deployment.apps/tensorboards-web-app-deployment created\n  virtualservice.networking.istio.io/tensorboards-web-app-tensorboards-web-app created\n  ```\n\n  정상적으로 설치되었는지 확인합니다.\n\n  ```bash\n  kubectl get po -n kubeflow | grep tensorboards-web-app\n  ```\n\n  1 개의 pod 가 Running 이 될 때까지 기다립니다.\n\n  ```bash\n  tensorboards-web-app-deployment-6ff79b7f44-qbzmw            1/1     Running             0          22s\n  ```\n\n2. Tensorboard Controller 를 설치합니다.\n\n  ```bash\n  kustomize build apps/tensorboard/tensorboard-controller/upstream/overlays/kubeflow | kubectl apply -f -\n  ```\n\n  정상적으로 수행되면 다음과 같이 출력됩니다.\n\n  ```bash\n  customresourcedefinition.apiextensions.k8s.io/tensorboards.tensorboard.kubeflow.org created\n  serviceaccount/tensorboard-controller created\n  role.rbac.authorization.k8s.io/tensorboard-controller-leader-election-role created\n  clusterrole.rbac.authorization.k8s.io/tensorboard-controller-manager-role created\n  clusterrole.rbac.authorization.k8s.io/tensorboard-controller-proxy-role created\n  rolebinding.rbac.authorization.k8s.io/tensorboard-controller-leader-election-rolebinding created\n  clusterrolebinding.rbac.authorization.k8s.io/tensorboard-controller-manager-rolebinding created\n  clusterrolebinding.rbac.authorization.k8s.io/tensorboard-controller-proxy-rolebinding created\n  configmap/tensorboard-controller-config-bf88mm96c8 created\n  service/tensorboard-controller-controller-manager-metrics-service created\n  deployment.apps/tensorboard-controller-controller-manager created\n  ```\n\n  정상적으로 설치되었는지 확인합니다.\n\n  ```bash\n  kubectl get po -n kubeflow | grep tensorboard-controller\n  ```\n\n  1 개의 pod 가 Running 이 될 때까지 기다립니다.\n\n  ```bash\n  tensorboard-controller-controller-manager-954b7c544-vjpzj   3/3     Running   1          73s\n  ```\n\n### Training Operator\n\nTraining Operator 를 설치합니다.\n\n```bash\nkustomize build apps/training-operator/upstream/overlays/kubeflow | kubectl apply -f -\n```\n\n정상적으로 수행되면 다음과 같이 출력됩니다.\n\n```bash\ncustomresourcedefinition.apiextensions.k8s.io/mxjobs.kubeflow.org created\ncustomresourcedefinition.apiextensions.k8s.io/pytorchjobs.kubeflow.org created\ncustomresourcedefinition.apiextensions.k8s.io/tfjobs.kubeflow.org created\ncustomresourcedefinition.apiextensions.k8s.io/xgboostjobs.kubeflow.org created\nserviceaccount/training-operator created\nclusterrole.rbac.authorization.k8s.io/kubeflow-training-admin created\nclusterrole.rbac.authorization.k8s.io/kubeflow-training-edit created\nclusterrole.rbac.authorization.k8s.io/kubeflow-training-view created\nclusterrole.rbac.authorization.k8s.io/training-operator created\nclusterrolebinding.rbac.authorization.k8s.io/training-operator created\nservice/training-operator created\ndeployment.apps/training-operator created\n```\n\n정상적으로 설치되었는지 확인합니다.\n\n```bash\nkubectl get po -n kubeflow | grep training-operator\n```\n\n1 개의 pod 가 Running 이 될 때까지 기다립니다.\n\n```bash\ntraining-operator-7d98f9dd88-6887f                          1/1     Running   0          28s\n```\n\n### User Namespace\n\nKubeflow 사용을 위해, 사용할 User의 Kubeflow Profile 을 생성합니다.\n\n```bash\nkustomize build common/user-namespace/base | kubectl apply -f -\n```\n\n정상적으로 수행되면 다음과 같이 출력됩니다.\n\n```bash\nconfigmap/default-install-config-9h2h2b6hbk created\nprofile.kubeflow.org/kubeflow-user-example-com created\n```\n\nkubeflow-user-example-com profile 이 생성된 것을 확인합니다.\n\n```bash\nkubectl get profile\n```\n\n```bash\nkubeflow-user-example-com   37s\n```\n\n## 정상 설치 확인\n\nKubeflow central dashboard에 web browser로 접속하기 위해 포트 포워딩합니다.\n\n```bash\nkubectl port-forward svc/istio-ingressgateway -n istio-system 8080:80\n```\n\nWeb Browser 를 열어 [http://localhost:8080](http://localhost:8080) 으로 접속하여, 다음과 같은 화면이 출력되는 것을 확인합니다.\n\n![login-ui](./img/login-after-install.png)\n\n다음 접속 정보를 입력하여 접속합니다.\n\n- Email Address: `user@example.com`\n- Password: `12341234`\n\n![central-dashboard](./img/after-login.png)\n\n"
  },
  {
    "path": "versioned_docs/version-1.0/setup-components/install-components-mlflow.md",
    "content": "---\ntitle : \"2. MLflow Tracking Server\"\ndescription: \"구성요소 설치 - MLflow\"\nsidebar_position: 2\ndate: 2021-12-13\nlastmod: 2021-12-13\ncontributors: [\"Jaeyeon Kim\"]\n---\n\n## Install MLflow Tracking Server\n\nMLflow는 대표적인 오픈소스 ML 실험 관리 도구입니다. MLflow는 [실험 관리 용도](https://mlflow.org/docs/latest/tracking.html#tracking) 외에도 [ML Model 패키징](https://mlflow.org/docs/latest/projects.html#projects), [ML 모델 배포 관리](https://mlflow.org/docs/latest/models.html#models), [ML 모델 저장](https://mlflow.org/docs/latest/model-registry.html#registry)과 같은 기능도 제공하고 있습니다.\n\n*모두의 MLOps*에서는 MLflow를 실험 관리 용도로 사용합니다.  \n그래서 MLflow에서 관리하는 데이터를 저장하고 UI를 제공하는 MLflow Tracking Server를 쿠버네티스 클러스터에 배포하여 사용할 예정입니다.\n\n## Before Install MLflow Tracking Server\n\n### PostgreSQL DB 설치\n\nMLflow Tracking Server가 Backend Store로 사용할 용도의 PostgreSQL DB를 쿠버네티스 클러스터에 배포합니다.\n\n먼저 `mlflow-system`이라는 namespace 를 생성합니다.\n\n```bash\nkubectl create ns mlflow-system\n```\n\n다음과 같은 메시지가 출력되면 정상적으로 생성된 것을 의미합니다.\n\n```bash\nnamespace/mlflow-system created\n```\n\npostgresql DB를 `mlflow-system` namespace 에 생성합니다.\n\n```bash\nkubectl -n mlflow-system apply -f https://raw.githubusercontent.com/mlops-for-all/helm-charts/b94b5fe4133f769c04b25068b98ccfa7a505aa60/mlflow/manifests/postgres.yaml \n```\n\n정상적으로 수행되면 다음과 같이 출력됩니다.\n\n```bash\nservice/postgresql-mlflow-service created\ndeployment.apps/postgresql-mlflow created\npersistentvolumeclaim/postgresql-mlflow-pvc created\n```\n\nmlflow-system namespace 에 1개의 postgresql 관련 pod 가 Running 이 될 때까지 기다립니다.\n\n```bash\nkubectl get pod -n mlflow-system | grep postgresql\n```\n\n다음과 비슷하게 출력되면 정상적으로 실행된 것입니다.\n\n```bash\npostgresql-mlflow-7b9bc8c79f-srkh7   1/1     Running   0          38s\n```\n\n### Minio 설정\n\nMLflow Tracking Server가 Artifacts Store로 사용할 용도의 Minio는 이전 Kubeflow 설치 단계에서 설치한 Minio를 활용합니다.  \n단, kubeflow 용도와 mlflow 용도를 분리하기 위해, mlflow 전용 버킷(bucket)을 생성하겠습니다.  \nminio 에 접속하여 버킷을 생성하기 위해, 우선 minio-service 를 포트포워딩합니다.\n\n```bash\nkubectl port-forward svc/minio-service -n kubeflow 9000:9000\n```\n\n웹 브라우저를 열어 [localhost:9000](http://localhost:9000)으로 접속하면 다음과 같은 화면이 출력됩니다.\n\n![minio-install](./img/minio-install.png)\n\n\n다음과 같은 접속 정보를 입력하여 로그인합니다.\n\n- Username: `minio`\n- Password: `minio123`\n\n우측 하단의 **`+`** 버튼을 클릭하여, `Create Bucket`를 클릭합니다.\n\n![create-bucket](./img/create-bucket.png)\n\n\n`Bucket Name`에 `mlflow`를 입력하여 버킷을 생성합니다.\n\n정상적으로 생성되면 다음과 같이 왼쪽에 `mlflow`라는 이름의 버킷이 생성됩니다.\n\n![mlflow-bucket](./img/mlflow-bucket.png)\n\n\n---\n\n## Let's Install MLflow Tracking Server\n\n### Helm Repository 추가\n\n```bash\nhelm repo add mlops-for-all https://mlops-for-all.github.io/helm-charts\n```\n\n다음과 같은 메시지가 출력되면 정상적으로 추가된 것을 의미합니다.\n\n```bash\n\"mlops-for-all\" has been added to your repositories\n```\n\n### Helm Repository 업데이트\n\n```bash\nhelm repo update\n```\n\n다음과 같은 메시지가 출력되면 정상적으로 업데이트된 것을 의미합니다.\n\n```bash\nHang tight while we grab the latest from your chart repositories...\n...Successfully got an update from the \"mlops-for-all\" chart repository\nUpdate Complete. ⎈Happy Helming!⎈\n```\n\n### Helm Install\n\nmlflow-server Helm Chart 0.2.0 버전을 설치합니다.\n\n```bash\nhelm install mlflow-server mlops-for-all/mlflow-server \\\n  --namespace mlflow-system \\\n  --version 0.2.0\n```\n\n- **주의**: 위의 helm chart는 MLflow 의 backend store 와 artifacts store 의 접속 정보를 kubeflow 설치 과정에서 생성한 minio와 위의 [PostgreSQL DB 설치](#postgresql-db-설치)에서 생성한 postgresql 정보를 default로 하여 설치합니다.\n  - 별개로 생성한 DB 혹은 Object storage를 활용하고 싶은 경우, [Helm Chart Repo](https://github.com/mlops-for-all/helm-charts/tree/main/mlflow/chart)를 참고하여 helm install 시 value를 따로 설정하여 설치하시기 바랍니다.\n\n다음과 같은 메시지가 출력되어야 합니다.\n\n```bash\nNAME: mlflow-server\nLAST DEPLOYED: Sat Dec 18 22:02:13 2021\nNAMESPACE: mlflow-system\nSTATUS: deployed\nREVISION: 1\nTEST SUITE: None\n```\n\n정상적으로 설치되었는지 확인합니다.\n\n```bash\nkubectl get pod -n mlflow-system | grep mlflow-server\n```\n\nmlflow-system namespace 에 1 개의 mlflow-server 관련 pod 가 Running 이 될 때까지 기다립니다.  \n다음과 비슷하게 출력되면 정상적으로 실행된 것입니다.\n\n```bash\nmlflow-server-ffd66d858-6hm62        1/1     Running   0          74s\n```\n\n### 정상 설치 확인\n\n그럼 이제 MLflow Server에 정상적으로 접속되는지 확인해보겠습니다.\n\n우선 클라이언트 노드에서 접속하기 위해, 포트포워딩을 수행합니다.\n\n```bash\nkubectl port-forward svc/mlflow-server-service -n mlflow-system 5000:5000\n```\n\n웹 브라우저를 열어 [localhost:5000](http://localhost:5000)으로 접속하면 다음과 같은 화면이 출력됩니다.\n\n![mlflow-install](./img/mlflow-install.png)\n\n"
  },
  {
    "path": "versioned_docs/version-1.0/setup-components/install-components-pg.md",
    "content": "---\ntitle : \"4. Prometheus & Grafana\"\ndescription: \"구성요소 설치 - Prometheus & Grafana\"\nsidebar_position: 4\ndate: 2021-12-13\nlastmod: 2021-12-13\ncontributors: [\"Jaeyeon Kim\"]\n---\n\n## Prometheus & Grafana\n\n프로메테우스(Prometheus) 와 그라파나(Grafana) 는 모니터링을 위한 도구입니다.  \n안정적인 서비스 운영을 위해서는 서비스와 서비스가 운영되고 있는 인프라의 상태를 지속해서 관찰하고, 관찰한 메트릭을 바탕으로 문제가 생길 때 빠르게 대응해야 합니다.  \n이러한 모니터링을 효율적으로 수행하기 위한 많은 도구 중 *모두의 MLOps*에서는 오픈소스인 프로메테우스와 그라파나를 사용할 예정입니다.\n\n더 자세한 내용은 [Prometheus 공식 문서](https://prometheus.io/docs/introduction/overview/), [Grafana 공식 문서](https://grafana.com/docs/)를 확인해주시기를 바랍니다.\n\n프로메테우스는 다양한 대상으로부터 Metric을 수집하는 도구이며, 그라파나는 모인 데이터를 시각화하는 것을 도와주는 도구입니다. 서로 간의 종속성은 없지만 상호 보완적으로 사용할 수 있어 함께 사용되는 경우가 많습니다.\n\n이번 페이지에서는 쿠버네티스 클러스터에 프로메테우스와 그라파나를 설치한 뒤, Seldon-Core 로 생성한 SeldonDeployment 로 API 요청을 보내, 정상적으로 Metrics 이 수집되는지 확인해보겠습니다.\n\n본 글에서는 seldonio/seldon-core-analytics Helm Chart 1.12.0 버전을 활용해 쿠버네티스 클러스터에 프로메테우스와 그라파나를 설치하고, Seldon-Core 에서 생성한 SeldonDeployment의 Metrics 을 효율적으로 확인하기 위한 대시보드도 함께 설치합니다.\n\n### Helm Repository 추가\n\n```bash\nhelm repo add seldonio https://storage.googleapis.com/seldon-charts\n```\n\n다음과 같은 메시지가 출력되면 정상적으로 추가된 것을 의미합니다.\n\n```bash\n\"seldonio\" has been added to your repositories\n```\n\n### Helm Repository 업데이트\n\n```bash\nhelm repo update\n```\n\n다음과 같은 메시지가 출력되면 정상적으로 업데이트된 것을 의미합니다.\n\n```bash\nHang tight while we grab the latest from your chart repositories...\n...Successfully got an update from the \"seldonio\" chart repository\n...Successfully got an update from the \"datawire\" chart repository\nUpdate Complete. ⎈Happy Helming!⎈\n```\n\n### Helm Install\n\nseldon-core-analytics Helm Chart 1.12.0 버전을 설치합니다.\n\n```bash\nhelm install seldon-core-analytics seldonio/seldon-core-analytics \\\n  --namespace seldon-system \\\n  --version 1.12.0\n```\n\n다음과 같은 메시지가 출력되어야 합니다.\n\n```bash\n생략...\nNAME: seldon-core-analytics\nLAST DEPLOYED: Tue Dec 14 18:29:38 2021\nNAMESPACE: seldon-system\nSTATUS: deployed\nREVISION: 1\n```\n\n정상적으로 설치되었는지 확인합니다.\n\n```bash\nkubectl get pod -n seldon-system | grep seldon-core-analytics\n```\n\nseldon-system namespace 에 6개의 seldon-core-analytics 관련 pod 가 Running 이 될 때까지 기다립니다.\n\n```bash\nseldon-core-analytics-grafana-657c956c88-ng8wn                  2/2     Running   0          114s\nseldon-core-analytics-kube-state-metrics-94bb6cb9-svs82         1/1     Running   0          114s\nseldon-core-analytics-prometheus-alertmanager-64cf7b8f5-nxbl8   2/2     Running   0          114s\nseldon-core-analytics-prometheus-node-exporter-5rrj5            1/1     Running   0          114s\nseldon-core-analytics-prometheus-pushgateway-8476474cff-sr4n6   1/1     Running   0          114s\nseldon-core-analytics-prometheus-seldon-685c664894-7cr45        2/2     Running   0          114s\n```\n\n### 정상 설치 확인\n\n그럼 이제 그라파나에 정상적으로 접속되는지 확인해보겠습니다.\n\n우선 클라이언트 노드에서 접속하기 위해, 포트포워딩을 수행합니다.\n\n```bash\nkubectl port-forward svc/seldon-core-analytics-grafana -n seldon-system 8090:80\n```\n\n웹 브라우저를 열어 [localhost:8090](http://localhost:8090)으로 접속하면 다음과 같은 화면이 출력됩니다.\n\n![grafana-install](./img/grafana-install.png)\n\n다음과 같은 접속정보를 입력하여 접속합니다.\n\n- Email or username : `admin`\n- Password : `password`\n\n로그인하면 다음과 같은 화면이 출력됩니다.\n\n![grafana-login](./img/grafana-login.png)\n\n좌측의 대시보드 아이콘을 클릭하여, `Manage` 버튼을 클릭합니다.\n\n![dashboard-click](./img/dashboard-click.png)\n\n기본적인 그라파나 대시보드가 포함되어있는 것을 확인할 수 있습니다. 이 중 `Prediction Analytics` 대시보드를 클릭합니다.\n\n![dashboard](./img/dashboard.png)\n\nSeldon Core API Dashboard 가 보이고, 다음과 같이 출력되는 것을 확인할 수 있습니다.\n\n![seldon-dashboard](./img/seldon-dashboard.png)\n\n## References\n\n- [Seldon-Core-Analytics Helm Chart](https://github.com/SeldonIO/seldon-core/tree/master/helm-charts/seldon-core-analytics)\n"
  },
  {
    "path": "versioned_docs/version-1.0/setup-components/install-components-seldon.md",
    "content": "---\ntitle : \"3. Seldon-Core\"\ndescription: \"구성요소 설치 - Seldon-Core\"\nsidebar_position: 3\ndate: 2021-12-13\nlastmod: 2021-12-13\ncontributors: [\"Jaeyeon Kim\"]\n---\n\n## Seldon-Core\n\nSeldon-Core는 쿠버네티스 환경에 수많은 머신러닝 모델을 배포하고 관리할 수 있는 오픈소스 프레임워크 중 하나입니다.  \n더 자세한 내용은 Seldon-Core 의 공식 [제품 설명 페이지](https://www.seldon.io/tech/products/core/) 와 [깃헙](https://github.com/SeldonIO/seldon-core) 그리고 API Deployment 파트를 참고해주시기를 바랍니다.\n\n## Selon-Core 설치\n\nSeldon-Core를 사용하기 위해서는 쿠버네티스의 인그레스(Ingress)를 담당하는 Ambassador 와 Istio 와 같은 [모듈이 필요합니다](https://docs.seldon.io/projects/seldon-core/en/latest/workflow/install.html).  \nSeldon-Core 에서는 Ambassador 와 Istio 만을 공식적으로 지원하며, *모두의 MLOps*에서는 Ambassador를 사용해 Seldon-core를 사용하므로 Ambassador를 설치하겠습니다.\n\n### Ambassador - Helm Repository 추가\n\n```bash\nhelm repo add datawire https://www.getambassador.io\n```\n\n다음과 같은 메시지가 출력되면 정상적으로 추가된 것을 의미합니다.\n\n```bash\n\"datawire\" has been added to your repositories\n```\n\n### Ambassador - Helm Repository 업데이트\n\n```bash\nhelm repo update\n```\n\n다음과 같은 메시지가 출력되면 정상적으로 업데이트된 것을 의미합니다.\n\n```bash\nHang tight while we grab the latest from your chart repositories...\n...Successfully got an update from the \"datawire\" chart repository\nUpdate Complete. ⎈Happy Helming!⎈\n```\n\n### Ambassador - Helm Install\n\nambassador Chart 6.9.3 버전을 설치합니다.\n\n```bash\nhelm install ambassador datawire/ambassador \\\n  --namespace seldon-system \\\n  --create-namespace \\\n  --set image.repository=quay.io/datawire/ambassador \\\n  --set enableAES=false \\\n  --set crds.keep=false \\\n  --version 6.9.3\n```\n\n다음과 같은 메시지가 출력되어야 합니다.\n\n```bash\n생략...\n\nW1206 17:01:36.026326   26635 warnings.go:70] rbac.authorization.k8s.io/v1beta1 Role is deprecated in v1.17+, unavailable in v1.22+; use rbac.authorization.k8s.io/v1 Role\nW1206 17:01:36.029764   26635 warnings.go:70] rbac.authorization.k8s.io/v1beta1 RoleBinding is deprecated in v1.17+, unavailable in v1.22+; use rbac.authorization.k8s.io/v1 RoleBinding\nNAME: ambassador\nLAST DEPLOYED: Mon Dec  6 17:01:34 2021\nNAMESPACE: seldon-system\nSTATUS: deployed\nREVISION: 1\nNOTES:\n-------------------------------------------------------------------------------\n  Congratulations! You've successfully installed Ambassador!\n\n-------------------------------------------------------------------------------\nTo get the IP address of Ambassador, run the following commands:\nNOTE: It may take a few minutes for the LoadBalancer IP to be available.\n     You can watch the status of by running 'kubectl get svc -w  --namespace seldon-system ambassador'\n\n  On GKE/Azure:\n  export SERVICE_IP=$(kubectl get svc --namespace seldon-system ambassador -o jsonpath='{.status.loadBalancer.ingress[0].ip}')\n\n  On AWS:\n  export SERVICE_IP=$(kubectl get svc --namespace seldon-system ambassador -o jsonpath='{.status.loadBalancer.ingress[0].hostname}')\n\n  echo http://$SERVICE_IP:\n\nFor help, visit our Slack at http://a8r.io/Slack or view the documentation online at https://www.getambassador.io.\n```\n\nseldon-system 에 4 개의 pod 가 Running 이 될 때까지 기다립니다.\n\n```bash\nkubectl get pod -n seldon-system\n```\n\n```bash\nambassador-7f596c8b57-4s9xh                  1/1     Running   0          7m15s\nambassador-7f596c8b57-dt6lr                  1/1     Running   0          7m15s\nambassador-7f596c8b57-h5l6f                  1/1     Running   0          7m15s\nambassador-agent-77bccdfcd5-d5jxj            1/1     Running   0          7m15s\n```\n\n### Seldon-Core - Helm Install\n\nseldon-core-operator Chart 1.11.2 버전을 설치합니다.\n\n```bash\nhelm install seldon-core seldon-core-operator \\\n    --repo https://storage.googleapis.com/seldon-charts \\\n    --namespace seldon-system \\\n    --set usageMetrics.enabled=true \\\n    --set ambassador.enabled=true \\\n    --version 1.11.2\n```\n\n다음과 같은 메시지가 출력되어야 합니다.\n\n```bash\n생략...\n\nW1206 17:05:38.336391   28181 warnings.go:70] admissionregistration.k8s.io/v1beta1 ValidatingWebhookConfiguration is deprecated in v1.16+, unavailable in v1.22+; use admissionregistration.k8s.io/v1 ValidatingWebhookConfiguration\nNAME: seldon-core\nLAST DEPLOYED: Mon Dec  6 17:05:34 2021\nNAMESPACE: seldon-system\nSTATUS: deployed\nREVISION: 1\nTEST SUITE: None\n```\n\nseldon-system namespace 에 1 개의 seldon-controller-manager pod 가 Running 이 될 때까지 기다립니다.\n\n```bash\nkubectl get pod -n seldon-system | grep seldon-controller\n```\n\n```bash\nseldon-controller-manager-8457b8b5c7-r2frm   1/1     Running   0          2m22s\n```\n\n## References\n\n- [Example Model Servers with Seldon](https://docs.seldon.io/projects/seldon-core/en/latest/examples/server_examples.html#examples-server-examples--page-root)\n"
  },
  {
    "path": "versioned_docs/version-1.0/setup-kubernetes/_category_.json",
    "content": "{\n  \"label\": \"Setup Kubernetes\",\n  \"position\": 2,\n  \"link\": {\n    \"type\": \"generated-index\"\n  }\n}\n"
  },
  {
    "path": "versioned_docs/version-1.0/setup-kubernetes/install-kubernetes/_category_.json",
    "content": "{\n  \"label\": \"4. Install Kubernetes\",\n  \"position\": 4,\n  \"link\": {\n    \"type\": \"generated-index\"\n  }\n}\n"
  },
  {
    "path": "versioned_docs/version-1.0/setup-kubernetes/install-kubernetes/kubernetes-with-k3s.md",
    "content": "---\ntitle: \"4.1. K3s\"\ndescription: \"\"\nsidebar_position: 1\ndate: 2021-12-13\nlastmod: 2021-12-20\ndraft: false\nweight: 221\ncontributors: [\"Jongseob Jeon\"]\nmenu:\n  docs:\n    parent:../setup-kubernetes\"\nimages: []\n---\n\n## 1. Prerequisite\n\n쿠버네티스 클러스터를 구축하기에 앞서, 필요한 구성 요소들을 **클러스터에** 설치합니다.\n\n[Install Prerequisite](../../setup-kubernetes/install-prerequisite.md)을 참고하여 Kubernetes를 설치하기 전에 필요한 요소들을 **클러스터에** 설치해 주시기 바랍니다.\n\nk3s 에서는 기본값으로 containerd를 백엔드로 이용해 설치합니다.\n하지만 저희는 GPU를 사용하기 위해서 docker를 백엔드로 사용해야 하므로 `--docker` 옵션을 통해 백엔드를 docker로 설치하겠습니다.\n\n```bash\ncurl -sfL https://get.k3s.io | INSTALL_K3S_VERSION=v1.21.7+k3s1 sh -s - server --disable traefik --disable servicelb --disable local-storage --docker\n```\n\nk3s를 설치 후 k3s config를 확인합니다\n\n```bash\nsudo cat /etc/rancher/k3s/k3s.yaml\n```\n\n정상적으로 설치되면 다음과 같은 항목이 출력됩니다.  \n(보안 문제와 관련된 키들은 <...>로 가렸습니다.)\n\n```bash\napiVersion: v1\nclusters:\n- cluster:\n    certificate-authority-data:\n    <...>\n    server: https://127.0.0.1:6443\n  name: default\ncontexts:\n- context:\n    cluster: default\n    user: default\n  name: default\ncurrent-context: default\nkind: Config\npreferences: {}\nusers:\n- name: default\n  user:\n    client-certificate-data:\n    <...>\n    client-key-data:\n    <...>\n```\n\n## 2. 쿠버네티스 클러스터 셋업\n\nk3s config를 클러스터의 kubeconfig로 사용하기 위해서 복사합니다.\n\n```bash\nmkdir .kube\nsudo cp /etc/rancher/k3s/k3s.yaml .kube/config\n```\n\n복사된 config 파일에 user가 접근할 수 있는 권한을 줍니다.\n\n```bash\nsudo chown $USER:$USER .kube/config\n```\n\n## 3. 쿠버네티스 클라이언트 셋업\n\n이제 클러스터에서 설정한 kubeconfig를 로컬로 이동합니다.\n로컬에서는 경로를 `~/.kube/config`로 설정합니다.\n\n처음 복사한 config 파일에는 server ip가 `https://127.0.0.1:6443` 으로 되어 있습니다.  \n이 값을 클러스터의 ip에 맞게 수정합니다.  \n(이번 페이지에서 사용하는 클러스터의 ip에 맞춰서 `https://192.168.0.19:6443` 으로 수정했습니다.)\n\n```bash\napiVersion: v1\nclusters:\n- cluster:\n    certificate-authority-data:\n    <...>\n    server: https://192.168.0.19:6443\n  name: default\ncontexts:\n- context:\n    cluster: default\n    user: default\n  name: default\ncurrent-context: default\nkind: Config\npreferences: {}\nusers:\n- name: default\n  user:\n    client-certificate-data:\n    <...>\n    client-key-data:\n    <...>\n```\n\n## 4. 쿠버네티스 기본 모듈 설치\n\n[Setup Kubernetes Modules](../../setup-kubernetes/install-kubernetes-module.md)을 참고하여 다음 컴포넌트들을 설치해 주시기 바랍니다.\n\n- helm\n- kustomize\n- CSI plugin\n- [Optional] nvidia-docker, nvidia-device-plugin\n\n## 5. 정상 설치 확인\n\n최종적으로 node가 Ready 인지, OS, Docker, Kubernetes 버전을 확인합니다.\n\n```bash\nkubectl get nodes -o wide\n```\n\n다음과 같은 메시지가 보이면 정상적으로 설치된 것을 의미합니다.\n\n```bash\nNAME    STATUS   ROLES                  AGE   VERSION        INTERNAL-IP    EXTERNAL-IP   OS-IMAGE             KERNEL-VERSION     CONTAINER-RUNTIME\nubuntu   Ready    control-plane,master   11m   v1.21.7+k3s1   192.168.0.19   <none>        Ubuntu 20.04.3 LTS   5.4.0-91-generic   docker://20.10.11\n```\n\n## 6. References\n\n- [https://rancher.com/docs/k3s/latest/en/installation/install-options/](https://rancher.com/docs/k3s/latest/en/installation/install-options/)\n"
  },
  {
    "path": "versioned_docs/version-1.0/setup-kubernetes/install-kubernetes/kubernetes-with-kubeadm.md",
    "content": "---\ntitle: \"4.3. Kubeadm\"\ndescription: \"\"\nsidebar_position: 3\ndate: 2021-12-13\nlastmod: 2021-12-20\ncontributors: [\"Youngcheol Jang\"]\n---\n\n## 1. Prerequisite\n\n쿠버네티스 클러스터를 구축하기에 앞서, 필요한 구성 요소들을 **클러스터에** 설치합니다.\n\n[Install Prerequisite](../../setup-kubernetes/install-prerequisite.md)을 참고하여 Kubernetes를 설치하기 전에 필요한 요소들을 **클러스터에** 설치해 주시기 바랍니다.\n\n쿠버네티스를 위한 네트워크의 설정을 변경합니다.\n\n```bash\nsudo modprobe br_netfilter\n\ncat <<EOF | sudo tee /etc/modules-load.d/k8s.conf\nbr_netfilter\nEOF\n\ncat <<EOF | sudo tee /etc/sysctl.d/k8s.conf\nnet.bridge.bridge-nf-call-ip6tables = 1\nnet.bridge.bridge-nf-call-iptables = 1\nEOF\nsudo sysctl --system\n```\n\n## 2. 쿠버네티스 클러스터 셋업\n\n- kubeadm : kubelet을 서비스에 등록하고, 클러스터 컴포넌트들 사이의 통신을 위한 인증서 발급 등 설치 과정 자동화\n- kubelet : container 리소스를 실행, 종료를 해 주는 컨테이너 핸들러\n- kubectl : 쿠버네티스 클러스터를 터미널에서 확인, 조작하기 위한 CLI 도구\n\n다음 명령어를 통해 kubeadm, kubelet, kubectl을 설치합니다.\n실수로 이 컴포넌트들의 버전이 변경하면, 예기치 않은 장애를 낳을 수 있으므로 컴포넌트들이 변경되지 않도록 설정합니다.\n\n```bash\nsudo apt-get update\nsudo apt-get install -y apt-transport-https ca-certificates curl &&\nsudo curl -fsSLo /usr/share/keyrings/kubernetes-archive-keyring.gpg https://packages.cloud.google.com/apt/doc/apt-key.gpg &&\necho \"deb [signed-by=/usr/share/keyrings/kubernetes-archive-keyring.gpg] https://apt.kubernetes.io/ kubernetes-xenial main\" | sudo tee /etc/apt/sources.list.d/kubernetes.list &&\nsudo apt-get update\nsudo apt-get install -y kubelet=1.21.7-00 kubeadm=1.21.7-00 kubectl=1.21.7-00 &&\nsudo apt-mark hold kubelet kubeadm kubectl\n```\n\nkubeadm, kubelet, kubectl 이 잘 설치되었는지 확인합니다.\n\n```bash\nmlops@ubuntu:~$ kubeadm version\nkubeadm version: &version.Info{Major:\"1\", Minor:\"21\", GitVersion:\"v1.21.7\", GitCommit:\"1f86634ff08f37e54e8bfcd86bc90b61c98f84d4\", GitTreeState:\"clean\", BuildDate:\"2021-11-17T14:40:08Z\", GoVersion:\"go1.16.10\", Compiler:\"gc\", Platform:\"linux/amd64\"}\n```\n\n```bash\nmlops@ubuntu:~$ kubelet --version\nKubernetes v1.21.7\n```\n\n```bash\nmlops@ubuntu:~$ kubectl version --client\nClient Version: version.Info{Major:\"1\", Minor:\"21\", GitVersion:\"v1.21.7\", GitCommit:\"1f86634ff08f37e54e8bfcd86bc90b61c98f84d4\", GitTreeState:\"clean\", BuildDate:\"2021-11-17T14:41:19Z\", GoVersion:\"go1.16.10\", Compiler:\"gc\", Platform:\"linux/amd64\"}\n```\n\n이제 kubeadm을 사용하여 쿠버네티스를 설치합니다.\n\n```bash\nkubeadm config images list\nkubeadm config images pull\n\nsudo kubeadm init --pod-network-cidr=10.244.0.0/16\n```\n\nkubectl을 통해서 쿠버네티스 클러스터를 제어할 수 있도록 admin 인증서를 $HOME/.kube/config 경로에 복사합니다.\n\n```bash\nmkdir -p $HOME/.kube\nsudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config\nsudo chown $(id -u):$(id -g) $HOME/.kube/config\n```\n\nCNI를 설치합니다.\n쿠버네티스 내부의 네트워크 설정을 전담하는 CNI는 여러 종류가 있으며, *모두의 MLOps*에서는 flannel을 사용합니다.\n\n```bash\nkubectl apply -f https://raw.githubusercontent.com/flannel-io/flannel/v0.13.0/Documentation/kube-flannel.yml\n```\n\n쿠버네티스 노드의 종류에는 크게 `마스터 노드`와 `워커 노드`가 있습니다.\n안정성을 위하여 `마스터 노드`에는 쿠버네티스 클러스터를 제어하는 작업만 실행되도록 하는 것이 일반적이지만,\n이 매뉴얼에서는 싱글 클러스터를 가정하고 있으므로 마스터 노드에 모든 종류의 작업이 실행될 수 있도록 설정합니다.\n\n```bash\nkubectl taint nodes --all node-role.kubernetes.io/master-\n```\n\n## 3. 쿠버네티스 클라이언트 셋업\n\n클러스터에 생성된 kubeconfig 파일을 **클라이언트**에 복사하여 kubectl을 통해 클러스터를 제어할 수 있도록 합니다.\n\n```bash\nmkdir -p $HOME/.kube\nscp -p {CLUSTER_USER_ID}@{CLUSTER_IP}:~/.kube/config ~/.kube/config\n```\n\n## 4. 쿠버네티스 기본 모듈 설치\n\n[Setup Kubernetes Modules](../../setup-kubernetes/install-kubernetes-module.md)을 참고하여 다음 컴포넌트들을 설치해 주시기 바랍니다.\n\n- helm\n- kustomize\n- CSI plugin\n- [Optional] nvidia-docker, nvidia-device-plugin\n\n## 5. 정상 설치 확인\n\n다음 명령어를 통해 노드의 STATUS가 Ready 상태가 되었는지 확인합니다.\n\n```bash\nkubectl get nodes\n```\n\nReady 가 되면 다음과 비슷한 결과가 출력됩니다.\n\n```bash\nNAME     STATUS   ROLES                  AGE     VERSION\nubuntu   Ready    control-plane,master   2m55s   v1.21.7\n```\n\n## 6. References\n\n- [kubeadm](https://kubernetes.io/docs/setup/production-environment/tools/kubeadm/install-kubeadm)\n"
  },
  {
    "path": "versioned_docs/version-1.0/setup-kubernetes/install-kubernetes/kubernetes-with-minikube.md",
    "content": "---\ntitle: \"4.2. Minikube\"\ndescription: \"\"\nsidebar_position: 2\ndate: 2021-12-13\nlastmod: 2021-12-20\ncontributors: [\"Jaeyeon Kim\"]\n---\n\n## 1. Prerequisite\n\n쿠버네티스 클러스터를 구축하기에 앞서, 필요한 구성 요소들을 **클러스터에** 설치합니다.\n\n[Install Prerequisite](../../setup-kubernetes/install-prerequisite.md)을 참고하여 Kubernetes를 설치하기 전에 필요한 요소들을 **클러스터에** 설치해 주시기 바랍니다.\n\n### Minikube binary\n\nMinikube를 사용하기 위해, v1.24.0 버전의 Minikube 바이너리를 설치합니다.\n\n```bash\nwget https://github.com/kubernetes/minikube/releases/download/v1.24.0/minikube-linux-amd64\nsudo install minikube-linux-amd64 /usr/local/bin/minikube\n```\n\n정상적으로 설치되었는지 확인합니다.\n\n```bash\nminikube version\n```\n\n다음과 같은 메시지가 보이면 정상적으로 설치된 것을 의미합니다.\n\n```bash\nmlops@ubuntu:~$ minikube version\nminikube version: v1.24.0\ncommit: 76b94fb3c4e8ac5062daf70d60cf03ddcc0a741b\n```\n\n## 2. 쿠버네티스 클러스터 셋업\n\n이제 Minikube를 활용해 쿠버네티스 클러스터를 **클러스터에** 구축합니다.\nGPU 의 원활한 사용과 클러스터-클라이언트 간 통신을 간편하게 수행하기 위해, Minikube 는 `driver=none` 옵션을 활용하여 실행합니다. `driver=none` 옵션은 root user 로 실행해야 함에 주의 바랍니다.\n\nroot user로 전환합니다.\n\n```bash\nsudo su\n```\n\n`minikube start`를 수행하여 쿠버네티스 클러스터 구축을 진행합니다. Kubeflow의 원활한 사용을 위해, 쿠버네티스 버전은 v1.21.7로 지정하여 구축하며 `--extra-config`를 추가합니다.\n\n```bash\nminikube start --driver=none \\\n  --kubernetes-version=v1.21.7 \\\n  --extra-config=apiserver.service-account-signing-key-file=/var/lib/minikube/certs/sa.key \\\n  --extra-config=apiserver.service-account-issuer=kubernetes.default.svc\n```\n\n### Disable default addons\n\nMinikube를 설치하면 Default로 설치되는 addon이 존재합니다. 이 중 저희가 사용하지 않을 addon을 비활성화합니다.\n\n```bash\nminikube addons disable storage-provisioner\nminikube addons disable default-storageclass\n```\n\n모든 addon이 비활성화된 것을 확인합니다.\n\n```bash\nminikube addons list\n```\n\n다음과 같은 메시지가 보이면 정상적으로 설치된 것을 의미합니다.\n\n```bash\nroot@ubuntu:/home/mlops# minikube addons list\n|-----------------------------|----------|--------------|-----------------------|\n|         ADDON NAME          | PROFILE  |    STATUS    |      MAINTAINER       |\n|-----------------------------|----------|--------------|-----------------------|\n| ambassador                  | minikube | disabled     | unknown (third-party) |\n| auto-pause                  | minikube | disabled     | google                |\n| csi-hostpath-driver         | minikube | disabled     | kubernetes            |\n| dashboard                   | minikube | disabled     | kubernetes            |\n| default-storageclass        | minikube | disabled     | kubernetes            |\n| efk                         | minikube | disabled     | unknown (third-party) |\n| freshpod                    | minikube | disabled     | google                |\n| gcp-auth                    | minikube | disabled     | google                |\n| gvisor                      | minikube | disabled     | google                |\n| helm-tiller                 | minikube | disabled     | unknown (third-party) |\n| ingress                     | minikube | disabled     | unknown (third-party) |\n| ingress-dns                 | minikube | disabled     | unknown (third-party) |\n| istio                       | minikube | disabled     | unknown (third-party) |\n| istio-provisioner           | minikube | disabled     | unknown (third-party) |\n| kubevirt                    | minikube | disabled     | unknown (third-party) |\n| logviewer                   | minikube | disabled     | google                |\n| metallb                     | minikube | disabled     | unknown (third-party) |\n| metrics-server              | minikube | disabled     | kubernetes            |\n| nvidia-driver-installer     | minikube | disabled     | google                |\n| nvidia-gpu-device-plugin    | minikube | disabled     | unknown (third-party) |\n| olm                         | minikube | disabled     | unknown (third-party) |\n| pod-security-policy         | minikube | disabled     | unknown (third-party) |\n| portainer                   | minikube | disabled     | portainer.io          |\n| registry                    | minikube | disabled     | google                |\n| registry-aliases            | minikube | disabled     | unknown (third-party) |\n| registry-creds              | minikube | disabled     | unknown (third-party) |\n| storage-provisioner         | minikube | disabled     | kubernetes            |\n| storage-provisioner-gluster | minikube | disabled     | unknown (third-party) |\n| volumesnapshots             | minikube | disabled     | kubernetes            |\n|-----------------------------|----------|--------------|-----------------------|\n```\n\n## 3. 쿠버네티스 클라이언트 셋업\n\n이번에는 **클라이언트**에 쿠버네티스의 원활한 사용을 위한 도구를 설치합니다.\n**클라이언트**와 **클러스터** 노드가 분리되지 않은 경우에는 root user로 모든 작업을 진행해야 함에 주의바랍니다.\n\n**클라이언트**와 **클러스터** 노드가 분리된 경우, 우선 kubernetes의 관리자 인증 정보를 **클라이언트**로 가져옵니다.\n\n1. **클러스터**에서 config를 확인합니다.\n\n  ```bash\n  # 클러스터 노드\n  minikube kubectl -- config view --flatten\n  ```\n\n2. 다음과 같은 정보가 출력됩니다.\n\n  ```bash\n  apiVersion: v1\n  clusters:\n  - cluster:\n      certificate-authority-data: LS0tLS1CRUd....\n      extensions:\n      - extension:\n          last-update: Mon, 06 Dec 2021 06:55:46 UTC\n          provider: minikube.sigs.k8s.io\n          version: v1.24.0\n        name: cluster_info\n      server: https://192.168.0.62:8443\n    name: minikube\n  contexts:\n  - context:\n      cluster: minikube\n      extensions:\n      - extension:\n          last-update: Mon, 06 Dec 2021 06:55:46 UTC\n          provider: minikube.sigs.k8s.io\n          version: v1.24.0\n        name: context_info\n      namespace: default\n      user: minikube\n    name: minikube\n  current-context: minikube\n  kind: Config\n  preferences: {}\n  users:\n  - name: minikube\n    user:\n      client-certificate-data: LS0tLS1CRUdJTi....\n      client-key-data: LS0tLS1CRUdJTiBSU0....\n  ```\n\n3. **클라이언트** 노드에서 `.kube` 폴더를 생성합니다.\n\n  ```bash\n  # 클라이언트 노드\n  mkdir -p /home/$USER/.kube\n  ```\n\n4. 해당 파일에 2. 에서 출력된 정보를 붙여넣은 뒤 저장합니다.\n  \n  ```bash\n  vi /home/$USER/.kube/config\n  ```\n\n## 4. 쿠버네티스 기본 모듈 설치\n\n[Setup Kubernetes Modules](../../setup-kubernetes/install-kubernetes-module.md)을 참고하여 다음 컴포넌트들을 설치해 주시기 바랍니다.\n\n- helm\n- kustomize\n- CSI plugin\n- [Optional] nvidia-docker, nvidia-device-plugin\n\n## 5. 정상 설치 확인\n\n최종적으로 node가 Ready 인지, OS, Docker, Kubernetes 버전을 확인합니다.\n\n```bash\nkubectl get nodes -o wide\n```\n\n다음과 같은 메시지가 보이면 정상적으로 설치된 것을 의미합니다.\n\n```bash\nNAME     STATUS   ROLES                  AGE     VERSION   INTERNAL-IP    EXTERNAL-IP   OS-IMAGE             KERNEL-VERSION     CONTAINER-RUNTIME\nubuntu   Ready    control-plane,master   2d23h   v1.21.7   192.168.0.75   <none>        Ubuntu 20.04.3 LTS   5.4.0-91-generic   docker://20.10.11\n```\n"
  },
  {
    "path": "versioned_docs/version-1.0/setup-kubernetes/install-kubernetes-module.md",
    "content": "---\ntitle: \"5. Install Kubernetes Modules\"\ndescription: \"Install Helm, Kustomize\"\nsidebar_position: 5\ndate: 2021-12-13\nlastmod: 2021-12-20\ncontributors: [\"Jaeyeon Kim\"]\n---\n\n## Setup Kubernetes Modules\n\n이번 페이지에서는 클러스터에서 사용할 모듈을 클라이언트 노드에서 설치하는 과정에 관해서 설명합니다.  \n앞으로 소개되는 과정은 모두 **클라이언트 노드**에서 진행됩니다.\n\n## Helm\n\nHelm은 쿠버네티스 패키지와 관련된 자원을 한 번에 배포하고 관리할 수 있게 도와주는 패키지 매니징 도구 중 하나입니다.\n\n1. 현재 폴더에 Helm v3.7.1 버전을 내려받습니다.\n\n- For Linux amd64\n\n  ```bash\n  wget https://get.helm.sh/helm-v3.7.1-linux-amd64.tar.gz\n  ```\n\n- 다른 OS는 [공식 홈페이지](https://github.com/helm/helm/releases/tag/v3.7.1)를 참고하시어, 클라이언트 노드의 OS와 CPU에 맞는 바이너리의 다운 경로를 확인하시기 바랍니다.\n\n2. helm을 사용할 수 있도록 압축을 풀고, 파일의 위치를 변경합니다.\n\n  ```bash\n  tar -zxvf helm-v3.7.1-linux-amd64.tar.gz\n  sudo mv linux-amd64/helm /usr/local/bin/helm\n  ```\n\n3. 정상적으로 설치되었는지 확인합니다.\n\n  ```bash\n  helm help\n  ```\n\n  다음과 같은 메시지가 보이면 정상적으로 설치된 것을 의미합니다.\n\n  ```bash\n  The Kubernetes package manager\n\n  Common actions for Helm:\n\n  - helm search:    search for charts\n  - helm pull:      download a chart to your local directory to view\n  - helm install:   upload the chart to Kubernetes\n  - helm list:      list releases of charts\n\n  Environment variables:\n\n  | Name                     | Description                                                         |\n  |--------------------------|---------------------------------------------------------------------|\n  | $HELM_CACHE_HOME         | set an alternative location for storing cached files.               |\n  | $HELM_CONFIG_HOME        | set an alternative location for storing Helm configuration.         |\n  | $HELM_DATA_HOME          | set an alternative location for storing Helm data.                  |\n\n  ...\n  ```\n\n## Kustomize\n\nkustomize 또한 여러 쿠버네티스 리소스를 한 번에 배포하고 관리할 수 있게 도와주는 패키지 매니징 도구 중 하나입니다.\n\n1. 현재 폴더에 kustomize v3.10.0 버전의 바이너리를 다운받습니다.\n\n- For Linux amd64\n\n  ```bash\n  wget https://github.com/kubernetes-sigs/kustomize/releases/download/kustomize%2Fv3.10.0/kustomize_v3.10.0_linux_amd64.tar.gz\n  ```\n\n- 다른 OS는 [kustomize/v3.10.0](https://github.com/kubernetes-sigs/kustomize/releases/tag/kustomize%2Fv3.10.0)에서 확인 후 다운로드 받습니다.\n\n2. kustomize 를 사용할 수 있도록 압축을 풀고, 파일의 위치를 변경합니다.\n\n  ```bash\n  tar -zxvf kustomize_v3.10.0_linux_amd64.tar.gz\n  sudo mv kustomize /usr/local/bin/kustomize\n  ```\n\n3. 정상적으로 설치되었는지 확인합니다.\n\n  ```bash\n  kustomize help\n  ```\n\n  다음과 같은 메시지가 보이면 정상적으로 설치된 것을 의미합니다.\n\n  ```bash\n  Manages declarative configuration of Kubernetes.\n  See https://sigs.k8s.io/kustomize\n\n  Usage:\n    kustomize [command]\n\n  Available Commands:\n    build                     Print configuration per contents of kustomization.yaml\n    cfg                       Commands for reading and writing configuration.\n    completion                Generate shell completion script\n    create                    Create a new kustomization in the current directory\n    edit                      Edits a kustomization file\n    fn                        Commands for running functions against configuration.\n  ...\n  ```\n\n## CSI Plugin : Local Path Provisioner\n\n1. CSI Plugin은 kubernetes 내의 스토리지를 담당하는 모듈입니다. 단일 노드 클러스터에서 쉽게 사용할 수 있는 CSI Plugin인 Local Path Provisioner를 설치합니다.\n\n  ```bash\n  kubectl apply -f https://raw.githubusercontent.com/rancher/local-path-provisioner/v0.0.20/deploy/local-path-storage.yaml\n  ```\n\n  다음과 같은 메시지가 보이면 정상적으로 설치된 것을 의미합니다.\n\n  ```bash\n  namespace/local-path-storage created\n  serviceaccount/local-path-provisioner-service-account created\n  clusterrole.rbac.authorization.k8s.io/local-path-provisioner-role created\n  clusterrolebinding.rbac.authorization.k8s.io/local-path-provisioner-bind created\n  deployment.apps/local-path-provisioner created\n  storageclass.storage.k8s.io/local-path created\n  configmap/local-path-config created\n  ```\n\n2. 또한, 다음과 같이 local-path-storage namespace 에 provisioner pod이 Running 인지 확인합니다.\n\n  ```bash\n  kubectl -n local-path-storage get pod\n  ```\n\n  정상적으로 수행되면 아래와 같이 출력됩니다.\n\n  ```bash\n  NAME                                     READY     STATUS    RESTARTS   AGE\n  local-path-provisioner-d744ccf98-xfcbk   1/1       Running   0          7m\n  ```\n\n4. 다음을 수행하여 default storage class로 변경합니다.\n\n  ```bash\n  kubectl patch storageclass local-path  -p '{\"metadata\": {\"annotations\":{\"storageclass.kubernetes.io/is-default-class\":\"true\"}}}'\n  ```\n\n  정상적으로 수행되면 아래와 같이 출력됩니다.\n\n  ```bash\n  storageclass.storage.k8s.io/local-path patched\n  ```\n\n5. default storage class로 설정되었는지 확인합니다.\n\n  ```bash\n  kubectl get sc\n  ```\n\n  다음과 같이 NAME에 `local-path (default)` 인 storage class가 존재하는 것을 확인합니다.\n\n  ```bash\n  NAME                   PROVISIONER             RECLAIMPOLICY   VOLUMEBINDINGMODE      ALLOWVOLUMEEXPANSION   AGE\n  local-path (default)   rancher.io/local-path   Delete          WaitForFirstConsumer   false                  2h\n  ```\n"
  },
  {
    "path": "versioned_docs/version-1.0/setup-kubernetes/install-prerequisite.md",
    "content": "---\ntitle: \"3. Install Prerequisite\"\ndescription: \"Install docker\"\nsidebar_position: 3\ndate: 2021-12-13\nlastmod: 2021-12-20\ncontributors: [\"Jaeyeon Kim\", \"Jongsun Shinn\", \"Sangwoo Shim\"]\n---\n\n\n이 페이지에서는 쿠버네티스를 설치하기에 앞서, **클러스터**와 **클라이언트**에 설치 혹은 설정해두어야 하는 컴포넌트들에 대한 매뉴얼을 설명합니다.\n\n## Install apt packages\n\n추후 클라이언트와 클러스터의 원활한 통신을 위해서는 Port-Forwarding을 수행해야 할 일이 있습니다.\nPort-Forwarding을 위해서는 **클러스터**에 다음 패키지를 설치해 주어야 합니다.\n\n```bash\nsudo apt-get update\nsudo apt-get install -y socat\n```\n\n## Install Docker\n\n1. 도커 설치에 필요한 APT 패키지들을 설치합니다.\n\n   ```bash\n   sudo apt-get update && sudo apt-get install -y ca-certificates curl gnupg lsb-release\n   ```\n\n2. 도커의 공식 GPG key를 추가합니다.\n\n   ```bash\n   curl -fsSL https://download.docker.com/linux/ubuntu/gpg | sudo gpg --dearmor -o /usr/share/keyrings/docker-archive-keyring.gpg\n   ```\n\n3. apt 패키지 매니저로 도커를 설치할 때, stable Repository에서 받아오도록 설정합니다.\n\n   ```bash\n   echo \\\n   \"deb [arch=$(dpkg --print-architecture) signed-by=/usr/share/keyrings/docker-archive-keyring.gpg] https://download.docker.com/linux/ubuntu \\\n   $(lsb_release -cs) stable\" | sudo tee /etc/apt/sources.list.d/docker.list > /dev/null\n   ```\n\n4. 현재 설치할 수 있는 도커 버전을 확인합니다.\n\n   ```bash\n   sudo apt-get update && apt-cache madison docker-ce\n   ```\n\n   출력되는 버전 중 `5:20.10.11~3-0~ubuntu-focal` 버전이 있는지 확인합니다.\n\n   ```bash\n   apt-cache madison docker-ce | grep 5:20.10.11~3-0~ubuntu-focal\n   ```\n\n   정상적으로 추가가 된 경우 다음과 같이 출력됩니다.\n\n   ```bash\n   docker-ce | 5:20.10.11~3-0~ubuntu-focal | https://download.docker.com/linux/ubuntu focal/stable amd64 Packages\n   ```\n\n5. `5:20.10.11~3-0~ubuntu-focal` 버전의 도커를 설치합니다.\n\n   ```bash\n   sudo apt-get install -y containerd.io docker-ce=5:20.10.11~3-0~ubuntu-focal docker-ce-cli=5:20.10.11~3-0~ubuntu-focal\n   ```\n\n6. 도커가 정상적으로 설치된 것을 확인합니다.\n\n   ```bash\n   sudo docker run hello-world\n   ```\n\n   명령어 실행 후 다음과 같은 메시지가 보이면 정상적으로 설치된 것을 의미합니다.\n\n   ```bash\n   mlops@ubuntu:~$ sudo docker run hello-world\n\n   Hello from Docker!\n   This message shows that your installation appears to be working correctly.\n\n   To generate this message, Docker took the following steps:\n   1. The Docker client contacted the Docker daemon.\n   2. The Docker daemon pulled the \"hello-world\" image from the Docker Hub.\n      (amd64)\n   3. The Docker daemon created a new container from that image which runs the\n      executable that produces the output you are currently reading.\n   4. The Docker daemon streamed that output to the Docker client, which sent it\n      to your terminal.\n\n   To try something more ambitious, you can run an Ubuntu container with:\n   $ docker run -it ubuntu bash\n\n   Share images, automate workflows, and more with a free Docker ID:\n   https://hub.docker.com/\n\n   For more examples and ideas, visit:\n   https://docs.docker.com/get-started/\n   ```\n\n7. docker 관련 command를 sudo 키워드 없이 사용할 수 있게 하도록 다음 명령어를 통해 권한을 추가합니다.\n\n   ```bash\n   sudo groupadd docker\n   sudo usermod -aG docker $USER\n   newgrp docker\n   ```\n\n8. sudo 키워드 없이 docker command를 사용할 수 있게 된 것을 확인하기 위해, 다시 한번 docker run을 실행합니다.\n\n   ```bash\n   docker run hello-world\n   ```\n\n   명령어 실행 후 다음과 같은 메시지가 보이면 정상적으로 권한이 추가된 것을 의미합니다.\n\n   ```bash\n   mlops@ubuntu:~$ docker run hello-world\n\n   Hello from Docker!\n   This message shows that your installation appears to be working correctly.\n\n   To generate this message, Docker took the following steps:\n   1. The Docker client contacted the Docker daemon.\n   2. The Docker daemon pulled the \"hello-world\" image from the Docker Hub.\n      (amd64)\n   3. The Docker daemon created a new container from that image which runs the\n      executable that produces the output you are currently reading.\n   4. The Docker daemon streamed that output to the Docker client, which sent it\n      to your terminal.\n\n   To try something more ambitious, you can run an Ubuntu container with:\n   $ docker run -it ubuntu bash\n\n   Share images, automate workflows, and more with a free Docker ID:\n   https://hub.docker.com/\n\n   For more examples and ideas, visit:\n   https://docs.docker.com/get-started/\n   ```\n\n## Turn off Swap Memory\n\nkubelet 이 정상적으로 동작하게 하기 위해서는 **클러스터** 노드에서 swap이라고 불리는 가상메모리를 꺼 두어야 합니다. 다음 명령어를 통해 swap을 꺼 둡니다.  \n**(클러스터와 클라이언트를 같은 데스크톱에서 사용할 때 swap 메모리를 종료하면 속도의 저하가 있을 수 있습니다)**  \n\n```bash\nsudo sed -i '/ swap / s/^\\(.*\\)$/#\\1/g' /etc/fstab\nsudo swapoff -a\n```\n\n## Install Kubectl\n\nkubectl 은 쿠버네티스 클러스터에 API를 요청할 때 사용하는 클라이언트 툴입니다. **클라이언트** 노드에 설치해두어야 합니다.\n\n1. 현재 폴더에 kubectl v1.21.7 버전을 다운받습니다.\n\n   ```bash\n   curl -LO https://dl.k8s.io/release/v1.21.7/bin/linux/amd64/kubectl\n   ```\n\n2. kubectl 을 사용할 수 있도록 파일의 권한과 위치를 변경합니다.\n\n   ```bash\n   sudo install -o root -g root -m 0755 kubectl /usr/local/bin/kubectl\n   ```\n\n3. 정상적으로 설치되었는지 확인합니다.\n\n   ```bash\n   kubectl version --client\n   ```\n\n   다음과 같은 메시지가 보이면 정상적으로 설치된 것을 의미합니다.\n\n   ```bash\n   Client Version: version.Info{Major:\"1\", Minor:\"21\", GitVersion:\"v1.21.7\", GitCommit:\"1f86634ff08f37e54e8bfcd86bc90b61c98f84d4\", GitTreeState:\"clean\", BuildDate:\"2021-11-17T14:41:19Z\", GoVersion:\"go1.16.10\", Compiler:\"gc\", Platform:\"linux/amd64\"}\n   ```\n\n4. 여러 개의 쿠버네티스 클러스터를 사용하는 경우, 여러 개의 kubeconfig 파일을 관리해야 하는 경우가 있습니다.  \n여러 개의 kubeconfig 파일 혹은 여러 개의 kube-context를 효율적으로 관리하는 방법은 다음과 같은 문서를 참고하시기 바랍니다.\n\n   - [https://dev.to/aabiseverywhere/configuring-multiple-kubeconfig-on-your-machine-59eo](https://dev.to/aabiseverywhere/configuring-multiple-kubeconfig-on-your-machine-59eo)\n   - [https://github.com/ahmetb/kubectx](https://github.com/ahmetb/kubectx)\n\n## References\n\n- [Install Docker Engine on Ubuntu](https://docs.docker.com/engine/install/ubuntu/)\n- [리눅스에 kubectl 설치 및 설정](https://kubernetes.io/ko/docs/tasks/tools/install-kubectl-linux/)\n"
  },
  {
    "path": "versioned_docs/version-1.0/setup-kubernetes/intro.md",
    "content": "---\ntitle: \"1. Introduction\"\ndescription: \"Setup Introduction\"\nsidebar_position: 1\ndate: 2021-12-13\nlastmod: 2021-12-13\ncontributors: [\"Jaeyeon Kim\", \"Jongsun Shinn\", \"Youngdon Tae\", \"SeungTae Kim\"]\n---\n\n## MLOps 시스템 구축해보기\n\nMLOps를 공부하는 데 있어서 가장 큰 장벽은 MLOps 시스템을 구성해보고 사용해보기가 어렵다는 점입니다. AWS, GCP 등의 퍼블릭 클라우드 혹은 Weight & Bias, neptune.ai 등의 상용 툴을 사용해보기에는 과금에 대한 부담이 존재하고, 처음부터 모든 환경을 혼자서 구성하기에는 어디서부터 시작해야 할지 막막하게 느껴질 수밖에 없습니다.\n\n이런 이유들로 MLOps를 선뜻 시작해보지 못하시는 분들을 위해, *모두의 MLOps*에서는 우분투가 설치되는 데스크톱 하나만 준비되어 있다면 MLOps 시스템을 밑바닥부터 구축하고 사용해 볼 수 있는 방법을 다룰 예정입니다.\n\n우분투 데스크탑 환경을 준비할 수 없는 경우, 가상머신을 활용하여 환경을 구성하기\n\n>Windows 혹은 Intel Mac을 사용해 `모두의 MLops` 실습을 진행 중인 분들은 `Virtual Box`, `VMware` 등의 가상머신 소프트웨어를 이용하여 우분투 데스크탑 환경을 준비할 수 있습니다. 이 때, 권장 사양을 맞춰 가상 머신을 생성해주시기 바랍니다.\n>또한, M1 Mac을 사용하시는 분들은 작성일(2022년 2월) 기준으로는 Virtual Box, VMware 는 이용할 수 없습니다. ([M1 Apple Silicone Mac에 최적화된 macOS 앱 지원 확인하기](https://isapplesiliconready.com/kr))\n>따라서, 클라우드 환경을 이용해 실습하는 것이 아니라면, [UTM , Virtual machines for Mac](https://mac.getutm.app/)을 설치하여 가상 머신을 이용해주세요.\n>(앱스토어에서 구매하여 다운로드 받는 소프트웨어는 일종의 Donation 개념의 비용 지불입니다. 무료 버전과 자동 업데이트 정도의 차이가 있어, 무료버전을 사용해도 무방합니다.)\n>해당 가상머신 소프트웨어는 `Ubuntu 20.04.3 LTS` 실습 운영체제를 지원하고 있어, M1 Mac에서 실습을 수행하는 것을 가능하게 합니다.\n\n\n하지만 [MLOps의 구성요소](../introduction/component.md)에서 설명하는 요소들을 모두 사용해볼 수는 없기에, *모두의 MLOps*에서는 대표적인 오픈소스만을 설치한 뒤, 서로 연동하여 사용하는 부분을 주로 다룰 예정입니다.\n\n*모두의 MLOps*에서 설치하는 오픈소스가 표준을 의미하는 것은 아니며, 여러분의 상황에 맞게 적절한 툴을 취사선택하는 것을 권장합니다.\n\n## 구성 요소\n\n이 글에서 만들어 볼 MLOps 시스템의 구성 요소들과 각 버전은 아래와 같은 환경에서 검증되었습니다.\n\n원활한 환경에서 테스트하기 위해 **싱글 노드 클러스터 (혹은 클러스터)** 와 **클라이언트**를 분리하여 설명해 드릴 예정입니다.  \n**클러스터** 는 우분투가 설치되어 있는 데스크톱 하나를 의미합니다.  \n**클라이언트** 는 노트북 혹은 클러스터가 설치되어 있는 데스크톱 외의 클라이언트로 사용할 수 있는 다른 데스크톱을 사용하는 것을 권장합니다.  \n하지만 두 대의 머신을 준비할 수 없다면 데스크톱 하나를 동시에 클러스터와 클라이언트 용도로 사용하셔도 괜찮습니다.\n\n### 클러스터\n\n#### 1. Software\n\n아래는 클러스터에 설치해야 할 소프트웨어 목록입니다.\n\n| Software        | Version     |\n| --------------- | ----------- |\n| Ubuntu          | 20.04.3 LTS |\n| Docker (Server) | 20.10.11    |\n| NVIDIA-Driver   | 470.86      |\n| Kubernetes      | v1.21.7     |\n| Kubeflow        | v1.4.0      |\n| MLFlow          | v1.21.0     |\n\n#### 2. Helm Chart\n\n아래는 Helm을 이용해 설치되어야 할 써드파티 소프트웨어 목록입니다.\n\n| Helm Chart Repo Name          | Version |\n| ----------------------------- | ------- |\n| datawire/ambassador           | 6.9.3   |\n| seldonio/seldon-core-operator | 1.11.2  |\n\n### 클라이언트\n\n클라이언트는 MacOS (Intel CPU), Ubuntu 20.04 에서 검증되었습니다.\n\n| Software        | Version     |\n| --------------- | ----------- |\n| kubectl         | v1.21.7     |\n| helm            | v3.7.1      |\n| kustomize       | v3.10.0     |\n\n### Minimum System Requirements\n\n모두의 MLOps를 설치할 클러스터는 다음과 같은 사양을 만족시키는 것을 권장합니다.  \n이는 Kubernetes 및 Kubeflow 의 권장 사양에 의존합니다.\n\n- CPU : 6 core\n- RAM : 12GB\n- DISK : 50GB\n- GPU : NVIDIA GPU (Optional)\n"
  },
  {
    "path": "versioned_docs/version-1.0/setup-kubernetes/kubernetes.md",
    "content": "---\ntitle : \"2. Setup Kubernetes\"\ndescription: \"Setup Kubernetes\"\nsidebar_position: 2\ndate: 2021-12-13\nlastmod: 2021-12-13\ncontributors: [\"Jaeyeon Kim\"]\n---\n\n## Setup Kubernetes Cluster\n\n쿠버네티스를 처음 배우시는 분들에게 첫 진입 장벽은 쿠버네티스 실습 환경을 구축하는 것입니다.\n\n프로덕션 레벨의 쿠버네티스 클러스터를 구축할 수 있게 공식적으로 지원하는 도구는 kubeadm 이지만, 사용자들이 조금 더 쉽게 구축할 수 있도록 도와주는 kubespray, kops 등의 도구도 존재하며, 학습 목적을 위해서 컴팩트한 쿠버네티스 클러스터를 정말 쉽게 구축할 수 있도록 도와주는 k3s, minikube, microk8s, kind 등의 도구도 존재합니다.\n\n각각의 도구는 장단점이 다르기에 사용자마다 선호하는 도구가 다른 점을 고려하여, 본 글에서는 kubeadm, k3s, minikube의 3가지 도구를 활용하여 쿠버네티스 클러스터를 구축하는 방법을 다룹니다.\n각 도구에 대한 자세한 비교는 다음 쿠버네티스 [공식 문서](https://kubernetes.io/ko/docs/tasks/tools/)를 확인해주시기를 바랍니다.\n\n*모두의 MLOps*에서 권장하는 툴은 **k3s**로 쿠버네티스 클러스터를 구축할 때 쉽게 할 수 있다는 장점이 있습니다.  \n만약 쿠버네티스의 모든 기능을 사용하고 노드 구성까지 활용하고 싶다면 **kubeadm**을 권장해 드립니다.  \n**minikube** 는 저희가 설명하는 컴포넌트 외에도 다른 쿠버네티스를 add-on 형식으로 쉽게 설치할 수 있다는 장점이 있습니다.\n\n본 *모두의 MLOps*에서는 구축하게 될 MLOps 구성 요소들을 원활히 사용하기 위해, 각각의 도구를 활용해 쿠버네티스 클러스터를 구축할 때, 추가로 설정해 주어야 하는 부분이 추가되어 있습니다.\n\nUbuntu OS까지는 설치되어 있는 데스크탑을 k8s cluster로 구축한 뒤, 외부 클라이언트 노드에서 쿠버네티스 클러스터에 접근하는 것을 확인하는 것까지가 본 **Setup Kubernetes**단원의 범위입니다.\n\n자세한 구축 방법은 3가지 도구마다 다르기에 다음과 같은 흐름으로 구성되어 있습니다.\n\n```bash\n3. Setup Prerequisite\n4. Setup Kubernetes\n  4.1. with k3s\n  4.2. with minikube\n  4.3. with kubeadm\n5. Setup Kubernetes Modules\n```\n\n그럼 이제 각각의 도구를 활용해 쿠버네티스 클러스터를 구축해보겠습니다. 반드시 모든 도구를 사용해 볼 필요는 없으며, 이 중 여러분이 익숙하신 도구를 활용해주시면 충분합니다.\n"
  },
  {
    "path": "versioned_docs/version-1.0/setup-kubernetes/setup-nvidia-gpu.md",
    "content": "---\ntitle: \"6. (Optional) Setup GPU\"\ndescription: \"Install nvidia docker, nvidia device plugin\"\nsidebar_position: 6\ndate: 2021-12-13\nlastmod: 2021-12-13\ncontributors: [\"Jaeyeon Kim\"]\n---\n\n쿠버네티스 및 Kubeflow 등에서 GP 를 사용하기 위해서는 다음 작업이 필요합니다.\n\n## 1. Install NVIDIA Driver\n\n`nvidia-smi` 수행 시 다음과 같은 화면이 출력된다면 이 단계는 생략해 주시기 바랍니다.\n\n  ```bash\n  mlops@ubuntu:~$ nvidia-smi \n  +-----------------------------------------------------------------------------+\n  | NVIDIA-SMI 470.86       Driver Version: 470.86       CUDA Version: 11.4     |\n  |-------------------------------+----------------------+----------------------+\n  | GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |\n  | Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |\n  |                               |                      |               MIG M. |\n  |===============================+======================+======================|\n  |   0  NVIDIA GeForce ...  Off  | 00000000:01:00.0 Off |                  N/A |\n  | 25%   32C    P8     4W / 120W |    211MiB /  6078MiB |      0%      Default |\n  |                               |                      |                  N/A |\n  +-------------------------------+----------------------+----------------------+\n  |   1  NVIDIA GeForce ...  Off  | 00000000:02:00.0 Off |                  N/A |\n  |  0%   34C    P8     7W / 175W |      5MiB /  7982MiB |      0%      Default |\n  |                               |                      |                  N/A |\n  +-------------------------------+----------------------+----------------------+\n                                                                                \n  +-----------------------------------------------------------------------------+\n  | Processes:                                                                  |\n  |  GPU   GI   CI        PID   Type   Process name                  GPU Memory |\n  |        ID   ID                                                   Usage      |\n  |=============================================================================|\n  |    0   N/A  N/A      1644      G   /usr/lib/xorg/Xorg                198MiB |\n  |    0   N/A  N/A      1893      G   /usr/bin/gnome-shell               10MiB |\n  |    1   N/A  N/A      1644      G   /usr/lib/xorg/Xorg                  4MiB |\n  +-----------------------------------------------------------------------------+\n  ```\n\n`nvidia-smi`의 출력 결과가 위와 같지 않다면 장착된 GPU에 맞는 nvidia driver를 설치해 주시기 바랍니다.\n\n만약 nvidia driver의 설치에 익숙하지 않다면 아래 명령어를 통해 설치하시기 바랍니다.\n\n  ```bash\n  sudo add-apt-repository ppa:graphics-drivers/ppa\n  sudo apt update && sudo apt install -y ubuntu-drivers-common\n  sudo ubuntu-drivers autoinstall\n  sudo reboot\n  ```\n\n## 2. NVIDIA-Docker 설치\n\nNVIDIA-Docker를 설치합니다.\n\n```bash\ncurl -s -L https://nvidia.github.io/nvidia-docker/gpgkey | \\\n  sudo apt-key add -\ndistribution=$(. /etc/os-release;echo $ID$VERSION_ID)\ncurl -s -L https://nvidia.github.io/nvidia-docker/$distribution/nvidia-docker.list | sudo tee /etc/apt/sources.list.d/nvidia-docker.list\nsudo apt-get update\nsudo apt-get install -y nvidia-docker2 &&\nsudo systemctl restart docker\n```\n\n정상적으로 설치되었는지 확인하기 위해, GPU를 사용하는 도커 컨테이너를 실행해봅니다.\n\n```bash\nsudo docker run --rm --gpus all nvidia/cuda:11.0-base nvidia-smi\n```\n\n다음과 같은 메시지가 보이면 정상적으로 설치된 것을 의미합니다.\n\n  ```bash\n  mlops@ubuntu:~$ sudo docker run --rm --gpus all nvidia/cuda:11.0-base nvidia-smi\n  +-----------------------------------------------------------------------------+\n  | NVIDIA-SMI 470.86       Driver Version: 470.86       CUDA Version: 11.4     |\n  |-------------------------------+----------------------+----------------------+\n  | GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |\n  | Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |\n  |                               |                      |               MIG M. |\n  |===============================+======================+======================|\n  |   0  NVIDIA GeForce ...  Off  | 00000000:01:00.0 Off |                  N/A |\n  | 25%   32C    P8     4W / 120W |    211MiB /  6078MiB |      0%      Default |\n  |                               |                      |                  N/A |\n  +-------------------------------+----------------------+----------------------+\n  |   1  NVIDIA GeForce ...  Off  | 00000000:02:00.0 Off |                  N/A |\n  |  0%   34C    P8     6W / 175W |      5MiB /  7982MiB |      0%      Default |\n  |                               |                      |                  N/A |\n  +-------------------------------+----------------------+----------------------+\n                                                                                \n  +-----------------------------------------------------------------------------+\n  | Processes:                                                                  |\n  |  GPU   GI   CI        PID   Type   Process name                  GPU Memory |\n  |        ID   ID                                                   Usage      |\n  |=============================================================================|\n  +-----------------------------------------------------------------------------+\n  ```\n\n## 3. NVIDIA-Docker를 Default Container Runtime으로 설정\n\n쿠버네티스는 기본적으로 Docker-CE를 Default Container Runtime으로 사용합니다.\n따라서, Docker Container 내에서 NVIDIA GPU를 사용하기 위해서는 NVIDIA-Docker 를 Container Runtime 으로 사용하여 pod를 생성할 수 있도록 Default Runtime을 수정해 주어야 합니다.\n\n1. `/etc/docker/daemon.json` 파일을 열어 다음과 같이 수정합니다.\n\n  ```bash\n  sudo vi /etc/docker/daemon.json\n\n  {\n    \"default-runtime\": \"nvidia\",\n    \"runtimes\": {\n        \"nvidia\": {\n            \"path\": \"nvidia-container-runtime\",\n            \"runtimeArgs\": []\n    }\n    }\n  }\n  ```\n\n2. 파일이 변경된 것을 확인한 후, Docker를 재시작합니다.\n\n  ```bash\n  sudo systemctl daemon-reload\n  sudo service docker restart\n  ```\n\n3. 변경 사항이 반영되었는지 확인합니다.\n\n  ```bash\n  sudo docker info | grep nvidia\n  ```\n\n  다음과 같은 메시지가 보이면 정상적으로 설치된 것을 의미합니다.\n\n  ```bash\n  mlops@ubuntu:~$ docker info | grep nvidia\n  Runtimes: io.containerd.runc.v2 io.containerd.runtime.v1.linux nvidia runc\n  Default Runtime: nvidia\n  ```\n\n## 4. Nvidia-Device-Plugin\n\n1. nvidia-device-plugin daemonset을 생성합니다.\n\n  ```bash\n  kubectl create -f https://raw.githubusercontent.com/NVIDIA/k8s-device-plugin/v0.10.0/nvidia-device-plugin.yml\n  ```\n\n2. nvidia-device-plugin pod이 RUNNING 상태로 생성되었는지 확인합니다.\n\n  ```bash\n  kubectl get pod -n kube-system | grep nvidia\n  ```\n\n  다음과 같은 결과가 출력되어야 합니다.\n\n  ```bash\n  kube-system       nvidia-device-plugin-daemonset-nlqh2         1/1     Running   0      1h\n  ```\n\n3. node 정보에 gpu가 사용가능하도록 설정되었는지 확인합니다.\n\n  ```bash\n  kubectl get nodes \"-o=custom-columns=NAME:.metadata.name,GPU:.status.allocatable.nvidia\\.com/gpu\"\n  ```\n\n  다음과 같은 메시지가 보이면 정상적으로 설정된 것을 의미합니다.  \n  (*모두의 MLOps* 에서 실습을 진행한 클러스터는 2개의 GPU가 있어서 2가 출력됩니다.\n  본인의 클러스터의 GPU 개수와 맞는 숫자가 출력된다면 됩니다.)\n\n  ```bash\n  NAME       GPU\n  ubuntu     2\n  ```\n\n설정되지 않은 경우, GPU의 value가 `<None>` 으로 표시됩니다.\n"
  },
  {
    "path": "versioned_sidebars/version-1.0-sidebars.json",
    "content": "{\n  \"tutorialSidebar\": [\n    {\n      \"type\": \"category\",\n      \"label\": \"Introduction\",\n      \"items\": [\n        \"introduction/intro\",\n        \"introduction/levels\",\n        \"introduction/component\",\n        \"introduction/why_kubernetes\"\n      ]\n    },\n    {\n      \"type\": \"category\",\n      \"label\": \"Setup Kubernetes\",\n      \"items\": [\n        \"setup-kubernetes/intro\",\n        \"setup-kubernetes/kubernetes\",\n        \"setup-kubernetes/install-prerequisite\",\n        {\n          \"type\": \"category\",\n          \"label\": \"4. Install Kubernetes\",\n          \"items\": [\n            \"setup-kubernetes/install-kubernetes/kubernetes-with-k3s\",\n            \"setup-kubernetes/install-kubernetes/kubernetes-with-kubeadm\",\n            \"setup-kubernetes/install-kubernetes/kubernetes-with-minikube\"\n          ]\n        },\n        \"setup-kubernetes/install-kubernetes-module\",\n        \"setup-kubernetes/setup-nvidia-gpu\"\n      ]\n    },\n    {\n      \"type\": \"category\",\n      \"label\": \"Setup Components\",\n      \"items\": [\n        \"setup-components/install-components-kf\",\n        \"setup-components/install-components-mlflow\",\n        \"setup-components/install-components-seldon\",\n        \"setup-components/install-components-pg\"\n      ]\n    },\n    {\n      \"type\": \"category\",\n      \"label\": \"Kubeflow UI Guide\",\n      \"items\": [\n        \"kubeflow-dashboard-guide/intro\",\n        \"kubeflow-dashboard-guide/notebooks\",\n        \"kubeflow-dashboard-guide/tensorboards\",\n        \"kubeflow-dashboard-guide/volumes\",\n        \"kubeflow-dashboard-guide/experiments\",\n        \"kubeflow-dashboard-guide/experiments-and-others\"\n      ]\n    },\n    {\n      \"type\": \"category\",\n      \"label\": \"Kubeflow\",\n      \"items\": [\n        \"kubeflow/kubeflow-intro\",\n        \"kubeflow/kubeflow-concepts\",\n        \"kubeflow/basic-requirements\",\n        \"kubeflow/basic-component\",\n        \"kubeflow/basic-pipeline\",\n        \"kubeflow/basic-pipeline-upload\",\n        \"kubeflow/basic-run\",\n        \"kubeflow/advanced-component\",\n        \"kubeflow/advanced-environment\",\n        \"kubeflow/advanced-pipeline\",\n        \"kubeflow/advanced-run\",\n        \"kubeflow/advanced-mlflow\",\n        \"kubeflow/how-to-debug\"\n      ]\n    },\n    {\n      \"type\": \"category\",\n      \"label\": \"API Deployment\",\n      \"items\": [\n        \"api-deployment/what-is-api-deployment\",\n        \"api-deployment/seldon-iris\",\n        \"api-deployment/seldon-pg\",\n        \"api-deployment/seldon-fields\",\n        \"api-deployment/seldon-mlflow\",\n        \"api-deployment/seldon-children\"\n      ]\n    },\n    {\n      \"type\": \"category\",\n      \"label\": \"Appendix\",\n      \"items\": [\n        \"appendix/pyenv\",\n        \"appendix/metallb\"\n      ]\n    },\n    {\n      \"type\": \"category\",\n      \"label\": \"Further Readings\",\n      \"items\": [\n        \"further-readings/info\"\n      ]\n    }\n  ],\n  \"preSidebar\": [\n    {\n      \"type\": \"category\",\n      \"label\": \"Docker\",\n      \"items\": [\n        \"prerequisites/docker/install\",\n        \"prerequisites/docker/introduction\",\n        \"prerequisites/docker/docker\",\n        \"prerequisites/docker/command\",\n        \"prerequisites/docker/images\",\n        \"prerequisites/docker/advanced\"\n      ]\n    }\n  ]\n}\n"
  },
  {
    "path": "versions.json",
    "content": "[\n  \"1.0\"\n]\n"
  }
]