Repository: jina-ai/clip-as-service
Branch: main
Commit: 03410570d439
Files: 104
Total size: 598.7 KB

Directory structure:
gitextract_i8arbao_/

├── .dockerignore
├── .github/
│   ├── README-exec/
│   │   ├── onnx.readme.md
│   │   └── torch.readme.md
│   ├── codecov.yml
│   ├── labeler.yml
│   ├── release-template.ejs
│   └── workflows/
│       ├── cd.yml
│       ├── ci.yml
│       ├── force-docker-build-cas.yml
│       ├── force-docker-build.yml
│       ├── force-docs-build.yml
│       ├── force-hub-push.yml
│       ├── force-release.yml
│       ├── label-pr.yml
│       └── tag.yml
├── .gitignore
├── .pre-commit-config.yaml
├── CHANGELOG.md
├── Dockerfiles/
│   ├── base.Dockerfile
│   ├── cuda.Dockerfile
│   ├── server.Dockerfile
│   └── tensorrt.Dockerfile
├── LICENSE
├── README.md
├── client/
│   ├── clip_client/
│   │   ├── __init__.py
│   │   ├── client.py
│   │   └── helper.py
│   └── setup.py
├── docs/
│   ├── Makefile
│   ├── _static/
│   │   ├── cas-grafana.json
│   │   ├── demo-embed.html
│   │   ├── demo-text-rank.html
│   │   └── main.css
│   ├── _templates/
│   │   ├── page.html
│   │   └── sidebar/
│   │       ├── brand.html
│   │       └── navigation.html
│   ├── changelog/
│   │   └── index.md
│   ├── conf.py
│   ├── hosting/
│   │   ├── by-jina.md
│   │   ├── cas-on-colab.ipynb
│   │   ├── colab.md
│   │   └── on-jcloud.md
│   ├── html_extra/
│   │   └── robots.txt
│   ├── index.md
│   ├── makedoc.sh
│   ├── playground/
│   │   ├── embedding.md
│   │   ├── reasoning.md
│   │   └── searching.md
│   ├── requirements.txt
│   └── user-guides/
│       ├── benchmark.rst
│       ├── client.md
│       ├── faq.md
│       ├── finetuner.md
│       ├── retriever.md
│       └── server.md
├── scripts/
│   ├── MANIFEST.in
│   ├── benchmark.py
│   ├── black.sh
│   ├── docstrings_lint.sh
│   ├── get-all-test-paths.sh
│   ├── get-last-release-note.py
│   ├── get-requirements.py
│   ├── onnx_helper.py
│   ├── release.sh
│   └── setup.py
├── server/
│   ├── MANIFEST.in
│   ├── clip_server/
│   │   ├── __init__.py
│   │   ├── __main__.py
│   │   ├── executors/
│   │   │   ├── __init__.py
│   │   │   ├── clip_onnx.py
│   │   │   ├── clip_tensorrt.py
│   │   │   ├── clip_torch.py
│   │   │   └── helper.py
│   │   ├── helper.py
│   │   ├── model/
│   │   │   ├── __init__.py
│   │   │   ├── clip.py
│   │   │   ├── clip_model.py
│   │   │   ├── clip_onnx.py
│   │   │   ├── clip_trt.py
│   │   │   ├── cnclip_model.py
│   │   │   ├── flash_attention.py
│   │   │   ├── mclip_model.py
│   │   │   ├── model.py
│   │   │   ├── openclip_model.py
│   │   │   ├── pretrained_models.py
│   │   │   ├── simple_tokenizer.py
│   │   │   ├── tokenization.py
│   │   │   └── trt_utils.py
│   │   ├── onnx-flow.yml
│   │   ├── tensorrt-flow.yml
│   │   └── torch-flow.yml
│   └── setup.py
└── tests/
    ├── __init__.py
    ├── conftest.py
    ├── test_asyncio.py
    ├── test_client.py
    ├── test_helper.py
    ├── test_model.py
    ├── test_ranker.py
    ├── test_search.py
    ├── test_server.py
    ├── test_simple.py
    ├── test_tensorrt.py
    └── test_tokenization.py

================================================
FILE CONTENTS
================================================

================================================
FILE: .dockerignore
================================================
.git
.github
scripts
docs

================================================
FILE: .github/README-exec/onnx.readme.md
================================================
# CLIPOnnxEncoder

**CLIPOnnxEncoder** is the executor implemented in [CLIP-as-service](https://github.com/jina-ai/clip-as-service). 
The various `CLIP` models implemented in the [OpenAI](https://github.com/openai/CLIP) and [OpenCLIP](https://github.com/mlfoundations/open_clip) are supported with ONNX runtime (🚀 **3x** speed up). 
The introduction of the CLIP model [can be found here](https://openai.com/blog/clip/).

- 🔀 **Automatic**: Auto-detect image and text documents depending on their content.
- ⚡ **Efficiency**: Faster CLIP model inference on CPU and GPU via ONNX runtime. 
- 📈 **Observability**: Monitoring the serving via Prometheus and Grafana (see [Usage Guide](https://docs.jina.ai/how-to/monitoring/#deploying-locally)).


## Model support

 `ViT-B-32::openai` is used as the default model. To use specific pretrained models provided by `open_clip`, please use `::` to separate model name and pretrained weight name, e.g. `ViT-B-32::laion2b_e16`. Please also note that **different models give different sizes of output dimensions**.

| Model                                 | ONNX | Output dimension | 
|---------------------------------------|------|------------------|
| RN50                                  | ✅    | 1024             | 
| RN101                                 | ✅    | 512              | 
| RN50x4                                | ✅    | 640              |
| RN50x16                               | ✅    | 768              |
| RN50x64                               | ✅    | 1024             |
| ViT-B-32                              | ✅    | 512              |
| ViT-B-16                              | ✅    | 512              |
| ViT-B-16-plus-240                     | ✅    | 640              |
| ViT-L-14                              | ✅    | 768              |
| ViT-L-14-336                          | ✅    | 768              |
| ViT-H-14                              | ✅    | 1024             |
| ViT-g-14                              | ✅    | 1024             |
| M-CLIP/XLM_Roberta-Large-Vit-B-32     | ✅    | 512              |
| M-CLIP/XLM-Roberta-Large-Vit-L-14     | ✅    | 768              |
| M-CLIP/XLM-Roberta-Large-Vit-B-16Plus | ✅    | 640              |
| M-CLIP/LABSE-Vit-L-14                 | ✅    | 768              |

✅ = First class support 

Full list of open_clip models and weights can be found [here](https://github.com/mlfoundations/open_clip#pretrained-model-interface).

```{note}
For model definition with `-quickgelu` postfix, please use non `-quickgelu` model name.
```

## Usage

### Use in Jina Flow 

- **via Docker image (recommended)**

```python
from jina import Flow
from docarray import Document
import numpy as np

f = Flow().add(
    uses='jinahub+docker://CLIPOnnxEncoder',
)
```

- **via source code**

```python
from jina import Flow
from docarray import Document
import numpy as np

f = Flow().add(
    uses='jinahub://CLIPOnnxEncoder',
)
```

You can set the following parameters via `with`:

| Parameter | Description                                                                                                                   |
|-----------|-------------------------------------------------------------------------------------------------------------------------------|
| `name`    | Model weights, default is `ViT-B/32`. Support all OpenAI released pretrained models.                                          |
| `num_worker_preprocess` | The number of CPU workers for image & text prerpocessing, default 4.                                                          | 
| `minibatch_size` | The size of a minibatch for CPU preprocessing and GPU encoding, default 16. Reduce the size of it if you encounter OOM on GPU. |
| `device`  | `cuda` or `cpu`. Default is `None` means auto-detect.                                                                         |

### Encoding

Encoding here means getting the fixed-length vector representation of a sentence or image.

```python
from jina import Flow
from docarray import Document, DocumentArray

da = DocumentArray(
    [
        Document(text='she smiled, with pain'),
        Document(uri='apple.png'),
        Document(uri='apple.png').load_uri_to_image_tensor(),
        Document(blob=open('apple.png', 'rb').read()),
        Document(uri='https://clip-as-service.jina.ai/_static/favicon.png'),
        Document(
            uri='data:image/gif;base64,R0lGODlhEAAQAMQAAORHHOVSKudfOulrSOp3WOyDZu6QdvCchPGolfO0o/XBs/fNwfjZ0frl3/zy7////wAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAACH5BAkAABAALAAAAAAQABAAAAVVICSOZGlCQAosJ6mu7fiyZeKqNKToQGDsM8hBADgUXoGAiqhSvp5QAnQKGIgUhwFUYLCVDFCrKUE1lBavAViFIDlTImbKC5Gm2hB0SlBCBMQiB0UjIQA7'
        ),
    ]
)

f = Flow().add(
    uses='jinahub+docker://CLIPOnnxEncoder',
)
with f:
    f.post(on='/', inputs=da)
    da.summary()
```

From the output, you will see all the text and image docs have `embedding` attached.

```text
╭──────────────────────────── Documents Summary ─────────────────────────────╮
│                                                                            │
│   Length                        6                                          │
│   Homogenous Documents          False                                      │
│   4 Documents have attributes   ('id', 'mime_type', 'uri', 'embedding')    │
│   1 Document has attributes     ('id', 'mime_type', 'text', 'embedding')   │
│   1 Document has attributes     ('id', 'embedding')                        │
│                                                                            │
╰────────────────────────────────────────────────────────────────────────────╯
╭────────────────────── Attributes Summary ───────────────────────╮
│                                                                 │
│   Attribute   Data type      #Unique values   Has empty value   │
│  ─────────────────────────────────────────────────────────────  │
│   embedding   ('ndarray',)   6                False             │
│   id          ('str',)       6                False             │
│   mime_type   ('str',)       5                False             │
│   text        ('str',)       2                False             │
│   uri         ('str',)       4                False             │
│                                                                 │
╰─────────────────────────────────────────────────────────────────╯
```

👉 Access the embedding playground in **CLIP-as-service** [doc](https://clip-as-service.jina.ai/playground/embedding), type sentence or image URL and see **live embedding**!

### Ranking

One can also rank cross-modal matches via `/rank` endpoint. 
First construct a *cross-modal* Document where the root contains an image and `.matches` contain sentences to rerank. 

```python
from docarray import Document

d = Document(
    uri='rerank.png',
    matches=[
        Document(text=f'a photo of a {p}')
        for p in (
            'control room',
            'lecture room',
            'conference room',
            'podium indoor',
            'television studio',
        )
    ],
)
```

Then send the request via `/rank` endpoint:

```python
f = Flow().add(
    uses='jinahub+docker://CLIPOnnxEncoder',
)
with f:
    r = f.post(on='/rank', inputs=[d])
    print(r['@m', ['text', 'scores__clip_score__value']])
```

Finally, in the return you can observe the matches are re-ranked according to `.scores['clip_score']`:

```bash
[['a photo of a television studio', 'a photo of a conference room', 'a photo of a lecture room', 'a photo of a control room', 'a photo of a podium indoor'], 
[0.9920725226402283, 0.006038925610482693, 0.0009973491542041302, 0.00078492151806131, 0.00010626466246321797]]
```

One can also construct `text-to-image` rerank as below:

```python
from docarray import Document

d = Document(
    text='a photo of conference room',
    matches=[
        Document(uri='https://picsum.photos/300'),
        Document(uri='https://picsum.photos/id/331/50'),
        Document(uri='https://clip-as-service.jina.ai/_static/favicon.png'),
    ],
)
```

👉 Access the ranking playground in **CLIP-as-service** [doc](https://clip-as-service.jina.ai/playground/reasoning/). Just input the reasoning texts as prompts, the server will rank the prompts and return sorted prompts with scores.

================================================
FILE: .github/README-exec/torch.readme.md
================================================
# CLIPTorchEncoder

**CLIPTorchEncoder** is the executor implemented in [CLIP-as-service](https://github.com/jina-ai/clip-as-service). 
The various `CLIP` models implemented in the [OpenAI](https://github.com/openai/CLIP), [OpenCLIP](https://github.com/mlfoundations/open_clip), and [MultilingualCLIP](https://github.com/FreddeFrallan/Multilingual-CLIP) are supported with PyTorch runtime.
The introduction of the CLIP model [can be found here](https://openai.com/blog/clip/).

- 🔀 **Automatic**: Auto-detect image and text documents depending on their content.
- ⚡ **Efficiency**: Faster CLIP model inference on CPU and GPU via leveraging the best practices. 
- 📈 **Observability**: Monitoring the serving via Prometheus and Grafana (see [Usage Guide](https://docs.jina.ai/how-to/monitoring/#deploying-locally)).

With advances of ONNX runtime, you can use `CLIPOnnxEncoder` (see [link](https://cloud.jina.ai/executor/2a7auwg2)) instead to achieve **3x** model inference speed up.   

## Model support

`ViT-B-32::openai` is used as the default model. To use specific pretrained models provided by `open_clip`, please use `::` to separate model name and pretrained weight name, e.g. `ViT-B-32::laion2b_e16`. Please also note that **different models give different sizes of output dimensions**.

| Model                                 | PyTorch | Output dimension | 
|---------------------------------------|---------|------------------|
| RN50                                  | ✅       | 1024             | 
| RN101                                 | ✅       | 512              | 
| RN50x4                                | ✅       | 640              |
| RN50x16                               | ✅       | 768              |
| RN50x64                               | ✅       | 1024             |
| ViT-B-32                              | ✅       | 512              |
| ViT-B-16                              | ✅       | 512              |
| ViT-B-16-plus-240                     | ✅       | 640              |
| ViT-L-14                              | ✅       | 768              |
| ViT-L-14-336                          | ✅       | 768              |
| ViT-H-14                              | ✅       | 1024             |
| ViT-g-14                              | ✅       | 1024             |
| M-CLIP/XLM_Roberta-Large-Vit-B-32     | ✅       | 512              |
| M-CLIP/XLM-Roberta-Large-Vit-L-14     | ✅       | 768              |
| M-CLIP/XLM-Roberta-Large-Vit-B-16Plus | ✅       | 640              |
| M-CLIP/LABSE-Vit-L-14                 | ✅       | 768              |

✅ = First class support


Full list of open_clip models and weights can be found [here](https://github.com/mlfoundations/open_clip#pretrained-model-interface).

```{note}
For model definition with `-quickgelu` postfix, please use non `-quickgelu` model name.
```


## Usage

### Use in Jina Flow 

- **via Docker image (recommended)**

```python
from jina import Flow
from docarray import Document
import numpy as np

f = Flow().add(
    uses='jinahub+docker://CLIPTorchEncoder',
)
```

- **via source code**

```python
from jina import Flow
from docarray import Document
import numpy as np

f = Flow().add(
    uses='jinahub://CLIPTorchEncoder',
)
```

You can set the following parameters via `with`:

| Parameter               | Description                                                                                                                    |
|-------------------------|--------------------------------------------------------------------------------------------------------------------------------|
| `name`                  | Model weights, default is `ViT-B/32`. Support all OpenAI released pretrained models.                                           |
| `num_worker_preprocess` | The number of CPU workers for image & text prerpocessing, default 4.                                                           | 
| `minibatch_size`        | The size of a minibatch for CPU preprocessing and GPU encoding, default 32. Reduce the size of it if you encounter OOM on GPU. |
| `device`                | `cuda` or `cpu`. Default is `None` means auto-detect.                                                                          |
| `jit`                   | If to enable Torchscript JIT, default is `False`.                                                                              |

### Encoding

Encoding here means getting the fixed-length vector representation of a sentence or image.

```python
from jina import Flow
from docarray import Document, DocumentArray

da = DocumentArray(
    [
        Document(text='she smiled, with pain'),
        Document(uri='apple.png'),
        Document(uri='apple.png').load_uri_to_image_tensor(),
        Document(blob=open('apple.png', 'rb').read()),
        Document(uri='https://clip-as-service.jina.ai/_static/favicon.png'),
        Document(
            uri='data:image/gif;base64,R0lGODlhEAAQAMQAAORHHOVSKudfOulrSOp3WOyDZu6QdvCchPGolfO0o/XBs/fNwfjZ0frl3/zy7////wAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAACH5BAkAABAALAAAAAAQABAAAAVVICSOZGlCQAosJ6mu7fiyZeKqNKToQGDsM8hBADgUXoGAiqhSvp5QAnQKGIgUhwFUYLCVDFCrKUE1lBavAViFIDlTImbKC5Gm2hB0SlBCBMQiB0UjIQA7'
        ),
    ]
)

f = Flow().add(
    uses='jinahub+docker://CLIPTorchEncoder',
)
with f:
    f.post(on='/', inputs=da)
    da.summary()
```

From the output, you will see all the text and image docs have `embedding` attached.

```text
╭──────────────────────────── Documents Summary ─────────────────────────────╮
│                                                                            │
│   Length                        6                                          │
│   Homogenous Documents          False                                      │
│   4 Documents have attributes   ('id', 'mime_type', 'uri', 'embedding')    │
│   1 Document has attributes     ('id', 'mime_type', 'text', 'embedding')   │
│   1 Document has attributes     ('id', 'embedding')                        │
│                                                                            │
╰────────────────────────────────────────────────────────────────────────────╯
╭────────────────────── Attributes Summary ───────────────────────╮
│                                                                 │
│   Attribute   Data type      #Unique values   Has empty value   │
│  ─────────────────────────────────────────────────────────────  │
│   embedding   ('ndarray',)   6                False             │
│   id          ('str',)       6                False             │
│   mime_type   ('str',)       5                False             │
│   text        ('str',)       2                False             │
│   uri         ('str',)       4                False             │
│                                                                 │
╰─────────────────────────────────────────────────────────────────╯
```

👉 Access the embedding playground in **CLIP-as-service** [doc](https://clip-as-service.jina.ai/playground/embedding), type sentence or image URL and see **live embedding**!

### Ranking

One can also rank cross-modal matches via `/rank` endpoint. 
First construct a *cross-modal* Document where the root contains an image and `.matches` contain sentences to rerank. 

```python
from docarray import Document

d = Document(
    uri='rerank.png',
    matches=[
        Document(text=f'a photo of a {p}')
        for p in (
            'control room',
            'lecture room',
            'conference room',
            'podium indoor',
            'television studio',
        )
    ],
)
```

Then send the request via `/rank` endpoint:

```python
f = Flow().add(
    uses='jinahub+docker://CLIPTorchEncoder',
)
with f:
    r = f.post(on='/rank', inputs=[d])
    print(r['@m', ['text', 'scores__clip_score__value']])
```

Finally, you can observe the matches are re-ranked based on `.scores['clip_score']`:

```bash
[['a photo of a television studio', 'a photo of a conference room', 'a photo of a lecture room', 'a photo of a control room', 'a photo of a podium indoor'], 
[0.9920725226402283, 0.006038925610482693, 0.0009973491542041302, 0.00078492151806131, 0.00010626466246321797]]
```

One can also construct `text-to-image` rerank as below:

```python
from docarray import Document

d = Document(
    text='a photo of conference room',
    matches=[
        Document(uri='https://picsum.photos/300'),
        Document(uri='https://picsum.photos/id/331/50'),
        Document(uri='https://clip-as-service.jina.ai/_static/favicon.png'),
    ],
)
```

👉 Access the ranking playground in **CLIP-as-service** [doc](https://clip-as-service.jina.ai/playground/reasoning/). Just input the reasoning texts as prompts, the server will rank the prompts and return sorted prompts with scores.

================================================
FILE: .github/codecov.yml
================================================
codecov:
  # https://docs.codecov.io/docs/comparing-commits
  allow_coverage_offsets: true
coverage:
  status:
    project:
      default:
        informational: true
        target: auto  # auto compares coverage to the previous base commit
  comment:
    layout: "reach, diff, flags, files"
    behavior: default
    require_changes: false  # if true: only post the comment if coverage changes
    branches:               # branch names that can post comment
      - "main"


================================================
FILE: .github/labeler.yml
================================================
# Add 'label1' to any changes within 'example' folder or any subfolders
area/docs:
  - docs/**/*
  - ./*.md

area/testing:
  - tests/**/*

area/setup:
  - setup.py
  - requirements.txt
  - MANIFEST.in

area/housekeeping:
  - .github/**/*
  - ./.gitignore
  - ./*.yaml
  - ./*.yml

area/cicd:
  - .github/workflows/**/*

area/docker:
  - Dockerfiles/**/*
  - ./.dockerignore

area/script:
  - script/**/*

component/client:
  - client/**/*

component/server:
  - server/**/*


================================================
FILE: .github/release-template.ejs
================================================
<% var groupCommits = [
{
    name: 'breaking',
    show: true,
    list: []
}, {
    name: 'feat',
    show: true,
    list: []
}, {
    name: 'perf',
    show: true,
    list: []
}, {
    name: 'fix',
    show: true,
    list: []
}, {
    name: 'refactor',
    show: true,
    list: []
}, {
    name: 'docs',
    show: true,
    list: []
},  {
    name: 'test',
    show: true,
    list: []
}, {
    name: 'other',
    show: true,
    list: []
}
]

var all_titles = {};
var all_commiters = {};
var commitHref = "https://github.com/jina-ai/clip-as-service/commit/"

commits.forEach(function (commit) {

    var result = (commit.title).match(/^(\w*)(\((.*)\))?\: (.*)$/);

    var type = result && result[1];
    var scope = result && result[3];
    var title = result && result[4];
    var committer = commit.authorName

    if (!(committer in all_commiters)) {
        all_commiters[committer] = 1
    }

    if (!(title in all_titles)) {
        all_titles[title] = 1
        if( title != null && (title.indexOf('💥')>-1 || title.indexOf(':boom:')>-1) ){
            groupCommits.find(item => item.name === 'breaking').list.push({
                type: type,
                scope: scope,
                title: title,
                commit: commit
            })
        } else if(type == 'fix' || type == 'fixed'){
            groupCommits.find(item => item.name === 'fix').list.push({
                type: type,
                scope: scope,
                title: title,
                commit: commit
            })
        } else if(type == 'perf' || type == 'performance'){
            groupCommits.find(item => item.name === 'perf').list.push({
                type: type,
                scope: scope,
                title: title,
                commit: commit
            })
        } else if(type == 'feat' || type == 'feature'){
            groupCommits.find(item => item.name === 'feat').list.push({
                type: type,
                scope: scope,
                title: title,
                commit: commit
            })
        } else if(type == 'refactor'){
            groupCommits.find(item => item.name === 'refactor').list.push({
                type: type,
                scope: scope,
                title: title,
                commit: commit
            })
        } else if(type == 'docs' || type == 'doc'){
            groupCommits.find(item => item.name === 'docs').list.push({
                type: type,
                scope: scope,
                title: title,
                commit: commit
            })
        } else if(type == 'test' || type == 'tests' || type == 'ci'){
            groupCommits.find(item => item.name === 'test').list.push({
                type: type,
                scope: scope,
                title: title,
                commit: commit
            })
        } else {
            groupCommits.find(item => item.name === 'other').list.push({
                type: type,
                scope: scope,
                title: title,
                commit: commit
            })
        }
    }


});


var listCommits = function(list, key){

list.forEach(function (ct) {

    var type = ct.type;
    var scope = ct.scope;
    var title = '';
    var commit = ct.commit;

    if(type){
        if(key != 'other'){
            title = (scope? '__'+scope+'__: ':'') + ct.title;
        }else{
            title = '__' + type + (scope? '('+scope+')':'') + '__ : ' + ct.title;
        }
    }else{
        title = commit.title;
    }
%> - <% if(typeof commitHref === 'undefined' || commitHref === '') { %>[```<%=commit.sha1.slice(0,8)%>```]<% } else { %>[[```<%=commit.sha1.slice(0,8)%>```](<%=commitHref%><%=commit.sha1%>)]<%}%> __-__ <%=title%> (*<%= commit.authorName %>*)
<% })} %>

🙇 We'd like to thank all contributors for this new release! In particular,
<%  Object.keys(all_commiters).forEach(function (key) { %> <%= key %>, <% }) %> 🙇

<%
        for(var i of groupCommits){
    if(i.list.length == 0) continue;

if (i.name === 'breaking' && i.show) { %>
### 💥 Breaking changes

<%	} else if (i.name === 'fix' && i.show) { %>
### 🐞 Bug fixes

<%	} else if( i.name === 'feat' && i.show) { %>
### 🆕 New Features

<%	} else if(i.name === 'perf' && i.show) { %>
### ⚡ Performance Improvements

<%	} else if(i.name === 'refactor' && i.show) { %>
### 🧼 Code Refactoring

<%	} else if(i.name === 'docs' && i.show) { %>
### 📗 Documentation

<%	} else if(i.name === 'test' && i.show) { %>
### 🏁 Unit Test and CICD

<%	} else if (i.name === 'other' && i.show) { %>
### 🍹 Other Improvements

<%	}
    i.show && listCommits(i.list, i);
} %>

================================================
FILE: .github/workflows/cd.yml
================================================
name: CD

on:
  push:
    branches:
      - main


jobs:
  prep-testbed:
    if: |
      !startsWith(github.event.head_commit.message, 'chore') &&
      !startsWith(github.event.head_commit.message, 'build: hotfix') &&
      !endsWith(github.event.head_commit.message, 'reformatted by jina-dev-bot')
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v2
      - id: set-matrix
        run: |
          sudo apt-get install jq
          echo "::set-output name=matrix::$(bash scripts/get-all-test-paths.sh)"
    outputs:
      matrix: ${{ steps.set-matrix.outputs.matrix }}

  core-test:
    needs: prep-testbed
    runs-on: ubuntu-latest
    strategy:
      fail-fast: false
      matrix:
        python-version: [3.7]
        test-path: ${{fromJson(needs.prep-testbed.outputs.matrix)}}
    steps:
      - uses: actions/checkout@v2
      - name: Set up Python ${{ matrix.python-version }}
        uses: actions/setup-python@v2
        with:
          python-version: ${{ matrix.python-version }}
      - name: Prepare enviroment
        run: |
          python -m pip install --upgrade pip
          python -m pip install wheel
          pip install --no-cache-dir "client/[test]"
          pip install --no-cache-dir "server/[onnx]"
          pip install --no-cache-dir "server/[transformers]"
          pip install --no-cache-dir "server/[search]"
          pip install --no-cache-dir "server/[cn_clip]"
      - name: Test
        id: test
        run: |
          pytest --suppress-no-test-exit-code --cov=clip_client --cov=clip_server --cov-report=xml \
            -v -s -m "not gpu" ${{ matrix.test-path }}
          echo "::set-output name=codecov_flag::cas"
        timeout-minutes: 30
      - name: Check codecov file
        id: check_files
        uses: andstor/file-existence-action@v1
        with:
          files: "coverage.xml"
      - name: Upload coverage from test to Codecov
        uses: codecov/codecov-action@v2
        if: steps.check_files.outputs.files_exists == 'true' && ${{ matrix.python-version }} == '3.7'
        with:
          file: coverage.xml
          flags: ${{ steps.test.outputs.codecov_flag }}
          fail_ci_if_error: false
          token: ${{ secrets.CODECOV_TOKEN }} # not required for public repos

  gpu-test:
    needs: prep-testbed
    runs-on: [self-hosted, x64, gpu, linux]
    strategy:
      fail-fast: false
      matrix:
        python-version: [ 3.7 ]
    steps:
      - uses: actions/checkout@v2
        with:
          # For coverage builds fetch the whole history
          fetch-depth: 0
      - name: Set up Python ${{ matrix.python-version }}
        uses: actions/setup-python@v2
        with:
          python-version: ${{ matrix.python-version }}
      - name: Prepare enviroment
        run: |
          python -m pip install --upgrade pip
          python -m pip install wheel pytest pytest-cov nvidia-pyindex
          pip install -e "client/[test]"
          pip install -e "server/[tensorrt]"
      - name: Test
        id: test
        run: |
          pytest --suppress-no-test-exit-code --cov=clip_client --cov=clip_server --cov-report=xml \
            -v -s -m "gpu" ./tests/test_tensorrt.py
          echo "::set-output name=codecov_flag::cas"
        timeout-minutes: 30
        env:
          # fix re-initialized torch runtime error on cuda device
          JINA_MP_START_METHOD: spawn
      - name: Check codecov file
        id: check_files
        uses: andstor/file-existence-action@v1
        with:
          files: "coverage.xml"
      - name: Upload coverage from test to Codecov
        uses: codecov/codecov-action@v3
        if: steps.check_files.outputs.files_exists == 'true' && ${{ matrix.python-version }} == '3.7'
        with:
          file: coverage.xml
          name: gpu-related-codecov
          flags: ${{ steps.test.outputs.codecov_flag }}
          fail_ci_if_error: false
          token: ${{ secrets.CODECOV_TOKEN }} # not required for public repos

  prerelease:
    needs: [core-test, gpu-test]
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v2
        with:
          fetch-depth: 100
      - name: Pre-release (.devN)
        run: |
          git fetch --depth=1 origin +refs/tags/*:refs/tags/*
          pip install twine wheel
          ./scripts/release.sh
        env:
          TWINE_USERNAME: ${{ secrets.TWINE_USERNAME }}
          TWINE_PASSWORD: ${{ secrets.TWINE_PASSWORD }}
      - name: Pre-release docker (.devN)
        uses: benc-uk/workflow-dispatch@v1
        with:
          workflow: Manual Docker Build
          inputs: '{ "release_token": "${{ env.release_token }}", "triggered_by": "CD"}'
          token: ${{ secrets.JINA_DEV_BOT }}
        env:
          release_token: ${{ secrets.CAS_RELEASE_TOKEN }}
      - uses: benc-uk/workflow-dispatch@v1
        with:
          workflow: Manual CAS Docker Build
          inputs: '{ "release_token": "${{ env.release_token }}", "triggered_by": "CD"}'
          token: ${{ secrets.JINA_DEV_BOT }}
        env:
          release_token: ${{ secrets.CAS_RELEASE_TOKEN }}
      - name: Pre-release hub (.devN)
        uses: benc-uk/workflow-dispatch@v1
        with:
          workflow: Manual Hub Push
          inputs: '{ "release_token": "${{ env.release_token }}", "triggered_by": "CD"}'
          token: ${{ secrets.JINA_DEV_BOT }}
        env:
          release_token: ${{ secrets.CAS_RELEASE_TOKEN }}


================================================
FILE: .github/workflows/ci.yml
================================================
name: CI

on:
  pull_request:


jobs:
  commit-lint:
    runs-on: ubuntu-latest
    steps:
      - name: find the prev warning if exist
        uses: peter-evans/find-comment@v1
        id: fc
        with:
          issue-number: ${{ github.event.pull_request.number }}
          comment-author: "github-actions[bot]"
          body-includes: "bad commit message"
      - name: Delete comment if exist
        if: ${{ steps.fc.outputs.comment-id != 0 }}
        uses: actions/github-script@v3
        with:
          github-token: ${{ secrets.GITHUB_TOKEN }}
          script: |
            github.issues.deleteComment({
              owner: context.repo.owner,
              repo: context.repo.repo,
              comment_id: ${{ steps.fc.outputs.comment-id }},
            })
      - uses: actions/checkout@v2.5.0
        with:
          fetch-depth: 0
      - run: 'echo "module.exports = {extends: [''@commitlint/config-conventional'']}" > commitlint.config.js'
      - uses: wagoid/commitlint-github-action@v4
        env:
          GITHUB_TOKEN: "${{ secrets.GITHUB_TOKEN }}"
      - name: if lint failed
        if: ${{ failure() }}
        uses: peter-evans/create-or-update-comment@v1
        with:
          issue-number: ${{ github.event.pull_request.number }}
          body: |
            Thanks for your contribution :heart:
            :broken_heart: Unfortunately, this PR has one or more **bad commit messages**, it can not be merged. To fix this problem, please refer to:
            - [Commit Message Guideline for the First Time Contributor](https://github.com/jina-ai/jina/issues/553)
            - [Contributing Guideline](https://github.com/jina-ai/jina/blob/master/CONTRIBUTING.md)
            This message will be deleted automatically when the commit messages get fixed.
          reaction-type: "eyes"

  lint-flake-8:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v2
      - name: Set up Python 3.7
        uses: actions/setup-python@v2
        with:
          python-version: 3.7
      - name: Lint with flake8
        run: |
          pip install flake8
          # stop the build if there are Python syntax errors or undefined names
          flake8 . --count --select=E9,F63,F7,F82 --show-source --statistics --exclude .git,__pycache__,docs/source/conf.py,old,build,dist,tests/,jina/resources/
          # exit-zero treats all errors as warnings. The GitHub editor is 127 chars wide
          flake8 . --count --exit-zero --max-complexity=10 --max-line-length=127 --statistics --exclude .git,__pycache__,docs/source/conf.py,old,build,dist,tests/,jina/resources/

  check-black:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v2
        with:
          fetch-depth: 0
      - name: Set up Python 3.7
        uses: actions/setup-python@v2
        with:
          python-version: 3.7
      - id: file_changes
        uses: Ana06/get-changed-files@v1.2
      - name: check black
        run: ./scripts/black.sh
        env:
          CHANGED_FILES: ${{ steps.file_changes.outputs.added_modified }}

  prep-testbed:
    runs-on: ubuntu-latest
    needs: [lint-flake-8, check-black]
    steps:
      - uses: actions/checkout@v2
      - id: set-matrix
        run: |
          sudo apt-get install jq
          echo "::set-output name=matrix::$(bash scripts/get-all-test-paths.sh)"
    outputs:
      matrix: ${{ steps.set-matrix.outputs.matrix }}

  core-test:
    needs: prep-testbed
    runs-on: ubuntu-latest
    strategy:
      fail-fast: false
      matrix:
        python-version: [3.7]
        test-path: ${{fromJson(needs.prep-testbed.outputs.matrix)}}
    steps:
      - uses: actions/checkout@v2
      - name: Set up Python ${{ matrix.python-version }}
        uses: actions/setup-python@v2
        with:
          python-version: ${{ matrix.python-version }}
      - name: Prepare enviroment
        run: |
          python -m pip install --upgrade pip
          python -m pip install wheel pytest pytest-cov
          pip install --no-cache-dir "client/[test]"
          pip install --no-cache-dir "server/[onnx]"
          pip install --no-cache-dir "server/[transformers]"
          pip install --no-cache-dir "server/[search]"
          pip install --no-cache-dir "server/[cn_clip]"
      - name: Test
        id: test
        run: |
          pytest --suppress-no-test-exit-code --cov=clip_client --cov=clip_server --cov-report=xml \
            -v -s ${{ matrix.test-path }}
          echo "::set-output name=codecov_flag::cas"
        timeout-minutes: 30
      - name: Check codecov file
        id: check_files
        uses: andstor/file-existence-action@v1
        with:
          files: "coverage.xml"
      - name: Upload coverage from test to Codecov
        uses: codecov/codecov-action@v3
        if: steps.check_files.outputs.files_exists == 'true' && ${{ matrix.python-version }} == '3.7'
        with:
          file: coverage.xml
          name: ${{ matrix.test-path }}-codecov
          flags: ${{ steps.test.outputs.codecov_flag }}
          fail_ci_if_error: false
          token: ${{ secrets.CODECOV_TOKEN }} # not required for public repos

  trt-gpu-test:
    needs: prep-testbed
    runs-on: [self-hosted, x64, gpu, linux]
    strategy:
      fail-fast: false
      matrix:
        python-version: [ 3.7 ]
    steps:
      - uses: actions/checkout@v2
        with:
          # For coverage builds fetch the whole history
          fetch-depth: 0
      - name: Set up Python ${{ matrix.python-version }}
        uses: actions/setup-python@v2
        with:
          python-version: ${{ matrix.python-version }}
      - name: Prepare enviroment
        run: |
          python -m pip install pip==23.0.1
          python -m pip install wheel pytest pytest-cov nvidia-pyindex
          pip install -e "client/[test]"
          pip install -e "server/[tensorrt]"
          pip install -e "server/[onnx]"
          pip install -e "server/[transformers]"
          {
            pip install -e "server/[flash-attn]"
          } || {
            echo "flash attention was not installed."
          }
          pip install --no-cache-dir "server/[cn_clip]"
      - name: Test
        id: test
        run: |
          pytest --suppress-no-test-exit-code --cov=clip_client --cov=clip_server --cov-report=xml \
            -v -s -m "gpu" ./tests/test_tensorrt.py
          pytest --suppress-no-test-exit-code --cov=clip_client --cov=clip_server --cov-report=xml \
            -v -s -m "gpu" ./tests/test_simple.py
          echo "::set-output name=codecov_flag::cas"
        timeout-minutes: 30
        env:
          # fix re-initialized torch runtime error on cuda device
          JINA_MP_START_METHOD: spawn
      - name: Check codecov file
        id: check_files
        uses: andstor/file-existence-action@v1
        with:
          files: "coverage.xml"
      - name: Upload coverage from test to Codecov
        uses: codecov/codecov-action@v3
        if: steps.check_files.outputs.files_exists == 'true' && ${{ matrix.python-version }} == '3.7'
        with:
          file: coverage.xml
          name: gpu-related-codecov
          flags: ${{ steps.test.outputs.codecov_flag }}
          fail_ci_if_error: false
          token: ${{ secrets.CODECOV_TOKEN }} # not required for public repos

  gpu-model-test:
    needs: prep-testbed
    runs-on: [ self-hosted, x64, gpu, linux ]
    strategy:
      fail-fast: false
      matrix:
        python-version: [ 3.7 ]
    steps:
      - uses: actions/checkout@v2
        with:
          # For coverage builds fetch the whole history
          fetch-depth: 0
      - name: Set up Python ${{ matrix.python-version }}
        uses: actions/setup-python@v2
        with:
          python-version: ${{ matrix.python-version }}
      - name: Prepare enviroment
        run: |
          python -m pip install pip==23.0.1
          python -m pip install wheel pytest pytest-cov nvidia-pyindex
          pip install -e "client/[test]"
          pip install -e "server/[onnx]"
          pip install -e "server/[transformers]"
          {
            pip install -e "server/[flash-attn]"
          } || {
            echo "flash attention was not installed."
          }
          pip install --no-cache-dir "server/[cn_clip]"
      - name: Test
        id: test
        run: |
          pytest --suppress-no-test-exit-code --cov=clip_client --cov=clip_server --cov-report=xml \
            -v -s -m "gpu" ./tests/test_model.py
          echo "::set-output name=codecov_flag::cas"
        timeout-minutes: 30
        env:
          # fix re-initialized torch runtime error on cuda device
          JINA_MP_START_METHOD: spawn
      - name: Check codecov file
        id: check_files
        uses: andstor/file-existence-action@v1
        with:
          files: "coverage.xml"
      - name: Upload coverage from test to Codecov
        uses: codecov/codecov-action@v3
        if: steps.check_files.outputs.files_exists == 'true' && ${{ matrix.python-version }} == '3.7'
        with:
          file: coverage.xml
          name: gpu-related-codecov
          flags: ${{ steps.test.outputs.codecov_flag }}
          fail_ci_if_error: false
          token: ${{ secrets.CODECOV_TOKEN }} # not required for public repos

  # just for blocking the merge until all parallel core-test are successful
  success-all-test:
    needs: [commit-lint, core-test, trt-gpu-test, gpu-model-test]
    if: always()
    runs-on: ubuntu-latest
    steps:
      - uses: technote-space/workflow-conclusion-action@v2
      - name: Check Failure
        if: env.WORKFLOW_CONCLUSION == 'failure'
        run: exit 1
      - name: Success
        if: ${{ success() }}
        run: echo "All Done"


================================================
FILE: .github/workflows/force-docker-build-cas.yml
================================================
name: Manual CAS Docker Build

on:
  workflow_dispatch:
    inputs:
      release_token:
        description: 'Your release token'
        required: true
      triggered_by:
        description: 'CD | TAG | MANUAL'
        required: false
        default: MANUAL

jobs:
  token-check:
    runs-on: ubuntu-latest
    steps:
      - run: echo "success!"
        if: "${{ github.event.inputs.release_token }} == ${{ env.release_token }}"
        env:
          release_token: ${{ secrets.CAS_RELEASE_TOKEN }}

  regular-release:
    needs: token-check
    runs-on: ubuntu-latest
    strategy:
      fail-fast: false
      matrix:
        pip_tag: [ "", "onnx", "tensorrt"]  # default: "" = core
    steps:
      - uses: actions/checkout@v2
      - name: Set envs and versions
        run: |
          VCS_REF=${{ github.ref }}
          echo "VCS_REF=$VCS_REF" >> $GITHUB_ENV
          echo "Will build $VCS_REF"
          echo "BUILD_DATE=$(date -u +'%Y-%m-%dT%H:%M:%SZ')" >> $GITHUB_ENV

          if [[ "${{ matrix.pip_tag }}" == "perf" ]]; then
            echo "JINA_PIP_INSTALL_PERF=1" >> $GITHUB_ENV
          fi

          if [[ "${{ matrix.pip_tag }}" == "" ]]; then
            echo "JINA_PIP_INSTALL_CORE=1" >> $GITHUB_ENV
          fi

          JINA_VERSION=$(sed -n '/^__version__/p' ./server/clip_server/__init__.py | cut -d \' -f2)
          V_JINA_VERSION=v${JINA_VERSION}
          JINA_MINOR_VERSION=${JINA_VERSION%.*}
          JINA_MAJOR_VERSION=${JINA_MINOR_VERSION%.*}

          PY_TAG=${{matrix.py_version}}
          if [ -n "${PY_TAG}" ]; then
            PY_TAG=-py${PY_TAG//./}
          fi

          PIP_TAG=${{ matrix.pip_tag }}
          if [ -n "${PIP_TAG}" ]; then
              PIP_TAG=-${PIP_TAG}
          fi

          if [[ "${{ github.event.inputs.triggered_by }}" == "CD" ]]; then

            if [[ "${{ matrix.py_version }}" == "$DEFAULT_PY_VERSION" ]]; then
              echo "TAG_ALIAS=\
                              jinaai/clip-server:master${PY_TAG}${PIP_TAG}, \
                              jinaai/clip-server:master${PIP_TAG}" \
                              >> $GITHUB_ENV
            else
              # on every CD
              echo "TAG_ALIAS=\
                              jinaai/clip-server:master${PY_TAG}${PIP_TAG}" \
                              >> $GITHUB_ENV
            fi

          elif [[ "${{ github.event.inputs.triggered_by }}" == "TAG" ]]; then
            # on every tag release

            if [[ "${{ matrix.py_version }}" == "$DEFAULT_PY_VERSION" ]]; then
              echo "TAG_ALIAS=\
                              jinaai/clip-server:latest${PY_TAG}${PIP_TAG}, \
                              jinaai/clip-server:${JINA_VERSION}${PY_TAG}${PIP_TAG}, \
                              jinaai/clip-server:${JINA_MINOR_VERSION}${PY_TAG}${PIP_TAG}, \
                              jinaai/clip-server:${JINA_MAJOR_VERSION}${PY_TAG}${PIP_TAG}, \
                              jinaai/clip-server:latest${PIP_TAG}, \
                              jinaai/clip-server:${JINA_VERSION}${PIP_TAG}, \
                              jinaai/clip-server:${JINA_MINOR_VERSION}${PIP_TAG}, \
                              jinaai/clip-server:${JINA_MAJOR_VERSION}${PIP_TAG} \
                              " >> $GITHUB_ENV
            else
              echo "TAG_ALIAS=\
                              jinaai/clip-server:latest${PY_TAG}${PIP_TAG}, \
                              jinaai/clip-server:${JINA_VERSION}${PY_TAG}${PIP_TAG}, \
                              jinaai/clip-server:${JINA_MINOR_VERSION}${PY_TAG}${PIP_TAG}, \
                              jinaai/clip-server:${JINA_MAJOR_VERSION}${PY_TAG}${PIP_TAG} \
                              " >> $GITHUB_ENV
            fi
          elif [[ "${{ github.event.inputs.triggered_by }}" == "MANUAL" ]]; then
            # on every manual release
            if [[ "${{ matrix.py_version }}" == "$DEFAULT_PY_VERSION" ]]; then
              echo "TAG_ALIAS=\
                              jinaai/clip-server:${JINA_VERSION}${PIP_TAG}, \
                              jinaai/clip-server:${JINA_VERSION}${PY_TAG}${PIP_TAG} \
                              " >> $GITHUB_ENV
            else
              echo "TAG_ALIAS=\
                              jinaai/clip-server:${JINA_VERSION}${PY_TAG}${PIP_TAG} \
                              " >> $GITHUB_ENV
            fi
          else
            echo "Bad triggered_by: ${{ github.event.inputs.triggered_by }}!"
            exit 1
          fi

          echo "JINA_VERSION=${JINA_VERSION}" >> $GITHUB_ENV

      - name: Set up Docker Buildx
        id: buildx
        uses: docker/setup-buildx-action@v1
        with:
          install: true
      - name: Login to DockerHub
        uses: docker/login-action@v1
        with:
          username: ${{ secrets.DOCKERHUB_DEVBOT_USER }}
          password: ${{ secrets.DOCKERHUB_DEVBOT_TOKEN }}
      - run: |
          # https://github.com/docker/buildx/issues/464#issuecomment-741507760
          # https://github.com/kubernetes-sigs/azuredisk-csi-driver/pull/808/files
          docker run --privileged --rm tonistiigi/binfmt --uninstall qemu-aarch64
          docker run --rm --privileged tonistiigi/binfmt --install all
      - name: Build and push
        uses: docker/build-push-action@v2
        with:
          context: .
          file: Dockerfiles/server.Dockerfile
          platforms: linux/amd64
          push: true
          tags: ${{env.TAG_ALIAS}}
          build-args: |
            BUILD_DATE=${{env.BUILD_DATE}}
            JINA_VERSION=${{env.JINA_VERSION}}
            VCS_REF=${{env.VCS_REF}}
            PIP_INSTALL_CORE=${{env.JINA_PIP_INSTALL_CORE}}
            PIP_INSTALL_PERF=${{env.JINA_PIP_INSTALL_PERF}}
            PIP_TAG=${{matrix.pip_tag}}


================================================
FILE: .github/workflows/force-docker-build.yml
================================================
name: Manual Docker Build

on:
  workflow_dispatch:
    inputs:
      release_token:
        description: 'Your release token'
        required: true
      triggered_by:
        description: 'CD | TAG | MANUAL'
        required: false
        default: MANUAL


jobs:
  token-check:
    runs-on: ubuntu-latest
    steps:
      - run: echo "success!"
        if: "${{ github.event.inputs.release_token }} == ${{ env.release_token }}"
        env:
          release_token: ${{ secrets.CAS_RELEASE_TOKEN }}

  docker-release:
    needs: token-check
    runs-on: ubuntu-latest
    strategy:
      fail-fast: false
      matrix:
        pip_tag: ["", "onnx", "tensorrt"]  # default: "" = torch
        engine_tag: ["", "cuda"]  # default: "" = cpu
    steps:
      - uses: actions/checkout@v2
      - name: Set envs and versions
        run: |
          VCS_REF=${{ github.ref }}
          echo "VCS_REF=$VCS_REF" >> $GITHUB_ENV
          echo "Will build $VCS_REF"
          echo "BUILD_DATE=$(date -u +'%Y-%m-%dT%H:%M:%SZ')" >> $GITHUB_ENV
          echo "BUILD_TARGET=clip_executor" >> $GITHUB_ENV

          CAS_VERSION=$(sed -n '/^__version__/p' ./server/clip_server/__init__.py | cut -d \' -f2)
          V_CAS_VERSION=v${CAS_VERSION}
          CAS_MINOR_VERSION=${CAS_VERSION%.*}
          CAS_MAJOR_VERSION=${CAS_MINOR_VERSION%.*}
          
          ENGINE_TAG=${{matrix.engine_tag}}
          if [ -n "${ENGINE_TAG}" ]; then
            ENGINE_TAG=-${ENGINE_TAG//./}
          fi

          PIP_TAG=${{ matrix.pip_tag }}
          BACKEND_TAG=torch
          if [ -n "${PIP_TAG}" ]; then
              BACKEND_TAG=${PIP_TAG}
              PIP_TAG=-${PIP_TAG}
          fi

          if [[ "${{ github.event.inputs.triggered_by }}" == "CD" ]]; then
            # on every CD release
            echo "TAG_ALIAS=\
                            jinaai/clip_executor:master${PIP_TAG}${ENGINE_TAG}" \
                            >> $GITHUB_ENV

          elif [[ "${{ github.event.inputs.triggered_by }}" == "TAG" ]]; then
            # on every tag release
            echo "TAG_ALIAS=\
                            jinaai/clip_executor:latest${PIP_TAG}${ENGINE_TAG}, \
                            jinaai/clip_executor:${CAS_VERSION}${PIP_TAG}${ENGINE_TAG}, \
                            jinaai/clip_executor:${CAS_MINOR_VERSION}${PIP_TAG}${ENGINE_TAG} \
                            " >> $GITHUB_ENV
            
          elif [[ "${{ github.event.inputs.triggered_by }}" == "MANUAL" ]]; then
            # on every manual release
            echo "TAG_ALIAS=\
                            jinaai/clip_executor:${CAS_VERSION}${PIP_TAG}${ENGINE_TAG} \
                            " >> $GITHUB_ENV
          else
            echo "Bad triggered_by: ${{ github.event.inputs.triggered_by }}!"
            exit 1
          fi

          echo "CAS_VERSION=${CAS_VERSION}" >> $GITHUB_ENV
          echo "BACKEND_TAG=${BACKEND_TAG}" >> $GITHUB_ENV

      - name: Set up Docker Buildx
        id: buildx
        uses: docker/setup-buildx-action@v2
        with:
          install: true
      - name: Login to DockerHub
        uses: docker/login-action@v2
        with:
          username: ${{ secrets.DOCKERHUB_DEVBOT_USER }}
          password: ${{ secrets.DOCKERHUB_DEVBOT_TOKEN }}
      - run: |
          # https://github.com/docker/buildx/issues/464#issuecomment-741507760
          # https://github.com/kubernetes-sigs/azuredisk-csi-driver/pull/808/files
          docker run --privileged --rm tonistiigi/binfmt --uninstall qemu-aarch64
          docker run --rm --privileged tonistiigi/binfmt --install all
      - name: CPU Build and push
        id: base_docker_build
        if: ${{ matrix.engine_tag == '' && matrix.pip_tag != 'tensorrt' }}
        uses: docker/build-push-action@v2
        with:
          context: server
          file: Dockerfiles/base.Dockerfile
          platforms: linux/amd64
          cache-from: type=registry,ref=jinaai/clip_executor:latest
          cache-to: type=inline
          push: true
          tags: ${{env.TAG_ALIAS}}
          build-args: |
            BUILD_DATE=${{env.BUILD_DATE}}
            CAS_VERSION=${{env.CAS_VERSION}}
            VCS_REF=${{env.VCS_REF}}
            BACKEND_TAG=${{env.BACKEND_TAG}}
      - name: CUDA Build and push
        id: cuda_docker_build
        if: ${{ matrix.engine_tag == 'cuda' }}
        uses: docker/build-push-action@v2
        with:
          context: server
          file: Dockerfiles/cuda.Dockerfile
          platforms: linux/amd64
          cache-from: type=registry,ref=jinaai/clip_executor:latest-cuda
          cache-to: type=inline
          push: true
          tags: ${{env.TAG_ALIAS}}
          build-args: |
            BUILD_DATE=${{env.BUILD_DATE}}
            CAS_VERSION=${{env.CAS_VERSION}}
            VCS_REF=${{env.VCS_REF}}
            BACKEND_TAG=${{env.BACKEND_TAG}}


================================================
FILE: .github/workflows/force-docs-build.yml
================================================
name: Manual Docs Build

on:
  workflow_dispatch:
    inputs:
      release_token:
        description: 'Your release token'
        required: true
      triggered_by:
        description: 'CD | TAG | MANUAL'
        required: false
        default: MANUAL

jobs:
  token-check:
    runs-on: ubuntu-latest
    steps:
      - run: echo "success!"
        if: "${{ github.event.inputs.release_token }} == ${{ env.release_token }}"
        env:
          release_token: ${{ secrets.CAS_RELEASE_TOKEN }}

  release-docs:
    needs: token-check
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v2
        with:
          fetch-depth: 0
      - uses: actions/setup-python@v2
        with:
          python-version: 3.7
      - name: Build doc and push to gh-pages
        run: |
          git config --local user.email "dev-bot@jina.ai"
          git config --local user.name "Jina Dev Bot"
          pip install --no-cache-dir client/
          pip install --no-cache-dir server/
          mkdir gen-html
          cd docs
          pip install -r requirements.txt
          pip install --pre -U furo
          bash makedoc.sh
          cd ./_build/dirhtml/
          cp -r ./ ../../../gen-html
          cd -  # back to ./docs
          cd ..
          git checkout -f gh-pages
          git rm -rf ./docs
          mkdir -p docs
          cd gen-html
          cp -r ./ ../docs
          cd ../docs
          ls -la
          touch .nojekyll
          cp 404/index.html 404.html
          sed -i 's/href="\.\./href="/' 404.html # fix asset urls that needs to be updated in 404.html
          echo clip-as-service.jina.ai > CNAME
          cd ..
          git add docs
          git status
          git commit -m "chore(docs): update docs due to ${{github.event_name}} on ${{github.repository}}"
          git push --force origin gh-pages

================================================
FILE: .github/workflows/force-hub-push.yml
================================================
name: Manual Hub Push

on:
  workflow_dispatch:
    inputs:
      release_token:
        description: 'Your release token'
        required: true
      triggered_by:
        description: 'CD | TAG | MANUAL'
        required: false
        default: MANUAL

#on:
#  pull_request:

jobs:
  token-check:
    runs-on: ubuntu-latest
    steps:
      - run: echo "success!"
        if: "${{ github.event.inputs.release_token }} == ${{ env.release_token }}"
        env:
          release_token: ${{ secrets.CAS_RELEASE_TOKEN }}

  hub-release:
    needs: token-check
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v2
      - name: Set envs and versions
        run: |
          VCS_REF=${{ github.ref }}
          echo "VCS_REF=$VCS_REF" >> $GITHUB_ENV
          echo "Will push $VCS_REF"

          CAS_VERSION=$(sed -n '/^__version__/p' ./server/clip_server/__init__.py | cut -d \' -f2)
          V_CAS_VERSION=v${CAS_VERSION}
          CAS_MINOR_VERSION=${CAS_VERSION%.*}
          CAS_MAJOR_VERSION=${CAS_MINOR_VERSION%.*}

          if [[ "${{ github.event.inputs.triggered_by }}" == "CD" ]]; then
            # on every CD release
            echo "TAG_ALIAS=\
                            -t latest \
                            " >> $GITHUB_ENV
            echo "GPU_TAG_ALIAS=\
                            -t latest-gpu \
                            " >> $GITHUB_ENV

          elif [[ "${{ github.event.inputs.triggered_by }}" == "TAG" ]]; then
            # on every tag release
            echo "TAG_ALIAS=\
                            -t latest \
                            -t ${CAS_VERSION} \
                            -t ${CAS_MINOR_VERSION} \
                            " >> $GITHUB_ENV
            echo "GPU_TAG_ALIAS=\
                            -t latest-gpu \
                            -t ${CAS_VERSION}-gpu \
                            -t ${CAS_MINOR_VERSION}-gpu \
                            " >> $GITHUB_ENV

          elif [[ "${{ github.event.inputs.triggered_by }}" == "MANUAL" ]]; then
            # on every manual release
            echo "TAG_ALIAS=\
                            -t ${CAS_VERSION} \
                            " >> $GITHUB_ENV
            echo "GPU_TAG_ALIAS=\
                            -t ${CAS_VERSION}-gpu \
                            " >> $GITHUB_ENV
          else
            echo "TAG_ALIAS=\
                            -t latest \
                            " >> $GITHUB_ENV
            echo "GPU_TAG_ALIAS=\
                            -t latest-gpu \
                            " >> $GITHUB_ENV
          fi

          echo "CAS_VERSION=${CAS_VERSION}" >> $GITHUB_ENV

      - name: Prepare enviroment
        run: |
          python -m pip install --upgrade jina yq

      - name: Push Torch Executor
        id: push_torch_executor
        run: |
          # FIX the import issue
          echo -e "\
          __version__ = '$CAS_VERSION'
          from .executors.clip_torch import CLIPEncoder\n\
          " > server/clip_server/__init__.py
                    
          echo -e "\
          jtype: CLIPEncoder\n\
          metas:\n\
            py_modules:\n\
              - clip_server/__init__.py\n\
          " > server/config.yml
          
          echo -e "\
          manifest_version: 1\n\
          name: CLIPTorchEncoder\n\
          description: Embed images and sentences into fixed-length vectors with CLIP\n\
          url: https://github.com/jina-ai/clip-as-service\n\
          keywords: [clip, clip-model, clip-as-service, pytorch]\n\
          " > server/manifest.yml
          
          python scripts/get-requirements.py "" server/requirements.txt
          
          cp .github/README-exec/torch.readme.md server/README.md
          
          exec_name=`yq -r .name server/manifest.yml`
          echo executor name is $exec_name

          cp Dockerfiles/base.Dockerfile server/Dockerfile
          JINA_AUTH_TOKEN=${{secrets.JINAHUB_TOKEN}} jina hub push --force $exec_name --secret ${{secrets.TORCH_EXEC_SECRET}} server ${{env.TAG_ALIAS}}

          cp Dockerfiles/cuda.Dockerfile server/Dockerfile
          JINA_AUTH_TOKEN=${{secrets.JINAHUB_TOKEN}} jina hub push --force $exec_name --secret ${{secrets.TORCH_EXEC_SECRET}} server ${{env.GPU_TAG_ALIAS}}

      - name: Push Onnx Executor
        id: push_onnx_executor
        run: |
          # FIX the import issue
          echo -e "\
          __version__ = '$CAS_VERSION'
          from .executors.clip_onnx import CLIPEncoder\n\
          " > server/clip_server/__init__.py
          
          echo -e "\
          jtype: CLIPEncoder\n\
          metas:\n\
            py_modules:\n\
              - clip_server/__init__.py\n\
          " > server/config.yml
          
          echo -e "\
          manifest_version: 1\n\
          name: CLIPOnnxEncoder\n\
          description: Embed images and sentences into fixed-length vectors with CLIP\n\
          url: https://github.com/jina-ai/clip-as-service\n\
          keywords: [clip, clip-model, clip-as-service, onnx, onnx-runtime]\n\
          " > server/manifest.yml
          
          python scripts/get-requirements.py onnx server/requirements.txt
          
          cp .github/README-exec/onnx.readme.md server/README.md
          
          exec_name=`yq -r .name server/manifest.yml`
          echo executor name is $exec_name
          
          cp Dockerfiles/base.Dockerfile server/Dockerfile
          sed -i 's/ARG BACKEND_TAG=torch/ARG BACKEND_TAG=onnx/g' server/Dockerfile          
          JINA_AUTH_TOKEN=${{secrets.JINAHUB_TOKEN}} jina hub push --force $exec_name --secret ${{secrets.ONNX_EXEC_SECRET}} server ${{env.TAG_ALIAS}}
          
          cp Dockerfiles/cuda.Dockerfile server/Dockerfile
          sed -i 's/ARG BACKEND_TAG=torch/ARG BACKEND_TAG=onnx/g' server/Dockerfile
          JINA_AUTH_TOKEN=${{secrets.JINAHUB_TOKEN}} jina hub push --force $exec_name --secret ${{secrets.ONNX_EXEC_SECRET}} server ${{env.GPU_TAG_ALIAS}}

      - name: Push TensorRT Executor
        id: push_tensorrt_executor
        run: |
          # FIX the import issue
          echo -e "\
          __version__ = '$CAS_VERSION'
          from .executors.clip_tensorrt import CLIPEncoder\n\
          " > server/clip_server/__init__.py
          
          echo -e "\
          jtype: CLIPEncoder\n\
          metas:\n\
            py_modules:\n\
              - clip_server/__init__.py\n\
          " > server/config.yml
          
          echo -e "\
          manifest_version: 1\n\
          name: CLIPTensorRTEncoder\n\
          description: Embed images and sentences into fixed-length vectors with CLIP\n\
          url: https://github.com/jina-ai/clip-as-service\n\
          keywords: [clip, clip-model, clip-as-service, onnx, tensorrt]\n\
          " > server/manifest.yml
          
          python scripts/get-requirements.py tensorrt server/requirements.txt
          
          cp Dockerfiles/tensorrt.Dockerfile server/Dockerfile
          
          exec_name=`yq -r .name server/manifest.yml`
          echo executor name is $exec_name
          
          # FIXME: disable uploading at debugging
          # JINA_AUTH_TOKEN=${{secrets.JINAHUB_TOKEN}} jina hub push --force $exec_name --secret ${{secrets.TENSORRT_EXEC_SECRET}} server ${{env.TAG_ALIAS}}


================================================
FILE: .github/workflows/force-release.yml
================================================
name: Manual Release

on:
  workflow_dispatch:
    inputs:
      release_token:
        description: 'Your release token'
        required: true
      release_reason:
        description: 'Short reason for this manual release'
        required: true

jobs:
  token-check:
    runs-on: ubuntu-latest
    steps:
      - run: echo "success!"
        if: "${{ github.event.inputs.release_token }} == ${{ env.release_token }}"
        env:
          release_token: ${{ secrets.CAS_RELEASE_TOKEN }}

  regular-release:
    needs: token-check
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v2
        with:
          token: ${{ secrets.JINA_DEV_BOT }}
          fetch-depth: 100  # means max contribute history is limited to 100 lines
#          submodules: true
      - uses: actions/setup-python@v2
        with:
          python-version: 3.7
      - run: |
          git fetch --depth=1 origin +refs/tags/*:refs/tags/*
          npm install git-release-notes
          pip install twine wheel
          ./scripts/release.sh final "${{ github.event.inputs.release_reason }}" "${{github.actor}}"
        env:
          TWINE_USERNAME: ${{ secrets.TWINE_USERNAME }}
          TWINE_PASSWORD: ${{ secrets.TWINE_PASSWORD }}
      - if: failure()
        run: echo "nothing to release"
      - name: bumping master version
        uses: ad-m/github-push-action@v0.6.0
        with:
          github_token: ${{ secrets.JINA_DEV_BOT }}
          tags: true
          branch: main


================================================
FILE: .github/workflows/label-pr.yml
================================================
name: PR

on:
  pull_request:


jobs:
  assign-label-to-pr:
    runs-on: ubuntu-latest
    if: ${{ !github.event.pull_request.head.repo.fork }}
    steps:
      - uses: codelytv/pr-size-labeler@v1
        with:
          GITHUB_TOKEN: ${{ secrets.GITHUB_TOKEN }}
          xs_max_size: '10'
          s_max_size: '100'
          m_max_size: '500'
          l_max_size: '1000'
          fail_if_xl: 'false'
      - uses: actions/labeler@v3
        with:
          repo-token: "${{ secrets.GITHUB_TOKEN }}"
      - id: docs_updated
        if: contains( github.event.pull_request.labels.*.name, 'area/docs')
        run: echo '::set-output name=docs::true'
    outputs:
      docs: ${{ steps.docs_updated.outputs.docs }}

  deploy-to-netlify:
    runs-on: ubuntu-latest
    needs: [assign-label-to-pr]
    if: ${{ needs.assign-label-to-pr.outputs.docs == 'true' }}
    steps:
      - run: |
          echo "BRANCH_NAME=${{ github.head_ref }}" >> $GITHUB_ENV
      - uses: actions/checkout@v2
        with:
          repository: jina-ai/clip-as-service
          ref: ${{ env.BRANCH_NAME }}
      - uses: actions/setup-python@v2
        with:
          python-version: 3.7
      - uses: actions/setup-node@v2
        with:
          node-version: '14'
      - name: Build and Deploy
        run: |
          npm i -g netlify-cli
          python -m pip install --upgrade pip
          pip install -r requirements.txt
          git fetch origin
          export NUM_RELEASES=2 # only 2 last tags to save build time
          bash makedoc.sh development
          netlify deploy --dir=_build/dirhtml --alias="ft-${{ env.BRANCH_NAME }}" --message="Deploying docs to ${{ env.BRANCH_NAME }} branch"
        env:
          NETLIFY_AUTH_TOKEN: ${{ secrets.NETLIFY_AUTH_TOKEN1 }}
          NETLIFY_SITE_ID: ${{ secrets.NETLIFY_SITE_ID }}
        working-directory: docs
      - name: Find the prev comment if exists
        uses: peter-evans/find-comment@v1
        id: fc
        with:
          issue-number: ${{ github.event.pull_request.number }}
          comment-author: 'github-actions[bot]'
          body-includes: 'Docs are deployed'
      - name: Delete comment if exists
        if: ${{ steps.fc.outputs.comment-id != 0 && !github.event.pull_request.head.repo.fork }}
        uses: actions/github-script@v3
        with:
          github-token: ${{ secrets.GITHUB_TOKEN }}
          script: |
            github.issues.deleteComment({
              owner: context.repo.owner,
              repo: context.repo.repo,
              comment_id: ${{ steps.fc.outputs.comment-id }},
            })
      - name: Add or update comment
        uses: peter-evans/create-or-update-comment@v1
        with:
          issue-number: ${{ github.event.pull_request.number }}
          body: |
            :memo: Docs are deployed on https://ft-${{ env.BRANCH_NAME }}--jina-docs.netlify.app :tada: 


================================================
FILE: .github/workflows/tag.yml
================================================
name: Release CD

on:
  push:
    tags:
      - "v*"  # push to version tags trigger the build

jobs:
  update-doc:
    runs-on: ubuntu-latest
    steps:
      - uses: benc-uk/workflow-dispatch@v1
        with:
          workflow: Manual Docs Build
          token: ${{ secrets.JINA_DEV_BOT }}
          inputs: '{ "release_token": "${{ env.release_token }}", "triggered_by": "TAG"}'
        env:
          release_token: ${{ secrets.CAS_RELEASE_TOKEN }}

  update-docker:
    needs: update-doc
    runs-on: ubuntu-latest
    steps:
      - name: CAS Docker Build
        uses: benc-uk/workflow-dispatch@v1
        with:
          workflow: Manual CAS Docker Build
          inputs: '{ "release_token": "${{ env.release_token }}", "triggered_by": "TAG"}'
          token: ${{ secrets.JINA_DEV_BOT }}
        env:
          release_token: ${{ secrets.CAS_RELEASE_TOKEN }}
      - name: Helm Executor Build
        uses: benc-uk/workflow-dispatch@v1
        with:
          workflow: Manual Docker Build
          inputs: '{ "release_token": "${{ env.release_token }}", "triggered_by": "TAG"}'
          token: ${{ secrets.JINA_DEV_BOT }}
        env:
          release_token: ${{ secrets.CAS_RELEASE_TOKEN }}
      - name: Hub Executor Build
        uses: benc-uk/workflow-dispatch@v1
        with:
          workflow: Manual Hub Push
          inputs: '{ "release_token": "${{ env.release_token }}", "triggered_by": "TAG"}'
          token: ${{ secrets.JINA_DEV_BOT }}
        env:
          release_token: ${{ secrets.CAS_RELEASE_TOKEN }}

  create-release:
    needs: update-doc
    runs-on: ubuntu-latest
    steps:
      - name: Checkout code
        uses: actions/checkout@v2
        with:
          ref: 'main'
      - uses: actions/setup-python@v2
        with:
          python-version: 3.7
      - run: |
          python scripts/get-last-release-note.py
      - name: Create Release
        id: create_release
        uses: actions/create-release@v1
        env:
          GITHUB_TOKEN: ${{ secrets.GITHUB_TOKEN }} # This token is provided by Actions, you do not need to create your own token
        with:
          tag_name: ${{ github.ref }}
          release_name: 💫 Patch ${{ github.ref }}
          body_path: 'tmp.md'
          draft: false
          prerelease: false


================================================
FILE: .gitignore
================================================
# Initially taken from Github's Python gitignore file

# Byte-compiled / optimized / DLL files
__pycache__/
*.py[cod]
*$py.class

# C extensions
*.so

# Distribution / packaging
.Python
build/
develop-eggs/
dist/
downloads/
eggs/
.eggs/
lib/
lib64/
parts/
sdist/
var/
wheels/
*.egg-info/
.installed.cfg
*.egg
MANIFEST

# PyInstaller
#  Usually these files are written by a python script from a template
#  before PyInstaller builds the exe, so as to inject date/other infos into it.
*.manifest
*.spec

# Installer logs
pip-log.txt
pip-delete-this-directory.txt

# Unit test / coverage reports
htmlcov/
.tox/
.nox/
.coverage
.coverage.*
.cache
nosetests.xml
coverage.xml
*.cover
.hypothesis/
.pytest_cache/

# Translations
*.mo
*.pot

# Django stuff:
*.log
local_settings.py
db.sqlite3

# Flask stuff:
instance/
.webassets-cache

# Scrapy stuff:
.scrapy

# Sphinx documentation
docs/_build/
docs/api/

# PyBuilder
target/

# Jupyter Notebook
.ipynb_checkpoints

# IPython
profile_default/
ipython_config.py

# pyenv
docs/.python-version

# celery beat schedule file
celerybeat-schedule

# SageMath parsed files
*.sage.py

# Environments
.venv
env/
venv/
ENV/
env.bak/
venv.bak/

# Spyder project settings
.spyderproject
.spyproject

# Rope project settings
.ropeproject

# mkdocs documentation
/site

# mypy
.mypy_cache/
.dmypy.json
dmypy.json

# Pyre type checker
.pyre/
.idea/
toy*.py
.DS_Store
post/
toy*.ipynb
data/
*.c
.nes_cache
toy*.yml
*.tmp

shell/jina-wizard.sh
/junit/
/tests/junit/
/docs/chapters/proto/docs.md
/tests/.pytest-kind

# IntelliJ IDEA
*.iml
.idea

# VSCode
.vscode

# test with config in resources
tests/integration/crud/simple/simple_indexer/

# latency tracking
latency
MyIndexer/
MyMemMap/
original/
output/

# kubernetes testing
.pytest-kind
.kube

================================================
FILE: .pre-commit-config.yaml
================================================
repos:
- repo: https://github.com/ambv/black
  rev: 22.3.0
  hooks:
  - id: black
    types: [python]
    exclude: ^(docs/|server/clip_server/resources/)
    args:
      - -S
- repo: https://github.com/asottile/blacken-docs
  rev: v1.12.1
  hooks:
  -   id: blacken-docs
      args:
        - -S

================================================
FILE: CHANGELOG.md
================================================


<a name=release-note-0-0-3></a>
## Release Note (`0.0.3`)

> Release time: 2022-03-23 21:42:16


🙇 We'd like to thank all contributors for this new release! In particular,
 Han Xiao,  Dmitry Kan,  varshaneya,  Ilya Usvyatsky,  Nicko van Someren,  George Gkotsis,  Jhang,  Changrui Zhang,  DomHudson,  Filip Bednárik,  🙇


### 🍹 Other Improvements

 - [[```378d82b5```](https://github.com/jina-ai/clip-as-service/commit/378d82b5d20a627a7a32239ebd9b47cbd12a5f7a)] __-__ fix setup and release script (*Han Xiao*)
 - [[```372de00f```](https://github.com/jina-ai/clip-as-service/commit/372de00f286565750cab51d6e33ca4d9471a2934)] __-__ fix workflow yaml config (*Han Xiao*)
 - [[```11822f60```](https://github.com/jina-ai/clip-as-service/commit/11822f6050096a4ed1ca7bb9ee3c082deec56fb4)] __-__ fix image (*Han Xiao*)
 - [[```78a6a8b9```](https://github.com/jina-ai/clip-as-service/commit/78a6a8b9de9cab353be63311facce670212de08d)] __-__ first commit (*Han Xiao*)
 - [[```f5e42383```](https://github.com/jina-ai/clip-as-service/commit/f5e4238397ab76a1539bb6f22d5735f563ff187e)] __-__ update readme (*Han Xiao*)
 - [[```c4790fbe```](https://github.com/jina-ai/clip-as-service/commit/c4790fbef902e8d3509d1229788cb81abcdc33d6)] __-__ modified the port 8001-&gt;8081 to match Vue.js demo (*Dmitry Kan*)
 - [[```749c8e45```](https://github.com/jina-ai/clip-as-service/commit/749c8e45d301967dcaa8744ef846190ac86d2932)] __-__ update readme header (*Han Xiao*)

<a name=release-note-0-0-4></a>
## Release Note (`0.0.4`)

> Release time: 2022-03-23 21:45:59


🙇 We'd like to thank all contributors for this new release! In particular,
 Han Xiao,  Jina Dev Bot,  🙇


### 🍹 Other Improvements

 - [[```f8936108```](https://github.com/jina-ai/clip-as-service/commit/f89361085bbe2715fd0d1c2d769389e0a46dc860)] __-__ fix setup and release script (*Han Xiao*)
 - [[```d2e4cfbf```](https://github.com/jina-ai/clip-as-service/commit/d2e4cfbf1977db1cf0fee433600b48b0c3626312)] __-__ __version__: the next version will be 0.0.4 (*Jina Dev Bot*)

<a name=release-note-0-0-5></a>
## Release Note (`0.0.5`)

> Release time: 2022-03-23 22:09:18


🙇 We'd like to thank all contributors for this new release! In particular,
 Han Xiao,  Jina Dev Bot,  🙇


### 🍹 Other Improvements

 - [[```7ed4643c```](https://github.com/jina-ai/clip-as-service/commit/7ed4643cc471a6433ee6cc8699617ae61978bcdb)] __-__ fix doc setup (*Han Xiao*)
 - [[```fe09c32c```](https://github.com/jina-ai/clip-as-service/commit/fe09c32c8629b20ea9f677b7097556e33091876f)] __-__ __version__: the next version will be 0.0.5 (*Jina Dev Bot*)
 - [[```f8936108```](https://github.com/jina-ai/clip-as-service/commit/f89361085bbe2715fd0d1c2d769389e0a46dc860)] __-__ fix setup and release script (*Han Xiao*)

<a name=release-note-0-0-6></a>
## Release Note (`0.0.6`)

> Release time: 2022-03-23 22:42:28


🙇 We'd like to thank all contributors for this new release! In particular,
 Han Xiao,  Jina Dev Bot,  🙇


### 🍹 Other Improvements

 - [[```f7044fb2```](https://github.com/jina-ai/clip-as-service/commit/f7044fb2f9c81f5dbfb7fec7c12c6a3b0dd54fa6)] __-__ fix doc setup (*Han Xiao*)
 - [[```c04eb30e```](https://github.com/jina-ai/clip-as-service/commit/c04eb30e8e09c89b1dd7061d0cb044286a1c5fc2)] __-__ __version__: the next version will be 0.0.6 (*Jina Dev Bot*)

<a name=release-note-0-0-7></a>
## Release Note (`0.0.7`)

> Release time: 2022-03-24 07:04:50


🙇 We'd like to thank all contributors for this new release! In particular,
 Han Xiao,  Jina Dev Bot,  🙇


### 🍹 Other Improvements

 - [[```e8aa643a```](https://github.com/jina-ai/clip-as-service/commit/e8aa643a802fca900f2111407093107f22f08917)] __-__ update docs and license (*Han Xiao*)
 - [[```7245f67a```](https://github.com/jina-ai/clip-as-service/commit/7245f67adf76e600762d8dbdbaee747764c0677c)] __-__ __version__: the next version will be 0.0.7 (*Jina Dev Bot*)
 - [[```f7044fb2```](https://github.com/jina-ai/clip-as-service/commit/f7044fb2f9c81f5dbfb7fec7c12c6a3b0dd54fa6)] __-__ fix doc setup (*Han Xiao*)

<a name=release-note-0-1-0></a>
## Release Note (`0.1.0`)

> Release time: 2022-03-24 08:19:14


🙇 We'd like to thank all contributors for this new release! In particular,
 Han Xiao,  Jina Dev Bot,  🙇


### 📗 Documentation

 - [[```fa9d50c2```](https://github.com/jina-ai/clip-as-service/commit/fa9d50c2395ea1e1e60a2a0fc9c0105df472bcdb)] __-__ fix readme (#656) (*Han Xiao*)

### 🍹 Other Improvements

 - [[```1a2a2af9```](https://github.com/jina-ai/clip-as-service/commit/1a2a2af932717cc0da12667453cd00c58e6ee443)] __-__ bump version (*Han Xiao*)
 - [[```44f4e52e```](https://github.com/jina-ai/clip-as-service/commit/44f4e52eba72d903883a827e115fb8dead69111f)] __-__ update docstring (*Han Xiao*)
 - [[```12bf98aa```](https://github.com/jina-ai/clip-as-service/commit/12bf98aa5b098f9017af186642990d9f854c54d7)] __-__ __version__: the next version will be 0.0.8 (*Jina Dev Bot*)
 - [[```e8aa643a```](https://github.com/jina-ai/clip-as-service/commit/e8aa643a802fca900f2111407093107f22f08917)] __-__ update docs and license (*Han Xiao*)

<a name=release-note-0-1-1></a>
## Release Note (`0.1.1`)

> Release time: 2022-03-24 09:03:13


🙇 We'd like to thank all contributors for this new release! In particular,
 Han Xiao,  Wang Bo,  Jina Dev Bot,  🙇


### 🐞 Bug fixes

 - [[```dd4cb3c3```](https://github.com/jina-ai/clip-as-service/commit/dd4cb3c3e142a8b8f702c4eee57a251aab7a10d5)] __-__ url description and keywords in setup (#657) (*Wang Bo*)

### 🍹 Other Improvements

 - [[```1b679cdc```](https://github.com/jina-ai/clip-as-service/commit/1b679cdc8c657e872833f1344f19ed7d6bb57b0a)] __-__ fix banner (*Han Xiao*)
 - [[```9e0a1058```](https://github.com/jina-ai/clip-as-service/commit/9e0a1058b77f33508a3dbdffabaacf3bb0c53cb4)] __-__ __version__: the next version will be 0.1.1 (*Jina Dev Bot*)
 - [[```1a2a2af9```](https://github.com/jina-ai/clip-as-service/commit/1a2a2af932717cc0da12667453cd00c58e6ee443)] __-__ bump version (*Han Xiao*)

<a name=release-note-0-1-2></a>
## Release Note (`0.1.2`)

> Release time: 2022-03-24 10:57:53


🙇 We'd like to thank all contributors for this new release! In particular,
 Han Xiao,  Alex Cureton-Griffiths,  Wang Bo,  Jina Dev Bot,  🙇


### 🧼 Code Refactoring

 - [[```715e8ba9```](https://github.com/jina-ai/clip-as-service/commit/715e8ba90faf1c58daed608cd02340404484a018)] __-__ remove unused main unify keywords (#658) (*Wang Bo*)

### 📗 Documentation

 - [[```ff16ce1d```](https://github.com/jina-ai/clip-as-service/commit/ff16ce1db88bbac14cb19c05266b1286c224e65d)] __-__ __readme__: polish (#660) (*Alex Cureton-Griffiths*)

### 🍹 Other Improvements

 - [[```0ec1fac2```](https://github.com/jina-ai/clip-as-service/commit/0ec1fac2a4f0c02a0cc704da9b2361e1d22108fa)] __-__ fix setup deps (*Han Xiao*)
 - [[```59b154a7```](https://github.com/jina-ai/clip-as-service/commit/59b154a7358cfdf1449fab8a2855599b657ddf24)] __-__ __version__: the next version will be 0.1.2 (*Jina Dev Bot*)
 - [[```1b679cdc```](https://github.com/jina-ai/clip-as-service/commit/1b679cdc8c657e872833f1344f19ed7d6bb57b0a)] __-__ fix banner (*Han Xiao*)

<a name=release-note-0-1-3></a>
## Release Note (`0.1.3`)

> Release time: 2022-03-24 13:03:30


🙇 We'd like to thank all contributors for this new release! In particular,
 Han Xiao,  Wang Bo,  Jina Dev Bot,  🙇


### 🧼 Code Refactoring

 - [[```ae4d0bac```](https://github.com/jina-ai/clip-as-service/commit/ae4d0bacbfadb57e1cb1de8a5f6adca5426e62d3)] __-__ remove inference model pytorch from onnx (#661) (*Wang Bo*)

### 🍹 Other Improvements

 - [[```dece9dd0```](https://github.com/jina-ai/clip-as-service/commit/dece9dd0695bc418fc13fe9ed95ef52ba9f8f19d)] __-__ fix setup file (*Han Xiao*)
 - [[```3d04e695```](https://github.com/jina-ai/clip-as-service/commit/3d04e69511098490b1e56e15640a6378df89ff04)] __-__ __version__: the next version will be 0.1.3 (*Jina Dev Bot*)
 - [[```0ec1fac2```](https://github.com/jina-ai/clip-as-service/commit/0ec1fac2a4f0c02a0cc704da9b2361e1d22108fa)] __-__ fix setup deps (*Han Xiao*)

<a name=release-note-0-1-5></a>
## Release Note (`0.1.5`)

> Release time: 2022-03-24 19:17:54


🙇 We'd like to thank all contributors for this new release! In particular,
 Han Xiao,  Jina Dev Bot,  🙇


### 🍹 Other Improvements

 - [[```989a706a```](https://github.com/jina-ai/clip-as-service/commit/989a706aa53271d9018ce48d9241e37887572025)] __-__ hide top-level setup (*Han Xiao*)
 - [[```f07e0f57```](https://github.com/jina-ai/clip-as-service/commit/f07e0f57ab795688100d07c09564fe3680031a83)] __-__ __version__: the next version will be 0.1.4 (*Jina Dev Bot*)
 - [[```dece9dd0```](https://github.com/jina-ai/clip-as-service/commit/dece9dd0695bc418fc13fe9ed95ef52ba9f8f19d)] __-__ fix setup file (*Han Xiao*)

<a name=release-note-0-1-6></a>
## Release Note (`0.1.6`)

> Release time: 2022-03-29 07:02:26


🙇 We'd like to thank all contributors for this new release! In particular,
 Han Xiao,  felix-wang,  Jina Dev Bot,  🙇


### 🐞 Bug fixes

 - [[```b4624dd4```](https://github.com/jina-ai/clip-as-service/commit/b4624dd408e0ee3908f57b7e5e598e5631de3d95)] __-__ __client__: raise value when embedding is empty (#666) (*Han Xiao*)

### 🍹 Other Improvements

 - [[```16f8c403```](https://github.com/jina-ai/clip-as-service/commit/16f8c403a274961c3acd1ecddbef823dcea488b2)] __-__ fix typo (#664) (*felix-wang*)
 - [[```da1dd85c```](https://github.com/jina-ai/clip-as-service/commit/da1dd85cba6bc72c4ff83fd0c00c2651eef56075)] __-__ hide top-level setup (*Han Xiao*)
 - [[```fe22c3f2```](https://github.com/jina-ai/clip-as-service/commit/fe22c3f21ee180c7f74e1a996572b0eb04a36a6d)] __-__ __version__: the next version will be 0.1.6 (*Jina Dev Bot*)

<a name=release-note-0-1-7></a>
## Release Note (`0.1.7`)

> Release time: 2022-03-30 11:40:37


🙇 We'd like to thank all contributors for this new release! In particular,
 Han Xiao,  Roshan Jossy,  Jina Dev Bot,  🙇


### 🆕 New Features

 - [[```d56b1463```](https://github.com/jina-ai/clip-as-service/commit/d56b146392c16d7f917b3a6baeba9d6b77121249)] __-__ __client__: more comprehensive progressbar (#667) (*Han Xiao*)

### 🐞 Bug fixes

 - [[```b4624dd4```](https://github.com/jina-ai/clip-as-service/commit/b4624dd408e0ee3908f57b7e5e598e5631de3d95)] __-__ __client__: raise value when embedding is empty (#666) (*Han Xiao*)

### 📗 Documentation

 - [[```cfaba711```](https://github.com/jina-ai/clip-as-service/commit/cfaba7119c334d87a5f7860cc4df65a52f19c750)] __-__ __tracking__: add scarf tracking (#665) (*Roshan Jossy*)

### 🍹 Other Improvements

 - [[```9e276744```](https://github.com/jina-ai/clip-as-service/commit/9e27674447d041c21078082478acc234d5c0e3f7)] __-__ fix readme (*Han Xiao*)
 - [[```1c2de8da```](https://github.com/jina-ai/clip-as-service/commit/1c2de8da0ac3cfa7fa7304f6c55b4979747f22a5)] __-__ __version__: the next version will be 0.1.7 (*Jina Dev Bot*)

<a name=release-note-0-1-8></a>
## Release Note (`0.1.8`)

> Release time: 2022-03-30 14:30:36


🙇 We'd like to thank all contributors for this new release! In particular,
 Han Xiao,  Jina Dev Bot,  🙇


### 🆕 New Features

 - [[```d56b1463```](https://github.com/jina-ai/clip-as-service/commit/d56b146392c16d7f917b3a6baeba9d6b77121249)] __-__ __client__: more comprehensive progressbar (#667) (*Han Xiao*)

### 🧼 Code Refactoring

 - [[```065d6a91```](https://github.com/jina-ai/clip-as-service/commit/065d6a910e44a31718463a51ba0e3c29ca926d1c)] __-__ __client__: use docarray pbar (#668) (*Han Xiao*)

### 🍹 Other Improvements

 - [[```dd61bdce```](https://github.com/jina-ai/clip-as-service/commit/dd61bdce6f6e908a6a583106d1c51ca5725b1ad4)] __-__ update readme (*Han Xiao*)
 - [[```be5fff81```](https://github.com/jina-ai/clip-as-service/commit/be5fff8129628d069e8d2992705f5f8b4681e040)] __-__ __version__: the next version will be 0.1.8 (*Jina Dev Bot*)

<a name=release-note-0-1-9></a>
## Release Note (`0.1.9`)

> Release time: 2022-03-30 23:20:09


🙇 We'd like to thank all contributors for this new release! In particular,
 Han Xiao,  Jina Dev Bot,  🙇


### ⚡ Performance Improvements

 - [[```88123432```](https://github.com/jina-ai/clip-as-service/commit/8812343211701fd93b75dab9f5b86bd5bc9e6819)] __-__ __server__: use map_batch to overlap cpu gpu (#669) (*Han Xiao*)

### 🍹 Other Improvements

 - [[```41b93773```](https://github.com/jina-ai/clip-as-service/commit/41b937732a4f241e22ef9089071e9c7b611a6674)] __-__ fix readme (*Han Xiao*)
 - [[```da3227d3```](https://github.com/jina-ai/clip-as-service/commit/da3227d3b3df984a2816dbea283c209020b0815a)] __-__ update readme (*Han Xiao*)
 - [[```431d4635```](https://github.com/jina-ai/clip-as-service/commit/431d46353c3382ff954fc05bc8397f3e560a7ac7)] __-__ __version__: the next version will be 0.1.9 (*Jina Dev Bot*)

<a name=release-note-0-1-10></a>
## Release Note (`0.1.10`)

> Release time: 2022-03-31 10:31:26


🙇 We'd like to thank all contributors for this new release! In particular,
 Han Xiao,  Jina Dev Bot,  🙇


### ⚡ Performance Improvements

 - [[```962a1b5c```](https://github.com/jina-ai/clip-as-service/commit/962a1b5ce92249d98e4850e0af003d63b95a3f9d)] __-__ __server__: reuse the preprocessing pool (#670) (*Han Xiao*)
 - [[```88123432```](https://github.com/jina-ai/clip-as-service/commit/8812343211701fd93b75dab9f5b86bd5bc9e6819)] __-__ __server__: use map_batch to overlap cpu gpu (#669) (*Han Xiao*)

### 🍹 Other Improvements

 - [[```f0dfc34a```](https://github.com/jina-ai/clip-as-service/commit/f0dfc34adca3c3d7b9879ac518735dd929074e80)] __-__ __version__: the next version will be 0.1.10 (*Jina Dev Bot*)

<a name=release-note-0-1-11></a>
## Release Note (`0.1.11`)

> Release time: 2022-04-01 15:46:07


🙇 We'd like to thank all contributors for this new release! In particular,
 Han Xiao,  Jina Dev Bot,  🙇


### ⚡ Performance Improvements

 - [[```962a1b5c```](https://github.com/jina-ai/clip-as-service/commit/962a1b5ce92249d98e4850e0af003d63b95a3f9d)] __-__ __server__: reuse the preprocessing pool (#670) (*Han Xiao*)

### 📗 Documentation

 - [[```257f0393```](https://github.com/jina-ai/clip-as-service/commit/257f03931d28a0d9020a3381495a41df1185fd9c)] __-__ add http endpoint explain (#671) (*Han Xiao*)

### 🍹 Other Improvements

 - [[```2e9e212f```](https://github.com/jina-ai/clip-as-service/commit/2e9e212f0d9501c7d55d0fd8dd82c180a067ca89)] __-__ add demo server (*Han Xiao*)
 - [[```99133924```](https://github.com/jina-ai/clip-as-service/commit/991339248bb3d7e9b6a741f2cf456f7f8bade154)] __-__ __version__: the next version will be 0.1.11 (*Jina Dev Bot*)

<a name=release-note-0-1-12></a>
## Release Note (`0.1.12`)

> Release time: 2022-04-07 02:20:52


🙇 We'd like to thank all contributors for this new release! In particular,
 felix-wang,  samsja,  Han Xiao,  Jina Dev Bot,  🙇


### 🆕 New Features

 - [[```ffc4bdc4```](https://github.com/jina-ai/clip-as-service/commit/ffc4bdc4e2414fa0aff67bb5cbaffd4077a7c8f4)] __-__ gitignore (#673) (*samsja*)

### 🐞 Bug fixes

 - [[```aeb64c08```](https://github.com/jina-ai/clip-as-service/commit/aeb64c082d4819ab9330ba65f57f13ba2a868268)] __-__ ignore onnxruntime-gpu on macos (#675) (*felix-wang*)

### 📗 Documentation

 - [[```257f0393```](https://github.com/jina-ai/clip-as-service/commit/257f03931d28a0d9020a3381495a41df1185fd9c)] __-__ add http endpoint explain (#671) (*Han Xiao*)

### 🍹 Other Improvements

 - [[```e2b2ae8b```](https://github.com/jina-ai/clip-as-service/commit/e2b2ae8bb96ce8602f9808c01e9a1712b7cdf7ac)] __-__ update readme (*Han Xiao*)
 - [[```d7aa1615```](https://github.com/jina-ai/clip-as-service/commit/d7aa161503ed72e662f8efe477af702e8759375e)] __-__ __version__: the next version will be 0.1.12 (*Jina Dev Bot*)

<a name=release-note-0-1-13></a>
## Release Note (`0.1.13`)

> Release time: 2022-04-11 08:02:20


🙇 We'd like to thank all contributors for this new release! In particular,
 Han Xiao,  Jina Dev Bot,  felix-wang,  🙇


### 🆕 New Features

 - [[```8b800eea```](https://github.com/jina-ai/clip-as-service/commit/8b800eea5d40d02417a4368fcc38ac58dc2d651d)] __-__ __server__: allow client sending tensor document (#678) (*Han Xiao*)

### 🐞 Bug fixes

 - [[```aeb64c08```](https://github.com/jina-ai/clip-as-service/commit/aeb64c082d4819ab9330ba65f57f13ba2a868268)] __-__ ignore onnxruntime-gpu on macos (#675) (*felix-wang*)

### 📗 Documentation

 - [[```b6f9d849```](https://github.com/jina-ai/clip-as-service/commit/b6f9d849e5693d6c40744feebd5d309dbeed1cb3)] __-__ __server__: docs document tensor (#679) (*Han Xiao*)

### 🍹 Other Improvements

 - [[```fa42dc50```](https://github.com/jina-ai/clip-as-service/commit/fa42dc50f6c766c60fe246802ecc8c15e37fbdf4)] __-__ update docs (*Han Xiao*)
 - [[```c91fa4d1```](https://github.com/jina-ai/clip-as-service/commit/c91fa4d16fd01ba8cc571041919201bbb1a76e31)] __-__ __version__: the next version will be 0.1.13 (*Jina Dev Bot*)

<a name=release-note-0-1-14></a>
## Release Note (`0.1.14`)

> Release time: 2022-04-14 02:39:16


🙇 We'd like to thank all contributors for this new release! In particular,
 felix-wang,  Jina Dev Bot,  Han Xiao,  🙇


### 🐞 Bug fixes

 - [[```8286eeed```](https://github.com/jina-ai/clip-as-service/commit/8286eeed13e65b7414e7bff0ace689daac103101)] __-__ tensor input document (#681) (*felix-wang*)

### 📗 Documentation

 - [[```b6f9d849```](https://github.com/jina-ai/clip-as-service/commit/b6f9d849e5693d6c40744feebd5d309dbeed1cb3)] __-__ __server__: docs document tensor (#679) (*Han Xiao*)

### 🍹 Other Improvements

 - [[```ef6ea254```](https://github.com/jina-ai/clip-as-service/commit/ef6ea254b48d2675364184c3cfcb787819e8433a)] __-__ __version__: the next version will be 0.1.14 (*Jina Dev Bot*)

<a name=release-note-0-1-15></a>
## Release Note (`0.1.15`)

> Release time: 2022-04-18 04:07:21


🙇 We'd like to thank all contributors for this new release! In particular,
 felix-wang,  Han Xiao,  Jina Dev Bot,  🙇


### ⚡ Performance Improvements

 - [[```10d53eb4```](https://github.com/jina-ai/clip-as-service/commit/10d53eb4c4b84c95412b5f904de705cc01d7ba89)] __-__ scalable benchmark (#680) (*felix-wang*)

### 🐞 Bug fixes

 - [[```8286eeed```](https://github.com/jina-ai/clip-as-service/commit/8286eeed13e65b7414e7bff0ace689daac103101)] __-__ tensor input document (#681) (*felix-wang*)

### 🍹 Other Improvements

 - [[```fb229ae8```](https://github.com/jina-ai/clip-as-service/commit/fb229ae81f708d00f01af2d791de6eb67174e1c7)] __-__ add jcloud logo (*Han Xiao*)
 - [[```a3891eed```](https://github.com/jina-ai/clip-as-service/commit/a3891eedfa88e92f4a730f94b897da79ec28848d)] __-__ fix readme (*Han Xiao*)
 - [[```bfd04706```](https://github.com/jina-ai/clip-as-service/commit/bfd047061d7465877503caa3f11f77065718cbc4)] __-__ __version__: the next version will be 0.1.15 (*Jina Dev Bot*)

<a name=release-note-0-2-0></a>
## Release Note (`0.2.0`)

> Release time: 2022-04-18 04:27:56


🙇 We'd like to thank all contributors for this new release! In particular,
 numb3r3,  Jina Dev Bot,  felix-wang,  🙇


### ⚡ Performance Improvements

 - [[```10d53eb4```](https://github.com/jina-ai/clip-as-service/commit/10d53eb4c4b84c95412b5f904de705cc01d7ba89)] __-__ scalable benchmark (#680) (*felix-wang*)

### 🍹 Other Improvements

 - [[```67226f5c```](https://github.com/jina-ai/clip-as-service/commit/67226f5c8ec7d652f5d47ed0ab21dc8d14fc49c8)] __-__ bump version (*numb3r3*)
 - [[```5dc64878```](https://github.com/jina-ai/clip-as-service/commit/5dc6487819483ee680cf6e4c6a5fd1c0c27e2b54)] __-__ __version__: the next version will be 0.1.16 (*Jina Dev Bot*)

<a name=release-note-0-2-1></a>
## Release Note (`0.2.1`)

> Release time: 2022-04-21 15:32:35


🙇 We'd like to thank all contributors for this new release! In particular,
 felix-wang,  Han Xiao,  Jina Dev Bot,  numb3r3,  🙇


### 🐞 Bug fixes

 - [[```2558d738```](https://github.com/jina-ai/clip-as-service/commit/2558d7388a6ebdbfa99e489d9d005d9e1c22c33a)] __-__ pass extra_search path (#687) (*felix-wang*)
 - [[```71e9ebc8```](https://github.com/jina-ai/clip-as-service/commit/71e9ebc89a27b9ed64def3b6968b78c4ea1c3256)] __-__ remove process backend (#685) (*felix-wang*)
 - [[```65ad956d```](https://github.com/jina-ai/clip-as-service/commit/65ad956dab7e4f011568ac14357d636312dd4f5e)] __-__ use one iteration step (#683) (*felix-wang*)

### 🍹 Other Improvements

 - [[```e3d4e918```](https://github.com/jina-ai/clip-as-service/commit/e3d4e9181b645dfebd23620e153d86c563d38c9b)] __-__ update readme (*Han Xiao*)
 - [[```ec3a700c```](https://github.com/jina-ai/clip-as-service/commit/ec3a700c632dba97c01529c0049c383d68636aa8)] __-__ update docs (#684) (*felix-wang*)
 - [[```11487cee```](https://github.com/jina-ai/clip-as-service/commit/11487ceec184fbcf93c0c1debedbf22d097d9254)] __-__ __version__: the next version will be 0.2.1 (*Jina Dev Bot*)
 - [[```67226f5c```](https://github.com/jina-ai/clip-as-service/commit/67226f5c8ec7d652f5d47ed0ab21dc8d14fc49c8)] __-__ bump version (*numb3r3*)

<a name=release-note-0-2-2></a>
## Release Note (`0.2.2`)

> Release time: 2022-04-24 07:28:34


🙇 We'd like to thank all contributors for this new release! In particular,
 felix-wang,  Jina Dev Bot,  🙇


### 🐞 Bug fixes

 - [[```3bd74641```](https://github.com/jina-ai/clip-as-service/commit/3bd74641e52a48d9b1bac15554c8e42050d47094)] __-__ download with resume (#689) (*felix-wang*)
 - [[```2558d738```](https://github.com/jina-ai/clip-as-service/commit/2558d7388a6ebdbfa99e489d9d005d9e1c22c33a)] __-__ pass extra_search path (#687) (*felix-wang*)

### 🍹 Other Improvements

 - [[```6eadbfba```](https://github.com/jina-ai/clip-as-service/commit/6eadbfbac798c72b0319071b3f9c8480f97155d8)] __-__ __version__: the next version will be 0.2.2 (*Jina Dev Bot*)

<a name=release-note-0-2-3></a>
## Release Note (`0.2.3`)

> Release time: 2022-04-25 11:42:51


🙇 We'd like to thank all contributors for this new release! In particular,
 Han Xiao,  Jina Dev Bot,  felix-wang,  🙇


### 🆕 New Features

 - [[```0ebc4c03```](https://github.com/jina-ai/clip-as-service/commit/0ebc4c0363aa182d2992bbc11bbcf676aeaf69df)] __-__ __server__: add rank endpoint (#694) (*Han Xiao*)

### 🐞 Bug fixes

 - [[```3bd74641```](https://github.com/jina-ai/clip-as-service/commit/3bd74641e52a48d9b1bac15554c8e42050d47094)] __-__ download with resume (#689) (*felix-wang*)

### 🍹 Other Improvements

 - [[```22cfffaf```](https://github.com/jina-ai/clip-as-service/commit/22cfffaffbd73d1d15afd87d08e2bb62bb2acbdc)] __-__ __version__: the next version will be 0.2.3 (*Jina Dev Bot*)

<a name=release-note-0-3-0></a>
## Release Note (`0.3.0`)

> Release time: 2022-04-25 15:13:21


🙇 We'd like to thank all contributors for this new release! In particular,
 Han Xiao,  Jina Dev Bot,  🙇


### 🆕 New Features

 - [[```b7270862```](https://github.com/jina-ai/clip-as-service/commit/b7270862e353b73531cd8bff4735d537c905a312)] __-__ __client__: add rank endpoint (#695) (*Han Xiao*)
 - [[```0ebc4c03```](https://github.com/jina-ai/clip-as-service/commit/0ebc4c0363aa182d2992bbc11bbcf676aeaf69df)] __-__ __server__: add rank endpoint (#694) (*Han Xiao*)

### 🍹 Other Improvements

 - [[```8600286c```](https://github.com/jina-ai/clip-as-service/commit/8600286cf53755c127cf258af918b6bdf3e86691)] __-__ update readme (*Han Xiao*)
 - [[```5e1dd607```](https://github.com/jina-ai/clip-as-service/commit/5e1dd607e47a94265f48cbb2a70406c5057b86fa)] __-__ __version__: the next version will be 0.2.4 (*Jina Dev Bot*)

<a name=release-note-0-3-1></a>
## Release Note (`0.3.1`)

> Release time: 2022-04-26 08:03:08


🙇 We'd like to thank all contributors for this new release! In particular,
 Han Xiao,  Jina Dev Bot,  🙇


### 🆕 New Features

 - [[```ca5f3021```](https://github.com/jina-ai/clip-as-service/commit/ca5f30211d87fc324e9cfffb3fcc89682a233ba8)] __-__ __helper__: add version check for client and server (#696) (*Han Xiao*)

### 🍹 Other Improvements

 - [[```234650f4```](https://github.com/jina-ai/clip-as-service/commit/234650f48a1e59cd274748f98ae84f6648b811af)] __-__ __version__: the next version will be 0.3.1 (*Jina Dev Bot*)
 - [[```8600286c```](https://github.com/jina-ai/clip-as-service/commit/8600286cf53755c127cf258af918b6bdf3e86691)] __-__ update readme (*Han Xiao*)

<a name=release-note-0-3-2></a>
## Release Note (`0.3.2`)

> Release time: 2022-04-26 09:16:04


🙇 We'd like to thank all contributors for this new release! In particular,
 Han Xiao,  Jina Dev Bot,  🙇


### 🆕 New Features

 - [[```f5ba35ab```](https://github.com/jina-ai/clip-as-service/commit/f5ba35abf37a5d140f2e8491cb0dcab13c25869f)] __-__ __helper__: add version check for client and server (*Han Xiao*)
 - [[```ca5f3021```](https://github.com/jina-ai/clip-as-service/commit/ca5f30211d87fc324e9cfffb3fcc89682a233ba8)] __-__ __helper__: add version check for client and server (#696) (*Han Xiao*)

### 🍹 Other Improvements

 - [[```27ffd856```](https://github.com/jina-ai/clip-as-service/commit/27ffd85623407033de72ad259a725848b3412822)] __-__ __version__: the next version will be 0.3.2 (*Jina Dev Bot*)

<a name=release-note-0-3-3></a>
## Release Note (`0.3.3`)

> Release time: 2022-04-26 09:38:17


🙇 We'd like to thank all contributors for this new release! In particular,
 Han Xiao,  Jina Dev Bot,  🙇


### 🆕 New Features

 - [[```076d6537```](https://github.com/jina-ai/clip-as-service/commit/076d65378b3653dc9f1213f398d1d70824e67513)] __-__ __helper__: add version check for client and server (*Han Xiao*)

### 🍹 Other Improvements

 - [[```9bcbb1f9```](https://github.com/jina-ai/clip-as-service/commit/9bcbb1f9147aa3dda857a5a130b531e88f27baf2)] __-__ __version__: the next version will be 0.3.3 (*Jina Dev Bot*)

<a name=release-note-0-3-4></a>
## Release Note (`0.3.4`)

> Release time: 2022-04-30 15:17:02


🙇 We'd like to thank all contributors for this new release! In particular,
 Han Xiao,  Jina Dev Bot,  🙇


### 🆕 New Features

 - [[```076d6537```](https://github.com/jina-ai/clip-as-service/commit/076d65378b3653dc9f1213f398d1d70824e67513)] __-__ __helper__: add version check for client and server (*Han Xiao*)

### 🐞 Bug fixes

 - [[```8ac2e9bb```](https://github.com/jina-ai/clip-as-service/commit/8ac2e9bb68b96d1421f7e2ae6b01cec95aad3183)] __-__ __torch__: fix oom in rerank endpoint (#699) (*Han Xiao*)

### 🍹 Other Improvements

 - [[```dd508167```](https://github.com/jina-ai/clip-as-service/commit/dd5081672718a12e7aeede0400432e8f6a01a744)] __-__ __version__: the next version will be 0.3.4 (*Jina Dev Bot*)

<a name=release-note-0-3-5></a>
## Release Note (`0.3.5`)

> Release time: 2022-04-30 18:55:10


🙇 We'd like to thank all contributors for this new release! In particular,
 Han Xiao,  Jina Dev Bot,  🙇


### 🐞 Bug fixes

 - [[```8ac2e9bb```](https://github.com/jina-ai/clip-as-service/commit/8ac2e9bb68b96d1421f7e2ae6b01cec95aad3183)] __-__ __torch__: fix oom in rerank endpoint (#699) (*Han Xiao*)

### 🧼 Code Refactoring

 - [[```050c34e0```](https://github.com/jina-ai/clip-as-service/commit/050c34e0906f9593aee054ac37a0a830478a3a3b)] __-__ use packaging instead of distutil (#700) (*Han Xiao*)

### 🍹 Other Improvements

 - [[```d2c2c872```](https://github.com/jina-ai/clip-as-service/commit/d2c2c8729e0872b0c5e299916ae8cf58be7ec516)] __-__ __version__: the next version will be 0.3.5 (*Jina Dev Bot*)

<a name=release-note-0-4-0></a>
## Release Note (`0.4.0`)

> Release time: 2022-04-30 20:25:29


🙇 We'd like to thank all contributors for this new release! In particular,
 Han Xiao,  Jina Dev Bot,  🙇


### 🆕 New Features

 - [[```33efcb00```](https://github.com/jina-ai/clip-as-service/commit/33efcb00414f6cbf5f860bafe7cc183773e08241)] __-__ add async rerank (#701) (*Han Xiao*)
 - [[```12d33c49```](https://github.com/jina-ai/clip-as-service/commit/12d33c49d5ac4d55a7351a442335f25218661dc8)] __-__ add async rerank (*Han Xiao*)

### 🧼 Code Refactoring

 - [[```050c34e0```](https://github.com/jina-ai/clip-as-service/commit/050c34e0906f9593aee054ac37a0a830478a3a3b)] __-__ use packaging instead of distutil (#700) (*Han Xiao*)

### 🍹 Other Improvements

 - [[```20e66b95```](https://github.com/jina-ai/clip-as-service/commit/20e66b953af17480e062a8e84719b5a6823ba648)] __-__ __version__: the next version will be 0.3.6 (*Jina Dev Bot*)

<a name=release-note-0-4-1></a>
## Release Note (`0.4.1`)

> Release time: 2022-05-04 17:38:48


🙇 We'd like to thank all contributors for this new release! In particular,
 felix-wang,  Han Xiao,  Jina Dev Bot,  🙇


### 🆕 New Features

 - [[```f66b145b```](https://github.com/jina-ai/clip-as-service/commit/f66b145be9a19b64e6404665fce8ff5c14b9552b)] __-__ add ranker endpoint for all backends (#707) (*felix-wang*)
 - [[```f7b9af40```](https://github.com/jina-ai/clip-as-service/commit/f7b9af40c3bb693ca70faddfcbc49d07d287df62)] __-__ add tensorrt support (#688) (*felix-wang*)
 - [[```33efcb00```](https://github.com/jina-ai/clip-as-service/commit/33efcb00414f6cbf5f860bafe7cc183773e08241)] __-__ add async rerank (#701) (*Han Xiao*)

### 🐞 Bug fixes

 - [[```618dbdb2```](https://github.com/jina-ai/clip-as-service/commit/618dbdb2cfe7a765c3748708bf6b16560175ca51)] __-__ cd workflow (#706) (*felix-wang*)

### 🍹 Other Improvements

 - [[```3f34d46d```](https://github.com/jina-ai/clip-as-service/commit/3f34d46d662998ae39f7bfe50aface9a9582bb0c)] __-__ __docs__: add cas async usage to readme (*Han Xiao*)
 - [[```0f941660```](https://github.com/jina-ai/clip-as-service/commit/0f941660a78879a8eea5846ecb4f7c2dabf0f34c)] __-__ __version__: the next version will be 0.4.1 (*Jina Dev Bot*)

<a name=release-note-0-4-2></a>
## Release Note (`0.4.2`)

> Release time: 2022-05-09 05:32:39


🙇 We'd like to thank all contributors for this new release! In particular,
 felix-wang,  Han Xiao,  Jina Dev Bot,  🙇


### 🆕 New Features

 - [[```f66b145b```](https://github.com/jina-ai/clip-as-service/commit/f66b145be9a19b64e6404665fce8ff5c14b9552b)] __-__ add ranker endpoint for all backends (#707) (*felix-wang*)

### 🐞 Bug fixes

 - [[```835eb13f```](https://github.com/jina-ai/clip-as-service/commit/835eb13fcb84d126b6a35e18b2fe0ef9d8b835b7)] __-__ use cosine as the rank score (#708) (*felix-wang*)

### 🍹 Other Improvements

 - [[```706fa624```](https://github.com/jina-ai/clip-as-service/commit/706fa624cb567857e6bc024f52c93a10ab410651)] __-__ __docs__: update readme (*Han Xiao*)
 - [[```7fd04d2d```](https://github.com/jina-ai/clip-as-service/commit/7fd04d2d65dc033ccdec3b970046692816a84880)] __-__ __docs__: add cas async usage to readme (*Han Xiao*)
 - [[```90bb4c5c```](https://github.com/jina-ai/clip-as-service/commit/90bb4c5c70c5e51a1c29bf5aed2906d284c077f7)] __-__ __version__: the next version will be 0.4.2 (*Jina Dev Bot*)

<a name=release-note-0-4-3></a>
## Release Note (`0.4.3`)

> Release time: 2022-05-09 10:23:15


🙇 We'd like to thank all contributors for this new release! In particular,
 felix-wang,  Roshan Jossy,  Han Xiao,  Jina Dev Bot,  🙇


### 🐞 Bug fixes

 - [[```bb520d14```](https://github.com/jina-ai/clip-as-service/commit/bb520d14b6c5172fce9a971b51c4125b60418119)] __-__ keep logit_scale on same device (#710) (*felix-wang*)
 - [[```835eb13f```](https://github.com/jina-ai/clip-as-service/commit/835eb13fcb84d126b6a35e18b2fe0ef9d8b835b7)] __-__ use cosine as the rank score (#708) (*felix-wang*)

### 📗 Documentation

 - [[```da87d13a```](https://github.com/jina-ai/clip-as-service/commit/da87d13a753cdb394fe24a00af7e02f0dcc5fa00)] __-__ __tracking__: update external links&#39; source (#711) (*Roshan Jossy*)

### 🍹 Other Improvements

 - [[```099e2218```](https://github.com/jina-ai/clip-as-service/commit/099e2218a47bacb73afed2a7739b58f4e56c7c68)] __-__ __docs__: update readme (*Han Xiao*)
 - [[```ce5806d3```](https://github.com/jina-ai/clip-as-service/commit/ce5806d33ccad1ef83106900b44da3e85eb0ce64)] __-__ update index html (#709) (*felix-wang*)
 - [[```a1651079```](https://github.com/jina-ai/clip-as-service/commit/a16510799b648beb29bd422cdddc1e5dee5db061)] __-__ __version__: the next version will be 0.4.3 (*Jina Dev Bot*)

<a name=release-note-0-4-4></a>
## Release Note (`0.4.4`)

> Release time: 2022-05-11 12:00:46


🙇 We'd like to thank all contributors for this new release! In particular,
 Han Xiao,  felix-wang,  Jina Dev Bot,  🙇


### 🆕 New Features

 - [[```edf0d862```](https://github.com/jina-ai/clip-as-service/commit/edf0d862228fb69ed7aa25e6ac71a322494c3c29)] __-__ add dockerfiles and cd workflow (#712) (*felix-wang*)

### 🐞 Bug fixes

 - [[```bb520d14```](https://github.com/jina-ai/clip-as-service/commit/bb520d14b6c5172fce9a971b51c4125b60418119)] __-__ keep logit_scale on same device (#710) (*felix-wang*)

### 🧼 Code Refactoring

 - [[```59c06986```](https://github.com/jina-ai/clip-as-service/commit/59c06986387b59c0f536af08d0f6abed71cd7a41)] __-__ __server__: remove redundant logics of rank (#715) (*Han Xiao*)

### 🍹 Other Improvements

 - [[```72d69c75```](https://github.com/jina-ai/clip-as-service/commit/72d69c75be20cc8126f9c406cb621d349b87ba3c)] __-__ __docs__: update readme (*Han Xiao*)
 - [[```f898c8ce```](https://github.com/jina-ai/clip-as-service/commit/f898c8ce73ea3884037b900ca45fda4a477efdce)] __-__ __version__: the next version will be 0.4.4 (*Jina Dev Bot*)

<a name=release-note-0-4-5></a>
## Release Note (`0.4.5`)

> Release time: 2022-05-11 12:10:29


🙇 We'd like to thank all contributors for this new release! In particular,
 Han Xiao,  Jina Dev Bot,  🙇


### 🐞 Bug fixes

 - [[```6ed4c484```](https://github.com/jina-ai/clip-as-service/commit/6ed4c484346e16846e0076d7bf99388c858581be)] __-__ convert distance to score (*Han Xiao*)

### 🧼 Code Refactoring

 - [[```59c06986```](https://github.com/jina-ai/clip-as-service/commit/59c06986387b59c0f536af08d0f6abed71cd7a41)] __-__ __server__: remove redundant logics of rank (#715) (*Han Xiao*)

### 🍹 Other Improvements

 - [[```d565d31f```](https://github.com/jina-ai/clip-as-service/commit/d565d31f80e4a289529477159eba07161c6c6066)] __-__ __version__: the next version will be 0.4.5 (*Jina Dev Bot*)

<a name=release-note-0-4-6></a>
## Release Note (`0.4.6`)

> Release time: 2022-05-11 15:10:52


🙇 We'd like to thank all contributors for this new release! In particular,
 Han Xiao,  Jina Dev Bot,  🙇


### ⚡ Performance Improvements

 - [[```cda93fdd```](https://github.com/jina-ai/clip-as-service/commit/cda93fdd648a64f16bd9f194079e3e9220629af2)] __-__ __server__: use await gather in rank function (*Han Xiao*)

### 🐞 Bug fixes

 - [[```6ed4c484```](https://github.com/jina-ai/clip-as-service/commit/6ed4c484346e16846e0076d7bf99388c858581be)] __-__ convert distance to score (*Han Xiao*)

### 🍹 Other Improvements

 - [[```06fcd07b```](https://github.com/jina-ai/clip-as-service/commit/06fcd07bcf208753de15f051339fd48a0e8186f9)] __-__ __version__: the next version will be 0.4.6 (*Jina Dev Bot*)

<a name=release-note-0-4-7></a>
## Release Note (`0.4.7`)

> Release time: 2022-05-11 16:25:08


🙇 We'd like to thank all contributors for this new release! In particular,
 Han Xiao,  Jina Dev Bot,  🙇


### ⚡ Performance Improvements

 - [[```72f1bc4a```](https://github.com/jina-ai/clip-as-service/commit/72f1bc4af0bc6ad01e645a889fd8ee505f986b42)] __-__ __server__: use await gather in rank function (#716) (*Han Xiao*)
 - [[```cda93fdd```](https://github.com/jina-ai/clip-as-service/commit/cda93fdd648a64f16bd9f194079e3e9220629af2)] __-__ __server__: use await gather in rank function (*Han Xiao*)

### 🍹 Other Improvements

 - [[```66b14fc6```](https://github.com/jina-ai/clip-as-service/commit/66b14fc6f7e8abd3322d0a9e8e0bdfe942d187d3)] __-__ __version__: the next version will be 0.4.7 (*Jina Dev Bot*)

<a name=release-note-0-4-8></a>
## Release Note (`0.4.8`)

> Release time: 2022-05-13 09:24:42


🙇 We'd like to thank all contributors for this new release! In particular,
 Han Xiao,  numb3r3,  felix-wang,  Jina Dev Bot,  🙇


### ⚡ Performance Improvements

 - [[```72f1bc4a```](https://github.com/jina-ai/clip-as-service/commit/72f1bc4af0bc6ad01e645a889fd8ee505f986b42)] __-__ __server__: use await gather in rank function (#716) (*Han Xiao*)

### 🐞 Bug fixes

 - [[```65991a3f```](https://github.com/jina-ai/clip-as-service/commit/65991a3f9126b19c99f21e44cd3dd4227cbe80c7)] __-__ __client__: fix https args to tls (#722) (*Han Xiao*)
 - [[```1002a913```](https://github.com/jina-ai/clip-as-service/commit/1002a9132120dbf52a0dd4700c740692c959a422)] __-__ docker release cd (#717) (*felix-wang*)
 - [[```71d2c867```](https://github.com/jina-ai/clip-as-service/commit/71d2c867b5d45c8dfe42872b7e5697793be79f4b)] __-__ docker build push (#714) (*felix-wang*)

### 🏁 Unit Test and CICD

 - [[```38043676```](https://github.com/jina-ai/clip-as-service/commit/3804367632cccdc1f7fa1f5fc998f3530e1ee05c)] __-__ fix force release (*numb3r3*)

### 🍹 Other Improvements

 - [[```0da311e4```](https://github.com/jina-ai/clip-as-service/commit/0da311e4113b0afb60cb16779d86c48399c86f24)] __-__ __docs__: change http to https (*Han Xiao*)
 - [[```741ad796```](https://github.com/jina-ai/clip-as-service/commit/741ad796be93df808a634a41105f2860751afc2d)] __-__ __docs__: add playground (*Han Xiao*)
 - [[```a2b6d337```](https://github.com/jina-ai/clip-as-service/commit/a2b6d33738f3cea55e429db21691534c62c08320)] __-__ __version__: the next version will be 0.4.8 (*Jina Dev Bot*)

<a name=release-note-0-4-9></a>
## Release Note (`0.4.9`)

> Release time: 2022-05-23 15:13:23


🙇 We'd like to thank all contributors for this new release! In particular,
 Han Xiao,  numb3r3,  felix-wang,  Jina Dev Bot,  🙇


### 🐞 Bug fixes

 - [[```a7311fbf```](https://github.com/jina-ai/clip-as-service/commit/a7311fbf6ae6e988b78924fea239a9c8236dd8ff)] __-__ __server__: recover original contents of the input da (#726) (*Han Xiao*)
 - [[```42ef75b1```](https://github.com/jina-ai/clip-as-service/commit/42ef75b185369c1ffd64f8ea5a7b0fca8c8656b2)] __-__ __server__: remove embeddings to save bandwidth (*Han Xiao*)
 - [[```2d2da147```](https://github.com/jina-ai/clip-as-service/commit/2d2da147c3781233dc3812e2e7dfebbcbeb5f20e)] __-__ docker push cd (*numb3r3*)
 - [[```994635fa```](https://github.com/jina-ai/clip-as-service/commit/994635fabc84293b05adff123263a0d276812202)] __-__ k8s dockerize  (#725) (*felix-wang*)
 - [[```d12c5115```](https://github.com/jina-ai/clip-as-service/commit/d12c5115c946497b4ddf03a759a63c4040bcf8c7)] __-__ docker file (#719) (*felix-wang*)
 - [[```65991a3f```](https://github.com/jina-ai/clip-as-service/commit/65991a3f9126b19c99f21e44cd3dd4227cbe80c7)] __-__ __client__: fix https args to tls (#722) (*Han Xiao*)

### 🍹 Other Improvements

 - [[```b6adcf8b```](https://github.com/jina-ai/clip-as-service/commit/b6adcf8be6a087d446e833478a0ac05a7900c24b)] __-__ __docs__: add multi gpu setting (*Han Xiao*)
 - [[```3d8c552a```](https://github.com/jina-ai/clip-as-service/commit/3d8c552a721a52fb374f73fcc9725f7d06e2383f)] __-__ __version__: the next version will be 0.4.9 (*Jina Dev Bot*)

<a name=release-note-0-4-10></a>
## Release Note (`0.4.10`)

> Release time: 2022-05-24 07:46:48


🙇 We'd like to thank all contributors for this new release! In particular,
 Han Xiao,  Jina Dev Bot,  🙇


### 🐞 Bug fixes

 - [[```0054b47c```](https://github.com/jina-ai/clip-as-service/commit/0054b47cf12043b0bf493424ec22defa9448a9be)] __-__ __server__: fix content assignment (#727) (*Han Xiao*)
 - [[```a7311fbf```](https://github.com/jina-ai/clip-as-service/commit/a7311fbf6ae6e988b78924fea239a9c8236dd8ff)] __-__ __server__: recover original contents of the input da (#726) (*Han Xiao*)

### 🍹 Other Improvements

 - [[```926621bc```](https://github.com/jina-ai/clip-as-service/commit/926621bc97972694caff79700c5b70031a2677c1)] __-__ __version__: the next version will be 0.4.10 (*Jina Dev Bot*)

<a name=release-note-0-4-11></a>
## Release Note (`0.4.11`)

> Release time: 2022-05-27 07:44:46


🙇 We'd like to thank all contributors for this new release! In particular,
 samsja,  Shubham Goel,  Han Xiao,  Ziniu Yu,  Roshan Jossy,  Jina Dev Bot,  🙇


### 🆕 New Features

 - [[```60a986a0```](https://github.com/jina-ai/clip-as-service/commit/60a986a07921b2374fdc64dccde7e1ec1e728cdb)] __-__ add monitoring (#674) (*samsja*)

### 🐞 Bug fixes

 - [[```59f48e60```](https://github.com/jina-ai/clip-as-service/commit/59f48e60a286344f66f37aea46b0b1b148ca90f4)] __-__ windows file name conflict (#729) (*Ziniu Yu*)
 - [[```0054b47c```](https://github.com/jina-ai/clip-as-service/commit/0054b47cf12043b0bf493424ec22defa9448a9be)] __-__ __server__: fix content assignment (#727) (*Han Xiao*)

### 📗 Documentation

 - [[```2f3a2077```](https://github.com/jina-ai/clip-as-service/commit/2f3a207734c829e6baa94dfa5715dc8d5c3f12de)] __-__ __tracking__: remove utm source in links (#728) (*Roshan Jossy*)

### 🍹 Other Improvements

 - [[```c7c96251```](https://github.com/jina-ai/clip-as-service/commit/c7c9625163d97ca3a6ad2b845309bad9e34e5d87)] __-__ Corrected replicas indentation in server.md (#731) (*Shubham Goel*)
 - [[```8d112275```](https://github.com/jina-ai/clip-as-service/commit/8d1122754dc53c75bce75e313ee74796bd9614e7)] __-__ fix docs (*Han Xiao*)
 - [[```7323d99e```](https://github.com/jina-ai/clip-as-service/commit/7323d99edd7814ece9ed8f5c0adce949047c987d)] __-__ __version__: the next version will be 0.4.11 (*Jina Dev Bot*)

<a name=release-note-0-4-12></a>
## Release Note (`0.4.12`)

> Release time: 2022-06-01 08:28:41


🙇 We'd like to thank all contributors for this new release! In particular,
 felix-wang,  Ziniu Yu,  Jina Dev Bot,  samsja,  🙇


### 🆕 New Features

 - [[```60a986a0```](https://github.com/jina-ai/clip-as-service/commit/60a986a07921b2374fdc64dccde7e1ec1e728cdb)] __-__ add monitoring (#674) (*samsja*)

### 🐞 Bug fixes

 - [[```bb8c4ce0```](https://github.com/jina-ai/clip-as-service/commit/bb8c4ce01de76d2be63444a517e90b530422110e)] __-__ better monitoring (#738) (*felix-wang*)
 - [[```751cf9de```](https://github.com/jina-ai/clip-as-service/commit/751cf9de0d8bb727c44ecaf7950a3658b868399d)] __-__ does not require port (#735) (*Ziniu Yu*)

### 📗 Documentation

 - [[```5e06667a```](https://github.com/jina-ai/clip-as-service/commit/5e06667ac9afef335b98b72a58ba0d28985d9b18)] __-__ update monitoring feature (#737) (*felix-wang*)

### 🍹 Other Improvements

 - [[```b523c624```](https://github.com/jina-ai/clip-as-service/commit/b523c62468dd6088095b00b5335160c57a1cb25e)] __-__ __version__: the next version will be 0.4.12 (*Jina Dev Bot*)

<a name=release-note-0-4-13></a>
## Release Note (`0.4.13`)

> Release time: 2022-06-09 04:42:07


🙇 We'd like to thank all contributors for this new release! In particular,
 felix-wang,  Ziniu Yu,  Han Xiao,  Jina Dev Bot,  🙇


### 🆕 New Features

 - [[```d675148b```](https://github.com/jina-ai/clip-as-service/commit/d675148b4305338e9d17d449e42ab7c142896c06)] __-__ add clip_hg executor (#740) (*Ziniu Yu*)

### 🧼 Code Refactoring

 - [[```5eb5d7e8```](https://github.com/jina-ai/clip-as-service/commit/5eb5d7e8ed6f924c6560bb850edb08bbd809ff09)] __-__ monitor (#743) (*felix-wang*)

### 📗 Documentation

 - [[```130108c1```](https://github.com/jina-ai/clip-as-service/commit/130108c1aaf993bebc8215527c47d98c7e2169c5)] __-__ add JCloud deployment docs (#739) (*Ziniu Yu*)
 - [[```5e06667a```](https://github.com/jina-ai/clip-as-service/commit/5e06667ac9afef335b98b72a58ba0d28985d9b18)] __-__ update monitoring feature (#737) (*felix-wang*)

### 🍹 Other Improvements

 - [[```4b88e992```](https://github.com/jina-ai/clip-as-service/commit/4b88e99263a29903312f52bae01465b44b7a0cce)] __-__ fix docs (*Han Xiao*)
 - [[```b130d645```](https://github.com/jina-ai/clip-as-service/commit/b130d645409b044df6f1e0bcda78b42e79cb98d9)] __-__ add grafana dashboard (#741) (*felix-wang*)
 - [[```12ede839```](https://github.com/jina-ai/clip-as-service/commit/12ede83996f62af8c549a1d6621ae1dd32b7de7d)] __-__ __version__: the next version will be 0.4.13 (*Jina Dev Bot*)

<a name=release-note-0-4-14></a>
## Release Note (`0.4.14`)

> Release time: 2022-06-09 13:39:46


🙇 We'd like to thank all contributors for this new release! In particular,
 felix-wang,  Jina Dev Bot,  🙇


### 🐞 Bug fixes

 - [[```752202f8```](https://github.com/jina-ai/clip-as-service/commit/752202f8b730d0ab8785a703fc719dacfbe2993b)] __-__ monitor documentation (#745) (*felix-wang*)

### 🧼 Code Refactoring

 - [[```5eb5d7e8```](https://github.com/jina-ai/clip-as-service/commit/5eb5d7e8ed6f924c6560bb850edb08bbd809ff09)] __-__ monitor (#743) (*felix-wang*)

### 🍹 Other Improvements

 - [[```06097f20```](https://github.com/jina-ai/clip-as-service/commit/06097f2098190b5a8a40fc82354b642730e617e0)] __-__ __version__: the next version will be 0.4.14 (*Jina Dev Bot*)

<a name=release-note-0-4-15></a>
## Release Note (`0.4.15`)

> Release time: 2022-06-13 13:06:16


🙇 We'd like to thank all contributors for this new release! In particular,
 Han Xiao,  felix-wang,  Ziniu Yu,  Jina Dev Bot,  🙇


### 🆕 New Features

 - [[```e022bd46```](https://github.com/jina-ai/clip-as-service/commit/e022bd46c8c1773620f635148cb999e23ff7167e)] __-__ add traversal paths (#750) (*felix-wang*)
 - [[```4fe5a1b1```](https://github.com/jina-ai/clip-as-service/commit/4fe5a1b1dc9672be98638ba57023936b5ed69a6c)] __-__ add traversal paths (#748) (*felix-wang*)

### 🐞 Bug fixes

 - [[```752202f8```](https://github.com/jina-ai/clip-as-service/commit/752202f8b730d0ab8785a703fc719dacfbe2993b)] __-__ monitor documentation (#745) (*felix-wang*)

### 🍹 Other Improvements

 - [[```dab8341e```](https://github.com/jina-ai/clip-as-service/commit/dab8341e9ffb0716eb8c17477534ef91f19d8c5d)] __-__ add cas on colab section (*Han Xiao*)
 - [[```29bd68a4```](https://github.com/jina-ai/clip-as-service/commit/29bd68a4bc1f17c34016d45901858c65c8cf5623)] __-__ add replicas field in all yamls (*Han Xiao*)
 - [[```d5be8c2f```](https://github.com/jina-ai/clip-as-service/commit/d5be8c2f85e47fcafa7587f3e75b76fbc42300e5)] __-__ Revert &#34;feat: add traversal paths (#748)&#34; (#749) (*Han Xiao*)
 - [[```7f2d8fe8```](https://github.com/jina-ai/clip-as-service/commit/7f2d8fe88643ae71e5d8b38547faa32570886e46)] __-__ update links in docs (#747) (*Ziniu Yu*)
 - [[```52a8b0a6```](https://github.com/jina-ai/clip-as-service/commit/52a8b0a6c62204d37556f31fd79fd1ee621b45e3)] __-__ __version__: the next version will be 0.4.15 (*Jina Dev Bot*)

<a name=release-note-0-4-16></a>
## Release Note (`0.4.16`)

> Release time: 2022-06-14 08:52:07


🙇 We'd like to thank all contributors for this new release! In particular,
 felix-wang,  Ziniu Yu,  Han Xiao,  Jina Dev Bot,  🙇


### 🐞 Bug fixes

 - [[```eca1e700```](https://github.com/jina-ai/clip-as-service/commit/eca1e700493d59f475714aa5f49ccd33247cb983)] __-__ add integerate test for client (#753) (*felix-wang*)
 - [[```b5c339fe```](https://github.com/jina-ai/clip-as-service/commit/b5c339feda8ed89538b92d59624a781a8725f304)] __-__ fix client concurrent issue (#752) (*Ziniu Yu*)

### 🍹 Other Improvements

 - [[```e5ab22f5```](https://github.com/jina-ai/clip-as-service/commit/e5ab22f58ea8888aef0ead6902d2412301e9e5fc)] __-__ update slack (*Han Xiao*)
 - [[```5503becb```](https://github.com/jina-ai/clip-as-service/commit/5503becb5308ca34062bc611a4a431815de5383c)] __-__ fix docs (*Han Xiao*)
 - [[```909cdb11```](https://github.com/jina-ai/clip-as-service/commit/909cdb110ff27f811e5ed718bb99291f454af03a)] __-__ add cas on colab section (*Han Xiao*)
 - [[```3d3ef936```](https://github.com/jina-ai/clip-as-service/commit/3d3ef9363c9a322660fe84e94a2e610a24be0f0e)] __-__ __version__: the next version will be 0.4.16 (*Jina Dev Bot*)

<a name=release-note-0-4-17></a>
## Release Note (`0.4.17`)

> Release time: 2022-06-20 10:56:12


🙇 We'd like to thank all contributors for this new release! In particular,
 Han Xiao,  Ziniu Yu,  numb3r3,  felix-wang,  Jina Dev Bot,  🙇


### 🆕 New Features

 - [[```03541dd7```](https://github.com/jina-ai/clip-as-service/commit/03541dd765849ec453b501c83dbf4071b317bce1)] __-__ add cas server dockerfile (#757) (*Han Xiao*)
 - [[```4d069a84```](https://github.com/jina-ai/clip-as-service/commit/4d069a84ac0414059acce322f00815bf0cd12536)] __-__ upload torch executor (#723) (*Ziniu Yu*)

### 🐞 Bug fixes

 - [[```eca1e700```](https://github.com/jina-ai/clip-as-service/commit/eca1e700493d59f475714aa5f49ccd33247cb983)] __-__ add integerate test for client (#753) (*felix-wang*)

### 📗 Documentation

 - [[```7c2faae2```](https://github.com/jina-ai/clip-as-service/commit/7c2faae270e276bfc36f4c51e4abe101194f1799)] __-__ update jcloud docs (#754) (*Ziniu Yu*)
 - [[```9d872f2e```](https://github.com/jina-ai/clip-as-service/commit/9d872f2e20e53e988a6d13dff191a42fa6e7e0d2)] __-__ add disk usage / memory usage benchmark table (#751) (*Ziniu Yu*)

### 🍹 Other Improvements

 - [[```9e469bf7```](https://github.com/jina-ai/clip-as-service/commit/9e469bf70f4cd314353bde9c1ca8dfbda45fa532)] __-__ fix readme (*Han Xiao*)
 - [[```4c4e74b2```](https://github.com/jina-ai/clip-as-service/commit/4c4e74b2d5ebcede408d49b6455e1a13293edf95)] __-__ upload executor in cd workflow (*numb3r3*)
 - [[```96923f12```](https://github.com/jina-ai/clip-as-service/commit/96923f12a33e2c30dc55dc993648d9758f96a132)] __-__ fix docker cd (#755) (*felix-wang*)
 - [[```1869e61f```](https://github.com/jina-ai/clip-as-service/commit/1869e61f3e0c46a7322abc42be6983e951d5806d)] __-__ add visual reasoning to docs (*Han Xiao*)
 - [[```2083f097```](https://github.com/jina-ai/clip-as-service/commit/2083f0970985a2260a7b6fbbaaaa8b1210036765)] __-__ __version__: the next version will be 0.4.17 (*Jina Dev Bot*)

<a name=release-note-0-4-18></a>
## Release Note (`0.4.18`)

> Release time: 2022-06-20 11:21:16


🙇 We'd like to thank all contributors for this new release! In particular,
 Han Xiao,  Jina Dev Bot,  🙇


### 🍹 Other Improvements

 - [[```a0c2661b```](https://github.com/jina-ai/clip-as-service/commit/a0c2661bc4764a74ff8737744b0d47fac4c1a5e9)] __-__ fix tag docker build job (*Han Xiao*)
 - [[```23f738ec```](https://github.com/jina-ai/clip-as-service/commit/23f738ecabebf906d001f83481f8cd10b89f5fb0)] __-__ __version__: the next version will be 0.4.18 (*Jina Dev Bot*)
 - [[```9e469bf7```](https://github.com/jina-ai/clip-as-service/commit/9e469bf70f4cd314353bde9c1ca8dfbda45fa532)] __-__ fix readme (*Han Xiao*)

<a name=release-note-0-4-19></a>
## Release Note (`0.4.19`)

> Release time: 2022-06-20 16:32:32


🙇 We'd like to thank all contributors for this new release! In particular,
 Han Xiao,  Jina Dev Bot,  🙇


### 🆕 New Features

 - [[```6902d2df```](https://github.com/jina-ai/clip-as-service/commit/6902d2dffc04b57e7a49f308139b27775a2193fa)] __-__ read config from stdin to allow pipe (#758) (*Han Xiao*)

### 📗 Documentation

 - [[```6e054db8```](https://github.com/jina-ai/clip-as-service/commit/6e054db893fcff4a2fe6c86073dd049e1c13f954)] __-__ read config from stdin to allow pipe (*Han Xiao*)

### 🍹 Other Improvements

 - [[```4a298d4f```](https://github.com/jina-ai/clip-as-service/commit/4a298d4f9fcbe342855f234d59c8e920e6918659)] __-__ add docker image docs (*Han Xiao*)
 - [[```1e931e8b```](https://github.com/jina-ai/clip-as-service/commit/1e931e8b2d2d8e5429c69e25df95ab15cb84ab66)] __-__ __version__: the next version will be 0.4.19 (*Jina Dev Bot*)
 - [[```a0c2661b```](https://github.com/jina-ai/clip-as-service/commit/a0c2661bc4764a74ff8737744b0d47fac4c1a5e9)] __-__ fix tag docker build job (*Han Xiao*)

<a name=release-note-0-4-20></a>
## Release Note (`0.4.20`)

> Release time: 2022-06-21 15:45:06


🙇 We'd like to thank all contributors for this new release! In particular,
 Han Xiao,  Jina Dev Bot,  🙇


### 🐞 Bug fixes

 - [[```79e85eed```](https://github.com/jina-ai/clip-as-service/commit/79e85eed7c89f31c16399bfcc1bb098f0ae5c920)] __-__ miscalling clip_server in clip_client (*Han Xiao*)

### 📗 Documentation

 - [[```6e054db8```](https://github.com/jina-ai/clip-as-service/commit/6e054db893fcff4a2fe6c86073dd049e1c13f954)] __-__ read config from stdin to allow pipe (*Han Xiao*)

### 🍹 Other Improvements

 - [[```c3e75133```](https://github.com/jina-ai/clip-as-service/commit/c3e751336722b415aa88992794119f32b7ddee77)] __-__ __version__: the next version will be 0.4.20 (*Jina Dev Bot*)

<a name=release-note-0-5-0></a>
## Release Note (`0.5.0`)

> Release time: 2022-08-03 05:13:06


🙇 We'd like to thank all contributors for this new release! In particular,
 numb3r3,  Ziniu Yu,  Alex Shan,  felix-wang,  Sha Zhou,  Jina Dev Bot,  Han Xiao,  🙇


### 🆕 New Features

 - [[```3402b1d1```](https://github.com/jina-ai/clip-as-service/commit/3402b1d1726120d8ed39ae561e441695f24ddeb3)] __-__ replace traversal_paths with access_paths (#791) (*Ziniu Yu*)
 - [[```87928a7b```](https://github.com/jina-ai/clip-as-service/commit/87928a7b8be9e8a4fce4d2352e82975252db162b)] __-__ update onnx models and md5 (#785) (*Ziniu Yu*)
 - [[```8bd83896```](https://github.com/jina-ai/clip-as-service/commit/8bd838964b7975c9c1a2394c0ae681507ed5dc18)] __-__ support onnx backend for openclip (#781) (*felix-wang*)
 - [[```f043b4d9```](https://github.com/jina-ai/clip-as-service/commit/f043b4d934a9454b5db32e7ea7331307506a1a6f)] __-__ update openclip loader (#782) (*Alex Shan*)
 - [[```fa62d8e9```](https://github.com/jina-ai/clip-as-service/commit/fa62d8e93baf2579b2934cc0ed8daca12c144d7d)] __-__ support openclip&amp;mclip models + refactor model loader (#774) (*Alex Shan*)
 - [[```32b11cd6```](https://github.com/jina-ai/clip-as-service/commit/32b11cd64bb76bca5075fbcbc84b9334952c236c)] __-__ allow model selection in client (#775) (*Ziniu Yu*)
 - [[```0ff4e252```](https://github.com/jina-ai/clip-as-service/commit/0ff4e2526394e0fa86266668f1162f4a6b922bd8)] __-__ allow credential in client (#765) (*Ziniu Yu*)
 - [[```ee7da10d```](https://github.com/jina-ai/clip-as-service/commit/ee7da10d1f56a130e6f9a85d5fb3518b80e5df0d)] __-__ support custom onnx file and update model signatures (#761) (*Ziniu Yu*)
 - [[```ed1b92d1```](https://github.com/jina-ai/clip-as-service/commit/ed1b92d1896cc0c12733b51bd1bd83040676f505)] __-__ __docs__: add qabot (#759) (*Sha Zhou*)

### 🐞 Bug fixes

 - [[```e48a7a38```](https://github.com/jina-ai/clip-as-service/commit/e48a7a38ac01fe0db47a7898ae1401f25394402f)] __-__ change onnx and trt default model name to ViT-B-32::openai (#793) (*Ziniu Yu*)
 - [[```8b8082a9```](https://github.com/jina-ai/clip-as-service/commit/8b8082a939f67f7ea01cc9f55ebce9c5368ebe1a)] __-__ mclip cuda device (#792) (*felix-wang*)
 - [[```8681b88e```](https://github.com/jina-ai/clip-as-service/commit/8681b88eb3a7806c1286eaefff3bd8a8ab28ff03)] __-__ fp16 inference (#790) (*felix-wang*)
 - [[```ab00c2ae```](https://github.com/jina-ai/clip-as-service/commit/ab00c2ae4067678b8f9c8351244867257031f3c2)] __-__ upgrade jina (#788) (*felix-wang*)
 - [[```1db43b48```](https://github.com/jina-ai/clip-as-service/commit/1db43b485b0fe368eb3949ddc052b5dd8002c279)] __-__ no allow client to change server batch size (#787) (*Ziniu Yu*)
 - [[```58772079```](https://github.com/jina-ai/clip-as-service/commit/5877207924c088739644873d6cf654aabb1f7134)] __-__ add models and md5 (#783) (*Ziniu Yu*)
 - [[```7c8285bb```](https://github.com/jina-ai/clip-as-service/commit/7c8285bbf7eb5d757cba1f85b56e6528be66396b)] __-__ async progress bar does not display (#779) (*Ziniu Yu*)
 - [[```79e85eed```](https://github.com/jina-ai/clip-as-service/commit/79e85eed7c89f31c16399bfcc1bb098f0ae5c920)] __-__ miscalling clip_server in clip_client (*Han Xiao*)

### 📗 Documentation

 - [[```c67a7f59```](https://github.com/jina-ai/clip-as-service/commit/c67a7f59c25760e32a611b330fd9ff5959aa1e4b)] __-__ add model support (#784) (*Alex Shan*)
 - [[```bc6b72e6```](https://github.com/jina-ai/clip-as-service/commit/bc6b72e65cce999ad7b09ecb93b25b07ff8f4de1)] __-__ add finetuner docs (#771) (*Ziniu Yu*)
 - [[```2b78b12e```](https://github.com/jina-ai/clip-as-service/commit/2b78b12e3aa527b386eac4ee7eed74e580eadbf6)] __-__ improve model support (#768) (*Ziniu Yu*)

### 🍹 Other Improvements

 - [[```b00963c4```](https://github.com/jina-ai/clip-as-service/commit/b00963c45983dfdac6d05258b03298de5ad1edf6)] __-__ bump version to 0.5.0 (*numb3r3*)
 - [[```c458dd65```](https://github.com/jina-ai/clip-as-service/commit/c458dd6579d6e3125028ad4cb2b88f9f481b4686)] __-__ remove clip_hg (#786) (*Ziniu Yu*)
 - [[```ca03dca3```](https://github.com/jina-ai/clip-as-service/commit/ca03dca369d2e7ed55d2f2a339fa9b4e9f41667d)] __-__ fix markdown-table extention (#772) (*felix-wang*)
 - [[```7b19bffe```](https://github.com/jina-ai/clip-as-service/commit/7b19bffecb739a74a524544472aa3ad07dff2f2a)] __-__ __version__: the next version will be 0.4.21 (*Jina Dev Bot*)

<a name=release-note-0-5-1></a>
## Release Note (`0.5.1`)

> Release time: 2022-08-08 05:11:18


🙇 We'd like to thank all contributors for this new release! In particular,
 Ziniu Yu,  Jina Dev Bot,  numb3r3,  🙇


### 🆕 New Features

 - [[```65032f02```](https://github.com/jina-ai/clip-as-service/commit/65032f02db30671f7a2a6ca78e371588ae98ab2b)] __-__ encode text first when both text and uri are presented (#795) (*Ziniu Yu*)

### 📗 Documentation

 - [[```7c6708fa```](https://github.com/jina-ai/clip-as-service/commit/7c6708fa8a592b5ce306f1ab2f1af1504148484a)] __-__ update hub readme (#794) (*Ziniu Yu*)

### 🍹 Other Improvements

 - [[```a7c4f490```](https://github.com/jina-ai/clip-as-service/commit/a7c4f4903df5736bcf9e85d82bb83497d850bc4d)] __-__ __version__: the next version will be 0.5.1 (*Jina Dev Bot*)
 - [[```b00963c4```](https://github.com/jina-ai/clip-as-service/commit/b00963c45983dfdac6d05258b03298de5ad1edf6)] __-__ bump version to 0.5.0 (*numb3r3*)

<a name=release-note-0-6-0></a>
## Release Note (`0.6.0`)

> Release time: 2022-08-30 04:19:21


🙇 We'd like to thank all contributors for this new release! In particular,
 numb3r3,  Ziniu Yu,  felix-wang,  Jina Dev Bot,  🙇


### 🆕 New Features

 - [[```3c43eed3```](https://github.com/jina-ai/clip-as-service/commit/3c43eed38afe2ff84c8b06368f4301afcd332cf5)] __-__ do not send blob from server when it is loaded in client (#804) (*Ziniu Yu*)
 - [[```f852dfc8```](https://github.com/jina-ai/clip-as-service/commit/f852dfc876caa7b98552b5c707d4c85babc46393)] __-__ add warning if input is too large (#796) (*Ziniu Yu*)
 - [[```65032f02```](https://github.com/jina-ai/clip-as-service/commit/65032f02db30671f7a2a6ca78e371588ae98ab2b)] __-__ encode text first when both text and uri are presented (#795) (*Ziniu Yu*)

### 🐞 Bug fixes

 - [[```bb2c142b```](https://github.com/jina-ai/clip-as-service/commit/bb2c142b8899075c00db3b08e506fb970fee1478)] __-__ cast dtype for fp16 (#801) (*felix-wang*)

### 📗 Documentation

 - [[```a5893c70```](https://github.com/jina-ai/clip-as-service/commit/a5893c70531830f236d38fde5a880a9a2556474f)] __-__ update jcloud gpu usage (#809) (*Ziniu Yu*)
 - [[```b4fb0dd2```](https://github.com/jina-ai/clip-as-service/commit/b4fb0dd2823b6218da4395989c6b011cf3de1a38)] __-__ fix hub table typo (#803) (*Ziniu Yu*)

### 🍹 Other Improvements

 - [[```2a80235c```](https://github.com/jina-ai/clip-as-service/commit/2a80235c0aa16eefdc6703989fc6da670cbd5c89)] __-__ bump version to 0.6.0 (*numb3r3*)
 - [[```59b9f771```](https://github.com/jina-ai/clip-as-service/commit/59b9f7716df9a325fb6e707d086ca6f2612da975)] __-__ update protobuf version (#810) (*Ziniu Yu*)
 - [[```89205f06```](https://github.com/jina-ai/clip-as-service/commit/89205f06d1b740952e79c512d6b0ef6f8db18300)] __-__ update executor docstring (#806) (*Ziniu Yu*)
 - [[```25c91e21```](https://github.com/jina-ai/clip-as-service/commit/25c91e21ee8de9e2cd1766d2c6c319f6e5609e80)] __-__ __version__: the next version will be 0.5.2 (*Jina Dev Bot*)

<a name=release-note-0-6-1></a>
## Release Note (`0.6.1`)

> Release time: 2022-08-30 13:57:32


🙇 We'd like to thank all contributors for this new release! In particular,
 felix-wang,  Jina Dev Bot,  numb3r3,  🙇


### 🐞 Bug fixes

 - [[```ea239685```](https://github.com/jina-ai/clip-as-service/commit/ea239685bff56372aeadaeb3050f5c2ccc37175f)] __-__ grpc meta auth (#811) (*felix-wang*)

### 🍹 Other Improvements

 - [[```83a8120c```](https://github.com/jina-ai/clip-as-service/commit/83a8120c22c76cf34f0d2e5966c368031e0fe9b4)] __-__ __version__: the next version will be 0.6.1 (*Jina Dev Bot*)
 - [[```2a80235c```](https://github.com/jina-ai/clip-as-service/commit/2a80235c0aa16eefdc6703989fc6da670cbd5c89)] __-__ bump version to 0.6.0 (*numb3r3*)

<a name=release-note-0-6-2></a>
## Release Note (`0.6.2`)

> Release time: 2022-09-01 04:16:27


🙇 We'd like to thank all contributors for this new release! In particular,
 Ziniu Yu,  Jina Dev Bot,  felix-wang,  🙇


### 🐞 Bug fixes

 - [[```ea239685```](https://github.com/jina-ai/clip-as-service/commit/ea239685bff56372aeadaeb3050f5c2ccc37175f)] __-__ grpc meta auth (#811) (*felix-wang*)

### 📗 Documentation

 - [[```4461d2e9```](https://github.com/jina-ai/clip-as-service/commit/4461d2e9ab07c01669237b220cd24cd6f95e30e8)] __-__ update model support table (#813) (*Ziniu Yu*)

### 🍹 Other Improvements

 - [[```f7ee26a1```](https://github.com/jina-ai/clip-as-service/commit/f7ee26a17d47c1de0efc1122ccb40d3b22d217a8)] __-__ improve model not found error msg (#812) (*Ziniu Yu*)
 - [[```f1c0057d```](https://github.com/jina-ai/clip-as-service/commit/f1c0057d7e1c51953303bbf7b3743e19a9c300ab)] __-__ __version__: the next version will be 0.6.2 (*Jina Dev Bot*)

<a name=release-note-0-7-0></a>
## Release Note (`0.7.0`)

> Release time: 2022-09-13 13:47:54


🙇 We'd like to thank all contributors for this new release! In particular,
 numb3r3,  felix-wang,  Jie Fu,  Ziniu Yu,  Jina Dev Bot,  🙇


### 🆕 New Features

 - [[```a07a5218```](https://github.com/jina-ai/clip-as-service/commit/a07a52182d02b3cab1135235c9aee8e1af4f280c)] __-__ support clip retrieval (#816) (*felix-wang*)

### 🐞 Bug fixes

 - [[```213ecc28```](https://github.com/jina-ai/clip-as-service/commit/213ecc28afa20bbb0984efd4ab28dd08443e9369)] __-__ always return docarray as search result (#821) (*felix-wang*)
 - [[```eca57745```](https://github.com/jina-ai/clip-as-service/commit/eca577455a0d378cc4d9974ef3109f2d2e74c1b3)] __-__ __readme__: use new demo server (#819) (*felix-wang*)

### 📗 Documentation

 - [[```8d9725fb```](https://github.com/jina-ai/clip-as-service/commit/8d9725fb874d94944cb1129ca2ccc8293c52dc90)] __-__ update clip search (#820) (*felix-wang*)
 - [[```fa7e5776```](https://github.com/jina-ai/clip-as-service/commit/fa7e577606d68e65a0e7952048c64d2b3a28e231)] __-__ docs for retrieval (#808) (*Jie Fu*)
 - [[```47144c23```](https://github.com/jina-ai/clip-as-service/commit/47144c23fd6b10f9aed0dfc4a2e37f83bc33f284)] __-__ enable horizontal scrolling in wide tables (#818) (*Ziniu Yu*)

### 🍹 Other Improvements

 - [[```53636cea```](https://github.com/jina-ai/clip-as-service/commit/53636cea63bf8063bcfd744aae4577df8e0eab2e)] __-__ bump version to 0.7.0 (*numb3r3*)
 - [[```eda4aa8e```](https://github.com/jina-ai/clip-as-service/commit/eda4aa8e958bbbd83dddcd5932622bcf041f3918)] __-__ __version__: the next version will be 0.6.3 (*Jina Dev Bot*)
 - [[```f7ee26a1```](https://github.com/jina-ai/clip-as-service/commit/f7ee26a17d47c1de0efc1122ccb40d3b22d217a8)] __-__ improve model not found error msg (#812) (*Ziniu Yu*)

<a name=release-note-0-8-0></a>
## Release Note (`0.8.0`)

> Release time: 2022-10-12 08:11:40


🙇 We'd like to thank all contributors for this new release! In particular,
 numb3r3,  Jie Fu,  Ziniu Yu,  felix-wang,  Jina Dev Bot,  🙇


### 🆕 New Features

 - [[```2ba8a4fe```](https://github.com/jina-ai/clip-as-service/commit/2ba8a4fe71f26faa5e92d62df04edb616389f6bd)] __-__ support large ONNX model files (#828) (*Ziniu Yu*)
 - [[```09d15485```](https://github.com/jina-ai/clip-as-service/commit/09d15485d50c51a77cb57380f4b848b41764a1b6)] __-__ support B/32, L/14, H/14, and g/14 trained on LAION-2B (#825) (*Ziniu Yu*)
 - [[```c690c247```](https://github.com/jina-ai/clip-as-service/commit/c690c247946017d178d9340d8c951342c0321943)] __-__ drop image content to boost latency (#824) (*felix-wang*)
 - [[```bcce9900```](https://github.com/jina-ai/clip-as-service/commit/bcce990032abfd618cea408ab3f0fb4e352789ae)] __-__ in-place result in clip_client; preserve output order by uid (#815) (*Ziniu Yu*)

### 📗 Documentation

 - [[```87fdc548```](https://github.com/jina-ai/clip-as-service/commit/87fdc5489c5b33b76e28dd1c0b54017a51dd4abe)] __-__ add memory profile (#841) (*Jie Fu*)
 - [[```7ee58c8b```](https://github.com/jina-ai/clip-as-service/commit/7ee58c8b2751f949790983f223209ad1d2261fca)] __-__ clip benchmark on zeroshot classification and retrieval tasks (#832) (*Ziniu Yu*)

### 🍹 Other Improvements

 - [[```920b3107```](https://github.com/jina-ai/clip-as-service/commit/920b31070f54b1b6af4d4e58e7db351a576e0783)] __-__ bump version to 0.8.0 (*numb3r3*)
 - [[```54e99786```](https://github.com/jina-ai/clip-as-service/commit/54e99786ea07b9ad109f593890e3b4945d39b768)] __-__ add description for retrieval playground (#834) (*Jie Fu*)
 - [[```a26a883f```](https://github.com/jina-ai/clip-as-service/commit/a26a883fa15a47243450c9cebbfc7f472e6cfa04)] __-__ use open clip naming convention for model names (#836) (*Ziniu Yu*)
 - [[```f40513d5```](https://github.com/jina-ai/clip-as-service/commit/f40513d57c0c3f7e466160f41547c970618af85a)] __-__ fix docs website template (#833) (*Ziniu Yu*)
 - [[```d520ebb8```](https://github.com/jina-ai/clip-as-service/commit/d520ebb835e2814f7696148a0dcabbbf8bdadc76)] __-__ remove unused md (*numb3r3*)
 - [[```2c3c61f9```](https://github.com/jina-ai/clip-as-service/commit/2c3c61f9d6f5a351f235dbad45879f0c7c4fd986)] __-__ __version__: the next version will be 0.7.1 (*Jina Dev Bot*)
 - [[```53636cea```](https://github.com/jina-ai/clip-as-service/commit/53636cea63bf8063bcfd744aae4577df8e0eab2e)] __-__ bump version to 0.7.0 (*numb3r3*)

<a name=release-note-0-8-1></a>
## Release Note (`0.8.1`)

> Release time: 2022-11-15 11:15:48


🙇 We'd like to thank all contributors for this new release! In particular,
 YangXiuyu,  Ziniu Yu,  felix-wang,  Jie Fu,  Jina Dev Bot,  numb3r3,  🙇


### 🆕 New Features

 - [[```e4717a35```](https://github.com/jina-ai/clip-as-service/commit/e4717a35f850e6a2cd8b4d8b4c994fad30fd5c72)] __-__ Integrate flash attention (#853) (*YangXiuyu*)
 - [[```4fcbf68a```](https://github.com/jina-ai/clip-as-service/commit/4fcbf68a883cb3143e47738df4c8044dfec2a131)] __-__ allow custom callback in clip_client (#849) (*Ziniu Yu*)

### 🐞 Bug fixes

 - [[```71086227```](https://github.com/jina-ai/clip-as-service/commit/710862279bdef342983bd7944f413d8ee54f9603)] __-__ increase timeout ready for executor docker images (#854) (*Ziniu Yu*)
 - [[```f96ce543```](https://github.com/jina-ai/clip-as-service/commit/f96ce5433dc1ec473ae89e22f01520b93abc6071)] __-__ install transformers for executor docker images (#851) (*Ziniu Yu*)

### 📗 Documentation

 - [[```9aa8c224```](https://github.com/jina-ai/clip-as-service/commit/9aa8c224f93c4a7b52fecac8fe8a18832ce98814)] __-__ add tips for client parallelism usage (#846) (*Ziniu Yu*)
 - [[```8776784d```](https://github.com/jina-ai/clip-as-service/commit/8776784d2cf2b0bfce44724db10c18dbda7acb77)] __-__ add instructions for using clip server hosted by jina (#848) (*Ziniu Yu*)
 - [[```d91da50c```](https://github.com/jina-ai/clip-as-service/commit/d91da50cc86942623dbee2cdb6b31350d9ce6a8e)] __-__ move benchmark conclusion to beginning (#847) (*Ziniu Yu*)
 - [[```baf94b5f```](https://github.com/jina-ai/clip-as-service/commit/baf94b5f70b9c18cfe2c0fea3e284fe30e4ca093)] __-__ update finetuner docs (#843) (*Jie Fu*)

### 🍹 Other Improvements

 - [[```d2ecec60```](https://github.com/jina-ai/clip-as-service/commit/d2ecec60e9be4235518d19b0e2f2342fa5401dfc)] __-__ allow test to pass even if commit name is not good (#856) (*Ziniu Yu*)
 - [[```ebfa494c```](https://github.com/jina-ai/clip-as-service/commit/ebfa494c9218a848e0bc49a552dabecda1373dbb)] __-__ replace clip server address in docs (#857) (*Ziniu Yu*)
 - [[```fe112ea5```](https://github.com/jina-ai/clip-as-service/commit/fe112ea5ec8dd9de8fd842633b17dcb9079c79a4)] __-__ change hub url from hub.jina.ai to cloud.jina.ai (#845) (*Ziniu Yu*)
 - [[```ae05624d```](https://github.com/jina-ai/clip-as-service/commit/ae05624d68bf8c3fbcefc1d07b0adabbe1cad422)] __-__ use new free service in playground (#844) (*felix-wang*)
 - [[```6cdc3e21```](https://github.com/jina-ai/clip-as-service/commit/6cdc3e21bb6e0b0476b94e40cfa88a475d4a5f7d)] __-__ __version__: the next version will be 0.8.1 (*Jina Dev Bot*)
 - [[```920b3107```](https://github.com/jina-ai/clip-as-service/commit/920b31070f54b1b6af4d4e58e7db351a576e0783)] __-__ bump version to 0.8.0 (*numb3r3*)

<a name=release-note-0-8-2></a>
## Release Note (`0.8.2`)

> Release time: 2023-04-19 08:23:45


🙇 We'd like to thank all contributors for this new release! In particular,
 Ziniu Yu,  Yang Ruiyi,  YangXiuyu,  Jie Fu,  zawabest,  Girish Chandrashekar,  Jina Dev Bot,  🙇


### 🆕 New Features

 - [[```cce3b05a```](https://github.com/jina-ai/clip-as-service/commit/cce3b05a1cfa23db129e8a7077e75e75f5da73c6)] __-__ set prefetch in client for traffic control (#897) (*Ziniu Yu*)
 - [[```dabbe8bc```](https://github.com/jina-ai/clip-as-service/commit/dabbe8bc3ef633e4460e1be3f1c06792fe08f00c)] __-__ add cn clip model (#888) (*Yang Ruiyi*)
 - [[```1fe3a5a0```](https://github.com/jina-ai/clip-as-service/commit/1fe3a5a01123dcfea8a7981fc5aea212d42c1299)] __-__ add fp16 inference support (torch/onnx) (#871) (*YangXiuyu*)
 - [[```1eebdd7f```](https://github.com/jina-ai/clip-as-service/commit/1eebdd7f489abb8e694226d5c5c29b011eab229a)] __-__ add custom tracing spans with jina&gt;=3.12.0 (#861) (*Girish Chandrashekar*)
 - [[```f2515394```](https://github.com/jina-ai/clip-as-service/commit/f25153942464bb9230158af33c324cdb0b8b70a4)] __-__ add three new open clip roberta base models  (#860) (*YangXiuyu*)
 - [[```e4717a35```](https://github.com/jina-ai/clip-as-service/commit/e4717a35f850e6a2cd8b4d8b4c994fad30fd5c72)] __-__ Integrate flash attention (#853) (*YangXiuyu*)

### 🐞 Bug fixes

 - [[```280b925e```](https://github.com/jina-ai/clip-as-service/commit/280b925e16ab5605a124d412f66ff56caa492553)] __-__ fix docarray at v1 (#911) (*Ziniu Yu*)
 - [[```35733a0b```](https://github.com/jina-ai/clip-as-service/commit/35733a0ba7fe6d9ae64d2d4d657d6ded2df3a6d1)] __-__ replace transform ndarray with transform blob (#910) (*Ziniu Yu*)
 - [[```d70f2382```](https://github.com/jina-ai/clip-as-service/commit/d70f238220f76593fb9b14e43e50f9a9d2cecd8a)] __-__ onnx package conflict during setup (#894) (*Ziniu Yu*)
 - [[```8a576c58```](https://github.com/jina-ai/clip-as-service/commit/8a576c585756e6526b1fe4a526858252d096535a)] __-__ install pytorch cu116 for server docker image (#882) (*Ziniu Yu*)
 - [[```0b293ec8```](https://github.com/jina-ai/clip-as-service/commit/0b293ec834e80f7335aa625d683904594373a607)] __-__ dynamic convert onnx model to fp16 during start session (#876) (*YangXiuyu*)
 - [[```fd16e5ab```](https://github.com/jina-ai/clip-as-service/commit/fd16e5abef94e274572d40912f12baeffece8696)] __-__ check dtype when loading models (#872) (*Ziniu Yu*)
 - [[```67f551ca```](https://github.com/jina-ai/clip-as-service/commit/67f551ca46c2bcf8c8598d6749544bd335da8bdb)] __-__ torchvision version to avoid compatibility issue (#866) (*Jie Fu*)
 - [[```0223e6fa```](https://github.com/jina-ai/clip-as-service/commit/0223e6fa071534bfc1a3b2010dd7065623afd540)] __-__ add pip installable flash attention (#863) (*YangXiuyu*)

### 📗 Documentation

 - [[```1888ef65```](https://github.com/jina-ai/clip-as-service/commit/1888ef65f20a94b38f318696e663d447c7cb1dc6)] __-__ fix broken link in client doc (#909) (*Ziniu Yu*)
 - [[```f4eed3bc```](https://github.com/jina-ai/clip-as-service/commit/f4eed3bcbf5757571365159582d09f22c0ca8ed2)] __-__ add link and intro to inference api (#900) (*Ziniu Yu*)
 - [[```702fff88```](https://github.com/jina-ai/clip-as-service/commit/702fff88fc8070138b6eee517d9bb6167da0e87f)] __-__ default model suggestion (#874) (*Jie Fu*)

### 🍹 Other Improvements

 - [[```19b4fa51```](https://github.com/jina-ai/clip-as-service/commit/19b4fa51f7534b38a8ca236f05483602e44c0536)] __-__ remove docsqa html (#899) (*Ziniu Yu*)
 - [[```aa07d257```](https://github.com/jina-ai/clip-as-service/commit/aa07d2577fd27df03ccfff409ee00420071c41af)] __-__ remove docsqa (#898) (*Ziniu Yu*)
 - [[```f3421f7c```](https://github.com/jina-ai/clip-as-service/commit/f3421f7c1decbbdd3a5e1f1038666479c8fe60f6)] __-__ bump open-clip-torch to v2.8.0 (#883) (*Ziniu Yu*)
 - [[```c7af9f71```](https://github.com/jina-ai/clip-as-service/commit/c7af9f718550600973c6880de442619228f655e8)] __-__ fix configuration file for the search flow doc (#869) (*zawabest*)
 - [[```53cd0630```](https://github.com/jina-ai/clip-as-service/commit/53cd06301efde97e6e59a2b143323ccd5f5f2565)] __-__ hide changelog in docs (#864) (*Ziniu Yu*)
 - [[```9bb7d1f4```](https://github.com/jina-ai/clip-as-service/commit/9bb7d1f47d19e15e844108dec5f84cabcce7975d)] __-__ __version__: the next version will be 0.8.2 (*Jina Dev Bot*)

<a name=release-note-0-8-3></a>
## Release Note (`0.8.3`)

> Release time: 2023-12-20 04:13:18


🙇 We'd like to thank all contributors for this new release! In particular,
 Zihao Jing,  Han Xiao,  Nick de Silva,  Ziniu Yu,  Jina Dev Bot,  🙇


### 🐞 Bug fixes

 - [[```280b925e```](https://github.com/jina-ai/clip-as-service/commit/280b925e16ab5605a124d412f66ff56caa492553)] __-__ fix docarray at v1 (#911) (*Ziniu Yu*)

### 📗 Documentation

 - [[```ca2b25b7```](https://github.com/jina-ai/clip-as-service/commit/ca2b25b7564bc9b18ae38b93f0134e1f9aa0cee7)] __-__ remove jina self-hosted parts (#942) (*Zihao Jing*)
 - [[```6e418fe6```](https://github.com/jina-ai/clip-as-service/commit/6e418fe69c10dbac155e02267828d922a5601691)] __-__ replace free service docs with inference docs (#918) (*Ziniu Yu*)

### 🍹 Other Improvements

 - [[```d4e7a30b```](https://github.com/jina-ai/clip-as-service/commit/d4e7a30b755b2d314f89181fcc42624a1224b9ae)] __-__ Update README.md (*Han Xiao*)
 - [[```679de4e3```](https://github.com/jina-ai/clip-as-service/commit/679de4e3c9cb02b712f58540f6a3dd2e32d8e5e9)] __-__ change slack link to discord (*Han Xiao*)
 - [[```02abdc7b```](https://github.com/jina-ai/clip-as-service/commit/02abdc7b68214bedc181d9ef4be1c093ee60c609)] __-__ __version__: the next version will be 0.8.3 (*Jina Dev Bot*)


================================================
FILE: Dockerfiles/base.Dockerfile
================================================
# !!! An ARG declared before a FROM is outside of a build stage, so it can’t be used in any instruction after a FROM
ARG JINA_VERSION=3.11.0

FROM jinaai/jina:${JINA_VERSION}-py38-standard

ARG BACKEND_TAG=torch

# constant, wont invalidate cache
LABEL org.opencontainers.image.vendor="Jina AI Limited" \
      org.opencontainers.image.licenses="Apache 2.0" \
      org.opencontainers.image.title="CLIP-as-Service" \
      org.opencontainers.image.description="Embed images and sentences into fixed-length vectors with CLIP" \
      org.opencontainers.image.authors="hello@jina.ai" \
      org.opencontainers.image.url="clip-as-service" \
      org.opencontainers.image.documentation="https://clip-as-service.jina.ai/"

RUN pip3 install --no-cache-dir torch torchvision torchaudio transformers --extra-index-url https://download.pytorch.org/whl/cpu

# copy will almost always invalid the cache
COPY . /cas/

WORKDIR /cas

RUN if [ "${BACKEND_TAG}" != "torch" ]; then python3 -m pip install --no-cache-dir "./[${BACKEND_TAG}]" ; fi \
    && python3 -m pip install --no-cache-dir .

RUN echo "\
jtype: CLIPEncoder\n\
metas:\n\
  py_modules:\n\
    - clip_server.executors.clip_$BACKEND_TAG\n\
" > /tmp/config.yml


ENTRYPOINT ["jina", "executor", "--uses", "/tmp/config.yml", "--timeout-ready", "3000000"]


================================================
FILE: Dockerfiles/cuda.Dockerfile
================================================
ARG CUDA_VERSION=11.4.2

FROM nvcr.io/nvidia/cuda:${CUDA_VERSION}-cudnn8-runtime-ubuntu20.04
ENV DEBIAN_FRONTEND=noninteractive

ARG JINA_VERSION=3.11.0
ARG BACKEND_TAG=torch

# constant, wont invalidate cache
LABEL org.opencontainers.image.vendor="Jina AI Limited" \
      org.opencontainers.image.licenses="Apache 2.0" \
      org.opencontainers.image.title="CLIP-as-Service" \
      org.opencontainers.image.description="Embed images and sentences into fixed-length vectors with CLIP" \
      org.opencontainers.image.authors="hello@jina.ai" \
      org.opencontainers.image.url="clip-as-service" \
      org.opencontainers.image.documentation="https://clip-as-service.jina.ai/"

RUN apt-get update && apt-get install -y --no-install-recommends \
    python3-setuptools python3-wheel python3-pip \
    && apt-get clean && rm -rf /var/lib/apt/lists/*;

RUN python3 -m pip install --default-timeout=1000 --no-cache-dir torch torchvision torchaudio nvidia-pyindex transformers --extra-index-url https://download.pytorch.org/whl/cu113
RUN python3 -m pip install --default-timeout=1000 --no-cache-dir "jina[standard]==${JINA_VERSION}"

# copy will almost always invalid the cache
COPY . /cas/

WORKDIR /cas

RUN if [ "${BACKEND_TAG}" != "torch" ]; then python3 -m pip install --no-cache-dir "./[${BACKEND_TAG}]" ; fi \
    && python3 -m pip install --no-cache-dir .

RUN echo "\
jtype: CLIPEncoder\n\
metas:\n\
  py_modules:\n\
    - clip_server.executors.clip_$BACKEND_TAG\n\
" > /tmp/config.yml

ENTRYPOINT ["jina", "executor", "--uses", "/tmp/config.yml", "--timeout-ready", "3000000"]


================================================
FILE: Dockerfiles/server.Dockerfile
================================================
ARG CUDA_VERSION=11.6.0

FROM nvidia/cuda:${CUDA_VERSION}-devel-ubuntu20.04

ARG CAS_NAME=cas
WORKDIR /${CAS_NAME}

ENV PIP_NO_CACHE_DIR=1 \
    PIP_DISABLE_PIP_VERSION_CHECK=1

# constant, wont invalidate cache
LABEL org.opencontainers.image.vendor="Jina AI Limited" \
      org.opencontainers.image.licenses="Apache 2.0" \
      org.opencontainers.image.title="CLIP-as-Service" \
      org.opencontainers.image.description="Embed images and sentences into fixed-length vectors with CLIP" \
      org.opencontainers.image.authors="hello@jina.ai" \
      org.opencontainers.image.url="clip-as-service" \
      org.opencontainers.image.documentation="https://clip-as-service.jina.ai/"


RUN apt-get update \
    && apt-get install -y --no-install-recommends python3 python3-pip wget \
    && ln -sf python3 /usr/bin/python \
    && ln -sf pip3 /usr/bin/pip \
    && pip install --upgrade pip \
    && pip install wheel setuptools nvidia-pyindex \
    && pip install torch torchvision torchaudio --extra-index-url https://download.pytorch.org/whl/cu116

COPY server ./server
# given by builder
ARG PIP_TAG
RUN pip install --default-timeout=1000 --compile ./server/ \
    && if [ -n "${PIP_TAG}" ]; then pip install --default-timeout=1000 --compile "./server[${PIP_TAG}]" ; fi

ENV LD_LIBRARY_PATH=/usr/local/cuda/lib64

ARG USER_ID=1000
ARG GROUP_ID=1000
ARG USER_NAME=${CAS_NAME}
ARG GROUP_NAME=${CAS_NAME}

RUN groupadd -g ${GROUP_ID} ${USER_NAME} &&\
    useradd -l -u ${USER_ID} -g ${USER_NAME} ${GROUP_NAME} &&\
    mkdir /home/${USER_NAME} &&\
    chown ${USER_NAME}:${GROUP_NAME} /home/${USER_NAME} &&\
    chown -R ${USER_NAME}:${GROUP_NAME} /${CAS_NAME}/

USER ${USER_NAME}

ENTRYPOINT ["python", "-m", "clip_server"]

================================================
FILE: Dockerfiles/tensorrt.Dockerfile
================================================
# Dockerfile to run Clip-as-Service with TensorRT, CUDA integration

ARG TENSORRT_VERSION=22.04

FROM nvcr.io/nvidia/tensorrt:${TENSORRT_VERSION}-py3

ARG JINA_VERSION=3.7.0
ARG BACKEND_TAG=tensorrt

# constant, wont invalidate cache
LABEL org.opencontainers.image.vendor="Jina AI Limited" \
      org.opencontainers.image.licenses="Apache 2.0" \
      org.opencontainers.image.title="CLIP-as-Service" \
      org.opencontainers.image.description="Embed images and sentences into fixed-length vectors with CLIP" \
      org.opencontainers.image.authors="hello@jina.ai" \
      org.opencontainers.image.url="clip-as-service" \
      org.opencontainers.image.documentation="https://clip-as-service.jina.ai/"

RUN pip3 install --default-timeout=1000 --no-cache-dir torch torchvision torchaudio --extra-index-url https://download.pytorch.org/whl/cu113
RUN pip3 -m pip install --default-timeout=1000 --no-cache-dir "jina[standard]==${JINA_VERSION}"

# copy will almost always invalid the cache
COPY . /cas/
WORKDIR /cas

RUN python3 -m pip install --no-cache-dir "./[$BACKEND_TAG]"


RUN echo "\
jtype: CLIPEncoder\n\
metas:\n\
  py_modules:\n\
    - clip_server.executors.clip_$BACKEND_TAG\n\
" > /tmp/config.yml


ENTRYPOINT ["jina", "executor", "--uses", "/tmp/config.yml"]


================================================
FILE: LICENSE
================================================
Copyright 2020-2022 Jina AI Limited.  All rights reserved.

The following two files are licensed under MIT License via https://github.com/mlfoundations/open_clip Copyright (c) 2021, OpenCLIP
    server/clip_server/model/model.py
    server/clip_server/model/simple_tokenizer.py


                                 Apache License
                           Version 2.0, January 2004
                        http://www.apache.org/licenses/

   TERMS AND CONDITIONS FOR USE, REPRODUCTION, AND DISTRIBUTION

   1. Definitions.

      "License" shall mean the terms and conditions for use, reproduction,
      and distribution as defined by Sections 1 through 9 of this document.

      "Licensor" shall mean the copyright owner or entity authorized by
      the copyright owner that is granting the License.

      "Legal Entity" shall mean the union of the acting entity and all
      other entities that control, are controlled by, or are under common
      control with that entity. For the purposes of this definition,
      "control" means (i) the power, direct or indirect, to cause the
      direction or management of such entity, whether by contract or
      otherwise, or (ii) ownership of fifty percent (50%) or more of the
      outstanding shares, or (iii) beneficial ownership of such entity.

      "You" (or "Your") shall mean an individual or Legal Entity
      exercising permissions granted by this License.

      "Source" form shall mean the preferred form for making modifications,
      including but not limited to software source code, documentation
      source, and configuration files.

      "Object" form shall mean any form resulting from mechanical
      transformation or translation of a Source form, including but
      not limited to compiled object code, generated documentation,
      and conversions to other media types.

      "Work" shall mean the work of authorship, whether in Source or
      Object form, made available under the License, as indicated by a
      copyright notice that is included in or attached to the work
      (an example is provided in the Appendix below).

      "Derivative Works" shall mean any work, whether in Source or Object
      form, that is based on (or derived from) the Work and for which the
      editorial revisions, annotations, elaborations, or other modifications
      represent, as a whole, an original work of authorship. For the purposes
      of this License, Derivative Works shall not include works that remain
      separable from, or merely link (or bind by name) to the interfaces of,
      the Work and Derivative Works thereof.

      "Contribution" shall mean any work of authorship, including
      the original version of the Work and any modifications or additions
      to that Work or Derivative Works thereof, that is intentionally
      submitted to Licensor for inclusion in the Work by the copyright owner
      or by an individual or Legal Entity authorized to submit on behalf of
      the copyright owner. For the purposes of this definition, "submitted"
      means any form of electronic, verbal, or written communication sent
      to the Licensor or its representatives, including but not limited to
      communication on electronic mailing lists, source code control systems,
      and issue tracking systems that are managed by, or on behalf of, the
      Licensor for the purpose of discussing and improving the Work, but
      excluding communication that is conspicuously marked or otherwise
      designated in writing by the copyright owner as "Not a Contribution."

      "Contributor" shall mean Licensor and any individual or Legal Entity
      on behalf of whom a Contribution has been received by Licensor and
      subsequently incorporated within the Work.

   2. Grant of Copyright License. Subject to the terms and conditions of
      this License, each Contributor hereby grants to You a perpetual,
      worldwide, non-exclusive, no-charge, royalty-free, irrevocable
      copyright license to reproduce, prepare Derivative Works of,
      publicly display, publicly perform, sublicense, and distribute the
      Work and such Derivative Works in Source or Object form.

   3. Grant of Patent License. Subject to the terms and conditions of
      this License, each Contributor hereby grants to You a perpetual,
      worldwide, non-exclusive, no-charge, royalty-free, irrevocable
      (except as stated in this section) patent license to make, have made,
      use, offer to sell, sell, import, and otherwise transfer the Work,
      where such license applies only to those patent claims licensable
      by such Contributor that are necessarily infringed by their
      Contribution(s) alone or by combination of their Contribution(s)
      with the Work to which such Contribution(s) was submitted. If You
      institute patent litigation against any entity (including a
      cross-claim or counterclaim in a lawsuit) alleging that the Work
      or a Contribution incorporated within the Work constitutes direct
      or contributory patent infringement, then any patent licenses
      granted to You under this License for that Work shall terminate
      as of the date such litigation is filed.

   4. Redistribution. You may reproduce and distribute copies of the
      Work or Derivative Works thereof in any medium, with or without
      modifications, and in Source or Object form, provided that You
      meet the following conditions:

      (a) You must give any other recipients of the Work or
          Derivative Works a copy of this License; and

      (b) You must cause any modified files to carry prominent notices
          stating that You changed the files; and

      (c) You must retain, in the Source form of any Derivative Works
          that You distribute, all copyright, patent, trademark, and
          attribution notices from the Source form of the Work,
          excluding those notices that do not pertain to any part of
          the Derivative Works; and

      (d) If the Work includes a "NOTICE" text file as part of its
          distribution, then any Derivative Works that You distribute must
          include a readable copy of the attribution notices contained
          within such NOTICE file, excluding those notices that do not
          pertain to any part of the Derivative Works, in at least one
          of the following places: within a NOTICE text file distributed
          as part of the Derivative Works; within the Source form or
          documentation, if provided along with the Derivative Works; or,
          within a display generated by the Derivative Works, if and
          wherever such third-party notices normally appear. The contents
          of the NOTICE file are for informational purposes only and
          do not modify the License. You may add Your own attribution
          notices within Derivative Works that You distribute, alongside
          or as an addendum to the NOTICE text from the Work, provided
          that such additional attribution notices cannot be construed
          as modifying the License.

      You may add Your own copyright statement to Your modifications and
      may provide additional or different license terms and conditions
      for use, reproduction, or distribution of Your modifications, or
      for any such Derivative Works as a whole, provided Your use,
      reproduction, and distribution of the Work otherwise complies with
      the conditions stated in this License.

   5. Submission of Contributions. Unless You explicitly state otherwise,
      any Contribution intentionally submitted for inclusion in the Work
      by You to the Licensor shall be under the terms and conditions of
      this License, without any additional terms or conditions.
      Notwithstanding the above, nothing herein shall supersede or modify
      the terms of any separate license agreement you may have executed
      with Licensor regarding such Contributions.

   6. Trademarks. This License does not grant permission to use the trade
      names, trademarks, service marks, or product names of the Licensor,
      except as required for reasonable and customary use in describing the
      origin of the Work and reproducing the content of the NOTICE file.

   7. Disclaimer of Warranty. Unless required by applicable law or
      agreed to in writing, Licensor provides the Work (and each
      Contributor provides its Contributions) on an "AS IS" BASIS,
      WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or
      implied, including, without limitation, any warranties or conditions
      of TITLE, NON-INFRINGEMENT, MERCHANTABILITY, or FITNESS FOR A
      PARTICULAR PURPOSE. You are solely responsible for determining the
      appropriateness of using or redistributing the Work and assume any
      risks associated with Your exercise of permissions under this License.

   8. Limitation of Liability. In no event and under no legal theory,
      whether in tort (including negligence), contract, or otherwise,
      unless required by applicable law (such as deliberate and grossly
      negligent acts) or agreed to in writing, shall any Contributor be
      liable to You for damages, including any direct, indirect, special,
      incidental, or consequential damages of any character arising as a
      result of this License or out of the use or inability to use the
      Work (including but not limited to damages for loss of goodwill,
      work stoppage, computer failure or malfunction, or any and all
      other commercial damages or losses), even if such Contributor
      has been advised of the possibility of such damages.

   9. Accepting Warranty or Additional Liability. While redistributing
      the Work or Derivative Works thereof, You may choose to offer,
      and charge a fee for, acceptance of support, warranty, indemnity,
      or other liability obligations and/or rights consistent with this
      License. However, in accepting such obligations, You may act only
      on Your own behalf and on Your sole responsibility, not on behalf
      of any other Contributor, and only if You agree to indemnify,
      defend, and hold each Contributor harmless for any liability
      incurred by, or claims asserted against, such Contributor by reason
      of your accepting any such warranty or additional liability.

   END OF TERMS AND CONDITIONS

   Copyright 2020-2022 Jina AI Limited

   Licensed under the Apache License, Version 2.0 (the "License");
   you may not use this file except in compliance with the License.
   You may obtain a copy of the License at

       http://www.apache.org/licenses/LICENSE-2.0

   Unless required by applicable law or agreed to in writing, software
   distributed under the License is distributed on an "AS IS" BASIS,
   WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
   See the License for the specific language governing permissions and
   limitations under the License.


================================================
FILE: README.md
================================================
<p align="center">
<a href="https://clip-as-service.jina.ai"><img src="https://github.com/jina-ai/clip-as-service/blob/main/docs/_static/logo-light.svg?raw=true" alt="CLIP-as-service logo: The data structure for unstructured data" width="200px"></a>
<br><br><br>
</p>


<p align=center>
<a href="https://pypi.org/project/clip_server/"><img alt="PyPI" src="https://img.shields.io/pypi/v/clip_server?label=Release&style=flat-square"></a>
<a href="https://discord.jina.ai"><img src="https://img.shields.io/discord/1106542220112302130?logo=discord&logoColor=white&style=flat-square"></a>
<a href="https://codecov.io/gh/jina-ai/clip-as-service"><img alt="Codecov branch" src="https://img.shields.io/codecov/c/github/jina-ai/clip-as-service/main?logo=Codecov&logoColor=white&style=flat-square"></a>
<a href="https://colab.research.google.com/github/jina-ai/clip-as-service/blob/main/docs/hosting/cas-on-colab.ipynb"><img src="https://img.shields.io/badge/Host-on%20Google%20Colab%20(GPU/TPU)-brightgreen?style=flat-square&logo=googlecolab&&logoColor=white" alt="Host on Google Colab with GPU/TPU support"></a>
</p>

<!-- start elevator-pitch -->

CLIP-as-service is a low-latency high-scalability service for embedding images and text. It can be easily integrated as a microservice into neural search solutions.

⚡ **Fast**: Serve CLIP models with TensorRT, ONNX runtime and PyTorch w/o JIT with 800QPS<sup>[*]</sup>. Non-blocking duplex streaming on requests and responses, designed for large data and long-running tasks. 

🫐 **Elastic**: Horizontally scale up and down multiple CLIP models on single GPU, with automatic load balancing.

🐥 **Easy-to-use**: No learning curve, minimalist design on client and server. Intuitive and consistent API for image and sentence embedding. 

👒 **Modern**: Async client support. Easily switch between gRPC, HTTP, WebSocket protocols with TLS and compression.

🍱 **Integration**: Smooth integration with neural search ecosystem including [Jina](https://github.com/jina-ai/jina) and [DocArray](https://github.com/jina-ai/docarray). Build cross-modal and multi-modal solutions in no time. 

<sup>[*] with default config (single replica, PyTorch no JIT) on GeForce RTX 3090. </sup>

<!-- end elevator-pitch -->

### Text & image embedding

<table>
<tr>
<td> via HTTPS 🔐 </td>
<td> via gRPC 🔐⚡⚡ </td>
</tr>
<tr>
<td>

```bash
curl \
-X POST https://<your-inference-address>-http.wolf.jina.ai/post \
-H 'Content-Type: application/json' \
-H 'Authorization: <your access token>' \
-d '{"data":[{"text": "First do it"}, 
    {"text": "then do it right"}, 
    {"text": "then do it better"}, 
    {"uri": "https://picsum.photos/200"}], 
    "execEndpoint":"/"}'
```

</td>
<td>

```python
# pip install clip-client
from clip_client import Client

c = Client(
    'grpcs://<your-inference-address>-grpc.wolf.jina.ai',
    credential={'Authorization': '<your access token>'},
)

r = c.encode(
    [
        'First do it',
        'then do it right',
        'then do it better',
        'https://picsum.photos/200',
    ]
)
print(r)
```
</td>
</tr>
</table>

### Visual reasoning

There are four basic visual reasoning skills: object recognition, object counting, color recognition, and spatial relation understanding. Let's try some:

> You need to install [`jq` (a JSON processor)](https://stedolan.github.io/jq/) to prettify the results.

<table>
<tr>
<td> Image </td>
<td> via HTTPS 🔐 </td>
</tr>
<tr>
<td>
<img src="https://picsum.photos/id/1/300/300">
</td>
<td>

```bash
curl \
-X POST https://<your-inference-address>-http.wolf.jina.ai/post \
-H 'Content-Type: application/json' \
-H 'Authorization: <your access token>' \
-d '{"data":[{"uri": "https://picsum.photos/id/1/300/300",
"matches": [{"text": "there is a woman in the photo"},
            {"text": "there is a man in the photo"}]}],
            "execEndpoint":"/rank"}' \
| jq ".data[].matches[] | (.text, .scores.clip_score.value)"
```

gives:

```
"there is a woman in the photo"
0.626907229423523
"there is a man in the photo"
0.37309277057647705
```

</td>
</tr>
<tr>
<td>
<img src="https://picsum.photos/id/133/300/300">
</td>
<td>

```bash
curl \
-X POST https://<your-inference-address>-http.wolf.jina.ai/post \
-H 'Content-Type: application/json' \
-H 'Authorization: <your access token>' \
-d '{"data":[{"uri": "https://picsum.photos/id/133/300/300",
"matches": [
{"text": "the blue car is on the left, the red car is on the right"},
{"text": "the blue car is on the right, the red car is on the left"},
{"text": "the blue car is on top of the red car"},
{"text": "the blue car is below the red car"}]}],
"execEndpoint":"/rank"}' \
| jq ".data[].matches[] | (.text, .scores.clip_score.value)"
```

gives:
```
"the blue car is on the left, the red car is on the right"
0.5232442617416382
"the blue car is on the right, the red car is on the left"
0.32878655195236206
"the blue car is below the red car"
0.11064132302999496
"the blue car is on top of the red car"
0.03732786327600479
```

</td>
</tr>


<tr>
<td>
<img src="https://picsum.photos/id/102/300/300">
</td>
<td>

```bash
curl \
-X POST https://<your-inference-address>-http.wolf.jina.ai/post \
-H 'Content-Type: application/json' \
-H 'Authorization: <your access token>' \
-d '{"data":[{"uri": "https://picsum.photos/id/102/300/300",
"matches": [{"text": "this is a photo of one berry"},
            {"text": "this is a photo of two berries"},
            {"text": "this is a photo of three berries"},
            {"text": "this is a photo of four berries"},
            {"text": "this is a photo of five berries"},
            {"text": "this is a photo of six berries"}]}],
            "execEndpoint":"/rank"}' \
| jq ".data[].matches[] | (.text, .scores.clip_score.value)"
```

gives:
```
"this is a photo of three berries"
0.48507222533226013
"this is a photo of four berries"
0.2377079576253891
"this is a photo of one berry"
0.11304923892021179
"this is a photo of five berries"
0.0731358453631401
"this is a photo of two berries"
0.05045759305357933
"this is a photo of six berries"
0.04057715833187103
```

</td>
</tr>


</table>


## [Documentation](https://clip-as-service.jina.ai)

## Install

CLIP-as-service consists of two Python packages `clip-server` and `clip-client` that can be installed _independently_. Both require Python 3.7+. 

### Install server

<table>
<tr>
<td> Pytorch Runtime ⚡ </td>
<td> ONNX Runtime ⚡⚡</td>
<td> TensorRT Runtime ⚡⚡⚡ </td>
</tr>
<tr>
<td>

```bash
pip install clip-server
```

</td>
<td>

```bash
pip install "clip-server[onnx]"
```

</td>
<td>

```bash
pip install nvidia-pyindex 
pip install "clip-server[tensorrt]"
```
</td>
</tr>
</table>

You can also [host the server on Google Colab](https://clip-as-service.jina.ai/hosting/colab/), leveraging its free GPU/TPU.

### Install client

```bash
pip install clip-client
```

### Quick check

You can run a simple connectivity check after install.


<table>
<tr>
<th> C/S </th> 
<th> Command </th> 
<th> Expect output </th>
</tr>
<tr>
<td>
Server
</td>
<td> 

```bash
python -m clip_server
```
     
</td>
<td>

<img src="https://github.com/jina-ai/clip-as-service/blob/main/.github/README-img/server-output.svg?raw=true" alt="Expected server output" width="300px">

</td>
</tr>
<tr>
<td>
Client
</td>
<td> 

```python
from clip_client import Client

c = Client('grpc://0.0.0.0:23456')
c.profile()
```
     
</td>
<td>

<img src="https://github.com/jina-ai/clip-as-service/blob/main/.github/README-img/pyclient-output.svg?raw=true" alt="Expected clip-client output" width="300px">

</td>
</tr>
</table>


You can change `0.0.0.0` to the intranet or public IP address to test the connectivity over private and public network. 


## Get Started

### Basic usage

1. Start the server: `python -m clip_server`. Remember its address and port.
2. Create a client:
   ```python
    from clip_client import Client
   
    c = Client('grpc://0.0.0.0:51000')
    ```
3. To get sentence embedding:
    ```python    
    r = c.encode(['First do it', 'then do it right', 'then do it better'])
    
    print(r.shape)  # [3, 512] 
    ```
4. To get image embedding:
    ```python    
    r = c.encode(['apple.png',  # local image 
                  'https://clip-as-service.jina.ai/_static/favicon.png',  # remote image
                  'data:image/gif;base64,R0lGODlhEAAQAMQAAORHHOVSKudfOulrSOp3WOyDZu6QdvCchPGolfO0o/XBs/fNwfjZ0frl3/zy7////wAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAACH5BAkAABAALAAAAAAQABAAAAVVICSOZGlCQAosJ6mu7fiyZeKqNKToQGDsM8hBADgUXoGAiqhSvp5QAnQKGIgUhwFUYLCVDFCrKUE1lBavAViFIDlTImbKC5Gm2hB0SlBCBMQiB0UjIQA7'])  # in image URI
    
    print(r.shape)  # [3, 512]
    ```

More comprehensive server and client user guides can be found in the [docs](https://clip-as-service.jina.ai/).

### Text-to-image cross-modal search in 10 lines

Let's build a text-to-image search using CLIP-as-service. Namely, a user can input a sentence and the program returns matching images. We'll use the [Totally Looks Like](https://sites.google.com/view/totally-looks-like-dataset) dataset and [DocArray](https://github.com/jina-ai/docarray) package. Note that DocArray is included within `clip-client` as an upstream dependency, so you don't need to install it separately.

#### Load images

First we load images. You can simply pull them from Jina Cloud:

```python
from docarray import DocumentArray

da = DocumentArray.pull('ttl-original', show_progress=True, local_cache=True)
```

<details>
<summary>or download TTL dataset, unzip, load manually</summary>

Alternatively, you can go to [Totally Looks Like](https://sites.google.com/view/totally-looks-like-dataset) official website, unzip and load images:

```python
from docarray import DocumentArray

da = DocumentArray.from_files(['left/*.jpg', 'right/*.jpg'])
```

</details>

The dataset contains 12,032 images, so it may take a while to pull. Once done, you can visualize it and get the first taste of those images:

```python
da.plot_image_sprites()
```

<p align="center">
<img src="https://github.com/jina-ai/clip-as-service/blob/main/.github/README-img/ttl-image-sprites.png?raw=true" alt="Visualization of the image sprite of Totally looks like dataset" width="50%">
</p>

#### Encode images

Start the server with `python -m clip_server`. Let's say it's at `0.0.0.0:51000` with `GRPC` protocol (you will get this information after running the server).

Create a Python client script:

```python
from clip_client import Client

c = Client(server='grpc://0.0.0.0:51000')

da = c.encode(da, show_progress=True)
```

Depending on your GPU and client-server network, it may take a while to embed 12K images. In my case, it took about two minutes.

<details>
<summary>Download the pre-encoded dataset</summary>

If you're impatient or don't have a GPU, waiting can be Hell. In this case, you can simply pull our pre-encoded image dataset:

```python
from docarray import DocumentArray

da = DocumentArray.pull('ttl-embedding', show_progress=True, local_cache=True)
```

</details>

#### Search via sentence 

Let's build a simple prompt to allow a user to type sentence:

```python
while True:
    vec = c.encode([input('sentence> ')])
    r = da.find(query=vec, limit=9)
    r[0].plot_image_sprites()
```

#### Showcase

Now you can input arbitrary English sentences and view the top-9 matching images. Search is fast and instinctive. Let's have some fun:

<table>
<tr>
<th> "a happy potato" </th> 
<th> "a super evil AI" </th> 
<th> "a guy enjoying his burger" </th>
</tr>
<tr>
<td>

<p align="center">
<img src="https://github.com/jina-ai/clip-as-service/blob/main/.github/README-img/a-happy-potato.png?raw=true" alt="Visualization of the image sprite of Totally looks like dataset" width="100%">
</p>

</td>
<td>

<p align="center">
<img src="https://github.com/jina-ai/clip-as-service/blob/main/.github/README-img/a-super-evil-AI.png?raw=true" alt="Visualization of the image sprite of Totally looks like dataset" width="100%">
</p>

</td>
<td>

<p align="center">
<img src="https://github.com/jina-ai/clip-as-service/blob/main/.github/README-img/a-guy-enjoying-his-burger.png?raw=true" alt="Visualization of the image sprite of Totally looks like dataset" width="100%">
</p>

</td>
</tr>
</table>


<table>
<tr>
<th> "professor cat is very serious" </th> 
<th> "an ego engineer lives with parent" </th> 
<th> "there will be no tomorrow so lets eat unhealthy" </th>
</tr>
<tr>
<td>

<p align="center">
<img src="https://github.com/jina-ai/clip-as-service/blob/main/.github/README-img/professor-cat-is-very-serious.png?raw=true" alt="Visualization of the image sprite of Totally looks like dataset" width="100%">
</p>

</td>
<td>

<p align="center">
<img src="https://github.com/jina-ai/clip-as-service/blob/main/.github/README-img/an-ego-engineer-lives-with-parent.png?raw=true" alt="Visualization of the image sprite of Totally looks like dataset" width="100%">
</p>

</td>
<td>

<p align="center">
<img src="https://github.com/jina-ai/clip-as-service/blob/main/.github/README-img/there-will-be-no-tomorrow-so-lets-eat-unhealthy.png?raw=true" alt="Visualization of the image sprite of Totally looks like dataset" width="100%">
</p>

</td>
</tr>
</table>

Let's save the embedding result for our next example: 

```python
da.save_binary('ttl-image')
```

### Image-to-text cross-modal search in 10 Lines

We can also switch the input and output of the last program to achieve image-to-text search. Precisely, given a query image find the sentence that best describes the image.

Let's use all sentences from the book "Pride and Prejudice". 

```python
from docarray import Document, DocumentArray

d = Document(uri='https://www.gutenberg.org/files/1342/1342-0.txt').load_uri_to_text()
da = DocumentArray(
    Document(text=s.strip()) for s in d.text.replace('\r\n', '').split('.') if s.strip()
)
```

Let's look at what we got:

```python
da.summary()
```

```text
            Documents Summary            
                                         
  Length                 6403            
  Homogenous Documents   True            
  Common Attributes      ('id', 'text')  
                                         
                     Attributes Summary                     
                                                            
  Attribute   Data type   #Unique values   Has empty value  
 ────────────────────────────────────────────────────────── 
  id          ('str',)    6403             False            
  text        ('str',)    6030             False            
```

#### Encode sentences

Now encode these 6,403 sentences, it may take 10 seconds or less depending on your GPU and network: 

```python
from clip_client import Client

c = Client('grpc://0.0.0.0:51000')

r = c.encode(da, show_progress=True)
```

<details>
<summary>Download the pre-encoded dataset</summary>

Again, for people who are impatient or don't have a GPU, we have prepared a pre-encoded text dataset:

```python
from docarray import DocumentArray

da = DocumentArray.pull('ttl-textual', show_progress=True, local_cache=True)
```

</details>

#### Search via image

Let's load our previously stored image embedding, randomly sample 10 image Documents, then find top-1 nearest neighbour of each.

```python
from docarray import DocumentArray

img_da = DocumentArray.load_binary('ttl-image')

for d in img_da.sample(10):
    print(da.find(d.embedding, limit=1)[0].text)
```

#### Showcase

Fun time! Note, unlike the previous example, here the input is an image and the sentence is the output. All sentences come from the book "Pride and Prejudice". 

<table>
<tr>
<td>
<p align="center">
<img src="https://github.com/jina-ai/clip-as-service/blob/main/.github/README-img/Besides,-there-was-truth-in-his-looks.png?raw=true" alt="Visualization of the image sprite of Totally looks like dataset" height="100px">
</p>


</td>
<td>

<p align="center">
<img src="https://github.com/jina-ai/clip-as-service/blob/main/.github/README-img/Gardiner-smiled.png?raw=true" alt="Visualization of the image sprite of Totally looks like dataset" height="100px">
</p>

</td>
<td>

<p align="center">
<img src="https://github.com/jina-ai/clip-as-service/blob/main/.github/README-img/what’s-his-name.png?raw=true" alt="Visualization of the image sprite of Totally looks like dataset" height="100px">
</p>

</td>
<td>

<p align="center">
<img src="https://github.com/jina-ai/clip-as-service/blob/main/.github/README-img/By-tea-time,-however,-the-dose-had-been-enough,-and-Mr.png?raw=true" alt="Visualization of the image sprite of Totally looks like dataset" height="100px">
</p>

</td>

<td>

<p align="center">
<img src="https://github.com/jina-ai/clip-as-service/blob/main/.github/README-img/You-do-not-look-well.png?raw=true" alt="Visualization of the image sprite of Totally looks like dataset" height="100px">
</p>

</td>
</tr>
<tr>
<td>Besides, there was truth in his looks</td>
<td>Gardiner smiled</td>
<td>what’s his name</td>
<td>By tea time, however, the dose had been enough, and Mr</td>
<td>You do not look well</td>
</tr>
</table>

<table>
<tr>
<td>
<p align="center">
<img src="https://github.com/jina-ai/clip-as-service/blob/main/.github/README-img/“A-gamester!”-she-cried.png?raw=true" alt="Visualization of the image sprite of Totally looks like dataset" height="100px">
</p>


</td>
<td>

<p align="center">
<img src="https://github.com/jina-ai/clip-as-service/blob/main/.github/README-img/If-you-mention-my-name-at-the-Bell,-you-will-be-attended-to.png?raw=true" alt="Visualization of the image sprite of Totally looks like dataset" height="100px">
</p>

</td>
<td>

<p align="center">
<img src="https://github.com/jina-ai/clip-as-service/blob/main/.github/README-img/Never-mind-Miss-Lizzy’s-hair.png?raw=true" alt="Visualization of the image sprite of Totally looks like dataset" height="100px">
</p>

</td>
<td>

<p align="center">
<img src="https://github.com/jina-ai/clip-as-service/blob/main/.github/README-img/Elizabeth-will-soon-be-the-wife-of-Mr.png?raw=true" alt="Visualization of the image sprite of Totally looks like dataset" height="100px">
</p>

</td>

<td>

<p align="center">
<img src="https://github.com/jina-ai/clip-as-service/blob/main/.github/README-img/I-saw-them-the-night-before-last.png?raw=true" alt="Visualization of the image sprite of Totally looks like dataset" height="100px">
</p>

</td>
</tr>
<tr>
<td>“A gamester!” she cried</td>
<td>If you mention my name at the Bell, you will be attended to</td>
<td>Never mind Miss Lizzy’s hair</td>
<td>Elizabeth will soon be the wife of Mr</td>
<td>I saw them the night before last</td>
</tr>
</table>


### Rank image-text matches via CLIP model

From `0.3.0` CLIP-as-service adds a new `/rank` endpoint that re-ranks cross-modal matches according to their joint likelihood in CLIP model. For example, given an image Document with some predefined sentence matches as below:

```python
from clip_client import Client
from docarray import Document

c = Client(server='grpc://0.0.0.0:51000')
r = c.rank(
    [
        Document(
            uri='.github/README-img/rerank.png',
            matches=[
                Document(text=f'a photo of a {p}')
                for p in (
                    'control room',
                    'lecture room',
                    'conference room',
                    'podium indoor',
                    'television studio',
                )
            ],
        )
    ]
)

print(r['@m', ['text', 'scores__clip_score__value']])
```

```text
[['a photo of a television studio', 'a photo of a conference room', 'a photo of a lecture room', 'a photo of a control room', 'a photo of a podium indoor'], 
[0.9920725226402283, 0.006038925610482693, 0.0009973491542041302, 0.00078492151806131, 0.00010626466246321797]]
```

One can see now `a photo of a television studio` is ranked to the top with `clip_score` score at `0.992`. In practice, one can use this endpoint to re-rank the matching result from another search system, for improving the cross-modal search quality.

<table>
<tr>
<td>
<img src="https://github.com/jina-ai/clip-as-service/blob/main/.github/README-img/rerank.png?raw=true" alt="Rerank endpoint image input" height="150px">
</td>
<td>
<img src="https://github.com/jina-ai/clip-as-service/blob/main/.github/README-img/rerank-chart.svg?raw=true" alt="Rerank endpoint output">
</td>
</tr>
</table>

### Rank text-image matches via CLIP model

In the [DALL·E Flow](https://github.com/jina-ai/dalle-flow) project, CLIP is called for ranking the generated results from DALL·E. [It has an Executor wrapped on top of `clip-client`](https://github.com/jina-ai/dalle-flow/blob/main/executors/rerank/executor.py), which calls `.arank()` - the async version of `.rank()`:

```python
from clip_client import Client
from jina import Executor, requests, DocumentArray


class ReRank(Executor):
    def __init__(self, clip_server: str, **kwargs):
        super().__init__(**kwargs)
        self._client = Client(server=clip_server)

    @requests(on='/')
    async def rerank(self, docs: DocumentArray, **kwargs):
        return await self._client.arank(docs)
```

<p align="center">
<img src="https://github.com/jina-ai/clip-as-service/blob/main/.github/README-img/client-dalle.png?raw=true" alt="CLIP-as-service used in DALLE Flow" width="300px">
</p>

Intrigued? That's only scratching the surface of what CLIP-as-service is capable of. [Read our docs to learn more](https://clip-as-service.jina.ai).

<!-- start support-pitch -->
## Support

- Join our [Discord community](https://discord.jina.ai) and chat with other community members about ideas.
- Watch our [Engineering All Hands](https://youtube.com/playlist?list=PL3UBBWOUVhFYRUa_gpYYKBqEAkO4sxmne) to learn Jina's new features and stay up-to-date with the latest AI techniques.
- Subscribe to the latest video tutorials on our [YouTube channel](https://youtube.com/c/jina-ai)

## Join Us

CLIP-as-service is backed by [Jina AI](https://jina.ai) and licensed under [Apache-2.0](./LICENSE). [We are actively hiring](https://jobs.jina.ai) AI engineers, solution engineers to build the next neural search ecosystem in open-source.

<!-- end support-pitch -->


================================================
FILE: client/clip_client/__init__.py
================================================
__version__ = '0.8.4'

import os

from clip_client.client import Client

if 'NO_VERSION_CHECK' not in os.environ:
    from clip_client.helper import is_latest_version

    is_latest_version(github_repo='clip-as-service')


================================================
FILE: client/clip_client/client.py
================================================
import mimetypes
import os
import time
import warnings
from typing import (
    overload,
    TYPE_CHECKING,
    Optional,
    Union,
    Iterator,
    Generator,
    Iterable,
    Dict,
)
from urllib.parse import urlparse
from functools import partial
from docarray import DocumentArray

if TYPE_CHECKING:
    import numpy as np
    from docarray import Document
    from jina.clients.base import CallbackFnType


class Client:
    def __init__(self, server: str, credential: dict = {}, **kwargs):
        """Create a Clip client object that connects to the Clip server.
        Server scheme is in the format of ``scheme://netloc:port``, where
            - scheme: one of grpc, websocket, http, grpcs, websockets, https
            - netloc: the server ip address or hostname
            - port: the public port of the server
        :param server: the server URI
        :param credential: the credential for authentication ``{'Authentication': '<token>'}``
        """
        try:
            r = urlparse(server)
            _port = r.port
            self._scheme = r.scheme
        except:
            raise ValueError(f'{server} is not a valid scheme')

        _tls = False
        if self._scheme in ('grpcs', 'https', 'wss'):
            self._scheme = self._scheme[:-1]
            _tls = True

        if self._scheme == 'ws':
            self._scheme = 'websocket'  # temp fix for the core
            if credential:
                warnings.warn(
                    'Credential is not supported for websocket, please use grpc or http'
                )

        if self._scheme in ('grpc', 'http', 'websocket'):
            _kwargs = dict(host=r.hostname, port=_port, protocol=self._scheme, tls=_tls)

            from jina import Client

            self._client = Client(**_kwargs)
            self._async_client = Client(**_kwargs, asyncio=True)
        else:
            raise ValueError(f'{server} is not a valid scheme')

        self._authorization = credential.get(
            'Authorization', os.environ.get('CLIP_AUTH_TOKEN')
        )

    def profile(self, content: Optional[str] = '') -> Dict[str, float]:
        """Profiling a single query's roundtrip including network and computation latency. Results is summarized in a table.
        :param content: the content to be sent for profiling. By default it sends an empty Document
            that helps you understand the network latency.
        :return: the latency report in a dict.
        """
        st = time.perf_counter()
        r = self._client.post(
            '/', self._iter_doc([content], DocumentArray()), return_responses=True
        )
        ed = (time.perf_counter() - st) * 1000
        route = r[0].routes
        gateway_time = (
            route[0].end_time.ToMilliseconds() - route[0].start_time.ToMilliseconds()
        )
        clip_time = (
            route[1].end_time.ToMilliseconds() - route[1].start_time.ToMilliseconds()
        )
        network_time = ed - gateway_time
        server_network = gateway_time - clip_time

        from rich.table import Table

        def make_table(_title, _time, _percent):
            table = Table(show_header=False, box=None)
            table.add_row(
                _title, f'[b]{_time:.0f}[/b]ms', f'[dim]{_percent * 100:.0f}%[/dim]'
            )
            return table

        from rich.tree import Tree

        t = Tree(make_table('Roundtrip', ed, 1))
        t.add(make_table('Client-server network', network_time, network_time / ed))
        t2 = t.add(make_table('Server', gateway_time, gateway_time / ed))
        t2.add(
            make_table(
                'Gateway-CLIP network', server_network, server_network / gateway_time
            )
        )
        t2.add(make_table('CLIP model', clip_time, clip_time / gateway_time))

        from rich import print

        print(t)

        return {
            'Roundtrip': ed,
            'Client-server network': network_time,
            'Server': gateway_time,
            'Gateway-CLIP network': server_network,
            'CLIP model': clip_time,
        }

    def _update_pbar(self, response, func: Optional['CallbackFnType'] = None):
        from rich import filesize

        r = response.data.docs
        if not self._pbar._tasks[self._r_task].started:
            self._pbar.start_task(self._r_task)
        self._pbar.update(
            self._r_task,
            advance=len(r),
            total_size=str(
                filesize.decimal(int(os.environ.get('JINA_GRPC_RECV_BYTES', '0')))
            ),
        )
        if func is not None:
            func(response)

    def _prepare_streaming(self, disable, total):
        if total is None:
            total = 500
            warnings.warn(
                'The length of the input is unknown, the progressbar would not be accurate.'
            )
        elif total > 500:
            warnings.warn(
                'Please ensure all the inputs are valid, otherwise the request will be aborted.'
            )

        from docarray.array.mixins.io.pbar import get_pbar

        self._pbar = get_pbar(disable)

        os.environ['JINA_GRPC_SEND_BYTES'] = '0'
        os.environ['JINA_GRPC_RECV_BYTES'] = '0'

        self._r_task = self._pbar.add_task(
            ':arrow_down: Progress', total=total, total_size=0, start=False
        )

    @staticmethod
    def _gather_result(
        response, results: 'DocumentArray', attribute: Optional[str] = None
    ):
        r = response.data.docs
        if attribute:
            results[r[:, 'id']][:, attribute] = r[:, attribute]

    def _iter_doc(
        self, content, results: Optional['DocumentArray'] = None
    ) -> Generator['Document', None, None]:
        from docarray import Document

        for c in content:
            if isinstance(c, str):
                _mime = mimetypes.guess_type(c)[0]
                if _mime and _mime.startswith('image'):
                    d = Document(
                        uri=c,
                    ).load_uri_to_blob()
                else:
                    d = Document(text=c)
            elif isinstance(c, Document):
                if c.content_type in ('text', 'blob'):
                    d = c
                elif not c.blob and c.uri:
                    c.load_uri_to_blob()
                    d = c
                elif c.tensor is not None:
                    d = c
                else:
                    raise TypeError(f'unsupported input type {c!r} {c.content_type}')
            else:
                raise TypeError(f'unsupported input type {c!r}')

            if results is not None:
                results.append(d)
            yield d

    def _get_post_payload(
        self, content, results: Optional['DocumentArray'] = None, **kwargs
    ):
        payload = dict(
            inputs=self._iter_doc(content, results),
            request_size=kwargs.get('batch_size', 8),
            total_docs=len(content) if hasattr(content, '__len__') else None,
        )

        if self._scheme == 'grpc' and self._authorization:
            payload.update(metadata=(('authorization', self._authorization),))
        elif self._scheme == 'http' and self._authorization:
            payload.update(headers={'Authorization': self._authorization})
        return payload

    @staticmethod
    def _unboxed_result(results: Optional['DocumentArray'] = None, unbox: bool = False):
        if results is not None:
            if results.embeddings is None:
                raise ValueError(
                    'Empty embedding returned from the server. '
                    'This often due to a mis-config of the server, '
                    'restarting the server or changing the serving port number often solves the problem'
                )
            return results.embeddings if unbox else results

    @overload
    def encode(
        self,
        content: Iterable[str],
        *,
        batch_size: Optional[int] = None,
        show_progress: bool = False,
        parameters: Optional[dict] = None,
        on_done: Optional['CallbackFnType'] = None,
        on_error: Optional['CallbackFnType'] = None,
        on_always: Optional['CallbackFnType'] = None,
        prefetch: int = 100,
    ) -> 'np.ndarray':
        """Encode images and texts into embeddings where the input is an iterable of raw strings.
        Each image and text must be represented as a string. The following strings are acceptable:
            - local image filepath, will be considered as an image
            - remote image http/https, will be considered as an image
            - a dataURI, will be considered as an image
            - plain text, will be considered as a sentence
        :param content: an iterator of image URIs or sentences, each element is an image or a text sentence as a string.
        :param batch_size: the number of elements in each request when sending ``content``
        :param show_progress: if set, show a progress bar
        :param parameters: the parameters for the encoding, you can specify the model to use when you have multiple models
        :param on_done: the callback function executed while streaming, after successful completion of each request.
            It takes the response ``DataRequest`` as the only argument
        :param on_error: the callback function executed while streaming, after failed completion of each request.
            It takes the response ``DataRequest`` as the only argument
        :param on_always: the callback function executed while streaming, after completion of each request.
            It takes the response ``DataRequest`` as the only argument
        :param prefetch: the number of in-flight batches made by the post() method. Use a lower value for expensive
            operations, and a higher value for faster response times
        :return: the embedding in a numpy ndarray with shape ``[N, D]``. ``N`` is in the same length of ``content``
        """
        ...

    @overload
    def encode(
        self,
        content: Union['DocumentArray', Iterable['Document']],
        *,
        batch_size: Optional[int] = None,
        show_progress: bool = False,
        parameters: Optional[dict] = None,
        on_done: Optional['CallbackFnType'] = None,
        on_error: Optional['CallbackFnType'] = None,
        on_always: Optional['CallbackFnType'] = None,
        prefetch: int = 100,
    ) -> 'DocumentArray':
        """Encode images and texts into embeddings where the input is an iterable of :class:`docarray.Document`.
        :param content: an iterable of :class:`docarray.Document`, each Document must be filled with `.uri`, `.text` or `.blob`.
        :param batch_size: the number of elements in each request when sending ``content``
        :param show_progress: if set, show a progress bar
        :param parameters: the parameters for the encoding, you can specify the model to use when you have multiple models
        :param on_done: the callback function executed while streaming, after successful completion of each request.
            It takes the response ``DataRequest`` as the only argument
        :param on_error: the callback function executed while streaming, after failed completion of each request.
            It takes the response ``DataRequest`` as the only argument
        :param on_always: the callback function executed while streaming, after completion of each request.
            It takes the response ``DataRequest`` as the only argument
        :param prefetch: the number of in-flight batches made by the post() method. Use a lower value for expensive
            operations, and a higher value for faster response times
        :return: the embedding in a numpy ndarray with shape ``[N, D]``. ``N`` is in the same length of ``content``
        """
        ...

    def encode(self, content, **kwargs):
        if isinstance(content, str):
            raise TypeError(
                f'Content must be an Iterable of [str, Document], try `.encode(["{content}"])` instead'
            )
        if hasattr(content, '__len__') and len(content) == 0:
            return DocumentArray() if isinstance(content, DocumentArray) else []

        self._prepare_streaming(
            not kwargs.get('show_progress'),
            total=len(content) if hasattr(content, '__len__') else None,
        )
        on_done = kwargs.pop('on_done', None)
        on_error = kwargs.pop('on_error', None)
        on_always = kwargs.pop('on_always', None)
        prefetch = kwargs.pop('prefetch', 100)
        results = DocumentArray() if not on_done and not on_always else None
        if not on_done:
            on_done = partial(
                self._gather_result, results=results, attribute='embedding'
            )

        with self._pbar:
            parameters = kwargs.pop('parameters', {})
            parameters['drop_image_content'] = parameters.get(
                'drop_image_content', True
            )
            model_name = parameters.pop('model_name', '') if parameters else ''

            self._client.post(
                on=f'/encode/{model_name}'.rstrip('/'),
                **self._get_post_payload(content, results, **kwargs),
                on_done=on_done,
                on_error=on_error,
                on_always=partial(self._update_pbar, func=on_always),
                parameters=parameters,
                prefetch=prefetch,
            )

        unbox = hasattr(content, '__len__') and isinstance(content[0], str)
        return self._unboxed_result(results, unbox)

    @overload
    async def aencode(
        self,
        content: Iterator[str],
        *,
        batch_size: Optional[int] = None,
        show_progress: bool = False,
        parameters: Optional[dict] = None,
        on_done: Optional['CallbackFnType'] = None,
        on_error: Optional['CallbackFnType'] = None,
        on_always: Optional['CallbackFnType'] = None,
        prefetch: int = 100,
    ) -> 'np.ndarray':
        ...

    @overload
    async def aencode(
        self,
        content: Union['DocumentArray', Iterable['Document']],
        *,
        batch_size: Optional[int] = None,
        show_progress: bool = False,
        parameters: Optional[dict] = None,
        on_done: Optional['CallbackFnType'] = None,
        on_error: Optional['CallbackFnType'] = None,
        on_always: Optional['CallbackFnType'] = None,
        prefetch: int = 100,
    ) -> 'DocumentArray':
        ...

    async def aencode(self, content, **kwargs):
        if isinstance(content, str):
            raise TypeError(
                f'Content must be an Iterable of [str, Document], try `.aencode(["{content}"])` instead'
            )
        if hasattr(content, '__len__') and len(content) == 0:
            return DocumentArray() if isinstance(content, DocumentArray) else []

        self._prepare_streaming(
            not kwargs.get('show_progress'),
            total=len(content) if hasattr(content, '__len__') else None,
        )
        on_done = kwargs.pop('on_done', None)
        on_error = kwargs.pop('on_error', None)
        on_always = kwargs.pop('on_always', None)
        prefetch = kwargs.pop('prefetch', 100)
        results = DocumentArray() if not on_done and not on_always else None
        if not on_done:
            on_done = partial(
                self._gather_result, results=results, attribute='embedding'
            )

        with self._pbar:
            parameters = kwargs.pop('parameters', {})
            parameters['drop_image_content'] = parameters.get(
                'drop_image_content', True
            )
            model_name = parameters.get('model_name', '') if parameters else ''

            async for _ in self._async_client.post(
                on=f'/encode/{model_name}'.rstrip('/'),
                **self._get_post_payload(content, results, **kwargs),
                on_done=on_done,
                on_error=on_error,
                on_always=partial(self._update_pbar, func=on_always),
                parameters=parameters,
                prefetch=prefetch,
            ):
                continue

        unbox = hasattr(content, '__len__') and isinstance(content[0], str)
        return self._unboxed_result(results, unbox)

    def _iter_rank_docs(
        self, content, results: Optional['DocumentArray'] = None, source='matches'
    ) -> Generator['Document', None, None]:
        from docarray import Document

        for c in content:
            if isinstance(c, Document):
                d = self._prepare_rank_doc(c, source)
            else:
                raise TypeError(f'Unsupported input type {c!r}')
            if results is not None:
                results.append(d)
            yield d

    def _get_rank_payload(
        self, content, results: Optional['DocumentArray'] = None, **kwargs
    ):
        payload = dict(
            inputs=self._iter_rank_docs(
                content, results, source=kwargs.get('source', 'matches')
            ),
            request_size=kwargs.get('batch_size', 8),
            total_docs=len(content) if hasattr(content, '__len__') else None,
        )
        if self._scheme == 'grpc' and self._authorization:
            payload.update(metadata=(('authorization', self._authorization),))
        elif self._scheme == 'http' and self._authorization:
            payload.update(headers={'Authorization': self._authorization})
        return payload

    @staticmethod
    def _prepare_single_doc(d: 'Document'):
        if d.content_type in ('text', 'blob'):
            return d
        elif not d.blob and d.uri:
            d.load_uri_to_blob()
            return d
        elif d.tensor is not None:
            return d
        else:
            raise TypeError(f'Unsupported input type {d!r} {d.content_type}')

    @staticmethod
    def _prepare_rank_doc(d: 'Document', _source: str = 'matches'):
        _get = lambda d: getattr(d, _source)
        if not _get(d):
            raise ValueError(f'`.rank()` requires every doc to have `.{_source}`')
        d = Client._prepare_single_doc(d)
        setattr(d, _source, [Client._prepare_single_doc(c) for c in _get(d)])
        return d

    def rank(
        self, docs: Union['DocumentArray', Iterable['Document']], **kwargs
    ) -> 'DocumentArray':
        """Rank image-text matches according to the server CLIP model.
        Given a Document with nested matches, where the root is image/text and the matches is in another modality, i.e.
        text/image; this method ranks the matches according to the CLIP model.
        Each match now has a new score inside ``clip_score`` and matches are sorted descendingly according to this score.
        More details can be found in: https://github.com/openai/CLIP#usage

        :param docs: the input Documents
        :return: the ranked Documents in a DocumentArray.
        """
        if isinstance(docs, str):
            raise TypeError(f'Content must be an Iterable of [Document]')

        self._prepare_streaming(
            not kwargs.get('show_progress'),
            total=len(docs) if hasattr(docs, '__len__') else None,
        )

        on_done = kwargs.pop('on_done', None)
        on_error = kwargs.pop('on_error', None)
        on_always = kwargs.pop('on_always', None)
        prefetch = kwargs.pop('prefetch', 100)
        results = DocumentArray() if not on_done and not on_always else None
        if not on_done:
            on_done = partial(self._gather_result, results=results, attribute='matches')

        with self._pbar:
            parameters = kwargs.pop('parameters', {})
            parameters['drop_image_content'] = parameters.get(
                'drop_image_content', True
            )
            model_name = parameters.get('model_name', '') if parameters else ''

            self._client.post(
                on=f'/rank/{model_name}'.rstrip('/'),
                **self._get_rank_payload(docs, results, **kwargs),
                on_done=on_done,
                on_error=on_error,
                on_always=partial(self._update_pbar, func=on_always),
                parameters=parameters,
                prefetch=prefetch,
            )

        return results

    async def arank(
        self, docs: Union['DocumentArray', Iterable['Document']], **kwargs
    ) -> 'DocumentArray':
        if isinstance(docs, str):
            raise TypeError(f'Content must be an Iterable of [Document]')

        self._prepare_streaming(
            not kwargs.get('show_progress'),
            total=len(docs) if hasattr(docs, '__len__') else None,
        )
        on_done = kwargs.pop('on_done', None)
        on_error = kwargs.pop('on_error', None)
        on_always = kwargs.pop('on_always', None)
        prefetch = kwargs.pop('prefetch', 100)
        results = DocumentArray() if not on_done and not on_always else None
        if not on_done:
            on_done = partial(self._gather_result, results=results, attribute='matches')

        with self._pbar:
            parameters = kwargs.pop('parameters', {})
            parameters['drop_image_content'] = parameters.get(
                'drop_image_content', True
            )
            model_name = parameters.get('model_name', '') if parameters else ''

            async for _ in self._async_client.post(
                on=f'/rank/{model_name}'.rstrip('/'),
                **self._get_rank_payload(docs, results, **kwargs),
                on_done=on_done,
                on_error=on_error,
                on_always=partial(self._update_pbar, func=on_always),
                parameters=parameters,
                prefetch=prefetch,
            ):
                continue

        return results

    @overload
    def index(
        self,
        content: Iterable[str],
        *,
        batch_size: Optional[int] = None,
        show_progress: bool = False,
        parameters: Optional[Dict] = None,
        on_done: Optional['CallbackFnType'] = None,
        on_error: Optional['CallbackFnType'] = None,
        on_always: Optional['CallbackFnType'] = None,
        prefetch: int = 100,
    ):
        """Index the images or texts where their embeddings are computed by the server CLIP model.

        Each image and text must be represented as a string. The following strings are acceptable:
            - local image filepath, will be considered as an image
            - remote image http/https, will be considered as an image
            - a dataURI, will be considered as an image
            - plain text, will be considered as a sentence
        :param content: an iterator of image URIs or sentences, each element is an image or a text sentence as a string.
        :param batch_size: the number of elements in each request when sending ``content``
        :param show_progress: if set, show a progress bar
        :param parameters: the parameters for the indexing, you can specify the model to use when you have multiple models
        :param on_done: the callback function executed while streaming, after successful completion of each request.
            It takes the response ``DataRequest`` as the only argument
        :param on_error: the callback function executed while streaming, after an error occurs in each request.
            It takes the response ``DataRequest`` as the only argument
        :param on_always: the callback function executed while streaming, after each request is completed.
            It takes the response ``DataRequest`` as the only argument
        :param prefetch: the number of in-flight batches made by the post() method. Use a lower value for expensive
            operations, and a higher value for faster response times
        :return: the embedding in a numpy ndarray with shape ``[N, D]``. ``N`` is in the same length of ``content``
        """
        ...

    @overload
    def index(
        self,
        content: Union['DocumentArray', Iterable['Document']],
        *,
        batch_size: Optional[int] = None,
        show_progress: bool = False,
        parameters: Optional[dict] = None,
        on_done: Optional['CallbackFnType'] = None,
        on_error: Optional['CallbackFnType'] = None,
        on_always: Optional['CallbackFnType'] = None,
        prefetch: int = 100,
    ) -> 'DocumentArray':
        """Index the images or texts where their embeddings are computed by the server CLIP model.

        :param content: an iterable of :class:`docarray.Document`, each Document must be filled with `.uri`, `.text` or `.blob`.
        :param batch_size: the number of elements in each request when sending ``content``
        :param show_progress: if set, show a progress bar
        :param parameters: the parameters for the indexing, you can specify the model to use when you have multiple models
        :param on_done: the callback function executed while streaming, after successful completion of each request.
            It takes the response ``DataRequest`` as the only argument
        :param on_error: the callback function executed while streaming, after an error occurs in each request.
            It takes the response ``DataRequest`` as the only argument
        :param on_always: the callback function executed while streaming, after each request is completed.
            It takes the response ``DataRequest`` as the only argument
        :param prefetch: the number of in-flight batches made by the post() method. Use a lower value for expensive
            operations, and a higher value for faster response times
        :return: the embedding in a numpy ndarray with shape ``[N, D]``. ``N`` is in the same length of ``content``
        """
        ...

    def index(self, content, **kwargs):
        if isinstance(content, str):
            raise TypeError(
                f'content must be an Iterable of [str, Document], try `.index(["{content}"])` instead'
            )

        self._prepare_streaming(
            not kwargs.get('show_progress'),
            total=len(content) if hasattr(content, '__len__') else None,
        )
        on_done = kwargs.pop('on_done', None)
        on_error = kwargs.pop('on_error', None)
        on_always = kwargs.pop('on_always', None)
        prefetch = kwargs.pop('prefetch', 100)
        results = DocumentArray() if not on_done and not on_always else None
        if not on_done:
            on_done = partial(
                self._gather_result, results=results, attribute='embedding'
            )

        with self._pbar:
            parameters = kwargs.pop('parameters', {})
            parameters['drop_image_content'] = parameters.get(
                'drop_image_content', True
            )

            self._client.post(
                on='/index',
                **self._get_post_payload(content, results, **kwargs),
                on_done=on_done,
                on_error=on_error,
                on_always=partial(self._update_pbar, func=on_always),
                parameters=parameters,
                prefetch=prefetch,
            )

        return results

    @overload
    async def aindex(
        self,
        content: Iterator[str],
        *,
        batch_size: Optional[int] = None,
        show_progress: bool = False,
        parameters: Optional[Dict] = None,
        on_done: Optional['CallbackFnType'] = None,
        on_error: Optional['CallbackFnType'] = None,
        on_always: Optional['CallbackFnType'] = None,
        prefetch: int = 100,
    ):
        ...

    @overload
    async def aindex(
        self,
        content: Union['DocumentArray', Iterable['Document']],
        *,
        batch_size: Optional[int] = None,
        show_progress: bool = False,
        parameters: Optional[dict] = None,
        on_done: Optional['CallbackFnType'] = None,
        on_error: Optional['CallbackFnType'] = None,
        on_always: Optional['CallbackFnType'] = None,
        prefetch: int = 100,
    ):
        ...

    async def aindex(self, content, **kwargs):
        if isinstance(content, str):
            raise TypeError(
                f'content must be an Iterable of [str, Document], try `.aindex(["{content}"])` instead'
            )

        self._prepare_streaming(
            not kwargs.get('show_progress'),
            total=len(content) if hasattr(content, '__len__') else None,
        )
        on_done = kwargs.pop('on_done', None)
        on_error = kwargs.pop('on_error', None)
        on_always = kwargs.pop('on_always', None)
        prefetch = kwargs.pop('prefetch', 100)
        results = DocumentArray() if not on_done and not on_always else None
        if not on_done:
            on_done = partial(
                self._gather_result, results=results, attribute='embedding'
            )

        with self._pbar:
            parameters = kwargs.pop('parameters', {})
            parameters['drop_image_content'] = parameters.get(
                'drop_image_content', True
            )

            async for _ in self._async_client.post(
                on='/index',
                **self._get_post_payload(content, results, **kwargs),
                on_done=on_done,
                on_error=on_error,
                on_always=partial(self._update_pbar, func=on_always),
                parameters=parameters,
                prefetch=prefetch,
            ):
                continue

        return results

    @overload
    def search(
        self,
        content: Iterable[str],
        *,
        limit: int = 10,
        batch_size: Optional[int] = None,
        show_progress: bool = False,
        parameters: Optional[Dict] = None,
        on_done: Optional['CallbackFnType'] = None,
        on_error: Optional['CallbackFnType'] = None,
        on_always: Optional['CallbackFnType'] = None,
        prefetch: int = 100,
    ) -> 'DocumentArray':
        """Search for top k results for given query string or ``Document``.

        If the input is a string, will use this string as query. If the input is a ``Document``,
        will use this ``Document`` as query.

        :param content: list of queries.
        :param limit: the number of results to return.
        :param batch_size: the number of elements in each request when sending ``content``.
        :param show_progress: if set, show a progress bar.
        :param parameters: parameters passed to search function.
        :param on_done: the callback function executed while streaming, after successful completion of each request.
            It takes the response ``DataRequest`` as the only argument
        :param on_error: the callback function executed while streaming, after an error occurs in each request.
            It takes the response ``DataRequest`` as the only argument
        :param on_always: the callback function executed while streaming, after each request is completed.
            It takes the response ``DataRequest`` as the only argument
        :param prefetch: the number of in-flight batches made by the post() method. Use a lower value for expensive
            operations, and a higher value for faster response times
        """
        ...

    @overload
    def search(
        self,
        content: Union['DocumentArray', Iterable['Document']],
        *,
        limit: int = 10,
        batch_size: Optional[int] = None,
        show_progress: bool = False,
        parameters: Optional[dict] = None,
        on_done: Optional['CallbackFnType'] = None,
        on_error: Optional['CallbackFnType'] = None,
        on_always: Optional['CallbackFnType'] = None,
        prefetch: int = 100,
    ) -> 'DocumentArray':
        """Search for top k results for given query string or ``Document``.

        If the input is a string, will use this string as query. If the input is a ``Document``,
        will use this ``Document`` as query.

        :param content: list of queries.
        :param limit: the number of results to return.
        :param batch_size: the number of elements in each request when sending ``content``.
        :param show_progress: if set, show a progress bar.
        :param parameters: parameters passed to search function.
        :param on_done: the callback function executed while streaming, after successful completion of each request.
            It takes the response ``DataRequest`` as the only argument
        :param on_error: the callback function executed while streaming, after an error occurs in each request.
            It takes the response ``DataRequest`` as the only argument
        :param on_always: the callback function executed while streaming, after each request is completed.
            It takes the response ``DataRequest`` as the only argument
        :param prefetch: the number of in-flight batches made by the post() method. Use a lower value for expensive
            operations, and a higher value for faster response times
        """
        ...

    def search(self, content, limit: int = 10, **kwargs) -> 'DocumentArray':
        if isinstance(content, str):
            raise TypeError(
                f'content must be an Iterable of [str, Document], try `.search(["{content}"])` instead'
            )

        self._prepare_streaming(
            not kwargs.get('show_progress'),
            total=len(content) if hasattr(content, '__len__') else None,
        )
        on_done = kwargs.pop('on_done', None)
        on_error = kwargs.pop('on_error', None)
        on_always = kwargs.pop('on_always', None)
        prefetch = kwargs.pop('prefetch', 100)
        results = DocumentArray() if not on_done and not on_always else None
        if not on_done:
            on_done = partial(self._gather_result, results=results, attribute='matches')

        with self._pbar:
            parameters = kwargs.pop('parameters', {})
            parameters['limit'] = limit
            parameters['drop_image_content'] = parameters.get(
                'drop_image_content', True
            )

            self._client.post(
                on='/search',
                **self._get_post_payload(content, results, **kwargs),
                on_done=on_done,
                on_error=on_error,
                on_always=partial(self._update_pbar, func=on_always),
                parameters=parameters,
                prefetch=prefetch,
            )

        return results

    @overload
    async def asearch(
        self,
        content: Iterator[str],
        *,
        limit: int = 10,
        batch_size: Optional[int] = None,
        show_progress: bool = False,
        parameters: Optional[Dict] = None,
        on_done: Optional['CallbackFnType'] = None,
        on_error: Optional['CallbackFnType'] = None,
        on_always: Optional['CallbackFnType'] = None,
        prefetch: int = 100,
    ):
        ...

    @overload
    async def asearch(
        self,
        content: Union['DocumentArray', Iterable['Document']],
        *,
        limit: int = 10,
        batch_size: Optional[int] = None,
        show_progress: bool = False,
        parameters: Optional[dict] = None,
        on_done: Optional['CallbackFnType'] = None,
        on_error: Optional['CallbackFnType'] = None,
        on_always: Optional['CallbackFnType'] = None,
        prefetch: int = 100,
    ):
        ...

    async def asearch(self, content, limit: int = 10, **kwargs):
        if isinstance(content, str):
            raise TypeError(
                f'content must be an Iterable of [str, Document], try `.asearch(["{content}"])` instead'
            )

        self._prepare_streaming(
            not kwargs.get('show_progress'),
            total=len(content) if hasattr(content, '__len__') else None,
        )
        on_done = kwargs.pop('on_done', None)
        on_error = kwargs.pop('on_error', None)
        on_always = kwargs.pop('on_always', None)
        prefetch = kwargs.pop('prefetch', 100)
        results = DocumentArray() if not on_done and not on_always else None
        if not on_done:
            on_done = partial(self._gather_result, results=results, attribute='matches')

        with self._pbar:
            parameters = kwargs.pop('parameters', {})
            parameters['limit'] = limit
            parameters['drop_image_content'] = parameters.get(
                'drop_image_content', True
            )

            async for _ in self._async_client.post(
                on='/search',
                **self._get_post_payload(content, results, **kwargs),
                on_done=on_done,
                on_error=on_error,
                on_always=partial(self._update_pbar, func=on_always),
                parameters=parameters,
                prefetch=prefetch,
            ):
                continue

        return results


================================================
FILE: client/clip_client/helper.py
================================================
import json
import sys
import threading
from packaging.version import Version
from urllib.request import Request, urlopen

import pkg_resources
from rich import print
from rich.panel import Panel


def _version_check(package: str = None, github_repo: str = None):
    try:

        if not package:
            package = vars(sys.modules[__name__])['__package__']
        if not github_repo:
            github_repo = package

        cur_ver = Version(pkg_resources.get_distribution(package).version)
        req = Request(
            f'https://pypi.python.org/pypi/{package}/json',
            headers={'User-Agent': 'Mozilla/5.0'},
        )
        with urlopen(
            req, timeout=1
        ) as resp:  # 'with' is important to close the resource after use
            j = json.load(resp)
            releases = j.get('releases', {})
            latest_release_ver = max(
                Version(v) for v in releases.keys() if '.dev' not in v
            )
            if cur_ver < latest_release_ver:
                print(
                    Panel(
                        f'You are using [b]{package} {cur_ver}[/b], but [bold green]{latest_release_ver}[/] is available. '
                        f'You may upgrade it via [b]pip install -U {package}[/b]. [link=https://github.com/jina-ai/{github_repo}/releases]Read Changelog here[/link].',
                        title=':new: New version available!',
                        width=50,
                    )
                )
    except Exception:
        # no network, too slow, PyPi is down
        pass


def is_latest_version(package: str = None, github_repo: str = None) -> None:
    """Check if there is a latest version from Pypi, set env `NO_VERSION_CHECK` to disable it.

    :param package: package name if none auto-detected
    :param github_repo: repo name that contains CHANGELOG if none then the same as package name
    """

    threading.Thread(target=_version_check, args=(package, github_repo)).start()


================================================
FILE: client/setup.py
================================================
import sys
from os import path

from setuptools import find_packages
from setuptools import setup

if sys.version_info < (3, 7, 0):
    raise OSError(f'CLIP-as-service requires Python >=3.7, but yours is {sys.version}')

try:
    pkg_name = 'clip-client'
    libinfo_py = path.join(
        path.dirname(__file__), pkg_name.replace('-', '_'), '__init__.py'
    )
    libinfo_content = open(libinfo_py, 'r', encoding='utf8').readlines()
    version_line = [l.strip() for l in libinfo_content if l.startswith('__version__')][
        0
    ]
    exec(version_line)  # gives __version__
except FileNotFoundError as ex:
    __version__ = '0.0.0'

try:
    with open('../README.md', encoding='utf8') as fp:
        _long_description = fp.read()
except FileNotFoundError:
    _long_description = ''

setup(
    name=pkg_name,
    packages=find_packages(),
    version=__version__,
    include_package_data=True,
    description='Embed images and sentences into fixed-length vectors via CLIP',
    author='Jina AI',
    author_email='hello@jina.ai',
    license='Apache 2.0',
    url='https://github.com/jina-ai/clip-as-service',
    download_url='https://github.com/jina-ai/clip-as-service/tags',
    long_description=_long_description,
    long_description_content_type='text/markdown',
    zip_safe=False,
    setup_requires=['setuptools>=18.0', 'wheel'],
    install_requires=[
        'jina>=3.12.0',
        'docarray[common]>=0.19.0,<0.30.0',
        'packaging',
    ],
    extras_require={
        'test': [
            'pytest',
            'pytest-timeout',
            'pytest-mock',
            'pytest-asyncio',
            'pytest-cov',
            'pytest-repeat',
            'pytest-reraise',
            'mock',
            'pytest-custom_exit_code',
            'black',
        ],
    },
    classifiers=[
        'Development Status :: 5 - Production/Stable',
        'Intended Audience :: Developers',
        'Intended Audience :: Education',
        'Intended Audience :: Science/Research',
        'Programming Language :: Python :: 3.7',
        'Programming Language :: Python :: 3.8',
        'Programming Language :: Python :: 3.9',
        'Programming Language :: Python :: 3.10',
        'Programming Language :: Unix Shell',
        'Environment :: Console',
        'License :: OSI Approved :: Apache Software License',
        'Operating System :: OS Independent',
        'Topic :: Database :: Database Engines/Servers',
        'Topic :: Scientific/Engineering :: Artificial Intelligence',
        'Topic :: Internet :: WWW/HTTP :: Indexing/Search',
        'Topic :: Scientific/Engineering :: Image Recognition',
        'Topic :: Multimedia :: Video',
        'Topic :: Scientific/Engineering',
        'Topic :: Scientific/Engineering :: Mathematics',
        'Topic :: Software Development',
        'Topic :: Software Development :: Libraries',
        'Topic :: Software Development :: Libraries :: Python Modules',
    ],
    project_urls={
        'Documentation': 'https://clip-as-service.jina.ai',
        'Source': 'https://github.com/jina-ai/clip-as-service/',
        'Tracker': 'https://github.com/jina-ai/clip-as-service/issues',
    },
    keywords='jina openai clip deep-learning cross-modal multi-modal neural-search',
)


================================================
FILE: docs/Makefile
================================================
# Minimal makefile for Sphinx documentation
# Used only for local building

# You can set these variables from the command line.
SPHINXOPTS    =
SPHINXBUILD   = sphinx-build
SOURCEDIR     = .
BUILDDIR      = _build

# Put it first so that "make" without argument is like "make help".
help:
	@$(SPHINXBUILD) -M help "$(SOURCEDIR)" "$(BUILDDIR)" $(SPHINXOPTS) $(O)

.PHONY: help Makefile

# Catch-all target: route all unknown targets to Sphinx using the new
# "make mode" option.  $(O) is meant as a shortcut for $(SPHINXOPTS).
%: Makefile
	@$(SPHINXBUILD) -M $@ "$(SOURCEDIR)" "$(BUILDDIR)" $(SPHINXOPTS) $(O)

================================================
FILE: docs/_static/cas-grafana.json
================================================
{
  "__inputs": [
    {
      "name": "DS_PROMETHEUS",
      "label": "Prometheus",
      "description": "",
      "type": "datasource",
      "pluginId": "prometheus",
      "pluginName": "Prometheus"
    }
  ],
  "__elements": [],
  "__requires": [
    {
      "type": "grafana",
      "id": "grafana",
      "name": "Grafana",
      "version": "8.5.3"
    },
    {
      "type": "panel",
      "id": "piechart",
      "name": "Pie chart",
      "version": ""
    },
    {
      "type": "datasource",
      "id": "prometheus",
      "name": "Prometheus",
      "version": "1.0.0"
    },
    {
      "type": "panel",
      "id": "stat",
      "name": "Stat",
      "version": ""
    },
    {
      "type": "panel",
      "id": "timeseries",
      "name": "Time series",
      "version": ""
    }
  ],
  "annotations": {
    "list": [
      {
        "builtIn": 1,
        "datasource": {
          "type": "datasource",
          "uid": "grafana"
        },
        "enable": true,
        "hide": true,
        "iconColor": "rgba(0, 211, 255, 1)",
        "name": "Annotations & Alerts",
        "target": {
          "limit": 100,
          "matchAny": false,
          "tags": [],
          "type": "dashboard"
        },
        "type": "dashboard"
      }
    ]
  },
  "description": "The datashboard for CLIP-as-service",
  "editable": true,
  "fiscalYearStartMonth": 0,
  "graphTooltip": 0,
  "id": null,
  "iteration": 1654148217937,
  "links": [],
  "liveNow": false,
  "panels": [
    {
      "datasource": {
        "type": "prometheus",
        "uid": "${DS_PROMETHEUS}"
      },
      "fieldConfig": {
        "defaults": {
          "color": {
            "mode": "palette-classic"
          },
          "custom": {
            "hideFrom": {
              "legend": false,
              "tooltip": false,
              "viz": false
            }
          },
          "mappings": [],
          "unit": "s"
        },
        "overrides": [
          {
            "__systemRef": "hideSeriesFrom",
            "matcher": {
              "id": "byNames",
              "options": {
                "mode": "exclude",
                "names": [
                  "gateway overhead",
                  "gateway/worker network",
                  "processing-",
                  "preproc text"
                ],
                "prefix": "All except:",
                "readOnly": true
              }
            },
            "properties": [
              {
                "id": "custom.hideFrom",
                "value": {
                  "legend": false,
                  "tooltip": false,
                  "viz": true
                }
              }
            ]
          }
        ]
      },
      "gridPos": {
        "h": 16,
        "w": 13,
        "x": 0,
        "y": 0
      },
      "id": 41,
      "options": {
        "displayLabels": [
          "name"
        ],
        "legend": {
          "displayMode": "table",
          "placement": "right",
          "values": [
            "value",
            "percent"
          ]
        },
        "pieType": "pie",
        "reduceOptions": {
          "calcs": [
            "lastNotNull"
          ],
          "fields": "",
          "values": false
        },
        "tooltip": {
          "mode": "single",
          "sort": "none"
        }
      },
      "pluginVersion": "8.4.4",
      "targets": [
        {
          "datasource": {
            "type": "prometheus",
            "uid": "${DS_PROMETHEUS}"
          },
          "exemplar": true,
          "expr": "jina_receiving_request_seconds_sum / jina_receiving_request_seconds_count",
          "hide": false,
          "interval": "",
          "legendFormat": "receiving-{{job}}",
          "refId": "A"
        },
        {
          "datasource": {
            "type": "prometheus",
            "uid": "${DS_PROMETHEUS}"
          },
          "exemplar": true,
          "expr": "jina_sending_request_seconds_sum / jina_sending_request_seconds_count",
          "hide": false,
          "interval": "",
          "legendFormat": "sending-{{job}}",
          "refId": "D"
        },
        {
          "datasource": {
            "type": "prometheus",
            "uid": "${DS_PROMETHEUS}"
          },
          "exemplar": true,
          "expr": "jina_preprocess_texts_seconds_sum / jina_preprocess_texts_seconds_count",
          "hide": false,
          "interval": "",
          "legendFormat": "preproc text",
          "refId": "B"
        },
        {
          "datasource": {
            "type": "prometheus",
            "uid": "${DS_PROMETHEUS}"
          },
          "editorMode": "code",
          "exemplar": true,
          "expr": "jina_encode_texts_seconds_sum / jina_encode_texts_seconds_count",
          "hide": false,
          "interval": "",
          "legendFormat": "encode text",
          "range": true,
          "refId": "C"
        },
        {
          "datasource": {
            "type": "prometheus",
            "uid": "${DS_PROMETHEUS}"
          },
          "editorMode": "code",
          "exemplar": true,
          "expr": "jina_process_request_seconds_sum / jina_process_request_seconds_count",
          "hide": false,
          "interval": "",
          "legendFormat": "processing-encode",
          "range": true,
          "refId": "E"
        },
        {
          "datasource": {
            "type": "prometheus",
            "uid": "${DS_PROMETHEUS}"
          },
          "editorMode": "code",
          "expr": "jina_preprocess_images_seconds_sum / jina_preprocess_images_seconds_count",
          "hide": false,
          "legendFormat": "preproc image",
          "range": true,
          "refId": "F"
        },
        {
          "datasource": {
            "type": "prometheus",
            "uid": "${DS_PROMETHEUS}"
          },
          "editorMode": "code",
          "expr": "jina_encode_images_seconds_sum / jina_encode_images_seconds_count",
          "hide": false,
          "legendFormat": "encode image",
          "range": true,
          "refId": "G"
        }
      ],
      "title": "life cycle of a request",
      "transformations": [
        {
          "id": "calculateField",
          "options": {
            "alias": "gateway overhead",
            "binary": {
              "left": "receiving-gateway",
              "operator": "-",
              "reducer": "sum",
              "right": "sending-gateway"
            },
            "mode": "binary",
            "reduce": {
              "reducer": "sum"
            }
          }
        },
        {
          "id": "calculateField",
          "options": {
            "alias": "worker-overhead",
            "binary": {
              "left": "receiving-exec",
              "operator": "-",
              "reducer": "sum",
              "right": "processing-encode"
            },
            "mode": "binary",
            "reduce": {
              "reducer": "sum"
            }
          }
        },
        {
          "id": "calculateField",
          "options": {
            "alias": "text-model-inference",
            "binary": {
              "left": "processing-encode",
              "operator": "-",
              "reducer": "sum",
              "right": "preproc text"
            },
            "mode": "binary",
            "reduce": {
              "reducer": "sum"
            }
          }
        },
        {
          "id": "calculateField",
          "options": {
            "alias": "gateway/worker network",
            "binary": {
              "left": "sending-gateway",
              "operator": "-",
              "reducer": "sum",
              "right": "receiving-exec"
            },
            "mode": "binary",
            "reduce": {
              "reducer": "sum"
            }
          }
        },
        {
          "id": "calculateField",
          "options": {
            "alias": "visual-model-inference",
            "binary": {
              "left": "processing-encode",
              "reducer": "sum",
              "right": "preproc image"
            },
            "mode": "binary",
            "reduce": {
              "reducer": "sum"
            }
          }
        }
      ],
      "type": "piechart"
    },
    {
      "datasource": {
        "type": "prometheus",
        "uid": "${DS_PROMETHEUS}"
      },
      "fieldConfig": {
        "defaults": {
          "color": {
            "mode": "thresholds"
          },
          "mappings": [],
          "thresholds": {
            "mode": "absolute",
            "steps": [
              {
                "color": "green",
                "value": null
              },
              {
                "color": "red",
                "value": 80
              }
            ]
          }
        },
        "overrides": []
      },
      "gridPos": {
        "h": 8,
        "w": 6,
        "x": 15,
        "y": 0
      },
      "id": 32,
      "options": {
        "colorMode": "value",
        "graphMode": "area",
        "justifyMode": "auto",
        "orientation": "auto",
        "reduceOptions": {
          "calcs": [
            "lastNotNull"
          ],
          "fields": "",
          "values": false
        },
        "textMode": "auto"
      },
      "pluginVersion": "8.5.3",
      "targets": [
        {
          "datasource": {
            "type": "prometheus",
            "uid": "${DS_PROMETHEUS}"
          },
          "exemplar": true,
          "expr": "jina_receiving_request_seconds_count{runtime_name=~\"gateway.*\"}",
          "instant": false,
          "interval": "",
          "intervalFactor": 1,
          "legendFormat": "",
          "refId": "A"
        }
      ],
      "title": "Number of Request processed ",
      "type": "stat"
    },
    {
      "datasource": {
        "type": "prometheus",
        "uid": "${DS_PROMETHEUS}"
      },
      "fieldConfig": {
        "defaults": {
          "color": {
            "mode": "palette-classic"
          },
          "custom": {
            "axisLabel": "",
            "axisPlacement": "auto",
            "barAlignment": 0,
            "drawStyle": "line",
            "fillOpacity": 0,
            "gradientMode": "none",
            "hideFrom": {
              "legend": false,
              "tooltip": false,
              "viz": false
            },
            "lineInterpolation": "linear",
            "lineWidth": 1,
            "pointSize": 5,
            "scaleDistribution": {
              "type": "linear"
            },
            "showPoints": "auto",
            "spanNulls": false,
            "stacking": {
              "group": "A",
              "mode": "none"
            },
            "thresholdsStyle": {
              "mode": "off"
            }
          },
          "mappings": [],
          "thresholds": {
            "mode": "absolute",
            "steps": [
              {
                "color": "green",
                "value": null
              },
              {
                "color": "red",
                "value": 80
              }
            ]
          },
          "unit": "s"
        },
        "overrides": []
      },
      "gridPos": {
        "h": 8,
        "w": 15,
        "x": 0,
        "y": 16
      },
      "id": 39,
      "options": {
        "legend": {
          "calcs": [],
          "displayMode": "list",
          "placement": "bottom"
        },
        "tooltip": {
          "mode": "single",
          "sort": "none"
        }
      },
      "targets": [
        {
          "datasource": {
            "type": "prometheus",
            "uid": "${DS_PROMETHEUS}"
          },
          "exemplar": true,
          "expr": "jina_receiving_request_seconds_sum / jina_receiving_request_seconds_count",
          "interval": "",
          "legendFormat": "{{runtime_name}}",
          "refId": "A"
        }
      ],
      "title": "jina_receiving_request_seconds_sum",
      "type": "timeseries"
    },
    {
      "collapsed": false,
      "datasource": {
        "type": "prometheus",
        "uid": "PBFA97CFB590B2093"
      },
      "gridPos": {
        "h": 1,
        "w": 24,
        "x": 0,
        "y": 24
      },
      "id": 4,
      "panels": [],
      "repeat": "Executor",
      "title": "$Executor",
      "type": "row"
    },
    {
      "datasource": {
        "type": "prometheus",
        "uid": "${DS_PROMETHEUS}"
      },
      "fieldConfig": {
        "defaults": {
          "color": {
            "mode": "thresholds"
          },
          "mappings": [],
          "thresholds": {
            "mode": "absolute",
            "steps": [
              {
                "color": "green"
              }
            ]
          }
        },
        "overrides": []
      },
      "gridPos": {
        "h": 5,
        "w": 8,
        "x": 0,
        "y": 25
      },
      "id": 2,
      "options": {
        "colorMode": "value",
        "graphMode": "area",
        "justifyMode": "auto",
        "orientation": "auto",
        "reduceOptions": {
          "calcs": [
            "lastNotNull"
          ],
          "fields": "",
          "values": false
        },
        "textMode": "auto"
      },
      "pluginVersion": "8.5.3",
      "targets": [
        {
          "datasource": {
            "type": "prometheus",
            "uid": "${DS_PROMETHEUS}"
          },
          "exemplar": true,
          "expr": "jina_document_processed_total{runtime_name=\"$Executor\"}",
          "instant": false,
          "interval": "",
          "intervalFactor": 1,
          "legendFormat": "{{executor_endpoint}}",
          "refId": "A"
        }
      ],
      "title": "Number of Documents processed per endpoint",
      "type": "stat"
    },
    {
      "datasource": {
        "type": "prometheus",
        "uid": "${DS_PROMETHEUS}"
      },
      "fieldConfig": {
        "defaults": {
          "color": {
            "mode": "thresholds"
          },
          "mappings": [],
          "thresholds": {
            "mode": "absolute",
            "steps": [
              {
                "color": "green"
              },
              {
                "color": "red",
                "value": 80
              }
            ]
          }
        },
        "overrides": []
      },
      "gridPos": {
        "h": 5,
        "w": 8,
        "x": 8,
        "y": 25
      },
      "id": 7,
      "options": {
        "colorMode": "value",
        "graphMode": "area",
        "justifyMode": "auto",
        "orientation": "auto",
        "reduceOptions": {
          "calcs": [
            "lastNotNull"
          ],
          "fields": "",
          "values": false
        },
        "textMode": "auto"
      },
      "pluginVersion": "8.5.3",
      "targets": [
        {
          "datasource": {
            "type": "prometheus",
            "uid": "${DS_PROMETHEUS}"
          },
          "exemplar": true,
          "expr": "jina_process_request_seconds_count{runtime_name=\"$Executor\"}",
          "instant": false,
          "interval": "",
          "intervalFactor": 1,
          "legendFormat": "{{executor_endpoint}}",
          "refId": "A"
        }
      ],
      "title": "Number of requests per endpoint",
      "type": "stat"
    },
    {
      "datasource": {
        "type": "prometheus",
        "uid": "${DS_PROMETHEUS}"
      },
      "fieldConfig": {
        "defaults": {
          "color": {
            "mode": "palette-classic"
          },
          "custom": {
            "axisLabel": "",
            "axisPlacement": "auto",
            "barAlignment": 0,
            "drawStyle": "line",
            "fillOpacity": 0,
            "gradientMode": "none",
            "hideFrom": {
              "legend": false,
              "tooltip": false,
              "viz": false
            },
            "lineInterpolation": "linear",
            "lineWidth": 1,
            "pointSize": 5,
            "scaleDistribution": {
              "type": "linear"
            },
            "showPoints": "auto",
            "spanNulls": false,
            "stacking": {
              "group": "A",
              "mode": "none"
            },
            "thresholdsStyle": {
              "mode": "off"
            }
          },
          "mappings": [],
          "thresholds": {
            "mode": "absolute",
            "steps": [
              {
                "color": "green"
              },
              {
                "color": "red",
                "value": 80
              }
            ]
          },
          "unit": "s"
        },
        "overrides": []
      },
      "gridPos": {
        "h": 6,
        "w": 18,
        "x": 0,
        "y": 30
      },
      "id": 12,
      "options": {
        "legend": {
          "calcs": [],
          "displayMode": "list",
          "placement": "bottom"
        },
        "tooltip": {
          "mode": "single",
          "sort": "none"
        }
      },
      "targets": [
        {
          "datasource": {
            "type": "prometheus",
            "uid": "${DS_PROMETHEUS}"
          },
          "exemplar": true,
          "expr": "jina_process_request_seconds_sum{runtime_name=\"$Executor\"} / jina_process_request_seconds_count{runtime_name=\"$Executor\"}",
          "interval": "",
          "legendFormat": "{{executor_endpoint}}-process",
          "refId": "A"
        }
      ],
      "title": "Time spend calling the Executor method link the to endpoint",
      "type": "timeseries"
    },
    {
      "datasource": {
        "type": "prometheus",
        "uid": "${DS_PROMETHEUS}"
      },
      "fieldConfig": {
        "defaults": {
          "color": {
            "mode": "palette-classic"
          },
          "custom": {
            "axisLabel": "",
            "axisPlacement": "auto",
            "barAlignment": 0,
            "drawStyle": "line",
            "fillOpacity": 0,
            "gradientMode": "none",
            "hideFrom": {
              "legend": false,
              "tooltip": false,
              "viz": false
            },
            "lineInterpolation": "linear",
            "lineWidth": 1,
            "pointSize": 5,
            "scaleDistribution": {
              "type": "linear"
            },
            "showPoints": "auto",
            "spanNulls": false,
            "stacking": {
              "group": "A",
              "mode": "none"
            },
            "thresholdsStyle": {
              "mode": "off"
            }
          },
          "mappings": [],
          "thresholds": {
            "mode": "absolute",
            "steps": [
              {
                "color": "green"
              },
              {
                "color": "red",
                "value": 80
              }
            ]
          },
          "unit": "s"
        },
        "overrides": []
      },
      "gridPos": {
        "h": 6,
        "w": 18,
        "x": 0,
        "y": 36
      },
      "id": 17,
      "options": {
        "legend": {
          "calcs": [],
          "displayMode": "list",
          "placement": "bottom"
        },
        "tooltip": {
          "mode": "single",
          "sort": "none"
        }
      },
      "targets": [
        {
          "datasource": {
            "type": "prometheus",
            "uid": "${DS_PROMETHEUS}"
          },
          "exemplar": true,
          "expr": "jina_receiving_request_seconds_sum{runtime_name=\"$Executor\"} / jina_receiving_request_seconds_count{runtime_name=\"$Executor\"}",
          "interval": "",
          "legendFormat": "{{executor_endpoint}}",
          "refId": "A"
        }
      ],
      "title": "Time spend calling between receiving and responding ",
      "type": "timeseries"
    }
  ],
  "refresh": "",
  "schemaVersion": 36,
  "style": "dark",
  "tags": [
    "clip",
    "jina"
  ],
  "templating": {
    "list": [
      {
        "current": {},
        "datasource": {
          "type": "prometheus",
          "uid": "${DS_PROMETHEUS}"
        },
        "definition": "label_values(jina_document_processed_created,executor_endpoint)\n",
        "description": "",
        "hide": 0,
        "includeAll": true,
        "multi": true,
        "name": "Endpoint",
        "options": [],
        "query": {
          "query": "label_values(jina_document_processed_created,executor_endpoint)\n",
          "refId": "StandardVariableQuery"
        },
        "refresh": 1,
        "regex": "",
        "skipUrlSync": false,
        "sort": 0,
        "type": "query"
      },
      {
        "current": {},
        "datasource": {
          "type": "prometheus",
          "uid": "${DS_PROMETHEUS}"
        },
        "definition": "label_values(jina_document_processed_created,runtime_name)\n",
        "description": "",
        "hide": 0,
        "includeAll": true,
        "multi": true,
        "name": "Executor",
        "options": [],
        "query": {
          "query": "label_values(jina_document_processed_created,runtime_name)\n",
          "refId": "StandardVariableQuery"
        },
        "refresh": 1,
        "regex": "",
        "skipUrlSync": false,
        "sort": 0,
        "type": "query"
      },
      {
        "current": {
          "selected": false,
          "text": "Prometheus",
          "value": "Prometheus"
        },
        "hide": 0,
        "includeAll": false,
        "multi": false,
        "name": "datasource",
        "options": [],
        "query": "prometheus",
        "queryValue": "",
        "refresh": 1,
        "regex": "",
        "skipUrlSync": false,
        "type": "datasource"
      }
    ]
  },
  "time": {
    "from": "now-5m",
    "to": "now"
  },
  "timepicker": {},
  "timezone": "",
  "title": "clip-as-service",
  "uid": "e_4RtOlnz",
  "version": 3,
  "weekStart": ""
}

================================================
FILE: docs/_static/demo-embed.html
================================================
<script src="https://cdn.jsdelivr.net/npm/vue@2/dist/vue.js"></script>
<script src='https://cdnjs.cloudflare.com/ajax/libs/jquery/2.1.3/jquery.min.js'></script>

<div id="demo-embed">


    <table class="embeddingInput">

        <tr>
            <td>Input a sentence (or an image URL)</td>
            <td><textarea v-model="query"
                          placeholder="Input a sentence or an image URL"
                          style="width: 80%"
                          rows="6"
                          maxlength="1000">
            </textarea></td>
        </tr>

        <tr>
            <td>Click an image to select</td>
            <td>
                <div class="gallery">
                    <img @click="query='https://picsum.photos/id/'+image.id+'/50'" class="gallery-image"
                         :src="'https://picsum.photos/id/'+image.id+'/80'" v-for="image in images">
                </div>
            </td>
        </tr>

        <tr>
            <td>Upload image from local</td>
            <td><input type="file" @change="encodeImageFileAsURL" accept=".jpg, .jpeg, .png"/></td>
        </tr>

    </table>

    <div v-if="query && embedding" class="embeddingChart">

        <div>
            <p>Done in {{elapsed}}ms
            <img class="thumbnail" v-if="isUrl" :src="query" :alt="query + 'is not a valid image'">
            <code v-else>{{query}}</code>
            </p>
        </span>
        </div>

        <div class="embeddingBlock" v-for="value in embedding" v-bind:style="{opacity: normalize_value(value)}"
             v-bind:title="value">
            <span v-if="showValue">{{value.toString().charAt(3)}}</span>
        </div>

        <div>

            <input type="checkbox" id="checkbox" v-model="showValue"/>
            <label for="checkbox">Show embedding values (that visually make no sense but people like it as it was doing
                some real science stuff)</label>
        </div>

    </div>


</div>

<style>
    #demo-embed {
        font-family: var(--font-stack) !important;
    }

    .gallery-image:hover {
        opacity: 100%;
    }

    .gallery-image {
        opacity: 50%;
        transition: opacity 0.3s;
        -webkit-transition: opacity 0.3s;
        cursor: pointer;
    }

    .thumbnail {
        max-width: 64px;
        max-height: 64px;
    }

    .embeddingChart {
        margin-top: 30px;
        margin-bottom: 30px;
    }

    .embeddingBlock {
        width: 8px;
        height: 8px;
        display: inline-flex;
        background: green;
        border-style: solid;
        border-color: white;
        border-width: 1px;
        font-size: 1vmin;
        color: white;
        text-align: center;
        vertical-align: middle;
        justify-content: center;
        align-items: center;
        transition: opacity 0.3s;
        -webkit-transition: opacity 0.3s;
        cursor: pointer;
    }

    .embeddingBlock:hover {
        border-color: green;
    }


</style>

<script>
    function randomIntFromInterval(min, max) { // min and max included
        return Math.floor(Math.random() * (max - min + 1) + min)
    }

    var app = new Vue({
        el: '#demo-embed',
        data: {
            serverAddress: `https://api.clip.jina.ai:8443`,
            query: 'First do it, then do it right, then do it better',
            embedding: [1, 1, 1],
            max_embed_value: 0,
            min_embed_value: 0,
            elapsed: 0,
            showValue: true,
            images: []
        },
        computed: {
            isUrl: function () {
                let url;

                try {
                    url = new URL(this.query);
                } catch (_) {
                    return false;
                }

                return url.protocol === "http:" || url.protocol === "https:" || url.protocol === "data:";
            },
            // get only
            payload: function () {
                return {
                    data: [this.isUrl ? {
                        uri: this.query
                    } : {
                        text: this.query
                    }],
                    exec_endpoint: '/',
                }
            }
        },
        mounted: function () {
            this.$nextTick(function () {
                app.callJina();

                $.getJSON("https://picsum.photos/v2/list?page=" + randomIntFromInterval(1, 40) + "&limit=10", function (json) {
                    app.images = json
                });

            })
        },
        watch: {
            query: function (newQ, oldQ) {
                this.callJina()
            }
        },
        methods: {
            encodeImageFileAsURL(element) {
                var file = element.target.files[0];
                var reader = new FileReader();
                reader.onloadend = function () {
                    app.query = reader.result
                }
                reader.readAsDataURL(file);
            },
            normalize_value(val) {
                r = (val - this.min_embed_value) / (this.max_embed_value - this.min_embed_value)
                r = (r * 10).toFixed(0) / 10
                return r
            },
            callJina: function () {

                $.ajax({
                    headers: {
                        Authorization: "d28b93ccbd13367148d05fe3f7fbc680"
                    },
                    type: "POST",
                    url: this.serverAddress + "/post",
                    data: JSON.stringify(this.payload),
                    contentType: "application/json; charset=utf-8",
                    dataType: "json",
                }).success(function (data, textStatus, jqXHR) {
                    // data.data[0].embedding
                    app.embedding = data.data[0].embedding
                    app.max_embed_value = Math.max.apply(null, app.embedding)
                    app.min_embed_value = Math.min.apply(null, app.embedding)

                    date1 = new Date(data.routes[0].startTime)
                    date2 = new Date(data.routes[0].endTime)
                    app.elapsed = date2 - date1

                }).fail(function () {
                    console.error("bad connection!")
                });
            }
        }
    })

</script>

================================================
FILE: docs/_static/demo-text-rank.html
================================================
<script src="https://cdn.jsdelivr.net/npm/vue@2/dist/vue.js"></script>
<script src='https://cdnjs.cloudflare.com/ajax/libs/jquery/2.1.3/jquery.min.js'></script>

<div id="demo-embed">
    <table class="embeddingInput">
        <tr>
            <td>Input an image URL</td>
            <td><input v-model="query"
                       placeholder="Input an image URL"
                       style="width: 80%"
                       maxlength="1000"></td>
        </tr>

        <tr>
            <td>Click an image to select</td>
            <td>
                <div class="gallery">
                    <img @click="query='https://picsum.photos/id/'+image.id+'/50'" class="gallery-image"
                         :src="'https://picsum.photos/id/'+image.id+'/80'" v-for="image in images">
                </div>
            </td>
        </tr>

        <tr>
            <td>Upload image from local</td>
            <td><input type="file" @change="encodeImageFileAsURL" accept=".jpg, .jpeg, .png"/></td>
        </tr>

    </table>

    <p>
        <button v-on:click="addPrompt">+ Add prompt</button>
        <button v-on:click="rmPrompt">- Remove prompt</button>
    </p>
    <li v-for="item in prompts" :key="item.id">
        <input v-model="item.text" placeholder="edit me" size="50">
    </li>

    <p>
        <div v-if="query && embedding" class="embeddingChart">

            <div>
                <p>Done in {{elapsed}}ms
                <span>
                    <img class="thumbnail" v-if="isUrl" :src="query" :alt="query + 'is not a valid image'">
                    <code v-else>{{query}}</code>
                </span>
                </p>
            </div>

        </div>
    </p>
    <p>
    Showing reasoning results (score in softmax):
    <table>
        <tr v-for="item in matches">
            <td style="width: 300px"
                :style="{background: 'linear-gradient(90deg, #00bfa5 ' +(item.scores['clip_score'].value * 100).toFixed(2) + '%, #b5f6e9 '+ (item.scores['clip_score'].value * 100).toFixed(2) + '%)'}">
                {{item.text}}
            </td>
            <td>{{(item.scores['clip_score'].value * 100).toFixed(2)}}</td>
        </tr>
    </table>
    </p>
</div>

<style>
    #demo-embed {
        font-family: var(--font-stack) !important;
    }

    .scorebar {
        background: #00bfa5;
    }

    .gallery-image:hover {
        opacity: 100%;
    }

    .gallery-image {
        opacity: 50%;
        transition: opacity 0.3s;
        -webkit-transition: opacity 0.3s;
        cursor: pointer;
    }

    .preview-image {
        width: 100px
    }

    .thumbnail {
        max-width: 64px;
        max-height: 64px;
    }

    .embeddingChart {
        margin-top: 30px;
        margin-bottom: 30px;
    }

    .embeddingBlock {
        width: 8px;
        height: 8px;
        display: inline-flex;
        background: green;
        border-style: solid;
        border-color: white;
        border-width: 1px;
        font-size: 1vmin;
        color: white;
        text-align: center;
        vertical-align: middle;
        justify-content: center;
        align-items: center;
        transition: opacity 0.3s;
        -webkit-transition: opacity 0.3s;
        cursor: pointer;
    }

    .embeddingBlock:hover {
        border-color: green;
    }


</style>

<script>
    function randomIntFromInterval(min, max) { // min and max included
        return Math.floor(Math.random() * (max - min + 1) + min)
    }

    var app = new Vue({
        el: '#demo-embed',
        data: {
            serverAddress: `https://api.clip.jina.ai:8443`,
            query: 'https://picsum.photos/300',
            embedding: [1, 1, 1],
            max_embed_value: 0,
            min_embed_value: 0,
            elapsed: 0,
            showValue: true,
            images: [],
            matches: [],
            prompts: [
                {"text": "This is a photo of natural scene"},
                {"text": "This is a photo of man-made object"},
                {"text": "This is a photo of an animal"},
                {"text": "This is a photo with human faces"},
                {"text": "This is a blurry photo"},
                {"text": "This is a black and white photo"},
                {"text": "This is a screenshot"},
            ]
        },
        computed: {
            isUrl: function () {
                let url;

                try {
                    url = new URL(this.query);
                } catch (_) {
                    return false;
                }

                return url.protocol === "http:" || url.protocol === "https:" || url.protocol === "data:";
            },
            // get only
            payload: function () {
                return {
                    data: [this.isUrl ? {
                        uri: this.query,
                        matches: this.prompts
                    } : {
                        text: this.query,
                        matches: this.prompts
                    }],
                    exec_endpoint: '/rank',
                }
            }
        },
        mounted: function () {
            this.$nextTick(function () {
                app.callJina();

                $.getJSON("https://picsum.photos/v2/list?page=" + randomIntFromInterval(1, 40) + "&limit=10", function (json) {
                    app.images = json
                });

            })
        },
        watch: {
            query: function (newQ, oldQ) {
                this.callJina()
            },
            prompts: {
                handler: function (newQ, oldQ) {
                    this.callJina()
                }, deep: true
            },
        },
        methods: {
            encodeImageFileAsURL(element) {
                var file = element.target.files[0];
                var reader = new FileReader();
                reader.onloadend = function () {
                    app.query = reader.result
                }
                reader.readAsDataURL(file);
            },
            addPrompt: function () {
                this.prompts.push({"text": "write your prompt here"})
            },
            rmPrompt: function () {
                this.prompts.pop()
            },
            normalize_value(val) {
                r = (val - this.min_embed_value) / (this.max_embed_value - this.min_embed_value)
                r = (r * 10).toFixed(0) / 10
                return r
            },
            callJina: function () {

                $.ajax({
                    headers: {
                        Authorization: "d28b93ccbd13367148d05fe3f7fbc680"
                    },
                    type: "POST",
                    url: this.serverAddress + "/post",
                    data: JSON.stringify(this.payload),
                    contentType: "application/json; charset=utf-8",
                    dataType: "json",
                }).success(function (data, textStatus, jqXHR) {
                    // data.data[0].embedding
                    app.matches = data.data[0].matches
                    date1 = new Date(data.routes[0].startTime)
                    date2 = new Date(data.routes[0].endTime)
                    app.elapsed = date2 - date1

                }).fail(function () {
                    console.error("bad connection!")
                });
            }
        }
    })

</script>

================================================
FILE: docs/_static/main.css
================================================
html.loaded-in-iframe #announcement,
html.loaded-in-iframe #sidebar-drawer,
html.loaded-in-iframe footer,
html.loaded-in-iframe #toc-drawer {
  display: none!important;
}

html.loaded-in-iframe .page .main {
    justify-content: center;
}

.sidebar-logo {
    max-width: 70%;
}


table.docutils {
    border: thin;
}

table.docutils td, table.docutils th {
    padding: 1rem 1rem;
}

.highlight {
    background: #f5f5f5;
}

h1, h2, h3 {
    margin-top: 3rem;
}

.highlight-console .highlight {
    background: #00232b !important;
    color: whitesmoke;
}

.highlight-text .highlight {
    background: #00232b !important;
    color: whitesmoke;
}

.highlight-json .highlight {
    background: #00232b !important;
    color: whitesmoke;
}

.highlight-shell .highlight {
    background: #00232b !important;
    color: whitesmoke;
}

.highlight-bash .highlight {
    background: #00232b !important;
    color: whitesmoke;
}

.tab-set > input:checked + label {
    border-color: var(--tabs--label-text--active);
}

.tab-set > input:checked + label:hover {
    border-color: var(--tabs--label-text--active);
}


table code {
    background: var(--color-inline-code-background);
    border: 1px solid var(--color-background-border);
    border-radius: .2em;
    font-size: var(--font-size--small--2);
    padding: .1em .2em;
}

.related-information {
    justify-content: space-between;
}

.social-btn {
    margin: 0 .3em;
}

.social-btn:hover {
    opacity: .5;
}

.social-btns {
    display: inline-block;
}

.announcement {
    background-color: var(--color-brand-primary);
    color: var(--color-background-primary) !important;
}

.announcement a {
    color: inherit;
    text-decoration: none;
}

.announcement a:hover {
    color: inherit;
    text-decoration: underline;
}

.sidebar-ecosys-logo {
    width: 1.2em;
    margin-right: .5em;
    vertical-align: middle
}


body[data-theme="dark"] .only-dark-line {
    display: inline-block !important;
}

body[data-theme="dark"] .only-light-line {
    display: none !important;
}

body[data-theme="light"] .only-light-line {
    display: inline-block !important;
}

body[data-theme="light"] .only-dark-line {
    display: none !important;
}

body[data-theme="auto"] .only-light-line {
    display: inline-block !important;
}

body[data-theme="auto"] .only-dark-line {
    display: none !important;
}

.color-gradient-card {
    background: linear-gradient(270deg, #22c1c3, #fdbb2d);
    background-size: 200% 200%;

    -webkit-animation: AnimationName 30s ease infinite;
    -moz-animation: AnimationName 30s ease infinite;
    animation: AnimationName 30s ease infinite;
}

@-webkit-keyframes AnimationName {
    0%{background-position:0% 50%}
    50%{background-position:100% 50%}
    100%{background-position:0% 50%}
}
@-moz-keyframes AnimationName {
    0%{background-position:0% 50%}
    50%{background-position:100% 50%}
    100%{background-position:0% 50%}
}
@keyframes AnimationName {
    0%{background-position:0% 50%}
    50%{background-position:100% 50%}
    100%{background-position:0% 50%}
}

.version-select {
    font-size: .7em;
    border-radius: 5px;
    cursor: pointer;
    background-color: #fff;
    background-image: linear-gradient(to top, #f9f9f9, #fff 33%);
    border-color: var(--color-background-border);
    height: 1.8em;
    line-height: 1.8em;
    outline: none;
    text-align: center;
    max-width: 7em;
    color: var(--color-foreground-muted);
}

================================================
FILE: docs/_templates/page.html
================================================
{% extends "base.html" %}

{% block body -%}
{{ super() }}
{% include "partials/icons.html" %}

<input type="checkbox" class="sidebar-toggle" name="__navigation" id="__navigation">
<input type="checkbox" class="sidebar-toggle" name="__toc" id="__toc">
<label class="overlay sidebar-overlay" for="__navigation">
    <div class="visually-hidden">Hide navigation sidebar</div>
</label>
<label class="overlay toc-overlay" for="__toc">
    <div class="visually-hidden">Hide table of contents sidebar</div>
</label>

{% if theme_announcement -%}
<div class="announcement">
    <aside class="announcement-content">
        {% block announcement %} {{ theme_announcement }} {% endblock announcement %}
    </aside>
</div>
{%- endif %}

<div class="page">
    <header class="mobile-header">
        <div class="header-left">
            <label class="nav-overlay-icon" for="__navigation">
                <div class="visually-hidden">Toggle site navigation sidebar</div>
                <i class="icon">
                    <svg>
                        <use href="#svg-menu"></use>
                    </svg>
                </i>
            </label>
        </div>
        <div class="header-center">
            <a href="{{ pathto(master_doc) }}">
                <div class="brand">{{ docstitle if docstitle else project }}</div>
            </a>
        </div>
        <div class="header-right">
            <div class="theme-toggle-container theme-toggle-header">
                <button class="theme-toggle">
                    <div class="visually-hidden">Toggle Light / Dark / Auto color theme</div>
                    <svg class="theme-icon-when-auto">
                        <use href="#svg-sun-half"></use>
                    </svg>
                    <svg class="theme-icon-when-dark">
                        <use href="#svg-moon"></use>
                    </svg>
                    <svg class="theme-icon-when-light">
                        <use href="#svg-sun"></use>
                    </svg>
                </button>
            </div>
            <label class="toc-overlay-icon toc-header-icon{% if furo_hide_toc %} no-toc{% endif %}" for="__toc">
                <div class="visually-hidden">Toggle table of contents sidebar</div>
                <i class="icon">
                    <svg>
                        <use href="#svg-toc"></use>
                    </svg>
                </i>
            </label>
        </div>
    </header>
    <aside class="sidebar-drawer">
        <div class="sidebar-container">
            {% block left_sidebar %}
            <div class="sidebar-sticky">
                {%- for sidebar_section in sidebars %}
                {%- include sidebar_section %}
                {%- endfor %}
            </div>
            {% endblock left_sidebar %}
        </div>
    </aside>
    <div class="main">
        <div class="content">
            <div class="article-container">
                <a href="#" class="back-to-top muted-link">
                    <svg xmlns="http://www.w3.org/2000/svg" viewBox="0 0 24 24">
                        <path d="M13 20h-2V8l-5.5 5.5-1.42-1.42L12 4.16l7.92 7.92-1.42 1.42L13 8v12z"></path>
                    </svg>
                    <span>{% trans %}Back to top{% endtrans %}</span>
                </a>
                <div class="content-icon-container">
                    {#- Edit this page, on GitHub -#}
                    {%- if READTHEDOCS and conf_py_path and page_source_suffix and github_user != "None" and github_repo
                    != "None" and github_version %}
                    <div class="edit-this-page">
                        <a class="muted-link"
                           href="https://github.com/{{ github_user }}/{{ github_repo }}/edit/{{ github_version }}{{ conf_py_path }}{{ pagename }}{{ page_source_suffix }}"
                           title="{{ _(" Edit this page") }}">
                        <svg aria-hidden="true" viewBox="0 0 24 24" stroke-width="1.5" stroke="currentColor" fill="none"
                             stroke-linecap="round" stroke-linejoin="round">
                            <path stroke="none" d="M0 0h24v24H0z" fill="none"/>
                            <path d="M4 20h4l10.5 -10.5a1.5 1.5 0 0 0 -4 -4l-10.5 10.5v4"/>
                            <line x1="13.5" y1="6.5" x2="17.5" y2="10.5"/>
                        </svg>
                        <span class="visually-hidden">{{ _("Edit this page") }}</span>
                        </a>
                    </div>
                    {% endif %}
                    {#- Theme toggle -#}
                    <div class="theme-toggle-container theme-toggle-content">
                        <button class="theme-toggle">
                            <div class="visually-hidden">Toggle Light / Dark / Auto color theme</div>
                            <svg class="theme-icon-when-auto">
                                <use href="#svg-sun-half"></use>
                            </svg>
                            <svg class="theme-icon-when-dark">
                                <use href="#svg-moon"></use>
                            </svg>
                            <svg class="theme-icon-when-light">
                                <use href="#svg-sun"></use>
                            </svg>
                        </button>
                    </div>
                    <label class="toc-overlay-icon toc-content-icon{% if furo_hide_toc %} no-toc{% endif %}"
                           for="__toc">
                        <div class="visually-hidden">Toggle table of contents sidebar</div>
                        <i class="icon">
                            <svg>
                                <use href="#svg-toc"></use>
                            </svg>
                        </i>
                    </label>
                </div>
                <article role="main">
                    {% block content %}{{ body }}{% endblock %}
                </article>
            </div>
            <footer>
                {% block footer %}
                <div class="related-pages">
                    {% if next -%}
                    <a class="next-page" href="{{ next.link }}">
                        <div class="page-info">
                            <div class="context">
                                <span>{{ _("Next") }}</span>
                            </div>
                            <div class="title">{{ next.title }}</div>
                        </div>
                        <svg class="furo-related-icon">
                            <use href="#svg-arrow-right"></use>
                        </svg>
                    </a>
                    {%- endif %}
                    {% if prev -%}
                    <a class="prev-page" href="{{ prev.link }}">
                        <svg class="furo-related-icon">
                            <use href="#svg-arrow-right"></use>
                        </svg>
                        <div class="page-info">
                            <div class="context">
                                <span>{{ _("Previous") }}</span>
                            </div>
                            {% if prev.link == pathto(master_doc) %}
                            <div class="title">{{ _("Home") }}</div>
                            {% else %}
                            <div class="title">{{ prev.title }}</div>
                            {% endif %}
                        </div>
                    </a>
                    {%- endif %}
                </div>
                <div class="bottom-of-page">
                    <div class="left-details">
                        {%- if show_copyright %}
                        <div class="copyright">
                            {%- if hasdoc('copyright') %}
                            {% trans path=pathto('copyright'), copyright=copyright|e -%}
                            <a href="{{ path }}">Copyright</a> &#169; {{ copyright }}
                            {%- endtrans %}
                            {%- else %}
                            {% trans copyright=copyright|e -%}
                            Copyright &#169; {{ copyright }}
                            {%- endtrans %}
                            {%- endif %}
                        </div>
                        {%- endif %}
                        {%- if last_updated -%}
                        <div class="last-updated">
                            {% trans last_updated=last_updated|e -%}
                            Last updated on {{ last_updated }}
                            {%- endtrans -%}
                        </div>
                        {%- endif %}
                    </div>
                    <div class="right-details">
                        <div class="social-btns">
                            <a class='social-btn' href="https://github.com/jina-ai/clip-as-service/" aria-label="GitHub"
                               target="_blank" rel="noreferrer"> <i class="fab fa-github"></i></a>
                            <a class='social-btn' href="https://discord.jina.ai" aria-label="Discord" target="_blank"
                               rel="noreferrer"> <i class="fab fa-discord"></i></a>
                            <a class='social-btn' href="https://youtube.com/c/jina-ai" aria-label="YouTube"
                               target="_blank" rel="noreferrer"> <i class="fab fa-youtube"></i></a>
                            <a class='social-btn' href="https://twitter.com/JinaAI_" aria-label="Twitter"
                               target="_blank" rel="noreferrer"> <i class="fab fa-twitter"></i></a>
                            <a class='social-btn' href="https://www.linkedin.com/company/jinaai/" aria-label="LinkedIn"
                               target="_blank" rel="noreferrer"> <i class="fab fa-linkedin"></i></a>
                        </div>
                    </div>
                </div>
                {% endblock footer %}
            </footer>
        </div>
        <aside class="toc-drawer{% if furo_hide_toc %} no-toc{% endif %}">
            {% block right_sidebar %}
            {% if not furo_hide_toc %}
            <div class="toc-sticky toc-scroll">
                <div class="toc-title-container">
          <span class="toc-title">
            {{ _("Contents") }}
          </span>
                </div>
                <div class="toc-tree-container">
                    <div class="toc-tree">
                        {{ toc }}
                    </div>
                </div>
            </div>
            {% endif %}
            {% endblock right_sidebar %}
        </aside>
    </div>
</div>
<img referrerpolicy="no-referrer-when-downgrade"
     src="https://static.scarf.sh/a.png?x-pxid=2823e771-0e1e-4320-8fde-48bc48e53262"/>
{%- endblock %}

================================================
FILE: docs/_templates/sidebar/brand.html
================================================
<a class="sidebar-brand{% if logo %} centered{% endif %}" href="{{ pathto(master_doc) }}">
  {% block brand_content %}
  {%- if logo_url %}
  <div class="sidebar-logo-container">
    <img class="sidebar-logo" src="{{ logo_url }}" alt="Logo" />
  </div>
  {%- endif %}
  {%- if theme_light_logo and theme_dark_logo %}
  <div class="sidebar-logo-container">
    <img class="sidebar-logo only-light" src="{{ pathto('_static/' + theme_light_logo, 1) }}" alt="Light Logo" />
    <img class="sidebar-logo only-dark" src="{{ pathto('_static/' + theme_dark_logo, 1) }}" alt="Dark Logo" />
  </div>
  {%- endif %}
  {% if not theme_sidebar_hide_name %}
  <span class="sidebar-brand-text">{{ docstitle if docstitle else project }}</span>
  {%- endif %}
  {% endblock brand_content %}
</a>
<div class="sd-d-flex-row sd-align-major-spaced">
  <a class="github-button" href="https://github.com/jina-ai/clip-as-service" data-icon="octicon-star" data-show-count="true" aria-label="Star jina-ai/jina on GitHub" style="opacity: 0;">Star</a>
  {% if versions %}
  <select onChange="window.location.href=this.value" class="version-select">
      {%- for item in versions|reverse %}
        {% if item.name == latest_jina_version %}
          {% set new_url = item.url if current_version.name == latest_jina_version else item.url | replace('/' + latest_jina_version, "") %}
          {% if current_version.version == item.version %}
            <option value="{{ new_url }}" selected="selected" >latest ({{ item.name }})</option>
          {% else %}
            <option value="{{ new_url }}" >latest({{ item.name }})</option>
          {% endif %}
        {% else %}
          {% if current_version.version == item.version %}
            <option value="{{ item.url }}" selected="selected" >{{ item.name }}</option>
          {% else %}
            <option value="{{ item.url }}" >{{ item.name }}</option>
          {% endif %}
        {% endif %}
      {%- endfor %}
  </select>
  {% endif %}
</div>


================================================
FILE: docs/_templates/sidebar/navigation.html
================================================
<div class="sidebar-tree">
    {{ furo_navigation_tree }}
    <p class="caption" role="heading"><span class="caption-text">Ecosystem</span></p>
    <ul>
        <li class="toctree-l1">
            <a class="reference external" href="https://docs.jina.ai">
                <img class="sidebar-ecosys-logo only-light-line" src="{{ pathto('_static/search-light.svg', 1) }}">
                <img class="sidebar-ecosys-logo only-dark-line" src="{{ pathto('_static/search-dark.svg', 1) }}">
                Jina</a></li>
        <li class="toctree-l1"><a class="reference external" href="https://hub.jina.ai">
            <img class="sidebar-ecosys-logo only-light-line" src="{{ pathto('_static/hub-light.svg', 1) }}">
            <img class="sidebar-ecosys-logo only-dark-line" src="{{ pathto('_static/hub-dark.svg', 1) }}">
            Jina Hub</a></li>
        <li class="toctree-l1"><a class="reference external" href="https://finetuner.jina.ai">
            <img class="sidebar-ecosys-logo only-light-line" src="{{ pathto('_static/finetuner-light.svg', 1) }}">
            <img class="sidebar-ecosys-logo only-dark-line" src="{{ pathto('_static/finetuner-dark.svg', 1) }}">
            Finetuner</a></li>
        <li class="toctree-l1"><a class="reference external" href="https://docarray.jina.ai">
            <img class="sidebar-ecosys-logo only-light-line" src="{{ pathto('_static/docarray-light.svg', 1) }}">
            <img class="sidebar-ecosys-logo only-dark-line" src="{{ pathto('_static/docarray-dark.svg', 1) }}">
            DocArray</a></li>
        <li class="toctree-l1"><a class="reference internal" href="#">
            <img class="sidebar-ecosys-logo only-light-line" src="{{ pathto('_static/cas-light.svg', 1) }}">
            <img class="sidebar-ecosys-logo only-dark-line" src="{{ pathto('_static/cas-dark.svg', 1) }}">
            CLIP-as-service</a></li>
        <li class="toctree-l1"><a class="reference external" href="https://github.com/jina-ai/jcloud">
            <img class="sidebar-ecosys-logo only-light-line" src="{{ pathto('_static/JCloud-light.svg', 1) }}">
            <img class="sidebar-ecosys-logo only-dark-line" src="{{ pathto('_static/JCloud-dark.svg', 1) }}">
            JCloud</a></li>
        <li class="toctree-l1"><a class="reference external" href="https://now.jina.ai">
            <img class="sidebar-ecosys-logo only-light-line" src="{{ pathto('_static/now-light.svg', 1) }}">
            <img class="sidebar-ecosys-logo only-dark-line" src="{{ pathto('_static/now-dark.svg', 1) }}">
            NOW</a></li>
    </ul>
</div>


================================================
FILE: docs/changelog/index.md
================================================
# Changelog

CLIP-as-service follows semantic versioning. However, before the project reach 1.0.0, any breaking change will only bump the minor version.  An automated release note is [generated on every release](https://github.com/jina-ai/clip-as-service/releases). The release note includes features, bugs, refactorings etc. 

This chapter only tracks the most important breaking changes and explain the rationale behind them.

## 0.4.0: rename `rerank` concept to `rank`

"Reranking" is a new feature introduced since 0.3.3. This feature allows user to rank and score `document.matches` in a cross-modal way. From 0.4.0, this feature as well as all related functions will refer it simply as "rank".

## 0.2.0: improve the service scalability with replicas

This change is mainly intended to improve the inference performance with replicas.

Here is the short benchmark summary of the improvement (`replicas=4`):

| batch_size  | before | after   |
|-------------|--------|---------|
| 1           | 23.74  | 18.89   |
| 8           | 58.88  | 30.38   |
| 16          | 14.96  | 91.86   |
| 32          | 14.78  | 101.75  |


================================================
FILE: docs/conf.py
================================================
import os
import re
import sys
from os import path

sys.path.insert(0, path.abspath('..'))

project = 'CLIP-as-service'
slug = re.sub(r'\W+', '-', project.lower())
author = 'Jina AI'
copyright = 'Jina AI Limited. All rights reserved.'
source_suffix = ['.rst', '.md']
master_doc = 'index'
language = 'en'
repo_dir = '../'

try:
    if 'CAS_VERSION' not in os.environ:
        libinfo_py = path.join(repo_dir, 'client/clip_client', '__init__.py')
        libinfo_content = open(libinfo_py, 'r').readlines()
        version_line = [
            l.strip() for l in libinfo_content if l.startswith('__version__')
        ][0]
        exec(version_line)
    else:
        __version__ = os.environ['CAS_VERSION']
except FileNotFoundError:
    __version__ = '0.0.0'

version = __version__
release = __version__

templates_path = ['_templates']
exclude_patterns = [
    '_build',
    'Thumbs.db',
    '.DS_Store',
    'tests',
    'page_templates',
    '.github',
]
pygments_style = 'rainbow_dash'
html_theme = 'furo'

base_url = '/'
html_baseurl = 'https://clip-as-service.jina.ai'
sitemap_url_scheme = '{link}'
sitemap_locales = [None]
sitemap_filename = "sitemap.xml"

html_theme_options = {
    'light_logo': 'logo-light.svg',
    'dark_logo': 'logo-dark.svg',
    "sidebar_hide_name": True,
    "light_css_variables": {
        "color-brand-primary": "#009191",
        "color-brand-content": "#009191",
    },
    "dark_css_variables": {
        "color-brand-primary": "#FBCB67",
        "color-brand-content": "#FBCB67",
    },
    # PLEASE DO NOT DELETE the empty line between `start-announce` and `end-announce`
    # PLEASE DO NOT DELETE `start-announce`/ `end-announce` it is used for our dev bot to inject announcement from GH
    # start-announce
    # end-announce
}

html_static_path = ['_static']
html_extra_path = ['html_extra']
html_css_files = [
    'main.css',
    'https://cdnjs.cloudflare.com/ajax/libs/font-awesome/6.0.0-beta2/css/all.min.css',
]
html_js_files = [
    'https://cdn.jsdelivr.net/npm/vue@2/dist/vue.min.js',
]
htmlhelp_basename = slug
html_show_sourcelink = False
html_favicon = '_static/favicon.png'

intersphinx_mapping = {'docarray': ('https://docarray.jina.ai/', None), 'finetuner': ('https://finetuner.jina.ai/', None)}

latex_documents = [(master_doc, f'{slug}.tex', project, author, 'manual')]
man_pages = [(master_doc, slug, project, [author], 1)]
texinfo_documents = [
    (master_doc, slug, project, author, slug, project, 'Miscellaneous')
]
epub_title = project
epub_exclude_files = ['search.html']

# -- Extension configuration -------------------------------------------------

extensions = [
    'sphinx.ext.autodoc',
    'sphinx_autodoc_typehints',
    'sphinx.ext.viewcode',
    'sphinx.ext.coverage',
    'sphinxcontrib.apidoc',
    'sphinxarg.ext',
    'sphinx_copybutton',
    'sphinx_sitemap',
    'sphinx.ext.intersphinx',
    'sphinxext.opengraph',
    'notfound.extension',
    'myst_parser',
    'sphinx_design',
    'sphinx_inline_tabs',
]

myst_enable_extensions = ['colon_fence', 'substitution', 'deflist']

# -- Custom 404 page

# sphinx-notfound-page
# https://github.com/readthedocs/sphinx-notfound-page
notfound_context = {
    'title': 'Page Not Found',
    'body': '''
<h1>Page Not Found</h1>
<p>Oops, we couldn't find that page. </p>
<p>You can try "asking our docs" on the right corner of the page to find answer.</p>
<p>Otherwise, <a href="https://github.com/jina-ai/clip-as-service/">please create a Github issue</a> and one of our team will respond.</p>

''',
}
notfound_no_urls_prefix = True

apidoc_module_dir = '../client'
apidoc_output_dir = 'api'
apidoc_excluded_paths = ['tests', 'legacy', 'hub', 'toy*', 'setup.py']
apidoc_separate_modules = True
apidoc_extra_args = ['-t', 'template/']
autodoc_member_order = 'bysource'
autodoc_mock_imports = ['argparse', 'numpy', 'np', 'tensorflow', 'torch', 'scipy']
autoclass_content = 'both'
set_type_checking_flag = False
html_last_updated_fmt = ''
nitpicky = True
nitpick_ignore = [('py:class', 'type')]
linkcheck_ignore = [
    # Avoid link check on local uri
    'http://0.0.0.0:*',
    'pods/encode.yml',
    'https://github.com/jina-ai/clip-as-service/commit/*',
    '.github/*',
    'extra-requirements.txt',
    'fastentrypoints.py' '../../101',
    '../../102',
    'http://www.twinsun.com/tz/tz-link.htm',  # Broken link from pytz library
    'https://urllib3.readthedocs.io/en/latest/contrib.html#google-app-engine',  # Broken link from urllib3 library
    'https://linuxize.com/post/how-to-add-swap-space-on-ubuntu-20-04/',
    # This link works but gets 403 error on linkcheck
]
linkcheck_timeout = 20
linkcheck_retries = 2
linkcheck_anchors = False

ogp_site_url = 'https://clip-as-service.jina.ai/'
ogp_image = 'https://clip-as-service.jina.ai/_static/banner.png'
ogp_use_first_image = True
ogp_description_length = 300
ogp_type = 'website'
ogp_site_name = f'CLIP-as-service {os.environ.get("SPHINX_MULTIVERSION_VERSION", version)} Documentation'

ogp_custom_meta_tags = [
    '<meta name="twitter:card" content="summary_large_image">',
    '<meta name="twitter:site" content="@JinaAI_">',
    '<meta name="twitter:creator" content="@JinaAI_">',
    '<meta name="description" content="Embed images and sentences into fixed-length vectors via CLIP.">',
    '<meta property="og:description" content="CLIP-as-service is a low-latency high-scalability embedding service for images and texts. It can be easily integrated as a microservice into neural search solutions.">',
    '''

<script async src="https://www.googletagmanager.com/gtag/js?id=G-E63SXVNDXZ"></script>
<script>
  window.dataLayer = window.dataLayer || [];
  function gtag(){dataLayer.push(arguments);}
  gtag('js', new Date());

  gtag('config', 'G-E63SXVNDXZ');
</script>

<script async defer src="https://buttons.github.io/buttons.js"></script>
    ''',
]


def add_server_address(app):
    # This makes variable `server_address` available to docbot.js
    server_address = app.config['server_address']
    js_text = "var server_address = '%s';" % server_address
    app.add_js_file(None, body=js_text)


def setup(app):
    from sphinx.domains.python import PyField
    from sphinx.util.docfields import Field
    from sphinx.locale import _

    app.add_object_type(
        'confval',
        'confval',
        objname='configuration value',
        indextemplate='pair: %s; configuration value',
        doc_field_types=[
            PyField(
                'type',
                label=_('Type'),
                has_arg=False,
                names=('type',),
                bodyrolename='class',
            ),
            Field(
                'default',
                label=_('Default'),
                has_arg=False,
                names=('default',),
            ),
        ],
    )


================================================
FILE: docs/hosting/by-jina.md
================================================
# Hosted by Jina AI

```{include} ../../README.md
:start-after: <!-- start inference-banner -->
:end-before: <!-- end inference-banner -->
```

In today's dynamic business environment, enterprises face a multitude of challenges that require advanced solutions to 
maintain a competitive edge. 
From managing vast amounts of unstructured data to delivering personalized customer experiences, businesses need 
efficient tools to tackle these obstacles. 
Machine learning (ML) has emerged as a powerful tool for automating repetitive tasks, processing data effectively, and 
generating valuable insights from multimedia content. 
Jina AI's Inference offers a comprehensive solution to streamline access to curated, state-of-the-art ML models, 
eliminating traditional roadblocks such as costly and time-consuming MLOps steps and the distinction between public and 
custom neural network models.

## Getting started

To access the fastest and most performant CLIP models, [Jina AI's Inference](https://cloud.jina.ai/user/inference) is 
the go-to choice. 
Follow the steps below to get started:

1. Sign up for a free account at [Jina AI Cloud](https://cloud.jina.ai).
2. Once you have created an account, navigate to the Inference tab to create a new CLIP model.
3. The model can be accessed either through an HTTP endpoint or a gRPC endpoint.

## Obtaining a Personal Access Token

Before you begin using [Jina AI's Inference](https://cloud.jina.ai/user/inference), ensure that you have obtained a 
personal access token (PAT) from the [Jina AI Cloud](https://cloud.jina.ai) or through the command-line interface (CLI). 
Use the following guide to create a new PAT:

1. Access the [Jina AI Cloud](https://cloud.jina.ai) and log in to your account.
2. Navigate to the [**Access token**](https://cloud.jina.ai/settings/tokens) section in the **Settings** tab, or alternatively, create a PAT via the CLI using the command:

```bash
jina auth token create <name of PAT> -e <expiration days>
```

## Installing the Inference Client

To interact with the model created in Inference, you will need to install the `inference-client` Python package. 
Follow the steps below to install the package using pip:

```bash
pip install inference-client
```

## Interacting with the Model

Once you have your personal access token and the model name listed in the Inference detail page, you can start 
interacting with the model using the `inference-client` Python package. 
Follow the example code snippet below:

```python
from inference_client import Client

client = Client(token='<your auth token>')

model = client.get_model('<your model name>')
```

The CLIP models offer the following functionalities:

1. Encoding: Users can encode data by calling the `model.encode` method. For detailed instructions on using this method, refer to the [Encode documentation](https://jina.readme.io/docs/encode).
2. Ranking: Users can perform ranking by calling the `model.rank` method. Refer to the [Rank documentation](https://jina.readme.io/docs/rank) for detailed instructions on using this method.

For further details on usage and information about other tasks and models supported in Inference, as well as how to use 
`curl` to interact with the model, please consult the [Inference documentation](https://jina.readme.io/docs/inference).


================================================
FILE: docs/hosting/cas-on-colab.ipynb
================================================
{
 "nbformat": 4,
 "nbformat_minor": 0,
 "metadata": {
  "colab": {
   "name": "cas-on-colab.ipynb",
   "provenance": [],
   "collapsed_sections": []
  },
  "kernelspec": {
   "name": "python3",
   "display_name": "Python 3"
  },
  "language_info": {
   "name": "python"
  },
  "accelerator": "GPU"
 },
 "cells": [
  {
   "cell_type": "markdown",
   "source": [
    "# Hosting CLIP-as-service on Google Colab with TPU/GPU support\n",
    "\n",
    "This tutorial guides you on how to implement the following architecture:\n",
    "\n",
    "[![](https://mermaid.ink/img/pako:eNp1kEFrwzAMhf-K0bkh99xGVwpjh9Ctp7oMxVYTM8cOttwy2v732fMGgzFd9Hjvk0C6gvKaoIMx4DKJ5510IldMQzW23o-WxNpbHMRBlXasSCmF8S1SOFNommbb79vXfl9TcrqKh0OBlDXk-Cgydht3_bqdmJf2QkP06p349mtTHXvCM0YVzMJfMwX_C6kU7L8xrGCmMKPR-bprcSTwRDNJ6LLUdMJkWYJ094ymRSPTRhv2AboT2kgrwMT-5cMp6Dgk-oEeDebfzN_U_RP7v2yd)](https://mermaid.live/edit#pako:eNp1kEFrwzAMhf-K0bkh99xGVwpjh9Ctp7oMxVYTM8cOttwy2v732fMGgzFd9Hjvk0C6gvKaoIMx4DKJ5510IldMQzW23o-WxNpbHMRBlXasSCmF8S1SOFNommbb79vXfl9TcrqKh0OBlDXk-Cgydht3_bqdmJf2QkP06p349mtTHXvCM0YVzMJfMwX_C6kU7L8xrGCmMKPR-bprcSTwRDNJ6LLUdMJkWYJ094ymRSPTRhv2AboT2kgrwMT-5cMp6Dgk-oEeDebfzN_U_RP7v2yd)\n",
    "\n",
    "CLIP-as-service is powered by Jina, [there is another tutorial showing you how to host Jina service on Colab in general](https://colab.research.google.com/github/jina-ai/jina/blob/master/docs/Using_Jina_on_Colab.ipynb). Highly recommended!\n",
    "\n",
    "\n",
    "## 1. Change runtime type\n",
    "\n",
    "Go to menu `Runtime -> Change run time type -> GPU/TPU`\n",
    "\n",
    "\n",
    "## 2. Install Packages\n",
    "\n",
    "As we will run the client locally, we only need to install `clip_server` package on Colab.\n",
    "\n",
    "\n",
    "**⚠️ You will be asked to \"Restart Runtime\" after this step, please click the button and restart the runtime.**"
   ],
   "metadata": {
    "id": "lbUpcvs1p1CF",
    "pycharm": {
     "name": "#%% md\n"
    }
   }
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "id": "MRrB2If6kDfX",
    "pycharm": {
     "name": "#%%\n"
    }
   },
   "outputs": [],
   "source": [
    "!pip install clip_server pyngrok"
   ]
  },
  {
   "cell_type": "markdown",
   "source": [
    "## 3. Config Flow YAML\n",
    "\n",
    "\n",
    "Unlike classic entrypoint from CLI, here we need to start the Flow in Python. Let's load use Pytorch backend and write a Flow YAML. Note that we need to load the torch Python file from `clip_server` installation, hence you see `cas_path` below. More available options [can be found here](https://github.com/jina-ai/clip-as-service/tree/main/server/clip_server/executors)."
   ],
   "metadata": {
    "id": "q3bmGKIvx5S-",
    "pycharm": {
     "name": "#%% md\n"
    }
   }
  },
  {
   "cell_type": "code",
   "source": [
    "import clip_server\n",
    "cas_path = clip_server.__path__[0]"
   ],
   "metadata": {
    "id": "nypR4g9EmgOj",
    "pycharm": {
     "name": "#%%\n"
    }
   },
   "execution_count": 1,
   "outputs": []
  },
  {
   "cell_type": "markdown",
   "source": [
    "This YAML is directly [taken from this file](https://github.com/jina-ai/clip-as-service/blob/main/server/clip_server/torch-flow.yml). You can also customize it as you wish, [please check CLIP-as-service docs](https://clip-as-service.jina.ai/user-guides/server/#yaml-config)."
   ],
   "metadata": {
    "id": "5RVA1OD8ywOo",
    "pycharm": {
     "name": "#%% md\n"
    }
   }
  },
  {
   "cell_type": "code",
   "source": [
    "flow_yaml = f'''\n",
    "jtype: Flow\n",
    "with:\n",
    "  port: 51000\n",
    "executors:\n",
    "  - name: clip_t\n",
    "    uses:\n",
    "      jtype: CLIPEncoder\n",
    "      metas:\n",
    "        py_modules:\n",
    "          - {cas_path}/executors/clip_torch.py\n",
    "'''"
   ],
   "metadata": {
    "id": "q1BXWnXVkIZ8",
    "pycharm": {
     "name": "#%%\n"
    }
   },
   "execution_count": 2,
   "outputs": []
  },
  {
   "cell_type": "code",
   "source": [
    "flow_yaml"
   ],
   "metadata": {
    "colab": {
     "base_uri": "https://localhost:8080/",
     "height": 53
    },
    "id": "Fb1PKf992rLj",
    "outputId": "a06b634a-5021-4b24-f3dc-a2c6b1d87524",
    "pycharm": {
     "name": "#%%\n"
    }
   },
   "execution_count": 3,
   "outputs": [
    {
     "output_type": "execute_result",
     "data": {
      "text/plain": [
       "'\\njtype: Flow\\nwith:\\n  port: 51000\\nexecutors:\\n  - name: clip_t\\n    uses:\\n      jtype: CLIPEncoder\\n      metas:\\n        py_modules:\\n          - /usr/local/lib/python3.7/dist-packages/clip_server/executors/clip_torch.py\\n'"
      ],
      "application/vnd.google.colaboratory.intrinsic+json": {
       "type": "string"
      }
     },
     "metadata": {},
     "execution_count": 3
    }
   ]
  },
  {
   "cell_type": "markdown",
   "source": [
    "## 4. Start the Flow\n",
    "\n",
    "It may take a minute or so on the first start, as it will download the pretrained models. To select different pretrained models, [please check CLIP-as-service docs](https://clip-as-service.jina.ai/user-guides/server/#yaml-config)."
   ],
   "metadata": {
    "id": "GvAeaUf4y88e",
    "pycharm": {
     "name": "#%% md\n"
    }
   }
  },
  {
   "cell_type": "code",
   "source": [
    "from jina import Flow\n",
    "\n",
    "f = Flow.load_config(flow_yaml)\n",
    "f.start()"
   ],
   "metadata": {
    "id": "4UubypFpl8-K",
    "pycharm": {
     "name": "#%%\n"
    }
   },
   "execution_count": null,
   "outputs": []
  },
  {
   "cell_type": "markdown",
   "source": [
    "Remember to close it via `f.close()` when you don't use it. But let's keep it open for now."
   ],
   "metadata": {
    "id": "2BOYxmpd8YSE",
    "pycharm": {
     "name": "#%% md\n"
    }
   }
  },
  {
   "cell_type": "markdown",
   "source": [
    "## 5. Set up forwarding\n",
    "\n",
    "By default Flow uses gRPC protocol, it is highly-efficient and feature-rich. So in this tutorial, we will use gRPC protocol and use `ngrok` for forwarding. It is possible and in fact slighly easier to set up when using `Flow(protocol='http')`, [please read the turorial here](https://colab.research.google.com/github/jina-ai/jina/blob/master/docs/Using_Jina_on_Colab.ipynb#scrollTo=0ASjGLBhXono) here I won't repeat again.\n",
    "\n",
    "\n",
    "You will need to first sign up at https://dashboard.ngrok.com/signup (http do not need register, that's why I said it is easier)\n",
    "\n",
    "After signing up, you can get a token. Then simply add your token via (replacing `YOUR_TOKEN_HERE`)"
   ],
   "metadata": {
    "id": "1lTqYEwezDTP",
    "pycharm": {
     "name": "#%% md\n"
    }
   }
  },
  {
   "cell_type": "code",
   "source": [
    "!pip install pyngrok\n",
    "\n",
    "# remember to replace to your token! otherwise i can see your service, i mean i dont really have time to see it but nonetheless\n",
    "!ngrok authtoken 2ARsKtGKj47h7y4uXMQPrIeOinS_47Mkh6jkzNjFEJWuZYNEX"
   ],
   "metadata": {
    "id": "PYQPKek-oG1a",
    "pycharm": {
     "name": "#%%\n"
    }
   },
   "execution_count": null,
   "outputs": []
  },
  {
   "cell_type": "code",
   "source": [
    "!ngrok tcp 51000 --log \"stdout\""
   ],
   "metadata": {
    "id": "2Hacpj4qn9nx",
    "pycharm": {
     "name": "#%%\n"
    }
   },
   "execution_count": null,
   "outputs": []
  },
  {
   "cell_type": "markdown",
   "source": [
    "At the last line, you should see something like: \n",
    "\n",
    "```\n",
    "t=2022-06-11T20:29:11+0000 lvl=info msg=\"started tunnel\" obj=tunnels name=command_line addr=//localhost:54321 url=tcp://6.tcp.ngrok.io:18096\n",
    "```\n",
    "\n",
    "Grab the text after `url=tcp://` in my case it is `6.tcp.ngrok.io:18096`.\n",
    "\n",
    "Now build a client using this address from your local laptop/Python environment.\n",
    "\n",
    "Copy paste the code below to your local Python, remmeber to change your address.\n",
    "\n",
    "**Remember, if your last line is `url=tcp://6.tcp.ngrok.io:18096` then you should set `Client('grpc://6.tcp.ngrok.io:18096')`**\n",
    "\n",
    "### Try Embedding Task from Local\n",
    "\n",
    "```python\n",
    "# pip install clip-client\n",
    "from clip_client import Client\n",
    "\n",
    "c = Client('grpc://6.tcp.ngrok.io:18096')\n",
    "\n",
    "r = c.encode(\n",
    "    [\n",
    "        'First do it',\n",
    "        'then do it right',\n",
    "        'then do it better',\n",
    "        'https://picsum.photos/200',\n",
    "    ]\n",
    ")\n",
    "print(r)\n",
    "```\n",
    "\n",
    "And you will get \n",
    "\n",
    "```text\n",
    "[[ 0.03494263 -0.23510742  0.0104599  ... -0.5229492  -0.10021973\n",
    "  -0.08685303]\n",
    " [-0.06793213 -0.0032444   0.01506805 ... -0.50341797 -0.06143188\n",
    "  -0.08520508]\n",
    " [ 0.15063477 -0.07922363 -0.06530762 ... -0.46484375 -0.08526611\n",
    "   0.04324341]\n",
    " [-0.16088867  0.10552979 -0.20581055 ... -0.41381836  0.19543457\n",
    "   0.05718994]]\n",
    "```\n",
    "\n",
    "Showing the connection is success!\n",
    "\n",
    "\n",
    "### Try Ranking Task from Local\n",
    "\n",
    "```python\n",
    "from docarray import Document\n",
    "\n",
    "from clip_client import Client\n",
    "\n",
    "c = Client(server='grpc://6.tcp.ngrok.io:18096/rank')\n",
    "\n",
    "r = c.rank(\n",
    "    [\n",
    "        Document(\n",
    "            uri='https://picsum.photos/id/1/300/300',\n",
    "            matches=[\n",
    "                Document(text=f'a photo of a {p}')\n",
    "                for p in (\n",
    "                    'man',\n",
    "                    'woman',\n",
    "                )\n",
    "            ],\n",
    "        )\n",
    "    ]\n",
    ")\n",
    "\n",
    "print(r['@m', ['text', 'scores']])\n",
    "```\n",
    "\n",
    "```\n",
    "[['a photo of a man', 'a photo of a woman'], [defaultdict(<class 'docarray.score.NamedScore'>, {'clip_score': {'value': 0.5806832313537598, 'op_name': 'softmax'}, 'clip_score_cosine': {'value': 0.2178003191947937, 'op_name': 'cosine'}}), defaultdict(<class 'docarray.score.NamedScore'>, {'clip_score': {'value': 0.41931676864624023, 'op_name': 'softmax'}, 'clip_score_cosine': {'value': 0.21454453468322754, 'op_name': 'cosine'}})]]\n",
    "```\n",
    "\n",
    "\n",
    "Now enjoy the free GPU/TPU to build your awesome CAS applications!"
   ],
   "metadata": {
    "id": "Fzxt8j3Bz9Nu",
    "pycharm": {
     "name": "#%% md\n"
    }
   }
  },
  {
   "cell_type": "code",
   "source": [
    "f.close()"
   ],
   "metadata": {
    "id": "wzj0pb7qo56c",
    "pycharm": {
     "name": "#%%\n"
    }
   },
   "execution_count": 11,
   "outputs": []
  },
  {
   "cell_type": "markdown",
   "source": [
    "# Push to the Limit\n",
    "\n",
    "Now let's use the biggest `ViT-L/14-336px` and fully leverage all VRAM with 4 replicas, lets see if it works.\t"
   ],
   "metadata": {
    "id": "c6yNVg69-vaw",
    "pycharm": {
     "name": "#%% md\n"
    }
   }
  },
  {
   "cell_type": "code",
   "source": [
    "flow_yaml = f'''\n",
    "jtype: Flow\n",
    "with:\n",
    "  port: 51000\n",
    "executors:\n",
    "  - name: clip_t\n",
    "    uses:\n",
    "      jtype: CLIPEncoder\n",
    "      metas:\n",
    "        py_modules:\n",
    "          - {cas_path}/executors/clip_torch.py\n",
    "    replicas: 4\n",
    "'''"
   ],
   "metadata": {
    "id": "uHHWk3WF_DaO",
    "pycharm": {
     "name": "#%%\n"
    }
   },
   "execution_count": 12,
   "outputs": []
  },
  {
   "cell_type": "code",
   "source": [
    "from jina import Flow\n",
    "\n",
    "f = Flow.load_config(flow_yaml)\n",
    "f.start()"
   ],
   "metadata": {
    "id": "0AGcGasu_JIv",
    "pycharm": {
     "name": "#%%\n"
    }
   },
   "execution_count": null,
   "outputs": []
  },
  {
   "cell_type": "code",
   "source": [
    "!ngrok tcp 51000 --log \"stdout\""
   ],
   "metadata": {
    "id": "DQzvwOF3_K6U",
    "pycharm": {
     "name": "#%%\n"
    }
   },
   "execution_count": null,
   "outputs": []
  },
  {
   "cell_type": "markdown",
   "source": [
    "Yay it works!"
   ],
   "metadata": {
    "id": "8T2z6HXd_hKB",
    "pycharm": {
     "name": "#%% md\n"
    }
   }
  },
  {
   "cell_type": "code",
   "source": [],
   "metadata": {
    "id": "4-y_vbHW_acV",
    "pycharm": {
     "name": "#%%\n"
    }
   },
   "execution_count": null,
   "outputs": []
  }
 ]
}

================================================
FILE: docs/hosting/colab.md
================================================
# Host on Google Colab

```{figure} https://clip-as-service.jina.ai/_images/colab-banner.png
:width: 0 %
:scale: 0 %
```

```{figure} colab-banner.png
:scale: 0 %
:width: 0 %
```


As [Jina is fully compatible to Google Colab](https://docs.jina.ai/how-to/google-colab/), CLIP-as-service can be run smoothly on Colab as well. One can host `clip_server` on Google Colab by leveraging its free GPU/TPU resources and open up to 4 replicas of `ViT-L/14-336px`. Then you can send request from local to the server for embedding, ranking and reasoning tasks. 

Specifically, the architecture is illustrated below:

```{figure} cas-on-colab.svg
:width: 70%
```

```{button-link} https://colab.research.google.com/github/jina-ai/clip-as-service/blob/main/docs/hosting/cas-on-colab.ipynb
:color: primary
:align: center

{octicon}`link-external` Open the notebook on Google Colab 
```

Please follow the walk-through there. Enjoy the free GPU/TPU to build your awesome Jina applications!


```{tip}
Hosing service on Google Colab is not recommended if you server aims to be long-live or permanent. It is often used for quick experiment, demonstration or leveraging its free GPU/TPU. For stable, please deploy the CLIP model on your own server.
```


================================================
FILE: docs/hosting/on-jcloud.md
================================================
# Host on JCloud

Essentially `clip_server` is a Jina [Flow](https://docs.jina.ai/fundamentals/flow/). Any Jina Flow can be hosted on [JCloud](https://docs.jina.ai/fundamentals/jcloud/), hence `clip_server` can be hosted on JCloud as well. Learn more about [JCloud here](https://docs.jina.ai/fundamentals/jcloud/).


First, you need a Flow YAML file for deploy. A minimum YAML file is as follows:

````{tab} torch-flow.yml

```yaml
jtype: Flow
executors:
  - uses: jinahub+docker://CLIPTorchEncoder
```

````
````{tab} onnx-flow.yml

```yaml
jtype: Flow
executors:
  - uses: jinahub+docker://CLIPOnnxEncoder
```

````

```{tip}
`port` is unnecessary here as JCloud will assign a new hostname and port for any deployed service. 
```

Executors must start with `jinahub+docker://` as it is required by JCloud. We currently provide containerized executors [`jinahub+docker://CLIPTorchEncoder`](https://cloud.jina.ai/executor/gzpbl8jh) and [`jinahub+docker://CLIPOnnxEncoder`](https://cloud.jina.ai/executor/2a7auwg2) on Jina Hub. They are automatically synced on the new release of `clip_server` module. 

To enable GPU on JCloud, you need to configure it in the YAML file and use prebuilt docker GPU images. For example,

```yaml
jtype: Flow
executors:
  - uses: jinahub+docker://CLIPTorchEncoder/latest-gpu
    jcloud:
      resources:
        gpu: shared
```

Please refer [here](https://docs.jina.ai/fundamentals/jcloud/yaml-spec/#gpu) for more details on using GPU in JCloud.
Notice that you must specify a docker image GPU tag for your executor to utilize the GPU. For example `latest-gpu`. 
See the 'Tag' section in [CLIPTorchEncoder](https://cloud.jina.ai/executor/gzpbl8jh) and [CLIPOnnxEncoder](https://cloud.jina.ai/executor/2a7auwg2) for docker image GPU tags.

To deploy,

````{tab} PyTorch-backed
```bash
jc deploy torch-flow.yml
```
````

````{tab} ONNX-backed
```bash
jc deploy onnx-flow.yml
```
````


If Flow is successfully deployed you will see:

```{figure} jc-deploy.png
:width: 60%
```

You can now connect to it via client by setting  `server` as the URL given by JCloud:

```python
from clip_client import Client

c = Client(
    'grpcs://174eb69ba3.wolf.jina.ai'
)  # This is the URL you get from previous step
c.profile()
```


================================================
FILE: docs/html_extra/robots.txt
================================================
User-agent: *
sitemap: https://clip-as-service.jina.ai/sitemap.xml

================================================
FILE: docs/index.md
================================================
# Welcome to CLIP-as-service!


```{include} ../README.md
:start-after: <!-- start elevator-pitch -->
:end-before: <!-- end elevator-pitch -->
```

## Try it!

## Install

![PyPI](https://img.shields.io/pypi/v/clip_client?color=%23ffffff&label=%20) is the latest version.

Make sure you are using Python 3.7+. You can install the client and server independently. It is **not required** to install both: e.g. you can install `clip_server` on a GPU machine and `clip_client` on a local laptop.

````{tab} Client

```bash
pip install clip-client
```

````

````{tab} Server (PyTorch)

```bash
pip install clip-server
```
````

````{tab} Server (ONNX)

```bash
pip install "clip_server[onnx]"
```

````


````{tab} Server (TensorRT)

```bash
pip install nvidia-pyindex 
pip install "clip_server[tensorrt]"
```
````

````{tab} Server on Google Colab

```{button-link} https://colab.research.google.com/github/jina-ai/clip-as-service/blob/main/docs/hosting/cas-on-colab.ipynb
:color: primary
:align: center

{octicon}`link-external` Open the notebook on Google Colab 
```

````


## Quick check

After installing, you can run the following commands for a quick connectivity check.

### Start the server

````{tab} Start PyTorch Server 
```bash
python -m clip_server
```
````

````{tab} Start ONNX Server 
```bash
python -m clip_server onnx-flow.yml
```
````

````{tab} Start TensorRT Server 
```bash
python -m clip_server tensorrt-flow.yml
```
````

At the first time starting the server, it will download the default pretrained model, which may take a while depending on your network speed. Then you will get the address information similar to the following: 

```text
╭────────────── 🔗 Endpoint ───────────────╮
│  🔗     Protocol                   GRPC  │
│  🏠        Local          0.0.0.0:51000  │
│  🔒      Private    192.168.31.62:51000  │
|  🌍       Public   87.105.159.191:51000  |
╰──────────────────────────────────────────╯  
```

This means the server is ready to serve. Note down the three addresses shown above, you will need them later.

### Connect from client

```{tip}
Depending on the location of the client and server. You may use different IP addresses:
- Client and server are on the same machine: use local address, e.g. `0.0.0.0`
- Client and server are connected to the same router: use private network address, e.g. `192.168.3.62`
- Server is in public network: use public network address, e.g. `87.105.159.191`
```

Run the following Python script:

```python
from clip_client import Client

c = Client('grpc://0.0.0.0:51000')
c.profile()
```

will give you:

```text
 Roundtrip  16ms  100%
├──  Client-server network  8ms  49%
└──  Server  8ms  51%
    ├──  Gateway-CLIP network  2ms  25%
    └──  CLIP model  6ms  75%
{'Roundtrip': 15.684750003856607, 'Client-server network': 7.684750003856607, 'Server': 8, 'Gateway-CLIP network': 2, 'CLIP model': 6}
```

It means the client and the server are now connected. Well done!


```{include} ../README.md
:start-after: <!-- start support-pitch -->
:end-before: <!-- end support-pitch -->
```


```{toctree}
:caption: User Guides
:hidden:

user-guides/client
user-guides/server
user-guides/benchmark
user-guides/retriever
user-guides/faq
```

```{toctree}
:caption: Hosting
:hidden:


hosting/colab
```

```{toctree}
:caption: Playground
:hidden:

playground/embedding
playground/reasoning
playground/searching
```


```{toctree}
:caption: Developer References
:hidden:
:maxdepth: 1

api/clip_client
```


---
{ref}`genindex` | {ref}`modindex`


================================================
FILE: docs/makedoc.sh
================================================
#!/usr/bin/env bash

set -ex

rm -rf api && make clean

make dirhtml


================================================
FILE: docs/playground/embedding.md
================================================
# Text & Image Embedding

Embedding is a basic task in CLIP-as-service. It means converting your input sentence or image into a fixed-length vector. In this demo, you can choose a picture, input a sentence in the textbox, or copy-paste your image URL into the text box to get a rough feeling how CLIP-as-service works.

This is *not* a search task. The images are random stock images and are related to any search results, they are mainly for saving your time on finding some random internet cat pictures. 

The model is `ViT-L/14-336px` on one GPU.

<iframe frameborder="0" allowtransparency="true" scrolling="no" src="../../_static/demo-embed.html" style="overflow:hidden;overflow-x:hidden;overflow-y:hidden;height:100vh;width:100%"></iframe>

```{button-link} ../../_static/demo-embed.html
:color: primary
:align: center

{octicon}`link-external` Open this playground in a new window
```

================================================
FILE: docs/playground/reasoning.md
================================================
# Visual Reasoning

Visual reasoning is another basic task in CLIP-as-service. There are four basic visual reasoning skills: object recognition, object counting, color recognition, and spatial relation understanding. Despite how magic it sounds and looks, the idea is fairly simple: just input the reasoning texts as prompts, then {ref}`calling rank interface<rank-api>` of `clip_server`. The server will rank the prompts and return sorted prompts with scores.

In this demo, you can choose a picture, or copy-paste your image URL into the text box to get a rough feeling how visual reasoning works. Feel free to add or remove prompts and observe how it affects the ranking results.

The model is `ViT-L/14-336px` on one GPU.

<iframe frameborder="0" allowtransparency="true" scrolling="no" src="../../_static/demo-text-rank.html" style="overflow:hidden;overflow-x:hidden;overflow-y:hidden;height:100vh;width:100%"></iframe>


```{button-link} ../../_static/demo-text-rank.html
:color: primary
:align: center

{octicon}`link-external` Open this playground in a new window
```

================================================
FILE: docs/playground/searching.md
================================================
# Text & Image Searching

CLIP-as-service enables us to encode text and images into a common space. This is a powerful tool for many applications, such as cross-modality search.

[CLIP search](../user-guides/retriever.md) is a new feature provided by CLIP-as-service. It enables us to search for images based on text/image. It calculates the similarity score based on the embeddings of the text and image. The higher the score, the more similar they are.

This demo demonstrates the text-to-image and image-to-image searching in CLIP search. You can type text query or upload the local image as a query, and it will return the top 10 similar images for you.

In this demo, we use [``Open-Image-Dataset``](https://storage.googleapis.com/openimages/web/download.html) dataset (consist of 125,346 images) to demonstrate Text & Image retrieval.

<iframe frameborder="0" allowtransparency="true" scrolling="no" src="https://jemmyshin-laion5b-streamlit-streamlit-demo-rddbqz.streamlitapp.com?embedded=true" style="overflow:hidden;overflow-x:hidden;overflow-y:hidden;height:100vh;width:100%"></iframe>

```{button-link} https://jemmyshin-laion5b-streamlit-streamlit-demo-rddbqz.streamlitapp.com/
:color: primary
:align: center

{octicon}`link-external` Open this playground in a new window
```

================================================
FILE: docs/requirements.txt
================================================
# cf. https://github.com/ryanfox/sphinx-markdown-tables/issues/36
markdown<3.4.0
sphinx
sphinx-argparse==0.3.1
sphinxcontrib-apidoc==0.3.0
sphinx-autodoc-typehints==1.12.0
sphinx_markdown_tables==0.0.15
sphinx_copybutton==0.4.0
sphinx-notfound-page==0.7.1
gitpython==3.1.13
sphinx-sitemap==2.2.0
sphinxext-opengraph
furo
myst-parser==0.15.1
sphinx-design
sphinx-inline-tabs
# sphinx-multiversion
git+https://github.com/Holzhaus/sphinx-multiversion.git

================================================
FILE: docs/user-guides/benchmark.rst
================================================
Benchmark
=========

In order to understand the zero-shot performance of CLIP and its limitations, we conducted a benchmark
across a variety of computer vision datasets (the dataset details are in the appendix). Here, thanks for the
open-source `CLIP Benchmark toolkit <https://github.com/LAION-AI/CLIP_benchmark>`_, we can easily reproduce the results.

We hope that this benchmark can help you to better understand the performance of CLIP models and choose the best model for your application.


Select the right model
-----------------------

In general, you can select the best model for your application from different perspectives: disk usage, peak RAM and VRAM usages, QPS, and most importantly, the performance.

Based on our experiments, we recommend the ViT models over the RN models for most general applications.
More specifically, the ``ViT-H-14::laion2b_s32b_b79k`` model and ``ViT-g-14::laion2b_s12b_b42k`` model should be first considered since they have the best or close to the best performance in most cases.
However, if you are concerned about the encoding speed, you can consider other ViT models because they have higher QPS with decent performance.
Anyway, you should choose the model that best fits your requirements.
For example, if you are labeling images for diabetic retinopathy, you should probably select the ``ViT-B-32::laion2b_s34b_b79k`` model since it has the best top-1 accuracy of 0.734 on zero-shot classification of the Retinopathy dataset.
Or if you are dealing with histopathologic images, you should probably select the RN50::openai model since it has the best top-1 accuracy of 0.636 on zero-shot classification of the Patch Camelyon dataset.


The following sections show the performance of different models in details on different datasets and tasks.


Size and efficiency
-------------------------

We first present the model's size and efficiency in terms of query time and memory usage (including the peak RAM and VRAM usage).
All of the results are obtained on a single Nvidia TITAN RTX GPU (24GB VRAM) with default server settings.

+----------------------------------------+------------------+----------------------+-----------------------+-----------+------------+
| Model                                  | Disk Usage (MB)  | Peak RAM Usage (GB)  | Peak VRAM Usage (GB)  | Text QPS  | Image QPS  |
+========================================+==================+======================+=======================+===========+============+
| RN50::openai                           | 244              | 2.99                 | 1.36                  | 1019      | 269        |
+----------------------------------------+------------------+----------------------+-----------------------+-----------+------------+
| RN50::yfcc15m                          | 389              | 2.86                 | 1.36                  | 1083      | 262        |
+----------------------------------------+------------------+----------------------+-----------------------+-----------+------------+
| RN50::cc12m                            | 389              | 2.84                 | 1.36                  | 1064      | 264        |
+----------------------------------------+------------------+----------------------+-----------------------+-----------+------------+
| RN101::openai                          | 278              | 3.05                 | 1.40                  | 1047      | 222        |
+----------------------------------------+------------------+----------------------+-----------------------+-----------+------------+
| RN101::yfcc15m                         | 457              | 2.88                 | 1.40                  | 1107      | 223        |
+----------------------------------------+------------------+----------------------+-----------------------+-----------+------------+
| RN50x4::openai                         | 402              | 3.23                 | 1.63                  | 1047      | 218        |
+----------------------------------------+------------------+----------------------+-----------------------+-----------+------------+
| RN50x16::openai                        | 631              | 3.63                 | 2.02                  | 1038      | 121        |
+----------------------------------------+------------------+----------------------+-----------------------+-----------+------------+
| RN50x64::openai                        | 1291             | 4.08                 | 2.98                  | 985       | 59         |
+----------------------------------------+------------------+----------------------+-----------------------+-----------+------------+
| ViT-B-32::openai                       | 338              | 3.20                 | 1.40                  | 1064      | 286        |
+----------------------------------------+------------------+----------------------+-----------------------+-----------+------------+
| ViT-B-32::laion2b_e16                  | 577              | 2.93                 | 1.40                  | 1120      | 292        |
+----------------------------------------+------------------+----------------------+-----------------------+-----------+------------+
| ViT-B-32::laion400m_e31                | 577              | 2.93                 | 1.40                  | 1080      | 287        |
+----------------------------------------+------------------+----------------------+-----------------------+-----------+------------+
| ViT-B-32::laion400m_e32                | 577              | 2.94                 | 1.40                  | 1092      | 289        |
+----------------------------------------+------------------+----------------------+-----------------------+-----------+------------+
| ViT-B-32::laion2b-s34b-b79k            | 577              | 2.94                 | 1.40                  | 1102      | 285        |
+----------------------------------------+------------------+----------------------+-----------------------+-----------+------------+
| ViT-B-16::openai                       | 335              | 3.20                 | 1.44                  | 1064      | 260        |
+----------------------------------------+------------------+----------------------+-----------------------+-----------+------------+
| ViT-B-16::laion400m_e31                | 571              | 2.93                 | 1.44                  | 1099      | 262        |
+----------------------------------------+------------------+----------------------+-----------------------+-----------+------------+
| ViT-B-16::laion400m_e32                | 571              | 2.94                 | 1.44                  | 1082      | 268        |
+----------------------------------------+------------------+----------------------+-----------------------+-----------+------------+
| ViT-B-16-plus-240::laion400m_e31       | 795              | 3.03                 | 1.59                  | 1059      | 235        |
+----------------------------------------+------------------+----------------------+-----------------------+-----------+------------+
| ViT-B-16-plus-240::laion400m_e32       | 795              | 3.03                 | 1.59                  | 1043      | 239        |
+----------------------------------------+------------------+----------------------+-----------------------+-----------+------------+
| ViT-L-14::openai                       | 890              | 3.66                 | 2.04                  | 1040      | 140        |
+----------------------------------------+------------------+----------------------+-----------------------+-----------+------------+
| ViT-L-14::laion400m_e31                | 1631             | 3.43                 | 2.03                  | 1058      | 147        |
+----------------------------------------+------------------+----------------------+-----------------------+-----------+------------+
| ViT-L-14::laion400m_e32                | 1631             | 3.42                 | 2.03                  | 1061      | 146        |
+----------------------------------------+------------------+----------------------+-----------------------+-----------+------------+
| ViT-L-14::laion2b-s32b-b82k            | 1631             | 3.43                 | 2.03                  | 1069      | 147        |
+----------------------------------------+------------------+----------------------+-----------------------+-----------+------------+
| ViT-L-14-336::openai                   | 891              | 3.74                 | 2.23                  | 1070      | 76         |
+----------------------------------------+------------------+----------------------+-----------------------+-----------+------------+
| ViT-H-14::laion2b-s32b-b79k            | 3762             | 4.45                 | 3.26                  | 642       | 91         |
+----------------------------------------+------------------+----------------------+-----------------------+-----------+------------+
| ViT-g-14::laion2b-s12b-b42k            | 5214             | 5.16                 | 4.00                  | 639       | 69         |
+----------------------------------------+------------------+----------------------+-----------------------+-----------+------------+
| M-CLIP/LABSE-Vit-L-14                  | 3609             | 4.30                 | 4.70                  | 646       | 284        |
+----------------------------------------+------------------+----------------------+-----------------------+-----------+------------+
| M-CLIP/XLM-Roberta-Large-Vit-B-32      | 4284             | 5.37                 | 1.68                  | 656       | 139        |
+----------------------------------------+------------------+----------------------+-----------------------+-----------+------------+
| M-CLIP/XLM-Roberta-Large-Vit-B-16Plus  | 4293             | 4.30                 | 4.13                  | 662       | 236        |
+----------------------------------------+------------------+----------------------+-----------------------+-----------+------------+
| M-CLIP/XLM-Roberta-Large-Vit-L-14      | 4293             | 4.30                 | 4.97                  | 1027      | 139        |
+----------------------------------------+------------------+----------------------+-----------------------+-----------+------------+


Zero-shot performance
----------------------------

In this section, we will report the zero-shot performance of the models on classification and retrieval tasks across different datasets.
In the following tables, we will highlight the best results in bold for each dataset (higher is better).

Zero-shot retrieval
+++++++++++++++++++

In zero-shot retrieval benchmark, each model is evaluated on the following datasets: `COCO Caption <https://github.com/tylin/coco-caption>`_, `Flickr8k <http://hockenmaier.cs.illinois.edu/8k-pictures.html>`_ and `Flickr30k <https://shannon.cs.illinois.edu/DenotationGraph/>`_.
For the above datasets, there are five corresponding description sentences for each image written by humans.
The results are reported in terms of top-5 text-to-image retrieval recall, top-5 image-to-text retrieval recall and their averages.
More specifically, the top-5 text-to-image retrieval recall for each retrieved image is either 1 or 0.
It is 1 if the input text matches one of the image descriptions among the top-5.
The top-5 image-to-text retrieval recall for each image is the number of top-5 retrieved texts matching that image descriptions.

+----------------------------------+-------------------------------------------+-------------------------------------------+-------------------------------------------+
| Model                            | COCO Caption                              | Flickr 8k                                 | Flickr 30k                                |
|                                  +---------------+---------------+-----------+---------------+---------------+-----------+---------------+---------------+-----------+
|                                  | Text to image | Image to text | Average   | Text to image | Image to text | Average   | Text to image | Image to text | Average   |
+==================================+===============+===============+===========+===============+===============+===========+===============+===============+===========+
| RN50::openai                     | 0.529         | 0.728         | 0.629     | 0.504         | 0.690         | 0.597     | 0.392         | 0.621         | 0.506     |
+----------------------------------+---------------+---------------+-----------+---------------+---------------+-----------+---------------+---------------+-----------+
| RN50::yfcc15m                    | 0.361         | 0.534         | 0.447     | 0.238         | 0.394         | 0.316     | 0.146         | 0.278         | 0.212     |
+----------------------------------+---------------+---------------+-----------+---------------+---------------+-----------+---------------+---------------+-----------+
| RN50::cc12m                      | 0.446         | 0.607         | 0.527     | 0.302         | 0.435         | 0.369     | 0.204         | 0.316         | 0.260     |
+----------------------------------+---------------+---------------+-----------+---------------+---------------+-----------+---------------+---------------+-----------+
| RN101::openai                    | 0.555         | 0.745         | 0.650     | 0.523         | 0.694         | 0.608     | 0.415         | 0.629         | 0.522     |
+----------------------------------+---------------+---------------+-----------+---------------+---------------+-----------+---------------+---------------+-----------+
| RN101::yfcc15m                   | 0.376         | 0.549         | 0.463     | 0.251         | 0.417         | 0.334     | 0.156         | 0.296         | 0.226     |
+----------------------------------+---------------+---------------+-----------+---------------+---------------+-----------+---------------+---------------+-----------+
| RN50x4::openai                   | 0.581         | 0.767         | 0.674     | 0.558         | 0.729         | 0.643     | 0.451         | 0.671         | 0.561     |
+----------------------------------+---------------+---------------+-----------+---------------+---------------+-----------+---------------+---------------+-----------+
| RN50x16::openai                  | 0.600         | 0.787         | 0.693     | 0.597         | 0.768         | 0.682     | 0.496         | 0.713         | 0.604     |
+----------------------------------+---------------+---------------+-----------+---------------+---------------+-----------+---------------+---------------+-----------+
| RN50x64::openai                  | 0.599         | 0.803         | 0.701     | 0.629         | 0.790         | 0.709     | 0.534         | 0.756         | 0.645     |
+----------------------------------+---------------+---------------+-----------+---------------+---------------+-----------+---------------+---------------+-----------+
| ViT-B-32::openai                 | 0.560         | 0.749         | 0.654     | 0.532         | 0.699         | 0.616     | 0.413         | 0.629         | 0.521     |
+----------------------------------+---------------+---------------+-----------+---------------+---------------+-----------+---------------+---------------+-----------+
| ViT-B-32::laion2b_e16            | 0.647         | 0.795         | 0.721     | 0.622         | 0.760         | 0.691     | 0.507         | 0.687         | 0.597     |
+----------------------------------+---------------+---------------+-----------+---------------+---------------+-----------+---------------+---------------+-----------+
| ViT-B-32::laion400m_e31          | 0.600         | 0.763         | 0.682     | 0.562         | 0.736         | 0.649     | 0.438         | 0.633         | 0.536     |
+----------------------------------+---------------+---------------+-----------+---------------+---------------+-----------+---------------+---------------+-----------+
| ViT-B-32::laion400m_e32          | 0.600         | 0.765         | 0.682     | 0.562         | 0.736         | 0.649     | 0.437         | 0.634         | 0.536     |
+----------------------------------+---------------+---------------+-----------+---------------+---------------+-----------+---------------+---------------+-----------+
| ViT-B-32::laion2b_s34b_b79k      | 0.654         | 0.798         | 0.726     | 0.629         | 0.778         | 0.703     | 0.513         | 0.694         | 0.603     |
+----------------------------------+---------------+---------------+-----------+---------------+---------------+-----------+---------------+---------------+-----------+
| ViT-B-16::openai                 | 0.584         | 0.767         | 0.676     | 0.564         | 0.727         | 0.646     | 0.452         | 0.671         | 0.561     |
+----------------------------------+---------------+---------------+-----------+---------------+---------------+-----------+---------------+---------------+-----------+
| ViT-B-16::laion400m_e31          | 0.637         | 0.796         | 0.717     | 0.620         | 0.765         | 0.692     | 0.506         | 0.697         | 0.602     |
+----------------------------------+---------------+---------------+-----------+---------------+---------------+-----------+---------------+---------------+-----------+
| ViT-B-16::laion400m_e32          | 0.636         | 0.796         | 0.716     | 0.620         | 0.767         | 0.694     | 0.508         | 0.697         | 0.603     |
+----------------------------------+---------------+---------------+-----------+---------------+---------------+-----------+---------------+---------------+-----------+
| ViT-B-16-plus-240::laion400m_e31 | 0.660         | 0.809         | 0.735     | 0.642         | 0.788         | 0.715     | 0.533         | 0.725         | 0.629     |
+----------------------------------+---------------+---------------+-----------+---------------+---------------+-----------+---------------+---------------+-----------+
| ViT-B-16-plus-240::laion400m_e32 | 0.662         | 0.811         | 0.736     | 0.644         | 0.791         | 0.718     | 0.535         | 0.727         | 0.631     |
+----------------------------------+---------------+---------------+-----------+---------------+---------------+-----------+---------------+---------------+-----------+
| ViT-L-14::openai                 | 0.610         | 0.793         | 0.702     | 0.599         | 0.767         | 0.683     | 0.494         | 0.717         | 0.605     |
+----------------------------------+---------------+---------------+-----------+---------------+---------------+-----------+---------------+---------------+-----------+
| ViT-L-14::laion400m_e31          | 0.680         | 0.821         | 0.750     | 0.675         | 0.806         | 0.741     | 0.570         | 0.751         | 0.661     |
+----------------------------------+---------------+---------------+-----------+---------------+---------------+-----------+---------------+---------------+-----------+
| ViT-L-14::laion400m_e32          | 0.680         | 0.821         | 0.751     | 0.675         | 0.806         | 0.740     | 0.570         | 0.751         | 0.661     |
+----------------------------------+---------------+---------------+-----------+---------------+---------------+-----------+---------------+---------------+-----------+
| ViT-L-14::laion2b_s32b_b82k      | 0.711         | 0.840         | 0.775     | 0.712         | 0.824         | 0.768     | 0.620         | 0.789         | 0.704     |
+----------------------------------+---------------+---------------+-----------+---------------+---------------+-----------+---------------+---------------+-----------+
| ViT-L-14-336::openai             | 0.616         | 0.812         | 0.714     | 0.629         | 0.779         | 0.704     | 0.533         | 0.741         | 0.637     |
+----------------------------------+---------------+---------------+-----------+---------------+---------------+-----------+---------------+---------------+-----------+
| ViT-H-14::laion2b_s32b_b79k      | **0.734**     | **0.861**     | **0.797** | **0.746**     | **0.856**     | **0.801** | **0.657**     | **0.823**     | **0.740** |
+----------------------------------+---------------+---------------+-----------+---------------+---------------+-----------+---------------+---------------+-----------+
| ViT-g-14::laion2b_s12b_b42k      | 0.724         | 0.853         | 0.788     | 0.730         | 0.846         | 0.788     | 0.639         | 0.806         | 0.722     |
+----------------------------------+---------------+---------------+-----------+---------------+---------------+-----------+---------------+---------------+-----------+

From the table, we observe that the ViT models outperform the RN models in general.
More specifically, the ``ViT-H-14::laion2b_s32b_b79k`` model and ``ViT-g-14::laion2b_s12b_b42k`` model achieve the best and second-best results on all zero-shot retrieval tasks.
For ViT models, the results of the same base model are better on those pre-trained with larger datasets (e.g., ``ViT-B-32::openai`` vs ``ViT-B-32::laion400m_e31`` vs ``ViT-B-32::laion2b-s34b-b79k``).

Zero-shot classification
++++++++++++++++++++++++

In zero-shot classification benchmark, each model is evaluated on the following datasets: `ImageNetV2 <https://github.com/modestyachts/ImageNetV2>`_, `VOC2007 <http://host.robots.ox.ac.uk/pascal/VOC/voc2007/>`_ and 19 `VTAB datasets <https://github.com/google-research/task_adaptation>`_.
The results are shown in the following table. 
For each dataset, we report the top-1 accuracy, which is whether the top-1 retrieved class of a image matches its true class.

+----------------------------------+------------+-----------+-------------------------------------------------------------------------------------+------------------------------------------------------+------------------------------------------------------------------------------------------------------------------------------------------------+
| Model                            | ImageNetV2 | VOC2007   | VTAB natural                                                                        | VTAB specialized                                     | VTAB structured                                                                                                                                |
|                                  |            |           +------------+-----------+-----------+------------+-----------+-----------+-----------+-----------+-----------+----------------+-------------+-------------+----------------+-------------------+----------------------+-------------------+---------------------+-----------+----------------+
|                                  |            |           | Caltech101 | CIFAR-100 | DTD       | Flowers102 | Pets      | Sun397    | SVHN      | EuroSAT   | Resisc45  | Patch Camelyon | Retinopathy | Clevr/count | Clevr/distance | dSprites/location | dSprites/orientation | SmallNORB/azimuth | SmallNORB/elevation | DMLab     | KITTI/distance |
+==================================+============+===========+============+===========+===========+============+===========+===========+===========+===========+===========+================+=============+=============+================+===================+======================+===================+=====================+===========+================+
| RN50::openai                     | 0.529      | 0.650     | 0.772      | 0.403     | 0.415     | 0.660      | 0.857     | 0.894     | 0.303     | 0.408     | 0.453     | **0.636**      | 0.171       | 0.217       | 0.148          | 0.034             | 0.014                | 0.056             | 0.110               | 0.145     | 0.170          |
+----------------------------------+------------+-----------+------------+-----------+-----------+------------+-----------+-----------+-----------+-----------+-----------+----------------+-------------+-------------+----------------+-------------------+----------------------+-------------------+---------------------+-----------+----------------+
| RN50::yfcc15m                    | 0.214      | 0.215     | 0.402      | 0.116     | 0.122     | 0.167      | 0.174     | 0.127     | 0.157     | 0.172     | 0.123     | 0.533          | 0.358       | 0.151       | 0.158          | 0.032             | 0.024                | 0.053             | 0.120               | 0.160     | **0.336**      |
+----------------------------------+------------+-----------+------------+-----------+-----------+------------+-----------+-----------+-----------+-----------+-----------+----------------+-------------+-------------+----------------+-------------------+----------------------+-------------------+---------------------+-----------+----------------+
| RN50::cc12m                      | 0.224      | 0.438     | 0.582      | 0.178     | 0.135     | 0.095      | 0.331     | 0.123     | 0.102     | 0.148     | 0.117     | 0.535          | 0.293       | 0.184       | 0.222          | 0.031             | 0.025                | 0.047             | 0.096               | 0.161     | 0.155          |
+----------------------------------+------------+-----------+------------+-----------+-----------+------------+-----------+-----------+-----------+-----------+-----------+----------------+-------------+-------------+----------------+-------------------+----------------------+-------------------+---------------------+-----------+----------------+
| RN101::openai                    | 0.561      | 0.651     | 0.780      | 0.476     | 0.432     | 0.652      | 0.869     | 0.887     | 0.226     | 0.314     | 0.547     | 0.583          | 0.280       | 0.242       | 0.130          | 0.031             | 0.021                | 0.054             | 0.111               | 0.139     | 0.263          |
+----------------------------------+------------+-----------+------------+-----------+-----------+------------+-----------+-----------+-----------+-----------+-----------+----------------+-------------+-------------+----------------+-------------------+----------------------+-------------------+---------------------+-----------+----------------+
| RN101::yfcc15m                   | 0.221      | 0.243     | 0.469      | 0.125     | 0.117     | 0.210      | 0.177     | 0.128     | 0.137     | 0.151     | 0.099     | 0.479          | 0.584       | 0.109       | 0.159          | 0.031             | 0.019                | 0.055             | 0.097               | 0.153     | 0.252          |
+----------------------------------+------------+-----------+------------+-----------+-----------+------------+-----------+-----------+-----------+-----------+-----------+----------------+-------------+-------------+----------------+-------------------+----------------------+-------------------+---------------------+-----------+----------------+
| RN50x4::openai                   | 0.594      | 0.682     | 0.781      | 0.451     | 0.486     | 0.698      | 0.887     | 0.908     | 0.367     | 0.335     | 0.532     | 0.569          | 0.318       | 0.205       | 0.082          | 0.031             | 0.026                | 0.056             | 0.108               | 0.162     | 0.233          |
+----------------------------------+------------+-----------+------------+-----------+-----------+------------+-----------+-----------+-----------+-----------+-----------+----------------+-------------+-------------+----------------+-------------------+----------------------+-------------------+---------------------+-----------+----------------+
| RN50x16::openai                  | 0.643      | 0.680     | 0.810      | 0.522     | 0.524     | 0.724      | 0.898     | 0.917     | 0.409     | 0.433     | 0.589     | 0.625          | 0.715       | 0.195       | 0.213          | 0.030             | 0.026                | 0.050             | 0.116               | 0.146     | 0.229          |
+----------------------------------+------------+-----------+------------+-----------+-----------+------------+-----------+-----------+-----------+-----------+-----------+----------------+-------------+-------------+----------------+-------------------+----------------------+-------------------+---------------------+-----------+----------------+
| RN50x64::openai                  | 0.670      | 0.740     | 0.834      | 0.598     | 0.531     | 0.788      | 0.936     | 0.931     | 0.481     | 0.577     | 0.628     | 0.539          | 0.073       | 0.227       | 0.200          | 0.034             | 0.025                | 0.056             | 0.125               | 0.158     | 0.311          |
+----------------------------------+------------+-----------+------------+-----------+-----------+------------+-----------+-----------+-----------+-----------+-----------+----------------+-------------+-------------+----------------+-------------------+----------------------+-------------------+---------------------+-----------+----------------+
| ViT-B-32::openai                 | 0.559      | 0.764     | 0.815      | 0.643     | 0.443     | 0.664      | 0.873     | 0.913     | 0.135     | 0.504     | 0.537     | 0.623          | 0.447       | 0.232       | 0.164          | 0.037             | 0.024                | 0.061             | **0.127**           | 0.193     | 0.274          |
+----------------------------------+------------+-----------+------------+-----------+-----------+------------+-----------+-----------+-----------+-----------+-----------+----------------+-------------+-------------+----------------+-------------------+----------------------+-------------------+---------------------+-----------+----------------+
| ViT-B-32::laion2b_e16            | 0.573      | 0.788     | 0.831      | 0.754     | 0.539     | 0.691      | 0.893     | 0.933     | 0.388     | 0.503     | 0.619     | 0.506          | 0.195       | 0.192       | 0.167          | 0.031             | 0.024                | 0.052             | 0.110               | 0.189     | 0.176          |
+----------------------------------+------------+-----------+------------+-----------+-----------+------------+-----------+-----------+-----------+-----------+-----------+----------------+-------------+-------------+----------------+-------------------+----------------------+-------------------+---------------------+-----------+----------------+
| ViT-B-32::laion400m_e31          | 0.523      | 0.731     | 0.818      | 0.678     | 0.521     | 0.659      | 0.856     | 0.918     | 0.220     | 0.470     | 0.510     | 0.549          | 0.259       | 0.155       | 0.161          | 0.033             | 0.021                | 0.053             | 0.117               | 0.173     | 0.122          |
+----------------------------------+------------+-----------+------------+-----------+-----------+------------+-----------+-----------+-----------+-----------+-----------+----------------+-------------+-------------+----------------+-------------------+----------------------+-------------------+---------------------+-----------+----------------+
| ViT-B-32::laion400m_e32          | 0.523      | 0.733     | 0.817      | 0.677     | 0.523     | 0.658      | 0.854     | 0.917     | 0.223     | 0.476     | 0.510     | 0.548          | 0.240       | 0.153       | 0.161          | 0.033             | 0.021                | 0.054             | 0.117               | 0.173     | 0.118          |
+----------------------------------+------------+-----------+------------+-----------+-----------+------------+-----------+-----------+-----------+-----------+-----------+----------------+-------------+-------------+----------------+-------------------+----------------------+-------------------+---------------------+-----------+----------------+
| ViT-B-32::laion2b_s34b_b79k      | 0.581      | 0.791     | 0.839      | 0.755     | 0.557     | 0.716      | 0.909     | 0.937     | 0.410     | 0.482     | 0.610     | 0.598          | **0.734**   | 0.153       | 0.189          | 0.029             | **0.034**            | **0.062**         | 0.113               | 0.159     | 0.262          |
+----------------------------------+------------+-----------+------------+-----------+-----------+------------+-----------+-----------+-----------+-----------+-----------+----------------+-------------+-------------+----------------+-------------------+----------------------+-------------------+---------------------+-----------+----------------+
| ViT-B-16::openai                 | 0.619      | 0.783     | 0.819      | 0.669     | 0.449     | 0.712      | 0.890     | 0.924     | 0.313     | 0.559     | 0.582     | 0.507          | 0.036       | 0.209       | 0.158          | 0.030             | 0.023                | 0.053             | 0.122               | 0.155     | 0.263          |
+----------------------------------+------------+-----------+------------+-----------+-----------+------------+-----------+-----------+-----------+-----------+-----------+----------------+-------------+-------------+----------------+-------------------+----------------------+-------------------+---------------------+-----------+----------------+
| ViT-B-16::laion400m_e31          | 0.594      | 0.767     | 0.838      | 0.712     | 0.513     | 0.694      | 0.892     | 0.939     | 0.380     | 0.503     | 0.585     | 0.593          | 0.062       | 0.289       | **0.245**      | 0.031             | 0.030                | 0.059             | 0.100               | 0.152     | 0.200          |
+----------------------------------+------------+-----------+------------+-----------+-----------+------------+-----------+-----------+-----------+-----------+-----------+----------------+-------------+-------------+----------------+-------------------+----------------------+-------------------+---------------------+-----------+----------------+
| ViT-B-16::laion400m_e32          | 0.597      | 0.768     | 0.837      | 0.712     | 0.513     | 0.692      | 0.892     | 0.939     | 0.385     | 0.501     | 0.585     | 0.598          | 0.077       | 0.287       | **0.245**      | 0.032             | 0.029                | 0.060             | 0.099               | 0.151     | 0.183          |
+----------------------------------+------------+-----------+------------+-----------+-----------+------------+-----------+-----------+-----------+-----------+-----------+----------------+-------------+-------------+----------------+-------------------+----------------------+-------------------+---------------------+-----------+----------------+
| ViT-B-16-plus-240::laion400m_e31 | 0.614      | 0.764     | 0.832      | 0.733     | 0.555     | 0.706      | 0.904     | 0.940     | 0.355     | 0.569     | 0.615     | 0.551          | 0.093       | 0.240       | 0.159          | 0.041             | 0.026                | 0.056             | 0.111               | 0.149     | 0.280          |
+----------------------------------+------------+-----------+------------+-----------+-----------+------------+-----------+-----------+-----------+-----------+-----------+----------------+-------------+-------------+----------------+-------------------+----------------------+-------------------+---------------------+-----------+----------------+
| ViT-B-16-plus-240::laion400m_e32 | 0.615      | 0.764     | 0.833      | 0.738     | 0.555     | 0.711      | 0.902     | 0.940     | 0.362     | 0.581     | 0.613     | 0.551          | 0.095       | 0.238       | 0.160          | **0.043**         | 0.027                | 0.054             | 0.110               | 0.148     | 0.281          |
+----------------------------------+------------+-----------+------------+-----------+-----------+------------+-----------+-----------+-----------+-----------+-----------+----------------+-------------+-------------+----------------+-------------------+----------------------+-------------------+---------------------+-----------+----------------+
| ViT-L-14::openai                 | 0.698      | 0.783     | 0.835      | 0.758     | 0.554     | 0.792      | 0.932     | 0.937     | 0.571     | 0.626     | 0.633     | 0.520          | 0.733       | 0.194       | 0.161          | 0.032             | 0.023                | 0.045             | 0.115               | 0.163     | 0.218          |
+----------------------------------+------------+-----------+------------+-----------+-----------+------------+-----------+-----------+-----------+-----------+-----------+----------------+-------------+-------------+----------------+-------------------+----------------------+-------------------+---------------------+-----------+----------------+
| ViT-L-14::laion400m_e31          | 0.654      | 0.758     | 0.839      | 0.774     | 0.598     | 0.757      | 0.917     | 0.950     | 0.378     | 0.632     | 0.671     | 0.487          | 0.058       | 0.242       | 0.149          | 0.030             | 0.026                | 0.053             | 0.109               | 0.186     | 0.200          |
+----------------------------------+------------+-----------+------------+-----------+-----------+------------+-----------+-----------+-----------+-----------+-----------+----------------+-------------+-------------+----------------+-------------------+----------------------+-------------------+---------------------+-----------+----------------+
| ViT-L-14::laion400m_e32          | 0.654      | 0.756     | 0.839      | 0.774     | 0.605     | 0.756      | 0.919     | 0.950     | 0.380     | 0.622     | 0.675     | 0.493          | 0.061       | 0.243       | 0.149          | 0.030             | 0.026                | 0.053             | 0.110               | 0.186     | 0.203          |
+----------------------------------+------------+-----------+------------+-----------+-----------+------------+-----------+-----------+-----------+-----------+-----------+----------------+-------------+-------------+----------------+-------------------+----------------------+-------------------+---------------------+-----------+----------------+
| ViT-L-14::laion2b_s32b_b82k      | 0.677      | 0.805     | **0.851**  | 0.833     | 0.629     | 0.758      | 0.932     | 0.958     | 0.459     | 0.646     | 0.668     | 0.563          | 0.116       | 0.312       | 0.161          | 0.032             | 0.020                | 0.056             | 0.108               | **0.224** | 0.229          |
+----------------------------------+------------+-----------+------------+-----------+-----------+------------+-----------+-----------+-----------+-----------+-----------+----------------+-------------+-------------+----------------+-------------------+----------------------+-------------------+---------------------+-----------+----------------+
| ViT-L-14-336::openai             | **0.709**  | 0.781     | 0.837      | 0.744     | 0.556     | 0.783      | 0.937     | 0.940     | 0.560     | 0.615     | 0.638     | 0.608          | 0.733       | 0.200       | 0.158          | 0.032             | 0.024                | 0.046             | 0.113               | 0.158     | 0.262          |
+----------------------------------+------------+-----------+------------+-----------+-----------+------------+-----------+-----------+-----------+-----------+-----------+----------------+-------------+-------------+----------------+-------------------+----------------------+-------------------+---------------------+-----------+----------------+
| ViT-H-14::laion2b_s32b_b79k      | **0.709**  | 0.777     | 0.850      | **0.847** | 0.678     | **0.801**  | **0.945** | 0.961     | 0.563     | **0.726** | 0.699     | 0.542          | 0.297       | 0.268       | 0.169          | 0.032             | 0.027                | 0.054             | 0.111               | 0.140     | 0.110          |
+----------------------------------+------------+-----------+------------+-----------+-----------+------------+-----------+-----------+-----------+-----------+-----------+----------------+-------------+-------------+----------------+-------------------+----------------------+-------------------+---------------------+-----------+----------------+
| ViT-g-14::laion2b_s12b_b42k      | 0.696      | **0.811** | **0.851**  | 0.839     | **0.682** | 0.776      | 0.943     | **0.962** | **0.603** | 0.648     | 0.718     | 0.560          | 0.580       | **0.332**   | 0.175          | 0.036             | 0.031                | 0.060             | 0.115               | 0.190     | 0.138          |
+----------------------------------+------------+-----------+------------+-----------+-----------+------------+-----------+-----------+-----------+-----------+-----------+----------------+-------------+-------------+----------------+-------------------+----------------------+-------------------+---------------------+-----------+----------------+

From the table, we observe that the ViT models still outperform the RN models in most tasks, except for the Patch Camelyon dataset where ``RN50::openai`` has the best top-1 accuracy of 0.636, and the KITTI/distance dataset where ``RN50::yfcc15m`` has the best result of 0.336.
Similar to retrieval results, the ``ViT-H-14::laion2b_s32b_b79k`` model and ``ViT-g-14::laion2b_s12b_b42k`` model still have the best or close to the best results on 12/21 zero-shot classification tasks.
All models tend to perform well on ImageNetV2, VOC2007, VTAB natural and VTAB specialized (except for Retinopathy) datasets, whereas they perform poorly on VTAB structured datasets.
We do not observe any significant difference between the ViT models of the same base model. 

Appendix: Datasets description
------------------------------

* **COCO Caption** [1]_: The dataset contains over one and a half million captions describing over 330,000 images. For the training and validation images, five independent human generated captions are provided.

* **Flickr 8k** [2]_: The dataset consists of 8,000 images that are each paired with five different captions which provide clear descriptions of the salient entities and events. The images were chosen from six different Flickr groups, and tend not to contain any well-known people or locations, but were manually selected to depict a variety of scenes and situations.

* **Flickr 30k** [3]_: The dataset is an extension of the Flickr 8k Dataset. It consists of 158,915 crowd-sourced captions describing 31,783 images.

* **ImageNetV2** [4]_: ImageNetV2 contains three test sets with 10,000 new images each. Importantly, these test sets were sampled after a decade of progress on the original ImageNet dataset. This makes the new test data independent of existing models and guarantees that the accuracy scores are not affected by adaptive overfitting.

* **VOC2007** [5]_: The training data provided consists of a set of images; each image has an annotation file giving a bounding box and object class label for each object in one of the twenty classes present in the image. Note that multiple objects from multiple classes may be present in the same image.

* **VTAB natural group** [6]_: The natural group represents classical vision problems. These tasks contain natural images captured using standard cameras. The classes may represent generic, fine-grained, or abstract objects.

  * **Caltech101**: The task consists in classifying pictures of objects (101 classes plus a background clutter class), including animals, airplanes, chairs, or scissors. The image size varies, but it typically ranges from 200-300 pixels per edge.

  * **CIFAR-100**: The task consists in classifying natural images (100 classes, with 500 training images each). Some examples include apples, bottles, dinosaurs, and bicycles. The image size is 32x32.

  * **DTD**: The task consists in classifying images of textural patterns (47 classes, with 120 training images each). Some of the textures are banded, bubbly, meshed, lined, or porous. The image size ranges between 300x300 and 640x640 pixels.

  * **Flowers102**: The task consists in classifying images of flowers present in the UK (102 classes, with between 40 and 248 training images per class). Azalea, Californian Poppy, Sunflower, or Petunia are some examples. Each image dimension has at least 500 pixels.

  * **Pets**: The task consists in classifying pictures of cat and dog breeds (37 classes with around 200 images each), including Persian cat, Chihuahua dog, English Setter dog, or Bengal cat. Images dimensions are typically 200 pixels or larger.

  * **Sun397**: The Sun397 task is a scenery benchmark with 397 classes and, at least, 100 images per class. Classes have a hierarchy structure, and include cathedral, staircase, shelter, river, or archipelago. The images are (colour) 200x200 pixels or larger.

  * **SVHN**: This task consists in classifying images of Google's street-view house numbers (10 classes, with more than 1000 training images each). The image size is 32x32 pixels.

* **VTAB specialized group**: The specialized group also contains images of the world, but captured through specialist equipment. These images have different invariances to those in the specialized tasks. Nonetheless, humans recognize the structures therein, thus generic visual representations should also capture the visual concepts. It two sub-groups: remote sensing, and medical.

  * **EuroSAT**: The task consists in classifying Sentinel-2 satellite images into 10 different types of land use (Residential, Industrial, River, Highway, etc). The spatial resolution corresponds to 10 meters per pixel, and the image size is 64x64 pixels.

  * **Resisc45**: The Remote Sensing Image Scene Classification (RESISC) dataset is a scene classification task from remote sensing images. There are 45 classes, containing 700 images each, including tennis court, ship, island, lake, parking lot, sparse residential, or stadium. The image size is RGB 256x256 pixels.

  * **Patch Camelyon**: The Patch Camelyon dataset contains 327,680 images of histopathologic scans of lymph node sections. The classification task consists in predicting the presence of metastatic tissue in given image (i.e., two classes). All images are 96x96 pixels.

  * **Retinopathy**: The Diabetic Retinopathy dataset consists of image-label pairs with high-resolution retina images, and labels that indicate the presence of Diabetic Retinopahy (DR) in a 0-4 scale (No DR, Mild, Moderate, Severe, or Proliferative DR).

* **VTAB structured group**: The structured group assesses comprehension of the structure of a scene, for example, object counting, or 3D depth prediction. Most of these tasks are generated from simulated environments, whose structure is easy for a human to determine, but whose domain differs greatly to datasets like ImageNet. These tasks are intended as a step towards useful representations for perceptual control.

  * **Clevr/count**: CLEVR is a visual question and answer dataset designed to evaluate algorithmic visual reasoning. We use just the images from this dataset, and create a synthetic task by setting the label equal to the number of objects in the images.

  * **Clevr/distance**: Another synthetic task we create from CLEVR consists of predicting the depth of the closest object in the image from the camera. The depths are bucketed into size bins.

  * **dSprites/location**: The dSprites dataset was originally designed to asses disentanglement properties of unsupervised learning algorithms. In particular, each image is a 2D shape where six factors are controlled: color, shape, scale, rotation, and (x,y) center coordinates. Images have 64x64 black-and-white pixels. This task consists in predicting the x (horizontal) coordinate of the object. The locations are bucketed into 16 bins.

  * **dSprites/orientation**: We create another task from dSprites consists in predicting the orientation of each object, bucketed into 16 bins.

  * **SmallNORB/azimuth**: The Small NORB dataset contains images of 3D-toys from 50 classes, including animals, human figures, airplanes, trucks, and cars. The image size is 640x480 pixels. In this case, we define labels depending on the azimuth (angle of horizontal deviation), in intervals of 20 degrees (18 classes).

  * **SmallNORB/elevation**: Another synthetic task we create from Small NORB consists in predicting the elevation in the image. There are 9 classes, corresponding to 9 different elevations ranging from 30 to 70 degrees, in intervals of 5 degrees.

  * **DMLab**: The DMLab (DeepMind Lab) is a set of control environments focused on 3D navigation and puzzle-solving tasks. The Dmlab dataset contains frames observed by the agent acting in the DeepMind Lab environment, which are annotated by the distance between the agent and various objects present in the environment. The goal is to evaluate the ability of a visual model to reason about distances from the visual input in 3D environments. The Dmlab dataset consists of 360x480 color images in 6 classes. The classes are {close, far, very far} x {positive reward, negative reward} respectively.

  * **KITTI-Dist**: The KITTI task consists in predicting the (binned) depth to the vehicle (car, van, or truck) in the image. There are 4 bins / classes.

.. [1] https://arxiv.org/pdf/1504.00325.pdf
.. [2] https://www.kaggle.com/datasets/adityajn105/flickr8k
.. [3] https://shannon.cs.illinois.edu/DenotationGraph/
.. [4] https://github.com/modestyachts/ImageNetV2
.. [5] http://host.robots.ox.ac.uk/pascal/VOC/voc2007/
.. [6] https://arxiv.org/pdf/1910.04867.pdf


================================================
FILE: docs/user-guides/client.md
================================================
# Client API

CLIP-as-service is designed in a client-server architecture. You can use `clip_client` to send images and texts to the server and receive the responses from the server. Right now, `clip_client` provides encoding, ranking, indexing, and searching functionalities. Additionally, it has many nice designs for speeding up the processing of a large amount of data:

- Streaming: request sending is *not* blocked by the response receiving. Sending and receiving are two separate streams that run in parallel. Both are independent and each have separate internal buffer.
- Batching: large requests are segmented into small batches and send in a stream. 
- Low memory footprint: only load data when needed.
- Sync/async interface: provide `async` interface that can be easily integrated into other asynchronous system.
- Auto-detect images and text input.
- Support gRPC, HTTP, Websocket protocols with their TLS counterparts. 


```{tip}
You will need to install `clip_client` first in Python 3.7+: `pip install clip-client`.
```

(construct-client)=
## Construct client

To use `clip_client`, you need to first construct a Client object, e.g.:

```python
from clip_client import Client

c = Client('grpc://0.0.0.0:23456')
```

The URL-like scheme `grpc://0.0.0.0:23456` is what you get after {ref}`running the server<server-address>`. The scheme follows the format `scheme://netloc:port`:


| Field    | Description                                                                                                                                                                                 | Example       |
| -------- | ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | ------------- |
| `scheme` | The protocol of the server, must be one of `grpc`, `websocket`, `http`, `grpcs`, `websockets`, `https`. Protocols end with `s` are TLS encrypted. This must match with the server protocol. | `grpc`        |
| `netloc` | The server's IP address or hostname                                                                                                                                                         | `192.168.0.3` |
| `port`   | The public port of the server                                                                                                                                                               | `51234`       |
            

## Encoding

`clip_client` provides {func}`~clip_client.client.Client.encode` function that allows you to send sentences, images to the server in a streaming and sync/async manner. Encoding here means getting the fixed-length vector representation of a text or image.

{func}`~clip_client.client.Client.encode` supports two basic input types:

- **An iterable of `str`**, e.g. `List[str]`, `Tuple[str]`, `Generator[str]` are all acceptable.
- **An iterable of {class}`~docarray.document.Document`**, e.g. `List[Document]`, {class}`~docarray.array.document.DocumentArray`, `Generator[Document]` are all acceptable.

Depending on the input, the output of {func}`~clip_client.client.Client.encode` is different:

- If the input is an iterable of `str`, then the output will be a `numpy.ndarray`.
- If the input is an iterable of `Document`, then the output will be a `DocumentArray`.

Now let's look at these two cases in details.

### Input as iterable of strings

- Input: each string element is auto-detected as a sentence or an image.
- Output: a `[N, D]` shape `numpy.ndarray`, where `N` is the length of the input and `D` is the CLIP embedding size. Each row corresponds to the embedding of the input object.

Any URI-like string, including relative, absolute file path, http/https path, data URI string will be considered as an image. Otherwise, it will be considered as a sentence. 

For example,

```python
from clip_client import Client

c = Client('grpc://0.0.0.0:23456')

r = c.encode(
    [
        'she smiled, with pain',
        'apple.png',
        'https://clip-as-service.jina.ai/_static/favicon.png',
        'data:image/gif;base64,R0lGODlhEAAQAMQAAORHHOVSKudfOulrSOp3WOyDZu6QdvCchPGolfO0o/XBs/fNwfjZ0frl3/zy7////wAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAACH5BAkAABAALAAAAAAQABAAAAVVICSOZGlCQAosJ6mu7fiyZeKqNKToQGDsM8hBADgUXoGAiqhSvp5QAnQKGIgUhwFUYLCVDFCrKUE1lBavAViFIDlTImbKC5Gm2hB0SlBCBMQiB0UjIQA7',
    ]
)
print(r)
```

gives you

```text
[[-0.09136295  0.42720157 -0.05784469 ... -0.42873043  0.04472527
   0.4437953 ]
 [ 0.43152636  0.1563695  -0.09363698 ... -0.11514216  0.1865044
   0.15025651]
 [ 0.42862126  0.17757078  0.08584607 ...  0.23284511 -0.00929402
   0.10993651]
 [ 0.4706376  -0.01384148  0.3877237  ...  0.1995864  -0.22621225
  -0.4837676 ]]
```

### Input as iterable of Documents

```{tip}
This feature uses [DocArray](https://docarray.jina.ai), which is installed together with `clip_client` as an upstream dependency. You do not need to install DocArray separately.
```

If auto-detection on a list of raw string is too "sci-fi" to you, then you may use `docarray.Document` to make the input more explicit and organized. `Document` can be used as a container to easily represent a sentence or an image.

- Input: each `Document` must be filled with `.text` or `.uri` or `.blob` or `.tensor` attribute. 
  - `Document` filled with `.text` is considered as sentence;
  - `Document` filled with `.uri` or `.blob` or `.tensor` is considered as image. If `.tensor` is filled, then its shape must be in `[H, W, C]` format.
- Output: a `DocumentArray` of the same input length. Each `Document` object in it is the same one from the input and is now filled with `.embedding` attribute. The order of the output is the same as the input.

```{note}
If the input `Document` is filled with both `.text` and `.uri`, then `.text` will be used.
```

```{caution}
The correctness of result and the order of output rely on the uniqueness of id of the input `Document`. The id will be implicitly generated if not provided. If you set the id manually, then you must make sure the id is unique, otherwise the results will not be complete.
```

The explicitness comes from now you have to put the content into the `Document` attributes. For example, we can rewrite the above example as below:

```python
from clip_client import Client
from docarray import Document

c = Client('grpc://0.0.0.0:23456')

da = [
    Document(text='she smiled, with pain'),
    Document(uri='apple.png'),
    Document(uri='apple.png').load_uri_to_image_tensor(),
    Document(blob=open('apple.png', 'rb').read()),
    Document(uri='https://clip-as-service.jina.ai/_static/favicon.png'),
    Document(
        uri='data:image/gif;base64,R0lGODlhEAAQAMQAAORHHOVSKudfOulrSOp3WOyDZu6QdvCchPGolfO0o/XBs/fNwfjZ0frl3/zy7////wAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAACH5BAkAABAALAAAAAAQABAAAAVVICSOZGlCQAosJ6mu7fiyZeKqNKToQGDsM8hBADgUXoGAiqhSvp5QAnQKGIgUhwFUYLCVDFCrKUE1lBavAViFIDlTImbKC5Gm2hB0SlBCBMQiB0UjIQA7'
    ),
]

r = c.encode(da)
```

Instead of sending a list of `Document`, you can also wrap it with a `DocumentArray` and then send it:

```python
r = c.encode(DocumentArray(da))
```

Now that the return result is a `DocumentArray`, we can get a summary of it using `r.summary()`.

```text
╭──────────────────────────── Documents Summary ─────────────────────────────╮
│                                                                            │
│   Length                        6                                          │
│   Homogenous Documents          False                                      │
│   4 Documents have attributes   ('id', 'mime_type', 'uri', 'embedding')    │
│   1 Document has attributes     ('id', 'mime_type', 'text', 'embedding')   │
│   1 Document has attributes     ('id', 'embedding')                        │
│                                                                            │
╰────────────────────────────────────────────────────────────────────────────╯
╭────────────────────── Attributes Summary ───────────────────────╮
│                                                                 │
│   Attribute   Data type      #Unique values   Has empty value   │
│  ─────────────────────────────────────────────────────────────  │
│   embedding   ('ndarray',)   6                False             │
│   id          ('str',)       6                False             │
│   mime_type   ('str',)       5                False             │
│   text        ('str',)       2                False             │
│   uri         ('str',)       4                False             │
│                                                                 │
╰─────────────────────────────────────────────────────────────────╯
```

To get the embedding of all Documents, simply call `r.embeddings`:

```text
[[-0.09136295  0.42720157 -0.05784469 ... -0.42873043  0.04472527
   0.4437953 ]
 [ 0.43152636  0.1563695  -0.09363698 ... -0.11514216  0.1865044
   0.15025651]
 [ 0.43152636  0.1563695  -0.09363698 ... -0.11514216  0.1865044
   0.15025651]
 [ 0.42862126  0.17757078  0.08584607 ...  0.23284511 -0.00929402
   0.10993651]
 [ 0.4706376  -0.01384148  0.3877237  ...  0.1995864  -0.22621225
  -0.4837676 ]]
```

```{tip}
Reading an image file into bytes and put into `.blob` is possible as shown above. However, it is often unnecessary. Especially if you have a lot of images, loading all of them into memory is not a good idea. Rule of thumb, always use `.uri` and trust `clip_client` to handle it well. 
```

### Async encoding

To encode `Document` in an asynchronous manner, one can use {func}`~clip_client.client.Client.aencode`.

```{tip}
Despite the sexy word "async", many data scientists have misconceptions about asynchronous behavior. And their motivation of using async function is often wrong. _Async is not a silver bullet._ In a simple language, you will only need `.aencode()` when there is another concurrent task that is also async. Then you want to "overlap" the time spending of these two tasks.

If your system is sync by design, there is nothing wrong about it. Go with `encode()` until you see a clear advantage of using `aencode()`, or until your boss tell you to do so.   
```

In the following example, there is another job `another_heavylifting_job` to represent a job like writing to database, downloading large file.

```python
import asyncio
from clip_client import Client

c = Client('grpc://0.0.0.0:23456')


async def another_heavylifting_job():
    # can be writing to database, downloading large file
    # big IO ops
    await asyncio.sleep(3)


async def main():
    t1 = asyncio.create_task(another_heavylifting_job())
    t2 = asyncio.create_task(c.aencode(['hello world'] * 100))
    await asyncio.gather(t1, t2)


asyncio.run(main())
```

The final time cost will be less than `3s + time(t2)`.

(rank-api)=
## Ranking

```{tip}
This feature is only available with `clip_server>=0.3.0`.
```

One can also rank cross-modal matches via {meth}`~clip_client.client.Client.rank` or {meth}`~clip_client.client.Client.arank`. First construct a cross-modal `Document` where the root contains an image and `.matches` contain sentences to rerank. One can also construct text-to-image rerank as below:

````{tab} Given image, rank sentences

```python
from docarray import Document

d = Document(
    uri='.github/README-img/rerank.png',
    matches=[
        Document(text=f'a photo of a {p}')
        for p in (
            'control room',
            'lecture room',
            'conference room',
            'podium indoor',
            'television studio',
        )
    ],
)
```

````

````{tab} Given sentence, rank images

```python
from docarray import Document

d = Document(
    text='a photo of conference room',
    matches=[
        Document(uri='.github/README-img/4.png'),
        Document(uri='.github/README-img/9.png'),
        Document(uri='https://clip-as-service.jina.ai/_static/favicon.png'),
    ],
)
```

````


Then call `rank`, you can feed it with multiple Documents as a list:

```python
from clip_client import Client

c = Client(server='grpc://0.0.0.0:23456')
r = c.rank([d])

print(r['@m', ['text', 'scores__clip_score__value']])
```

Finally, in the return you can observe the matches are re-ranked according to `.scores['clip_score']`:

```text
[['a photo of a television studio', 'a photo of a conference room', 'a photo of a lecture room', 'a photo of a control room', 'a photo of a podium indoor'], 
[0.9920725226402283, 0.006038925610482693, 0.0009973491542041302, 0.00078492151806131, 0.00010626466246321797]]
```

(indexing)=
## Indexing

```{tip}
This feature is only available with `clip_client>=0.7.0`, and the server is running with 
a FLOW consisting of encoder and indexer.
``` 

You can index Documents via {func}`~clip_client.client.Client.index` or {func}`~clip_client.client.Client.aindex`. 

```python
from clip_client import Client
from docarray import Document

c = Client('grpc://0.0.0.0:23456')

da = [
    Document(text='she smiled, with pain'),
    Document(uri='apple.png'),
    Document(uri='apple.png').load_uri_to_image_tensor(),
    Document(blob=open('apple.png', 'rb').read()),
    Document(uri='https://clip-as-service.jina.ai/_static/favicon.png'),
    Document(
        uri='data:image/gif;base64,R0lGODlhEAAQAMQAAORHHOVSKudfOulrSOp3WOyDZu6QdvCchPGolfO0o/XBs/fNwfjZ0frl3/zy7////wAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAACH5BAkAABAALAAAAAAQABAAAAVVICSOZGlCQAosJ6mu7fiyZeKqNKToQGDsM8hBADgUXoGAiqhSvp5QAnQKGIgUhwFUYLCVDFCrKUE1lBavAViFIDlTImbKC5Gm2hB0SlBCBMQiB0UjIQA7'
    ),
]

r = c.index(da)
```
Now that the return result is a DocumentArray, we can get a summary of it.

```text
╭──────────────────────────── Documents Summary ─────────────────────────────╮
│                                                                            │
│   Length                        6                                          │
│   Homogenous Documents          False                                      │
│   4 Documents have attributes   ('id', 'mime_type', 'uri', 'embedding')    │
│   1 Document has attributes     ('id', 'mime_type', 'text', 'embedding')   │
│   1 Document has attributes     ('id', 'embedding')                        │
│                                                                            │
╰────────────────────────────────────────────────────────────────────────────╯
╭────────────────────── Attributes Summary ───────────────────────╮
│                                                                 │
│   Attribute   Data type      #Unique values   Has empty value   │
│  ─────────────────────────────────────────────────────────────  │
│   embedding   ('ndarray',)   6                False             │
│   id          ('str',)       6                False             │
│   mime_type   ('str',)       5                False             │
│   text        ('str',)       2                False             │
│   uri         ('str',)       4                False             │
│                                                                 │
╰─────────────────────────────────────────────────────────────────╯
```

The `embedding` is the output of the encoder, which is a 512-dim vector. 
Now we can use the indexer to search for the indexed Documents.


(searching)=
## Searching

```{tip}
This feature is only available with `clip_client>=0.7.0`, and the server is running with 
a FLOW consisting of encoder and indexer.
``` 

You can use {func}`~clip_client.client.Client.search` or {func}`~clip_client.client.Client.asearch`
to search for relevant Documents in the index for a given query.

```python
from clip_client import Client

c = Client('grpc://0.0.0.0:23456')

result = c.search(['smile'], limit=2)

print(result['@m', ['text', 'scores__cosine']])
```

The results will look like this, the most relevant doc is "she smiled, with pain" with the cosine distance of 0.096. And the apple image has the cosine distance of 0.799.
```text
[['she smiled, with pain', ''], [{'value': 0.09604918956756592}, {'value': 0.7994111776351929}]]
```
You can set the `limit` parameter (default is `10`) to control the number of the most similar documents to be retrieved.


(profiling)=
## Profiling

You can use {func}`~clip_client.client.Client.profile` to give a quick test on the server to make sure everything is good. 

```python
from clip_client import Client

c = Client('grpc://0.0.0.0:23456')

c.profile()
```

This give you a tree-like table showing the latency and percentage.

```text
 Roundtrip  16ms  100%                                                          
├──  Client-server network  12ms  75%                                           
└──  Server  4ms  25%                                                           
    ├──  Gateway-CLIP network  0ms  0%                                          
    └──  CLIP model  4ms  100%      
```

Under the hood, `.profile()` sends a single empty Document to the CLIP-server for encoding and calculates a summary of latency. The above tree can be read as follows:  

- From calling `client.encode()` to returning the results, everything counted, takes 16ms to finish.
- Among them the time spent on the server is 4ms, the remaining 12ms is spent on the client-server communication, request packing, response unpacking.
- During the 4ms server processing time, CLIP model takes 4ms, whereas the [Gateway](https://docs.jina.ai/fundamentals/architecture-overview/#architecture-overview) to CLIP communication takes no time.

`.profile()` can also take a string argument and asks CLIP-server to encode it. This string can be a sentence, local/remote image file URI. For example:

```python
c.profile('hello, world')
c.profile('apple.png')
c.profile('https://docarray.jina.ai/_static/favicon.png')
```


Single query latency is often very fluctuated. Running `.profile()` multiple times may give you different results. Nonetheless, it helps you understand who to blame if CLIP-as-service is running slow for you: the network? the computation? But certainly not this software itself.


## Best practices

In this section, we will show you some best practices for using this client. We will use encoding as an example. The same applies to all other methods.

### Control batch size

You can specify `.encode(..., batch_size=8)` to control how many `Document`s are sent in each request. You can play this number and find the perfect balance between network transmission and GPU utilization. 

Intuitively, setting `batch_size=1024` should result in very high GPU utilization on each request. However, a large batch size like this also means sending each request would take longer. Given that `clip-client` is designed with request and response streaming, large batch size would not benefit from the time overlapping between request streaming and response streaming.

### Control prefetch size

To control the number of in-flight batches, you can use the `.encode(..., prefetch=100)` option. 
The way this works is that when you send a large request, the outgoing request stream will usually finish before the incoming response stream due to the asynchronous design. 
This is because the request handling is typically time-consuming, which can prevent the server from sending back the response and may cause it to close the connection as it thinks the incoming channel is idle. 
By default, the client is set to a prefetch value of 100. However, it is recommended to use a lower value for expensive operations and a higher value for faster response times.

For more information about client prefetching, please refer to [Rate Limit](https://docs.jina.ai/concepts/client/rate-limit/) in Jina documentation.

### Show progressbar

You can use `.encode(..., show_progress=True)` to turn on the progress bar.

```{figure} images/client-pgbar.gif
:width: 80%
```

```{hint}
Progress bar may not show up in the PyCharm debug terminal. This is an upstream issue of `rich` package.
```

### Processing large number of Documents

Here are some suggestions when encoding a large number of `Document`s:

1. Use `Generator` as input to load data on-demand. You can put your data into a Generator and feed to `.encode`:
    ```python
    def data_gen():
        for _ in range(100_000):
            yield Document(uri=...)


    c = Client(...)
    c.encode(data_gen())
    ```
    Yield raw strings is also acceptable, e.g. to encode all images under a directory, you can simply do:
    ```python
    from glob import iglob

    c.encode(iglob('**/*.png'))
    ```
2. Adjust the `batch_size` parameters.
3. Adjust the `prefetch` parameters.
4. Turn on the progressbar.

````{danger}
In any case, avoiding the following coding:

```python
for d in big_list:
    c.encode([d])
```

This is extremely slow as only one document is encoded at a time, it is a bad utilization of the network and not leveraging any duplex streaming.
````

### Custom callback

`clip_client` by default collects all the results and returns them to users. However, if you want to process the results on-the-fly, you can also pass a callback function when sending the request. For example, you can use the callback to save the results to a database, or render the results to a webpage. Specifically, you can specify any of the three callback functions: `on_done`, `on_error`, and `on_always`.

- `on_done` is executed while streaming, after successful completion of each request
- `on_error` is executed while streaming, whenever an error occurs in each request
- `on_always` is always performed while streaming, no matter the success or failure of each request

Note that these callbacks only work for requests (and failures) inside the stream. For `on_error`, if the failure is due to an error happening outside of streaming, then it will not be triggered. For example, a `SIGKILL` from the client OS during the handling of the request, or a networking issue, will not trigger the callback. Learn more about [handling exceptions in `on_error`](https://docs.jina.ai/concepts/client/callbacks/#handle-exceptions-in-callbacks).

Callback functions take a `Response` of the type DataRequest, which contains resulting Documents, parameters, and other information. Learn more about [handling `DataRequest` in callbacks](https://docs.jina.ai/concepts/client/callbacks/#handle-datarequest-in-callbacks).

In the following example, we will use `on_done` to save the results to a database. We use a simple `dict` to simulate the database. The error is saved to log file using `on_error`. `on_always` will print the number of documents processed in each request.

```python
from clip_client import Client

db = {}


def my_on_done(resp):
    for doc in resp.docs:
        db[doc.id] = doc


def my_on_error(resp):
    with open('error.log', 'a') as f:
        f.write(resp)


def my_on_always(resp):
    print(f'{len(resp.docs)} docs processed')


c = Client('grpc://0.0.0.0:12345')
c.encode(
    ['hello', 'world'], on_done=my_on_done, on_error=my_on_error, on_always=my_on_always
)
```

```{note}
If either `on_done` or `on_always` is specified, the default behavior of returning the results is disabled. You need to handle the results yourself.
```

### Client parallelism

In case you instanciate a `clip_client` object using the `grpc` protocol, keep in mind that `grpc` clients cannot be used in a multi-threaded environment (check [this gRPC issue](https://github.com/grpc/grpc/issues/25364) for reference).
What you should do, is to rely on asynchronous programming or multi-processing rather than multi-threading.

To use `clip_client` in a Flask application, you can introduce multi-processing based parallelism to your app using `gunicorn`: 

```bash
gunicorn -w 4 -b 127.0.0.1:4000 myproject:app
```

To use `clip_client` in a FastAPI application, you have to manually restrict the thread number to 1 at the starting state of the app:

```python   
import uvicorn
from fastapi import FastAPI
from clip_client import Client
from anyio.lowlevel import RunVar
from anyio import CapacityLimiter

c = Client('grpc://0.0.0.0:51001')
app = FastAPI()

@app.on_event("startup")
def startup():
    print("start")
    RunVar("_default_thread_limiter").set(CapacityLimiter(1))

@app.post("/")
def encode():
    r =  c.encode(['Hello world', 'Hello Jina'])
    print(r)
```

Then it can run with multiprocessing using

```bash
gunicorn myproject:app --workers 4 --worker-class uvicorn.workers.UvicornWorker --bind 0.0.0.0:4000
```

## Appendix: Plain HTTP request via `curl`

```{tip}
Sending large embeddings over plain HTTP is often not the best idea. Websocket is often a better choice, allows one to call clip-server from Javascript with much better performance.   
```

If your {ref}`server is spawned<flow-config>` with `protocol: http` and `cors: True`, then you do not need to call the server via Python client. You can simply do it via `curl` or Javascript by sending a JSON to `http://address:port/post`. Notice, the `/post` endpoint at the end. For example,

To encode sentences:

```{code-block} bash
---
emphasize-lines: 3
---
curl -X POST http://0.0.0.0:51000/post \ 
     -H 'Content-Type: application/json' \
     -d '{"data":[{"text": "First do it"}, {"text": "then do it right"}, {"text": "then do it better"}], "execEndpoint":"/"}'
```

To encode a local image, you need to load it as base64 string and put into the `blob` field, and be careful with the quotes there:

```{code-block} bash
---
emphasize-lines: 3
---
curl -X POST http://0.0.0.0:51000/post \ 
     -H 'Content-Type: application/json' \
     -d '{"data":[{"text": "First do it"}, {"blob":"'"$( base64 test-1.jpeg)"'" }], "execEndpoint":"/"}'
```

To encode a remote image, you can simply put its address into `uri` field:

```{code-block} bash
---
emphasize-lines: 3
---
curl -X POST http://0.0.0.0:51000/post \ 
     -H 'Content-Type: application/json' \
     -d '{"data":[{"text": "First do it"}, {"uri": "https://clip-as-service.jina.ai/_static/favicon.png"}], "execEndpoint":"/"}'
```

Run it, you will get:

```json
{"header":{"requestId":"8b1f4b419bc54e95ab4b63cc086233c9","status":null,"execEndpoint":"/","targetExecutor":""},"parameters":null,"routes":[{"executor":"gateway","startTime":"2022-04-01T15:24:28.267003+00:00","endTime":"2022-04-01T15:24:28.328868+00:00","status":null},{"executor":"clip_t","startTime":"2022-04-01T15:24:28.267189+00:00","endTime":"2022-04-01T15:24:28.328748+00:00","status":null}],"data":[{"id":"b15331b8281ffde1e9fb64005af28ffd","parent_id":null,"granularity":null,"adjacency":null,"blob":null,"tensor":null,"mime_type":"text/plain","text":"hello, world!","weight":null,"uri":null,"tags":null,"offset":null,"location":null,"embedding":[-0.022064208984375,0.1044921875, ..., -0.1363525390625,-0.447509765625],"modality":null,"evaluations":null,"scores":null,"chunks":null,"matches":null}]}
```

The embedding is inside `.data[].embedding`. If you have [jq](https://stedolan.github.io/jq/) installed, you can easily filter the embeddings out via:

```{code-block} bash
---
emphasize-lines: 4
---
curl -X POST http://0.0.0.0:51000/post \
     -H 'Content-Type: application/json' \
     -d '{"data":[{"text": "hello, world!"}, {"blob":"'"$( base64 test-1.jpeg)"'" }], "execEndpoint":"/"}' | \
     jq -c '.data[] | .embedding'
```

```text
[-0.022064208984375,0.1044921875,...]
[-0.0750732421875,-0.166015625,...]
```


================================================
FILE: docs/user-guides/faq.md
================================================
# FAQ

This is a list of Frequently Asked Questions about CLIP-as-service. Feel free to suggest new entries!


What is CLIP model?
: Developed by OpenAI, CLIP (Contrastive Language-Image Pre-Training) is a neural network trained on a variety of (image, text) pairs. The original CLIP Github repository [is here](https://github.com/openai/CLIP). The introduction of the CLIP model can [be found here](https://openai.com/blog/clip/).

Do I need to install `clip-server` and `clip-client` together?
: No. You can install them separately on different machines. For example, on a GPU server, you just need `clip-server`; on your laptop, you just need `clip-client`.

What is CLIP-as-service based on? The codebase seems quite small
: CLIP-as-service leverages features from [Jina](https://github.com/jina-ai/jina), which itself utilizes [DocArray](https://github.com/jina-ai/docarray). Thanks to them CLIP-as-service can be quickly built with solid infrastructure and rich features.

I had this AioRpcError, what should I do?
: If you encounter the following errors, it means you client can not connect to the server.

  ```text
     GRPCClient@28632[E]:gRPC error: StatusCode.UNAVAILABLE failed to connect to all addresses
  the ongoing request is terminated as the server is not available or closed already
  ```

  ```text
    AioRpcError: <AioRpcError of RPC that terminated with:
            status = StatusCode.UNAVAILABLE
            details = "failed to connect to all addresses"
            debug_error_string =
    "{"created":"@1648074480.571952000","description":"Failed to pick subchannel",
    "file":"src/core/ext/filters/client_channel/client_channel.cc","file_line":312
    9,"referenced_errors":[{"created":"@1648074480.571952000","description":"faile
    d to connect to all addresses","file":"src/core/lib/transport/error_utils.cc",
    "file_line":163,"grpc_status":14}]}"
  ```

  You can try `.profile()` to {ref}`confirm it<profiling>`. If it still throws the same error, then your connection is broken.

  While it is hard to pinpoint a network problem, also out of the scope of CLIP-as-service, we here provide you a checklist that may help you to diagnose the problem: 
  - Are the IP address, port, and protocol all correct?
  - Is client and server under the same network, or a different network?
  - Is your server down?
  - Is server's port open to public?
  - Is there a firewall on the server side that restricts the port?
  - Is there a firewall on the client side that restricts the port?
  - Is the security group (on Cloud providers) correctly configured?

Why "CLIP-as-service" why not "CLIP-as-a-service"
: Kind of pay homage to BERT-as-service. It is not about grammatically correct anyhow.

What happened to the BERT-as-service.
: There has been no maintenance of BERT-as-service since Feb. 2019.

  CLIP-as-service is a huge upgrade of BERT-as-service, with more powerful universal embedding models that can handle both images and texts; and more solid and efficient microservice infrastructure developed in the last 2 years by Jina AI. The high-level API, especially the client side, is a drop-in replacement of the old BERT-as-service.

Where can I find the old codebase of BERT-as-service.
: In the [`bert-as-service` branch](https://github.com/jina-ai/clip-as-service/tree/bert-as-service) of the repository.

================================================
FILE: docs/user-guides/finetuner.md
================================================
(Finetuner)=
# Fine-tune Models

Although CLIP-as-service has provided you a list of pre-trained models, you can also fine-tune your models. 
This guide will show you how to use [Finetuner](https://finetuner.jina.ai) to fine-tune models and use them in CLIP-as-service.

For installation and basic usage of Finetuner, please refer to [Finetuner documentation](https://finetuner.jina.ai).
You can also [learn more details about fine-tuning CLIP](https://finetuner.jina.ai/tasks/text-to-image/).

This tutorial requires `finetuner >=v0.6.4`, `clip_server >=v0.6.0`.

## Prepare Training Data

Finetuner accepts training data and evaluation data in the form of {class}`~docarray.array.document.DocumentArray`.
The training data for CLIP is a list of (text, image) pairs.
Each pair is stored in a {class}`~docarray.document.Document` which wraps two [`chunks`](https://docarray.jina.ai/fundamentals/document/nested/) with `image` and `text` modality respectively.
You can push the resulting {class}`~docarray.array.document.DocumentArray` to the cloud using the {meth}`~docarray.array.document.DocumentArray.push` method.

We use [fashion captioning dataset](https://github.com/xuewyang/Fashion_Captioning) as a sample dataset in this tutorial.
The following are examples of descriptions and image urls from the dataset.
We also include a preview of each image.

| Description                                                                                                                           | Image URL                                                                                                                                                           | Preview                                                                                                        |
|---------------------------------------------------------------------------------------------------------------------------------------|---------------------------------------------------------------------------------------------------------------------------------------------------------------------|----------------------------------------------------------------------------------------------------------------|
| subtly futuristic and edgy this liquid metal cuff bracelet is shaped from sculptural rectangular link                                 | [https://n.nordstrommedia.com/id/sr3/<br/>58d1a13f-b6b6-4e68-b2ff-3a3af47c422e.jpeg](https://n.nordstrommedia.com/id/sr3/58d1a13f-b6b6-4e68-b2ff-3a3af47c422e.jpeg) | <img src="https://n.nordstrommedia.com/id/sr3/58d1a13f-b6b6-4e68-b2ff-3a3af47c422e.jpeg?raw=true" width=100px> |
| high quality leather construction defines a hearty boot one-piece on a tough lug sole                                                 | [https://n.nordstrommedia.com/id/sr3/<br/>21e7a67c-0a54-4d09-a4a4-6a0e0840540b.jpeg](https://n.nordstrommedia.com/id/sr3/21e7a67c-0a54-4d09-a4a4-6a0e0840540b.jpeg) | <img src="https://n.nordstrommedia.com/id/sr3/21e7a67c-0a54-4d09-a4a4-6a0e0840540b.jpeg?raw=true" width=100px> |
| this shimmering tricot knit tote is traced with decorative whipstitching and diamond cut chain the two hallmark of the falabella line | [https://n.nordstrommedia.com/id/sr3/<br/>1d8dd635-6342-444d-a1d3-4f91a9cf222b.jpeg](https://n.nordstrommedia.com/id/sr3/1d8dd635-6342-444d-a1d3-4f91a9cf222b.jpeg) | <img src="https://n.nordstrommedia.com/id/sr3/1d8dd635-6342-444d-a1d3-4f91a9cf222b.jpeg?raw=true" width=100px> |
| ...                                                                                                                                   | ...                                                                                                                                                                 | ...                                                                                                            |

You can use the following script to transform the first three entries of the dataset to a {class}`~docarray.array.document.DocumentArray` and push it to the cloud using the name `fashion-sample`.

```python
from docarray import Document, DocumentArray

train_da = DocumentArray(
    [
        Document(
            chunks=[
                Document(
                    content='subtly futuristic and edgy this liquid metal cuff bracelet is shaped from sculptural rectangular link',
                    modality='text',
                ),
                Document(
                    uri='https://n.nordstrommedia.com/id/sr3/58d1a13f-b6b6-4e68-b2ff-3a3af47c422e.jpeg',
                    modality='image',
                ),
            ],
        ),
        Document(
            chunks=[
                Document(
                    content='high quality leather construction defines a hearty boot one-piece on a tough lug sole',
                    modality='text',
                ),
                Document(
                    uri='https://n.nordstrommedia.com/id/sr3/21e7a67c-0a54-4d09-a4a4-6a0e0840540b.jpeg',
                    modality='image',
                ),
            ],
        ),
        Document(
            chunks=[
                Document(
                    content='this shimmering tricot knit tote is traced with decorative whipstitching and diamond cut chain the two hallmark of the falabella line',
                    modality='text',
                ),
                Document(
                    uri='https://n.nordstrommedia.com/id/sr3/1d8dd635-6342-444d-a1d3-4f91a9cf222b.jpeg',
                    modality='image',
                ),
            ],
        ),
    ]
)
train_da.push('fashion-sample')
```

The full dataset has been converted to `clip-fashion-train-data` and `clip-fashion-eval-data` and pushed to the cloud which can be directly used in Finetuner.

## Start Finetuner

You may now create and run a fine-tuning job after login to Jina ecosystem.

```python
import finetuner

finetuner.login()
run = finetuner.fit(
    model='ViT-B-32::openai',
    run_name='clip-fashion',
    train_data='clip-fashion-train-data',
    eval_data='clip-fashion-eval-data',  # optional
    epochs=5,
    learning_rate=1e-5,
    loss='CLIPLoss',
    to_onnx=True,
)
```

After the job started, you may use {meth}`~finetuner.run.Run.status` to check the status of the job.

```python
import finetuner

finetuner.login()
run = finetuner.get_run('clip-fashion')
print(run.status())
```

When the status is `FINISHED`, you can download the tuned model to your local machine.

```python
import finetuner

finetuner.login()
run = finetuner.get_run('clip-fashion')
run.save_artifact('clip-model')
```

You should now get a zip file containing the tuned model named `clip-fashion.zip` under the folder `clip-model`.

## Use the Model

After unzipping the model you get from the previous step, a folder with the following structure will be generated:

```text
.
└── clip-fashion/
    ├── config.yml
    ├── metadata.yml
    ├── metrics.yml
    └── models/
        ├── clip-text/
        │   ├── metadata.yml
        │   └── model.onnx
        ├── clip-vision/
        │   ├── metadata.yml
        │   └── model.onnx
        └── input-map.yml
```

Since the tuned model generated from Finetuner contains richer information such as metadata and config, we now transform it to simpler structure used by CLIP-as-service.

* Firstly, create a new folder named `clip-fashion-cas` or name of your choice. This will be the storage of the models to use in CLIP-as-service.

* Secondly, copy the textual model `clip-fashion/models/clip-text/model.onnx` into the folder `clip-fashion-cas` and rename the model to `textual.onnx`.

* Similarly, copy the visual model `clip-fashion/models/clip-vision/model.onnx` into the folder `clip-fashion-cas` and rename the model to `visual.onnx`.

This is the expected structure of `clip-fashion-cas`:

```text
.
└── clip-fashion-cas/
    ├── textual.onnx
    └── visual.onnx
```

In order to use the fine-tuned model, create a custom YAML file `finetuned_clip.yml` like below. Learn more about [Flow YAML configuration](https://docs.jina.ai/fundamentals/flow/yaml-spec/) and [`clip_server` YAML configuration](https://clip-as-service.jina.ai/user-guides/server/#yaml-config).

```yaml
jtype: Flow
version: '1'
with:
  port: 51000
executors:
  - name: clip_o
    uses:
      jtype: CLIPEncoder
      metas:
        py_modules:
          - clip_server.executors.clip_onnx
      with:
        name: ViT-B-32::openai
        model_path: 'clip-fashion-cas' # path to clip-fashion-cas
    replicas: 1
```

You can use `finetuner.describe_models()` to check the supported models in `finetuner`, you should see:
```bash
                                                                Finetuner backbones                                                                      
┏━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━┳━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┓
┃                                             name ┃           task ┃ output_dim ┃ architecture ┃                                                description ┃
┡━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━╇━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┩
│                                  bert-base-cased │   text-to-text │        768 │  transformer │ BERT model pre-trained on BookCorpus and English Wikipedia │
│                     openai/clip-vit-base-patch16 │  text-to-image │        512 │  transformer │                         CLIP base model with patch size 16 │
│                     openai/clip-vit-base-patch32 │  text-to-image │        512 │  transformer │                                            CLIP base model │
│                openai/clip-vit-large-patch14-336 │  text-to-image │        768 │  transformer │                        CLIP large model for 336x336 images │
│                    openai/clip-vit-large-patch14 │  text-to-image │       1024 │  transformer │                        CLIP large model with patch size 14 │
│                                  efficientnet_b0 │ image-to-image │       1280 │          cnn │                    EfficientNet B0 pre-trained on ImageNet │
│                                  efficientnet_b4 │ image-to-image │       1792 │          cnn │                    EfficientNet B4 pre-trained on ImageNet │
│                                    RN101::openai │  text-to-image │        512 │  transformer │                            Open CLIP "RN101::openai" model │
│                          RN101-quickgelu::openai │  text-to-image │        512 │  transformer │                  Open CLIP "RN101-quickgelu::openai" model │
│                         RN101-quickgelu::yfcc15m │  text-to-image │        512 │  transformer │                 Open CLIP "RN101-quickgelu::yfcc15m" model │
│                                   RN101::yfcc15m │  text-to-image │        512 │  transformer │                           Open CLIP "RN101::yfcc15m" model │
│                                      RN50::cc12m │  text-to-image │       1024 │  transformer │                              Open CLIP "RN50::cc12m" model │
│                                     RN50::openai │  text-to-image │       1024 │  transformer │                             Open CLIP "RN50::openai" model │
│                            RN50-quickgelu::cc12m │  text-to-image │       1024 │  transformer │                    Open CLIP "RN50-quickgelu::cc12m" model │
│                           RN50-quickgelu::openai │  text-to-image │       1024 │  transformer │                   Open CLIP "RN50-quickgelu::openai" model │
│                          RN50-quickgelu::yfcc15m │  text-to-image │       1024 │  transformer │                  Open CLIP "RN50-quickgelu::yfcc15m" model │
│                                  RN50x16::openai │  text-to-image │        768 │  transformer │                          Open CLIP "RN50x16::openai" model │
│                                   RN50x4::openai │  text-to-image │        640 │  transformer │                           Open CLIP "RN50x4::openai" model │
│                                  RN50x64::openai │  text-to-image │       1024 │  transformer │                          Open CLIP "RN50x64::openai" model │
│                                    RN50::yfcc15m │  text-to-image │       1024 │  transformer │                            Open CLIP "RN50::yfcc15m" model │
│                          ViT-B-16::laion400m_e31 │  text-to-image │        512 │  transformer │                  Open CLIP "ViT-B-16::laion400m_e31" model │
│                          ViT-B-16::laion400m_e32 │  text-to-image │        512 │  transformer │                  Open CLIP "ViT-B-16::laion400m_e32" model │
│                                 ViT-B-16::openai │  text-to-image │        512 │  transformer │                         Open CLIP "ViT-B-16::openai" model │
│                 ViT-B-16-plus-240::laion400m_e31 │  text-to-image │        640 │  transformer │         Open CLIP "ViT-B-16-plus-240::laion400m_e31" model │
│                 ViT-B-16-plus-240::laion400m_e32 │  text-to-image │        640 │  transformer │         Open CLIP "ViT-B-16-plus-240::laion400m_e32" model │
│                            ViT-B-32::laion2b_e16 │  text-to-image │        512 │  transformer │                    Open CLIP "ViT-B-32::laion2b_e16" model │
│                          ViT-B-32::laion400m_e31 │  text-to-image │        512 │  transformer │                  Open CLIP "ViT-B-32::laion400m_e31" model │
│                          ViT-B-32::laion400m_e32 │  text-to-image │        512 │  transformer │                  Open CLIP "ViT-B-32::laion400m_e32" model │
│                                 ViT-B-32::openai │  text-to-image │        512 │  transformer │                         Open CLIP "ViT-B-32::openai" model │
│                ViT-B-32-quickgelu::laion400m_e31 │  text-to-image │        512 │  transformer │        Open CLIP "ViT-B-32-quickgelu::laion400m_e31" model │
│                ViT-B-32-quickgelu::laion400m_e32 │  text-to-image │        512 │  transformer │        Open CLIP "ViT-B-32-quickgelu::laion400m_e32" model │
│                       ViT-B-32-quickgelu::openai │  text-to-image │        512 │  transformer │               Open CLIP "ViT-B-32-quickgelu::openai" model │
│                             ViT-L-14-336::openai │  text-to-image │        768 │  transformer │                     Open CLIP "ViT-L-14-336::openai" model │
│                                 ViT-L-14::openai │  text-to-image │        768 │  transformer │                         Open CLIP "ViT-L-14::openai" model │
│                                        resnet152 │ image-to-image │       2048 │          cnn │                          ResNet152 pre-trained on ImageNet │
│                                         resnet50 │ image-to-image │       2048 │          cnn │                           ResNet50 pre-trained on ImageNet │
│ sentence-transformers/msmarco-distilbert-base-v3 │   text-to-text │        768 │  transformer │                    Pretrained BERT, fine-tuned on MS Marco │
└──────────────────────────────────────────────────┴────────────────┴────────────┴──────────────┴───────────────────────────────────────────────────────────
```


You can now start the `clip_server` using fine-tuned model to get a performance boost:

```bash
python -m clip_server finetuned_clip.yml
```

That's it, enjoy 🚀


================================================
FILE: docs/user-guides/retriever.md
================================================
# CLIP Search


CLIP Search is a search paradigm that uses the CLIP model to encode the text and image documents into a common vector space. 
The search results are then retrieved by computing the cosine similarity between the query and the indexed documents.
Technically, CLIP search can be designed as a two-stage process: *encoding* and *indexing*.

```{figure} images/retreival.png
:width: 80%
```

At the encoding stage, the text and image documents can be encoded into a common vector space by the CLIP model. 
It enables us to achieve cross-modal search, i.e., we can search for images given a text query, or search for text given an image query. 
At the indexing stage, we use the encoded vectors to build an index, which is a data structure that can be used to efficiently retrieve the most relevant documents.
Specifically, we use the [Annlite](https://github.com/jina-ai/annlite) indexer executor to build the index.

This chapter will walk you through the process of building a CLIP search system.


```{tip}
You will need to install server first in Python 3.7+: `pip install clip-server[search]>=0.7.0`.
```

## Start the server

To start the server, you can use the following command:

```bash
python -m clip_server search_flow.yml
```

The `search_flow.yml` is the yaml configuration file for the search flow. It defines a [Jina Flow](https://docs.jina.ai/fundamentals/flow/) to implement the CLIP search system.
Below is an example of the Flow YAML file, we can put it into two subsections as below:

````{tab} CLIP model config

```{code-block} yaml
---
emphasize-lines: 9
---

jtype: Flow
version: '1'
with:
  port: 61000
executors:
  - name: encoder
    uses:
      jtype: CLIPEncoder
      metas:
        py_modules:
          - clip_server.executors.clip_torch
    
  - name: indexer
    uses:
      jtype: AnnLiteIndexer
      with:
        n_dim: 512
      metas:
        py_modules:
          - annlite.executor
    workspace: './workspace'
```

````

````{tab} Annlite indexer config

```{code-block} yaml
---
emphasize-lines: 17,18,19
---

jtype: Flow
version: '1'
with:
  port: 61000
executors:
  - name: encoder
    uses:
      jtype: CLIPEncoder
      with:
      metas:
        py_modules:
          - clip_server.executors.clip_torch
          
  - name: indexer
    uses:
      jtype: AnnLiteIndexer
      with:
        n_dim: 512
        limit: 10
      metas:
        py_modules:
          - annlite.executor
    workspace: './workspace'
```

````

The first part defines the CLIP model config, which is explained [here](https://clip-as-service.jina.ai/user-guides/server/#clip-model-config).
And the second part defines the Annlite indexer config, you can set the following parameters:

| Parameter | Description                                                                                  |
|-----------|----------------------------------------------------------------------------------------------|
| `n_dim`   | The dimension of the vector space. It should be the same as the dimension of the CLIP model. |
| `limit`   | The number of the most relevant documents to be retrieved. The default value is 10.          |

And the `workspace` parameter is the path to the workspace directory, which is used to store the index files.

## Index and search documents

```{tip}
You will need to install client first in Python 3.7+: `pip install clip-client>=0.7.0`.
```

### Index Documents

To index image or text documents in the CLIP search server, you can use the client function {func}`~clip_client.Client.index`:

```python
from clip_client import Client
from docarray import Document

client = Client('grpc://0.0.0.0:61000')

client.index(
    [
        Document(text='she smiled, with pain'),
        Document(uri='apple.png'),
        Document(uri='https://clip-as-service.jina.ai/_static/favicon.png'),
    ]
)
```

You don't need to call `client.encode()` explicitly since `client.index()` will handle this for you.


### Search Documents

Then, you can use the client function {func}`~clip_client.Client.search` to search for similar documents:

```python
result = client.search(['smile'], limit=2)

print(result['@m', ['text', 'scores__cosine']])
```

The results will look like this, the most relevant doc is "she smiled, with pain" with the cosine distance of 0.096. And the apple image has the cosine distance of 0.799.
```text
[['she smiled, with pain', ''], [{'value': 0.09604918956756592}, {'value': 0.7994111776351929}]]
```
You can set the `limit` parameter (default is `10`) to control the number of the most similar documents to be retrieved.


### Memory Estimation

Here, we will show how to estimate the memory usage of `AnnLite` indexer.
This is useful for determining the amount of memory required for indexing and querying.

In `AnnLite`, the memory usage is determined by the following two components:

- `HNSW` indexer: N * 1.1 * (4 bytes * `dimension` + 8 bytes * `max_connection`), where N is the number of embedding vectors, `dimension` is the dimension of the embedding vectors, and `max_connection` is the maximum number of connections in the graph. 
- `cell_table`: it's almost linear to the number of columns and number of data. If the default setting is used (no columns used for filtering), the memory usage of `cell_table` is 0.12GB per million data.
Columns used for filtering are stored in string type so the memory usage is depended on the length of the string.

```{Notice}
If you use `AnnLiteIndexer` in your Jina Flow, the memory usage will be slightly higher since we keep a `SQLite` table in memory in order to indexing in `DocumentArray`.
```


## Support large-scale dataset

When we want to index a large number of documents, for example, 100 million data or even 1 billion data, 
it's not possible to implement index operations on a single machine. **Sharding**, 
a type of partitioning that separates a large dataset into smaller, faster, more easily managed parts, is needed in this case.

You need to specify the `shards` and `polling` in the YAML config:

```yaml
jtype: Flow
version: '1'
with:
  port: 61000
executors:
  - name: encoder
    uses:
      jtype: CLIPEncoder
      metas:
        py_modules:
          - clip_server.executors.clip_torch
          
  - name: indexer
    uses:
      jtype: AnnLiteIndexer
      with:
        n_dim: 512
      metas:
        py_modules:
          - annlite.executor
    workspace: './workspace'
    shards: 5
    polling: {'/index': 'ANY', '/search': 'ALL', '/update': 'ALL',
              '/delete': 'ALL', '/status': 'ALL'}
```

| Parameter   | Description                                 |
|-------------|---------------------------------------------|
| `shards`    | Number of shardings.                        |
| `polling`   | Polling strategies for different endpoints. |

Then you can perform exactly the same operations as we do on a single machine.(`/encode`, `/index` and `/search`)

**Why different [polling strategies](https://docs.jina.ai/how-to/scale-out/?highlight=polling#different-polling-strategies) are needed for different endpoints?**

Differences between `ANY` and `ALL`:
- `ANY`: requests are sent to one of the executors.
- `ALL`: requests are sent to all executors.

```{figure} images/polling_stratey.png
:width: 80%

```

Since one data point only needs to be indexed once, there will only be one indexer executor that will handle this data point. Thus, `ANY` is used for `/index`. On the contrary, we use `ALL` in for `/search` since we don't know which executor stores the perfectly matched result, so the search request should be handled by all indexer executors. (The same reason for using `ALL` in `/update`, `/delete`, `/status`)

```{Warning}
Increasing the number of shardings will definitely alleviate the memory issue, but it will increase the latency since there will be more network connections between different shards.
```


================================================
FILE: docs/user-guides/server.md
================================================
# Server API

CLIP-as-service is designed in a client-server architecture. A server is a long-running program that receives raw sentences and images from clients, and returns CLIP embeddings to the client. Additionally, `clip_server` is optimized for speed, low memory footprint and scalability.
- Horizontal scaling: adding more replicas easily with one argument. 
- Vertical scaling: using PyTorch JIT, ONNX or TensorRT runtime to speedup single GPU inference.
- Supporting gRPC, HTTP, Websocket protocols with their TLS counterparts, w/o compressions.

This chapter introduces the API of the server. 

```{tip}
You will need to install server first in Python 3.7+: `pip install clip-server`.
```

(server-address)=
## Start server


### Start a PyTorch-backed server

Unlike the client, server only has a CLI entrypoint. To start a server, run the following in the terminal:

```bash
python -m clip_server
```

Note that it is underscore `_` not the dash `-`.

First time running will download the pretrained model (Pytorch `ViT-B/32` by default), load the model, and finally you will get the address information of the server. This information will {ref}`then be used in clients<construct-client>`.

```{figure} images/server-start.gif
:width: 70%

```

### Start a ONNX-backed server

To use ONNX runtime for CLIP, you can run:

```bash
pip install "clip_server[onnx]"

python -m clip_server onnx-flow.yml
```


### Start a TensorRT-backed server

`nvidia-pyindex` package needs to be installed first. It allows your `pip` to fetch additional Python modules from the NVIDIA NGC™ PyPI repo:

```bash
pip install nvidia-pyindex
pip install "clip_server[tensorrt]"

python -m clip_server tensorrt-flow.yml
```

One may wonder where is this `onnx-flow.yml` or `tensorrt-flow.yml` come from. Must be a typo? Believe me, just run it. It should just work. I will explain this YAML file in the next section. 

The procedure and UI of ONNX and TensorRT runtime would look the same as Pytorch runtime.

## Model support

The various `CLIP` models implemented in the [OpenAI](https://github.com/openai/CLIP), [OpenCLIP](https://github.com/mlfoundations/open_clip), and [MultilingualCLIP](https://github.com/FreddeFrallan/Multilingual-CLIP) are supported. 
`ViT-B-32::openai` is used as the default model in all runtimes. 
Due to the limitation of some runtimes, not every runtime supports all models. 
Please also note that **different models give different sizes of output dimensions**. This will affect your downstream applications. For example, switching the model from one to another make your embedding incomparable, which breaks the downstream applications. Below is a list of supported models of each runtime and its corresponding size.

For more details about the models and how to select the best model for your application, please refer to the [CLIP benchmark page](benchmark.rst).

| Model                                 | PyTorch | ONNX | TensorRT | Output Dimension |
| ------------------------------------- | ------- | ---- | -------- | ---------------- |
| RN50::openai                          | ✅       | ✅    | ✅        | 1024             |
| RN50::yfcc15m                         | ✅       | ✅    | ✅        | 1024             |
| RN50::cc12m                           | ✅       | ✅    | ✅        | 1024             |
| RN101::openai                         | ✅       | ✅    | ✅        | 512              |
| RN101::yfcc15m                        | ✅       | ✅    | ✅        | 512              |
| RN50x4::openai                        | ✅       | ✅    | ✅        | 640              |
| RN50x16::openai                       | ✅       | ✅    | ❌        | 768              |
| RN50x64::openai                       | ✅       | ✅    | ❌        | 1024             |
| ViT-B-32::openai                      | ✅       | ✅    | ✅        | 512              |
| ViT-B-32::laion2b_e16                 | ✅       | ✅    | ✅        | 512              |
| ViT-B-32::laion400m_e31               | ✅       | ✅    | ✅        | 512              |
| ViT-B-32::laion400m_e32               | ✅       | ✅    | ✅        | 512              |
| ViT-B-32::laion2b-s34b-b79k           | ✅       | ✅    | ❌        | 512              |
| ViT-B-16::openai                      | ✅       | ✅    | ✅        | 512              |
| ViT-B-16::laion400m_e31               | ✅       | ✅    | ✅        | 512              |
| ViT-B-16::laion400m_e32               | ✅       | ✅    | ✅        | 512              |
| ViT-B-16-plus-240::laion400m_e31      | ✅       | ✅    | 🚧        | 640              |
| ViT-B-16-plus-240::laion400m_e32      | ✅       | ✅    | 🚧        | 640              |
| ViT-L-14::openai                      | ✅       | ✅    | ❌        | 768              |
| ViT-L-14::laion400m_e31               | ✅       | ✅    | ❌        | 768              |
| ViT-L-14::laion400m_e32               | ✅       | ✅    | ❌        | 768              |
| ViT-L-14::laion2b-s32b-b82k           | ✅       | ✅    | ❌        | 768              |
| ViT-L-14-336::openai                  | ✅       | ✅    | ❌        | 768              |
| ViT-H-14::laion2b-s32b-b79k           | ✅       | ✅    | ❌        | 1024             |
| ViT-g-14::laion2b-s12b-b42k           | ✅       | ✅    | ❌        | 1024             |
| M-CLIP/LABSE-Vit-L-14                 | ✅       | ✅    | ❌        | 768              |
| M-CLIP/XLM-Roberta-Large-Vit-B-32     | ✅       | ✅    | 🚧        | 512              |
| M-CLIP/XLM-Roberta-Large-Vit-B-16Plus | ✅       | ✅    | 🚧        | 640              |
| M-CLIP/XLM-Roberta-Large-Vit-L-14     | ✅       | ✅    | ❌        | 768              |

✅ = Supported — 🚧 = Working in progress — ❌ = Not supported

### Use custom model for onnx
You can also use your own model in ONNX runtime by specifying the model name and the path to ONNX model directory in YAML file.
The model directory should have the same structure as below:

```text
.
└── custom-model/
    ├── textual.onnx
    └── visual.onnx
```

One may wonder how to produce the model as described above. 
Fortunately, you can simply use the [Finetuner](https://finetuner.jina.ai) to fine-tune your model based on custom dataset.
[Finetuner](https://finetuner.jina.ai) is a cloud service that makes fine-tuning simple and fast. 
Moving the process into the cloud, [Finetuner](https://finetuner.jina.ai) handles all related complexity and infrastructure, making models performant and production ready.
{ref}`Click here for detail instructions<Finetuner>`.

## YAML config

You may notice that there is a YAML file in our last ONNX example. All configurations are stored in this file. In fact, `python -m clip_server` does **not support** any other argument besides a YAML file. So it is the only source of the truth of your configs. 

To load a YAML config from `my.yml`, simply do

```bash
python -m clip_server my.yml
```

Or one can also pipe the config via stdin:

```bash
cat my.yml | python -m clip_server -i
```

This can be very useful when using `clip_server` in a Docker container.

And to answer your doubt, `clip_server` has three built-in YAML configs as a part of the package resources. When you do `python -m clip_server` it loads the Pytorch config, and when you do `python -m clip_server onnx-flow.yml` it loads the ONNX config.
In the same way, when you do `python -m clip_server tensorrt-flow.yml` it loads the TensorRT config.

Let's look at these three built-in YAML configs:

````{tab} torch-flow.yml

```yaml
jtype: Flow
version: '1'
with:
  port: 51000
executors:
  - name: clip_t
    uses:
      jtype: CLIPEncoder
      metas:
        py_modules:
          - clip_server.executors.clip_torch
```
````

````{tab} onnx-flow.yml

```yaml
jtype: Flow
version: '1'
with:
  port: 51000
executors:
  - name: clip_o
    uses:
      jtype: CLIPEncoder
      metas:
        py_modules:
          - clip_server.executors.clip_onnx
```
````


````{tab} tensorrt-flow.yml

```yaml
jtype: Flow
version: '1'
with:
  port: 51000
executors:
  - name: clip_r
    uses:
      jtype: CLIPEncoder
      metas:
        py_modules:
          - clip_server.executors.clip_tensorrt
```
````

Basically, each YAML file defines a [Jina Flow](https://docs.jina.ai/fundamentals/flow/). The complete Jina Flow YAML syntax [can be found here](https://docs.jina.ai/fundamentals/flow/yaml-spec/). General parameters of the Flow and Executor can be used here as well. But now we only highlight the most important parameters.

Looking at the YAML file again, we can put it into three subsections as below:


````{tab} CLIP model config

```{code-block} yaml
---
emphasize-lines: 9
---

jtype: Flow
version: '1'
with:
  port: 51000
executors:
  - name: clip_t
    uses:
      jtype: CLIPEncoder
      with:
      metas:
        py_modules:
          - clip_server.executors.clip_torch
```

````

````{tab} Executor config

```{code-block} yaml
---
emphasize-lines: 6
---

jtype: Flow
version: '1'
with:
  port: 51000
executors:
  - name: clip_t
    uses:
      jtype: CLIPEncoder
      with: 
      metas:
        py_modules:
          - clip_server.executors.clip_torch
```

````

````{tab} Flow config

```{code-block} yaml
---
emphasize-lines: 3,4
---

jtype: Flow
version: '1'
with:
  port: 51000
executors:
  - name: clip_t
    uses:
      jtype: CLIPEncoder
      with: 
      metas:
        py_modules:
          - clip_server.executors.clip_torch
```

````

### CLIP model config

For all backends, you can set the following parameters via `with`:

| Parameter               | Description                                                                                                                  |
| ----------------------- | ---------------------------------------------------------------------------------------------------------------------------- |
| `name`                  | The name of the model to be used. Default 'ViT-B-32::openai'. A list of available models can be found [here](#model-support) |
| `num_worker_preprocess` | The number of CPU workers to preprocess images and texts. Default is 4.                                                      |
| `minibatch_size`        | The size of the minibatch for preprocessing and encoding. Default is 32. Reduce this number if you encounter OOM errors.     |

There are also runtime-specific parameters listed below:

````{tab} PyTorch

| Parameter | Description                                                      |
| --------- | ---------------------------------------------------------------- |
| `device`  | 'cpu' or 'cuda'. Default is None, which auto-detects the device. |
| `jit`     | Whether to use JIT compilation. Default is False.                |

````

````{tab} ONNX

| Parameter    | Description                                                                                                                                                                                     |
| ------------ | ----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
| `device`     | 'cpu' or 'cuda'. Default is None, which auto-detects the device.                                                                                                                                |
| `model_path` | The path to the model to be used. If not specified, the model will be downloaded or loaded from the local cache. See [here](#use-custom-model-for-onnx) to learn how to finetune custom models. |

````

For example, to turn on JIT and force PyTorch running on CPU, one can do:

```{code-block} yaml
---
emphasize-lines: 9-11
---

jtype: Flow
version: '1'
with:
  port: 51000
executors:
  - name: clip_t
    uses:
      jtype: CLIPEncoder
      with: 
        jit: True
        device: cpu
      metas:
        py_modules:
          - clip_server.executors.clip_torch
```

To use custom model in ONNX runtime, one can do:

```{code-block} yaml
---
emphasize-lines: 9-11
---

jtype: Flow
version: '1'
with:
  port: 51000
executors:
  - name: clip_o
    uses:
      jtype: CLIPEncoder
      with:
        name: ViT-B/32
        model_path: 'custom-model'
      metas:
        py_modules:
          - clip_server.executors.clip_onnx
```

```{warning}
The model name should match the fine-tuned model, or you will get incorrect output.
```

### Executor config

The full list of configs for Executor can be found via `jina executor --help`. The most important one is probably `replicas`, which **allows you to run multiple CLIP models in parallel** to achieve horizontal scaling.

To scale to 4 CLIP replicas, simply adding `replicas: 4` under `uses:`:

```{code-block} yaml
---
emphasize-lines: 7
---
jtype: Flow
version: '1'
with:
  port: 51000
executors:
  - name: clip_t
    replicas: 4
    uses:
      jtype: CLIPEncoder
      metas:
        py_modules:
          - clip_server.executors.clip_torch
```

(flow-config)=
### Flow config

Flow configs are the ones under top-level `with:`. We can see the `port: 51000` is configured there. Besides `port`, there are some common parameters you might need.

| Parameter  | Description                                                                                                                                                                                                           |
| ---------- | --------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
| `protocol` | Communication protocol between server and client.  Can be `grpc`, `http`, `websocket`.                                                                                                                                |
| `cors`     | Only effective when `protocol=http`. If set, a CORS middleware is added to FastAPI frontend to allow cross-origin access.                                                                                             |
| `prefetch` | Control the maximum streamed request inside the Flow at any given time, default is `None`, means no limit. Setting `prefetch` to a small number helps solving the OOM problem, but may slow down the streaming a bit. |


As an example, to set `protocol` and `prefetch`, one can modify the YAML as follows:

```{code-block} yaml
---
emphasize-lines: 5,6
---

jtype: Flow
version: '1'
with:
  port: 51000
  protocol: websocket
  prefetch: 10
executors:
  - name: clip_t
    replicas: 4
    uses:
      jtype: CLIPEncoder
      metas:
        py_modules:
          - clip_server.executors.clip_torch
```

## Environment variables


To start a server with more verbose logging,

```bash
JINA_LOG_LEVEL=DEBUG python -m clip_server
```

```{figure} images/server-log.gif
:width: 70%

```

To run CLIP-server on 3rd GPU,

```bash
CUDA_VISIBLE_DEVICES=2 python -m clip_server
```

### Serve on Multiple GPUs

If you have multiple GPU devices, you can leverage them via `CUDA_VISIBLE_DEVICES=RR`. For example, if you have 3 GPUs and your Flow YAML says `replicas: 5`, then 

```bash
CUDA_VISIBLE_DEVICES=RR python -m clip_server
```

Will assign GPU devices to the following round-robin fashion:

| GPU device | Replica ID |
| ---------- | ---------- |
| 0          | 0          |
| 1          | 1          |
| 2          | 2          |
| 0          | 3          |
| 1          | 4          |


You can also restrict the visible devices in round-robin assigment by `CUDA_VISIBLE_DEVICES=RR0:2`, where `0:2` has the same meaning as Python slice. This will create the following assigment:

| GPU device | Replica ID |
| ---------- | ---------- |
| 0          | 0          |
| 1          | 1          |
| 0          | 2          |
| 1          | 3          |
| 0          | 4          |


```{tip}
In pratice, we found it is unnecessary to run `clip_server` on multiple GPUs for two reasons:
- A single replica even with largest `ViT-L/14-336px` takes only 3.5GB VRAM.
- Real network traffic never utilizes GPU in 100%.

Based on these two points, it makes more sense to have multiple replicas on a single GPU comparing to have multiple replicas on different GPU, which is kind of waste of resources. `clip_server` scales pretty well by interleaving the GPU time with mulitple replicas.
```

## Monitor with Prometheus and Grafana

To monitor the performance of the service, you can enable the Prometheus metrics in the Flow YAML:

```{code-block} yaml
---
emphasize-lines: 5,6,14,15
---

jtype: Flow
version: '1'
with:
  port: 51000
  monitoring: True
  port_monitoring: 9090
executors:
  - name: clip_t
    uses:
      jtype: CLIPEncoder
      metas:
        py_modules:
          - clip_server.executors.clip_torch
    monitoring: true
    port_monitoring: 9091
```

This enables Prometheus metrics on both Gateway and the CLIP Executor.

Running it gives you:

```{figure} images/server-start-monitoring.gif
:width: 80%

```

which exposes two additional endpoints:
- `http://localhost:9090`  for the Gateway
- `http://localhost:9091`  for the CLIP Executor


To visualize the metrics in Grafana, you can import this [JSON file of an example dashboard](https://clip-as-service.jina.ai/_static/cas-grafana.json). You will get something as follows:

```{figure} images/grafana-dashboard.png
:width: 80%

```


For more information on monitoring a Flow, [please read here](https://docs.jina.ai/fundamentals/flow/monitoring-flow/). 

## Serve with TLS

You can turn on TLS for HTTP and gRPC protocols. Your Flow YAML should be changed to the following:

```{code-block} yaml
---
emphasize-lines: 4,5,7-10
---
jtype: Flow
version: '1'
with:
  port: 8443
  protocol: http
  cors: true
  uvicorn_kwargs:
    ssl_keyfile_password: blahblah
  ssl_certfile: cert.pem
  ssl_keyfile: key.pem
```

Here, `protocol` can be either `http` or `grpc`; `cert.pem` or `key.pem` represent both parts of a certificate, key being the private key to the certificate and crt being the signed certificate. You can run the following command in terminal:

```bash
openssl req -newkey rsa:4096 -nodes -sha512 -x509 -days 3650 -nodes -out cert.pem -keyout key.pem -subj "/CN=<your.clip.address>"
```

Note that if you are using `protocol: grpc` then `/CN=<your.clip.address>` must strictly follow the IP address or the domain name of your server. Mismatch IP or domain name would throw an exception.

Certificate and keys can be also generated via [letsencrypt.org](https://letsencrypt.org/), which is a free SSL provider.

```{warning}
Note that note every port support HTTPS. Commonly support ports are: `443`, `2053`, `2083`, `2087`, `2096`, `8443`.
```

```{warning}
If you are using Cloudflare proxied DNS, please be aware:
- you need to turn on gRPC support manually, [please follow the guide here](https://support.cloudflare.com/hc/en-us/articles/360050483011-Understanding-Cloudflare-gRPC-support);
- the free tier of Cloudflare has 100s hard limit on the timeout, meaning sending big batch to a CPU server may throw 524 to the client-side.
```

When the server is successfully running, you can connect to it via client by setting `server` to `https://` or `grpcs://` as follows:

```python
from clip_client import Client

c = Client('grpcs://<your.clip.address>:2096')

r = c.encode(
    [
        'First do it',
        'then do it right',
        'then do it better',
        'https://picsum.photos/200',
    ]
)
```

## Serve in Docker Container

You can run the server inside a Docker container. We provide a Dockerfile in the repository, which is CUDA-enabled with optimized package installation. 

### Build

We have a list of {ref}`pre-built images available on Docker Hub<prebuild-images>`. If they are too big for you to download, you may consider built it yourself as follows:

```bash
git clone https://github.com/jina-ai/clip-as-service.git
docker build . -f Dockerfiles/server.Dockerfile  --build-arg GROUP_ID=$(id -g ${USER}) --build-arg USER_ID=$(id -u ${USER}) -t jinaai/clip-server
```

```{tip}
The build argument `--build-arg GROUP_ID=$(id -g ${USER}) --build-arg USER_ID=$(id -u ${USER})` is optional, but having them is highly recommended as it allows you to reuse host's cache with the correct access.
```


### Run

````{tab} PyTorch
```bash
docker run -p 51009:51000 -v $HOME/.cache:/home/cas/.cache --gpus all jinaai/clip-server
```
````
````{tab} ONNX
```bash
docker run -p 51009:51000 -v $HOME/.cache:/home/cas/.cache --gpus all jinaai/clip-server:master-onnx onnx-flow.yml
```
````
````{tab} TensorRT
```bash
docker run -p 51009:51000 -v $HOME/.cache:/home/cas/.cache --gpus all jinaai/clip-server:master-tensorrt tensorrt-flow.yml
```
````

Here, `51009` is the public port on the host and `51000` is the {ref}`in-container port defined inside YAML<flow-config>`. The argument `-v $HOME/.cache:/home/cas/.cache` leverages host's cache and prevents you to download the same model next time on start. 

Due to the limitation of the terminal inside Docker container, you will **not** see the classic Jina progress bar on start. Instead, you will face a few minutes awkward silent while model downloading and then see "Flow is ready to serve" dialog.

To pass a YAML config from the host, one can do:

````{tab} PyTorch
```bash
cat my.yml | docker run -i -p 51009:51000 -v $HOME/.cache:/home/cas/.cache --gpus all jinaai/clip-server -i
```
````
````{tab} ONNX
```bash
cat my.yml | docker run -i -p 51009:51000 -v $HOME/.cache:/home/cas/.cache --gpus all jinaai/clip-server:master-onnx -i
```
````
````{tab} TensorRT
```bash
cat my.yml | docker run -i -p 51009:51000 -v $HOME/.cache:/home/cas/.cache --gpus all jinaai/clip-server:master-tensorrt -i
```
````

The CLI usage is the same {ref}`as described here <server-address>`.

```{tip}
You can enable debug logging via: `docker run --env JINA_LOG_LEVEL=debug ...`
```

(prebuild-images)=
### Pre-built images

We have prebuilt images with CUDA support.

The Docker image name always starts with `jinaai/clip-server` followed by a tag composed of three parts:

```text
jinaai/clip-server:{version}{extra}
```

- `{version}`: The version of Jina. Possible values:
    - `latest`: the last release;
    - `master`: the master branch of `jina-ai/jina` repository;
    - `x.y.z`: the release of a particular version;
    - `x.y` and `x`: the alias to the last `x.y.z` patch release, i.e. `x.y` = `x.y.max(z)`;
- `{extra}`: the extra dependency installed along with `clip_server`. Possible values:
    - ` `: Pytorch backend;
    - `-onnx`: ONNX backend; 
    - `-tensorrt`: TensorRT backend;

#### Image alias and updates

| Event                | Updated images                     | Aliases                                                                                                                                            |
| -------------------- | ---------------------------------- | -------------------------------------------------------------------------------------------------------------------------------------------------- |
| On merge into `main` | `jinaai/clip-server:master{extra}` |                                                                                                                                                    |
| On `x.y.z` release   | `jinaai/clip-server:x.y.z{extra}`  | `jinaai/clip-server:latest{python_version}{extra}`, `jinaai/clip-server:x.y{python_version}{extra}`, `jinaai/clip-server:x{python_version}{extra}` |

3 images are built on the event listed above, i.e. taking the combination of:
  - `{extra} = ["", "-onnx", "-tensorrt"]`

#### Image size on different tags

```{warning}
[Due to a known bug in shields.io/Docker Hub API](https://github.com/badges/shields/issues/7583), the following badge may show "invalid" status randomly.
```

| Image Size                                                                                                                                |
| ----------------------------------------------------------------------------------------------------------------------------------------- |
| ![](https://img.shields.io/docker/image-size/jinaai/clip-server/latest?label=jinaai%2Fclip-server%3Alatest&logo=docker)                   |
| ![](https://img.shields.io/docker/image-size/jinaai/clip-server/latest-onnx?label=jinaai%2Fclip-server%3Alatest-onnx&logo=docker)         |
| ![](https://img.shields.io/docker/image-size/jinaai/clip-server/latest-tensorrt?label=jinaai%2Fclip-server%3Alatest-tensorrt&logo=docker) |
| ![](https://img.shields.io/docker/image-size/jinaai/clip-server/master?label=jinaai%2Fclip-server%3Amaster&logo=docker)                   |
| ![](https://img.shields.io/docker/image-size/jinaai/clip-server/master-onnx?label=jinaai%2Fclip-server%3Amaster-onnx&logo=docker)         |
| ![](https://img.shields.io/docker/image-size/jinaai/clip-server/master-tensorrt?label=jinaai%2Fclip-server%3Amaster-tensorrt&logo=docker) |


================================================
FILE: scripts/MANIFEST.in
================================================
include LICENSE
prune tests/
prune **/tests/


================================================
FILE: scripts/benchmark.py
================================================
import random
import time
from typing import Optional
import threading
import click
import numpy as np
from docarray import Document, DocumentArray


def warn(*args, **kwargs):
    pass


import warnings

warnings.warn = warn


np.random.seed(123)


class BenchmarkClient(threading.Thread):
    def __init__(
        self,
        server: str,
        batch_size: int = 1,
        modality: str = 'text',
        num_iter: Optional[int] = 100,
        image_sample: str = None,
        **kwargs,
    ):
        """
        @param server: the CLIP-as-service server URI
        @param batch_size: number of batch sample
        @param num_iter: number of repeat run per experiment
        @param image_sample: uri of the test image
        """
        assert num_iter > 2, 'num_iter must be greater than 2'
        super().__init__()
        self.server = server
        self.batch_size = batch_size
        self.modality = modality
        self.image_sample = image_sample
        self.num_iter = num_iter
        self.avg_time = 0

    def run(self):
        try:
            from clip_client import Client
        except ImportError:
            raise ImportError(
                'clip_client module is not available. it is required for benchmarking.'
                'Please use ""pip install clip-client" to install it.'
            )

        if self.modality == 'text':
            from clip_server.model.simple_tokenizer import SimpleTokenizer

            tokenizer = SimpleTokenizer()
            vocab = list(tokenizer.encoder.keys())
            batch = DocumentArray(
                [
                    Document(text=' '.join(random.choices(vocab, k=78)))
                    for _ in range(self.batch_size)
                ]
            )
        elif self.modality == 'image':
            batch = DocumentArray(
                [
                    Document(blob=open(self.image_sample, 'rb').read())
                    for _ in range(self.batch_size)
                ]
            )
        else:
            raise ValueError(f'The modality "{self.modality}" is unsupported')

        client = Client(self.server)

        time_costs = []
        for _ in range(self.num_iter):
            start = time.perf_counter()
            r = client.encode(batch, batch_size=self.batch_size)
            time_costs.append(time.perf_counter() - start)
        self.avg_time = np.mean(time_costs[2:])


@click.command(name='clip-as-service benchmark')
@click.argument('server')
@click.option(
    '--batch_sizes',
    multiple=True,
    type=int,
    default=[1, 8, 16, 32, 64],
    help='number of batch',
)
@click.option(
    '--num_iter', default=10, help='number of repeat run per experiment (must > 2)'
)
@click.option(
    "--concurrent_clients",
    multiple=True,
    type=int,
    default=[1, 4, 16, 32, 64],
    help='number of concurrent clients per experiment',
)
@click.option("--image_sample", help='path to the image sample file')
def main(server, batch_sizes, num_iter, concurrent_clients, image_sample):
    # wait until the server is ready
    for batch_size in batch_sizes:
        for num_client in concurrent_clients:
            all_clients = [
                BenchmarkClient(
                    server,
                    batch_size=batch_size,
                    num_iter=num_iter,
                    modality='image' if (image_sample is not None) else 'text',
                    image_sample=image_sample,
                )
                for _ in range(num_client)
            ]

            for bc in all_clients:
                bc.start()

            clients_speed = []
            for bc in all_clients:
                bc.join()
                clients_speed.append(batch_size / bc.avg_time)

            max_speed, min_speed, avg_speed = (
                max(clients_speed),
                min(clients_speed),
                np.mean(clients_speed),
            )

            print(
                '(concurrent client=%d, batch_size=%d) avg speed: %.3f\tmax speed: %.3f\tmin speed: %.3f'
                % (num_client, batch_size, avg_speed, max_speed, min_speed),
                flush=True,
            )


if __name__ == '__main__':
    main()


================================================
FILE: scripts/black.sh
================================================
#!/bin/bash
pip install black==22.3.0
arrVar=()
echo we ignore non-*.py files and files generated from protobuf
excluded_files=(
   docarray/proto/docarray_pb2.py
   docs/conf.py
)
for changed_file in $CHANGED_FILES; do
  if [[ ${changed_file} == *.py ]] && ! [[ " ${excluded_files[@]} " =~ " ${changed_file} " ]]; then
    echo checking ${changed_file}
    arrVar+=(${changed_file})
  fi
done
if [ ${#arrVar[@]} -ne 0 ]; then
  black -S --check "${arrVar[@]}"
fi


================================================
FILE: scripts/docstrings_lint.sh
================================================
#!/bin/bash
# required in order to get the status of all the files at once
pip install darglint==1.6.0
pip install pydocstyle==5.1.1
echo ====================================================================================
echo DOCSTRINGS LINT: checking $CHANGED_FILES
echo ------------------------------------------------------------------------------------
echo 'removing files under /tests...'
arrVar=()
# we ignore tests files
for changed_file in $CHANGED_FILES; do
  case ${changed_file} in
    tests/* | \
    .github/* | \
    scripts/* | \
    docarray/resources/* | \
    docs/* | \
    setup.py | \
    fastentrypoints.py)
    ;;*)
      echo keeping ${changed_file}
      arrVar+=(${changed_file})
    ;;
  esac
done

# if array is empty
if [ ${#arrVar[@]} -eq 0 ]; then
  echo 'nothing to check'
  exit 0
fi

DARGLINT_OUTPUT=$(darglint -v 2 -s sphinx "${arrVar[@]}"); PYDOCSTYLE_OUTPUT=$(pydocstyle --select=D101,D102,D103 "${arrVar[@]}")
# status captured here
if [[ -z "$PYDOCSTYLE_OUTPUT" ]] && [[ -z "$DARGLINT_OUTPUT" ]]; then
  echo 'OK'
  exit 0
else
  echo 'failure. make sure to check the guide for docstrings: https://docarray.jina.ai/chapters/docstring.html'
  echo $DARGLINT_OUTPUT
  echo $PYDOCSTYLE_OUTPUT
  exit 1
fi
echo ====================================================================================


================================================
FILE: scripts/get-all-test-paths.sh
================================================
#!/usr/bin/env bash

set -ex

BATCH_SIZE=3
#declare -a array1=( "tests/unit/test_*.py" )
#declare -a array2=( $(ls -d tests/unit/*/ | grep -v '__pycache__' | grep -v 'array') )
#declare -a array3=( "tests/unit/array/*.py" )
declare -a mixins=( $(find tests -name "test_*.py" | grep -v 'test_tensorrt.py') )
declare -a array4=( "$(echo "${mixins[@]}" | xargs -n$BATCH_SIZE)" )
# array5 is currently empty because in the array/ directory, mixins is the only directory
# but add the following in case new directories are created in array/
declare -a array5=( $(ls -d tests/unit/array/*/ | grep -v '__pycache__' | grep -v 'mixins') )
dest=( "${array1[@]}" "${array2[@]}" "${array3[@]}" "${array4[@]}" "${array5[@]}" )

printf '%s\n' "${dest[@]}" | jq -R . | jq -cs .


================================================
FILE: scripts/get-last-release-note.py
================================================
## under jina root dir
# python scripts/get-last-release-note.py
## result in root/tmp.md

with open('CHANGELOG.md') as fp:
    n = []
    for v in fp:
        if v.startswith('## Release Note'):
            n.clear()
        n.append(v)

with open('tmp.md', 'w') as fp:
    fp.writelines(n)


================================================
FILE: scripts/get-requirements.py
================================================
## under clip-as-service root dir
# python scripts/get-requirments.py $PIP_TAG /path/to/requirements.txt

import sys
from distutils.core import run_setup

result = run_setup("./server/setup.py", stop_after="init")

with open(sys.argv[2], 'w') as fp:
    fp.write('\n'.join(result.install_requires) + '\n')
    if sys.argv[1]:
        fp.write('\n'.join(result.extras_require[sys.argv[1]]) + '\n')


================================================
FILE: scripts/onnx_helper.py
================================================
def convert_float_to_float16(model_path: str, output_model_path: str):
    import onnx
    from onnxmltools.utils.float16_converter import (
        convert_float_to_float16_model_path,
    )

    new_onnx_model = convert_float_to_float16_model_path(model_path)

    onnx.save(new_onnx_model, output_model_path)

    # Alternate approach
    # from onnx import load_model
    # from onnxruntime.transformers import optimizer, onnx_model
    #
    # # optimized_model = optimizer.optimize_model(model_path, model_type='bert')
    #
    # model = load_model(model_path)
    # optimized_model = onnx_model.OnnxModel(model)
    #
    # if hasattr(optimized_model, 'convert_float32_to_float16'):
    #     optimized_model.convert_float_to_float16()
    # else:
    #     optimized_model.convert_model_float32_to_float16()
    #
    # self._textual_path = f'{self._textual_path[:-5]}_optimized.onnx'
    # optimized_model.save_model_to_file(output_model_path)


def quantize(model_path: str, output_model_path: str):
    """
    Quantize the weights of the model from float32 to in8 to allow very efficient inference on modern CPU
    Uses unsigned ints for activation values, signed ints for weights, per
    https://onnxruntime.ai/docs/performance/quantization.html#data-type-selection
    it is faster on most CPU architectures
    Args:
        onnx_model_path: Path to location the exported ONNX model is stored
    Returns: The Path generated for the quantized
    """
    from onnxruntime.quantization import quantize_dynamic, QuantType

    quantize_dynamic(
        model_input=model_path,
        model_output=output_model_path,
        per_channel=True,
        reduce_range=True,  # should be the same as per_channel
        activation_type=QuantType.QUInt8,
        weight_type=QuantType.QInt8,  # per docs, signed is faster on most CPUs
        optimize_model=True,
        op_types_to_quantize=["MatMul", "Attention", "Mul", "Add"],
        extra_options={"WeightSymmetric": False, "MatMulConstBOnly": True},
    )  # op_types_to_quantize=['MatMul', 'Relu', 'Add', 'Mul' ],


================================================
FILE: scripts/release.sh
================================================
#!/usr/bin/env bash

# Requirements
# brew install hub
# npm install -g git-release-notes
# pip install twine wheel

set -ex

INIT_FILE='client/clip_client/__init__.py'
VER_TAG='__version__ = '
RELEASENOTE='./node_modules/.bin/git-release-notes'

function escape_slashes {
    sed 's/\//\\\//g'
}

function update_ver_line {
    local OLD_LINE_PATTERN=$1
    local NEW_LINE=$2
    local FILE=$3

    local NEW=$(echo "${NEW_LINE}" | escape_slashes)
    sed -i '/'"${OLD_LINE_PATTERN}"'/s/.*/'"${NEW}"'/' "${FILE}"
    head -n10 ${FILE}
}


function clean_build {
    rm -rf dist
    rm -rf *.egg-info
    rm -rf build
}

function pub_pypi {
    # publish to pypi
    cd $1
    clean_build
    python setup.py sdist
    twine upload dist/*
    clean_build
    cd -
}

function git_commit {
    git config --local user.email "dev-bot@jina.ai"
    git config --local user.name "Jina Dev Bot"
    git tag "v$RELEASE_VER" -m "$(cat ./CHANGELOG.tmp)"
    git add client/clip_client/__init__.py server/clip_server/__init__.py ./CHANGELOG.md
    git commit -m "chore(version): the next version will be $NEXT_VER" -m "build($RELEASE_ACTOR): $RELEASE_REASON"
}


function make_release_note {
    ${RELEASENOTE} ${LAST_VER}..HEAD .github/release-template.ejs > ./CHANGELOG.tmp
    head -n10 ./CHANGELOG.tmp
    printf '\n%s\n\n%s\n%s\n\n%s\n\n%s\n\n' "$(cat ./CHANGELOG.md)" "<a name="release-note-${RELEASE_VER//\./-}"></a>" "## Release Note (\`${RELEASE_VER}\`)" "> Release time: $(date +'%Y-%m-%d %H:%M:%S')" "$(cat ./CHANGELOG.tmp)" > ./CHANGELOG.md
}

BRANCH=$(git rev-parse --abbrev-ref HEAD)

if [[ "$BRANCH" != "main" ]]; then
  printf "You are not at main branch, exit\n";
  exit 1;
fi

LAST_UPDATE=`git show --no-notes --format=format:"%H" $BRANCH | head -n 1`
LAST_COMMIT=`git show --no-notes --format=format:"%H" origin/$BRANCH | head -n 1`

if [ $LAST_COMMIT != $LAST_UPDATE ]; then
    printf "Your local $BRANCH is behind the remote master, exit\n"
    exit 1;
fi

# release the current version
export RELEASE_VER=$(sed -n '/^__version__/p' $INIT_FILE | cut -d \' -f2)
LAST_VER=$(git tag -l | sort -V | tail -n1)
printf "last version: \e[1;32m$LAST_VER\e[0m\n"

if [[ $1 == "final" ]]; then
  printf "this will be a final release: \e[1;33m$RELEASE_VER\e[0m\n"

  NEXT_VER=$(echo $RELEASE_VER | awk -F. -v OFS=. 'NF==1{print ++$NF}; NF>1{$NF=sprintf("%0*d", length($NF), ($NF+1)); print}')
  printf "bump master version to: \e[1;32m$NEXT_VER\e[0m\n"

  make_release_note

  pub_pypi client
  pub_pypi server
  cp scripts/MANIFEST.in ./
  cp scripts/setup.py ./
  pub_pypi "."

  VER_TAG_NEXT=$VER_TAG\'${NEXT_VER}\'
  update_ver_line "$VER_TAG" "$VER_TAG_NEXT" 'client/clip_client/__init__.py'
  update_ver_line "$VER_TAG" "$VER_TAG_NEXT" 'server/clip_server/__init__.py'
  RELEASE_REASON="$2"
  RELEASE_ACTOR="$3"
  git_commit
elif [[ $1 == 'rc' ]]; then
  printf "this will be a release candidate: \e[1;33m$RELEASE_VER\e[0m\n"
  DOT_RELEASE_VER=$(echo $RELEASE_VER | sed "s/rc/\./")
  NEXT_VER=$(echo $DOT_RELEASE_VER | awk -F. -v OFS=. 'NF==1{print ++$NF}; NF>1{$NF=sprintf("%0*d", length($NF), ($NF+1)); print}')
  NEXT_VER=$(echo $NEXT_VER | sed "s/\.\([^.]*\)$/rc\1/")
  printf "bump master version to: \e[1;32m$NEXT_VER\e[0m, this will be the next version\n"

  make_release_note

  pub_pypi client
  pub_pypi server
  cp scripts/MANIFEST.in ./
  cp scripts/setup.py ./
  pub_pypi "."

  VER_TAG_NEXT=$VER_TAG\'${NEXT_VER}\'
  update_ver_line "$VER_TAG" "$VER_TAG_NEXT" 'client/clip_client/__init__.py'
  update_ver_line "$VER_TAG" "$VER_TAG_NEXT" 'server/clip_server/__init__.py'
  RELEASE_REASON="$2"
  RELEASE_ACTOR="$3"
  git_commit
else
  # as a prerelease, pypi update only, no back commit etc.
  COMMITS_SINCE_LAST_VER=$(git rev-list $LAST_VER..HEAD --count)
  NEXT_VER=$RELEASE_VER".dev"$COMMITS_SINCE_LAST_VER
  printf "this will be a developmental release: \e[1;33m$NEXT_VER\e[0m\n"

  VER_TAG_NEXT=$VER_TAG\'${NEXT_VER}\'
  update_ver_line "$VER_TAG" "$VER_TAG_NEXT" 'client/clip_client/__init__.py'
  update_ver_line "$VER_TAG" "$VER_TAG_NEXT" 'server/clip_server/__init__.py'

  pub_pypi client
  pub_pypi server
  cp scripts/MANIFEST.in ./
  cp scripts/setup.py ./
  pub_pypi "."
fi


================================================
FILE: scripts/setup.py
================================================
import sys
from os import path

from setuptools import find_packages
from setuptools import setup

if sys.version_info < (3, 7, 0):
    raise OSError(f'Clip-as-service requires Python >=3.7, but yours is {sys.version}')

try:
    pkg_name = 'clip-as-service'
    libinfo_py = path.join('server/clip_server/__init__.py')
    libinfo_content = open(libinfo_py, 'r', encoding='utf8').readlines()
    version_line = [l.strip() for l in libinfo_content if l.startswith('__version__')][
        0
    ]
    exec(version_line)  # gives __version__
except FileNotFoundError:
    __version__ = '0.0.0'

try:
    with open('README.md', encoding='utf8') as fp:
        _long_description = fp.read()
except FileNotFoundError:
    _long_description = ''

setup(
    name=pkg_name,
    packages=find_packages(),
    version=__version__,
    include_package_data=True,
    description='Embed images and sentences into fixed-length vectors via CLIP',
    author='Jina AI',
    author_email='hello@jina.ai',
    license='Apache 2.0',
    url='https://github.com/jina-ai/clip-as-service',
    download_url='https://github.com/jina-ai/clip-as-service/tags',
    long_description=_long_description,
    long_description_content_type='text/markdown',
    zip_safe=False,
    setup_requires=['setuptools>=18.0', 'wheel'],
    install_requires=['clip-server', 'clip-client'],
    classifiers=[
        'Development Status :: 5 - Production/Stable',
        'Intended Audience :: Developers',
        'Intended Audience :: Education',
        'Intended Audience :: Science/Research',
        'Programming Language :: Python :: 3.7',
        'Programming Language :: Python :: 3.8',
        'Programming Language :: Python :: 3.9',
        'Programming Language :: Unix Shell',
        'Environment :: Console',
        'License :: OSI Approved :: Apache Software License',
        'Operating System :: OS Independent',
        'Topic :: Database :: Database Engines/Servers',
        'Topic :: Scientific/Engineering :: Artificial Intelligence',
        'Topic :: Internet :: WWW/HTTP :: Indexing/Search',
        'Topic :: Scientific/Engineering :: Image Recognition',
        'Topic :: Multimedia :: Video',
        'Topic :: Scientific/Engineering',
        'Topic :: Scientific/Engineering :: Mathematics',
        'Topic :: Software Development',
        'Topic :: Software Development :: Libraries',
        'Topic :: Software Development :: Libraries :: Python Modules',
    ],
    project_urls={
        'Documentation': 'https://clip-as-service.jina.ai/',
        'Source': 'https://github.com/jina-ai/clip-as-service',
        'Tracker': 'https://github.com/jina-ai/clip-as-service/issues',
    },
    keywords='jina openai clip deep-learning cross-modal multi-modal neural-search',
)


================================================
FILE: server/MANIFEST.in
================================================
recursive-include clip_server/resources *
include clip_server/*.yml


================================================
FILE: server/clip_server/__init__.py
================================================
__version__ = '0.8.4'


================================================
FILE: server/clip_server/__main__.py
================================================
import inspect
import os
import sys

if __name__ == '__main__':
    if 'NO_VERSION_CHECK' not in os.environ:
        from clip_server.helper import is_latest_version

        is_latest_version(github_repo='clip-as-service')

    from jina import Flow

    if len(sys.argv) > 1:
        if sys.argv[1] == '-i':
            _input = sys.stdin.read()
        else:
            _input = sys.argv[1]
    else:
        _input = 'torch-flow.yml'

    f = Flow.load_config(
        _input,
        extra_search_paths=[os.path.dirname(inspect.getfile(inspect.currentframe()))],
    )
    with f:
        f.block()


================================================
FILE: server/clip_server/executors/__init__.py
================================================


================================================
FILE: server/clip_server/executors/clip_onnx.py
================================================
import os
import warnings
from functools import partial
from multiprocessing.pool import ThreadPool
from typing import Dict, Optional

import onnxruntime as ort
from clip_server.executors.helper import (
    preproc_image,
    preproc_text,
    set_rank,
    split_img_txt_da,
)
from clip_server.model import clip
from clip_server.model.clip_onnx import CLIPOnnxModel
from clip_server.model.tokenization import Tokenizer
from jina import DocumentArray, Executor, requests
from opentelemetry.trace import NoOpTracer, Span


class CLIPEncoder(Executor):
    def __init__(
        self,
        name: str = 'ViT-B-32::openai',
        device: Optional[str] = None,
        num_worker_preprocess: int = 4,
        minibatch_size: int = 32,
        access_paths: str = '@r',
        model_path: Optional[str] = None,
        dtype: Optional[str] = None,
        **kwargs,
    ):
        """
        :param name: The name of the model to be used. Default 'ViT-B-32::openai'. A list of available models can be
            found at https://clip-as-service.jina.ai/user-guides/server/#model-support
        :param device: 'cpu' or 'cuda'. Default is None, which auto-detects the device.
        :param num_worker_preprocess: The number of CPU workers to preprocess images and texts. Default is 4.
        :param minibatch_size: The size of the minibatch for preprocessing and encoding. Default is 32. Reduce this
            number if you encounter OOM errors.
        :param access_paths: The access paths to traverse on the input documents to get the images and texts to be
            processed. Visit https://docarray.jina.ai/fundamentals/documentarray/access-elements for more details.
        :param model_path: The path to the model to be used. If not specified, the model will be downloaded or loaded
            from the local cache. Visit https://clip-as-service.jina.ai/user-guides/server/#use-custom-model-for-onnx
            to learn how to finetune custom models.
        :param dtype: inference data type, if None defaults to 'fp32' if device == 'cpu' else 'fp16'.
        """
        super().__init__(**kwargs)
        import torch

        if not device:
            device = 'cuda' if torch.cuda.is_available() else 'cpu'
        self._device = device
        if not dtype:
            dtype = 'fp32' if self._device in ('cpu', torch.device('cpu')) else 'fp16'
        self._dtype = dtype

        self._minibatch_size = minibatch_size
        self._access_paths = access_paths
        if 'traversal_paths' in kwargs:
            warnings.warn(
                f'`traversal_paths` is deprecated. Use `access_paths` instead.'
            )
            self._access_paths = kwargs['traversal_paths']

        self._num_worker_preprocess = num_worker_preprocess
        self._pool = ThreadPool(processes=num_worker_preprocess)

        self._model = CLIPOnnxModel(name, model_path, dtype)
        self._tokenizer = Tokenizer(name)

        self._image_transform = clip._transform_blob(self._model.image_size)

        # define the priority order for the execution providers
        providers = ['CPUExecutionProvider']

        # prefer CUDA Execution Provider over CPU Execution Provider
        if self._device.startswith('cuda'):
            providers.insert(0, 'CUDAExecutionProvider')

        sess_options = ort.SessionOptions()

        # Enables all available optimizations including layout optimizations
        sess_options.graph_optimization_level = (
            ort.GraphOptimizationLevel.ORT_ENABLE_ALL
        )

        if not self._device.startswith('cuda') and (
            'OMP_NUM_THREADS' not in os.environ
            and hasattr(self.runtime_args, 'replicas')
        ):
            replicas = getattr(self.runtime_args, 'replicas', 1)
            num_threads = max(1, torch.get_num_threads() * 2 // replicas)
            if num_threads < 2:
                warnings.warn(
                    f'Too many replicas ({replicas}) vs too few threads {num_threads} may result in '
                    f'sub-optimal performance.'
                )

            # Run the operators in the graph in parallel (not support the CUDA Execution Provider)
            sess_options.execution_mode = ort.ExecutionMode.ORT_PARALLEL

            # The number of threads used to parallelize the execution of the graph (across nodes)
            sess_options.inter_op_num_threads = 1
            sess_options.intra_op_num_threads = max(num_threads, 1)

        self._model.start_sessions(
            sess_options=sess_options, providers=providers, dtype=dtype
        )

        if not self.tracer:
            self.tracer = NoOpTracer()

    def _preproc_images(self, docs: 'DocumentArray', drop_image_content: bool):
        with self.monitor(
            name='preprocess_images_seconds',
            documentation='images preprocess time in seconds',
        ):
            with self.tracer.start_as_current_span('preprocess_images'):
                return preproc_image(
                    docs,
                    preprocess_fn=self._image_transform,
                    return_np=True,
                    drop_image_content=drop_image_content,
                    dtype=self._dtype,
                )

    def _preproc_texts(self, docs: 'DocumentArray'):
        with self.monitor(
            name='preprocess_texts_seconds',
            documentation='texts preprocess time in seconds',
        ):
            with self.tracer.start_as_current_span('preprocess_images'):
                return preproc_text(docs, tokenizer=self._tokenizer, return_np=True)

    @requests(on='/rank')
    async def rank(self, docs: 'DocumentArray', parameters: Dict, **kwargs):
        _drop_image_content = parameters.get('drop_image_content', False)
        await self.encode(docs['@r,m'], drop_image_content=_drop_image_content)

        set_rank(docs)

    @requests
    async def encode(
        self,
        docs: 'DocumentArray',
        tracing_context=None,
        parameters: Dict = {},
        **kwargs,
    ):
        with self.tracer.start_as_current_span(
            'encode', context=tracing_context
        ) as span:
            span.set_attribute('device', self._device)
            span.set_attribute('runtime', 'onnx')
            access_paths = parameters.get('access_paths', self._access_paths)
            if 'traversal_paths' in parameters:
                warnings.warn(
                    f'`traversal_paths` is deprecated. Use `access_paths` instead.'
                )
                access_paths = parameters['traversal_paths']
            _drop_image_content = parameters.get('drop_image_content', False)

            _img_da = DocumentArray()
            _txt_da = DocumentArray()
            for d in docs[access_paths]:
                split_img_txt_da(d, _img_da, _txt_da)

            with self.tracer.start_as_current_span('inference') as inference_span:
                inference_span.set_attribute('drop_image_content', _drop_image_content)
                inference_span.set_attribute('minibatch_size', self._minibatch_size)
                inference_span.set_attribute('has_img_da', True if _img_da else False)
                inference_span.set_attribute('has_txt_da', True if _txt_da else False)
                # for image
                if _img_da:
                    with self.tracer.start_as_current_span(
                        'img_minibatch_encoding'
                    ) as img_encode_span:
                        for minibatch, batch_data in _img_da.map_batch(
                            partial(
                                self._preproc_images,
                                drop_image_content=_drop_image_content,
                            ),
                            batch_size=self._minibatch_size,
                            pool=self._pool,
                        ):
                            with self.monitor(
                                name='encode_images_seconds',
                                documentation='images encode time in seconds',
                            ):
                                minibatch.embeddings = self._model.encode_image(
                                    batch_data
                                )

                # for text
                if _txt_da:
                    with self.tracer.start_as_current_span(
                        'txt_minibatch_encoding'
                    ) as txt_encode_span:
                        for minibatch, batch_data in _txt_da.map_batch(
                            self._preproc_texts,
                            batch_size=self._minibatch_size,
                            pool=self._pool,
                        ):
                            with self.monitor(
                                name='encode_texts_seconds',
                                documentation='texts encode time in seconds',
                            ):
                                minibatch.embeddings = self._model.encode_text(
                                    batch_data
                                )

        return docs


================================================
FILE: server/clip_server/executors/clip_tensorrt.py
================================================
import warnings
from functools import partial
from multiprocessing.pool import ThreadPool
from typing import Dict, Optional

import numpy as np
from clip_server.executors.helper import (
    preproc_image,
    preproc_text,
    set_rank,
    split_img_txt_da,
)
from clip_server.model import clip
from clip_server.model.clip_trt import CLIPTensorRTModel
from clip_server.model.tokenization import Tokenizer
from jina import DocumentArray, Executor, requests
from opentelemetry.trace import NoOpTracer, Span


class CLIPEncoder(Executor):
    def __init__(
        self,
        name: str = 'ViT-B-32::openai',
        device: str = 'cuda',
        num_worker_preprocess: int = 4,
        minibatch_size: int = 32,
        access_paths: str = '@r',
        **kwargs,
    ):
        """
        :param name: The name of the model to be used. Default 'ViT-B-32::openai'. A list of available models can be
            found at https://clip-as-service.jina.ai/user-guides/server/#model-support
        :param device: 'cpu' or 'cuda'. Default is 'cuda' since TensorRT is only supported on CUDA.
        :param num_worker_preprocess: The number of CPU workers to preprocess images and texts. Default is 4.
        :param minibatch_size: The size of the minibatch for preprocessing and encoding. Default is 32. Reduce this
            number if you encounter OOM errors.
        :param access_paths: The access paths to traverse on the input documents to get the images and texts to be
            processed. Visit https://docarray.jina.ai/fundamentals/documentarray/access-elements for more details.
        """
        super().__init__(**kwargs)

        self._num_worker_preprocess = num_worker_preprocess
        self._pool = ThreadPool(processes=num_worker_preprocess)

        self._minibatch_size = minibatch_size
        self._access_paths = access_paths
        if 'traversal_paths' in kwargs:
            warnings.warn(
                f'`traversal_paths` is deprecated. Use `access_paths` instead.'
            )
            self._access_paths = kwargs['traversal_paths']

        self._device = device

        import torch

        assert self._device.startswith('cuda'), (
            f'can not perform inference on {self._device}'
            f' with Nvidia TensorRT as backend'
        )

        assert (
            torch.cuda.is_available()
        ), "CUDA/GPU is not available on Pytorch. Please check your CUDA installation"

        self._model = CLIPTensorRTModel(name)

        self._model.start_engines()

        self._tokenizer = Tokenizer(name)
        self._image_transform = clip._transform_blob(self._model.image_size)

        if not self.tracer:
            self.tracer = NoOpTracer()

    def _preproc_images(self, docs: 'DocumentArray', drop_image_content: bool):
        with self.monitor(
            name='preprocess_images_seconds',
            documentation='images preprocess time in seconds',
        ):
            with self.tracer.start_as_current_span('preprocess_images'):
                return preproc_image(
                    docs,
                    preprocess_fn=self._image_transform,
                    device=self._device,
                    return_np=False,
                    drop_image_content=drop_image_content,
                )

    def _preproc_texts(self, docs: 'DocumentArray'):
        with self.monitor(
            name='preprocess_texts_seconds',
            documentation='texts preprocess time in seconds',
        ):
            with self.tracer.start_as_current_span('preprocess_images'):
                return preproc_text(
                    docs,
                    tokenizer=self._tokenizer,
                    device=self._device,
                    return_np=False,
                )

    @requests(on='/rank')
    async def rank(self, docs: 'DocumentArray', parameters: Dict, **kwargs):
        _drop_image_content = parameters.get('drop_image_content', False)
        await self.encode(docs['@r,m'], drop_image_content=_drop_image_content)

        set_rank(docs)

    @requests
    async def encode(
        self,
        docs: 'DocumentArray',
        tracing_context=None,
        parameters: Dict = {},
        **kwargs,
    ):
        with self.tracer.start_as_current_span(
            'encode', context=tracing_context
        ) as span:
            span.set_attribute('device', self._device)
            span.set_attribute('runtime', 'tensorrt')
            access_paths = parameters.get('access_paths', self._access_paths)
            if 'traversal_paths' in parameters:
                warnings.warn(
                    f'`traversal_paths` is deprecated. Use `access_paths` instead.'
                )
                access_paths = parameters['traversal_paths']
            _drop_image_content = parameters.get('drop_image_content', False)

            _img_da = DocumentArray()
            _txt_da = DocumentArray()
            for d in docs[access_paths]:
                split_img_txt_da(d, _img_da, _txt_da)

            with self.tracer.start_as_current_span('inference') as inference_span:
                inference_span.set_attribute('drop_image_content', _drop_image_content)
                inference_span.set_attribute('minibatch_size', self._minibatch_size)
                inference_span.set_attribute('has_img_da', True if _img_da else False)
                inference_span.set_attribute('has_txt_da', True if _txt_da else False)
                # for image
                if _img_da:
                    with self.tracer.start_as_current_span(
                        'img_minibatch_encoding'
                    ) as img_encode_span:
                        for minibatch, batch_data in _img_da.map_batch(
                            partial(
                                self._preproc_images,
                                drop_image_content=_drop_image_content,
                            ),
                            batch_size=self._minibatch_size,
                            pool=self._pool,
                        ):
                            with self.monitor(
                                name='encode_images_seconds',
                                documentation='images encode time in seconds',
                            ):
                                minibatch.embeddings = (
                                    self._model.encode_image(batch_data)
                                    .detach()
                                    .cpu()
                                    .numpy()
                                    .astype(np.float32)
                                )

                # for text
                if _txt_da:
                    with self.tracer.start_as_current_span(
                        'txt_minibatch_encoding'
                    ) as txt_encode_span:
                        for minibatch, batch_data in _txt_da.map_batch(
                            self._preproc_texts,
                            batch_size=self._minibatch_size,
                            pool=self._pool,
                        ):
                            with self.monitor(
                                name='encode_texts_seconds',
                                documentation='texts encode time in seconds',
                            ):
                                minibatch.embeddings = (
                                    self._model.encode_text(batch_data)
                                    .detach()
                                    .cpu()
                                    .numpy()
                                    .astype(np.float32)
                                )

        return docs


================================================
FILE: server/clip_server/executors/clip_torch.py
================================================
import os
import warnings
from functools import partial
from multiprocessing.pool import ThreadPool
from typing import Dict, Union, Optional

import numpy as np
import torch
from clip_server.executors.helper import (
    preproc_image,
    preproc_text,
    set_rank,
    split_img_txt_da,
)
from clip_server.helper import __cast_dtype__
from clip_server.model import clip
from clip_server.model.clip_model import CLIPModel
from clip_server.model.tokenization import Tokenizer
from jina import DocumentArray, Executor, requests
from opentelemetry.trace import NoOpTracer, Span


class CLIPEncoder(Executor):
    def __init__(
        self,
        name: str = 'ViT-B-32::openai',
        device: Optional[str] = None,
        jit: bool = False,
        num_worker_preprocess: int = 4,
        minibatch_size: int = 32,
        access_paths: str = '@r',
        dtype: Optional[Union[str, torch.dtype]] = None,
        **kwargs,
    ):
        """
        :param name: The name of the model to be used. Default 'ViT-B-32::openai'. A list of available models can be
            found at https://clip-as-service.jina.ai/user-guides/server/#model-support
        :param device: 'cpu' or 'cuda'. Default is None, which auto-detects the device.
        :param jit: Whether to use JIT compilation. Default is False.
        :param num_worker_preprocess: The number of CPU workers to preprocess images and texts. Default is 4.
        :param minibatch_size: The size of the minibatch for preprocessing and encoding. Default is 32. Reduce this
            number if you encounter OOM errors.
        :param access_paths: The access paths to traverse on the input documents to get the images and texts to be
            processed. Visit https://docarray.jina.ai/fundamentals/documentarray/access-elements for more details.
        :param dtype: inference data type, if None defaults to torch.float32 if device == 'cpu' else torch.float16.
        """
        super().__init__(**kwargs)

        self._minibatch_size = minibatch_size
        self._access_paths = access_paths
        if 'traversal_paths' in kwargs:
            warnings.warn(
                f'`traversal_paths` is deprecated. Use `access_paths` instead.'
            )
            self._access_paths = kwargs['traversal_paths']

        if not device:
            device = 'cuda' if torch.cuda.is_available() else 'cpu'
        self._device = device
        if isinstance(dtype, str):
            dtype = __cast_dtype__.get(dtype)
        elif not dtype:
            dtype = (
                torch.float32
                if self._device in ('cpu', torch.device('cpu'))
                else torch.float16
            )
        self._dtype = dtype

        if not self._device.startswith('cuda') and (
            'OMP_NUM_THREADS' not in os.environ
            and hasattr(self.runtime_args, 'replicas')
        ):
            replicas = getattr(self.runtime_args, 'replicas', 1)
            num_threads = max(1, torch.get_num_threads() // replicas)
            if num_threads < 2:
                warnings.warn(
                    f'Too many replicas ({replicas}) vs too few threads {num_threads} may result in '
                    f'sub-optimal performance.'
                )

            # NOTE: make sure to set the threads right after the torch import,
            # and `torch.set_num_threads` always take precedence over environment variables `OMP_NUM_THREADS`.
            # For more details, please see https://pytorch.org/docs/stable/generated/torch.set_num_threads.html
            torch.set_num_threads(max(num_threads, 1))
            torch.set_num_interop_threads(1)

        self._num_worker_preprocess = num_worker_preprocess
        self._pool = ThreadPool(processes=num_worker_preprocess)

        self._model = CLIPModel(
            name, device=self._device, jit=jit, dtype=dtype, **kwargs
        )
        self._tokenizer = Tokenizer(name)
        self._image_transform = clip._transform_blob(self._model.image_size)

        if not self.tracer:
            self.tracer = NoOpTracer()

    def _preproc_images(self, docs: 'DocumentArray', drop_image_content: bool):
        with self.monitor(
            name='preprocess_images_seconds',
            documentation='images preprocess time in seconds',
        ):
            with self.tracer.start_as_current_span('preprocess_images'):
                return preproc_image(
                    docs,
                    preprocess_fn=self._image_transform,
                    device=self._device,
                    return_np=False,
                    drop_image_content=drop_image_content,
                    dtype=self._dtype,
                )

    def _preproc_texts(self, docs: 'DocumentArray'):
        with self.monitor(
            name='preprocess_texts_seconds',
            documentation='texts preprocess time in seconds',
        ):
            with self.tracer.start_as_current_span('preprocess_images'):
                return preproc_text(
                    docs,
                    tokenizer=self._tokenizer,
                    device=self._device,
                    return_np=False,
                )

    @requests(on='/rank')
    async def rank(self, docs: 'DocumentArray', parameters: Dict, **kwargs):
        _drop_image_content = parameters.get('drop_image_content', False)
        await self.encode(docs['@r,m'], drop_image_content=_drop_image_content)

        set_rank(docs)

    @requests
    async def encode(
        self,
        docs: 'DocumentArray',
        tracing_context=None,
        parameters: Dict = {},
        **kwargs,
    ):
        with self.tracer.start_as_current_span(
            'encode', context=tracing_context
        ) as span:
            span.set_attribute('device', self._device)
            span.set_attribute('runtime', 'torch')
            access_paths = parameters.get('access_paths', self._access_paths)
            if 'traversal_paths' in parameters:
                warnings.warn(
                    f'`traversal_paths` is deprecated. Use `access_paths` instead.'
                )
                access_paths = parameters['traversal_paths']
            _drop_image_content = parameters.get('drop_image_content', False)

            _img_da = DocumentArray()
            _txt_da = DocumentArray()
            for d in docs[access_paths]:
                split_img_txt_da(d, _img_da, _txt_da)

            with self.tracer.start_as_current_span('inference') as inference_span:
                with torch.inference_mode():
                    inference_span.set_attribute(
                        'drop_image_content', _drop_image_content
                    )
                    inference_span.set_attribute('minibatch_size', self._minibatch_size)
                    inference_span.set_attribute(
                        'has_img_da', True if _img_da else False
                    )
                    inference_span.set_attribute(
                        'has_txt_da', True if _txt_da else False
                    )
                    # for image
                    if _img_da:
                        with self.tracer.start_as_current_span(
                            'img_minibatch_encoding'
                        ) as img_encode_span:
                            img_encode_span.set_attribute(
                                'num_pool_workers', self._num_worker_preprocess
                            )
                            for minibatch, batch_data in _img_da.map_batch(
                                partial(
                                    self._preproc_images,
                                    drop_image_content=_drop_image_content,
                                ),
                                batch_size=self._minibatch_size,
                                pool=self._pool,
                            ):
                                with self.monitor(
                                    name='encode_images_seconds',
                                    documentation='images encode time in seconds',
                                ):
                                    minibatch.embeddings = (
                                        self._model.encode_image(**batch_data)
                                        .cpu()
                                        .numpy()
                                        .astype(np.float32)
                                    )

                    # for text
                    if _txt_da:
                        with self.tracer.start_as_current_span(
                            'txt_minibatch_encoding'
                        ) as txt_encode_span:
                            txt_encode_span.set_attribute(
                                'num_pool_workers', self._num_worker_preprocess
                            )
                            for minibatch, batch_data in _txt_da.map_batch(
                                self._preproc_texts,
                                batch_size=self._minibatch_size,
                                pool=self._pool,
                            ):
                                with self.monitor(
                                    name='encode_texts_seconds',
                                    documentation='texts encode time in seconds',
                                ):
                                    minibatch.embeddings = (
                                        self._model.encode_text(**batch_data)
                                        .cpu()
                                        .numpy()
                                        .astype(np.float32)
                                    )

        return docs


================================================
FILE: server/clip_server/executors/helper.py
================================================
from typing import Tuple, List, Callable, Any, Dict, Union
import torch
import numpy as np
from docarray import Document, DocumentArray
from docarray.math.distance.numpy import cosine
from clip_server.helper import __cast_dtype__


from clip_server.model.tokenization import Tokenizer


def numpy_softmax(x: 'np.ndarray', axis: int = -1) -> 'np.ndarray':
    max = np.max(x, axis=axis, keepdims=True)
    e_x = np.exp(x - max)
    div = np.sum(e_x, axis=axis, keepdims=True)
    f_x = e_x / div
    return f_x


def preproc_image(
    da: 'DocumentArray',
    preprocess_fn: Callable,
    device: str = 'cpu',
    return_np: bool = False,
    drop_image_content: bool = False,
    dtype: Union[str, torch.dtype] = torch.float32,
) -> Tuple['DocumentArray', Dict]:

    if isinstance(dtype, str):
        dtype = __cast_dtype__.get(dtype)

    tensors_batch = []

    for d in da:
        content = d.content
        if d.tensor is not None:
            d.convert_image_tensor_to_blob()
        elif d.content_type != 'blob' and d.uri:
            # in case user uses HTTP protocol and send data via curl not using .blob (base64), but in .uri
            d.load_uri_to_blob()

        tensors_batch.append(preprocess_fn(d.blob).detach())

        # recover doc content
        d.content = content
        if drop_image_content:
            d.pop('blob', 'tensor')

    tensors_batch = torch.stack(tensors_batch).type(dtype)

    if return_np:
        tensors_batch = tensors_batch.cpu().numpy()
    else:
        tensors_batch = tensors_batch.to(device)

    return da, {'pixel_values': tensors_batch}


def preproc_text(
    da: 'DocumentArray',
    tokenizer: 'Tokenizer',
    device: str = 'cpu',
    return_np: bool = False,
) -> Tuple['DocumentArray', Dict]:

    inputs = tokenizer(da.texts)
    inputs['input_ids'] = inputs['input_ids'].detach()

    if return_np:
        inputs['input_ids'] = inputs['input_ids'].cpu().numpy().astype(np.int32)
        inputs['attention_mask'] = (
            inputs['attention_mask'].cpu().numpy().astype(np.int32)
        )
    else:
        inputs['input_ids'] = inputs['input_ids'].to(device)
        inputs['attention_mask'] = inputs['attention_mask'].to(device)

    da[:, 'mime_type'] = 'text'
    return da, inputs


def split_img_txt_da(doc: 'Document', img_da: 'DocumentArray', txt_da: 'DocumentArray'):
    if doc.text:
        txt_da.append(doc)
    elif doc.blob or (doc.tensor is not None) or doc.uri:
        img_da.append(doc)


def set_rank(docs, _logit_scale=np.exp(4.60517)):
    queries = docs
    candidates = docs['@m']

    query_embeddings = queries.embeddings  # Q X D
    candidate_embeddings = candidates.embeddings  # C = Sum(C_q1, C_q2, C_q3,...) x D
    cosine_scores = 1 - cosine(
        query_embeddings, candidate_embeddings
    )  # Q x C Block matix
    start_idx = 0
    for q, _cosine_scores in zip(docs, cosine_scores):

        _candidates = q.matches

        end_idx = start_idx + len(_candidates)

        _candidate_cosines = _cosine_scores[start_idx:end_idx]
        _candidate_softmaxs = numpy_softmax(_logit_scale * _candidate_cosines)
        for c, _c_score, _s_score in zip(
            _candidates, _candidate_cosines, _candidate_softmaxs
        ):
            c.scores['clip_score'].value = _s_score
            c.scores['clip_score'].op_name = 'softmax'

            c.scores['clip_score_cosine'].value = _c_score
            c.scores['clip_score_cosine'].op_name = 'cosine'

        start_idx = end_idx

        _candidates.embeddings = None  # remove embedding to save bandwidth

        final = sorted(
            _candidates, key=lambda _m: _m.scores['clip_score'].value, reverse=True
        )

        q.matches = final


def get_image_size(name: str):
    from clip_server.model.pretrained_models import _VISUAL_MODEL_IMAGE_SIZE

    return _VISUAL_MODEL_IMAGE_SIZE[name]


================================================
FILE: server/clip_server/helper.py
================================================
import json
import os
import sys
import threading
import torch
from packaging.version import Version
from urllib.request import Request, urlopen

import pkg_resources
from rich import print
from rich.panel import Panel

__resources_path__ = os.path.join(
    os.path.dirname(
        sys.modules.get('clip_server').__file__
        if 'clip_server' in sys.modules
        else __file__
    ),
    'resources',
)


__cast_dtype__ = {'fp16': torch.float16, 'fp32': torch.float32, 'bf16': torch.bfloat16}


def _version_check(package: str = None, github_repo: str = None):
    try:

        if not package:
            package = vars(sys.modules[__name__])['__package__']
        if not github_repo:
            github_repo = package

        cur_ver = Version(pkg_resources.get_distribution(package).version)
        req = Request(
            f'https://pypi.python.org/pypi/{package}/json',
            headers={'User-Agent': 'Mozilla/5.0'},
        )
        with urlopen(
            req, timeout=1
        ) as resp:  # 'with' is important to close the resource after use
            j = json.load(resp)
            releases = j.get('releases', {})
            latest_release_ver = max(
                Version(v) for v in releases.keys() if '.dev' not in v
            )
            if cur_ver < latest_release_ver:
                print(
                    Panel(
                        f'You are using [b]{package} {cur_ver}[/b], but [bold green]{latest_release_ver}[/] is available. '
                        f'You may upgrade it via [b]pip install -U {package}[/b]. [link=https://github.com/jina-ai/{github_repo}/releases]Read Changelog here[/link].',
                        title=':new: New version available!',
                        width=50,
                    )
                )
    except:
        # no network, too slow, PyPi is down
        pass


def is_latest_version(package: str = None, github_repo: str = None) -> None:
    """Check if there is a latest version from Pypi, set env `NO_VERSION_CHECK` to disable it.

    :param package: package name if none auto-detected
    :param github_repo: repo name that contains CHANGELOG if none then the same as package name
    """

    threading.Thread(target=_version_check, args=(package, github_repo)).start()


================================================
FILE: server/clip_server/model/__init__.py
================================================


================================================
FILE: server/clip_server/model/clip.py
================================================
# Originally from https://github.com/openai/CLIP. MIT License, Copyright (c) 2021 OpenAI

import io

import pillow_avif
from PIL import Image
from torchvision.transforms import Compose, Resize, CenterCrop, ToTensor, Normalize

try:
    from torchvision.transforms import InterpolationMode

    BICUBIC = InterpolationMode.BICUBIC
except ImportError:
    BICUBIC = Image.BICUBIC


def _convert_image_to_rgb(image):
    return image.convert('RGB')


def _blob2image(blob):
    return Image.open(io.BytesIO(blob))


def _transform_blob(n_px):
    return Compose(
        [
            _blob2image,
            Resize(n_px, interpolation=BICUBIC),
            CenterCrop(n_px),
            _convert_image_to_rgb,
            ToTensor(),
            Normalize(
                (0.48145466, 0.4578275, 0.40821073),
                (0.26862954, 0.26130258, 0.27577711),
            ),
        ]
    )


def _transform_ndarray(n_px):
    return Compose(
        [
            ToTensor(),
            Resize(n_px, interpolation=BICUBIC),
            CenterCrop(n_px),
            Normalize(
                (0.48145466, 0.4578275, 0.40821073),
                (0.26862954, 0.26130258, 0.27577711),
            ),
        ]
    )


================================================
FILE: server/clip_server/model/clip_model.py
================================================
from clip_server.model.pretrained_models import (
    _OPENCLIP_MODELS,
    _MULTILINGUALCLIP_MODELS,
    _CNCLIP_MODELS,
    _VISUAL_MODEL_IMAGE_SIZE,
)


class BaseCLIPModel:
    def __init__(self, name: str, **kwargs):
        super().__init__()
        self._name = name

    @staticmethod
    def get_model_name(name: str):
        return name

    @property
    def model_name(self):
        return self.__class__.get_model_name(self._name)

    @property
    def image_size(self):
        return _VISUAL_MODEL_IMAGE_SIZE.get(self.model_name, None)


class CLIPModel(BaseCLIPModel):
    def __new__(cls, name: str, **kwargs):
        if cls is CLIPModel:
            if name in _OPENCLIP_MODELS:
                from clip_server.model.openclip_model import OpenCLIPModel

                instance = super().__new__(OpenCLIPModel)
            elif name in _MULTILINGUALCLIP_MODELS:
                from clip_server.model.mclip_model import MultilingualCLIPModel

                instance = super().__new__(MultilingualCLIPModel)
            elif name in _CNCLIP_MODELS:
                from clip_server.model.cnclip_model import CNClipModel

                instance = super().__new__(CNClipModel)
            else:
                raise ValueError(
                    'CLIP model {} not found; below is a list of all available models:\n{}'.format(
                        name,
                        ''.join(
                            [
                                '\t- {}\n'.format(i)
                                for i in list(_OPENCLIP_MODELS.keys())
                                + list(_MULTILINGUALCLIP_MODELS.keys())
                                + list(_CNCLIP_MODELS.keys())
                            ]
                        ),
                    )
                )
        else:
            instance = super().__new__(cls)
        return instance


================================================
FILE: server/clip_server/model/clip_onnx.py
================================================
import os
from typing import Dict, Optional

from clip_server.model.pretrained_models import (
    download_model,
    _OPENCLIP_MODELS,
    _MULTILINGUALCLIP_MODELS,
)
from clip_server.model.clip_model import BaseCLIPModel

_S3_BUCKET = (
    'https://clip-as-service.s3.us-east-2.amazonaws.com/models/onnx/'  # Deprecated
)
_S3_BUCKET_V2 = 'https://clip-as-service.s3.us-east-2.amazonaws.com/models-436c69702d61732d53657276696365/onnx/'
_MODELS = {
    'RN50::openai': (
        ('RN50/textual.onnx', '722418bfe47a1f5c79d1f44884bb3103'),
        ('RN50/visual.onnx', '5761475db01c3abb68a5a805662dcd10'),
    ),
    'RN50::yfcc15m': (
        ('RN50-yfcc15m/textual.onnx', '4ff2ea7228b9d2337b5440d1955c2108'),
        ('RN50-yfcc15m/visual.onnx', '87daa9b4a67449b5390a9a73b8c15772'),
    ),
    'RN50::cc12m': (
        ('RN50-cc12m/textual.onnx', '78fa0ae0ea47aca4b8864f709c48dcec'),
        ('RN50-cc12m/visual.onnx', '0e04bf92f3c181deea2944e322ebee77'),
    ),
    'RN101::openai': (
        ('RN101/textual.onnx', '2d9efb7d184c0d68a369024cedfa97af'),
        ('RN101/visual.onnx', '0297ebc773af312faab54f8b5a622d71'),
    ),
    'RN101::yfcc15m': (
        ('RN101-yfcc15m/textual.onnx', '7aa2a4e3d5b960998a397a6712389f08'),
        ('RN101-yfcc15m/visual.onnx', '681a72dd91c9c79464947bf29b623cb4'),
    ),
    'RN50x4::openai': (
        ('RN50x4/textual.onnx', 'd9d63d3fe35fb14d4affaa2c4e284005'),
        ('RN50x4/visual.onnx', '16afe1e35b85ad862e8bbdb12265c9cb'),
    ),
    'RN50x16::openai': (
        ('RN50x16/textual.onnx', '1525785494ff5307cadc6bfa56db6274'),
        ('RN50x16/visual.onnx', '2a293d9c3582f8abe29c9999e47d1091'),
    ),
    'RN50x64::openai': (
        ('RN50x64/textual.onnx', '3ae8ade74578eb7a77506c11bfbfaf2c'),
        ('RN50x64/visual.onnx', '1341f10b50b3aca6d2d5d13982cabcfc'),
    ),
    'ViT-B-32::openai': (
        ('ViT-B-32/textual.onnx', 'bd6d7871e8bb95f3cc83aff3398d7390'),
        ('ViT-B-32/visual.onnx', '88c6f38e522269d6c04a85df18e6370c'),
    ),
    'ViT-B-32::laion2b_e16': (
        ('ViT-B-32-laion2b_e16/textual.onnx', 'aa6eac88fe77d21f337e806417957497'),
        ('ViT-B-32-laion2b_e16/visual.onnx', '0cdc00a9dfad560153d40aced9df0c8f'),
    ),
    'ViT-B-32::laion400m_e31': (
        ('ViT-B-32-laion400m_e31/textual.onnx', '832f417bf1b3f1ced8f9958eda71665c'),
        ('ViT-B-32-laion400m_e31/visual.onnx', '62326b925ae342313d4cc99c2741b313'),
    ),
    'ViT-B-32::laion400m_e32': (
        ('ViT-B-32-laion400m_e32/textual.onnx', '93284915937ba42a2b52ae8d3e5283a0'),
        ('ViT-B-32-laion400m_e32/visual.onnx', 'db220821a31fe9795fd8c2ba419078c5'),
    ),
    'ViT-B-32::laion2b-s34b-b79k': (
        ('ViT-B-32-laion2b-s34b-b79k/textual.onnx', '84af5ae53da56464c76e67fe50fddbe9'),
        ('ViT-B-32-laion2b-s34b-b79k/visual.onnx', 'a2d4cbd1cf2632cd09ffce9b40bfd8bd'),
    ),
    'ViT-B-16::openai': (
        ('ViT-B-16/textual.onnx', '6f0976629a446f95c0c8767658f12ebe'),
        ('ViT-B-16/visual.onnx', 'd5c03bfeef1abbd9bede54a8f6e1eaad'),
    ),
    'ViT-B-16::laion400m_e31': (
        ('ViT-B-16-laion400m_e31/textual.onnx', '5db27763c06c06c727c90240264bf4f7'),
        ('ViT-B-16-laion400m_e31/visual.onnx', '04a6a780d855a36eee03abca64cd5361'),
    ),
    'ViT-B-16::laion400m_e32': (
        ('ViT-B-16-laion400m_e32/textual.onnx', '9abe000a51b6f1cbaac8fde601b16725'),
        ('ViT-B-16-laion400m_e32/visual.onnx', 'd38c144ac3ad7fbc1966f88ff8fa522f'),
    ),
    'ViT-B-16-plus-240::laion400m_e31': (
        (
            'ViT-B-16-plus-240-laion400m_e31/textual.onnx',
            '2b524e7a530a98010cc7e57756937c5c',
        ),
        (
            'ViT-B-16-plus-240-laion400m_e31/visual.onnx',
            'a78989da3300fd0c398a9877dd26a9f1',
        ),
    ),
    'ViT-B-16-plus-240::laion400m_e32': (
        (
            'ViT-B-16-plus-240-laion400m_e32/textual.onnx',
            '53c8d26726b386ca0749207876482907',
        ),
        (
            'ViT-B-16-plus-240-laion400m_e32/visual.onnx',
            '7a32c4272c1ee46f734486570d81584b',
        ),
    ),
    'ViT-L-14::openai': (
        ('ViT-L-14/textual.onnx', '325380b31af4837c2e0d9aba2fad8e1b'),
        ('ViT-L-14/visual.onnx', '53f5b319d3dc5d42572adea884e31056'),
    ),
    'ViT-L-14::laion400m_e31': (
        ('ViT-L-14-laion400m_e31/textual.onnx', '36216b85e32668ea849730a54e1e09a4'),
        ('ViT-L-14-laion400m_e31/visual.onnx', '15fa5a24916e2a58325c5cf70350c300'),
    ),
    'ViT-L-14::laion400m_e32': (
        ('ViT-L-14-laion400m_e32/textual.onnx', '8ba5b76ba71992923470c0261b10a67c'),
        ('ViT-L-14-laion400m_e32/visual.onnx', '49db3ba92bd816001e932530ad92d76c'),
    ),
    'ViT-L-14::laion2b-s32b-b82k': (
        ('ViT-L-14-laion2b-s32b-b82k/textual.onnx', 'da36a6cbed4f56abf576fdea8b6fe2ee'),
        ('ViT-L-14-laion2b-s32b-b82k/visual.onnx', '1e337a190abba6a8650237dfae4740b7'),
    ),
    'ViT-L-14-336::openai': (
        ('ViT-L-14@336px/textual.onnx', '78fab479f136403eed0db46f3e9e7ed2'),
        ('ViT-L-14@336px/visual.onnx', 'f3b1f5d55ca08d43d749e11f7e4ba27e'),
    ),
    'ViT-H-14::laion2b-s32b-b79k': (
        ('ViT-H-14-laion2b-s32b-b79k/textual.onnx', '41e73c0c871d0e8e5d5e236f917f1ec3'),
        ('ViT-H-14-laion2b-s32b-b79k/visual.zip', '38151ea5985d73de94520efef38db4e7'),
    ),
    'ViT-g-14::laion2b-s12b-b42k': (
        ('ViT-g-14-laion2b-s12b-b42k/textual.onnx', 'e597b7ab4414ecd92f715d47e79a033f'),
        ('ViT-g-14-laion2b-s12b-b42k/visual.zip', '6d0ac4329de9b02474f4752a5d16ba82'),
    ),
    # older version name format
    'RN50': (
        ('RN50/textual.onnx', '722418bfe47a1f5c79d1f44884bb3103'),
        ('RN50/visual.onnx', '5761475db01c3abb68a5a805662dcd10'),
    ),
    'RN101': (
        ('RN101/textual.onnx', '2d9efb7d184c0d68a369024cedfa97af'),
        ('RN101/visual.onnx', '0297ebc773af312faab54f8b5a622d71'),
    ),
    'RN50x4': (
        ('RN50x4/textual.onnx', 'd9d63d3fe35fb14d4affaa2c4e284005'),
        ('RN50x4/visual.onnx', '16afe1e35b85ad862e8bbdb12265c9cb'),
    ),
    'RN50x16': (
        ('RN50x16/textual.onnx', '1525785494ff5307cadc6bfa56db6274'),
        ('RN50x16/visual.onnx', '2a293d9c3582f8abe29c9999e47d1091'),
    ),
    'RN50x64': (
        ('RN50x64/textual.onnx', '3ae8ade74578eb7a77506c11bfbfaf2c'),
        ('RN50x64/visual.onnx', '1341f10b50b3aca6d2d5d13982cabcfc'),
    ),
    'ViT-B/32': (
        ('ViT-B-32/textual.onnx', 'bd6d7871e8bb95f3cc83aff3398d7390'),
        ('ViT-B-32/visual.onnx', '88c6f38e522269d6c04a85df18e6370c'),
    ),
    'ViT-B/16': (
        ('ViT-B-16/textual.onnx', '6f0976629a446f95c0c8767658f12ebe'),
        ('ViT-B-16/visual.onnx', 'd5c03bfeef1abbd9bede54a8f6e1eaad'),
    ),
    'ViT-L/14': (
        ('ViT-L-14/textual.onnx', '325380b31af4837c2e0d9aba2fad8e1b'),
        ('ViT-L-14/visual.onnx', '53f5b319d3dc5d42572adea884e31056'),
    ),
    'ViT-L/14@336px': (
        ('ViT-L-14@336px/textual.onnx', '78fab479f136403eed0db46f3e9e7ed2'),
        ('ViT-L-14@336px/visual.onnx', 'f3b1f5d55ca08d43d749e11f7e4ba27e'),
    ),
    # MultilingualCLIP models
    'M-CLIP/LABSE-Vit-L-14': (
        ('M-CLIP-LABSE-Vit-L-14/textual.onnx', '03727820116e63c7d19c72bb5d839488'),
        ('M-CLIP-LABSE-Vit-L-14/visual.onnx', 'a78028eab30084c3913edfb0c8411f15'),
    ),
    'M-CLIP/XLM-Roberta-Large-Vit-B-32': (
        (
            'M-CLIP-XLM-Roberta-Large-Vit-B-32/textual.zip',
            '41f51ec9af4754d11c7b7929e2caf5b9',
        ),
        (
            'M-CLIP-XLM-Roberta-Large-Vit-B-32/visual.onnx',
            '5f18f68ac94e294863bfd1f695c8c5ca',
        ),
    ),
    'M-CLIP/XLM-Roberta-Large-Vit-B-16Plus': (
        (
            'M-CLIP-XLM-Roberta-Large-Vit-B-16Plus/textual.zip',
            '6c3e55f7d2d6c12f2c1f1dd36fdec607',
        ),
        (
            'M-CLIP-XLM-Roberta-Large-Vit-B-16Plus/visual.onnx',
            '467a3ef3e5f50abcf850c3db9e705f8e',
        ),
    ),
    'M-CLIP/XLM-Roberta-Large-Vit-L-14': (
        (
            'M-CLIP-XLM-Roberta-Large-Vit-L-14/textual.zip',
            '3dff00335dc3093acb726dab975ae57d',
        ),
        (
            'M-CLIP-XLM-Roberta-Large-Vit-L-14/visual.onnx',
            'a78028eab30084c3913edfb0c8411f15',
        ),
    ),
}


class CLIPOnnxModel(BaseCLIPModel):
    def __init__(
        self, name: str, model_path: str = None, dtype: Optional[str] = 'fp32'
    ):
        super().__init__(name)
        self._dtype = dtype
        if name in _MODELS:
            if not model_path:
                cache_dir = os.path.expanduser(
                    f'~/.cache/clip/{name.replace("/", "-").replace("::", "-")}'
                )
                textual_model_name, textual_model_md5 = _MODELS[name][0]
                self._textual_path = download_model(
                    url=_S3_BUCKET_V2 + textual_model_name,
                    target_folder=cache_dir,
                    md5sum=textual_model_md5,
                    with_resume=True,
                )
                visual_model_name, visual_model_md5 = _MODELS[name][1]
                self._visual_path = download_model(
                    url=_S3_BUCKET_V2 + visual_model_name,
                    target_folder=cache_dir,
                    md5sum=visual_model_md5,
                    with_resume=True,
                )
            else:
                if os.path.isdir(model_path):
                    self._textual_path = os.path.join(model_path, 'textual.onnx')
                    self._visual_path = os.path.join(model_path, 'visual.onnx')
                    if not os.path.isfile(self._textual_path) or not os.path.isfile(
                        self._visual_path
                    ):
                        raise RuntimeError(
                            f'The given model path {model_path} does not contain `textual.onnx` and `visual.onnx`'
                        )
                else:
                    raise RuntimeError(
                        f'The given model path {model_path} should be a folder containing both '
                        f'`textual.onnx` and `visual.onnx`.'
                    )
        else:
            raise RuntimeError(
                'CLIP model {} not found or not supports ONNX backend; below is a list of all available models:\n{}'.format(
                    name,
                    ''.join(['\t- {}\n'.format(i) for i in list(_MODELS.keys())]),
                )
            )

    @staticmethod
    def get_model_name(name: str):
        if name in _OPENCLIP_MODELS:
            from clip_server.model.openclip_model import OpenCLIPModel

            return OpenCLIPModel.get_model_name(name)
        elif name in _MULTILINGUALCLIP_MODELS:
            from clip_server.model.mclip_model import MultilingualCLIPModel

            return MultilingualCLIPModel.get_model_name(name)

        return name

    def start_sessions(
        self,
        dtype,
        **kwargs,
    ):
        import onnxruntime as ort

        def _load_session(model_path: str, model_type: str, dtype: str):
            if model_path.endswith('.zip') or dtype == 'fp16':
                import tempfile

                with tempfile.TemporaryDirectory() as tmp_dir:
                    tmp_model_path = tmp_dir + f'/{model_type}.onnx'
                    if model_path.endswith('.zip'):
                        import zipfile

                        with zipfile.ZipFile(model_path, 'r') as zip_ref:
                            zip_ref.extractall(tmp_dir)
                            model_path = tmp_model_path
                    if dtype == 'fp16':
                        import onnx
                        from onnxmltools.utils import float16_converter

                        model_fp16 = (
                            float16_converter.convert_float_to_float16_model_path(
                                model_path
                            )
                        )
                        onnx.save_model(model_fp16, tmp_model_path)
                    return ort.InferenceSession(tmp_model_path, **kwargs)
            return ort.InferenceSession(model_path, **kwargs)

        self._visual_session = _load_session(self._visual_path, 'visual', dtype)
        self._textual_session = _load_session(self._textual_path, 'textual', dtype)
        self._visual_session.disable_fallback()
        self._textual_session.disable_fallback()

    def encode_image(self, image_input: Dict):
        (visual_output,) = self._visual_session.run(None, image_input)
        return visual_output

    def encode_text(self, text_input: Dict):
        (textual_output,) = self._textual_session.run(None, text_input)
        return textual_output


================================================
FILE: server/clip_server/model/clip_trt.py
================================================
import os
from typing import Dict

try:
    import tensorrt as trt
    from tensorrt.tensorrt import Logger, Runtime

    from clip_server.model.trt_utils import load_engine, build_engine, save_engine
except ImportError:
    raise ImportError(
        "It seems that TensorRT is not yet installed. "
        "It is required when you declare TensorRT backend."
        "Please find installation instruction on "
        "https://docs.nvidia.com/deeplearning/tensorrt/install-guide/index.html"
    )
from clip_server.model.pretrained_models import (
    _OPENCLIP_MODELS,
    _MULTILINGUALCLIP_MODELS,
)
from clip_server.model.clip_model import BaseCLIPModel
from clip_server.model.clip_onnx import _MODELS as ONNX_MODELS

_MODELS = [
    'RN50::openai',
    'RN50::yfcc15m',
    'RN50::cc12m',
    'RN101::openai',
    'RN101::yfcc15m',
    'RN50x4::openai',
    'ViT-B-32::openai',
    'ViT-B-32::laion2b_e16',
    'ViT-B-32::laion400m_e31',
    'ViT-B-32::laion400m_e32',
    'ViT-B-16::openai',
    'ViT-B-16::laion400m_e31',
    'ViT-B-16::laion400m_e32',
    # older version name format
    'RN50',
    'RN101',
    'RN50x4',
    # 'RN50x16',
    # 'RN50x64',
    'ViT-B/32',
    'ViT-B/16',
    # 'ViT-L/14',
    # 'ViT-L/14@336px',
]


class CLIPTensorRTModel(BaseCLIPModel):
    def __init__(
        self,
        name: str,
    ):
        super().__init__(name)

        if name in _MODELS:
            cache_dir = os.path.expanduser(
                f'~/.cache/clip/{name.replace("/", "-").replace("::", "-")}'
            )

            self._textual_path = os.path.join(
                cache_dir,
                f'textual.{ONNX_MODELS[name][0][1]}.trt',
            )
            self._visual_path = os.path.join(
                cache_dir,
                f'visual.{ONNX_MODELS[name][1][1]}.trt',
            )

            if not os.path.exists(self._textual_path) or not os.path.exists(
                self._visual_path
            ):
                from clip_server.model.clip_onnx import CLIPOnnxModel

                trt_logger: Logger = trt.Logger(trt.Logger.ERROR)
                runtime: Runtime = trt.Runtime(trt_logger)
                onnx_model = CLIPOnnxModel(name)

                visual_engine = build_engine(
                    runtime=runtime,
                    onnx_file_path=onnx_model._visual_path,
                    logger=trt_logger,
                    min_shape=(1, 3, onnx_model.image_size, onnx_model.image_size),
                    optimal_shape=(
                        768,
                        3,
                        onnx_model.image_size,
                        onnx_model.image_size,
                    ),
                    max_shape=(
                        1024,
                        3,
                        onnx_model.image_size,
                        onnx_model.image_size,
                    ),
                    workspace_size=10000 * 1024 * 1024,
                    fp16=False,
                    int8=False,
                )
                save_engine(visual_engine, self._visual_path)

                text_engine = build_engine(
                    runtime=runtime,
                    onnx_file_path=onnx_model._textual_path,
                    logger=trt_logger,
                    min_shape=(1, 77),
                    optimal_shape=(768, 77),
                    max_shape=(1024, 77),
                    workspace_size=10000 * 1024 * 1024,
                    fp16=False,
                    int8=False,
                )
                save_engine(text_engine, self._textual_path)
        else:
            raise RuntimeError(
                'CLIP model {} not found or not supports Nvidia TensorRT backend; below is a list of all available models:\n{}'.format(
                    name,
                    ''.join(['\t- {}\n'.format(i) for i in list(_MODELS.keys())]),
                )
            )

    @staticmethod
    def get_model_name(name: str):
        if name in _OPENCLIP_MODELS:
            from clip_server.model.openclip_model import OpenCLIPModel

            return OpenCLIPModel.get_model_name(name)
        elif name in _MULTILINGUALCLIP_MODELS:
            from clip_server.model.mclip_model import MultilingualCLIPModel

            return MultilingualCLIPModel.get_model_name(name)

        return name

    def start_engines(self):
        trt_logger: Logger = trt.Logger(trt.Logger.ERROR)
        runtime: Runtime = trt.Runtime(trt_logger)
        self._textual_engine = load_engine(runtime, self._textual_path)
        self._visual_engine = load_engine(runtime, self._visual_path)

    def encode_image(self, image_input: Dict):
        (visual_output,) = self._visual_engine(image_input)
        return visual_output

    def encode_text(self, text_input: Dict):
        (textual_output,) = self._textual_engine(text_input)
        return textual_output


================================================
FILE: server/clip_server/model/cnclip_model.py
================================================
# Originally from https://github.com/OFA-Sys/Chinese-CLIP. MIT License.

import torch

from clip_server.model.clip_model import CLIPModel
from clip_server.model.pretrained_models import _VISUAL_MODEL_IMAGE_SIZE
from cn_clip.clip import load_from_name

_CNCLIP_MODEL_MAPS = {
    'CN-CLIP/ViT-B-16': 'ViT-B-16',
    'CN-CLIP/ViT-L-14': 'ViT-L-14',
    'CN-CLIP/ViT-L-14-336': 'ViT-L-14-336',
    'CN-CLIP/ViT-H-14': 'ViT-H-14',
    'CN-CLIP/RN50': 'RN50',
}


class CNClipModel(CLIPModel):
    def __init__(
        self,
        name: str,
        device: str = 'cpu',
        jit: bool = False,
        dtype: str = None,
        **kwargs
    ):
        super().__init__(name, **kwargs)
        self._name = _CNCLIP_MODEL_MAPS[name]

        self._model, self._preprocess = load_from_name(
            _CNCLIP_MODEL_MAPS[name], device=device
        )
        self._model.eval()

    @staticmethod
    def get_model_name(name: str):
        return _CNCLIP_MODEL_MAPS[name]

    def encode_text(self, input_ids: 'torch.Tensor', **kwargs):
        return self._model.encode_text(input_ids).detach()

    def encode_image(self, pixel_values: 'torch.Tensor', **kwargs):
        return self._model.encode_image(pixel_values).detach()

    @property
    def model_name(self):
        return self.__class__.get_model_name(self._name)

    @property
    def image_size(self):
        return _VISUAL_MODEL_IMAGE_SIZE.get(self._name, None)


================================================
FILE: server/clip_server/model/flash_attention.py
================================================
import torch
import torch.nn as nn
from torch import Tensor
from typing import Optional, Tuple

from torch.nn.functional import linear
from flash_attn.flash_attn_interface import flash_attn_unpadded_func


class MultiheadAttention(nn.MultiheadAttention):
    def __init__(
        self,
        embed_dim,
        num_heads,
        dropout=0,
        bias=True,
        add_bias_kv=False,
        add_zero_attn=False,
        kdim=None,
        vdim=None,
        batch_first=False,
        device=None,
        dtype=None,
    ) -> None:
        super().__init__(
            embed_dim,
            num_heads,
            dropout,
            bias,
            add_bias_kv,
            add_zero_attn,
            kdim,
            vdim,
            batch_first,
            device,
            dtype,
        )

    def attention(
        self,
        q,
        k,
        v,
        batch_size=1,
        seqlen=77,
        softmax_scale=None,
        attention_dropout=0.0,
        causal=False,
        cu_seqlens=None,
        max_s=None,
        need_weights=False,
    ):
        """Implements the multihead softmax attention.
        Arguments
        ---------
            q,k,v: The tensor containing the query, key, and value. each of (B*S, H, D)
            key_padding_mask: a bool tensor of shape (B, S)

        """

        if cu_seqlens is None:
            max_s = seqlen
            cu_seqlens = torch.arange(
                0,
                (batch_size + 1) * seqlen,
                step=seqlen,
                dtype=torch.int32,
                device=q.device,
            )
            output = flash_attn_unpadded_func(
                q,
                k,
                v,
                cu_seqlens,
                cu_seqlens,
                max_s,
                max_s,
                attention_dropout,
                softmax_scale=softmax_scale,
                causal=causal,
            )

        return output

    def forward(
        self,
        query: Tensor,
        key: Tensor,
        value: Tensor,
        key_padding_mask: Optional[Tensor] = None,
        need_weights: bool = False,
        attn_mask: Optional[Tensor] = None,
        average_attn_weights: bool = True,
    ) -> Tuple[Tensor, Optional[Tensor]]:
        # set up shape vars
        seqlen, batch_size, embed_dim = query.shape

        # in-projection and rearrange `b s (h d) -> (b s) h d`
        q, k, v = linear(query, self.in_proj_weight, self.in_proj_bias).chunk(3, dim=-1)
        q = (
            q.transpose(0, 1)
            .contiguous()
            .view(batch_size * seqlen, self.num_heads, self.head_dim)
        )
        k = (
            k.transpose(0, 1)
            .contiguous()
            .view(batch_size * seqlen, self.num_heads, self.head_dim)
        )
        v = (
            v.transpose(0, 1)
            .contiguous()
            .view(batch_size * seqlen, self.num_heads, self.head_dim)
        )

        # flash attention (use causal mask)
        causal = attn_mask is not None
        attn_output = self.attention(q, k, v, batch_size, seqlen, causal=causal)

        # out-projection
        # `(b s) h d -> s b (h d)`
        attn_output = attn_output.contiguous().view(
            batch_size, seqlen, self.num_heads, self.head_dim
        )
        attn_output = (
            attn_output.transpose(0, 1).contiguous().view(seqlen, batch_size, embed_dim)
        )
        attn_output = linear(attn_output, self.out_proj.weight, self.out_proj.bias)

        return attn_output, None


================================================
FILE: server/clip_server/model/mclip_model.py
================================================
# Originally from https://github.com/FreddeFrallan/Multilingual-CLIP. MIT License, Copyright (c) 2022 Multilingual-CLIP

import transformers
import torch

from clip_server.model.clip_model import CLIPModel
from clip_server.model.openclip_model import OpenCLIPModel

_CLIP_MODEL_MAPS = {
    'M-CLIP/XLM-Roberta-Large-Vit-B-32': 'ViT-B-32::openai',
    'M-CLIP/XLM-Roberta-Large-Vit-L-14': 'ViT-L-14::openai',
    'M-CLIP/XLM-Roberta-Large-Vit-B-16Plus': 'ViT-B-16-plus-240::laion400m_e31',
    'M-CLIP/LABSE-Vit-L-14': 'ViT-L-14::openai',
}


class MCLIPConfig(transformers.PretrainedConfig):
    model_type = "M-CLIP"

    def __init__(
        self,
        modelBase: str = 'xlm-roberta-large',
        transformerDimSize: int = 1024,
        imageDimSize: int = 768,
        **kwargs
    ):
        self.transformerDimensions = transformerDimSize
        self.numDims = imageDimSize
        self.modelBase = modelBase
        super().__init__(**kwargs)


class MultilingualCLIP(transformers.PreTrainedModel):
    config_class = MCLIPConfig

    def __init__(self, config, *args, **kwargs):
        super().__init__(config, *args, **kwargs)
        self.transformer = transformers.AutoModel.from_pretrained(config.modelBase)
        self.LinearTransformation = torch.nn.Linear(
            in_features=config.transformerDimensions, out_features=config.numDims
        )

    def forward(self, input_ids: torch.Tensor, attention_mask: torch.Tensor, **kwargs):
        embs = self.transformer(
            input_ids=input_ids, attention_mask=attention_mask, **kwargs
        )[0]
        embs = (embs * attention_mask.unsqueeze(2)).sum(dim=1) / attention_mask.sum(
            dim=1
        )[:, None]
        return self.LinearTransformation(embs)


class MultilingualCLIPModel(CLIPModel):
    def __init__(self, name: str, device: str = 'cpu', jit: bool = False, **kwargs):
        super().__init__(name, **kwargs)
        self._mclip_model = MultilingualCLIP.from_pretrained(name)
        self._mclip_model.to(device=device)
        self._mclip_model.eval()
        self._model = OpenCLIPModel(_CLIP_MODEL_MAPS[name], device=device, jit=jit)

    @staticmethod
    def get_model_name(name: str):
        return _CLIP_MODEL_MAPS[name].split('::')[0]

    def encode_text(
        self, input_ids: 'torch.Tensor', attention_mask: 'torch.Tensor', **kwargs
    ):
        return self._mclip_model(
            input_ids=input_ids, attention_mask=attention_mask, **kwargs
        )

    def encode_image(self, pixel_values: torch.Tensor):
        return self._model.encode_image(pixel_values)


================================================
FILE: server/clip_server/model/model.py
================================================
""" CLIP Model

Adapted from https://github.com/mlfoundations/open_clip.

Originally MIT License, Copyright (c) 2012-2021 Gabriel Ilharco, Mitchell Wortsman,
Nicholas Carlini, Rohan Taori, Achal Dave, Vaishaal Shankar,
John Miller, Hongseok Namkoong, Hannaneh Hajishirzi, Ali Farhadi,
Ludwig Schmidt
"""

import warnings
import torch
import numpy as np
from torch import nn
from dataclasses import dataclass
from typing import Tuple, Union, Optional
from copy import deepcopy
from clip_server.helper import __cast_dtype__
from open_clip.transformer import QuickGELU, LayerNorm, LayerNormFp32, Attention
from open_clip.timm_model import TimmModel
from open_clip.factory import _MODEL_CONFIGS
from open_clip.hf_model import HFTextEncoder
from open_clip.transformer import ResidualAttentionBlock as _ResidualAttentionBlock
from open_clip.transformer import Transformer as _Transformer
from open_clip.transformer import VisionTransformer as _VisionTransformer
from open_clip.transformer import TextTransformer as _TextTransformer
from open_clip.modified_resnet import ModifiedResNet as _ModifiedResNet
from open_clip.model import CustomTextCLIP as _CustomTextCLIP
from open_clip.model import CLIP as _CLIP

# Use flash attention
try:
    from clip_server.model.flash_attention import MultiheadAttention

    FLASH_ATTENTION_AVAILABLE = True
except:
    FLASH_ATTENTION_AVAILABLE = False


class ModifiedResNet(_ModifiedResNet):
    def forward(self, x):
        # To handle fp16 inference
        x = x.type(self.conv1.weight.dtype)
        return super().forward(x)


class ResidualAttentionBlock(_ResidualAttentionBlock):
    def __init__(
        self, width: int, heads: int, dtype: torch.dtype = torch.float32, **kwargs
    ):
        super().__init__(width, heads, **kwargs)
        head_dim = width // heads
        flash_attention = head_dim % 8 == 0 and head_dim <= 128

        self.attn = (
            MultiheadAttention(width, heads)
            if FLASH_ATTENTION_AVAILABLE
            and torch.cuda.is_available()
            and dtype in (torch.float16, torch.bfloat16)
            and flash_attention
            else nn.MultiheadAttention(width, heads)
        )


class Transformer(_Transformer):
    def __init__(self, layers: int, dtype: torch.dtype = torch.float32, **kwargs):
        super().__init__(layers=layers, **kwargs)
        self.resblocks = nn.ModuleList(
            [ResidualAttentionBlock(dtype=dtype, **kwargs) for _ in range(layers)]
        )


class VisionTransformer(_VisionTransformer):
    def __init__(
        self,
        image_size: int,
        patch_size: int,
        global_average_pool: bool,
        output_dim: int,
        dtype: torch.dtype = torch.float32,
        **kwargs,
    ):
        super().__init__(
            image_size,
            patch_size,
            global_average_pool=global_average_pool,
            output_dim=output_dim,
            **kwargs,
        )
        self.transformer = Transformer(dtype=dtype, **kwargs)

    def forward(self, x: torch.Tensor):
        dtype = self.transformer.get_cast_dtype()
        x = x.to(dtype)
        return super().forward(x)


class TextTransformer(_TextTransformer):
    def __init__(
        self,
        context_length: int,
        vocab_size: int,
        output_dim: int,
        dtype: torch.dtype = torch.float32,
        **kwargs,
    ):
        super().__init__(context_length, vocab_size, output_dim=output_dim, **kwargs)
        self.transformer = Transformer(dtype=dtype, **kwargs)
        self.init_parameters()


@dataclass
class CLIPVisionCfg:
    layers: Union[Tuple[int, int, int, int], int] = 12
    width: int = 768
    head_width: int = 64
    mlp_ratio: float = 4.0
    patch_size: int = 16
    image_size: Union[Tuple[int, int], int] = 224
    ls_init_value: Optional[float] = None  # layer scale initial value
    global_average_pool: bool = False  # whether to global average pool the last embedding layer, instead of using CLS token (https://arxiv.org/abs/2205.01580)
    timm_model_name: str = (
        None  # a valid model name overrides layers, width, patch_size
    )
    timm_model_pretrained: bool = (
        False  # use (imagenet) pretrained weights for named model
    )
    timm_pool: str = (
        'avg'  # feature pooling for timm model ('abs_attn', 'rot_attn', 'avg', '')
    )
    timm_proj: str = (
        'linear'  # linear projection for timm model output ('linear', 'mlp', '')
    )
    timm_proj_bias: bool = False  # enable bias final projection


@dataclass
class CLIPTextCfg:
    context_length: int = 77
    vocab_size: int = 49408
    width: int = 512
    heads: int = 8
    layers: int = 12
    ls_init_value: Optional[float] = None  # layer scale initial value
    hf_model_name: str = None
    hf_tokenizer_name: str = None
    hf_model_pretrained: bool = True
    proj: str = 'mlp'
    pooler_type: str = 'mean_pooler'


def _build_vision_tower(
    embed_dim: int,
    vision_cfg: CLIPVisionCfg,
    quick_gelu: bool = False,
    dtype: Optional[torch.dtype] = torch.float32,
):
    if isinstance(vision_cfg, dict):
        vision_cfg = CLIPVisionCfg(**vision_cfg)

    # OpenAI models are pretrained w/ QuickGELU but native nn.GELU is both faster and more
    # memory efficient in recent PyTorch releases (>= 1.10).
    # NOTE: timm models always use native GELU regardless of quick_gelu flag.
    act_layer = QuickGELU if quick_gelu else nn.GELU

    if vision_cfg.timm_model_name:
        visual = TimmModel(
            model_name=vision_cfg.timm_model_name,
            pretrained=vision_cfg.timm_model_pretrained,
            pool=vision_cfg.timm_pool,
            proj=vision_cfg.timm_proj,
            proj_bias=vision_cfg.timm_proj_bias,
            embed_dim=embed_dim,
            image_size=vision_cfg.image_size,
        )
        act_layer = (
            nn.GELU
        )  # so that text transformer doesn't use QuickGELU w/ timm models
    elif isinstance(vision_cfg.layers, (tuple, list)):
        vision_heads = vision_cfg.width * 32 // vision_cfg.head_width
        visual = ModifiedResNet(
            layers=vision_cfg.layers,
            output_dim=embed_dim,
            heads=vision_heads,
            image_size=vision_cfg.image_size,
            width=vision_cfg.width,
        )
    else:
        vision_heads = vision_cfg.width // vision_cfg.head_width
        norm_layer = (
            LayerNormFp32 if dtype in (torch.float16, torch.bfloat16) else LayerNorm
        )
        visual = VisionTransformer(
            image_size=vision_cfg.image_size,
            patch_size=vision_cfg.patch_size,
            width=vision_cfg.width,
            layers=vision_cfg.layers,
            heads=vision_heads,
            mlp_ratio=vision_cfg.mlp_ratio,
            ls_init_value=vision_cfg.ls_init_value,
            global_average_pool=vision_cfg.global_average_pool,
            output_dim=embed_dim,
            act_layer=act_layer,
            norm_layer=norm_layer,
            dtype=dtype,
        )

    return visual


def _build_text_tower(
    embed_dim: int,
    text_cfg: CLIPTextCfg,
    quick_gelu: bool = False,
    dtype: Optional[torch.dtype] = torch.float32,
):
    if isinstance(text_cfg, dict):
        text_cfg = CLIPTextCfg(**text_cfg)

    if text_cfg.hf_model_name:
        text = HFTextEncoder(
            text_cfg.hf_model_name,
            output_dim=embed_dim,
            proj=text_cfg.proj,
            pooler_type=text_cfg.pooler_type,
            pretrained=text_cfg.hf_model_pretrained,
        )
    else:
        act_layer = QuickGELU if quick_gelu else nn.GELU
        norm_layer = (
            LayerNormFp32 if dtype in (torch.float16, torch.bfloat16) else LayerNorm
        )

        text = TextTransformer(
            context_length=text_cfg.context_length,
            vocab_size=text_cfg.vocab_size,
            width=text_cfg.width,
            heads=text_cfg.heads,
            layers=text_cfg.layers,
            ls_init_value=text_cfg.ls_init_value,
            output_dim=embed_dim,
            act_layer=act_layer,
            norm_layer=norm_layer,
            dtype=dtype,
        )
    return text


class CustomTextCLIP(_CustomTextCLIP):
    def __init__(
        self,
        embed_dim: int,
        vision_cfg: CLIPVisionCfg,
        text_cfg: CLIPTextCfg,
        quick_gelu: bool = False,
        dtype: Optional[torch.dtype] = torch.float32,
    ):
        super().__init__(embed_dim, vision_cfg, text_cfg, quick_gelu, dtype)
        self.visual = _build_vision_tower(
            embed_dim=embed_dim,
            vision_cfg=vision_cfg,
            quick_gelu=quick_gelu,
            dtype=dtype,
        )
        self.text = _build_text_tower(
            embed_dim=embed_dim, text_cfg=text_cfg, quick_gelu=quick_gelu, dtype=dtype
        )


class CLIP(_CLIP):
    def __init__(
        self,
        embed_dim: int,
        vision_cfg: CLIPVisionCfg,
        text_cfg: CLIPTextCfg,
        quick_gelu: bool = False,
        dtype: Optional[torch.dtype] = torch.float32,
    ):
        nn.Module.__init__(self)

        self.visual = _build_vision_tower(
            embed_dim=embed_dim,
            vision_cfg=vision_cfg,
            quick_gelu=quick_gelu,
            dtype=dtype,
        )
        text = _build_text_tower(
            embed_dim=embed_dim, text_cfg=text_cfg, quick_gelu=quick_gelu, dtype=dtype
        )
        self.transformer = text.transformer
        self.vocab_size = text.vocab_size
        self.token_embedding = text.token_embedding
        self.positional_embedding = text.positional_embedding
        self.ln_final = text.ln_final
        self.text_projection = text.text_projection
        self.register_buffer('attn_mask', text.attn_mask, persistent=False)
        self.logit_scale = nn.Parameter(torch.ones([]) * np.log(1 / 0.07))


def convert_weights_to_lp(model: nn.Module, dtype=torch.float16):
    """Convert applicable model parameters to low-precision (bf16 or fp16)"""

    def _convert_weights(l):
        if isinstance(l, (nn.Conv1d, nn.Conv2d, nn.Linear)):
            l.weight.data = l.weight.data.to(dtype)
            if l.bias is not None:
                l.bias.data = l.bias.data.to(dtype)

        if isinstance(l, (nn.MultiheadAttention, Attention)):
            for attr in [
                *[f"{s}_proj_weight" for s in ["in", "q", "k", "v"]],
                "in_proj_bias",
                "bias_k",
                "bias_v",
            ]:
                tensor = getattr(l, attr)
                if tensor is not None:
                    tensor.data = tensor.data.to(dtype)

        for name in ["text_projection", "proj"]:
            if hasattr(l, name):
                attr = getattr(l, name)
                if attr is not None:
                    attr.data = attr.data.to(dtype)

    model.apply(_convert_weights)


convert_weights_to_fp16 = convert_weights_to_lp  # backwards compat


def load_state_dict(checkpoint_path: str, map_location='cpu'):
    checkpoint = torch.load(checkpoint_path, map_location=map_location)
    if isinstance(checkpoint, dict) and 'state_dict' in checkpoint:
        state_dict = checkpoint['state_dict']
    else:
        state_dict = checkpoint
    if next(iter(state_dict.items()))[0].startswith('module'):
        state_dict = {k[7:]: v for k, v in state_dict.items()}
    return state_dict


def build_model_from_openai_state_dict(
    state_dict: dict,
    quick_gelu: bool = False,
    dtype: torch.dtype = torch.float16,
):
    vit = "visual.proj" in state_dict

    if vit:
        vision_width = state_dict["visual.conv1.weight"].shape[0]
        vision_layers = len(
            [
                k
                for k in state_dict.keys()
                if k.startswith("visual.") and k.endswith(".attn.in_proj_weight")
            ]
        )
        vision_patch_size = state_dict["visual.conv1.weight"].shape[-1]
        grid_size = round(
            (state_dict["visual.positional_embedding"].shape[0] - 1) ** 0.5
        )
        image_size = vision_patch_size * grid_size
    else:
        counts: list = [
            len(
                set(
                    k.split(".")[2]
                    for k in state_dict
                    if k.startswith(f"visual.layer{b}")
                )
            )
            for b in [1, 2, 3, 4]
        ]
        vision_layers = tuple(counts)
        vision_width = state_dict["visual.layer1.0.conv1.weight"].shape[0]
        output_width = round(
            (state_dict["visual.attnpool.positional_embedding"].shape[0] - 1) ** 0.5
        )
        vision_patch_size = None
        assert (
            output_width**2 + 1
            == state_dict["visual.attnpool.positional_embedding"].shape[0]
        )
        image_size = output_width * 32

    embed_dim = state_dict["text_projection"].shape[1]
    context_length = state_dict["positional_embedding"].shape[0]
    vocab_size = state_dict["token_embedding.weight"].shape[0]
    transformer_width = state_dict["ln_final.weight"].shape[0]
    transformer_heads = transformer_width // 64
    transformer_layers = len(
        set(
            k.split(".")[2]
            for k in state_dict
            if k.startswith(f"transformer.resblocks")
        )
    )

    vision_cfg = CLIPVisionCfg(
        layers=vision_layers,
        width=vision_width,
        patch_size=vision_patch_size,
        image_size=image_size,
    )
    text_cfg = CLIPTextCfg(
        context_length=context_length,
        vocab_size=vocab_size,
        width=transformer_width,
        heads=transformer_heads,
        layers=transformer_layers,
    )
    model = CLIP(
        embed_dim=embed_dim,
        vision_cfg=vision_cfg,
        text_cfg=text_cfg,
        quick_gelu=quick_gelu,  # OpenAI models were trained with QuickGELU
        dtype=dtype,
    )

    for key in ["input_resolution", "context_length", "vocab_size"]:
        state_dict.pop(key, None)

    convert_weights_to_fp16(model)
    model.load_state_dict(state_dict)

    return model.eval()


def load_openai_model(
    model_path: str,
    device: Union[str, torch.device] = 'cuda' if torch.cuda.is_available() else 'cpu',
    dtype: Optional[Union[str, torch.dtype]] = None,
    jit: bool = True,
):
    """Load a CLIP model

    Parameters
    ----------
    model_path : str
        The path to a model checkpoint containing the state_dict
    dtype: str
        Model precision, if None defaults to 'fp32' if device == 'cpu' else 'fp16'.
    device : Union[str, torch.device]
        The device to put the loaded model
    jit : bool
        Whether to load the optimized JIT model (default) or more hackable non-JIT model.
    Returns
    -------
    model : torch.nn.Module
        The CLIP model
    preprocess : Callable[[PIL.Image], torch.Tensor]
        A torchvision transform that converts a PIL image into a tensor that the returned model can take as its input
    """
    if isinstance(dtype, str):
        dtype = __cast_dtype__.get(dtype, 'amp')
    elif dtype is None:
        dtype = (
            torch.float32 if device in ('cpu', torch.device('cpu')) else torch.float16
        )
    try:
        # loading JIT archive
        model = torch.jit.load(model_path, map_location=device if jit else "cpu").eval()
        state_dict = None
    except RuntimeError:
        # loading saved state dict
        if jit:
            warnings.warn(
                f"File {model_path} is not a JIT archive. Loading as a state dict instead"
            )
            jit = False
        state_dict = torch.load(model_path, map_location="cpu")

    if not jit:
        # Build a non-jit model from the OpenAI jitted model state dict
        try:
            model = build_model_from_openai_state_dict(
                state_dict or model.state_dict(), dtype=dtype
            )
        except KeyError:
            sd = {k[7:]: v for k, v in state_dict["state_dict"].items()}
            model = build_model_from_openai_state_dict(sd, dtype=dtype)

        # model from OpenAI state dict is in manually cast fp16 mode, must be converted for AMP/fp32/bf16 use
        model = model.to(device)
        if dtype == torch.float32 or (
            isinstance(dtype, str) and dtype.startswith('amp')
        ):
            model.float()
        elif dtype == torch.bfloat16:
            convert_weights_to_lp(model, dtype=torch.bfloat16)

        return model

    # patch the device names
    device_holder = torch.jit.trace(
        lambda: torch.ones([]).to(torch.device(device)), example_inputs=[]
    )
    device_node = [
        n
        for n in device_holder.graph.findAllNodes("prim::Constant")
        if "Device" in repr(n)
    ][-1]

    def patch_device(module):
        try:
            graphs = [module.graph] if hasattr(module, "graph") else []
        except RuntimeError:
            graphs = []

        if hasattr(module, "forward1"):
            graphs.append(module.forward1.graph)

        for graph in graphs:
            for node in graph.findAllNodes("prim::Constant"):
                if "value" in node.attributeNames() and str(node["value"]).startswith(
                    "cuda"
                ):
                    node.copyAttributes(device_node)

    model.apply(patch_device)
    patch_device(model.encode_image)
    patch_device(model.encode_text)

    # patch dtype to float32 (typically for CPU)
    if dtype == torch.float32:
        float_holder = torch.jit.trace(
            lambda: torch.ones([]).float(), example_inputs=[]
        )
        float_input = list(float_holder.graph.findNode("aten::to").inputs())[1]
        float_node = float_input.node()

        def patch_float(module):
            try:
                graphs = [module.graph] if hasattr(module, "graph") else []
            except RuntimeError:
                graphs = []

            if hasattr(module, "forward1"):
                graphs.append(module.forward1.graph)

            for graph in graphs:
                for node in graph.findAllNodes("aten::to"):
                    inputs = list(node.inputs())
                    for i in [
                        1,
                        2,
                    ]:  # dtype can be the second or third argument to aten::to()
                        if inputs[i].node()["value"] == 5:
                            inputs[i].node().copyAttributes(float_node)

        model.apply(patch_float)
        patch_float(model.encode_image)
        patch_float(model.encode_text)
        model.float()

    # ensure image_size attr available at consistent location for both jit and non-jit
    model.visual.image_size = model.input_resolution.item()
    return model


def load_openclip_model(
    model_name: str,
    model_path: str,
    device: Union[str, torch.device] = 'cpu',
    jit: bool = False,
    force_quick_gelu: bool = False,
    force_custom_text: bool = False,
    pretrained_image: bool = False,
    dtype: Optional[Union[str, torch.dtype]] = None,
):
    if isinstance(dtype, str):
        dtype = __cast_dtype__.get(dtype)
    elif dtype is None:
        dtype = (
            torch.float32 if device in ('cpu', torch.device('cpu')) else torch.float16
        )

    model_name = model_name.replace(
        '/', '-'
    )  # for callers using old naming with / in ViT names

    if model_name in _MODEL_CONFIGS:
        model_cfg = deepcopy(_MODEL_CONFIGS[model_name])
    else:
        raise RuntimeError(f'Model config for {model_name} not found.')

    if force_quick_gelu:
        # override for use of QuickGELU on non-OpenAI transformer models
        model_cfg["quick_gelu"] = True

    if pretrained_image:
        if 'timm_model_name' in model_cfg.get('vision_cfg', {}):
            # pretrained weight loading for timm models set via vision_cfg
            model_cfg['vision_cfg']['timm_model_pretrained'] = True
        else:
            assert (
                False
            ), 'pretrained image towers currently only supported for timm models'

    custom_text = (
        model_cfg.pop('custom_text', False)
        or force_custom_text
        or ('hf_model_name' in model_cfg['text_cfg'])
    )

    if custom_text:
        model = CustomTextCLIP(**model_cfg, dtype=dtype)
    else:
        model = CLIP(**model_cfg, dtype=dtype)

    model.eval()
    model.load_state_dict(load_state_dict(model_path))
    model.to(device=device)

    if dtype in (torch.float16, torch.bfloat16):
        convert_weights_to_lp(model, dtype=dtype)

    if jit:
        model = torch.jit.script(model)

    return model


================================================
FILE: server/clip_server/model/openclip_model.py
================================================
# Originally from https://github.com/mlfoundations/open_clip.
#
# Copyright (c) 2012-2021 Gabriel Ilharco, Mitchell Wortsman,
# Nicholas Carlini, Rohan Taori, Achal Dave, Vaishaal Shankar,
# John Miller, Hongseok Namkoong, Hannaneh Hajishirzi, Ali Farhadi,
# Ludwig Schmidt

from clip_server.model.clip_model import CLIPModel
from clip_server.model.pretrained_models import get_model_url_md5, download_model
from clip_server.model.model import load_openai_model, load_openclip_model

import torch


class OpenCLIPModel(CLIPModel):
    def __init__(
        self,
        name: str,
        device: str = 'cpu',
        jit: bool = False,
        dtype: str = None,
        **kwargs
    ):
        super().__init__(name, **kwargs)

        if '::' in name:
            model_name, pretrained = name.split('::')
        else:
            model_name = name
            pretrained = 'openai'

        self._model_name = model_name

        model_url, md5sum = get_model_url_md5(name)
        model_path = download_model(model_url, md5sum=md5sum)

        if pretrained == 'openai':
            self._model = load_openai_model(
                model_path=model_path, device=device, jit=jit, dtype=dtype
            )
        else:
            self._model = load_openclip_model(
                model_name=self._model_name,
                model_path=model_path,
                device=device,
                jit=jit,
                dtype=dtype,
            )

    @staticmethod
    def get_model_name(name: str):
        if '::' in name:
            model_name, pretrained = name.split('::')
        else:
            model_name = name
        if model_name == 'ViT-L/14@336px':
            return 'ViT-L-14-336'
        return model_name.replace('/', '-')

    def encode_text(self, input_ids: 'torch.Tensor', **kwargs):
        return self._model.encode_text(input_ids)

    def encode_image(self, pixel_values: 'torch.Tensor', **kwargs):
        return self._model.encode_image(pixel_values)


================================================
FILE: server/clip_server/model/pretrained_models.py
================================================
import os
import hashlib
import shutil
import urllib


_OPENCLIP_S3_BUCKET = 'https://clip-as-service.s3.us-east-2.amazonaws.com/models/torch'
_OPENCLIP_MODELS = {
    'RN50::openai': ('RN50.pt', '9140964eaaf9f68c95aa8df6ca13777c'),
    'RN50::yfcc15m': ('RN50-yfcc15m.pt', 'e9c564f91ae7dc754d9043fdcd2a9f22'),
    'RN50::cc12m': ('RN50-cc12m.pt', '37cb01eb52bb6efe7666b1ff2d7311b5'),
    'RN101::openai': ('RN101.pt', 'fa9d5f64ebf152bc56a18db245071014'),
    'RN101::yfcc15m': ('RN101-yfcc15m.pt', '48f7448879ce25e355804f6bb7928cb8'),
    'RN50x4::openai': ('RN50x4.pt', '03830990bc768e82f7fb684cde7e5654'),
    'RN50x16::openai': ('RN50x16.pt', '83d63878a818c65d0fb417e5fab1e8fe'),
    'RN50x64::openai': ('RN50x64.pt', 'a6631a0de003c4075d286140fc6dd637'),
    'ViT-B-32::openai': ('ViT-B-32.pt', '3ba34e387b24dfe590eeb1ae6a8a122b'),
    'ViT-B-32::laion2b_e16': (
        'ViT-B-32-laion2b_e16.pt',
        'df08de3d9f2dc53c71ea26e184633902',
    ),
    'ViT-B-32::laion400m_e31': (
        'ViT-B-32-laion400m_e31.pt',
        'ca8015f98ab0f8780510710681d7b73e',
    ),
    'ViT-B-32::laion400m_e32': (
        'ViT-B-32-laion400m_e32.pt',
        '359e0dba4a419f175599ee0c63a110d8',
    ),
    'ViT-B-32::laion2b-s34b-b79k': (
        'ViT-B-32-laion2b-s34b-b79k.bin',
        '2fc036aea9cd7306f5ce7ce6abb8d0bf',
    ),
    'ViT-B-16::openai': ('ViT-B-16.pt', '44c3d804ecac03d9545ac1a3adbca3a6'),
    'ViT-B-16::laion400m_e31': (
        'ViT-B-16-laion400m_e31.pt',
        '31306a44224cc46fec1bc3b82fd0c4e6',
    ),
    'ViT-B-16::laion400m_e32': (
        'ViT-B-16-laion400m_e32.pt',
        '07283adc5c17899f2ed22d82b563c54b',
    ),
    'ViT-B-16-plus-240::laion400m_e31': (
        'ViT-B-16-plus-240-laion400m_e31.pt',
        'c88f453644a998ecb094d878a2f0738d',
    ),
    'ViT-B-16-plus-240::laion400m_e32': (
        'ViT-B-16-plus-240-laion400m_e32.pt',
        'e573af3cef888441241e35022f30cc95',
    ),
    'ViT-L-14::openai': ('ViT-L-14.pt', '096db1af569b284eb76b3881534822d9'),
    'ViT-L-14::laion400m_e31': (
        'ViT-L-14-laion400m_e31.pt',
        '09d223a6d41d2c5c201a9da618d833aa',
    ),
    'ViT-L-14::laion400m_e32': (
        'ViT-L-14-laion400m_e32.pt',
        'a76cde1bc744ca38c6036b920c847a89',
    ),
    'ViT-L-14::laion2b-s32b-b82k': (
        'ViT-L-14-laion2b-s32b-b82k.bin',
        '4d2275fc7b2d7ee9db174f9b57ddecbd',
    ),
    'ViT-L-14-336::openai': ('ViT-L-14-336px.pt', 'b311058cae50cb10fbfa2a44231c9473'),
    'ViT-H-14::laion2b-s32b-b79k': (
        'ViT-H-14-laion2b-s32b-b79k.bin',
        '2aa6c46521b165a0daeb8cdc6668c7d3',
    ),
    'ViT-g-14::laion2b-s12b-b42k': (
        'ViT-g-14-laion2b-s12b-b42k.bin',
        '3bf99353f6f1829faac0bb155be4382a',
    ),
    'roberta-ViT-B-32::laion2b-s12b-b32k': (
        'roberta-ViT-B-32-laion2b-s12b-b32k.bin',
        '76d4c9d13774cc15fa0e2b1b94a8402c',
    ),
    'xlm-roberta-base-ViT-B-32::laion5b-s13b-b90k': (
        'xlm-roberta-base-ViT-B-32-laion5b-s13b-b90k.bin',
        'f68abc07ef349720f1f880180803142d',
    ),
    'xlm-roberta-large-ViT-H-14::frozen_laion5b_s13b_b90k': (
        'xlm-roberta-large-ViT-H-14-frozen_laion5b_s13b_b90k.bin',
        'b49991239a419d704fdba59c42d5536d',
    ),
    # older version name format
    'RN50': ('RN50.pt', '9140964eaaf9f68c95aa8df6ca13777c'),
    'RN101': ('RN101.pt', 'fa9d5f64ebf152bc56a18db245071014'),
    'RN50x4': ('RN50x4.pt', '03830990bc768e82f7fb684cde7e5654'),
    'RN50x16': ('RN50x16.pt', '83d63878a818c65d0fb417e5fab1e8fe'),
    'RN50x64': ('RN50x64.pt', 'a6631a0de003c4075d286140fc6dd637'),
    'ViT-B/32': ('ViT-B-32.pt', '3ba34e387b24dfe590eeb1ae6a8a122b'),
    'ViT-B/16': ('ViT-B-16.pt', '44c3d804ecac03d9545ac1a3adbca3a6'),
    'ViT-L/14': ('ViT-L-14.pt', '096db1af569b284eb76b3881534822d9'),
    'ViT-L/14@336px': ('ViT-L-14-336px.pt', 'b311058cae50cb10fbfa2a44231c9473'),
}

_MULTILINGUALCLIP_MODELS = {
    'M-CLIP/XLM-Roberta-Large-Vit-B-32': (),
    'M-CLIP/XLM-Roberta-Large-Vit-L-14': (),
    'M-CLIP/XLM-Roberta-Large-Vit-B-16Plus': (),
    'M-CLIP/LABSE-Vit-L-14': (),
}

_CNCLIP_MODELS = {
    'CN-CLIP/ViT-B-16': (),
    'CN-CLIP/ViT-L-14': (),
    'CN-CLIP/ViT-L-14-336': (),
    'CN-CLIP/ViT-H-14': (),
    'CN-CLIP/RN50': (),
}

_VISUAL_MODEL_IMAGE_SIZE = {
    'RN50': 224,
    'RN101': 224,
    'RN50x4': 288,
    'RN50x16': 384,
    'RN50x64': 448,
    'ViT-B-32': 224,
    'roberta-ViT-B-32': 224,
    'xlm-roberta-base-ViT-B-32': 224,
    'ViT-B-16': 224,
    'Vit-B-16Plus': 240,
    'ViT-B-16-plus-240': 240,
    'ViT-L-14': 224,
    'ViT-L-14-336': 336,
    'ViT-H-14': 224,
    'xlm-roberta-large-ViT-H-14': 224,
    'ViT-g-14': 224,
}


def md5file(filename: str):
    hash_md5 = hashlib.md5()
    with open(filename, 'rb') as f:
        for chunk in iter(lambda: f.read(4096), b""):
            hash_md5.update(chunk)

    return hash_md5.hexdigest()


def get_model_url_md5(name: str):
    model_pretrained = _OPENCLIP_MODELS[name]
    if len(model_pretrained) == 0:  # not on s3
        return None, None
    else:
        return (_OPENCLIP_S3_BUCKET + '/' + model_pretrained[0], model_pretrained[1])


def download_model(
    url: str,
    target_folder: str = os.path.expanduser("~/.cache/clip"),
    md5sum: str = None,
    with_resume: bool = True,
    max_attempts: int = 3,
) -> str:
    os.makedirs(target_folder, exist_ok=True)
    filename = os.path.basename(url)

    download_target = os.path.join(target_folder, filename)

    if os.path.exists(download_target):
        if not os.path.isfile(download_target):
            raise FileExistsError(f'{download_target} exists and is not a regular file')

        actual_md5sum = md5file(download_target)
        if (not md5sum) or actual_md5sum == md5sum:
            return download_target

    from rich.progress import (
        DownloadColumn,
        Progress,
        TextColumn,
        TimeRemainingColumn,
        TransferSpeedColumn,
    )

    progress = Progress(
        " \n",  # divide this bar from Flow's bar
        TextColumn("[bold blue]{task.fields[filename]}", justify="right"),
        "[progress.percentage]{task.percentage:>3.1f}%",
        "•",
        DownloadColumn(),
        "•",
        TransferSpeedColumn(),
        "•",
        TimeRemainingColumn(),
    )

    with progress:
        task = progress.add_task('download', filename=filename, start=False)

        for _ in range(max_attempts):
            tmp_file_path = download_target + '.part'
            resume_byte_pos = (
                os.path.getsize(tmp_file_path) if os.path.exists(tmp_file_path) else 0
            )

            try:
                # resolve the 403 error by passing a valid user-agent
                req = urllib.request.Request(url, headers={'User-Agent': 'Mozilla/5.0'})
                total_bytes = int(
                    urllib.request.urlopen(req).info().get('Content-Length', -1)
                )
                mode = 'ab' if (with_resume and resume_byte_pos) else 'wb'

                with open(tmp_file_path, mode) as output:
                    progress.update(task, total=total_bytes)
                    progress.start_task(task)

                    if resume_byte_pos and with_resume:
                        progress.update(task, advance=resume_byte_pos)
                        req.headers['Range'] = f'bytes={resume_byte_pos}-'

                    with urllib.request.urlopen(req) as source:
                        while True:
                            buffer = source.read(8192)
                            if not buffer:
                                break

                            output.write(buffer)
                            progress.update(task, advance=len(buffer))

                actual_md5 = md5file(tmp_file_path)
                if (md5sum and actual_md5 == md5sum) or (not md5sum):
                    shutil.move(tmp_file_path, download_target)
                    return download_target
                else:
                    os.remove(tmp_file_path)
                    raise RuntimeError(
                        f'MD5 mismatch: expected {md5sum}, got {actual_md5}'
                    )

            except Exception as ex:
                progress.console.print(
                    f'Failed to download {url} with {ex!r} at the {_}th attempt'
                )
                progress.reset(task)

        raise RuntimeError(
            f'Failed to download {url} within retry limit {max_attempts}'
        )


================================================
FILE: server/clip_server/model/simple_tokenizer.py
================================================
# Originally from https://github.com/openai/CLIP. MIT License, Copyright (c) 2021 OpenAI

import gzip
import html
import os
import regex as re
from functools import lru_cache

import ftfy

from clip_server.helper import __resources_path__


@lru_cache()
def default_bpe():
    return os.path.join(__resources_path__, 'bpe_simple_vocab_16e6.txt.gz')


@lru_cache()
def bytes_to_unicode():
    """
    Returns list of utf-8 byte and a corresponding list of unicode strings.
    The reversible bpe codes work on unicode strings.
    This means you need a large # of unicode characters in your vocab if you want to avoid UNKs.
    When you're at something like a 10B token dataset you end up needing around 5K for decent coverage.
    This is a signficant percentage of your normal, say, 32K bpe vocab.
    To avoid that, we want lookup tables between utf-8 bytes and unicode strings.
    And avoids mapping to whitespace/control characters the bpe code barfs on.
    """
    bs = (
        list(range(ord("!"), ord("~") + 1))
        + list(range(ord("¡"), ord("¬") + 1))
        + list(range(ord("®"), ord("ÿ") + 1))
    )
    cs = bs[:]
    n = 0
    for b in range(2**8):
        if b not in bs:
            bs.append(b)
            cs.append(2**8 + n)
            n += 1
    cs = [chr(n) for n in cs]
    return dict(zip(bs, cs))


def get_pairs(word):
    """Return set of symbol pairs in a word.
    Word is represented as tuple of symbols (symbols being variable-length strings).
    """
    pairs = set()
    prev_char = word[0]
    for char in word[1:]:
        pairs.add((prev_char, char))
        prev_char = char
    return pairs


def basic_clean(text):
    text = ftfy.fix_text(text)
    text = html.unescape(html.unescape(text))
    return text.strip()


def whitespace_clean(text):
    text = re.sub(r'\s+', ' ', text)
    text = text.strip()
    return text


class SimpleTokenizer(object):
    def __init__(self, bpe_path: str = default_bpe()):
        self.byte_encoder = bytes_to_unicode()
        self.byte_decoder = {v: k for k, v in self.byte_encoder.items()}
        merges = gzip.open(bpe_path).read().decode("utf-8").split('\n')
        merges = merges[1 : 49152 - 256 - 2 + 1]
        merges = [tuple(merge.split()) for merge in merges]
        vocab = list(bytes_to_unicode().values())
        vocab = vocab + [v + '</w>' for v in vocab]
        for merge in merges:
            vocab.append(''.join(merge))
        vocab.extend(['<|startoftext|>', '<|endoftext|>'])
        self.encoder = dict(zip(vocab, range(len(vocab))))
        self.decoder = {v: k for k, v in self.encoder.items()}
        self.bpe_ranks = dict(zip(merges, range(len(merges))))
        self.cache = {
            '<|startoftext|>': '<|startoftext|>',
            '<|endoftext|>': '<|endoftext|>',
        }
        self.pat = re.compile(
            r"""<\|startoftext\|>|<\|endoftext\|>|'s|'t|'re|'ve|'m|'ll|'d|[\p{L}]+|[\p{N}]|[^\s\p{L}\p{N}]+""",
            re.IGNORECASE,
        )

    def bpe(self, token):
        if token in self.cache:
            return self.cache[token]
        word = tuple(token[:-1]) + (token[-1] + '</w>',)
        pairs = get_pairs(word)

        if not pairs:
            return token + '</w>'

        while True:
            bigram = min(pairs, key=lambda pair: self.bpe_ranks.get(pair, float('inf')))
            if bigram not in self.bpe_ranks:
                break
            first, second = bigram
            new_word = []
            i = 0
            while i < len(word):
                try:
                    j = word.index(first, i)
                    new_word.extend(word[i:j])
                    i = j
                except:
                    new_word.extend(word[i:])
                    break

                if word[i] == first and i < len(word) - 1 and word[i + 1] == second:
                    new_word.append(first + second)
                    i += 2
                else:
                    new_word.append(word[i])
                    i += 1
            new_word = tuple(new_word)
            word = new_word
            if len(word) == 1:
                break
            else:
                pairs = get_pairs(word)
        word = ' '.join(word)
        self.cache[token] = word
        return word

    def encode(self, text):
        bpe_tokens = []
        text = whitespace_clean(basic_clean(text)).lower()
        for token in re.findall(self.pat, text):
            token = ''.join(self.byte_encoder[b] for b in token.encode('utf-8'))
            bpe_tokens.extend(
                self.encoder[bpe_token] for bpe_token in self.bpe(token).split(' ')
            )
        return bpe_tokens

    def decode(self, tokens):
        text = ''.join([self.decoder[token] for token in tokens])
        text = (
            bytearray([self.byte_decoder[c] for c in text])
            .decode('utf-8', errors='replace')
            .replace('</w>', ' ')
        )
        return text


================================================
FILE: server/clip_server/model/tokenization.py
================================================
import torch
from typing import List, Union
from clip_server.model.pretrained_models import (
    _MULTILINGUALCLIP_MODELS,
    _CNCLIP_MODELS,
)


class Tokenizer:
    def __init__(self, name: str, **kwargs):
        self._name = name
        if name in _MULTILINGUALCLIP_MODELS:
            import transformers

            self._tokenizer = transformers.AutoTokenizer.from_pretrained(name)
        elif name in _CNCLIP_MODELS:
            import cn_clip.clip as cnclip

            self._tokenizer = cnclip
        else:
            from clip_server.model.simple_tokenizer import SimpleTokenizer

            self._tokenizer = SimpleTokenizer()

    def __call__(
        self,
        texts: Union[str, List[str]],
        context_length: int = 77,
        truncate: bool = True,
    ):
        """
        :param texts: An input string or a list of input strings to tokenize
        :param context_length: The context length to use; all English CLIP models use 77 as the context length.
            for Chinese CLIP models, context_length = 52, if the number of characters is bigger than 50, sentence will be truncate and omit the part left
        :param truncate: Whether to truncate the text in case its encoding is longer than the context length.

        :return: A dict of tokenized representations of the input strings and their corresponding attention masks with both
            shape = [batch size, context_length]
        """
        if self._name in _CNCLIP_MODELS:
            return self._tokenize(texts, context_length=52)
        else:
            return self._tokenize(
                texts, context_length=context_length, truncate=truncate
            )

    def _tokenize(
        self,
        texts: Union[str, List[str]],
        context_length: int = 77,
        truncate: bool = True,
    ) -> dict:
        if isinstance(texts, str):
            texts = [texts]
        if self._name in _MULTILINGUALCLIP_MODELS:
            result = self._tokenizer(
                texts,
                max_length=context_length,
                return_attention_mask=True,
                return_tensors='pt',
                padding=True,
                truncation=True,
            )
            return {
                'input_ids': result['input_ids'],
                'attention_mask': result['attention_mask'],
            }
        elif self._name in _CNCLIP_MODELS:
            result = self._tokenizer.tokenize(
                texts=texts,
                context_length=52,  # in all cnclip baseline model context length is 52
            )
            attn_mask = result.clone()
            attn_mask[result != 0] = 1
            return {
                "input_ids": result,
                "attention_mask": attn_mask,
            }
        else:
            sot_token = self._tokenizer.encoder['<|startoftext|>']
            eot_token = self._tokenizer.encoder['<|endoftext|>']
            all_tokens = [
                [sot_token] + self._tokenizer.encode(text) + [eot_token]
                for text in texts
            ]

            input_ids = torch.zeros(len(all_tokens), context_length, dtype=torch.long)
            attention_mask = torch.zeros(
                len(all_tokens), context_length, dtype=torch.long
            )

            for i, tokens in enumerate(all_tokens):
                if len(tokens) > context_length:
                    if truncate:
                        tokens = tokens[:context_length]
                        tokens[-1] = eot_token
                    else:
                        raise RuntimeError(
                            f'Input {texts[i]} is too long for context length {context_length}'
                        )
                input_ids[i, : len(tokens)] = torch.tensor(tokens)
                attention_mask[i, : len(tokens)] = 1

            return {'input_ids': input_ids, 'attention_mask': attention_mask}


================================================
FILE: server/clip_server/model/trt_utils.py
================================================
# Originally from https://github.com/ELS-RD/transformer-deploy.
# Apache License, Version 2.0, Copyright (c) 2022 Lefebvre Dalloz Services

from typing import Callable, Dict, List, OrderedDict, Tuple

import tensorrt as trt
import torch
from tensorrt import ICudaEngine, IExecutionContext
from tensorrt.tensorrt import (
    Builder,
    IBuilderConfig,
    IElementWiseLayer,
    ILayer,
    INetworkDefinition,
    IOptimizationProfile,
    IReduceLayer,
    Logger,
    OnnxParser,
    Runtime,
)


"""
All the tooling to ease TensorRT usage.
"""


def fix_fp16_network(network_definition: INetworkDefinition) -> INetworkDefinition:
    """
    Mixed precision on TensorRT can generate scores very far from Pytorch because of some operator being saturated.
    Indeed, FP16 can't store very large and very small numbers like FP32.
    Here, we search for some patterns of operators to keep in FP32, in most cases, it is enough to fix the inference
    and don't hurt performances.
    :param network_definition: graph generated by TensorRT after parsing ONNX file (during the model building)
    :return: patched network definition
    """
    # search for patterns which may overflow in FP16 precision, we force FP32 precisions for those nodes
    for layer_index in range(network_definition.num_layers - 1):
        layer: ILayer = network_definition.get_layer(layer_index)
        next_layer: ILayer = network_definition.get_layer(layer_index + 1)
        # POW operation usually followed by mean reduce
        if (
            layer.type == trt.LayerType.ELEMENTWISE
            and next_layer.type == trt.LayerType.REDUCE
        ):
            # casting to get access to op attribute
            layer.__class__ = IElementWiseLayer
            next_layer.__class__ = IReduceLayer
            if layer.op == trt.ElementWiseOperation.POW:
                layer.precision = trt.DataType.FLOAT
                next_layer.precision = trt.DataType.FLOAT
            layer.set_output_type(index=0, dtype=trt.DataType.FLOAT)
            next_layer.set_output_type(index=0, dtype=trt.DataType.FLOAT)
    return network_definition


def build_engine(
    runtime: Runtime,
    onnx_file_path: str,
    logger: Logger,
    min_shape: Tuple[int, int],
    optimal_shape: Tuple[int, int],
    max_shape: Tuple[int, int],
    workspace_size: int,
    fp16: bool,
    int8: bool,
) -> ICudaEngine:
    """
    Convert ONNX file to TensorRT engine.
    It supports dynamic shape, however it's advised to keep sequence length fix as it hurts performance otherwise.
    Dynamic batch size don't hurt performance and is highly advised.
    :param runtime: global variable shared accross inference call / model building
    :param onnx_file_path: path to the ONNX file
    :param logger: specific logger to TensorRT
    :param min_shape: the minimal shape of input tensors. It's advised to set first dimension (batch size) to 1
    :param optimal_shape: input tensor shape used for optimizations
    :param max_shape: maximal input tensor shape
    :param workspace_size: GPU memory to use during the building, more is always better. If there is not enough memory,
    some optimization may fail, and the whole conversion process will crash.
    :param fp16: enable FP16 precision, it usually provide a 20-30% boost compared to ONNX Runtime.
    :param int8: enable INT-8 quantization, best performance but model should have been quantized.
    :return: TensorRT engine to use during inference
    """
    with trt.Builder(logger) as builder:  # type: Builder
        with builder.create_network(
            flags=1 << int(trt.NetworkDefinitionCreationFlag.EXPLICIT_BATCH)
        ) as network_definition:  # type: INetworkDefinition
            with trt.OnnxParser(
                network_definition, logger
            ) as parser:  # type: OnnxParser
                builder.max_batch_size = max_shape[0]  # max batch size
                config: IBuilderConfig = builder.create_builder_config()
                config.max_workspace_size = workspace_size
                # to enable complete trt inspector debugging, only for TensorRT >= 8.2
                config.profiling_verbosity = trt.ProfilingVerbosity.DETAILED
                # disable CUDNN optimizations
                config.set_tactic_sources(
                    tactic_sources=1 << int(trt.TacticSource.CUBLAS)
                    | 1 << int(trt.TacticSource.CUBLAS_LT)
                )
                if int8:
                    config.set_flag(trt.BuilderFlag.INT8)
                if fp16:
                    config.set_flag(trt.BuilderFlag.FP16)
                config.set_flag(trt.BuilderFlag.DISABLE_TIMING_CACHE)
                # https://github.com/NVIDIA/TensorRT/issues/1196 (sometimes big diff in output when using FP16)
                config.set_flag(trt.BuilderFlag.OBEY_PRECISION_CONSTRAINTS)
                with open(onnx_file_path, "rb") as f:
                    parser.parse(f.read())
                profile: IOptimizationProfile = builder.create_optimization_profile()
                for num_input in range(network_definition.num_inputs):
                    profile.set_shape(
                        input=network_definition.get_input(num_input).name,
                        min=min_shape,
                        opt=optimal_shape,
                        max=max_shape,
                    )
                config.add_optimization_profile(profile)
                if fp16:
                    network_definition = fix_fp16_network(network_definition)
                trt_engine = builder.build_serialized_network(
                    network_definition, config
                )
                engine: ICudaEngine = runtime.deserialize_cuda_engine(trt_engine)
                assert (
                    engine is not None
                ), "error during engine generation, check error messages above :-("
                return engine


def get_output_tensors(
    context: trt.IExecutionContext,
    host_inputs: List[torch.Tensor],
    input_binding_idxs: List[int],
    output_binding_idxs: List[int],
) -> List[torch.Tensor]:
    """
    Reserve memory in GPU for input and output tensors.
    :param context: TensorRT context shared accross inference steps
    :param host_inputs: input tensor
    :param input_binding_idxs: indexes of each input vector (should be the same than during building)
    :param output_binding_idxs: indexes of each output vector (should be the same than during building)
    :return: tensors where output will be stored
    """
    # explicitly set dynamic input shapes, so dynamic output shapes can be computed internally
    for host_input, binding_index in zip(host_inputs, input_binding_idxs):
        context.set_binding_shape(binding_index, tuple(host_input.shape))
    assert context.all_binding_shapes_specified
    device_outputs: List[torch.Tensor] = []
    for binding_index in output_binding_idxs:
        # TensorRT computes output shape based on input shape provided above
        output_shape = context.get_binding_shape(binding_index)
        # allocate buffers to hold output results
        output = torch.empty(tuple(output_shape), device="cuda")
        device_outputs.append(output)
    return device_outputs


def infer_tensorrt(
    context: IExecutionContext,
    host_inputs: OrderedDict[str, torch.Tensor],
    input_binding_idxs: List[int],
    output_binding_idxs: List[int],
) -> List[torch.Tensor]:
    """
    Perform inference with TensorRT.
    :param context: shared variable
    :param host_inputs: input tensor
    :param input_binding_idxs: input tensor indexes
    :param output_binding_idxs: output tensor indexes
    :return: output tensor
    """
    input_tensors: List[torch.Tensor] = list()
    for tensor in host_inputs.values():
        assert isinstance(
            tensor, torch.Tensor
        ), f"unexpected tensor type: {tensor.dtype}"

        if tensor.dtype == torch.int64:
            # warning: small changes in output if int64 is used instead of int32
            tensor = tensor.type(torch.int32)
            # tensor = tensor.to("cuda")
        input_tensors.append(tensor)
    # calculate input shape, bind it, allocate GPU memory for the output
    output_tensors: List[torch.Tensor] = get_output_tensors(
        context, input_tensors, input_binding_idxs, output_binding_idxs
    )
    bindings = [int(i.data_ptr()) for i in input_tensors + output_tensors]
    assert context.execute_async_v2(
        bindings, torch.cuda.current_stream().cuda_stream
    ), "failure during execution of inference"
    torch.cuda.current_stream().synchronize()  # sync all CUDA ops
    return output_tensors


def load_engine(
    runtime: Runtime, engine_file_path: str, profile_index: int = 0
) -> Callable[[Dict[str, torch.Tensor]], torch.Tensor]:
    """
    Load serialized TensorRT engine.
    :param runtime: shared variable
    :param engine_file_path: path to the serialized engine
    :param profile_index: which profile to load, 0 if you have not used multiple profiles
    :return: A function to perform inference
    """
    with open(file=engine_file_path, mode="rb") as f:
        engine: ICudaEngine = runtime.deserialize_cuda_engine(f.read())
        stream: int = torch.cuda.current_stream().cuda_stream
        context: IExecutionContext = engine.create_execution_context()
        context.set_optimization_profile_async(
            profile_index=profile_index, stream_handle=stream
        )
        # retrieve input/output IDs
        input_binding_idxs, output_binding_idxs = get_binding_idxs(
            engine, profile_index
        )  # type: List[int], List[int]

        def tensorrt_model(inputs: Dict[str, torch.Tensor]) -> torch.Tensor:
            return infer_tensorrt(
                context=context,
                host_inputs=inputs,
                input_binding_idxs=input_binding_idxs,
                output_binding_idxs=output_binding_idxs,
            )

        return tensorrt_model


def save_engine(engine: ICudaEngine, engine_file_path: str) -> None:
    """
    Serialize TensorRT engine to file.
    :param engine: TensorRT engine
    :param engine_file_path: output path
    """
    with open(engine_file_path, "wb") as f:
        f.write(engine.serialize())


def get_binding_idxs(engine: trt.ICudaEngine, profile_index: int):
    """
    Calculate start/end binding indices for current context's profile
    https://docs.nvidia.com/deeplearning/tensorrt/developer-guide/index.html#opt_profiles_bindings
    :param engine: TensorRT engine generated during the model building
    :param profile_index: profile to use (several profiles can be set during building)
    :return: input and output tensor indexes
    """
    num_bindings_per_profile = engine.num_bindings // engine.num_optimization_profiles
    start_binding = profile_index * num_bindings_per_profile
    end_binding = (
        start_binding + num_bindings_per_profile
    )  # Separate input and output binding indices for convenience
    input_binding_idxs: List[int] = []
    output_binding_idxs: List[int] = []
    for binding_index in range(start_binding, end_binding):
        if engine.binding_is_input(binding_index):
            input_binding_idxs.append(binding_index)
        else:
            output_binding_idxs.append(binding_index)
    return input_binding_idxs, output_binding_idxs


================================================
FILE: server/clip_server/onnx-flow.yml
================================================
jtype: Flow
version: '1'
with:
  port: 51000
executors:
  - name: clip_o
    uses:
      jtype: CLIPEncoder
      metas:
        py_modules:
          - clip_server.executors.clip_onnx
    timeout_ready: 3000000
    replicas: 1

================================================
FILE: server/clip_server/tensorrt-flow.yml
================================================
jtype: Flow
version: '1'
with:
  port: 51000
executors:
  - name: clip_r
    uses:
      jtype: CLIPEncoder
      metas:
        py_modules:
          - clip_server.executors.clip_tensorrt
    timeout_ready: 3000000
    replicas: 1

================================================
FILE: server/clip_server/torch-flow.yml
================================================
jtype: Flow
version: '1'
with:
  port: 51000
executors:
  - name: clip_t
    uses:
      jtype: CLIPEncoder
      metas:
        py_modules:
          - clip_server.executors.clip_torch
    timeout_ready: 3000000
    replicas: 1

================================================
FILE: server/setup.py
================================================
import sys
from os import path

from setuptools import find_packages, setup

if sys.version_info < (3, 7, 0):
    raise OSError(f'CLIP-as-service requires Python >=3.7, but yours is {sys.version}')

try:
    pkg_name = 'clip-server'
    libinfo_py = path.join(
        path.dirname(__file__), pkg_name.replace('-', '_'), '__init__.py'
    )
    libinfo_content = open(libinfo_py, 'r', encoding='utf8').readlines()
    version_line = [l.strip() for l in libinfo_content if l.startswith('__version__')][
        0
    ]
    exec(version_line)  # gives __version__
except FileNotFoundError:
    __version__ = '0.0.0'

try:
    with open('../README.md', encoding='utf8') as fp:
        _long_description = fp.read()
except FileNotFoundError:
    _long_description = ''

setup(
    name=pkg_name,
    packages=find_packages(),
    version=__version__,
    include_package_data=True,
    description='Embed images and sentences into fixed-length vectors via CLIP',
    author='Jina AI',
    author_email='hello@jina.ai',
    license='Apache 2.0',
    url='https://github.com/jina-ai/clip-as-service',
    download_url='https://github.com/jina-ai/clip-as-service/tags',
    long_description=_long_description,
    long_description_content_type='text/markdown',
    zip_safe=False,
    setup_requires=['setuptools>=18.0', 'wheel'],
    install_requires=[
        'ftfy',
        'torch',
        'regex',
        'torchvision<=0.13.0' if sys.version_info <= (3, 7, 2) else 'torchvision',
        'jina>=3.12.0',
        'docarray==0.21.0',
        'prometheus-client',
        'open_clip_torch>=2.8.0,<2.9.0',
        'pillow-avif-plugin',
    ],
    extras_require={
        'onnx': [
            'onnx',
            'onnxmltools<1.12.0',
        ]
        + (
            ['onnxruntime-gpu<=1.13.1']
            if sys.platform != 'darwin'
            else ['onnxruntime<=1.13.1']
        ),
        'tensorrt': [
            'nvidia-tensorrt==8.4.1.5',
        ],
        'transformers': ['transformers>=4.16.2'],
        'search': ['annlite>=0.3.10'],
        'flash-attn': ['flash-attn'],
        'cn_clip': ['cn_clip'],
    },
    classifiers=[
        'Development Status :: 5 - Production/Stable',
        'Intended Audience :: Developers',
        'Intended Audience :: Education',
        'Intended Audience :: Science/Research',
        'Programming Language :: Python :: 3.7',
        'Programming Language :: Python :: 3.8',
        'Programming Language :: Python :: 3.9',
        'Programming Language :: Python :: 3.10',
        'Programming Language :: Unix Shell',
        'Environment :: Console',
        'License :: OSI Approved :: Apache Software License',
        'Operating System :: OS Independent',
        'Topic :: Database :: Database Engines/Servers',
        'Topic :: Scientific/Engineering :: Artificial Intelligence',
        'Topic :: Internet :: WWW/HTTP :: Indexing/Search',
        'Topic :: Scientific/Engineering :: Image Recognition',
        'Topic :: Multimedia :: Video',
        'Topic :: Scientific/Engineering',
        'Topic :: Scientific/Engineering :: Mathematics',
        'Topic :: Software Development',
        'Topic :: Software Development :: Libraries',
        'Topic :: Software Development :: Libraries :: Python Modules',
    ],
    project_urls={
        'Documentation': 'https://clip-as-service.jina.ai',
        'Source': 'https://github.com/jina-ai/clip-as-service/',
        'Tracker': 'https://github.com/jina-ai/clip-as-service/issues',
    },
    keywords='jina openai clip deep-learning cross-modal multi-modal neural-search',
)


================================================
FILE: tests/__init__.py
================================================
import os

os.environ['OMP_NUM_THREADS'] = '1'


================================================
FILE: tests/conftest.py
================================================
import pytest
from jina import helper, Flow


@pytest.fixture(scope='session')
def port_generator():
    generated_ports = set()

    def random_port():
        port = helper.random_port()
        while port in generated_ports:
            port = helper.random_port()
        generated_ports.add(port)
        return port

    return random_port


@pytest.fixture(scope='session', params=['onnx', 'torch', 'onnx_custom'])
def make_flow(port_generator, request):
    if request.param != 'onnx_custom':
        if request.param == 'onnx':
            from clip_server.executors.clip_onnx import CLIPEncoder
        else:
            from clip_server.executors.clip_torch import CLIPEncoder

        f = Flow(port=port_generator()).add(name=request.param, uses=CLIPEncoder)
    else:
        import os
        from clip_server.executors.clip_onnx import CLIPEncoder

        f = Flow(port=port_generator()).add(
            name=request.param,
            uses=CLIPEncoder,
            uses_with={
                'model_path': os.path.expanduser('~/.cache/clip/ViT-B-32-openai')
            },
        )
    with f:
        yield f


@pytest.fixture(scope='session', params=['torch'])
def make_torch_flow(port_generator, request):
    from clip_server.executors.clip_torch import CLIPEncoder

    f = Flow(port=port_generator()).add(name=request.param, uses=CLIPEncoder)
    with f:
        yield f


@pytest.fixture(scope='session', params=['tensorrt'])
def make_trt_flow(port_generator, request):
    from clip_server.executors.clip_tensorrt import CLIPEncoder

    f = Flow(port=port_generator()).add(name=request.param, uses=CLIPEncoder)
    with f:
        yield f


@pytest.fixture(params=['torch'])
def make_search_flow(tmpdir, port_generator, request):
    from clip_server.executors.clip_torch import CLIPEncoder
    from annlite.executor import AnnLiteIndexer

    f = (
        Flow(port=port_generator())
        .add(name=request.param, uses=CLIPEncoder)
        .add(
            name='annlite',
            uses=AnnLiteIndexer,
            workspace=tmpdir,
            uses_with={'n_dim': 512},
        )
    )
    with f:
        yield f


================================================
FILE: tests/test_asyncio.py
================================================
import asyncio
import os
import pytest

from clip_client import Client
from docarray import Document, DocumentArray


async def another_heavylifting_job():
    await asyncio.sleep(3)


@pytest.mark.asyncio
async def test_async_encode(make_flow):
    c = Client(server=f'grpc://0.0.0.0:{make_flow.port}')
    t1 = asyncio.create_task(another_heavylifting_job())
    t2 = asyncio.create_task(c.aencode(['hello world'] * 10))
    await asyncio.gather(t1, t2)
    assert t2.result().shape


@pytest.mark.parametrize(
    'inputs',
    [
        DocumentArray([Document(text='hello, world'), Document(text='goodbye, world')]),
        DocumentArray(
            [
                Document(
                    uri='https://clip-as-service.jina.ai/_static/favicon.png',
                    text='hello, world',
                ),
            ]
        ),
        DocumentArray.from_files(
            f'{os.path.dirname(os.path.abspath(__file__))}/**/*.jpg'
        ),
    ],
)
@pytest.mark.asyncio
async def test_async_docarray_preserve_original_inputs(make_flow, inputs):
    c = Client(server=f'grpc://0.0.0.0:{make_flow.port}')
    t1 = asyncio.create_task(another_heavylifting_job())
    t2 = asyncio.create_task(c.aencode(inputs if not callable(inputs) else inputs()))
    await asyncio.gather(t1, t2)
    assert isinstance(t2.result(), DocumentArray)
    assert inputs[0] is t2.result()[0]
    assert t2.result().embeddings.shape
    assert t2.result().contents == inputs.contents
    assert not t2.result()[0].tensor
    assert inputs[0] is t2.result()[0]


@pytest.mark.parametrize(
    'inputs',
    [
        [Document(id=str(i), text='hello, world') for i in range(20)],
        DocumentArray([Document(id=str(i), text='hello, world') for i in range(20)]),
    ],
)
@pytest.mark.asyncio
async def test_async_docarray_preserve_original_order(make_flow, inputs):
    c = Client(server=f'grpc://0.0.0.0:{make_flow.port}')
    t1 = asyncio.create_task(another_heavylifting_job())
    t2 = asyncio.create_task(
        c.aencode(inputs if not callable(inputs) else inputs(), batch_size=1)
    )
    await asyncio.gather(t1, t2)
    assert isinstance(t2.result(), DocumentArray)
    for i in range(len(inputs)):
        assert inputs[i] is t2.result()[i]
        assert inputs[i].id == str(i)


================================================
FILE: tests/test_client.py
================================================
import os
import random
import time
import pytest
import numpy as np
from docarray import Document, DocumentArray
from jina import Flow, Executor, requests


class Exec1(Executor):
    @requests
    async def aencode(self, docs, **kwargs):
        time.sleep(random.random() * 1)
        docs.embeddings = np.random.rand(len(docs), 10)


class Exec2(Executor):
    def __init__(self, server_host: str = '', **kwargs):
        super().__init__(**kwargs)
        from clip_client.client import Client

        self._client = Client(server=server_host)

    @requests
    async def process(self, docs, **kwargs):
        results = await self._client.aencode(docs, batch_size=2)
        return results


class ErrorExec(Executor):
    @requests
    def foo(self, docs, **kwargs):
        raise NotImplementedError


def test_client_concurrent_requests(port_generator):

    f1 = Flow(port=port_generator()).add(uses=Exec1)

    f2 = Flow(protocol='http').add(
        uses=Exec2, uses_with={'server_host': f'grpc://0.0.0.0:{f1.port}'}
    )

    with f1, f2:
        import jina
        from multiprocessing.pool import ThreadPool

        def run_post(docs):
            c = jina.clients.Client(port=f2.port, protocol='http')
            results = c.post(on='/', inputs=docs, request_size=2)
            # assert set([d.id for d in results]) != set([d.id for d in docs])
            return results

        def generate_docs(tag):
            return DocumentArray(
                [Document(id=f'{tag}_{i}', text='hello') for i in range(20)]
            )

        with ThreadPool(5) as p:
            results = p.map(run_post, [generate_docs(f't{k}') for k in range(5)])

        for r in results:
            assert len(set([d.id[:2] for d in r])) == 1


def test_client_large_input(make_torch_flow):
    from clip_client.client import Client

    inputs = ['hello' for _ in range(600)]

    c = Client(server=f'grpc://0.0.0.0:{make_torch_flow.port}')
    with pytest.warns(UserWarning):
        c.encode(inputs if not callable(inputs) else inputs())


@pytest.mark.parametrize(
    'inputs',
    [
        [],
        DocumentArray(),
    ],
)
@pytest.mark.parametrize('endpoint', ['encode', 'rank', 'index', 'search'])
@pytest.mark.asyncio
def test_empty_input(make_torch_flow, inputs, endpoint):
    from clip_client.client import Client

    c = Client(server=f'grpc://0.0.0.0:{make_torch_flow.port}')

    r = getattr(c, endpoint)(inputs if not callable(inputs) else inputs())
    if endpoint == 'encode':
        if isinstance(inputs, DocumentArray):
            assert isinstance(r, DocumentArray)
        else:
            assert isinstance(r, list)
    else:
        assert isinstance(r, DocumentArray)
    assert len(r) == 0


@pytest.mark.parametrize(
    'inputs',
    [
        [],
        DocumentArray(),
    ],
)
@pytest.mark.parametrize('endpoint', ['aencode', 'arank', 'aindex', 'asearch'])
@pytest.mark.asyncio
async def test_async_empty_input(make_torch_flow, inputs, endpoint):
    from clip_client.client import Client

    c = Client(server=f'grpc://0.0.0.0:{make_torch_flow.port}')

    r = await getattr(c, endpoint)(inputs if not callable(inputs) else inputs())
    if endpoint == 'aencode':
        if isinstance(inputs, DocumentArray):
            assert isinstance(r, DocumentArray)
        else:
            assert isinstance(r, list)
    else:
        assert isinstance(r, DocumentArray)
    assert len(r) == 0


@pytest.mark.parametrize('endpoint', ['encode', 'rank', 'index', 'search'])
def test_wrong_input_type(make_torch_flow, endpoint):
    from clip_client.client import Client

    c = Client(server=f'grpc://0.0.0.0:{make_torch_flow.port}')

    with pytest.raises(Exception):
        getattr(c, endpoint)('hello')


@pytest.mark.parametrize('endpoint', ['aencode', 'arank', 'aindex', 'asearch'])
@pytest.mark.asyncio
async def test_wrong_input_type(make_torch_flow, endpoint):
    from clip_client.client import Client

    c = Client(server=f'grpc://0.0.0.0:{make_torch_flow.port}')

    with pytest.raises(Exception):
        await getattr(c, endpoint)('hello')


@pytest.mark.parametrize('endpoint', ['encode', 'rank', 'index', 'search'])
@pytest.mark.slow
def test_custom_on_done(make_torch_flow, mocker, endpoint):
    from clip_client.client import Client

    c = Client(server=f'grpc://0.0.0.0:{make_torch_flow.port}')

    on_done_mock = mocker.Mock()
    on_error_mock = mocker.Mock()
    on_always_mock = mocker.Mock()

    r = getattr(c, endpoint)(
        DocumentArray(
            [Document(text='hello', matches=DocumentArray([Document(text='jina')]))]
        ),
        on_done=on_done_mock,
        on_error=on_error_mock,
        on_always=on_always_mock,
    )
    assert r is None
    on_done_mock.assert_called_once()
    on_error_mock.assert_not_called()
    on_always_mock.assert_called_once()


@pytest.mark.parametrize('endpoint', ['aencode', 'arank', 'aindex', 'asearch'])
@pytest.mark.slow
@pytest.mark.asyncio
async def test_async_custom_on_done(make_torch_flow, mocker, endpoint):
    from clip_client.client import Client

    c = Client(server=f'grpc://0.0.0.0:{make_torch_flow.port}')

    on_done_mock = mocker.Mock()
    on_error_mock = mocker.Mock()
    on_always_mock = mocker.Mock()

    r = await getattr(c, endpoint)(
        DocumentArray(
            [Document(text='hello', matches=DocumentArray([Document(text='jina')]))]
        ),
        on_done=on_done_mock,
        on_error=on_error_mock,
        on_always=on_always_mock,
    )
    assert r is None
    on_done_mock.assert_called_once()
    on_error_mock.assert_not_called()
    on_always_mock.assert_called_once()


@pytest.mark.parametrize('endpoint', ['encode', 'rank', 'index', 'search'])
@pytest.mark.slow
def test_custom_on_error(port_generator, mocker, endpoint):
    from clip_client.client import Client

    f = Flow(port=port_generator()).add(uses=ErrorExec)

    with f:
        c = Client(server=f'grpc://0.0.0.0:{f.port}')

        on_done_mock = mocker.Mock()
        on_error_mock = mocker.Mock()
        on_always_mock = mocker.Mock()

        r = getattr(c, endpoint)(
            DocumentArray(
                [Document(text='hello', matches=DocumentArray([Document(text='jina')]))]
            ),
            on_done=on_done_mock,
            on_error=on_error_mock,
            on_always=on_always_mock,
        )
        assert r is None
        on_done_mock.assert_not_called()
        on_error_mock.assert_called_once()
        on_always_mock.assert_called_once()


@pytest.mark.parametrize('endpoint', ['aencode', 'arank', 'aindex', 'asearch'])
@pytest.mark.slow
@pytest.mark.asyncio
async def test_async_custom_on_error(port_generator, mocker, endpoint):
    from clip_client.client import Client

    f = Flow(port=port_generator()).add(uses=ErrorExec)

    with f:
        c = Client(server=f'grpc://0.0.0.0:{f.port}')

        on_done_mock = mocker.Mock()
        on_error_mock = mocker.Mock()
        on_always_mock = mocker.Mock()

        r = await getattr(c, endpoint)(
            DocumentArray(
                [Document(text='hello', matches=DocumentArray([Document(text='jina')]))]
            ),
            on_done=on_done_mock,
            on_error=on_error_mock,
            on_always=on_always_mock,
        )
        assert r is None
        on_done_mock.assert_not_called()
        on_error_mock.assert_called_once()
        on_always_mock.assert_called_once()


================================================
FILE: tests/test_helper.py
================================================
import pytest
import numpy as np
from clip_server.executors.helper import numpy_softmax
from clip_server.executors.helper import split_img_txt_da
from clip_server.executors.helper import preproc_image
from docarray import Document, DocumentArray


@pytest.mark.parametrize('shape', [(5, 10), (5, 10, 10)])
@pytest.mark.parametrize('axis', [-1, 1, 0])
def test_numpy_softmax(shape, axis):
    import torch

    logits = np.random.random(shape)

    np_softmax = numpy_softmax(logits, axis=axis)
    torch_softmax = torch.from_numpy(logits).softmax(dim=axis).numpy()
    np.testing.assert_array_almost_equal(np_softmax, torch_softmax)

    np_softmax = numpy_softmax(logits, axis=axis)
    torch_softmax = torch.from_numpy(logits).softmax(dim=axis).numpy()
    np.testing.assert_array_almost_equal(np_softmax, torch_softmax)


@pytest.mark.parametrize(
    'inputs',
    [
        (
            DocumentArray(
                [
                    Document(text='hello, world'),
                    Document(text='goodbye, world'),
                    Document(
                        text='hello, world',
                        uri='https://clip-as-service.jina.ai/_static/favicon.png',
                    ),
                    Document(
                        uri='https://clip-as-service.jina.ai/_static/favicon.png',
                    ),
                ]
            ),
            (3, 1),
        ),
        (
            DocumentArray(
                [
                    Document(text='hello, world'),
                    Document(tensor=np.array([0, 1, 2])),
                    Document(
                        uri='https://clip-as-service.jina.ai/_static/favicon.png'
                    ).load_uri_to_blob(),
                    Document(
                        tensor=np.array([0, 1, 2]),
                        uri='https://clip-as-service.jina.ai/_static/favicon.png',
                    ),
                    Document(
                        uri='https://clip-as-service.jina.ai/_static/favicon.png',
                    ),
                ]
            ),
            (1, 4),
        ),
        (
            DocumentArray(
                [
                    Document(text='hello, world'),
                    Document(uri='https://clip-as-service.jina.ai/_static/favicon.png'),
                ]
            ),
            (1, 1),
        ),
    ],
)
def test_split_img_txt_da(inputs):
    txt_da = DocumentArray()
    img_da = DocumentArray()
    for doc in inputs[0]:
        split_img_txt_da(doc, img_da, txt_da)
    assert len(txt_da) == inputs[1][0]
    assert len(img_da) == inputs[1][1]


@pytest.mark.parametrize(
    'inputs',
    [
        DocumentArray(
            [
                Document(
                    uri='https://clip-as-service.jina.ai/_static/favicon.png',
                ).load_uri_to_image_tensor(),
            ]
        )
    ],
)
def test_preproc_image(inputs):
    from clip_server.model import clip

    preprocess_fn = clip._transform_blob(224)
    da, pixel_values = preproc_image(inputs, preprocess_fn, drop_image_content=True)
    assert len(da) == 1
    assert not da[0].blob
    assert not da[0].tensor
    assert pixel_values.get('pixel_values') is not None


================================================
FILE: tests/test_model.py
================================================
import pytest
from clip_server.model.clip_model import CLIPModel
from clip_server.model.clip_onnx import CLIPOnnxModel
from clip_server.model.openclip_model import OpenCLIPModel
from clip_server.model.mclip_model import MultilingualCLIPModel
from clip_server.model.cnclip_model import CNClipModel


@pytest.mark.parametrize(
    'name, model_cls',
    [
        ('ViT-L/14@336px', OpenCLIPModel),
        ('RN50::openai', OpenCLIPModel),
        ('roberta-ViT-B-32::laion2b-s12b-b32k', OpenCLIPModel),
        ('M-CLIP/LABSE-Vit-L-14', MultilingualCLIPModel),
        ('CN-CLIP/ViT-B-16', CNClipModel),
    ],
)
def test_torch_model(name, model_cls):
    model = CLIPModel(name)
    assert model.__class__ == model_cls


@pytest.mark.parametrize(
    'name',
    [
        'RN50::openai',
        'ViT-H-14::laion2b-s32b-b79k',
        'M-CLIP/LABSE-Vit-L-14',
    ],
)
def test_onnx_model(name):
    CLIPOnnxModel(name)


@pytest.mark.gpu
@pytest.mark.parametrize(
    'name',
    ['ViT-H-14::laion2b-s32b-b79k'],
)
def test_large_onnx_model_fp16(name):
    from clip_server.executors.clip_onnx import CLIPEncoder

    CLIPEncoder(name, dtype='fp16')


================================================
FILE: tests/test_ranker.py
================================================
import os

import numpy as np
import pytest
from docarray import DocumentArray, Document

from clip_client import Client
from clip_server.executors.clip_onnx import CLIPEncoder as ONNXCLILPEncoder
from clip_server.executors.clip_torch import CLIPEncoder as TorchCLIPEncoder


@pytest.mark.asyncio
@pytest.mark.parametrize('encoder_class', [TorchCLIPEncoder, ONNXCLILPEncoder])
async def test_torch_executor_rank_img2texts(encoder_class):
    ce = encoder_class()

    da = DocumentArray.from_files(
        f'{os.path.dirname(os.path.abspath(__file__))}/**/*.jpg'
    )
    for d in da:
        d.matches.append(Document(text='hello, world!'))
        d.matches.append(Document(text='goodbye, world!'))
        d.matches.append(Document(text='goodbye,!'))
        d.matches.append(Document(text='good world!'))
        d.matches.append(Document(text='good!'))
        d.matches.append(Document(text='world!'))

    await ce.rank(da, {})
    print(da['@m', 'scores__clip_score__value'])
    for d in da:
        for c in d.matches:
            assert c.scores['clip_score'].value is not None
            assert not c.tensor
        org_score = d.matches[:, 'scores__clip_score__value']
        assert org_score == list(sorted(org_score, reverse=True))
        assert not d.tensor


@pytest.mark.asyncio
@pytest.mark.parametrize('encoder_class', [TorchCLIPEncoder, ONNXCLILPEncoder])
async def test_torch_executor_rank_text2imgs(encoder_class):
    ce = encoder_class()
    db = DocumentArray(
        [Document(text='hello, world!'), Document(text='goodbye, world!')]
    )
    for d in db:
        d.matches.extend(
            DocumentArray.from_files(
                f'{os.path.dirname(os.path.abspath(__file__))}/**/*.jpg'
            )
        )
    await ce.rank(db, {})
    print(db['@m', 'scores__clip_score__value'])
    for d in db:
        for c in d.matches:
            assert c.scores['clip_score'].value is not None
            assert c.scores['clip_score_cosine'].value is not None
            assert not c.tensor
        np.testing.assert_almost_equal(
            sum(c.scores['clip_score'].value for c in d.matches), 1
        )
        assert not d.tensor
        assert not d.blob


@pytest.mark.parametrize(
    'inputs',
    [
        [
            Document(
                uri='https://clip-as-service.jina.ai/_static/favicon.png',
                matches=[
                    Document(text='hello, world'),
                    Document(text='goodbye, world'),
                ],
            ),
            Document(
                uri='https://clip-as-service.jina.ai/_static/favicon.png',
                matches=[
                    Document(text='hello, world'),
                    Document(text='goodbye, world'),
                ],
            ),
        ],
        DocumentArray(
            [
                Document(
                    uri='https://clip-as-service.jina.ai/_static/favicon.png',
                    matches=[
                        Document(text='hello, world'),
                        Document(text='goodbye, world'),
                    ],
                ),
                Document(
                    uri='https://clip-as-service.jina.ai/_static/favicon.png',
                    matches=[
                        Document(text='hello, world'),
                        Document(text='goodbye, world'),
                    ],
                ),
            ]
        ),
        lambda: (
            Document(
                uri='https://clip-as-service.jina.ai/_static/favicon.png',
                matches=[
                    Document(text='hello, world'),
                    Document(text='goodbye, world'),
                ],
            )
            for _ in range(10)
        ),
        DocumentArray(
            [
                Document(
                    text='hello, world',
                    matches=[
                        Document(
                            uri='https://clip-as-service.jina.ai/_static/favicon.png'
                        ),
                        Document(
                            uri=f'{os.path.dirname(os.path.abspath(__file__))}/img/00000.jpg'
                        ),
                    ],
                )
            ]
        ),
    ],
)
def test_docarray_inputs(make_flow, inputs):
    c = Client(server=f'grpc://0.0.0.0:{make_flow.port}')
    r = c.rank(inputs if not callable(inputs) else inputs())
    assert not r[0].tensor
    assert isinstance(r, DocumentArray)
    rv1 = r['@m', 'scores__clip_score__value']
    rv2 = r['@m', 'scores__clip_score_cosine__value']
    for v1, v2 in zip(rv1, rv2):
        assert v1 is not None
        assert v1 > 0
        assert v2 is not None
        assert v2 > 0


@pytest.mark.parametrize(
    'inputs',
    [
        [
            Document(
                uri='https://clip-as-service.jina.ai/_static/favicon.png',
                matches=[
                    Document(text='hello, world'),
                    Document(text='goodbye, world'),
                ],
            ),
        ],
        DocumentArray(
            [
                Document(
                    uri='https://clip-as-service.jina.ai/_static/favicon.png',
                    matches=[
                        Document(text='hello, world'),
                        Document(text='goodbye, world'),
                    ],
                ),
            ]
        ),
        lambda: (
            Document(
                uri='https://clip-as-service.jina.ai/_static/favicon.png',
                matches=[
                    Document(text='hello, world'),
                    Document(text='goodbye, world'),
                ],
            )
            for _ in range(1)
        ),
        DocumentArray(
            [
                Document(
                    text='hello, world',
                    matches=[
                        Document(
                            uri='https://clip-as-service.jina.ai/_static/favicon.png'
                        ),
                        Document(
                            uri=f'{os.path.dirname(os.path.abspath(__file__))}/img/00000.jpg'
                        ),
                    ],
                )
            ]
        ),
    ],
)
@pytest.mark.asyncio
async def test_async_arank(make_flow, inputs):
    c = Client(server=f'grpc://0.0.0.0:{make_flow.port}')
    r = await c.arank(inputs if not callable(inputs) else inputs())
    assert not r[0].tensor
    assert isinstance(r, DocumentArray)
    rv = r['@m', 'scores__clip_score__value']
    for v in rv:
        assert v is not None
        assert v > 0
    np.testing.assert_almost_equal(sum(rv), 1.0)

    rv = r['@m', 'scores__clip_score_cosine__value']
    for v in rv:
        assert v is not None
        assert -1.0 <= v <= 1.0


@pytest.mark.parametrize(
    'inputs',
    [
        [
            Document(
                id=str(i), text='A', matches=[Document(text='B'), Document(text='C')]
            )
            for i in range(20)
        ],
        DocumentArray(
            [
                Document(
                    id=str(i),
                    text='A',
                    matches=[Document(text='B'), Document(text='C')],
                )
                for i in range(20)
            ]
        ),
    ],
)
def test_docarray_preserve_original_order(make_flow, inputs):
    c = Client(server=f'grpc://0.0.0.0:{make_flow.port}')
    r = c.rank(inputs, batch_size=1)
    assert isinstance(r, DocumentArray)
    for i in range(len(inputs)):
        assert inputs[i] is r[i]
        assert inputs[i].id == str(i)


@pytest.mark.parametrize(
    'inputs',
    [
        [
            Document(
                id=str(i), text='A', matches=[Document(text='B'), Document(text='C')]
            )
            for i in range(20)
        ],
        DocumentArray(
            [
                Document(
                    id=str(i),
                    text='A',
                    matches=[Document(text='B'), Document(text='C')],
                )
                for i in range(20)
            ]
        ),
    ],
)
@pytest.mark.asyncio
async def test_async_docarray_preserve_original_order(make_flow, inputs):
    c = Client(server=f'grpc://0.0.0.0:{make_flow.port}')
    r = await c.arank(inputs, batch_size=1)
    assert isinstance(r, DocumentArray)
    for i in range(len(inputs)):
        assert inputs[i] is r[i]
        assert inputs[i].id == str(i)


================================================
FILE: tests/test_search.py
================================================
import os

import numpy as np
import pytest
from docarray import DocumentArray, Document

from clip_client import Client


@pytest.mark.parametrize(
    'inputs',
    [
        [Document(text='hello, world'), Document(text='goodbye, world')],
        DocumentArray([Document(text='hello, world'), Document(text='goodbye, world')]),
        lambda: (Document(text='hello, world') for _ in range(10)),
        DocumentArray(
            [
                Document(uri='https://clip-as-service.jina.ai/_static/favicon.png'),
                Document(
                    uri=f'{os.path.dirname(os.path.abspath(__file__))}/img/00000.jpg'
                ),
                Document(text='hello, world'),
                Document(
                    uri=f'{os.path.dirname(os.path.abspath(__file__))}/img/00000.jpg'
                ).load_uri_to_image_tensor(),
            ]
        ),
        DocumentArray.from_files(
            f'{os.path.dirname(os.path.abspath(__file__))}/**/*.jpg'
        ),
    ],
)
@pytest.mark.parametrize('limit', [1, 2])
def test_index_search(make_search_flow, inputs, limit):
    c = Client(server=f'grpc://0.0.0.0:{make_search_flow.port}')

    r = c.index(inputs if not callable(inputs) else inputs())
    assert isinstance(r, DocumentArray)
    assert r.embeddings.shape[1] == 512

    r = c.search(inputs if not callable(inputs) else inputs(), limit=limit)
    assert isinstance(r, DocumentArray)
    for d in r:
        assert len(d.matches) == limit


@pytest.mark.parametrize(
    'inputs',
    [
        [Document(text='hello, world'), Document(text='goodbye, world')],
        DocumentArray([Document(text='hello, world'), Document(text='goodbye, world')]),
        lambda: (Document(text='hello, world') for _ in range(10)),
        DocumentArray(
            [
                Document(uri='https://clip-as-service.jina.ai/_static/favicon.png'),
                Document(
                    uri=f'{os.path.dirname(os.path.abspath(__file__))}/img/00000.jpg'
                ),
                Document(text='hello, world'),
                Document(
                    uri=f'{os.path.dirname(os.path.abspath(__file__))}/img/00000.jpg'
                ).load_uri_to_image_tensor(),
            ]
        ),
        DocumentArray.from_files(
            f'{os.path.dirname(os.path.abspath(__file__))}/**/*.jpg'
        ),
    ],
)
@pytest.mark.parametrize('limit', [1, 2])
@pytest.mark.asyncio
async def test_async_index_search(make_search_flow, inputs, limit):
    c = Client(server=f'grpc://0.0.0.0:{make_search_flow.port}')
    r = await c.aindex(inputs if not callable(inputs) else inputs())
    assert isinstance(r, DocumentArray)

    assert r.embeddings.shape[1] == 512

    r = await c.asearch(inputs if not callable(inputs) else inputs(), limit=limit)
    assert isinstance(r, DocumentArray)
    for d in r:
        assert len(d.matches) == limit


================================================
FILE: tests/test_server.py
================================================
import os

import pytest
from clip_server.model.clip import _transform_ndarray, _transform_blob
from clip_server.model.pretrained_models import download_model
from docarray import Document
from jina import Flow
import numpy as np


def test_server_download(tmpdir):
    download_model(
        url='https://clip-as-service.jina.ai/_static/favicon.png',
        target_folder=tmpdir,
        md5sum='43104e468ddd23c55bc662d84c87a7f8',
        with_resume=False,
    )
    target_path = os.path.join(tmpdir, 'favicon.png')
    file_size = os.path.getsize(target_path)
    assert file_size > 0

    part_path = target_path + '.part'
    with open(target_path, 'rb') as source, open(part_path, 'wb') as part_out:
        buf = source.read(10)
        part_out.write(buf)

    os.remove(target_path)

    download_model(
        url='https://clip-as-service.jina.ai/_static/favicon.png',
        target_folder=tmpdir,
        md5sum='43104e468ddd23c55bc662d84c87a7f8',
        with_resume=True,
    )
    assert os.path.getsize(target_path) == file_size
    assert not os.path.exists(part_path)


@pytest.mark.parametrize('md5', ['ABC', None, '43104e468ddd23c55bc662d84c87a7f8'])
def test_server_download_md5(tmpdir, md5):
    if md5 != 'ABC':
        download_model(
            url='https://clip-as-service.jina.ai/_static/favicon.png',
            target_folder=tmpdir,
            md5sum=md5,
            with_resume=False,
        )
    else:
        with pytest.raises(Exception):
            download_model(
                url='https://clip-as-service.jina.ai/_static/favicon.png',
                target_folder=tmpdir,
                md5sum=md5,
                with_resume=False,
            )


def test_server_download_not_regular_file(tmpdir):
    with pytest.raises(Exception):
        download_model(
            url='https://clip-as-service.jina.ai/_static/favicon.png',
            target_folder=tmpdir,
            md5sum='',
            with_resume=False,
        )
        download_model(
            url='https://docarray.jina.ai/_static/',
            target_folder=tmpdir,
            md5sum='',
            with_resume=False,
        )


def test_make_onnx_flow_wrong_name_path():
    from clip_server.executors.clip_onnx import CLIPEncoder

    with pytest.raises(Exception):
        encoder = CLIPEncoder(
            'ABC', model_path=os.path.expanduser('~/.cache/clip/ViT-B-32')
        )

    with pytest.raises(Exception) as info:
        encoder = CLIPEncoder('ViT-B/32', model_path='~/.cache/')


@pytest.mark.parametrize(
    'image_uri',
    [
        f'{os.path.dirname(os.path.abspath(__file__))}/img/00000.jpg',
        'https://clip-as-service.jina.ai/_static/favicon.png',
    ],
)
@pytest.mark.parametrize('size', [224, 288, 384, 448])
def test_server_preprocess_ndarray_image(image_uri, size):
    d1 = Document(uri=image_uri)
    d1.load_uri_to_blob()
    d2 = Document(uri=image_uri)
    d2.load_uri_to_image_tensor()

    t1 = _transform_blob(size)(d1.blob).numpy()
    t2 = _transform_ndarray(size)(d2.tensor).numpy()
    assert t1.shape == t2.shape


@pytest.mark.parametrize(
    'tensor',
    [
        np.random.random([100, 100, 3]),
        np.random.random([1, 1, 3]),
        np.random.random([5, 50, 3]),
    ],
)
def test_transform_arbitrary_tensor(tensor):
    d = Document(tensor=tensor)
    assert _transform_ndarray(224)(d.tensor).numpy().shape == (3, 224, 224)


================================================
FILE: tests/test_simple.py
================================================
import os

import pytest
from docarray import Document, DocumentArray
from jina import Flow

from clip_client.client import Client


@pytest.mark.parametrize('protocol', ['grpc', 'http', 'websocket', 'other'])
@pytest.mark.parametrize('jit', [True, False])
def test_protocols(port_generator, protocol, jit, pytestconfig):
    from clip_server.executors.clip_torch import CLIPEncoder

    if protocol == 'other':
        with pytest.raises(ValueError):
            Client(server=f'{protocol}://0.0.0.0:8000')
        return

    f = Flow(port=port_generator(), protocol=protocol).add(
        uses=CLIPEncoder, uses_with={'jit': jit}
    )
    with f:
        c = Client(server=f'{protocol}://0.0.0.0:{f.port}')
        c.profile()
        c.profile(content='hello world')
        c.profile(content=f'{pytestconfig.rootdir}/tests/img/00000.jpg')


@pytest.mark.gpu
@pytest.mark.parametrize(
    'inputs',
    [
        ['hello, world', 'goodbye, world'],
        ('hello, world', 'goodbye, world'),
        lambda: ('hello, world' for _ in range(10)),
        [
            'https://clip-as-service.jina.ai/_static/favicon.png',
            f'{os.path.dirname(os.path.abspath(__file__))}/img/00000.jpg',
            'hello, world',
        ],
    ],
)
def test_plain_inputs(make_flow, inputs):
    c = Client(server=f'grpc://0.0.0.0:{make_flow.port}')
    r = c.encode(inputs if not callable(inputs) else inputs())
    assert (
        r.shape[0] == len(list(inputs)) if not callable(inputs) else len(list(inputs()))
    )


@pytest.mark.gpu
@pytest.mark.parametrize(
    'inputs',
    [
        [Document(text='hello, world'), Document(text='goodbye, world')],
        DocumentArray([Document(text='hello, world'), Document(text='goodbye, world')]),
        lambda: (Document(text='hello, world') for _ in range(10)),
        DocumentArray(
            [
                Document(uri='https://clip-as-service.jina.ai/_static/favicon.png'),
                Document(
                    uri=f'{os.path.dirname(os.path.abspath(__file__))}/img/00000.jpg'
                ),
                Document(text='hello, world'),
                Document(
                    uri=f'{os.path.dirname(os.path.abspath(__file__))}/img/00000.jpg'
                ).load_uri_to_image_tensor(),
            ]
        ),
        DocumentArray.from_files(
            f'{os.path.dirname(os.path.abspath(__file__))}/**/*.jpg'
        ),
    ],
)
def test_docarray_inputs(make_flow, inputs):
    c = Client(server=f'grpc://0.0.0.0:{make_flow.port}')
    r = c.encode(inputs if not callable(inputs) else inputs())
    assert isinstance(r, DocumentArray)
    assert r.embeddings.shape
    assert not r[0].tensor
    if hasattr(inputs, '__len__'):
        assert inputs[0] is r[0]


@pytest.mark.parametrize(
    'inputs',
    [
        DocumentArray([Document(text='hello, world'), Document(text='goodbye, world')]),
        DocumentArray(
            [
                Document(
                    uri='https://clip-as-service.jina.ai/_static/favicon.png',
                    text='hello, world',
                ),
            ]
        ),
        DocumentArray.from_files(
            f'{os.path.dirname(os.path.abspath(__file__))}/**/*.jpg'
        ),
    ],
)
def test_docarray_preserve_original_inputs(make_flow, inputs):
    c = Client(server=f'grpc://0.0.0.0:{make_flow.port}')
    r = c.encode(inputs if not callable(inputs) else inputs())
    assert isinstance(r, DocumentArray)
    assert r.embeddings.shape
    assert r.contents == inputs.contents
    assert not r[0].tensor
    assert inputs[0] is r[0]


@pytest.mark.parametrize(
    'inputs',
    [
        DocumentArray([Document(text='hello, world'), Document(text='goodbye, world')]),
        DocumentArray(
            [
                Document(
                    uri='https://clip-as-service.jina.ai/_static/favicon.png',
                    text='hello, world',
                ),
            ]
        ),
        DocumentArray.from_files(
            f'{os.path.dirname(os.path.abspath(__file__))}/**/*.jpg'
        ),
    ],
)
def test_docarray_traversal(make_flow, inputs):
    from jina import Client as _Client

    da = DocumentArray.empty(1)
    da[0].chunks = inputs

    c = _Client(host=f'grpc://0.0.0.0', port=make_flow.port)
    r1 = c.post(on='/', inputs=da, parameters={'traversal_paths': '@c'})
    assert isinstance(r1, DocumentArray)
    assert r1[0].chunks.embeddings.shape[0] == len(inputs)
    assert not r1[0].tensor
    assert not r1[0].blob
    assert not r1[0].chunks[0].tensor
    assert not r1[0].chunks[0].blob

    r2 = c.post(on='/', inputs=da, parameters={'access_paths': '@c'})
    assert isinstance(r2, DocumentArray)
    assert r2[0].chunks.embeddings.shape[0] == len(inputs)
    assert not r2[0].tensor
    assert not r2[0].blob
    assert not r2[0].chunks[0].tensor
    assert not r2[0].chunks[0].blob


@pytest.mark.parametrize(
    'inputs',
    [
        [Document(id=str(i), text='hello, world') for i in range(20)],
        DocumentArray([Document(id=str(i), text='hello, world') for i in range(20)]),
    ],
)
def test_docarray_preserve_original_order(make_flow, inputs):
    c = Client(server=f'grpc://0.0.0.0:{make_flow.port}')
    r = c.encode(inputs if not callable(inputs) else inputs(), batch_size=1)
    assert isinstance(r, DocumentArray)
    for i in range(len(inputs)):
        assert inputs[i] is r[i]
        assert inputs[i].id == str(i)


================================================
FILE: tests/test_tensorrt.py
================================================
import os

import pytest
import numpy as np
from docarray import Document, DocumentArray
from jina import Flow

from clip_client.client import Client


@pytest.mark.gpu
@pytest.mark.parametrize(
    'inputs',
    [
        [Document(text='hello, world'), Document(text='goodbye, world')],
        DocumentArray([Document(text='hello, world'), Document(text='goodbye, world')]),
        lambda: (Document(text='hello, world') for _ in range(10)),
        DocumentArray(
            [
                Document(uri='https://clip-as-service.jina.ai/_static/favicon.png'),
                Document(
                    uri=f'{os.path.dirname(os.path.abspath(__file__))}/img/00000.jpg'
                ),
                Document(text='hello, world'),
                Document(
                    uri=f'{os.path.dirname(os.path.abspath(__file__))}/img/00000.jpg'
                ).load_uri_to_image_tensor(),
            ]
        ),
        DocumentArray.from_files(
            f'{os.path.dirname(os.path.abspath(__file__))}/**/*.jpg'
        ),
    ],
)
def test_docarray_inputs(make_trt_flow, inputs):
    c = Client(server=f'grpc://0.0.0.0:{make_trt_flow.port}')
    r = c.encode(inputs if not callable(inputs) else inputs())
    assert isinstance(r, DocumentArray)
    assert r.embeddings.shape
    if hasattr(inputs, '__len__'):
        assert inputs[0] is r[0]


@pytest.mark.gpu
@pytest.mark.asyncio
@pytest.mark.parametrize(
    'd',
    [
        Document(
            uri='https://clip-as-service.jina.ai/_static/favicon.png',
            matches=[Document(text='hello, world'), Document(text='goodbye, world')],
        ),
        Document(
            text='hello, world',
            matches=[
                Document(uri='https://clip-as-service.jina.ai/_static/favicon.png'),
                Document(
                    uri=f'{os.path.dirname(os.path.abspath(__file__))}/img/00000.jpg'
                ),
            ],
        ),
    ],
)
async def test_async_arank(make_trt_flow, d):
    c = Client(server=f'grpc://0.0.0.0:{make_trt_flow.port}')
    r = await c.arank([d])
    assert isinstance(r, DocumentArray)
    assert d is r[0]
    rv = r['@m', 'scores__clip_score__value']
    for v in rv:
        assert v is not None
        assert v > 0
    np.testing.assert_almost_equal(sum(rv), 1.0)

    rv = r['@m', 'scores__clip_score_cosine__value']
    for v in rv:
        assert v is not None
        assert -1.0 <= v <= 1.0


================================================
FILE: tests/test_tokenization.py
================================================
import pytest
from clip_server.model.tokenization import Tokenizer


@pytest.mark.parametrize(
    'name', ['ViT-L/14@336px', 'M-CLIP/XLM-Roberta-Large-Vit-B-32']
)
def test_tokenizer_name(name):
    tokenizer = Tokenizer(name)

    result = tokenizer('hello world')
    assert result['input_ids'].shape == result['attention_mask'].shape
    assert result['input_ids'].shape[0] == 1

    result = tokenizer(['hello world', 'welcome to the world'])
    assert result['input_ids'].shape == result['attention_mask'].shape
    assert result['input_ids'].shape[0] == 2