Full Code of HaonanGuo/Remote-Sensing-ChatGPT for AI

main 96adff321ce1 cached

91 files

456.0 KB

138.8k tokens

573 symbols

1 requests

Download .txt

Showing preview only (485K chars total). Download the full file or copy to clipboard to get everything.

Repository: HaonanGuo/Remote-Sensing-ChatGPT
Branch: main
Commit: 96adff321ce1
Files: 91
Total size: 456.0 KB

Directory structure:
gitextract_ppmvimf1/

├── Prefix/
│   └── __init__.py
├── README.md
├── RSChatGPT-shell.py
├── RStask/
│   ├── EdgeDetection/
│   │   ├── Canny.py
│   │   ├── __init__.py
│   │   └── test.py
│   ├── ImageCaptioning/
│   │   ├── __init__.py
│   │   ├── blip.py
│   │   └── test.py
│   ├── InstanceSegmentation/
│   │   ├── SwinUpper.py
│   │   ├── __init__.py
│   │   ├── model.py
│   │   ├── swin.py
│   │   ├── test.py
│   │   └── uper.py
│   ├── LanduseSegmentation/
│   │   ├── __init__.py
│   │   ├── seg_hrnet.py
│   │   └── test.py
│   ├── ObjectCounting/
│   │   ├── Yolocounting.py
│   │   ├── __init__.py
│   │   └── test.py
│   ├── ObjectDetection/
│   │   ├── YOLOv5.py
│   │   ├── __init__.py
│   │   ├── models/
│   │   │   ├── __init__.py
│   │   │   ├── common.py
│   │   │   ├── experimental.py
│   │   │   ├── hub/
│   │   │   │   ├── anchors.yaml
│   │   │   │   ├── yolov3-spp.yaml
│   │   │   │   ├── yolov3-tiny.yaml
│   │   │   │   ├── yolov3.yaml
│   │   │   │   ├── yolov5-bifpn.yaml
│   │   │   │   ├── yolov5-fpn.yaml
│   │   │   │   ├── yolov5-p2.yaml
│   │   │   │   ├── yolov5-p34.yaml
│   │   │   │   ├── yolov5-p6.yaml
│   │   │   │   ├── yolov5-p7.yaml
│   │   │   │   ├── yolov5-panet.yaml
│   │   │   │   ├── yolov5l6.yaml
│   │   │   │   ├── yolov5m6.yaml
│   │   │   │   ├── yolov5n6.yaml
│   │   │   │   ├── yolov5s-ghost.yaml
│   │   │   │   ├── yolov5s-transformer.yaml
│   │   │   │   ├── yolov5s6.yaml
│   │   │   │   └── yolov5x6.yaml
│   │   │   ├── tf.py
│   │   │   ├── yolo.py
│   │   │   ├── yolov5l.yaml
│   │   │   ├── yolov5m.yaml
│   │   │   ├── yolov5n.yaml
│   │   │   ├── yolov5s.yaml
│   │   │   └── yolov5x.yaml
│   │   ├── test.py
│   │   └── utils/
│   │       ├── __init__.py
│   │       ├── activations.py
│   │       ├── augmentations.py
│   │       ├── autoanchor.py
│   │       ├── autobatch.py
│   │       ├── aws/
│   │       │   ├── __init__.py
│   │       │   ├── mime.sh
│   │       │   ├── resume.py
│   │       │   └── userdata.sh
│   │       ├── benchmarks.py
│   │       ├── callbacks.py
│   │       ├── datasets.py
│   │       ├── docker/
│   │       │   ├── Dockerfile
│   │       │   └── Dockerfile-cpu
│   │       ├── downloads.py
│   │       ├── flask_rest_api/
│   │       │   ├── README.md
│   │       │   ├── example_request.py
│   │       │   └── restapi.py
│   │       ├── general.py
│   │       ├── google_app_engine/
│   │       │   ├── Dockerfile
│   │       │   ├── additional_requirements.txt
│   │       │   └── app.yaml
│   │       ├── loggers/
│   │       │   ├── __init__.py
│   │       │   └── wandb/
│   │       │       ├── README.md
│   │       │       ├── __init__.py
│   │       │       ├── log_dataset.py
│   │       │       ├── sweep.py
│   │       │       ├── sweep.yaml
│   │       │       └── wandb_utils.py
│   │       ├── loss.py
│   │       ├── metrics.py
│   │       ├── plots.py
│   │       └── torch_utils.py
│   ├── SceneClassification/
│   │   ├── ResNetScene.py
│   │   ├── __init__.py
│   │   └── test.py
│   └── __init__.py
├── checkpoints/
│   └── __init__.py
└── requirements.txt

================================================
FILE CONTENTS
================================================

================================================
FILE: Prefix/__init__.py
================================================
RS_CHATGPT_PREFIX = """Remote Sensing ChatGPT is designed to assist with a wide range of remote sensing image related tasks, from answering simple questions to providing in-depth explanations and discussions on a wide range of remote sensing applications. Remote Sensing ChatGPT is able to generate human-like text based on the input it receives, allowing it to engage in natural-sounding conversations and provide responses that are coherent and relevant to the topic at hand.

Remote Sensing ChatGPT can process and understand large amounts of  remote sensing images, knowledge, and text. As a expertized language model, Remote Sensing ChatGPT can not directly read remote sensing images, but it has a list of tools to finish different remote sensing tasks. Each input remote sensing image will have a file name formed as "image/xxx.png", and Remote Sensing ChatGPT can invoke different tools to indirectly understand the remote sensing image. When talking about images, Remote Sensing ChatGPT is very strict to the file name and will never fabricate nonexistent files. When using tools to generate new image files, Remote Sesning ChatGPT is also known that the image may not be the same as the user's demand, and will use other visual question answering tools or description tools to observe the real image. Remote Sensing ChatGPT is able to use tools in a sequence, and is loyal to the tool observation outputs rather than faking the image content and image file name. It will remember to provide the file name from the last tool observation, if a new image is generated.

Human may provide new remote sensing images to Remote Sensing ChatGPT with a description. The description helps Remote Sensing ChatGPT to understand this image, but Remote Sensing ChatGPT should use tools to finish following tasks, rather than directly imagine from the description.

Overall, Remote Sensing ChatGPT is a powerful visual dialogue assistant tool that can help with a wide range of remote sensing tasks and provide valuable insights and information on a wide range of remote sensing applicatinos. 


TOOLS:
------

Remote Sensing ChatGPT  has access to the following tools:"""

RS_CHATGPT_FORMAT_INSTRUCTIONS = """

When you have a response to say to the Human, or if you do not need to use a tool, you MUST use the format:

```
Thought: Do I need to use a tool? No
{ai_prefix}: [your response here]
```

To use a tool, you MUST use the following format:

```
Thought: Do I need to use a tool? Yes
Action: the action to take, should be one of [{tool_names}]
Action Input: the input to the action
Observation: the result of the action
```
"""

RS_CHATGPT_SUFFIX = """You are very strict to the filename correctness and will never fake a file name if it does not exist.
You will remember to provide the image file name loyally if it's provided in the last tool observation.

Begin!

Previous conversation history:
{chat_history}

New input: {input}
Since Remote Sensing ChatGPT is a text language model, Remote Sensing ChatGPT must use tools to observe remote sensing images rather than imagination.
The thoughts and observations are only visible for Remote Sensing ChatGPT, Remote Sensing ChatGPT should remember to repeat important information in the final response for Human. 
Thought: Do I need to use a tool? {agent_scratchpad} Let's think step by step.

"""


================================================
FILE: README.md
================================================
# RS-ChatGPT: Solving Remote Sensing Tasks with ChatGPT and Visual Models
Introduction
----
Remote Sensing ChatGPT(RS-ChatGPT) is an open source tool for solving remote sensing tasks with ChatGPT in an interactive way.ChatGPT acts as an expert to response to users' linguistic resquests based on the input remote sensing image.  It supports various interpretation tasks that are trained on remote sensing datasets. To help ChatGPT better understand remote sensing knowledge, remote sensing image captioning is set as cue to help ChatGPT understand remote sensing image.We will keep updating RS-CHatGPT~ Please comments with Issues or send me a email if you have any suggestions! Thanks in advance!

Generally, RS-ChatGPT includes four steps in implementation:
* Prompt Template Generation
* Task Planning
* Task Execution
* Response Genration

Our article is now available in [arXiv!](https://arxiv.org/abs/2401.09083) See you in IGARSS204, Greece!<br/>
My personal website is also avalable:[www.poleguo98.top](https://www.poleguo98.top)


Updates
----
* TODO1: An online website is comming! Let's use RS-ChatGPT Online!<br/>
* TODO2: What about asking RS-CHatGPT to choose different models within the given task?<br/>
* TODO3: Put all the models available in Huggingface<br/>

2024.01.30: <br/>
* Remote Sensing GPT is updated, with bugs fixed, more readable, and more scalable~<br/>

2023.09.19: <br/>
* Remote Sensing GPT now supports GPT-4 and multi-round chating! An interactive interface is now available(see interface.py)~<br/>

2023.08.23:<br/>
* Initial release

  
The code
----
### Requirements
Please Refer to [requirements.txt](https://github.com/HaonanGuo/Remote-Sensing-ChatGPT/blob/main/requirements.txt)

### Usage
->Clone the repository:git clone https://github.com/HaonanGuo/Remote-Sensing-ChatGPT<br/>
->Download the below models and place them in the checkpoints folder<br/>
->Run [RSChatGPT-shell.py](https://github.com/HaonanGuo/Remote-Sensing-ChatGPT/blob/main/RSChatGPT-shell.py) 

### Supported Function
| Function |    Description  | Method | Pretrain Dataset     | Model Weights     |
| :--------: | :--------: | :--------: | :--------: | :--------: |
| Image Captioning | Describe the remote sensing image | [BLIP](https://icml.cc/virtual/2022/spotlight/16016) | [BLIP Dataset](https://icml.cc/virtual/2022/spotlight/16016)| [weight(github)](https://github.com/salesforce/BLIP) |
| Scene Classification | Classify the type of scene | [ResNet](https://arxiv.org/abs/1512.03385) | [AID Dataset](http://www.captain-whu.com/project/AID/)|[weight(Google)](https://drive.google.com/file/d/1f-WES6fTGGa5W9BcDPMVhGk3Foc4p9Or/view?usp=drive_link) [weight(Baidu)](https://pan.baidu.com/s/1yNgUQKieZBEJZ0axzN4tiw?pwd=RSGP) |
| Object Detection | Detect RS object from image | [YOLO v5](https://zenodo.org/badge/latestdoi/264818686) | [DOTA](http://captain.whu.edu.cn/DOTAweb)| [weight(Google)](https://drive.google.com/file/d/1Hb7XA6gZxNam8y8nxs2p6EqJ-XaG1o5Y/view?usp=drive_link) [weight(Baidu)](https://pan.baidu.com/s/1XTG-MLxx5_D0OO6M80OP1A?pwd=RSGP) |
| Instance Segmentation | Extract Instance Mask of certain object | [SwinTransformer+UperNet](https://github.com/open-mmlab/mmsegmentation) | [iSAID](https://captain-whu.github.io/iSAID/index)| [weight(Google)](https://drive.google.com/file/d/165jeD0oi6fSpvWrpgfVBbzUOsyHN0xEq/view?usp=drive_link) [weight(Baidu)](https://pan.baidu.com/s/1Tv6BCt68L2deY_wMVZizgg?pwd=RSGP)|
| Landuse Classification | Extract Pixel-wise Landuse Classification | [HRNet](https://github.com/HRNet) | [LoveDA](https://github.com/Junjue-Wang/LoveDA)| [weight(Google)](https://drive.google.com/file/d/1fRyEpb7344S4Y5F2Q4EBO3fXVT4kXaft/view?usp=drive_link) [weight(Baidu)](https://pan.baidu.com/s/1m6yOXbT6cKGqJ64z86u7fQ?pwd=RSGP) |
| Object Counting | Count the number of certain object in an image | [YOLO v5](https://zenodo.org/badge/latestdoi/264818686) | [DOTA](http://captain.whu.edu.cn/DOTAweb)| Same as Object Detection |
| Edge Detection | Extract edge of remote sensing image | Canny |None| None |

 More funtions to be updated~

### Citation

Please cite the repo if you use the data or code in this repo.

```
@article{RS ChatGPT,
	title = {Remote Sensing ChatGPT: Solving Remote Sensing Tasks with ChatGPT and Visual Models},
	shorttitle = {Remote Sensing ChatGPT},
	doi = {10.48550/ARXIV.2401.09083},
	author = {Guo, Haonan and Su, Xin and Wu, Chen and Du, Bo and Zhang, Liangpei and Li, Deren},
	year = {2024},
}

```

## Acknowledgments
- [Visual ChatGPT](https://github.com/microsoft/TaskMatrix)
- [YOLOv5](https://github.com/hukaixuan19970627/yolov5_obb)
- [BLIP](https://github.com/salesforce/BLIP)
  
Help
----
Remote Sensing ChatGPT is an open source project that welcome any contribution and feedback. Please contact us with: haonan.guo@whu.edu.cn


================================================
FILE: RSChatGPT-shell.py
================================================
import os

import re
import uuid
from skimage import io
import argparse
import inspect
from langchain.chat_models import ChatOpenAI
from langchain.agents.initialize import initialize_agent
from langchain.agents.tools import Tool
from langchain.chains.conversation.memory import ConversationBufferMemory
import numpy as np
from Prefix import  RS_CHATGPT_PREFIX, RS_CHATGPT_FORMAT_INSTRUCTIONS, RS_CHATGPT_SUFFIX
from RStask import ImageEdgeFunction,CaptionFunction,LanduseFunction,DetectionFunction,CountingFuncnction,SceneFunction,InstanceFunction



os.makedirs('image', exist_ok=True)
def prompts(name, description):
    def decorator(func):
        func.name = name
        func.description = description
        return func
    return decorator
def get_new_image_name(org_img_name, func_name="update"):
    head_tail = os.path.split(org_img_name)
    head = head_tail[0]
    tail = head_tail[1]
    name_split = tail.split('.')[0].split('_')
    this_new_uuid = str(uuid.uuid4())[:4]
    recent_prev_file_name = name_split[0]
    new_file_name = f'{this_new_uuid}_{func_name}_{recent_prev_file_name}.png'.replace('__','_')
    return os.path.join(head, new_file_name)

class EdgeDetection:
    def __init__(self, device):
        print("Initializing Edge Detection Function....")
        self.func = ImageEdgeFunction()
    @prompts(name="Edge Detection On Image",
             description="useful when you want to detect the edge of the remote sensing image. "
                         "like: detect the edges of this image, or canny detection on image, "
                         "or perform edge detection on this image, or detect the  edge of this image. "
                         "The input to this tool should be a string, representing the image_path")
    def inference(self, inputs):
        updated_image_path=get_new_image_name(inputs, func_name="edge")
        self.func.inference(inputs,updated_image_path)
        return updated_image_path

class ObjectCounting:
    def __init__(self, device):
        self.func=CountingFuncnction(device)
    @prompts(name="Count object",
             description="useful when you want to count the number of the  object in the image. "
                         "like: how many planes are there in the image? or count the number of bridges"
                         "The input to this tool should be a comma separated string of two, "
                         "representing the image_path, the text description of the object to be counted")
    def inference(self, inputs):
        image_path, det_prompt = inputs.split(",")
        log_text=self.func.inference(image_path,det_prompt)
        return log_text


class InstanceSegmentation:
    def __init__(self, device):
        print("Initializing InstanceSegmentation")
        self.func=InstanceFunction(device)
    @prompts(name="Instance Segmentation for Remote Sensing Image",
             description="useful when you want to apply man-made instance segmentation for the image. The expected input category include plane, ship, storage tank, baseball diamond, tennis court, basketball court, ground track field, harbor, bridge, vehicle, helicopter, roundabout, soccer ball field, and swimming pool."
                         "like: extract plane from this image, "
                         "or predict the ship in this image, or extract tennis court from this image, segment harbor from this image, Extract the vehicle in the image. "
                         "The input to this tool should be a comma separated string of two, "
                         "representing the image_path, the text of the category,selected from plane, or ship, or storage tank, or baseball diamond, or tennis court, or basketball court, or ground track field, or harbor, or bridge, or vehicle, or helicopter, or roundabout, or soccer ball field, or  swimming pool. ")
    def inference(self, inputs):
        image_path, det_prompt = inputs.split(",")
        updated_image_path = get_new_image_name(image_path, func_name="instance_" + det_prompt)
        text=self.func.inference(image_path, det_prompt,updated_image_path)
        return text

class SceneClassification:
    def __init__(self, device):
        print("Initializing SceneClassification")
        self.func=SceneFunction(device)
    @prompts(name="Scene Classification for Remote Sensing Image",
             description="useful when you want to know the type of scene or function for the image. "
                         "like: what is the category of this image?, "
                         "or classify the scene of this image, or predict the scene category of this image, or what is the function of this image. "
                         "The input to this tool should be a string, representing the image_path. ")
    def inference(self, inputs):
        output_txt=self.func.inference(inputs)
        return output_txt


class LandUseSegmentation:
    def __init__(self, device):
        print("Initializing LandUseSegmentation")
        self.func=LanduseFunction(device)

    @prompts(name="Land Use Segmentation for Remote Sensing Image",
             description="useful when you want to apply land use gegmentation for the image. The expected input category include Building, Road, Water, Barren, Forest, Farmland, Landuse."
                         "like: generate landuse map from this image, "
                         "or predict the landuse on this image, or extract building from this image, segment roads from this image, Extract the water bodies in the image. "
                         "The input to this tool should be a comma separated string of two, "
                         "representing the image_path, the text of the category,selected from Lnad Use, or Building, or Road, or Water, or Barren, or Forest, or Farmland, or Landuse.")
    def inference(self, inputs):
        image_path, det_prompt = inputs.split(",")
        updated_image_path = get_new_image_name(image_path, func_name="landuse")
        text=self.func.inference(image_path, det_prompt,updated_image_path)
        return text

class ObjectDetection:
    def __init__(self, device):
        self.func=DetectionFunction(device)


    @prompts(name="Detect the given object",
             description="useful when you only want to detect the bounding box of the certain objects in the picture according to the given text."
                         "like: detect the plane, or can you locate an object for me."
                         "The input to this tool should be a comma separated string of two, "
                         "representing the image_path, the text description of the object to be found")

    def inference(self, inputs):
        image_path, det_prompt = inputs.split(",")
        updated_image_path = get_new_image_name(image_path, func_name="detection_" + det_prompt.replace(' ', '_'))
        log_text=self.func.inference(image_path, det_prompt,updated_image_path)
        return log_text

class ImageCaptioning:
    def __init__(self, device):
        print(f"Initializing ImageCaptioning to {device}")
        self.device = device
        self.func=CaptionFunction(device)
    @prompts(name="Get Photo Description",
             description="useful when you want to know what is inside the photo. receives image_path as input. "
                         "The input to this tool should be a string, representing the image_path. ")
    def inference(self, image_path):
        captions = self.func.inference(image_path)
        print(f"\nProcessed ImageCaptioning, Input Image: {image_path}, Output Text: {captions}")
        return captions

class RSChatGPT:
    def __init__(self, gpt_name,load_dict,openai_key,proxy_url):
        print(f"Initializing RSChatGPT, load_dict={load_dict}")
        if 'ImageCaptioning' not in load_dict:
            raise ValueError("You have to load ImageCaptioning as a basic function for RSChatGPT")
        self.models = {}
        # Load Basic Foundation Models
        for class_name, device in load_dict.items():
            self.models[class_name] = globals()[class_name](device=device)
        # Load Template Foundation Models
        for class_name, module in globals().items():
            if getattr(module, 'template_model', False):
                template_required_names = {k for k in inspect.signature(module.__init__).parameters.keys() if
                                           k != 'self'}
                loaded_names = set([type(e).__name__ for e in self.models.values()])
                if template_required_names.issubset(loaded_names):
                    self.models[class_name] = globals()[class_name](
                        **{name: self.models[name] for name in template_required_names})

        print(f"All the Available Functions: {self.models}")

        self.tools = []
        for instance in self.models.values():
            for e in dir(instance):
                if e.startswith('inference'):
                    func = getattr(instance, e)
                    self.tools.append(Tool(name=func.name, description=func.description, func=func))

        self.llm = ChatOpenAI(api_key=openai_key, base_url=proxy_url, model_name=gpt_name,temperature=0)
        self.memory = ConversationBufferMemory(memory_key="chat_history", output_key='output')

    def initialize(self):
        self.memory.clear() #clear previous history
        PREFIX, FORMAT_INSTRUCTIONS, SUFFIX = RS_CHATGPT_PREFIX, RS_CHATGPT_FORMAT_INSTRUCTIONS, RS_CHATGPT_SUFFIX
        self.agent = initialize_agent(
            self.tools,
            self.llm,
            agent="conversational-react-description",
            verbose=True,
            memory=self.memory,
            return_intermediate_steps=True,stop=["\nObservation:", "\n\tObservation:"],
            agent_kwargs={'prefix': PREFIX, 'format_instructions': FORMAT_INSTRUCTIONS,'suffix': SUFFIX}, )

    def run_text(self, text, state):
        res = self.agent({"input": text.strip()})
        res['output'] = res['output'].replace("\\", "/")
        response = re.sub('(image/[-\w]*.png)', lambda m: f'![](file={m.group(0)})*{m.group(0)}*', res['output'])
        state = state + [(text, response)]
        print(f"\nProcessed run_text, Input text: {text}\nCurrent state: {state}\n"
              f"Current Memory: {self.agent.memory.buffer}")
        return state
    def run_image(self, image_dir, state, txt=None):
        image_filename = os.path.join('image', f"{str(uuid.uuid4())[:8]}.png")
        img = io.imread(image_dir)
        # width, height = img.shape[1],img.shape[0]
        # ratio = min(640 / width, 640 / height)
        # if ratio<1:
        #     width_new, height_new = (round(width * ratio), round(height * ratio))
        # else:
        #     width_new, height_new =width,height
        # width_new = int(np.round(width_new / 64.0)) * 64
        # height_new = int(np.round(height_new / 64.0)) * 64
        #
        # if width_new!=width or height_new!=height:
        #     img = cv2.resize(img,(width_new, height_new))
        #     print(f"======>Auto Resizing Image from {height,width} to {height_new,width_new}...")
        # else:
        #     print(f"======>Auto Renaming Image...")
        io.imsave(image_filename, img.astype(np.uint8))
        description = self.models['ImageCaptioning'].inference(image_filename)
        Human_prompt = f' Provide a remote sensing image named {image_filename}. The description is: {description}. This information helps you to understand this image, but you should use tools to finish following tasks, rather than directly imagine from my description. If you understand, say \"Received\".'
        AI_prompt = "Received."
        self.memory.chat_memory.add_user_message(Human_prompt)
        self.memory.chat_memory.add_ai_message(AI_prompt)

        state = state + [(f"![](file={image_filename})*{image_filename}*", AI_prompt)]
        print(f"\nProcessed run_image, Input image: {image_filename}\nCurrent state: {state}\n"
              f"Current Memory: {self.agent.memory.buffer}")
        state=self.run_text(f'{txt} {image_filename} ', state)
        return state



if __name__ == '__main__':
    parser = argparse.ArgumentParser()
    parser.add_argument('--openai_key', type=str,required=True)
    parser.add_argument('--image_dir', type=str,required=True)
    parser.add_argument('--gpt_name', type=str, default="gpt-3.5-turbo",choices=['gpt-3.5-turbo-1106','gpt-3.5-turbo','gpt-4','gpt-4-0125-preview','gpt-4-turbo-preview','gpt-4-1106-preview'])
    parser.add_argument('--proxy_url', type=str, default=None)
    parser.add_argument('--load', type=str,help='Image Captioning is basic models that is required. You can select from [ImageCaptioning,ObjectDetection,LandUseSegmentation,InstanceSegmentation,ObjectCounting,SceneClassification,EdgeDetection]',
                        default="ImageCaptioning_cuda:0,SceneClassification_cuda:0,ObjectDetection_cuda:0,LandUseSegmentation_cuda:0,InstanceSegmentation_cuda:0,ObjectCounting_cuda:0,EdgeDetection_cpu")
    args = parser.parse_args()
    state = []
    load_dict = {e.split('_')[0].strip(): e.split('_')[1].strip() for e in args.load.split(',')}
    bot = RSChatGPT(gpt_name=args.gpt_name,load_dict=load_dict,openai_key=args.openai_key,proxy_url=args.proxy_url)
    bot.initialize()
    print('RSChatGPT initialization done, you can now chat with RSChatGPT~')
    bot.initialize()
    txt='Count the number of plane in the image.'
    state=bot.run_image(args.image_dir, [], txt)

    while 1:
        txt = input('You can now input your question.(e.g. Extract buildings from the image)\n')
        state = bot.run_image(args.image_dir, state, txt)




================================================
FILE: RStask/EdgeDetection/Canny.py
================================================
from PIL import Image
import cv2
import numpy as np
class Image2Canny:
    def __init__(self):
        print("Initializing Image2Canny")
        self.low_threshold = 100
        self.high_threshold = 200

    def inference(self, inputs,new_image_name):
        image = Image.open(inputs)
        image = np.array(image)
        canny = cv2.Canny(image, self.low_threshold, self.high_threshold)
        canny = canny[:, :, None]
        canny = np.concatenate([canny, canny, canny], axis=2)
        canny = Image.fromarray(canny)
        updated_image_path = new_image_name
        canny.save(updated_image_path)
        print(f"\nProcessed Image2Canny, Input Image: {inputs}, Output Text: {updated_image_path}")
        return None

================================================
FILE: RStask/EdgeDetection/__init__.py
================================================


================================================
FILE: RStask/EdgeDetection/test.py
================================================
from RStask.EdgeDetection.Canny import Image2Canny
model=Image2Canny()
model.inference('/data/haonan.guo/RSChatGPT/test.tif','/data/haonan.guo/RSChatGPT/output.png')

================================================
FILE: RStask/ImageCaptioning/__init__.py
================================================
from RStask.ImageCaptioning.blip import BLIP as CaptionFunction

================================================
FILE: RStask/ImageCaptioning/blip.py
================================================
import torch
from PIL import Image
from transformers import  BlipProcessor, BlipForConditionalGeneration

class BLIP:
    def __init__(self, device):
        self.device = device
        self.torch_dtype = torch.float16 if 'cuda' in device else torch.float32
        self.processor = BlipProcessor.from_pretrained("Salesforce/blip-image-captioning-base")
        self.model = BlipForConditionalGeneration.from_pretrained(
            "Salesforce/blip-image-captioning-base", torch_dtype=self.torch_dtype).to(self.device)
    def inference(self, image_path):
        inputs = self.processor(Image.open(image_path), return_tensors="pt").to(self.device, self.torch_dtype)
        out = self.model.generate(**inputs)
        captions = 'A satellite image of ' + self.processor.decode(out[0], skip_special_tokens=True)
        print(f"\nProcessed ImageCaptioning, Input Image: {image_path}, Output Text: {captions}")
        return captions

================================================
FILE: RStask/ImageCaptioning/test.py
================================================
from RStask import LanduseFunction
model=LanduseFunction('cuda:0')
model.inference('/data/haonan.guo/LoveDA/Train/Urban/images_png/1367.png','building','/data/haonan.guo/RSChatGPT/output.png')

================================================
FILE: RStask/InstanceSegmentation/SwinUpper.py
================================================
from RStask.InstanceSegmentation.model import SwinUPer
import torch
from skimage import io
from PIL import Image
import numpy as np
class SwinInstance:
    def __init__(self, device):
        print("Initializing InstanceSegmentation")
        self.model = SwinUPer()
        self.device = device
        try:
            trained = torch.load('./checkpoints/last_swint_upernet_finetune.pth')
        except:
            trained = torch.load('../../checkpoints/last_swint_upernet_finetune.pth')
        self.model.load_state_dict(trained["state_dict"])
        self.model = self.model.to(device)
        self.model.eval()
        self.mean, self.std = torch.tensor([123.675, 116.28, 103.53]).reshape((1, 3, 1, 1)), torch.tensor(
            [58.395, 57.12, 57.375]).reshape((1, 3, 1, 1))
        self.all_dict = {'plane': 1, 'ship': 2, 'storage tank': 3, 'baseball diamond': 4, 'tennis court': 5,
                         'basketball court': 6, 'ground track field': 7, 'harbor': 8, 'bridge': 9,
                         'large vehicle': 10, 'small vehicle': 11, 'helicopter': 12, 'roundabout': 13,
                         'soccer ball field': 14, 'swimming pool': 15}
    def inference(self, image_path, det_prompt ,updated_image_path):
        image = torch.from_numpy(io.imread(image_path))
        image = (image.permute(2, 0, 1).unsqueeze(0) - self.mean) / self.std
        with torch.no_grad():
            pred = self.model(image.to(self.device))
        pred = pred.argmax(1).cpu().squeeze().int().numpy()

        if det_prompt.strip().lower() in [i.strip().lower()  for i in self.all_dict.keys()]:
            idx=[i.replace(' ', '_').lower() for i in self.all_dict.keys()].index(det_prompt.strip().lower())+1
            pred=(pred==idx)*255
            pred = Image.fromarray(np.stack([pred, pred, pred], -1).astype(np.uint8))
            pred.save(updated_image_path)
            print(f"\nProcessed Instance Segmentation, Input Image: {image_path + ',' + det_prompt}, Output SegMap: {updated_image_path}")
            return updated_image_path
        else:
            print(f"\nCategory: { det_prompt} is not supported. Please use other tools.")
            return f"Category {det_prompt} is not supported. Please use other tools."





================================================
FILE: RStask/InstanceSegmentation/__init__.py
================================================


================================================
FILE: RStask/InstanceSegmentation/model.py
================================================
import torch
import torch.nn as nn
from RStask.InstanceSegmentation.swin import swin
from RStask.InstanceSegmentation.uper import UPerHead

Activation=torch.nn.ReLU
class SegmentationHead(nn.Sequential):
    def __init__(self, in_channels, out_channels, kernel_size=3, activation=None, upsampling=1):
        conv2d = nn.Conv2d(in_channels, out_channels, kernel_size=kernel_size, padding=kernel_size // 2)
        upsampling = nn.UpsamplingBilinear2d(scale_factor=upsampling) if upsampling > 1 else nn.Identity()
        activation = Activation(activation)
        super().__init__(conv2d, upsampling, activation)


class ClassificationHead(nn.Sequential):
    def __init__(self, in_channels, classes, pooling="avg", dropout=0.2, activation=None):
        if pooling not in ("max", "avg"):
            raise ValueError("Pooling should be one of ('max', 'avg'), got {}.".format(pooling))
        pool = nn.AdaptiveAvgPool2d(1) if pooling == "avg" else nn.AdaptiveMaxPool2d(1)
        flatten = nn.Flatten()
        dropout = nn.Dropout(p=dropout, inplace=True) if dropout else nn.Identity()
        linear = nn.Linear(in_channels, classes, bias=True)
        activation = Activation(activation)
        super().__init__(pool, flatten, dropout, linear, activation)


def initialize_decoder(module):
    for m in module.modules():

        if isinstance(m, nn.Conv2d):
            nn.init.kaiming_uniform_(m.weight, mode="fan_in", nonlinearity="relu")
            if m.bias is not None:
                nn.init.constant_(m.bias, 0)

        elif isinstance(m, nn.BatchNorm2d):
            nn.init.constant_(m.weight, 1)
            nn.init.constant_(m.bias, 0)

        elif isinstance(m, nn.Linear):
            nn.init.xavier_uniform_(m.weight)
            if m.bias is not None:
                nn.init.constant_(m.bias, 0)


def initialize_head(module):
    for m in module.modules():
        if isinstance(m, (nn.Linear, nn.Conv2d)):
            nn.init.xavier_uniform_(m.weight)
            if m.bias is not None:
                nn.init.constant_(m.bias, 0)





class SwinUPer(torch.nn.Module):
    def __init__(self, classes: int = 16):
        super(SwinUPer, self).__init__()
        # encoder
        self.encoder = swin(embed_dim=96,depths=[2, 2, 6, 2],num_heads=[3, 6, 12, 24],
                            window_size=7,ape=False,drop_path_rate=0.3,patch_norm=True)
        # decoder
        self.decoder = UPerHead(
            in_channels = self.encoder.out_channels[1:],
            channels = self.encoder.out_channels[2],
            in_index = (0, 1, 2, 3),dropout_ratio = 0.1,
            norm_cfg = dict(type='SyncBN', requires_grad=True)
        )

        self.semseghead = nn.Sequential(
            nn.Dropout2d(0.1),
            nn.Conv2d(self.encoder.out_channels[2], classes, kernel_size=1)
        )

        self.initialize()

    def initialize(self):
        initialize_decoder(self.decoder)
        initialize_head(self.semseghead)
    def forward(self, x):
        features = self.encoder(x)
        output = self.decoder(*features)
        output = self.semseghead(output)
        return output











================================================
FILE: RStask/InstanceSegmentation/swin.py
================================================
# --------------------------------------------------------
# Swin Transformer
# Copyright (c) 2021 Microsoft
# Licensed under The MIT License [see LICENSE for details]
# Written by Ze Liu, Yutong Lin, Yixuan Wei
# --------------------------------------------------------

import warnings
from collections import OrderedDict
import torch
import torch.nn as nn
import torch.nn.functional as F
import torch.utils.checkpoint as checkpoint
import numpy as np
from timm.models.layers import DropPath, to_2tuple, trunc_normal_

#from custom import load_checkpoint
#from mmseg.utils import get_root_logger
#from ..builder import BACKBONES

#from mmcv.cnn.utils.weight_init import (constant_init, trunc_normal_,
#                                        trunc_normal_init)
#from mmcv.runner import BaseModule, ModuleList, _load_checkpoint


class Mlp(nn.Module):
    """ Multilayer perceptron."""

    def __init__(self, in_features, hidden_features=None, out_features=None, act_layer=nn.GELU, drop=0.):
        super().__init__()
        out_features = out_features or in_features
        hidden_features = hidden_features or in_features
        self.fc1 = nn.Linear(in_features, hidden_features)
        self.act = act_layer()
        self.fc2 = nn.Linear(hidden_features, out_features)
        self.drop = nn.Dropout(drop)

    def forward(self, x):
        x = self.fc1(x)
        x = self.act(x)
        x = self.drop(x)
        x = self.fc2(x)
        x = self.drop(x)
        return x


def window_partition(x, window_size):
    """
    Args:
        x: (B, H, W, C)
        window_size (int): window size

    Returns:
        windows: (num_windows*B, window_size, window_size, C)
    """
    B, H, W, C = x.shape
    x = x.view(B, H // window_size, window_size, W // window_size, window_size, C)
    windows = x.permute(0, 1, 3, 2, 4, 5).contiguous().view(-1, window_size, window_size, C)
    return windows


def window_reverse(windows, window_size, H, W):
    """
    Args:
        windows: (num_windows*B, window_size, window_size, C)
        window_size (int): Window size
        H (int): Height of image
        W (int): Width of image

    Returns:
        x: (B, H, W, C)
    """
    B = int(windows.shape[0] / (H * W / window_size / window_size))
    x = windows.view(B, H // window_size, W // window_size, window_size, window_size, -1)
    x = x.permute(0, 1, 3, 2, 4, 5).contiguous().view(B, H, W, -1)
    return x


class WindowAttention(nn.Module):
    """ Window based multi-head self attention (W-MSA) module with relative position bias.
    It supports both of shifted and non-shifted window.

    Args:
        dim (int): Number of input channels.
        window_size (tuple[int]): The height and width of the window.
        num_heads (int): Number of attention heads.
        qkv_bias (bool, optional):  If True, add a learnable bias to query, key, value. Default: True
        qk_scale (float | None, optional): Override default qk scale of head_dim ** -0.5 if set
        attn_drop (float, optional): Dropout ratio of attention weight. Default: 0.0
        proj_drop (float, optional): Dropout ratio of output. Default: 0.0
    """

    def __init__(self, dim, window_size, num_heads, qkv_bias=True, qk_scale=None, attn_drop=0., proj_drop=0.):

        super().__init__()
        self.dim = dim
        self.window_size = window_size  # Wh, Ww
        self.num_heads = num_heads
        head_dim = dim // num_heads
        self.scale = qk_scale or head_dim ** -0.5

        # define a parameter table of relative position bias
        self.relative_position_bias_table = nn.Parameter(
            torch.zeros((2 * window_size[0] - 1) * (2 * window_size[1] - 1), num_heads))  # 2*Wh-1 * 2*Ww-1, nH

        # get pair-wise relative position index for each token inside the window
        coords_h = torch.arange(self.window_size[0])
        coords_w = torch.arange(self.window_size[1])
        coords = torch.stack(torch.meshgrid([coords_h, coords_w]))  # 2, Wh, Ww
        coords_flatten = torch.flatten(coords, 1)  # 2, Wh*Ww
        relative_coords = coords_flatten[:, :, None] - coords_flatten[:, None, :]  # 2, Wh*Ww, Wh*Ww
        relative_coords = relative_coords.permute(1, 2, 0).contiguous()  # Wh*Ww, Wh*Ww, 2
        relative_coords[:, :, 0] += self.window_size[0] - 1  # shift to start from 0
        relative_coords[:, :, 1] += self.window_size[1] - 1
        relative_coords[:, :, 0] *= 2 * self.window_size[1] - 1
        relative_position_index = relative_coords.sum(-1)  # Wh*Ww, Wh*Ww
        self.register_buffer("relative_position_index", relative_position_index)

        self.qkv = nn.Linear(dim, dim * 3, bias=qkv_bias)
        self.attn_drop = nn.Dropout(attn_drop)
        self.proj = nn.Linear(dim, dim)
        self.proj_drop = nn.Dropout(proj_drop)

        trunc_normal_(self.relative_position_bias_table, std=.02)
        self.softmax = nn.Softmax(dim=-1)

    def forward(self, x, mask=None):
        """ Forward function.

        Args:
            x: input features with shape of (num_windows*B, N, C)
            mask: (0/-inf) mask with shape of (num_windows, Wh*Ww, Wh*Ww) or None
        """
        B_, N, C = x.shape
        qkv = self.qkv(x).reshape(B_, N, 3, self.num_heads, C // self.num_heads).permute(2, 0, 3, 1, 4)
        q, k, v = qkv[0], qkv[1], qkv[2]  # make torchscript happy (cannot use tensor as tuple)

        q = q * self.scale
        attn = (q @ k.transpose(-2, -1))

        relative_position_bias = self.relative_position_bias_table[self.relative_position_index.view(-1)].view(
            self.window_size[0] * self.window_size[1], self.window_size[0] * self.window_size[1], -1)  # Wh*Ww,Wh*Ww,nH
        relative_position_bias = relative_position_bias.permute(2, 0, 1).contiguous()  # nH, Wh*Ww, Wh*Ww
        attn = attn + relative_position_bias.unsqueeze(0)

        if mask is not None:
            nW = mask.shape[0]
            attn = attn.view(B_ // nW, nW, self.num_heads, N, N) + mask.unsqueeze(1).unsqueeze(0)
            attn = attn.view(-1, self.num_heads, N, N)
            attn = self.softmax(attn)
        else:
            attn = self.softmax(attn)

        attn = self.attn_drop(attn)

        x = (attn @ v).transpose(1, 2).reshape(B_, N, C)
        x = self.proj(x)
        x = self.proj_drop(x)
        return x


class SwinTransformerBlock(nn.Module):
    """ Swin Transformer Block.

    Args:
        dim (int): Number of input channels.
        num_heads (int): Number of attention heads.
        window_size (int): Window size.
        shift_size (int): Shift size for SW-MSA.
        mlp_ratio (float): Ratio of mlp hidden dim to embedding dim.
        qkv_bias (bool, optional): If True, add a learnable bias to query, key, value. Default: True
        qk_scale (float | None, optional): Override default qk scale of head_dim ** -0.5 if set.
        drop (float, optional): Dropout rate. Default: 0.0
        attn_drop (float, optional): Attention dropout rate. Default: 0.0
        drop_path (float, optional): Stochastic depth rate. Default: 0.0
        act_layer (nn.Module, optional): Activation layer. Default: nn.GELU
        norm_layer (nn.Module, optional): Normalization layer.  Default: nn.LayerNorm
    """

    def __init__(self, dim, num_heads, window_size=7, shift_size=0,
                 mlp_ratio=4., qkv_bias=True, qk_scale=None, drop=0., attn_drop=0., drop_path=0.,
                 act_layer=nn.GELU, norm_layer=nn.LayerNorm):
        super().__init__()
        self.dim = dim
        self.num_heads = num_heads
        self.window_size = window_size
        self.shift_size = shift_size
        self.mlp_ratio = mlp_ratio
        assert 0 <= self.shift_size < self.window_size, "shift_size must in 0-window_size"

        self.norm1 = norm_layer(dim)
        self.attn = WindowAttention(
            dim, window_size=to_2tuple(self.window_size), num_heads=num_heads,
            qkv_bias=qkv_bias, qk_scale=qk_scale, attn_drop=attn_drop, proj_drop=drop)

        self.drop_path = DropPath(drop_path) if drop_path > 0. else nn.Identity()
        self.norm2 = norm_layer(dim)
        mlp_hidden_dim = int(dim * mlp_ratio)
        self.mlp = Mlp(in_features=dim, hidden_features=mlp_hidden_dim, act_layer=act_layer, drop=drop)

        self.H = None
        self.W = None

    def forward(self, x, mask_matrix):
        """ Forward function.

        Args:
            x: Input feature, tensor size (B, H*W, C).
            H, W: Spatial resolution of the input feature.
            mask_matrix: Attention mask for cyclic shift.
        """
        B, L, C = x.shape
        H, W = self.H, self.W
        assert L == H * W, "input feature has wrong size"

        shortcut = x
        x = self.norm1(x)
        x = x.view(B, H, W, C)

        # pad feature maps to multiples of window size
        pad_l = pad_t = 0
        pad_r = (self.window_size - W % self.window_size) % self.window_size
        pad_b = (self.window_size - H % self.window_size) % self.window_size
        x = F.pad(x, (0, 0, pad_l, pad_r, pad_t, pad_b))
        _, Hp, Wp, _ = x.shape

        # cyclic shift
        if self.shift_size > 0:
            shifted_x = torch.roll(x, shifts=(-self.shift_size, -self.shift_size), dims=(1, 2))
            attn_mask = mask_matrix
        else:
            shifted_x = x
            attn_mask = None

        # partition windows
        x_windows = window_partition(shifted_x, self.window_size)  # nW*B, window_size, window_size, C
        x_windows = x_windows.view(-1, self.window_size * self.window_size, C)  # nW*B, window_size*window_size, C

        # W-MSA/SW-MSA
        attn_windows = self.attn(x_windows, mask=attn_mask)  # nW*B, window_size*window_size, C

        # merge windows
        attn_windows = attn_windows.view(-1, self.window_size, self.window_size, C)
        shifted_x = window_reverse(attn_windows, self.window_size, Hp, Wp)  # B H' W' C

        # reverse cyclic shift
        if self.shift_size > 0:
            x = torch.roll(shifted_x, shifts=(self.shift_size, self.shift_size), dims=(1, 2))
        else:
            x = shifted_x

        if pad_r > 0 or pad_b > 0:
            x = x[:, :H, :W, :].contiguous()

        x = x.view(B, H * W, C)

        # FFN
        x = shortcut + self.drop_path(x)
        x = x + self.drop_path(self.mlp(self.norm2(x)))

        return x


class PatchMerging(nn.Module):
    """ Patch Merging Layer

    Args:
        dim (int): Number of input channels.
        norm_layer (nn.Module, optional): Normalization layer.  Default: nn.LayerNorm
    """
    def __init__(self, dim, norm_layer=nn.LayerNorm):
        super().__init__()
        self.dim = dim
        self.reduction = nn.Linear(4 * dim, 2 * dim, bias=False)
        self.norm = norm_layer(4 * dim)

    def forward(self, x, H, W):
        """ Forward function.

        Args:
            x: Input feature, tensor size (B, H*W, C).
            H, W: Spatial resolution of the input feature.
        """
        B, L, C = x.shape
        assert L == H * W, "input feature has wrong size"

        x = x.view(B, H, W, C)

        # padding
        pad_input = (H % 2 == 1) or (W % 2 == 1)
        if pad_input:
            x = F.pad(x, (0, 0, 0, W % 2, 0, H % 2))

        x0 = x[:, 0::2, 0::2, :]  # B H/2 W/2 C
        x1 = x[:, 1::2, 0::2, :]  # B H/2 W/2 C
        x2 = x[:, 0::2, 1::2, :]  # B H/2 W/2 C
        x3 = x[:, 1::2, 1::2, :]  # B H/2 W/2 C
        x = torch.cat([x0, x1, x2, x3], -1)  # B H/2 W/2 4*C
        x = x.view(B, -1, 4 * C)  # B H/2*W/2 4*C

        x = self.norm(x)
        x = self.reduction(x)

        return x


class BasicLayer(nn.Module):
    """ A basic Swin Transformer layer for one stage.

    Args:
        dim (int): Number of feature channels
        depth (int): Depths of this stage.
        num_heads (int): Number of attention head.
        window_size (int): Local window size. Default: 7.
        mlp_ratio (float): Ratio of mlp hidden dim to embedding dim. Default: 4.
        qkv_bias (bool, optional): If True, add a learnable bias to query, key, value. Default: True
        qk_scale (float | None, optional): Override default qk scale of head_dim ** -0.5 if set.
        drop (float, optional): Dropout rate. Default: 0.0
        attn_drop (float, optional): Attention dropout rate. Default: 0.0
        drop_path (float | tuple[float], optional): Stochastic depth rate. Default: 0.0
        norm_layer (nn.Module, optional): Normalization layer. Default: nn.LayerNorm
        downsample (nn.Module | None, optional): Downsample layer at the end of the layer. Default: None
        use_checkpoint (bool): Whether to use checkpointing to save memory. Default: False.
    """

    def __init__(self,
                 dim,
                 depth,
                 num_heads,
                 window_size=7,
                 mlp_ratio=4.,
                 qkv_bias=True,
                 qk_scale=None,
                 drop=0.,
                 attn_drop=0.,
                 drop_path=0.,
                 norm_layer=nn.LayerNorm,
                 downsample=None,
                 use_checkpoint=False):
        super().__init__()
        self.window_size = window_size
        self.shift_size = window_size // 2
        self.depth = depth
        self.use_checkpoint = use_checkpoint

        # build blocks
        self.blocks = nn.ModuleList([
            SwinTransformerBlock(
                dim=dim,
                num_heads=num_heads,
                window_size=window_size,
                shift_size=0 if (i % 2 == 0) else window_size // 2,
                mlp_ratio=mlp_ratio,
                qkv_bias=qkv_bias,
                qk_scale=qk_scale,
                drop=drop,
                attn_drop=attn_drop,
                drop_path=drop_path[i] if isinstance(drop_path, list) else drop_path,
                norm_layer=norm_layer)
            for i in range(depth)])

        # patch merging layer
        if downsample is not None:
            self.downsample = downsample(dim=dim, norm_layer=norm_layer)
        else:
            self.downsample = None

    def forward(self, x, H, W):
        """ Forward function.

        Args:
            x: Input feature, tensor size (B, H*W, C).
            H, W: Spatial resolution of the input feature.
        """

        # calculate attention mask for SW-MSA
        Hp = int(np.ceil(H / self.window_size)) * self.window_size
        Wp = int(np.ceil(W / self.window_size)) * self.window_size
        img_mask = torch.zeros((1, Hp, Wp, 1), device=x.device)  # 1 Hp Wp 1
        h_slices = (slice(0, -self.window_size),
                    slice(-self.window_size, -self.shift_size),
                    slice(-self.shift_size, None))
        w_slices = (slice(0, -self.window_size),
                    slice(-self.window_size, -self.shift_size),
                    slice(-self.shift_size, None))
        cnt = 0
        for h in h_slices:
            for w in w_slices:
                img_mask[:, h, w, :] = cnt
                cnt += 1

        mask_windows = window_partition(img_mask, self.window_size)  # nW, window_size, window_size, 1
        mask_windows = mask_windows.view(-1, self.window_size * self.window_size)
        attn_mask = mask_windows.unsqueeze(1) - mask_windows.unsqueeze(2)
        attn_mask = attn_mask.masked_fill(attn_mask != 0, float(-100.0)).masked_fill(attn_mask == 0, float(0.0))

        for blk in self.blocks:
            blk.H, blk.W = H, W
            if self.use_checkpoint:
                x = checkpoint.checkpoint(blk, x, attn_mask)
            else:
                x = blk(x, attn_mask)
        if self.downsample is not None:
            x_down = self.downsample(x, H, W)
            Wh, Ww = (H + 1) // 2, (W + 1) // 2
            return x, H, W, x_down, Wh, Ww
        else:
            return x, H, W, x, H, W


class PatchEmbed(nn.Module):
    """ Image to Patch Embedding

    Args:
        patch_size (int): Patch token size. Default: 4.
        in_chans (int): Number of input image channels. Default: 3.
        embed_dim (int): Number of linear projection output channels. Default: 96.
        norm_layer (nn.Module, optional): Normalization layer. Default: None
    """

    def __init__(self, patch_size=4, in_chans=3, embed_dim=96, norm_layer=None):
        super().__init__()
        patch_size = to_2tuple(patch_size)
        self.patch_size = patch_size

        self.in_chans = in_chans
        self.embed_dim = embed_dim

        self.proj = nn.Conv2d(in_chans, embed_dim, kernel_size=patch_size, stride=patch_size)
        if norm_layer is not None:
            self.norm = norm_layer(embed_dim)
        else:
            self.norm = None

    def forward(self, x):
        """Forward function."""
        # padding
        _, _, H, W = x.size()
        if W % self.patch_size[1] != 0:
            x = F.pad(x, (0, self.patch_size[1] - W % self.patch_size[1]))
        if H % self.patch_size[0] != 0:
            x = F.pad(x, (0, 0, 0, self.patch_size[0] - H % self.patch_size[0]))

        x = self.proj(x)  # B C Wh Ww
        if self.norm is not None:
            Wh, Ww = x.size(2), x.size(3)
            x = x.flatten(2).transpose(1, 2)
            x = self.norm(x)
            x = x.transpose(1, 2).view(-1, self.embed_dim, Wh, Ww)

        return x


#@BACKBONES.register_module()
class swin(nn.Module):
    """ Swin Transformer backbone.
        A PyTorch impl of : `Swin Transformer: Hierarchical Vision Transformer using Shifted Windows`  -
          https://arxiv.org/pdf/2103.14030

    Args:
        pretrain_img_size (int): Input image size for training the pretrained model,
            used in absolute postion embedding. Default 224.
        patch_size (int | tuple(int)): Patch size. Default: 4.
        in_chans (int): Number of input image channels. Default: 3.
        embed_dim (int): Number of linear projection output channels. Default: 96.
        depths (tuple[int]): Depths of each Swin Transformer stage.
        num_heads (tuple[int]): Number of attention head of each stage.
        window_size (int): Window size. Default: 7.
        mlp_ratio (float): Ratio of mlp hidden dim to embedding dim. Default: 4.
        qkv_bias (bool): If True, add a learnable bias to query, key, value. Default: True
        qk_scale (float): Override default qk scale of head_dim ** -0.5 if set.
        drop_rate (float): Dropout rate.
        attn_drop_rate (float): Attention dropout rate. Default: 0.
        drop_path_rate (float): Stochastic depth rate. Default: 0.2.
        norm_layer (nn.Module): Normalization layer. Default: nn.LayerNorm.
        ape (bool): If True, add absolute position embedding to the patch embedding. Default: False.
        patch_norm (bool): If True, add normalization after patch embedding. Default: True.
        out_indices (Sequence[int]): Output from which stages.
        frozen_stages (int): Stages to be frozen (stop grad and set eval mode).
            -1 means not freezing any parameters.
        use_checkpoint (bool): Whether to use checkpointing to save memory. Default: False.
    """

    def __init__(self,
                 pretrain_img_size=224,
                 patch_size=4,
                 in_chans=3,
                 embed_dim=96,
                 depths=[2, 2, 6, 2],
                 num_heads=[3, 6, 12, 24],
                 window_size=7,
                 mlp_ratio=4.,
                 qkv_bias=True,
                 qk_scale=None,
                 drop_rate=0.,
                 attn_drop_rate=0.,
                 drop_path_rate=0.2,
                 norm_layer=nn.LayerNorm,
                 ape=False,
                 patch_norm=True,
                 out_indices=(0, 1, 2, 3),
                 frozen_stages=-1,
                 use_checkpoint=False,
                 pretrained=None,
                 init_cfg=None):

        assert not (init_cfg and pretrained), \
            'init_cfg and pretrained cannot be specified at the same time'
        if isinstance(pretrained, str):
            warnings.warn('DeprecationWarning: pretrained is deprecated, '
                          'please use "init_cfg" instead')
            init_cfg = dict(type='Pretrained', checkpoint=pretrained)
        elif pretrained is None:
            init_cfg = init_cfg
        else:
            raise TypeError('pretrained must be a str or None')

        super().__init__()

        self.pretrain_img_size = pretrain_img_size
        self.num_layers = len(depths)
        self.embed_dim = embed_dim
        self.ape = ape
        self.patch_norm = patch_norm
        self.out_indices = out_indices
        self.frozen_stages = frozen_stages
        self.init_cfg = init_cfg

        # split image into non-overlapping patches
        self.patch_embed = PatchEmbed(
            patch_size=patch_size, in_chans=in_chans, embed_dim=embed_dim,
            norm_layer=norm_layer if self.patch_norm else None)

        # absolute position embedding
        if self.ape:
            pretrain_img_size = to_2tuple(pretrain_img_size)
            patch_size = to_2tuple(patch_size)
            patches_resolution = [pretrain_img_size[0] // patch_size[0], pretrain_img_size[1] // patch_size[1]]

            self.absolute_pos_embed = nn.Parameter(torch.zeros(1, embed_dim, patches_resolution[0], patches_resolution[1]))
            trunc_normal_(self.absolute_pos_embed, std=.02)

        self.pos_drop = nn.Dropout(p=drop_rate)

        # stochastic depth
        dpr = [x.item() for x in torch.linspace(0, drop_path_rate, sum(depths))]  # stochastic depth decay rule

        # build layers
        self.layers = nn.ModuleList()
        for i_layer in range(self.num_layers):
            layer = BasicLayer(
                dim=int(embed_dim * 2 ** i_layer),
                depth=depths[i_layer],
                num_heads=num_heads[i_layer],
                window_size=window_size,
                mlp_ratio=mlp_ratio,
                qkv_bias=qkv_bias,
                qk_scale=qk_scale,
                drop=drop_rate,
                attn_drop=attn_drop_rate,
                drop_path=dpr[sum(depths[:i_layer]):sum(depths[:i_layer + 1])],
                norm_layer=norm_layer,
                downsample=PatchMerging if (i_layer < self.num_layers - 1) else None,
                use_checkpoint=use_checkpoint)
            self.layers.append(layer)

        num_features = [int(embed_dim * 2 ** i) for i in range(self.num_layers)]
        self.num_features = num_features

        # add a norm layer for each output
        for i_layer in out_indices:
            layer = norm_layer(num_features[i_layer])
            layer_name = f'norm{i_layer}'
            self.add_module(layer_name, layer)

        self._freeze_stages()

        self.out_channels = (3, 96, 192, 384, 768)

    def _freeze_stages(self):
        if self.frozen_stages >= 0:
            self.patch_embed.eval()
            for param in self.patch_embed.parameters():
                param.requires_grad = False

        if self.frozen_stages >= 1 and self.ape:
            self.absolute_pos_embed.requires_grad = False

        if self.frozen_stages >= 2:
            self.pos_drop.eval()
            for i in range(0, self.frozen_stages - 1):
                m = self.layers[i]
                m.eval()
                for param in m.parameters():
                    param.requires_grad = False

    def init_weights(self, pretrained):

        ckpt = torch.load(pretrained, map_location='cpu')
        
        if 'state_dict' in ckpt:
            _state_dict = ckpt['state_dict']
        elif 'model' in ckpt:
            _state_dict = ckpt['model']
        else:
            _state_dict = ckpt

        state_dict = OrderedDict()
        for k, v in _state_dict.items():
            if k.startswith('backbone.'):
                state_dict[k[9:]] = v
            else:
                state_dict[k] = v

        # strip prefix of state_dict
        if list(state_dict.keys())[0].startswith('module.'):
            state_dict = {k[7:]: v for k, v in state_dict.items()}

        # reshape absolute position embedding
        if state_dict.get('absolute_pos_embed') is not None:
            absolute_pos_embed = state_dict['absolute_pos_embed']
            N1, L, C1 = absolute_pos_embed.size()
            N2, C2, H, W = self.absolute_pos_embed.size()
            if N1 != N2 or C1 != C2 or L != H * W:
                warnings.warn('Error in loading absolute_pos_embed, pass')
            else:
                state_dict['absolute_pos_embed'] = absolute_pos_embed.view(
                    N2, H, W, C2).permute(0, 3, 1, 2).contiguous()

        # interpolate position bias table if needed
        relative_position_bias_table_keys = [
            k for k in state_dict.keys()
            if 'relative_position_bias_table' in k
        ]

        for table_key in relative_position_bias_table_keys:
            table_pretrained = state_dict[table_key]
            table_current = self.state_dict()[table_key]
            L1, nH1 = table_pretrained.size()
            L2, nH2 = table_current.size()
            if nH1 != nH2:
                warnings.warn(f'Error in loading {table_key}, pass')
            elif L1 != L2:
                S1 = int(L1**0.5)
                S2 = int(L2**0.5)
                table_pretrained_resized = F.interpolate(
                    table_pretrained.permute(1, 0).reshape(1, nH1, S1, S1),
                    size=(S2, S2),
                    mode='bicubic')
                state_dict[table_key] = table_pretrained_resized.view(
                    nH2, L2).permute(1, 0).contiguous()

        # print('##############')

        # print(self.state_dict().keys())

        # print('$$$$$$$$$$$$$$')

        # print(state_dict.keys())

        # load state_dict
        msg = self.load_state_dict(state_dict, False)
        print(msg)

    def forward(self, x):
        """Forward function."""
        img = [x]

        x = self.patch_embed(x)

        Wh, Ww = x.size(2), x.size(3)
        if self.ape:
            # interpolate the position embedding to the corresponding size
            absolute_pos_embed = F.interpolate(self.absolute_pos_embed, size=(Wh, Ww), mode='bicubic')
            x = (x + absolute_pos_embed).flatten(2).transpose(1, 2)  # B Wh*Ww C
        else:
            x = x.flatten(2).transpose(1, 2)
        x = self.pos_drop(x)

        outs = []
        for i in range(self.num_layers):
            layer = self.layers[i]
            x_out, H, W, x, Wh, Ww = layer(x, Wh, Ww)

            if i in self.out_indices:
                norm_layer = getattr(self, f'norm{i}')
                x_out = norm_layer(x_out)

                out = x_out.view(-1, H, W, self.num_features[i]).permute(0, 3, 1, 2).contiguous()
                outs.append(out)

        return img + outs

    def train(self, mode=True):
        """Convert the model into training mode while keep layers freezed."""
        super(swin, self).train(mode)
        self._freeze_stages()

================================================
FILE: RStask/InstanceSegmentation/test.py
================================================
from RStask import InstanceFunction
model=InstanceFunction('cuda:0')
model.inference('/data/haonan.guo/LoveDA/Train/Urban/images_png/1367.png','bike','/data/haonan.guo/RSChatGPT/output.png')

================================================
FILE: RStask/InstanceSegmentation/uper.py
================================================
# Copyright (c) OpenMMLab. All rights reserved.
import torch
import torch.nn as nn
import warnings
from mmcv.cnn import ConvModule
from abc import ABCMeta, abstractmethod
from typing import List, Tuple
import torch
import torch.nn as nn
from mmengine.model import BaseModule
from torch import Tensor



import torch.nn.functional as F

def resize(input,
           size=None,
           scale_factor=None,
           mode='nearest',
           align_corners=None,
           warning=True):
    if warning:
        if size is not None and align_corners:
            input_h, input_w = tuple(int(x) for x in input.shape[2:])
            output_h, output_w = tuple(int(x) for x in size)
            if output_h > input_h or output_w > output_h:
                if ((output_h > 1 and output_w > 1 and input_h > 1
                     and input_w > 1) and (output_h - 1) % (input_h - 1)
                        and (output_w - 1) % (input_w - 1)):
                    warnings.warn(
                        f'When align_corners={align_corners}, '
                        'the output would more aligned if '
                        f'input size {(input_h, input_w)} is `x+1` and '
                        f'out size {(output_h, output_w)} is `nx+1`')
    return F.interpolate(input, size, scale_factor, mode, align_corners)

class PPM(nn.ModuleList):
    """Pooling Pyramid Module used in PSPNet.

    Args:
        pool_scales (tuple[int]): Pooling scales used in Pooling Pyramid
            Module.
        in_channels (int): Input channels.
        channels (int): Channels after modules, before conv_seg.
        conv_cfg (dict|None): Config of conv layers.
        norm_cfg (dict|None): Config of norm layers.
        act_cfg (dict): Config of activation layers.
        align_corners (bool): align_corners argument of F.interpolate.
    """

    def __init__(self, pool_scales, in_channels, channels, conv_cfg, norm_cfg,
                 act_cfg, align_corners, **kwargs):
        super().__init__()
        self.pool_scales = pool_scales
        self.align_corners = align_corners
        self.in_channels = in_channels
        self.channels = channels
        self.conv_cfg = conv_cfg
        self.norm_cfg = norm_cfg
        self.act_cfg = act_cfg
        for pool_scale in pool_scales:
            self.append(
                nn.Sequential(
                    nn.AdaptiveAvgPool2d(pool_scale),
                    ConvModule(
                        self.in_channels,
                        self.channels,
                        1,
                        conv_cfg=self.conv_cfg,
                        norm_cfg=self.norm_cfg,
                        act_cfg=self.act_cfg,
                        **kwargs)))

    def forward(self, x):
        """Forward function."""
        ppm_outs = []
        for ppm in self:
            ppm_out = ppm(x)
            upsampled_ppm_out = resize(
                ppm_out,
                size=x.size()[2:],
                mode='bilinear',
                align_corners=self.align_corners)
            ppm_outs.append(upsampled_ppm_out)
        return ppm_outs

class BaseDecodeHead(BaseModule, metaclass=ABCMeta):
    """Base class for BaseDecodeHead.

    1. The ``init_weights`` method is used to initialize decode_head's
    model parameters. After segmentor initialization, ``init_weights``
    is triggered when ``segmentor.init_weights()`` is called externally.

    2. The ``loss`` method is used to calculate the loss of decode_head,
    which includes two steps: (1) the decode_head model performs forward
    propagation to obtain the feature maps (2) The ``loss_by_feat`` method
    is called based on the feature maps to calculate the loss.

    .. code:: text

    loss(): forward() -> loss_by_feat()

    3. The ``predict`` method is used to predict segmentation results,
    which includes two steps: (1) the decode_head model performs forward
    propagation to obtain the feature maps (2) The ``predict_by_feat`` method
    is called based on the feature maps to predict segmentation results
    including post-processing.

    .. code:: text

    predict(): forward() -> predict_by_feat()

    Args:
        in_channels (int|Sequence[int]): Input channels.
        channels (int): Channels after modules, before conv_seg.
        num_classes (int): Number of classes.
        out_channels (int): Output channels of conv_seg.
        threshold (float): Threshold for binary segmentation in the case of
            `num_classes==1`. Default: None.
        dropout_ratio (float): Ratio of dropout layer. Default: 0.1.
        conv_cfg (dict|None): Config of conv layers. Default: None.
        norm_cfg (dict|None): Config of norm layers. Default: None.
        act_cfg (dict): Config of activation layers.
            Default: dict(type='ReLU')
        in_index (int|Sequence[int]): Input feature index. Default: -1
        input_transform (str|None): Transformation type of input features.
            Options: 'resize_concat', 'multiple_select', None.
            'resize_concat': Multiple feature maps will be resize to the
                same size as first one and than concat together.
                Usually used in FCN head of HRNet.
            'multiple_select': Multiple feature maps will be bundle into
                a list and passed into decode head.
            None: Only one select feature map is allowed.
            Default: None.
        loss_decode (dict | Sequence[dict]): Config of decode loss.
            The `loss_name` is property of corresponding loss function which
            could be shown in training log. If you want this loss
            item to be included into the backward graph, `loss_` must be the
            prefix of the name. Defaults to 'loss_ce'.
             e.g. dict(type='CrossEntropyLoss'),
             [dict(type='CrossEntropyLoss', loss_name='loss_ce'),
              dict(type='DiceLoss', loss_name='loss_dice')]
            Default: dict(type='CrossEntropyLoss').
        ignore_index (int | None): The label index to be ignored. When using
            masked BCE loss, ignore_index should be set to None. Default: 255.
        sampler (dict|None): The config of segmentation map sampler.
            Default: None.
        align_corners (bool): align_corners argument of F.interpolate.
            Default: False.
        init_cfg (dict or list[dict], optional): Initialization config dict.
    """

    def __init__(self,
                 in_channels,
                 channels,
                 *,
                 num_classes=None,
                 out_channels=None,
                 threshold=None,
                 dropout_ratio=0.1,
                 conv_cfg=None,
                 norm_cfg=None,
                 act_cfg=dict(type='ReLU'),
                 in_index=-1,
                 input_transform=None,
                 loss_decode=dict(
                     type='CrossEntropyLoss',
                     use_sigmoid=False,
                     loss_weight=1.0),
                 ignore_index=255,
                 sampler=None,
                 align_corners=False,
                 init_cfg=dict(
                     type='Normal', std=0.01, override=dict(name='conv_seg'))):
        super().__init__(init_cfg)
        self._init_inputs(in_channels, in_index, input_transform)
        self.channels = channels
        self.dropout_ratio = dropout_ratio
        self.conv_cfg = conv_cfg
        self.norm_cfg = norm_cfg
        self.act_cfg = act_cfg
        self.in_index = in_index

        self.ignore_index = ignore_index
        self.align_corners = align_corners

        if out_channels is None:
            if num_classes == 2:
                warnings.warn('For binary segmentation, we suggest using'
                              '`out_channels = 1` to define the output'
                              'channels of segmentor, and use `threshold`'
                              'to convert `seg_logits` into a prediction'
                              'applying a threshold')
            out_channels = num_classes

        if out_channels != num_classes and out_channels != 1:
            raise ValueError(
                'out_channels should be equal to num_classes,'
                'except binary segmentation set out_channels == 1 and'
                f'num_classes == 2, but got out_channels={out_channels}'
                f'and num_classes={num_classes}')

        if out_channels == 1 and threshold is None:
            threshold = 0.3
            warnings.warn('threshold is not defined for binary, and defaults'
                          'to 0.3')
        self.num_classes = num_classes
        self.out_channels = out_channels
        self.threshold = threshold

        # if isinstance(loss_decode, dict):
        #     self.loss_decode = build_loss(loss_decode)
        # elif isinstance(loss_decode, (list, tuple)):
        #     self.loss_decode = nn.ModuleList()
        #     for loss in loss_decode:
        #         self.loss_decode.append(build_loss(loss))
        # else:
        #     raise TypeError(f'loss_decode must be a dict or sequence of dict,\
        #         but got {type(loss_decode)}')

        if sampler is not None:
            self.sampler = build_pixel_sampler(sampler, context=self)
        else:
            self.sampler = None

        # self.conv_seg = nn.Conv2d(channels, self.out_channels, kernel_size=1)
        if dropout_ratio > 0:
            self.dropout = nn.Dropout2d(dropout_ratio)
        else:
            self.dropout = None

    def extra_repr(self):
        """Extra repr."""
        s = f'input_transform={self.input_transform}, ' \
            f'ignore_index={self.ignore_index}, ' \
            f'align_corners={self.align_corners}'
        return s

    def _init_inputs(self, in_channels, in_index, input_transform):
        """Check and initialize input transforms.

        The in_channels, in_index and input_transform must match.
        Specifically, when input_transform is None, only single feature map
        will be selected. So in_channels and in_index must be of type int.
        When input_transform

        Args:
            in_channels (int|Sequence[int]): Input channels.
            in_index (int|Sequence[int]): Input feature index.
            input_transform (str|None): Transformation type of input features.
                Options: 'resize_concat', 'multiple_select', None.
                'resize_concat': Multiple feature maps will be resize to the
                    same size as first one and than concat together.
                    Usually used in FCN head of HRNet.
                'multiple_select': Multiple feature maps will be bundle into
                    a list and passed into decode head.
                None: Only one select feature map is allowed.
        """

        if input_transform is not None:
            assert input_transform in ['resize_concat', 'multiple_select']
        self.input_transform = input_transform
        self.in_index = in_index
        if input_transform is not None:
            assert isinstance(in_channels, (list, tuple))
            assert isinstance(in_index, (list, tuple))
            assert len(in_channels) == len(in_index)
            if input_transform == 'resize_concat':
                self.in_channels = sum(in_channels)
            else:
                self.in_channels = in_channels
        else:
            assert isinstance(in_channels, int)
            assert isinstance(in_index, int)
            self.in_channels = in_channels

    def _transform_inputs(self, inputs):
        """Transform inputs for decoder.

        Args:
            inputs (list[Tensor]): List of multi-level img features.

        Returns:
            Tensor: The transformed inputs
        """

        if self.input_transform == 'resize_concat':
            inputs = [inputs[i] for i in self.in_index]
            upsampled_inputs = [
                resize(
                    input=x,
                    size=inputs[0].shape[2:],
                    mode='bilinear',
                    align_corners=self.align_corners) for x in inputs
            ]
            inputs = torch.cat(upsampled_inputs, dim=1)
        elif self.input_transform == 'multiple_select':
            inputs = [inputs[i] for i in self.in_index]
        else:
            inputs = inputs[self.in_index]

        return inputs

    @abstractmethod
    def forward(self, inputs):
        """Placeholder of forward function."""
        pass

    def cls_seg(self, feat):
        """Classify each pixel."""
        if self.dropout is not None:
            feat = self.dropout(feat)
        output = self.conv_seg(feat)
        return output


    def predict(self, inputs: Tuple[Tensor], batch_img_metas: List[dict],
                test_cfg) -> Tensor:
        """Forward function for prediction.

        Args:
            inputs (Tuple[Tensor]): List of multi-level img features.
            batch_img_metas (dict): List Image info where each dict may also
                contain: 'img_shape', 'scale_factor', 'flip', 'img_path',
                'ori_shape', and 'pad_shape'.
                For details on the values of these keys see
                `mmseg/datasets/pipelines/formatting.py:PackSegInputs`.
            test_cfg (dict): The testing config.

        Returns:
            Tensor: Outputs segmentation logits map.
        """
        seg_logits = self.forward(inputs)

        return self.predict_by_feat(seg_logits, batch_img_metas)

    def _stack_batch_gt(self, batch_data_samples) -> Tensor:
        gt_semantic_segs = [
            data_sample.gt_sem_seg.data for data_sample in batch_data_samples
        ]
        return torch.stack(gt_semantic_segs, dim=0)


    def predict_by_feat(self, seg_logits: Tensor,
                        batch_img_metas: List[dict]) -> Tensor:
        """Transform a batch of output seg_logits to the input shape.

        Args:
            seg_logits (Tensor): The output from decode head forward function.
            batch_img_metas (list[dict]): Meta information of each image, e.g.,
                image size, scaling factor, etc.

        Returns:
            Tensor: Outputs segmentation logits map.
        """

        seg_logits = resize(
            input=seg_logits,
            size=batch_img_metas[0]['img_shape'],
            mode='bilinear',
            align_corners=self.align_corners)
        return seg_logits

#@MODELS.register_module()
class UPerHead(BaseDecodeHead):
    """Unified Perceptual Parsing for Scene Understanding.

    This head is the implementation of `UPerNet
    <https://arxiv.org/abs/1807.10221>`_.

    Args:
        pool_scales (tuple[int]): Pooling scales used in Pooling Pyramid
            Module applied on the last feature. Default: (1, 2, 3, 6).
    """

    def __init__(self, pool_scales=(1, 2, 3, 6), **kwargs):
        super().__init__(input_transform='multiple_select', **kwargs)
        # PSP Module
        self.psp_modules = PPM(
            pool_scales,
            self.in_channels[-1],
            self.channels,
            conv_cfg=self.conv_cfg,
            norm_cfg=self.norm_cfg,
            act_cfg=self.act_cfg,
            align_corners=self.align_corners)
        self.bottleneck = ConvModule(
            self.in_channels[-1] + len(pool_scales) * self.channels,
            self.channels,
            3,
            padding=1,
            conv_cfg=self.conv_cfg,
            norm_cfg=self.norm_cfg,
            act_cfg=self.act_cfg)
        # FPN Module
        self.lateral_convs = nn.ModuleList()
        self.fpn_convs = nn.ModuleList()
        for in_channels in self.in_channels[:-1]:  # skip the top layer
            l_conv = ConvModule(
                in_channels,
                self.channels,
                1,
                conv_cfg=self.conv_cfg,
                norm_cfg=self.norm_cfg,
                act_cfg=self.act_cfg,
                inplace=False)
            fpn_conv = ConvModule(
                self.channels,
                self.channels,
                3,
                padding=1,
                conv_cfg=self.conv_cfg,
                norm_cfg=self.norm_cfg,
                act_cfg=self.act_cfg,
                inplace=False)
            self.lateral_convs.append(l_conv)
            self.fpn_convs.append(fpn_conv)

        self.fpn_bottleneck = ConvModule(
            len(self.in_channels) * self.channels,
            self.channels,
            3,
            padding=1,
            conv_cfg=self.conv_cfg,
            norm_cfg=self.norm_cfg,
            act_cfg=self.act_cfg)

    def psp_forward(self, inputs):
        """Forward function of PSP module."""
        x = inputs[-1]
        psp_outs = [x]
        psp_outs.extend(self.psp_modules(x))
        psp_outs = torch.cat(psp_outs, dim=1)
        output = self.bottleneck(psp_outs)

        return output

    def _forward_feature(self, inputs):
        """Forward function for feature maps before classifying each pixel with
        ``self.cls_seg`` fc.

        Args:
            inputs (list[Tensor]): List of multi-level img features.

        Returns:
            feats (Tensor): A tensor of shape (batch_size, self.channels,
                H, W) which is feature map for last layer of decoder head.
        """
        inputs = self._transform_inputs(inputs)

        # build laterals
        laterals = [
            lateral_conv(inputs[i])
            for i, lateral_conv in enumerate(self.lateral_convs)
        ]

        laterals.append(self.psp_forward(inputs))

        # build top-down path
        used_backbone_levels = len(laterals)
        for i in range(used_backbone_levels - 1, 0, -1):
            prev_shape = laterals[i - 1].shape[2:]
            laterals[i - 1] = laterals[i - 1] + resize(
                laterals[i],
                size=prev_shape,
                mode='bilinear',
                align_corners=self.align_corners)

        # build outputs
        fpn_outs = [
            self.fpn_convs[i](laterals[i])
            for i in range(used_backbone_levels - 1)
        ]
        # append psp feature
        fpn_outs.append(laterals[-1])

        for i in range(used_backbone_levels - 1, 0, -1):
            fpn_outs[i] = resize(
                fpn_outs[i],
                size=fpn_outs[0].shape[2:],
                mode='bilinear',
                align_corners=self.align_corners)
        fpn_outs = torch.cat(fpn_outs, dim=1)
        feats = self.fpn_bottleneck(fpn_outs)
        return feats

    def forward(self, *inputs):
        """Forward function."""

        inputs = inputs[1:]

        output = self._forward_feature(inputs)
        #output = self.cls_seg(output)

        output = F.interpolate(output, scale_factor=4, mode='bilinear')
        return output

================================================
FILE: RStask/LanduseSegmentation/__init__.py
================================================



================================================
FILE: RStask/LanduseSegmentation/seg_hrnet.py
================================================
import logging
from skimage import  io
import torch
import torch.nn as nn
import torch._utils
import torch.nn.functional as F
from PIL import Image
import numpy as np


BatchNorm2d=nn.BatchNorm2d
relu_inplace=True
BN_MOMENTUM = 0.1
ALIGN_CORNERS = None


def conv3x3(in_planes, out_planes, stride=1):
    """3x3 convolution with padding"""
    return nn.Conv2d(in_planes, out_planes, kernel_size=3, stride=stride,
                     padding=1, bias=False)


class BasicBlock(nn.Module):
    expansion = 1

    def __init__(self, inplanes, planes, stride=1, downsample=None):
        super(BasicBlock, self).__init__()
        self.conv1 = conv3x3(inplanes, planes, stride)
        self.bn1 = BatchNorm2d(planes, momentum=BN_MOMENTUM)
        self.relu = nn.ReLU(inplace=relu_inplace)
        self.conv2 = conv3x3(planes, planes)
        self.bn2 = BatchNorm2d(planes, momentum=BN_MOMENTUM)
        self.downsample = downsample
        self.stride = stride

    def forward(self, x):
        residual = x

        out = self.conv1(x)
        out = self.bn1(out)
        out = self.relu(out)

        out = self.conv2(out)
        out = self.bn2(out)

        if self.downsample is not None:
            residual = self.downsample(x)

        out = out + residual
        out = self.relu(out)

        return out


class Bottleneck(nn.Module):
    expansion = 4

    def __init__(self, inplanes, planes, stride=1, downsample=None):
        super(Bottleneck, self).__init__()
        self.conv1 = nn.Conv2d(inplanes, planes, kernel_size=1, bias=False)
        self.bn1 = BatchNorm2d(planes, momentum=BN_MOMENTUM)
        self.conv2 = nn.Conv2d(planes, planes, kernel_size=3, stride=stride,
                               padding=1, bias=False)
        self.bn2 = BatchNorm2d(planes, momentum=BN_MOMENTUM)
        self.conv3 = nn.Conv2d(planes, planes * self.expansion, kernel_size=1,
                               bias=False)
        self.bn3 = BatchNorm2d(planes * self.expansion,
                               momentum=BN_MOMENTUM)
        self.relu = nn.ReLU(inplace=relu_inplace)
        self.downsample = downsample
        self.stride = stride

    def forward(self, x):
        residual = x

        out = self.conv1(x)
        out = self.bn1(out)
        out = self.relu(out)

        out = self.conv2(out)
        out = self.bn2(out)
        out = self.relu(out)

        out = self.conv3(out)
        out = self.bn3(out)

        if self.downsample is not None:
            residual = self.downsample(x)

        out = out + residual
        out = self.relu(out)

        return out


class HighResolutionModule(nn.Module):
    def __init__(self, num_branches, blocks, num_blocks, num_inchannels,
                 num_channels, fuse_method, multi_scale_output=True):
        super(HighResolutionModule, self).__init__()
        self._check_branches(
            num_branches, blocks, num_blocks, num_inchannels, num_channels)

        self.num_inchannels = num_inchannels
        self.fuse_method = fuse_method
        self.num_branches = num_branches

        self.multi_scale_output = multi_scale_output

        self.branches = self._make_branches(
            num_branches, blocks, num_blocks, num_channels)
        self.fuse_layers = self._make_fuse_layers()
        self.relu = nn.ReLU(inplace=relu_inplace)

    def _check_branches(self, num_branches, blocks, num_blocks,
                        num_inchannels, num_channels):
        if num_branches != len(num_blocks):
            error_msg = 'NUM_BRANCHES({}) <> NUM_BLOCKS({})'.format(
                num_branches, len(num_blocks))
            raise ValueError(error_msg)

        if num_branches != len(num_channels):
            error_msg = 'NUM_BRANCHES({}) <> NUM_CHANNELS({})'.format(
                num_branches, len(num_channels))
            raise ValueError(error_msg)

        if num_branches != len(num_inchannels):
            error_msg = 'NUM_BRANCHES({}) <> NUM_INCHANNELS({})'.format(
                num_branches, len(num_inchannels))
            raise ValueError(error_msg)

    def _make_one_branch(self, branch_index, block, num_blocks, num_channels,
                         stride=1):
        downsample = None
        if stride != 1 or \
           self.num_inchannels[branch_index] != num_channels[branch_index] * block.expansion:
            downsample = nn.Sequential(
                nn.Conv2d(self.num_inchannels[branch_index],
                          num_channels[branch_index] * block.expansion,
                          kernel_size=1, stride=stride, bias=False),
                BatchNorm2d(num_channels[branch_index] * block.expansion,
                            momentum=BN_MOMENTUM),
            )

        layers = []
        layers.append(block(self.num_inchannels[branch_index],
                            num_channels[branch_index], stride, downsample))
        self.num_inchannels[branch_index] = \
            num_channels[branch_index] * block.expansion
        for i in range(1, num_blocks[branch_index]):
            layers.append(block(self.num_inchannels[branch_index],
                                num_channels[branch_index]))

        return nn.Sequential(*layers)

    def _make_branches(self, num_branches, block, num_blocks, num_channels):
        branches = []

        for i in range(num_branches):
            branches.append(
                self._make_one_branch(i, block, num_blocks, num_channels))

        return nn.ModuleList(branches)

    def _make_fuse_layers(self):
        if self.num_branches == 1:
            return None

        num_branches = self.num_branches
        num_inchannels = self.num_inchannels
        fuse_layers = []
        for i in range(num_branches if self.multi_scale_output else 1):
            fuse_layer = []
            for j in range(num_branches):
                if j > i:
                    fuse_layer.append(nn.Sequential(
                        nn.Conv2d(num_inchannels[j],
                                  num_inchannels[i],
                                  1,
                                  1,
                                  0,
                                  bias=False),
                        BatchNorm2d(num_inchannels[i], momentum=BN_MOMENTUM)))
                elif j == i:
                    fuse_layer.append(None)
                else:
                    conv3x3s = []
                    for k in range(i-j):
                        if k == i - j - 1:
                            num_outchannels_conv3x3 = num_inchannels[i]
                            conv3x3s.append(nn.Sequential(
                                nn.Conv2d(num_inchannels[j],
                                          num_outchannels_conv3x3,
                                          3, 2, 1, bias=False),
                                BatchNorm2d(num_outchannels_conv3x3, 
                                            momentum=BN_MOMENTUM)))
                        else:
                            num_outchannels_conv3x3 = num_inchannels[j]
                            conv3x3s.append(nn.Sequential(
                                nn.Conv2d(num_inchannels[j],
                                          num_outchannels_conv3x3,
                                          3, 2, 1, bias=False),
                                BatchNorm2d(num_outchannels_conv3x3,
                                            momentum=BN_MOMENTUM),
                                nn.ReLU(inplace=relu_inplace)))
                    fuse_layer.append(nn.Sequential(*conv3x3s))
            fuse_layers.append(nn.ModuleList(fuse_layer))

        return nn.ModuleList(fuse_layers)

    def get_num_inchannels(self):
        return self.num_inchannels

    def forward(self, x):
        if self.num_branches == 1:
            return [self.branches[0](x[0])]

        for i in range(self.num_branches):
            x[i] = self.branches[i](x[i])

        x_fuse = []
        for i in range(len(self.fuse_layers)):
            y = x[0] if i == 0 else self.fuse_layers[i][0](x[0])
            for j in range(1, self.num_branches):
                if i == j:
                    y = y + x[j]
                elif j > i:
                    width_output = x[i].shape[-1]
                    height_output = x[i].shape[-2]
                    y = y + F.interpolate(
                        self.fuse_layers[i][j](x[j]),
                        size=[height_output, width_output],
                        mode='bilinear', align_corners=ALIGN_CORNERS)
                else:
                    y = y + self.fuse_layers[i][j](x[j])
            x_fuse.append(self.relu(y))

        return x_fuse


blocks_dict = {
    'BASIC': BasicBlock,
    'BOTTLENECK': Bottleneck
}

class hrmodel(nn.Module):
    def __init__(self):
        super(hrmodel, self).__init__()
        BN_MOMENTUM,relu_inplace=0.1,True
        # stem net
        self.conv1 = nn.Conv2d(3, 64, kernel_size=3, stride=2, padding=1,
                               bias=False)
        self.bn1 = BatchNorm2d(64, momentum=BN_MOMENTUM)
        self.conv2 = nn.Conv2d(64, 64, kernel_size=3, stride=2, padding=1,
                               bias=False)
        self.bn2 = BatchNorm2d(64, momentum=BN_MOMENTUM)
        self.relu = nn.ReLU(inplace=relu_inplace)

        self.stage1_cfg = {'NUM_MODULES': 1, 'NUM_RANCHES': 1, 'BLOCK': 'BOTTLENECK', 'NUM_BLOCKS': [4], 'NUM_CHANNELS': [64], 'FUSE_METHOD': 'SUM'}
        num_channels = self.stage1_cfg['NUM_CHANNELS'][0]
        block = blocks_dict[self.stage1_cfg['BLOCK']]
        num_blocks = self.stage1_cfg['NUM_BLOCKS'][0]
        self.layer1 = self._make_layer(block, 64, 64, 4)
        stage1_out_channel = block.expansion*num_channels

        self.stage2_cfg = {'NUM_MODULES': 1, 'NUM_BRANCHES': 2, 'BLOCK': 'BASIC', 'NUM_BLOCKS': [4, 4], 'NUM_CHANNELS': [48, 96], 'FUSE_METHOD': 'SUM'}
        num_channels = self.stage2_cfg['NUM_CHANNELS']
        block = blocks_dict[self.stage2_cfg['BLOCK']]
        num_channels = [
            num_channels[i] * block.expansion for i in range(len(num_channels))]
        self.transition1 = self._make_transition_layer(
            [stage1_out_channel], num_channels)
        self.stage2, pre_stage_channels = self._make_stage(
            self.stage2_cfg, num_channels)

        self.stage3_cfg = {'NUM_MODULES': 4, 'NUM_BRANCHES': 3, 'BLOCK': 'BASIC', 'NUM_BLOCKS': [4, 4, 4], 'NUM_CHANNELS': [48, 96, 192], 'FUSE_METHOD': 'SUM'}
        num_channels = self.stage3_cfg['NUM_CHANNELS']
        block = blocks_dict[self.stage3_cfg['BLOCK']]
        num_channels = [
            num_channels[i] * block.expansion for i in range(len(num_channels))]
        self.transition2 = self._make_transition_layer(
            pre_stage_channels, num_channels)
        self.stage3, pre_stage_channels = self._make_stage(
            self.stage3_cfg, num_channels)

        self.stage4_cfg = {'NUM_MODULES': 3, 'NUM_BRANCHES': 4, 'BLOCK': 'BASIC', 'NUM_BLOCKS': [4, 4, 4, 4], 'NUM_CHANNELS': [48, 96, 192, 384], 'FUSE_METHOD': 'SUM'}
        num_channels = self.stage4_cfg['NUM_CHANNELS']
        block = blocks_dict[self.stage4_cfg['BLOCK']]
        num_channels = [
            num_channels[i] * block.expansion for i in range(len(num_channels))]
        self.transition3 = self._make_transition_layer(
            pre_stage_channels, num_channels)
        self.stage4, pre_stage_channels = self._make_stage(
            self.stage4_cfg, num_channels, multi_scale_output=True)

        self.convs=nn.Sequential(ConvModule())
        self.conv_seg=nn.Conv2d(720, 7, kernel_size=(1, 1), stride=(1, 1))

    def _make_transition_layer(
            self, num_channels_pre_layer, num_channels_cur_layer):
        num_branches_cur = len(num_channels_cur_layer)
        num_branches_pre = len(num_channels_pre_layer)

        transition_layers = []
        for i in range(num_branches_cur):
            if i < num_branches_pre:
                if num_channels_cur_layer[i] != num_channels_pre_layer[i]:
                    transition_layers.append(nn.Sequential(
                        nn.Conv2d(num_channels_pre_layer[i],
                                  num_channels_cur_layer[i],
                                  3,
                                  1,
                                  1,
                                  bias=False),
                        BatchNorm2d(
                            num_channels_cur_layer[i], momentum=BN_MOMENTUM),
                        nn.ReLU(inplace=relu_inplace)))
                else:
                    transition_layers.append(None)
            else:
                conv3x3s = []
                for j in range(i+1-num_branches_pre):
                    inchannels = num_channels_pre_layer[-1]
                    outchannels = num_channels_cur_layer[i] \
                        if j == i-num_branches_pre else inchannels
                    conv3x3s.append(nn.Sequential(
                        nn.Conv2d(
                            inchannels, outchannels, 3, 2, 1, bias=False),
                        BatchNorm2d(outchannels, momentum=BN_MOMENTUM),
                        nn.ReLU(inplace=relu_inplace)))
                transition_layers.append(nn.Sequential(*conv3x3s))

        return nn.ModuleList(transition_layers)

    def _make_layer(self, block, inplanes, planes, blocks, stride=1):
        downsample = None
        if stride != 1 or inplanes != planes * block.expansion:
            downsample = nn.Sequential(
                nn.Conv2d(inplanes, planes * block.expansion,
                          kernel_size=1, stride=stride, bias=False),
                BatchNorm2d(planes * block.expansion, momentum=BN_MOMENTUM),
            )

        layers = []
        layers.append(block(inplanes, planes, stride, downsample))
        inplanes = planes * block.expansion
        for i in range(1, blocks):
            layers.append(block(inplanes, planes))

        return nn.Sequential(*layers)

    def _make_stage(self, layer_config, num_inchannels,
                    multi_scale_output=True):
        num_modules = layer_config['NUM_MODULES']
        num_branches = layer_config['NUM_BRANCHES']
        num_blocks = layer_config['NUM_BLOCKS']
        num_channels = layer_config['NUM_CHANNELS']
        block = blocks_dict[layer_config['BLOCK']]
        fuse_method = layer_config['FUSE_METHOD']

        modules = []
        for i in range(num_modules):
            # multi_scale_output is only used last module
            if not multi_scale_output and i == num_modules - 1:
                reset_multi_scale_output = False
            else:
                reset_multi_scale_output = True
            modules.append(
                HighResolutionModule(num_branches,
                                      block,
                                      num_blocks,
                                      num_inchannels,
                                      num_channels,
                                      fuse_method,
                                      reset_multi_scale_output)
            )
            num_inchannels = modules[-1].get_num_inchannels()

        return nn.Sequential(*modules), num_inchannels

    def forward(self, x,gts=None):
        x = self.conv1(x)
        x = self.bn1(x)
        x = self.relu(x)
        x = self.conv2(x)
        x = self.bn2(x)
        x = self.relu(x)
        x = self.layer1(x)

        x_list = []
        for i in range(self.stage2_cfg['NUM_BRANCHES']):
            if self.transition1[i] is not None:
                x_list.append(self.transition1[i](x))
            else:
                x_list.append(x)
        y_list = self.stage2(x_list)

        x_list = []
        for i in range(self.stage3_cfg['NUM_BRANCHES']):
            if self.transition2[i] is not None:
                if i < self.stage2_cfg['NUM_BRANCHES']:
                    x_list.append(self.transition2[i](y_list[i]))
                else:
                    x_list.append(self.transition2[i](y_list[-1]))
            else:
                x_list.append(y_list[i])
        y_list = self.stage3(x_list)

        x_list = []
        for i in range(self.stage4_cfg['NUM_BRANCHES']):
            if self.transition3[i] is not None:
                if i < self.stage3_cfg['NUM_BRANCHES']:
                    x_list.append(self.transition3[i](y_list[i]))
                else:
                    x_list.append(self.transition3[i](y_list[-1]))
            else:
                x_list.append(y_list[i])
        x = self.stage4(x_list)
        # return x
        # Upsampling
        x0_h, x0_w = x[0].size(2), x[0].size(3)
        x1 = F.interpolate(x[1], size=(x0_h, x0_w), mode='bilinear', align_corners=ALIGN_CORNERS)
        x2 = F.interpolate(x[2], size=(x0_h, x0_w), mode='bilinear', align_corners=ALIGN_CORNERS)
        x3 = F.interpolate(x[3], size=(x0_h, x0_w), mode='bilinear', align_corners=ALIGN_CORNERS)

        x = torch.cat([x[0], x1, x2, x3], 1)
        mid=x
        # x = self.last_layer(x)
        x=self.convs(x)
        x=self.conv_seg(x)
        return x



class ConvModule(nn.Module):
    def __init__(self):
        super(ConvModule, self).__init__()
        self.conv=nn.Conv2d(720, 720, kernel_size=(1, 1), stride=(1, 1), bias=False)
        self.bn=nn.BatchNorm2d(720, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
        self.relu=nn.ReLU(inplace=True)
    def forward(self,x):
        return self.relu(self.bn(self.conv(x)))
class HRNet48(nn.Module):
    def __init__(self,device):
        super(HRNet48, self).__init__()
        self.model=hrmodel()
        self.device = device
        try:
            trained = torch.load('./checkpoints/HRNET_LoveDA_best.pth')
        except:
            trained = torch.load('../../checkpoints/HRNET_LoveDA_best.pth')
        self.load_state_dict(trained)
        self.model = self.model.to(device)
        self.model.eval()
        self.category = ['Background','Building', 'Road', 'Water', 'Barren', 'Forest', 'Farmland']
        self.color_bar=[[0,0,0],[255,0,0],[255,255,0],[0,0,255],[128,0,128],[0,255,0],[255,128,0]]
        self.mean, self.std = torch.tensor([123.675, 116.28, 103.53]).reshape((1, 3, 1, 1)), torch.tensor(
            [58.395, 57.12, 57.375]).reshape((1, 3, 1, 1))
    def visualize(self,pred,cls):
        vis=np.zeros([pred.shape[0],pred.shape[1],3]).astype(np.uint8)
        if len(cls)>1:
            for i in range(len(self.category)):
                vis[:,:,0][pred==i]=self.color_bar[i][0]
                vis[:,:,1][pred == i] = self.color_bar[i][1]
                vis[:,:,2][pred == i] = self.color_bar[i][2]
        else:
            idx=cls[0]
            vis[:, :, 0][pred == idx] = self.color_bar[idx][0]
            vis[:, :, 1][pred == idx] = self.color_bar[idx][1]
            vis[:, :, 2][pred == idx] = self.color_bar[idx][2]
        return vis


    def inference(self,image_path, det_prompt,updated_image_path):
        det_prompt=det_prompt.strip()
        image = torch.from_numpy(io.imread(image_path))
        image = (image.permute(2, 0, 1).unsqueeze(0) - self.mean) / self.std
        with torch.no_grad():
            b, c, h, w = image.shape
            pred = self.model(image.to(self.device))
            pred = F.interpolate(pred, (h, w), mode='bilinear')
        pred = pred.argmax(1).cpu().squeeze().int().numpy()
        if det_prompt.lower() == 'landuse':
            pred_vis = self.visualize(pred, self.category)
        elif det_prompt.lower() in [i.lower() for i in self.category]:
            idx=[i.lower() for i in self.category].index(det_prompt.strip().lower())
            pred_vis = self.visualize(pred, [idx])
        else:
            print('Category ',det_prompt,' do not suuport!')
            return ('Category ',det_prompt,' do not suuport!','The expected input category include Building, Road, Water, Barren, Forest, Farmland, Landuse.')

        pred = Image.fromarray(pred_vis.astype(np.uint8))
        pred.save(updated_image_path)
        print(f"\nProcessed Landuse Segmentation, Input Image: {image_path+','+det_prompt}, Output: {updated_image_path}")
        return det_prompt+' segmentation result in '+updated_image_path

if __name__=='__main__':
    net=HRNet48()
    print(sum(p.numel() for p in net.parameters()))
    x=torch.ones((2,3,512,512))
    output=net(x)
    print(output.shape)

================================================
FILE: RStask/LanduseSegmentation/test.py
================================================
from RStask import LanduseFunction
model=LanduseFunction('cuda:0')
model.inference('/data/haonan.guo/LoveDA/Train/Urban/images_png/1367.png','road','/data/haonan.guo/RSChatGPT/output.png')

================================================
FILE: RStask/ObjectCounting/Yolocounting.py
================================================
from RStask.ObjectDetection.models.common import DetectMultiBackend
import torch
from skimage import io
import numpy as np
import torchvision
import torch.nn.functional as F
class YoloCounting:
    def __init__(self, device):
        from RStask.ObjectDetection.models.common import DetectMultiBackend
        self.device = device
        try:
            self.model = DetectMultiBackend('./checkpoints/yolov5_best.pt', device=torch.device(device), dnn=False, fp16=False)
        except:
            self.model = DetectMultiBackend('../../checkpoints/yolov5_best.pt', device=torch.device(device), dnn=False,fp16=False)
        self.category = ['small vehicle', 'large vehicle', 'plane', 'storage tank', 'ship', 'harbor',
                         'ground track field',
                         'soccer ball field', 'tennis court', 'swimming pool', 'baseball diamond', 'roundabout',
                         'basketball court', 'bridge', 'helicopter']


    def inference(self, image_path, det_prompt):
        supported_class=False
        for i in range(len(self.category)):
            if self.category[i] == det_prompt or self.category[i] == det_prompt[:-1] or self.category[i] == det_prompt[:-3]:
                supported_class=True
        if supported_class is False:
            log_text=det_prompt+' is not a supported category for the model.'
            print(f"\nProcessed Object Counting, Input Image: {image_path}, Output text: {log_text}")
            return log_text

        image = torch.from_numpy(io.imread(image_path))
        image = image.permute(2, 0, 1).unsqueeze(0) / 255.0
        _, _, h, w = image.shape
        with torch.no_grad():
            out, _ = self.model(image.to(self.device), augment=False,val=True)
            predn = self.non_max_suppression(out, conf_thres=0.001, iou_thres=0.75, labels=[], multi_label=True,
                                             agnostic=False)[0]
            detections = predn.clone()
            detections = detections[predn[:, 4] > 0.75]
            detections_box = (detections[:, :4] / (640 / h)).int().cpu().numpy()
            detection_classes = detections[:, 5].int().cpu().numpy()
        log_text = ''

        for i in range(len(self.category)):
            if (detection_classes == i).sum() > 0 and (
                    self.category[i] == det_prompt or self.category[i] == det_prompt[:-1] or self.category[
                i] == det_prompt[:-3]):
                log_text += str((detection_classes == i).sum()) + ' ' + self.category[i] + ','
        if log_text != '':
            log_text = log_text[:-1] + ' detected.'
        else:
            log_text = 'No ' + self.category[i] + ' detected.'

        print(f"\nProcessed Object Counting, Input Image: {image_path}, Output text: {log_text}")
        return log_text

    def non_max_suppression(self, prediction,
                            conf_thres=0.25,
                            iou_thres=0.45,
                            classes=None,
                            agnostic=False,
                            multi_label=False,
                            labels=(),
                            max_det=300):
        """Non-Maximum Suppression (NMS) on inference results to reject overlapping bounding boxes

        Returns:
             list of detections, on (n,6) tensor per image [xyxy, conf, cls]
        """

        def box_iou(box1, box2):
            def box_area(box):
                # box = xyxy(4,n)
                return (box[2] - box[0]) * (box[3] - box[1])

            # https://github.com/pytorch/vision/blob/master/torchvision/ops/boxes.py
            """
            Return intersection-over-union (Jaccard index) of boxes.
            Both sets of boxes are expected to be in (x1, y1, x2, y2) format.
            Arguments:
                box1 (Tensor[N, 4])
                box2 (Tensor[M, 4])
            Returns:
                iou (Tensor[N, M]): the NxM matrix containing the pairwise
                    IoU values for every element in boxes1 and boxes2
            """

            # inter(N,M) = (rb(N,M,2) - lt(N,M,2)).clamp(0).prod(2)
            (a1, a2), (b1, b2) = box1[:, None].chunk(2, 2), box2.chunk(2, 1)
            inter = (torch.min(a2, b2) - torch.max(a1, b1)).clamp(0).prod(2)

            # IoU = inter / (area1 + area2 - inter)
            return inter / (box_area(box1.T)[:, None] + box_area(box2.T) - inter)

        def xywh2xyxy(x):
            # Convert nx4 boxes from [x, y, w, h] to [x1, y1, x2, y2] where xy1=top-left, xy2=bottom-right

            y = x.clone()
            y[:, 0] = x[:, 0] - x[:, 2] / 2  # top left x
            y[:, 1] = x[:, 1] - x[:, 3] / 2  # top left y
            y[:, 2] = x[:, 0] + x[:, 2] / 2  # bottom right x
            y[:, 3] = x[:, 1] + x[:, 3] / 2  # bottom right y
            return y

        bs = prediction.shape[0]  # batch size
        nc = prediction.shape[2] - 5  # number of classes
        xc = prediction[..., 4] > conf_thres  # candidates

        # Checks
        assert 0 <= conf_thres <= 1, f'Invalid Confidence threshold {conf_thres}, valid values are between 0.0 and 1.0'
        assert 0 <= iou_thres <= 1, f'Invalid IoU {iou_thres}, valid values are between 0.0 and 1.0'

        # Settings
        # min_wh = 2  # (pixels) minimum box width and height
        max_wh = 7680  # (pixels) maximum box width and height
        max_nms = 30000  # maximum number of boxes into torchvision.ops.nms()
        time_limit = 0.1 + 0.03 * bs  # seconds to quit after
        redundant = True  # require redundant detections
        multi_label &= nc > 1  # multiple labels per box (adds 0.5ms/img)
        merge = False  # use merge-NMS


        output = [torch.zeros((0, 6), device=prediction.device)] * bs
        for xi, x in enumerate(prediction):  # image index, image inference
            # Apply constraints
            # x[((x[..., 2:4] < min_wh) | (x[..., 2:4] > max_wh)).any(1), 4] = 0  # width-height
            x = x[xc[xi]]  # confidence

            # Cat apriori labels if autolabelling
            if labels and len(labels[xi]):
                lb = labels[xi]
                v = torch.zeros((len(lb), nc + 5), device=x.device)
                v[:, :4] = lb[:, 1:5]  # box
                v[:, 4] = 1.0  # conf
                v[range(len(lb)), lb[:, 0].long() + 5] = 1.0  # cls
                x = torch.cat((x, v), 0)

            # If none remain process next image
            if not x.shape[0]:
                continue

            # Compute conf
            x[:, 5:] *= x[:, 4:5]  # conf = obj_conf * cls_conf

            # Box (center x, center y, width, height) to (x1, y1, x2, y2)
            box = xywh2xyxy(x[:, :4])

            # Detections matrix nx6 (xyxy, conf, cls)
            if multi_label:
                i, j = (x[:, 5:] > conf_thres).nonzero(as_tuple=False).T
                x = torch.cat((box[i], x[i, j + 5, None], j[:, None].float()), 1)
            else:  # best class only
                conf, j = x[:, 5:].max(1, keepdim=True)
                x = torch.cat((box, conf, j.float()), 1)[conf.view(-1) > conf_thres]

            # Filter by class
            if classes is not None:
                x = x[(x[:, 5:6] == torch.tensor(classes, device=x.device)).any(1)]

            # Apply finite constraint
            # if not torch.isfinite(x).all():
            #     x = x[torch.isfinite(x).all(1)]

            # Check shape
            n = x.shape[0]  # number of boxes
            if not n:  # no boxes
                continue
            elif n > max_nms:  # excess boxes
                x = x[x[:, 4].argsort(descending=True)[:max_nms]]  # sort by confidence

            # Batched NMS
            c = x[:, 5:6] * (0 if agnostic else max_wh)  # classes
            boxes, scores = x[:, :4] + c, x[:, 4]  # boxes (offset by class), scores
            i = torchvision.ops.nms(boxes, scores, iou_thres)  # NMS
            if i.shape[0] > max_det:  # limit detections
                i = i[:max_det]
            if merge and (1 < n < 3E3):  # Merge NMS (boxes merged using weighted mean)
                # update boxes as boxes(i,4) = weights(i,n) * boxes(n,4)
                iou = box_iou(boxes[i], boxes) > iou_thres  # iou matrix
                weights = iou * scores[None]  # box weights
                x[i, :4] = torch.mm(weights, x[:, :4]).float() / weights.sum(1, keepdim=True)  # merged boxes
                if redundant:
                    i = i[iou.sum(1) > 1]  # require redundancy

            output[xi] = x[i]

        return output


================================================
FILE: RStask/ObjectCounting/__init__.py
================================================


================================================
FILE: RStask/ObjectCounting/test.py
================================================
from RStask import CountingFuncnction
model=CountingFuncnction('cuda:0')
txt='/data/haonan.guo/RSChatGPT/test.tif,small vehicles'
p,t=txt.split(",")
model.inference(p,t)


================================================
FILE: RStask/ObjectDetection/YOLOv5.py
================================================
from RStask.ObjectDetection.models.common import DetectMultiBackend
import torch
from skimage import io
import numpy as np
import torchvision
import cv2
from PIL import Image
class YoloDetection:
    def __init__(self, device):
        self.device = device
        try:
            self.model = DetectMultiBackend('./checkpoints/yolov5_best.pt', device=torch.device(device), dnn=False, fp16=False)
        except:
            self.model = DetectMultiBackend('/data/haonan.guo/RSChatGPT/checkpoints/yolov5_best.pt', device=torch.device(device), dnn=False,fp16=False)
        self.category = ['small vehicle', 'large vehicle', 'plane', 'storage tank', 'ship', 'harbor',
                         'ground track field',
                         'soccer ball field', 'tennis court', 'swimming pool', 'baseball diamond', 'roundabout',
                         'basketball court', 'bridge', 'helicopter']

    def inference(self, image_path, det_prompt,updated_image_path):
        image = torch.from_numpy(io.imread(image_path))
        image = image.permute(2, 0, 1).unsqueeze(0) / 255.0
        _, _, h, w = image.shape
        with torch.no_grad():
            out, _ = self.model(image.to(self.device), augment=False,val=True)
            predn = self.non_max_suppression(out, conf_thres=0.001, iou_thres=0.75, labels=[], multi_label=True,
                                             agnostic=False)[0]
            detections = predn.clone()
            detections = detections[predn[:, 4] > 0.75]
            detections_box = (detections[:, :4] / (640 / h)).int().cpu().numpy()
            detection_classes = detections[:, 5].int().cpu().numpy()
        if len(detection_classes) > 0:
            det = np.zeros((h, w, 3))
            for i in range(len(detections_box)):
                x1, y1, x2, y2 = detections_box[i]
                det[y1:y2, x1:x2] = detection_classes[i] + 1

            self.visualize(image_path,updated_image_path,detections)
            print(
                f"\nProcessed Object Detection, Input Image: {image_path}, Output Bounding box: {updated_image_path},Output text: {'Object Detection Done'}")
            return  det_prompt+' object detection result in '+updated_image_path
    def visualize(self,image_path, newpic_path,detections):
        font = cv2.FONT_HERSHEY_SIMPLEX
        im = io.imread(image_path)
        boxes = detections.int().cpu().numpy()
        for i in range(len(boxes)):
            cv2.rectangle(im, (boxes[i][0], boxes[i][1]), (boxes[i][2], boxes[i][3]), (0, 255, 255), 2)
            cv2.rectangle(im, (boxes[i][0], boxes[i][1] - 15), (boxes[i][0] + 45, boxes[i][1] - 2), (0, 0, 255),thickness=-1)
            cv2.putText(im, self.category[boxes[i][-1]], (boxes[i][0], boxes[i][1] - 2), font, 0.5, (255, 255, 255),1)
        Image.fromarray(im.astype(np.uint8)).save(newpic_path)
        with open(newpic_path[:-4]+'.txt','w') as f:
            for i in range(len(boxes)):
                f.write(str(list(boxes[i,:4]))[1:-1]+', '+self.category[boxes[i][-1]]+'\n')
    def non_max_suppression(self, prediction,
                            conf_thres=0.25,
                            iou_thres=0.45,
                            classes=None,
                            agnostic=False,
                            multi_label=False,
                            labels=(),
                            max_det=300):
        """Non-Maximum Suppression (NMS) on inference results to reject overlapping bounding boxes

        Returns:
             list of detections, on (n,6) tensor per image [xyxy, conf, cls]
        """

        def box_iou(box1, box2):
            def box_area(box):
                # box = xyxy(4,n)
                return (box[2] - box[0]) * (box[3] - box[1])

            # https://github.com/pytorch/vision/blob/master/torchvision/ops/boxes.py
            """
            Return intersection-over-union (Jaccard index) of boxes.
            Both sets of boxes are expected to be in (x1, y1, x2, y2) format.
            Arguments:
                box1 (Tensor[N, 4])
                box2 (Tensor[M, 4])
            Returns:
                iou (Tensor[N, M]): the NxM matrix containing the pairwise
                    IoU values for every element in boxes1 and boxes2
            """

            # inter(N,M) = (rb(N,M,2) - lt(N,M,2)).clamp(0).prod(2)
            (a1, a2), (b1, b2) = box1[:, None].chunk(2, 2), box2.chunk(2, 1)
            inter = (torch.min(a2, b2) - torch.max(a1, b1)).clamp(0).prod(2)

            # IoU = inter / (area1 + area2 - inter)
            return inter / (box_area(box1.T)[:, None] + box_area(box2.T) - inter)

        def xywh2xyxy(x):
            # Convert nx4 boxes from [x, y, w, h] to [x1, y1, x2, y2] where xy1=top-left, xy2=bottom-right

            y = x.clone()
            y[:, 0] = x[:, 0] - x[:, 2] / 2  # top left x
            y[:, 1] = x[:, 1] - x[:, 3] / 2  # top left y
            y[:, 2] = x[:, 0] + x[:, 2] / 2  # bottom right x
            y[:, 3] = x[:, 1] + x[:, 3] / 2  # bottom right y
            return y

        bs = prediction.shape[0]  # batch size
        nc = prediction.shape[2] - 5  # number of classes
        xc = prediction[..., 4] > conf_thres  # candidates

        # Checks
        assert 0 <= conf_thres <= 1, f'Invalid Confidence threshold {conf_thres}, valid values are between 0.0 and 1.0'
        assert 0 <= iou_thres <= 1, f'Invalid IoU {iou_thres}, valid values are between 0.0 and 1.0'

        # Settings
        # min_wh = 2  # (pixels) minimum box width and height
        max_wh = 7680  # (pixels) maximum box width and height
        max_nms = 30000  # maximum number of boxes into torchvision.ops.nms()
        time_limit = 0.1 + 0.03 * bs  # seconds to quit after
        redundant = True  # require redundant detections
        multi_label &= nc > 1  # multiple labels per box (adds 0.5ms/img)
        merge = False  # use merge-NMS


        output = [torch.zeros((0, 6), device=prediction.device)] * bs
        for xi, x in enumerate(prediction):  # image index, image inference
            # Apply constraints
            # x[((x[..., 2:4] < min_wh) | (x[..., 2:4] > max_wh)).any(1), 4] = 0  # width-height
            x = x[xc[xi]]  # confidence

            # Cat apriori labels if autolabelling
            if labels and len(labels[xi]):
                lb = labels[xi]
                v = torch.zeros((len(lb), nc + 5), device=x.device)
                v[:, :4] = lb[:, 1:5]  # box
                v[:, 4] = 1.0  # conf
                v[range(len(lb)), lb[:, 0].long() + 5] = 1.0  # cls
                x = torch.cat((x, v), 0)

            # If none remain process next image
            if not x.shape[0]:
                continue

            # Compute conf
            x[:, 5:] *= x[:, 4:5]  # conf = obj_conf * cls_conf

            # Box (center x, center y, width, height) to (x1, y1, x2, y2)
            box = xywh2xyxy(x[:, :4])

            # Detections matrix nx6 (xyxy, conf, cls)
            if multi_label:
                i, j = (x[:, 5:] > conf_thres).nonzero(as_tuple=False).T
                x = torch.cat((box[i], x[i, j + 5, None], j[:, None].float()), 1)
            else:  # best class only
                conf, j = x[:, 5:].max(1, keepdim=True)
                x = torch.cat((box, conf, j.float()), 1)[conf.view(-1) > conf_thres]

            # Filter by class
            if classes is not None:
                x = x[(x[:, 5:6] == torch.tensor(classes, device=x.device)).any(1)]

            # Apply finite constraint
            # if not torch.isfinite(x).all():
            #     x = x[torch.isfinite(x).all(1)]

            # Check shape
            n = x.shape[0]  # number of boxes
            if not n:  # no boxes
                continue
            elif n > max_nms:  # excess boxes
                x = x[x[:, 4].argsort(descending=True)[:max_nms]]  # sort by confidence

            # Batched NMS
            c = x[:, 5:6] * (0 if agnostic else max_wh)  # classes
            boxes, scores = x[:, :4] + c, x[:, 4]  # boxes (offset by class), scores
            i = torchvision.ops.nms(boxes, scores, iou_thres)  # NMS
            if i.shape[0] > max_det:  # limit detections
                i = i[:max_det]
            if merge and (1 < n < 3E3):  # Merge NMS (boxes merged using weighted mean)
                # update boxes as boxes(i,4) = weights(i,n) * boxes(n,4)
                iou = box_iou(boxes[i], boxes) > iou_thres  # iou matrix
                weights = iou * scores[None]  # box weights
                x[i, :4] = torch.mm(weights, x[:, :4]).float() / weights.sum(1, keepdim=True)  # merged boxes
                if redundant:
                    i = i[iou.sum(1) > 1]  # require redundancy

            output[xi] = x[i]

        return output

================================================
FILE: RStask/ObjectDetection/__init__.py
================================================


================================================
FILE: RStask/ObjectDetection/models/__init__.py
================================================


================================================
FILE: RStask/ObjectDetection/models/common.py
================================================
# YOLOv5 🚀 by Ultralytics, GPL-3.0 license
"""
Common modules
"""

import json
import math
import platform
import warnings
from collections import OrderedDict, namedtuple
from copy import copy
from pathlib import Path

import cv2
import numpy as np
import pandas as pd
import requests
import torch
import torch.nn as nn
import yaml
from PIL import Image
from torch.cuda import amp

from RStask.ObjectDetection.utils.datasets import exif_transpose, letterbox
from RStask.ObjectDetection.utils.general import (LOGGER, check_requirements, check_suffix, check_version, colorstr, increment_path,
                           make_divisible, non_max_suppression, scale_coords, xywh2xyxy, xyxy2xywh)
from RStask.ObjectDetection.utils.plots import Annotator, colors, save_one_box
from RStask.ObjectDetection.utils.torch_utils import copy_attr, time_sync


def autopad(k, p=None):  # kernel, padding
    # Pad to 'same'
    if p is None:
        p = k // 2 if isinstance(k, int) else (x // 2 for x in k)  # auto-pad
    return p


class Conv(nn.Module):
    # Standard convolution
    def __init__(self, c1, c2, k=1, s=1, p=None, g=1, act=True):  # ch_in, ch_out, kernel, stride, padding, groups
        super().__init__()
        self.conv = nn.Conv2d(c1, c2, k, s, autopad(k, p), groups=g, bias=False)
        self.bn = nn.BatchNorm2d(c2)
        self.act = nn.SiLU() if act is True else (act if isinstance(act, nn.Module) else nn.Identity())

    def forward(self, x):
        return self.act(self.bn(self.conv(x)))

    def forward_fuse(self, x):
        return self.act(self.conv(x))


class DWConv(Conv):
    # Depth-wise convolution class
    def __init__(self, c1, c2, k=1, s=1, act=True):  # ch_in, ch_out, kernel, stride, padding, groups
        super().__init__(c1, c2, k, s, g=math.gcd(c1, c2), act=act)


class TransformerLayer(nn.Module):
    # Transformer layer https://arxiv.org/abs/2010.11929 (LayerNorm layers removed for better performance)
    def __init__(self, c, num_heads):
        super().__init__()
        self.q = nn.Linear(c, c, bias=False)
        self.k = nn.Linear(c, c, bias=False)
        self.v = nn.Linear(c, c, bias=False)
        self.ma = nn.MultiheadAttention(embed_dim=c, num_heads=num_heads)
        self.fc1 = nn.Linear(c, c, bias=False)
        self.fc2 = nn.Linear(c, c, bias=False)

    def forward(self, x):
        x = self.ma(self.q(x), self.k(x), self.v(x))[0] + x
        x = self.fc2(self.fc1(x)) + x
        return x


class TransformerBlock(nn.Module):
    # Vision Transformer https://arxiv.org/abs/2010.11929
    def __init__(self, c1, c2, num_heads, num_layers):
        super().__init__()
        self.conv = None
        if c1 != c2:
            self.conv = Conv(c1, c2)
        self.linear = nn.Linear(c2, c2)  # learnable position embedding
        self.tr = nn.Sequential(*(TransformerLayer(c2, num_heads) for _ in range(num_layers)))
        self.c2 = c2

    def forward(self, x):
        if self.conv is not None:
            x = self.conv(x)
        b, _, w, h = x.shape
        p = x.flatten(2).permute(2, 0, 1)
        return self.tr(p + self.linear(p)).permute(1, 2, 0).reshape(b, self.c2, w, h)


class Bottleneck(nn.Module):
    # Standard bottleneck
    def __init__(self, c1, c2, shortcut=True, g=1, e=0.5):  # ch_in, ch_out, shortcut, groups, expansion
        super().__init__()
        c_ = int(c2 * e)  # hidden channels
        self.cv1 = Conv(c1, c_, 1, 1)
        self.cv2 = Conv(c_, c2, 3, 1, g=g)
        self.add = shortcut and c1 == c2

    def forward(self, x):
        return x + self.cv2(self.cv1(x)) if self.add else self.cv2(self.cv1(x))


class BottleneckCSP(nn.Module):
    # CSP Bottleneck https://github.com/WongKinYiu/CrossStagePartialNetworks
    def __init__(self, c1, c2, n=1, shortcut=True, g=1, e=0.5):  # ch_in, ch_out, number, shortcut, groups, expansion
        super().__init__()
        c_ = int(c2 * e)  # hidden channels
        self.cv1 = Conv(c1, c_, 1, 1)
        self.cv2 = nn.Conv2d(c1, c_, 1, 1, bias=False)
        self.cv3 = nn.Conv2d(c_, c_, 1, 1, bias=False)
        self.cv4 = Conv(2 * c_, c2, 1, 1)
        self.bn = nn.BatchNorm2d(2 * c_)  # applied to cat(cv2, cv3)
        self.act = nn.SiLU()
        self.m = nn.Sequential(*(Bottleneck(c_, c_, shortcut, g, e=1.0) for _ in range(n)))

    def forward(self, x):
        y1 = self.cv3(self.m(self.cv1(x)))
        y2 = self.cv2(x)
        return self.cv4(self.act(self.bn(torch.cat((y1, y2), 1))))


class C3(nn.Module):
    # CSP Bottleneck with 3 convolutions
    def __init__(self, c1, c2, n=1, shortcut=True, g=1, e=0.5):  # ch_in, ch_out, number, shortcut, groups, expansion
        super().__init__()
        c_ = int(c2 * e)  # hidden channels
        self.cv1 = Conv(c1, c_, 1, 1)
        self.cv2 = Conv(c1, c_, 1, 1)
        self.cv3 = Conv(2 * c_, c2, 1)  # optional act=FReLU(c2)
        self.m = nn.Sequential(*(Bottleneck(c_, c_, shortcut, g, e=1.0) for _ in range(n)))
        # self.m = nn.Sequential(*(CrossConv(c_, c_, 3, 1, g, 1.0, shortcut) for _ in range(n)))

    def forward(self, x):
        return self.cv3(torch.cat((self.m(self.cv1(x)), self.cv2(x)), 1))


class C3TR(C3):
    # C3 module with TransformerBlock()
    def __init__(self, c1, c2, n=1, shortcut=True, g=1, e=0.5):
        super().__init__(c1, c2, n, shortcut, g, e)
        c_ = int(c2 * e)
        self.m = TransformerBlock(c_, c_, 4, n)


class C3SPP(C3):
    # C3 module with SPP()
    def __init__(self, c1, c2, k=(5, 9, 13), n=1, shortcut=True, g=1, e=0.5):
        super().__init__(c1, c2, n, shortcut, g, e)
        c_ = int(c2 * e)
        self.m = SPP(c_, c_, k)


class C3Ghost(C3):
    # C3 module with GhostBottleneck()
    def __init__(self, c1, c2, n=1, shortcut=True, g=1, e=0.5):
        super().__init__(c1, c2, n, shortcut, g, e)
        c_ = int(c2 * e)  # hidden channels
        self.m = nn.Sequential(*(GhostBottleneck(c_, c_) for _ in range(n)))


class SPP(nn.Module):
    # Spatial Pyramid Pooling (SPP) layer https://arxiv.org/abs/1406.4729
    def __init__(self, c1, c2, k=(5, 9, 13)):
        super().__init__()
        c_ = c1 // 2  # hidden channels
        self.cv1 = Conv(c1, c_, 1, 1)
        self.cv2 = Conv(c_ * (len(k) + 1), c2, 1, 1)
        self.m = nn.ModuleList([nn.MaxPool2d(kernel_size=x, stride=1, padding=x // 2) for x in k])

    def forward(self, x):
        x = self.cv1(x)
        with warnings.catch_warnings():
            warnings.simplefilter('ignore')  # suppress torch 1.9.0 max_pool2d() warning
            return self.cv2(torch.cat([x] + [m(x) for m in self.m], 1))


class SPPF(nn.Module):
    # Spatial Pyramid Pooling - Fast (SPPF) layer for YOLOv5 by Glenn Jocher
    def __init__(self, c1, c2, k=5):  # equivalent to SPP(k=(5, 9, 13))
        super().__init__()
        c_ = c1 // 2  # hidden channels
        self.cv1 = Conv(c1, c_, 1, 1)
        self.cv2 = Conv(c_ * 4, c2, 1, 1)
        self.m = nn.MaxPool2d(kernel_size=k, stride=1, padding=k // 2)

    def forward(self, x):
        x = self.cv1(x)
        with warnings.catch_warnings():
            warnings.simplefilter('ignore')  # suppress torch 1.9.0 max_pool2d() warning
            y1 = self.m(x)
            y2 = self.m(y1)
            return self.cv2(torch.cat((x, y1, y2, self.m(y2)), 1))


class Focus(nn.Module):
    # Focus wh information into c-space
    def __init__(self, c1, c2, k=1, s=1, p=None, g=1, act=True):  # ch_in, ch_out, kernel, stride, padding, groups
        super().__init__()
        self.conv = Conv(c1 * 4, c2, k, s, p, g, act)
        # self.contract = Contract(gain=2)

    def forward(self, x):  # x(b,c,w,h) -> y(b,4c,w/2,h/2)
        return self.conv(torch.cat((x[..., ::2, ::2], x[..., 1::2, ::2], x[..., ::2, 1::2], x[..., 1::2, 1::2]), 1))
        # return self.conv(self.contract(x))


class GhostConv(nn.Module):
    # Ghost Convolution https://github.com/huawei-noah/ghostnet
    def __init__(self, c1, c2, k=1, s=1, g=1, act=True):  # ch_in, ch_out, kernel, stride, groups
        super().__init__()
        c_ = c2 // 2  # hidden channels
        self.cv1 = Conv(c1, c_, k, s, None, g, act)
        self.cv2 = Conv(c_, c_, 5, 1, None, c_, act)

    def forward(self, x):
        y = self.cv1(x)
        return torch.cat((y, self.cv2(y)), 1)


class GhostBottleneck(nn.Module):
    # Ghost Bottleneck https://github.com/huawei-noah/ghostnet
    def __init__(self, c1, c2, k=3, s=1):  # ch_in, ch_out, kernel, stride
        super().__init__()
        c_ = c2 // 2
        self.conv = nn.Sequential(
            GhostConv(c1, c_, 1, 1),  # pw
            DWConv(c_, c_, k, s, act=False) if s == 2 else nn.Identity(),  # dw
            GhostConv(c_, c2, 1, 1, act=False))  # pw-linear
        self.shortcut = nn.Sequential(DWConv(c1, c1, k, s, act=False), Conv(c1, c2, 1, 1,
                                                                            act=False)) if s == 2 else nn.Identity()

    def forward(self, x):
        return self.conv(x) + self.shortcut(x)


class Contract(nn.Module):
    # Contract width-height into channels, i.e. x(1,64,80,80) to x(1,256,40,40)
    def __init__(self, gain=2):
        super().__init__()
        self.gain = gain

    def forward(self, x):
        b, c, h, w = x.size()  # assert (h / s == 0) and (W / s == 0), 'Indivisible gain'
        s = self.gain
        x = x.view(b, c, h // s, s, w // s, s)  # x(1,64,40,2,40,2)
        x = x.permute(0, 3, 5, 1, 2, 4).contiguous()  # x(1,2,2,64,40,40)
        return x.view(b, c * s * s, h // s, w // s)  # x(1,256,40,40)


class Expand(nn.Module):
    # Expand channels into width-height, i.e. x(1,64,80,80) to x(1,16,160,160)
    def __init__(self, gain=2):
        super().__init__()
        self.gain = gain

    def forward(self, x):
        b, c, h, w = x.size()  # assert C / s ** 2 == 0, 'Indivisible gain'
        s = self.gain
        x = x.view(b, s, s, c // s ** 2, h, w)  # x(1,2,2,16,80,80)
        x = x.permute(0, 3, 4, 1, 5, 2).contiguous()  # x(1,16,80,2,80,2)
        return x.view(b, c // s ** 2, h * s, w * s)  # x(1,16,160,160)


class Concat(nn.Module):
    # Concatenate a list of tensors along dimension
    def __init__(self, dimension=1):
        super().__init__()
        self.d = dimension

    def forward(self, x):
        return torch.cat(x, self.d)


class DetectMultiBackend(nn.Module):
    # YOLOv5 MultiBackend class for python inference on various backends
    def __init__(self, weights='yolov5s.pt', device=torch.device('cpu'), dnn=False,fp16=False):

        from RStask.ObjectDetection.models.experimental import attempt_download, attempt_load  # scoped to avoid circular import

        super().__init__()
        w = str(weights[0] if isinstance(weights, list) else weights)

        stride, names = 32, [f'class{i}' for i in range(1000)]  # assign defaults
        w = attempt_download(w)  # download if not local
        fp16 = False

        names = ['small-vehicle', 'large-vehicle','plane','storage-tank',
                 'ship','harbor','ground-track-field','soccer-ball-field',
                 'tennis-court','swimming-pool','baseball-diamond',
                 'roundabout','basketball-court','bridge','helicopter']
        model = attempt_load(weights if isinstance(weights, list) else w, map_location=device)
        stride = max(int(model.stride.max()), 32)  # model stride
        names = model.module.names if hasattr(model, 'module') else model.names  # get class names
        model.half() if fp16 else model.float()
        self.model = model  # explicitly assign for to(), cpu(), cuda(), half()

        self.__dict__.update(locals())  # assign all variables to self

    def forward(self, im, augment=False, visualize=False, val=False):
        # YOLOv5 MultiBackend inference
        b, ch, h, w = im.shape  # batch, channel, height, width
        y = self.model(im, augment=augment, visualize=visualize)[0]
        if isinstance(y, np.ndarray):
            y = torch.tensor(y, device=self.device)
        return (y, []) if val else y





class AutoShape(nn.Module):
    # YOLOv5 input-robust model wrapper for passing cv2/np/PIL/torch inputs. Includes preprocessing, inference and NMS
    conf = 0.25  # NMS confidence threshold
    iou = 0.45  # NMS IoU threshold
    agnostic = False  # NMS class-agnostic
    multi_label = False  # NMS multiple labels per box
    classes = None  # (optional list) filter by class, i.e. = [0, 15, 16] for COCO persons, cats and dogs
    max_det = 1000  # maximum number of detections per image
    amp = False  # Automatic Mixed Precision (AMP) inference

    def __init__(self, model):
        super().__init__()
        LOGGER.info('Adding AutoShape... ')
        copy_attr(self, model, include=('yaml', 'nc', 'hyp', 'names', 'stride', 'abc'), exclude=())  # copy attributes
        self.dmb = isinstance(model, DetectMultiBackend)  # DetectMultiBackend() instance
        self.pt = not self.dmb or model.pt  # PyTorch model
        self.model = model.eval()

    def _apply(self, fn):
        # Apply to(), cpu(), cuda(), half() to model tensors that are not parameters or registered buffers
        self = super()._apply(fn)
        if self.pt:
            m = self.model.model.model[-1] if self.dmb else self.model.model[-1]  # Detect()
            m.stride = fn(m.stride)
            m.grid = list(map(fn, m.grid))
            if isinstance(m.anchor_grid, list):
                m.anchor_grid = list(map(fn, m.anchor_grid))
        return self

    @torch.no_grad()
    def forward(self, imgs, size=640, augment=False, profile=False):
        # Inference from various sources. For height=640, width=1280, RGB images example inputs are:
        #   file:       imgs = 'data/images/zidane.jpg'  # str or PosixPath
        #   URI:             = 'https://ultralytics.com/images/zidane.jpg'
        #   OpenCV:          = cv2.imread('image.jpg')[:,:,::-1]  # HWC BGR to RGB x(640,1280,3)
        #   PIL:             = Image.open('image.jpg') or ImageGrab.grab()  # HWC x(640,1280,3)
        #   numpy:           = np.zeros((640,1280,3))  # HWC
        #   torch:           = torch.zeros(16,3,320,640)  # BCHW (scaled to size=640, 0-1 values)
        #   multiple:        = [Image.open('image1.jpg'), Image.open('image2.jpg'), ...]  # list of images

        t = [time_sync()]
        p = next(self.model.parameters()) if self.pt else torch.zeros(1, device=self.model.device)  # for device, type
        autocast = self.amp and (p.device.type != 'cpu')  # Automatic Mixed Precision (AMP) inference
        if isinstance(imgs, torch.Tensor):  # torch
            with amp.autocast(autocast):
                return self.model(imgs.to(p.device).type_as(p), augment, profile)  # inference

        # Pre-process
        n, imgs = (len(imgs), list(imgs)) if isinstance(imgs, (list, tuple)) else (1, [imgs])  # number, list of images
        shape0, shape1, files = [], [], []  # image and inference shapes, filenames
        for i, im in enumerate(imgs):
            f = f'image{i}'  # filename
            if isinstance(im, (str, Path)):  # filename or uri
                im, f = Image.open(requests.get(im, stream=True).raw if str(im).startswith('http') else im), im
                im = np.asarray(exif_transpose(im))
            elif isinstance(im, Image.Image):  # PIL Image
                im, f = np.asarray(exif_transpose(im)), getattr(im, 'filename', f) or f
            files.append(Path(f).with_suffix('.jpg').name)
            if im.shape[0] < 5:  # image in CHW
                im = im.transpose((1, 2, 0))  # reverse dataloader .transpose(2, 0, 1)
            im = im[..., :3] if im.ndim == 3 else np.tile(im[..., None], 3)  # enforce 3ch input
            s = im.shape[:2]  # HWC
            shape0.append(s)  # image shape
            g = (size / max(s))  # gain
            shape1.append([y * g for y in s])
            imgs[i] = im if im.data.contiguous else np.ascontiguousarray(im)  # update
        shape1 = [make_divisible(x, self.stride) if self.pt else size for x in np.array(shape1).max(0)]  # inf shape
        x = [letterbox(im, shape1, auto=False)[0] for im in imgs]  # pad
        x = np.ascontiguousarray(np.array(x).transpose((0, 3, 1, 2)))  # stack and BHWC to BCHW
        x = torch.from_numpy(x).to(p.device).type_as(p) / 255  # uint8 to fp16/32
        t.append(time_sync())

        with amp.autocast(autocast):
            # Inference
            y = self.model(x, augment, profile)  # forward
            t.append(time_sync())

            # Post-process
            y = non_max_suppression(y if self.dmb else y[0],
                                    self.conf,
                                    self.iou,
                                    self.classes,
                                    self.agnostic,
                                    self.multi_label,
                                    max_det=self.max_det)  # NMS
            for i in range(n):
                scale_coords(shape1, y[i][:, :4], shape0[i])

            t.append(time_sync())
            return Detections(imgs, y, files, t, self.names, x.shape)


class Detections:
    # YOLOv5 detections class for inference results
    def __init__(self, imgs, pred, files, times=(0, 0, 0, 0), names=None, shape=None):
        super().__init__()
        d = pred[0].device  # device
        gn = [torch.tensor([*(im.shape[i] for i in [1, 0, 1, 0]), 1, 1], device=d) for im in imgs]  # normalizations
        self.imgs = imgs  # list of images as numpy arrays
        self.pred = pred  # list of tensors pred[0] = (xyxy, conf, cls)
        self.names = names  # class names
        self.files = files  # image filenames
        self.times = times  # profiling times
        self.xyxy = pred  # xyxy pixels
        self.xywh = [xyxy2xywh(x) for x in pred]  # xywh pixels
        self.xyxyn = [x / g for x, g in zip(self.xyxy, gn)]  # xyxy normalized
        self.xywhn = [x / g for x, g in zip(self.xywh, gn)]  # xywh normalized
        self.n = len(self.pred)  # number of images (batch size)
        self.t = tuple((times[i + 1] - times[i]) * 1000 / self.n for i in range(3))  # timestamps (ms)
        self.s = shape  # inference BCHW shape

    def display(self, pprint=False, show=False, save=False, crop=False, render=False, labels=True, save_dir=Path('')):
        crops = []
        for i, (im, pred) in enumerate(zip(self.imgs, self.pred)):
            s = f'image {i + 1}/{len(self.pred)}: {im.shape[0]}x{im.shape[1]} '  # string
            if pred.shape[0]:
                for c in pred[:, -1].unique():
                    n = (pred[:, -1] == c).sum()  # detections per class
                    s += f"{n} {self.names[int(c)]}{'s' * (n > 1)}, "  # add to string
                if show or save or render or crop:
                    annotator = Annotator(im, example=str(self.names))
                    for *box, conf, cls in reversed(pred):  # xyxy, confidence, class
                        label = f'{self.names[int(cls)]} {conf:.2f}'
                        if crop:
                            file = save_dir / 'crops' / self.names[int(cls)] / self.files[i] if save else None
                            crops.append({
                                'box': box,
                                'conf': conf,
                                'cls': cls,
                                'label': label,
                                'im': save_one_box(box, im, file=file, save=save)})
                        else:  # all others
                            annotator.box_label(box, label if labels else '', color=colors(cls))
                    im = annotator.im
            else:
                s += '(no detections)'

            im = Image.fromarray(im.astype(np.uint8)) if isinstance(im, np.ndarray) else im  # from np
            if pprint:
                print(s.rstrip(', '))
            if show:
                im.show(self.files[i])  # show
            if save:
                f = self.files[i]
                im.save(save_dir / f)  # save
                if i == self.n - 1:
                    LOGGER.info(f"Saved {self.n} image{'s' * (self.n > 1)} to {colorstr('bold', save_dir)}")
            if render:
                self.imgs[i] = np.asarray(im)
        if crop:
            if save:
                LOGGER.info(f'Saved results to {save_dir}\n')
            return crops

    def print(self):
        self.display(pprint=True)  # print results
        print(f'Speed: %.1fms pre-process, %.1fms inference, %.1fms NMS per image at shape {tuple(self.s)}' % self.t)

    def show(self, labels=True):
        self.display(show=True, labels=labels)  # show results

    def save(self, labels=True, save_dir='runs/detect/exp'):
        save_dir = increment_path(save_dir, exist_ok=save_dir != 'runs/detect/exp', mkdir=True)  # increment save_dir
        self.display(save=True, labels=labels, save_dir=save_dir)  # save results

    def crop(self, save=True, save_dir='runs/detect/exp'):
        save_dir = increment_path(save_dir, exist_ok=save_dir != 'runs/detect/exp', mkdir=True) if save else None
        return self.display(crop=True, save=save, save_dir=save_dir)  # crop results

    def render(self, labels=True):
        self.display(render=True, labels=labels)  # render results
        return self.imgs

    def pandas(self):
        # return detections as pandas DataFrames, i.e. print(results.pandas().xyxy[0])
        new = copy(self)  # return copy
        ca = 'xmin', 'ymin', 'xmax', 'ymax', 'confidence', 'class', 'name'  # xyxy columns
        cb = 'xcenter', 'ycenter', 'width', 'height', 'confidence', 'class', 'name'  # xywh columns
        for k, c in zip(['xyxy', 'xyxyn', 'xywh', 'xywhn'], [ca, ca, cb, cb]):
            a = [[x[:5] + [int(x[5]), self.names[int(x[5])]] for x in x.tolist()] for x in getattr(self, k)]  # update
            setattr(new, k, [pd.DataFrame(x, columns=c) for x in a])
        return new

    def tolist(self):
        # return a list of Detections objects, i.e. 'for result in results.tolist():'
        r = range(self.n)  # iterable
        x = [Detections([self.imgs[i]], [self.pred[i]], [self.files[i]], self.times, self.names, self.s) for i in r]
        # for d in x:
        #    for k in ['imgs', 'pred', 'xyxy', 'xyxyn', 'xywh', 'xywhn']:
        #        setattr(d, k, getattr(d, k)[0])  # pop out of list
        return x

    def __len__(self):
        return self.n  # override len(results)

    def __str__(self):
        self.print()  # override print(results)
        return ''


class Classify(nn.Module):
    # Classification head, i.e. x(b,c1,20,20) to x(b,c2)
    def __init__(self, c1, c2, k=1, s=1, p=None, g=1):  # ch_in, ch_out, kernel, stride, padding, groups
        super().__init__()
        self.aap = nn.AdaptiveAvgPool2d(1)  # to x(b,c1,1,1)
        self.conv = nn.Conv2d(c1, c2, k, s, autopad(k, p), groups=g)  # to x(b,c2,1,1)
        self.flat = nn.Flatten()

    def forward(self, x):
        z = torch.cat([self.aap(y) for y in (x if isinstance(x, list) else [x])], 1)  # cat if list
        return self.flat(self.conv(z))  # flatten to x(b,c2)


================================================
FILE: RStask/ObjectDetection/models/experimental.py
================================================
# YOLOv5 🚀 by Ultralytics, GPL-3.0 license
"""
Experimental modules
"""
import math

import numpy as np
import torch
import torch.nn as nn

from RStask.ObjectDetection.models.common import Conv
from RStask.ObjectDetection.utils.downloads import attempt_download


class CrossConv(nn.Module):
    # Cross Convolution Downsample
    def __init__(self, c1, c2, k=3, s=1, g=1, e=1.0, shortcut=False):
        # ch_in, ch_out, kernel, stride, groups, expansion, shortcut
        super().__init__()
        c_ = int(c2 * e)  # hidden channels
        self.cv1 = Conv(c1, c_, (1, k), (1, s))
        self.cv2 = Conv(c_, c2, (k, 1), (s, 1), g=g)
        self.add = shortcut and c1 == c2

    def forward(self, x):
        return x + self.cv2(self.cv1(x)) if self.add else self.cv2(self.cv1(x))


class Sum(nn.Module):
    # Weighted sum of 2 or more layers https://arxiv.org/abs/1911.09070
    def __init__(self, n, weight=False):  # n: number of inputs
        super().__init__()
        self.weight = weight  # apply weights boolean
        self.iter = range(n - 1)  # iter object
        if weight:
            self.w = nn.Parameter(-torch.arange(1.0, n) / 2, requires_grad=True)  # layer weights

    def forward(self, x):
        y = x[0]  # no weight
        if self.weight:
            w = torch.sigmoid(self.w) * 2
            for i in self.iter:
                y = y + x[i + 1] * w[i]
        else:
            for i in self.iter:
                y = y + x[i + 1]
        return y


class MixConv2d(nn.Module):
    # Mixed Depth-wise Conv https://arxiv.org/abs/1907.09595
    def __init__(self, c1, c2, k=(1, 3), s=1, equal_ch=True):  # ch_in, ch_out, kernel, stride, ch_strategy
        super().__init__()
        n = len(k)  # number of convolutions
        if equal_ch:  # equal c_ per group
            i = torch.linspace(0, n - 1E-6, c2).floor()  # c2 indices
            c_ = [(i == g).sum() for g in range(n)]  # intermediate channels
        else:  # equal weight.numel() per group
            b = [c2] + [0] * n
            a = np.eye(n + 1, n, k=-1)
            a -= np.roll(a, 1, axis=1)
            a *= np.array(k) ** 2
            a[0] = 1
            c_ = np.linalg.lstsq(a, b, rcond=None)[0].round()  # solve for equal weight indices, ax = b

        self.m = nn.ModuleList([
            nn.Conv2d(c1, int(c_), k, s, k // 2, groups=math.gcd(c1, int(c_)), bias=False) for k, c_ in zip(k, c_)])
        self.bn = nn.BatchNorm2d(c2)
        self.act = nn.SiLU()

    def forward(self, x):
        return self.act(self.bn(torch.cat([m(x) for m in self.m], 1)))


class Ensemble(nn.ModuleList):
    # Ensemble of models
    def __init__(self):
        super().__init__()

    def forward(self, x, augment=False, profile=False, visualize=False):
        y = []
        for module in self:
            y.append(module(x, augment, profile, visualize)[0])
        # y = torch.stack(y).max(0)[0]  # max ensemble
        # y = torch.stack(y).mean(0)  # mean ensemble
        y = torch.cat(y, 1)  # nms ensemble
        return y, None  # inference, train output


def attempt_load(weights, map_location=None, inplace=True, fuse=True):
    from RStask.ObjectDetection.models.yolo import Detect, Model

    # Loads an ensemble of models weights=[a,b,c] or a single model weights=[a] or weights=a
    model = Ensemble()
    for w in weights if isinstance(weights, list) else [weights]:
        ckpt = torch.load(attempt_download(w), map_location=map_location)  # load
        ckpt = (ckpt.get('ema') or ckpt['model']).float()  # FP32 model
        model.append(ckpt.fuse().eval() if fuse else ckpt.eval())  # fused or un-fused model in eval mode

    # Compatibility updates
    for m in model.modules():
        t = type(m)
        if t in (nn.Hardswish, nn.LeakyReLU, nn.ReLU, nn.ReLU6, nn.SiLU, Detect, Model):
            m.inplace = inplace  # torch 1.7.0 compatibility
            if t is Detect:
                if not isinstance(m.anchor_grid, list):  # new Detect Layer compatibility
                    delattr(m, 'anchor_grid')
                    setattr(m, 'anchor_grid', [torch.zeros(1)] * m.nl)
        elif t is Conv:
            m._non_persistent_buffers_set = set()  # torch 1.6.0 compatibility
        elif t is nn.Upsample and not hasattr(m, 'recompute_scale_factor'):
            m.recompute_scale_factor = None  # torch 1.11.0 compatibility

    if len(model) == 1:
        return model[-1]  # return model
    else:
        print(f'Ensemble created with {weights}\n')
        for k in 'names', 'nc', 'yaml':
            setattr(model, k, getattr(model[0], k))
        model.stride = model[torch.argmax(torch.tensor([m.stride.max() for m in model])).int()].stride  # max stride
        assert all(model[0].nc == m.nc for m in model), f'Models have different class counts: {[m.nc for m in model]}'
        return model  # return ensemble


================================================
FILE: RStask/ObjectDetection/models/hub/anchors.yaml
================================================
# YOLOv5 🚀 by Ultralytics, GPL-3.0 license
# Default anchors for COCO data


# P5 -------------------------------------------------------------------------------------------------------------------
# P5-640:
anchors_p5_640:
  - [10,13, 16,30, 33,23]  # P3/8
  - [30,61, 62,45, 59,119]  # P4/16
  - [116,90, 156,198, 373,326]  # P5/32


# P6 -------------------------------------------------------------------------------------------------------------------
# P6-640:  thr=0.25: 0.9964 BPR, 5.54 anchors past thr, n=12, img_size=640, metric_all=0.281/0.716-mean/best, past_thr=0.469-mean: 9,11,  21,19,  17,41,  43,32,  39,70,  86,64,  65,131,  134,130,  120,265,  282,180,  247,354,  512,387
anchors_p6_640:
  - [9,11,  21,19,  17,41]  # P3/8
  - [43,32,  39,70,  86,64]  # P4/16
  - [65,131,  134,130,  120,265]  # P5/32
  - [282,180,  247,354,  512,387]  # P6/64

# P6-1280:  thr=0.25: 0.9950 BPR, 5.55 anchors past thr, n=12, img_size=1280, metric_all=0.281/0.714-mean/best, past_thr=0.468-mean: 19,27,  44,40,  38,94,  96,68,  86,152,  180,137,  140,301,  303,264,  238,542,  436,615,  739,380,  925,792
anchors_p6_1280:
  - [19,27,  44,40,  38,94]  # P3/8
  - [96,68,  86,152,  180,137]  # P4/16
  - [140,301,  303,264,  238,542]  # P5/32
  - [436,615,  739,380,  925,792]  # P6/64

# P6-1920:  thr=0.25: 0.9950 BPR, 5.55 anchors past thr, n=12, img_size=1920, metric_all=0.281/0.714-mean/best, past_thr=0.468-mean: 28,41,  67,59,  57,141,  144,103,  129,227,  270,205,  209,452,  455,396,  358,812,  653,922,  1109,570,  1387,1187
anchors_p6_1920:
  - [28,41,  67,59,  57,141]  # P3/8
  - [144,103,  129,227,  270,205]  # P4/16
  - [209,452,  455,396,  358,812]  # P5/32
  - [653,922,  1109,570,  1387,1187]  # P6/64


# P7 -------------------------------------------------------------------------------------------------------------------
# P7-640:  thr=0.25: 0.9962 BPR, 6.76 anchors past thr, n=15, img_size=640, metric_all=0.275/0.733-mean/best, past_thr=0.466-mean: 11,11,  13,30,  29,20,  30,46,  61,38,  39,92,  78,80,  146,66,  79,163,  149,150,  321,143,  157,303,  257,402,  359,290,  524,372
anchors_p7_640:
  - [11,11,  13,30,  29,20]  # P3/8
  - [30,46,  61,38,  39,92]  # P4/16
  - [78,80,  146,66,  79,163]  # P5/32
  - [149,150,  321,143,  157,303]  # P6/64
  - [257,402,  359,290,  524,372]  # P7/128

# P7-1280:  thr=0.25: 0.9968 BPR, 6.71 anchors past thr, n=15, img_size=1280, metric_all=0.273/0.732-mean/best, past_thr=0.463-mean: 19,22,  54,36,  32,77,  70,83,  138,71,  75,173,  165,159,  148,334,  375,151,  334,317,  251,626,  499,474,  750,326,  534,814,  1079,818
anchors_p7_1280:
  - [19,22,  54,36,  32,77]  # P3/8
  - [70,83,  138,71,  75,173]  # P4/16
  - [165,159,  148,334,  375,151]  # P5/32
  - [334,317,  251,626,  499,474]  # P6/64
  - [750,326,  534,814,  1079,818]  # P7/128

# P7-1920:  thr=0.25: 0.9968 BPR, 6.71 anchors past thr, n=15, img_size=1920, metric_all=0.273/0.732-mean/best, past_thr=0.463-mean: 29,34,  81,55,  47,115,  105,124,  207,107,  113,259,  247,238,  222,500,  563,227,  501,476,  376,939,  749,711,  1126,489,  801,1222,  1618,1227
anchors_p7_1920:
  - [29,34,  81,55,  47,115]  # P3/8
  - [105,124,  207,107,  113,259]  # P4/16
  - [247,238,  222,500,  563,227]  # P5/32
  - [501,476,  376,939,  749,711]  # P6/64
  - [1126,489,  801,1222,  1618,1227]  # P7/128


================================================
FILE: RStask/ObjectDetection/models/hub/yolov3-spp.yaml
================================================
# YOLOv5 🚀 by Ultralytics, GPL-3.0 license

# Parameters
nc: 80  # number of classes
depth_multiple: 1.0  # model depth multiple
width_multiple: 1.0  # layer channel multiple
anchors:
  - [10,13, 16,30, 33,23]  # P3/8
  - [30,61, 62,45, 59,119]  # P4/16
  - [116,90, 156,198, 373,326]  # P5/32

# darknet53 backbone
backbone:
  # [from, number, module, args]
  [[-1, 1, Conv, [32, 3, 1]],  # 0
   [-1, 1, Conv, [64, 3, 2]],  # 1-P1/2
   [-1, 1, Bottleneck, [64]],
   [-1, 1, Conv, [128, 3, 2]],  # 3-P2/4
   [-1, 2, Bottleneck, [128]],
   [-1, 1, Conv, [256, 3, 2]],  # 5-P3/8
   [-1, 8, Bottleneck, [256]],
   [-1, 1, Conv, [512, 3, 2]],  # 7-P4/16
   [-1, 8, Bottleneck, [512]],
   [-1, 1, Conv, [1024, 3, 2]],  # 9-P5/32
   [-1, 4, Bottleneck, [1024]],  # 10
  ]

# YOLOv3-SPP head
head:
  [[-1, 1, Bottleneck, [1024, False]],
   [-1, 1, SPP, [512, [5, 9, 13]]],
   [-1, 1, Conv, [1024, 3, 1]],
   [-1, 1, Conv, [512, 1, 1]],
   [-1, 1, Conv, [1024, 3, 1]],  # 15 (P5/32-large)

   [-2, 1, Conv, [256, 1, 1]],
   [-1, 1, nn.Upsample, [None, 2, 'nearest']],
   [[-1, 8], 1, Concat, [1]],  # cat backbone P4
   [-1, 1, Bottleneck, [512, False]],
   [-1, 1, Bottleneck, [512, False]],
   [-1, 1, Conv, [256, 1, 1]],
   [-1, 1, Conv, [512, 3, 1]],  # 22 (P4/16-medium)

   [-2, 1, Conv, [128, 1, 1]],
   [-1, 1, nn.Upsample, [None, 2, 'nearest']],
   [[-1, 6], 1, Concat, [1]],  # cat backbone P3
   [-1, 1, Bottleneck, [256, False]],
   [-1, 2, Bottleneck, [256, False]],  # 27 (P3/8-small)

   [[27, 22, 15], 1, Detect, [nc, anchors]],   # Detect(P3, P4, P5)
  ]


================================================
FILE: RStask/ObjectDetection/models/hub/yolov3-tiny.yaml
================================================
# YOLOv5 🚀 by Ultralytics, GPL-3.0 license

# Parameters
nc: 80  # number of classes
depth_multiple: 1.0  # model depth multiple
width_multiple: 1.0  # layer channel multiple
anchors:
  - [10,14, 23,27, 37,58]  # P4/16
  - [81,82, 135,169, 344,319]  # P5/32

# YOLOv3-tiny backbone
backbone:
  # [from, number, module, args]
  [[-1, 1, Conv, [16, 3, 1]],  # 0
   [-1, 1, nn.MaxPool2d, [2, 2, 0]],  # 1-P1/2
   [-1, 1, Conv, [32, 3, 1]],
   [-1, 1, nn.MaxPool2d, [2, 2, 0]],  # 3-P2/4
   [-1, 1, Conv, [64, 3, 1]],
   [-1, 1, nn.MaxPool2d, [2, 2, 0]],  # 5-P3/8
   [-1, 1, Conv, [128, 3, 1]],
   [-1, 1, nn.MaxPool2d, [2, 2, 0]],  # 7-P4/16
   [-1, 1, Conv, [256, 3, 1]],
   [-1, 1, nn.MaxPool2d, [2, 2, 0]],  # 9-P5/32
   [-1, 1, Conv, [512, 3, 1]],
   [-1, 1, nn.ZeroPad2d, [[0, 1, 0, 1]]],  # 11
   [-1, 1, nn.MaxPool2d, [2, 1, 0]],  # 12
  ]

# YOLOv3-tiny head
head:
  [[-1, 1, Conv, [1024, 3, 1]],
   [-1, 1, Conv, [256, 1, 1]],
   [-1, 1, Conv, [512, 3, 1]],  # 15 (P5/32-large)

   [-2, 1, Conv, [128, 1, 1]],
   [-1, 1, nn.Upsample, [None, 2, 'nearest']],
   [[-1, 8], 1, Concat, [1]],  # cat backbone P4
   [-1, 1, Conv, [256, 3, 1]],  # 19 (P4/16-medium)

   [[19, 15], 1, Detect, [nc, anchors]],  # Detect(P4, P5)
  ]


================================================
FILE: RStask/ObjectDetection/models/hub/yolov3.yaml
================================================
# YOLOv5 🚀 by Ultralytics, GPL-3.0 license

# Parameters
nc: 80  # number of classes
depth_multiple: 1.0  # model depth multiple
width_multiple: 1.0  # layer channel multiple
anchors:
  - [10,13, 16,30, 33,23]  # P3/8
  - [30,61, 62,45, 59,119]  # P4/16
  - [116,90, 156,198, 373,326]  # P5/32

# darknet53 backbone
backbone:
  # [from, number, module, args]
  [[-1, 1, Conv, [32, 3, 1]],  # 0
   [-1, 1, Conv, [64, 3, 2]],  # 1-P1/2
   [-1, 1, Bottleneck, [64]],
   [-1, 1, Conv, [128, 3, 2]],  # 3-P2/4
   [-1, 2, Bottleneck, [128]],
   [-1, 1, Conv, [256, 3, 2]],  # 5-P3/8
   [-1, 8, Bottleneck, [256]],
   [-1, 1, Conv, [512, 3, 2]],  # 7-P4/16
   [-1, 8, Bottleneck, [512]],
   [-1, 1, Conv, [1024, 3, 2]],  # 9-P5/32
   [-1, 4, Bottleneck, [1024]],  # 10
  ]

# YOLOv3 head
head:
  [[-1, 1, Bottleneck, [1024, False]],
   [-1, 1, Conv, [512, 1, 1]],
   [-1, 1, Conv, [1024, 3, 1]],
   [-1, 1, Conv, [512, 1, 1]],
   [-1, 1, Conv, [1024, 3, 1]],  # 15 (P5/32-large)

   [-2, 1, Conv, [256, 1, 1]],
   [-1, 1, nn.Upsample, [None, 2, 'nearest']],
   [[-1, 8], 1, Concat, [1]],  # cat backbone P4
   [-1, 1, Bottleneck, [512, False]],
   [-1, 1, Bottleneck, [512, False]],
   [-1, 1, Conv, [256, 1, 1]],
   [-1, 1, Conv, [512, 3, 1]],  # 22 (P4/16-medium)

   [-2, 1, Conv, [128, 1, 1]],
   [-1, 1, nn.Upsample, [None, 2, 'nearest']],
   [[-1, 6], 1, Concat, [1]],  # cat backbone P3
   [-1, 1, Bottleneck, [256, False]],
   [-1, 2, Bottleneck, [256, False]],  # 27 (P3/8-small)

   [[27, 22, 15], 1, Detect, [nc, anchors]],   # Detect(P3, P4, P5)
  ]


================================================
FILE: RStask/ObjectDetection/models/hub/yolov5-bifpn.yaml
================================================
# YOLOv5 🚀 by Ultralytics, GPL-3.0 license

# Parameters
nc: 80  # number of classes
depth_multiple: 1.0  # model depth multiple
width_multiple: 1.0  # layer channel multiple
anchors:
  - [10,13, 16,30, 33,23]  # P3/8
  - [30,61, 62,45, 59,119]  # P4/16
  - [116,90, 156,198, 373,326]  # P5/32

# YOLOv5 v6.0 backbone
backbone:
  # [from, number, module, args]
  [[-1, 1, Conv, [64, 6, 2, 2]],  # 0-P1/2
   [-1, 1, Conv, [128, 3, 2]],  # 1-P2/4
   [-1, 3, C3, [128]],
   [-1, 1, Conv, [256, 3, 2]],  # 3-P3/8
   [-1, 6, C3, [256]],
   [-1, 1, Conv, [512, 3, 2]],  # 5-P4/16
   [-1, 9, C3, [512]],
   [-1, 1, Conv, [1024, 3, 2]],  # 7-P5/32
   [-1, 3, C3, [1024]],
   [-1, 1, SPPF, [1024, 5]],  # 9
  ]

# YOLOv5 v6.0 BiFPN head
head:
  [[-1, 1, Conv, [512, 1, 1]],
   [-1, 1, nn.Upsample, [None, 2, 'nearest']],
   [[-1, 6], 1, Concat, [1]],  # cat backbone P4
   [-1, 3, C3, [512, False]],  # 13

   [-1, 1, Conv, [256, 1, 1]],
   [-1, 1, nn.Upsample, [None, 2, 'nearest']],
   [[-1, 4], 1, Concat, [1]],  # cat backbone P3
   [-1, 3, C3, [256, False]],  # 17 (P3/8-small)

   [-1, 1, Conv, [256, 3, 2]],
   [[-1, 14, 6], 1, Concat, [1]],  # cat P4 <--- BiFPN change
   [-1, 3, C3, [512, False]],  # 20 (P4/16-medium)

   [-1, 1, Conv, [512, 3, 2]],
   [[-1, 10], 1, Concat, [1]],  # cat head P5
   [-1, 3, C3, [1024, False]],  # 23 (P5/32-large)

   [[17, 20, 23], 1, Detect, [nc, anchors]],  # Detect(P3, P4, P5)
  ]


================================================
FILE: RStask/ObjectDetection/models/hub/yolov5-fpn.yaml
================================================
# YOLOv5 🚀 by Ultralytics, GPL-3.0 license

# Parameters
nc: 80  # number of classes
depth_multiple: 1.0  # model depth multiple
width_multiple: 1.0  # layer channel multiple
anchors:
  - [10,13, 16,30, 33,23]  # P3/8
  - [30,61, 62,45, 59,119]  # P4/16
  - [116,90, 156,198, 373,326]  # P5/32

# YOLOv5 v6.0 backbone
backbone:
  # [from, number, module, args]
  [[-1, 1, Conv, [64, 6, 2, 2]],  # 0-P1/2
   [-1, 1, Conv, [128, 3, 2]],  # 1-P2/4
   [-1, 3, C3, [128]],
   [-1, 1, Conv, [256, 3, 2]],  # 3-P3/8
   [-1, 6, C3, [256]],
   [-1, 1, Conv, [512, 3, 2]],  # 5-P4/16
   [-1, 9, C3, [512]],
   [-1, 1, Conv, [1024, 3, 2]],  # 7-P5/32
   [-1, 3, C3, [1024]],
   [-1, 1, SPPF, [1024, 5]],  # 9
  ]

# YOLOv5 v6.0 FPN head
head:
  [[-1, 3, C3, [1024, False]],  # 10 (P5/32-large)

   [-1, 1, nn.Upsample, [None, 2, 'nearest']],
   [[-1, 6], 1, Concat, [1]],  # cat backbone P4
   [-1, 1, Conv, [512, 1, 1]],
   [-1, 3, C3, [512, False]],  # 14 (P4/16-medium)

   [-1, 1, nn.Upsample, [None, 2, 'nearest']],
   [[-1, 4], 1, Concat, [1]],  # cat backbone P3
   [-1, 1, Conv, [256, 1, 1]],
   [-1, 3, C3, [256, False]],  # 18 (P3/8-small)

   [[18, 14, 10], 1, Detect, [nc, anchors]],  # Detect(P3, P4, P5)
  ]


================================================
FILE: RStask/ObjectDetection/models/hub/yolov5-p2.yaml
================================================
# YOLOv5 🚀 by Ultralytics, GPL-3.0 license

# Parameters
nc: 80  # number of classes
depth_multiple: 1.0  # model depth multiple
width_multiple: 1.0  # layer channel multiple
anchors: 3  # AutoAnchor evolves 3 anchors per P output layer

# YOLOv5 v6.0 backbone
backbone:
  # [from, number, module, args]
  [[-1, 1, Conv, [64, 6, 2, 2]],  # 0-P1/2
   [-1, 1, Conv, [128, 3, 2]],  # 1-P2/4
   [-1, 3, C3, [128]],
   [-1, 1, Conv, [256, 3, 2]],  # 3-P3/8
   [-1, 6, C3, [256]],
   [-1, 1, Conv, [512, 3, 2]],  # 5-P4/16
   [-1, 9, C3, [512]],
   [-1, 1, Conv, [1024, 3, 2]],  # 7-P5/32
   [-1, 3, C3, [1024]],
   [-1, 1, SPPF, [1024, 5]],  # 9
  ]

# YOLOv5 v6.0 head with (P2, P3, P4, P5) outputs
head:
  [[-1, 1, Conv, [512, 1, 1]],
   [-1, 1, nn.Upsample, [None, 2, 'nearest']],
   [[-1, 6], 1, Concat, [1]],  # cat backbone P4
   [-1, 3, C3, [512, False]],  # 13

   [-1, 1, Conv, [256, 1, 1]],
   [-1, 1, nn.Upsample, [None, 2, 'nearest']],
   [[-1, 4], 1, Concat, [1]],  # cat backbone P3
   [-1, 3, C3, [256, False]],  # 17 (P3/8-small)

   [-1, 1, Conv, [128, 1, 1]],
   [-1, 1, nn.Upsample, [None, 2, 'nearest']],
   [[-1, 2], 1, Concat, [1]],  # cat backbone P2
   [-1, 1, C3, [128, False]],  # 21 (P2/4-xsmall)

   [-1, 1, Conv, [128, 3, 2]],
   [[-1, 18], 1, Concat, [1]],  # cat head P3
   [-1, 3, C3, [256, False]],  # 24 (P3/8-small)

   [-1, 1, Conv, [256, 3, 2]],
   [[-1, 14], 1, Concat, [1]],  # cat head P4
   [-1, 3, C3, [512, False]],  # 27 (P4/16-medium)

   [-1, 1, Conv, [512, 3, 2]],
   [[-1, 10], 1, Concat, [1]],  # cat head P5
   [-1, 3, C3, [1024, False]],  # 30 (P5/32-large)

   [[21, 24, 27, 30], 1, Detect, [nc, anchors]],  # Detect(P2, P3, P4, P5)
  ]


================================================
FILE: RStask/ObjectDetection/models/hub/yolov5-p34.yaml
================================================
# YOLOv5 🚀 by Ultralytics, GPL-3.0 license

# Parameters
nc: 80  # number of classes
depth_multiple: 0.33  # model depth multiple
width_multiple: 0.50  # layer channel multiple
anchors: 3  # AutoAnchor evolves 3 anchors per P output layer

# YOLOv5 v6.0 backbone
backbone:
  # [from, number, module, args]
  [ [ -1, 1, Conv, [ 64, 6, 2, 2 ] ],  # 0-P1/2
    [ -1, 1, Conv, [ 128, 3, 2 ] ],  # 1-P2/4
    [ -1, 3, C3, [ 128 ] ],
    [ -1, 1, Conv, [ 256, 3, 2 ] ],  # 3-P3/8
    [ -1, 6, C3, [ 256 ] ],
    [ -1, 1, Conv, [ 512, 3, 2 ] ],  # 5-P4/16
    [ -1, 9, C3, [ 512 ] ],
    [ -1, 1, Conv, [ 1024, 3, 2 ] ],  # 7-P5/32
    [ -1, 3, C3, [ 1024 ] ],
    [ -1, 1, SPPF, [ 1024, 5 ] ],  # 9
  ]

# YOLOv5 v6.0 head with (P3, P4) outputs
head:
  [ [ -1, 1, Conv, [ 512, 1, 1 ] ],
    [ -1, 1, nn.Upsample, [ None, 2, 'nearest' ] ],
    [ [ -1, 6 ], 1, Concat, [ 1 ] ],  # cat backbone P4
    [ -1, 3, C3, [ 512, False ] ],  # 13

    [ -1, 1, Conv, [ 256, 1, 1 ] ],
    [ -1, 1, nn.Upsample, [ None, 2, 'nearest' ] ],
    [ [ -1, 4 ], 1, Concat, [ 1 ] ],  # cat backbone P3
    [ -1, 3, C3, [ 256, False ] ],  # 17 (P3/8-small)

    [ -1, 1, Conv, [ 256, 3, 2 ] ],
    [ [ -1, 14 ], 1, Concat, [ 1 ] ],  # cat head P4
    [ -1, 3, C3, [ 512, False ] ],  # 20 (P4/16-medium)

    [ [ 17, 20 ], 1, Detect, [ nc, anchors ] ],  # Detect(P3, P4)
  ]


================================================
FILE: RStask/ObjectDetection/models/hub/yolov5-p6.yaml
================================================
# YOLOv5 🚀 by Ultralytics, GPL-3.0 license

# Parameters
nc: 80  # number of classes
depth_multiple: 1.0  # model depth multiple
width_multiple: 1.0  # layer channel multiple
anchors: 3  # AutoAnchor evolves 3 anchors per P output layer

# YOLOv5 v6.0 backbone
backbone:
  # [from, number, module, args]
  [[-1, 1, Conv, [64, 6, 2, 2]],  # 0-P1/2
   [-1, 1, Conv, [128, 3, 2]],  # 1-P2/4
   [-1, 3, C3, [128]],
   [-1, 1, Conv, [256, 3, 2]],  # 3-P3/8
   [-1, 6, C3, [256]],
   [-1, 1, Conv, [512, 3, 2]],  # 5-P4/16
   [-1, 9, C3, [512]],
   [-1, 1, Conv, [768, 3, 2]],  # 7-P5/32
   [-1, 3, C3, [768]],
   [-1, 1, Conv, [1024, 3, 2]],  # 9-P6/64
   [-1, 3, C3, [1024]],
   [-1, 1, SPPF, [1024, 5]],  # 11
  ]

# YOLOv5 v6.0 head with (P3, P4, P5, P6) outputs
head:
  [[-1, 1, Conv, [768, 1, 1]],
   [-1, 1, nn.Upsample, [None, 2, 'nearest']],
   [[-1, 8], 1, Concat, [1]],  # cat backbone P5
   [-1, 3, C3, [768, False]],  # 15

   [-1, 1, Conv, [512, 1, 1]],
   [-1, 1, nn.Upsample, [None, 2, 'nearest']],
   [[-1, 6], 1, Concat, [1]],  # cat backbone P4
   [-1, 3, C3, [512, False]],  # 19

   [-1, 1, Conv, [256, 1, 1]],
   [-1, 1, nn.Upsample, [None, 2, 'nearest']],
   [[-1, 4], 1, Concat, [1]],  # cat backbone P3
   [-1, 3, C3, [256, False]],  # 23 (P3/8-small)

   [-1, 1, Conv, [256, 3, 2]],
   [[-1, 20], 1, Concat, [1]],  # cat head P4
   [-1, 3, C3, [512, False]],  # 26 (P4/16-medium)

   [-1, 1, Conv, [512, 3, 2]],
   [[-1, 16], 1, Concat, [1]],  # cat head P5
   [-1, 3, C3, [768, False]],  # 29 (P5/32-large)

   [-1, 1, Conv, [768, 3, 2]],
   [[-1, 12], 1, Concat, [1]],  # cat head P6
   [-1, 3, C3, [1024, False]],  # 32 (P6/64-xlarge)

   [[23, 26, 29, 32], 1, Detect, [nc, anchors]],  # Detect(P3, P4, P5, P6)
  ]


================================================
FILE: RStask/ObjectDetection/models/hub/yolov5-p7.yaml
================================================
# YOLOv5 🚀 by Ultralytics, GPL-3.0 license

# Parameters
nc: 80  # number of classes
depth_multiple: 1.0  # model depth multiple
width_multiple: 1.0  # layer channel multiple
anchors: 3  # AutoAnchor evolves 3 anchors per P output layer

# YOLOv5 v6.0 backbone
backbone:
  # [from, number, module, args]
  [[-1, 1, Conv, [64, 6, 2, 2]],  # 0-P1/2
   [-1, 1, Conv, [128, 3, 2]],  # 1-P2/4
   [-1, 3, C3, [128]],
   [-1, 1, Conv, [256, 3, 2]],  # 3-P3/8
   [-1, 6, C3, [256]],
   [-1, 1, Conv, [512, 3, 2]],  # 5-P4/16
   [-1, 9, C3, [512]],
   [-1, 1, Conv, [768, 3, 2]],  # 7-P5/32
   [-1, 3, C3, [768]],
   [-1, 1, Conv, [1024, 3, 2]],  # 9-P6/64
   [-1, 3, C3, [1024]],
   [-1, 1, Conv, [1280, 3, 2]],  # 11-P7/128
   [-1, 3, C3, [1280]],
   [-1, 1, SPPF, [1280, 5]],  # 13
  ]

# YOLOv5 v6.0 head with (P3, P4, P5, P6, P7) outputs
head:
  [[-1, 1, Conv, [1024, 1, 1]],
   [-1, 1, nn.Upsample, [None, 2, 'nearest']],
   [[-1, 10], 1, Concat, [1]],  # cat backbone P6
   [-1, 3, C3, [1024, False]],  # 17

   [-1, 1, Conv, [768, 1, 1]],
   [-1, 1, nn.Upsample, [None, 2, 'nearest']],
   [[-1, 8], 1, Concat, [1]],  # cat backbone P5
   [-1, 3, C3, [768, False]],  # 21

   [-1, 1, Conv, [512, 1, 1]],
   [-1, 1, nn.Upsample, [None, 2, 'nearest']],
   [[-1, 6], 1, Concat, [1]],  # cat backbone P4
   [-1, 3, C3, [512, False]],  # 25

   [-1, 1, Conv, [256, 1, 1]],
   [-1, 1, nn.Upsample, [None, 2, 'nearest']],
   [[-1, 4], 1, Concat, [1]],  # cat backbone P3
   [-1, 3, C3, [256, False]],  # 29 (P3/8-small)

   [-1, 1, Conv, [256, 3, 2]],
   [[-1, 26], 1, Concat, [1]],  # cat head P4
   [-1, 3, C3, [512, False]],  # 32 (P4/16-medium)

   [-1, 1, Conv, [512, 3, 2]],
   [[-1, 22], 1, Concat, [1]],  # cat head P5
   [-1, 3, C3, [768, False]],  # 35 (P5/32-large)

   [-1, 1, Conv, [768, 3, 2]],
   [[-1, 18], 1, Concat, [1]],  # cat head P6
   [-1, 3, C3, [1024, False]],  # 38 (P6/64-xlarge)

   [-1, 1, Conv, [1024, 3, 2]],
   [[-1, 14], 1, Concat, [1]],  # cat head P7
   [-1, 3, C3, [1280, False]],  # 41 (P7/128-xxlarge)

   [[29, 32, 35, 38, 41], 1, Detect, [nc, anchors]],  # Detect(P3, P4, P5, P6, P7)
  ]


================================================
FILE: RStask/ObjectDetection/models/hub/yolov5-panet.yaml
================================================
# YOLOv5 🚀 by Ultralytics, GPL-3.0 license

# Parameters
nc: 80  # number of classes
depth_multiple: 1.0  # model depth multiple
width_multiple: 1.0  # layer channel multiple
anchors:
  - [10,13, 16,30, 33,23]  # P3/8
  - [30,61, 62,45, 59,119]  # P4/16
  - [116,90, 156,198, 373,326]  # P5/32

# YOLOv5 v6.0 backbone
backbone:
  # [from, number, module, args]
  [[-1, 1, Conv, [64, 6, 2, 2]],  # 0-P1/2
   [-1, 1, Conv, [128, 3, 2]],  # 1-P2/4
   [-1, 3, C3, [128]],
   [-1, 1, Conv, [256, 3, 2]],  # 3-P3/8
   [-1, 6, C3, [256]],
   [-1, 1, Conv, [512, 3, 2]],  # 5-P4/16
   [-1, 9, C3, [512]],
   [-1, 1, Conv, [1024, 3, 2]],  # 7-P5/32
   [-1, 3, C3, [1024]],
   [-1, 1, SPPF, [1024, 5]],  # 9
  ]

# YOLOv5 v6.0 PANet head
head:
  [[-1, 1, Conv, [512, 1, 1]],
   [-1, 1, nn.Upsample, [None, 2, 'nearest']],
   [[-1, 6], 1, Concat, [1]],  # cat backbone P4
   [-1, 3, C3, [512, False]],  # 13

   [-1, 1, Conv, [256, 1, 1]],
   [-1, 1, nn.Upsample, [None, 2, 'nearest']],
   [[-1, 4], 1, Concat, [1]],  # cat backbone P3
   [-1, 3, C3, [256, False]],  # 17 (P3/8-small)

   [-1, 1, Conv, [256, 3, 2]],
   [[-1, 14], 1, Concat, [1]],  # cat head P4
   [-1, 3, C3, [512, False]],  # 20 (P4/16-medium)

   [-1, 1, Conv, [512, 3, 2]],
   [[-1, 10], 1, Concat, [1]],  # cat head P5
   [-1, 3, C3, [1024, False]],  # 23 (P5/32-large)

   [[17, 20, 23], 1, Detect, [nc, anchors]],  # Detect(P3, P4, P5)
  ]


================================================
FILE: RStask/ObjectDetection/models/hub/yolov5l6.yaml
================================================
# YOLOv5 🚀 by Ultralytics, GPL-3.0 license

# Parameters
nc: 80  # number of classes
depth_multiple: 1.0  # model depth multiple
width_multiple: 1.0  # layer channel multiple
anchors:
  - [19,27,  44,40,  38,94]  # P3/8
  - [96,68,  86,152,  180,137]  # P4/16
  - [140,301,  303,264,  238,542]  # P5/32
  - [436,615,  739,380,  925,792]  # P6/64

# YOLOv5 v6.0 backbone
backbone:
  # [from, number, module, args]
  [[-1, 1, Conv, [64, 6, 2, 2]],  # 0-P1/2
   [-1, 1, Conv, [128, 3, 2]],  # 1-P2/4
   [-1, 3, C3, [128]],
   [-1, 1, Conv, [256, 3, 2]],  # 3-P3/8
   [-1, 6, C3, [256]],
   [-1, 1, Conv, [512, 3, 2]],  # 5-P4/16
   [-1, 9, C3, [512]],
   [-1, 1, Conv, [768, 3, 2]],  # 7-P5/32
   [-1, 3, C3, [768]],
   [-1, 1, Conv, [1024, 3, 2]],  # 9-P6/64
   [-1, 3, C3, [1024]],
   [-1, 1, SPPF, [1024, 5]],  # 11
  ]

# YOLOv5 v6.0 head
head:
  [[-1, 1, Conv, [768, 1, 1]],
   [-1, 1, nn.Upsample, [None, 2, 'nearest']],
   [[-1, 8], 1, Concat, [1]],  # cat backbone P5
   [-1, 3, C3, [768, False]],  # 15

   [-1, 1, Conv, [512, 1, 1]],
   [-1, 1, nn.Upsample, [None, 2, 'nearest']],
   [[-1, 6], 1, Concat, [1]],  # cat backbone P4
   [-1, 3, C3, [512, False]],  # 19

   [-1, 1, Conv, [256, 1, 1]],
   [-1, 1, nn.Upsample, [None, 2, 'nearest']],
   [[-1, 4], 1, Concat, [1]],  # cat backbone P3
   [-1, 3, C3, [256, False]],  # 23 (P3/8-small)

   [-1, 1, Conv, [256, 3, 2]],
   [[-1, 20], 1, Concat, [1]],  # cat head P4
   [-1, 3, C3, [512, False]],  # 26 (P4/16-medium)

   [-1, 1, Conv, [512, 3, 2]],
   [[-1, 16], 1, Concat, [1]],  # cat head P5
   [-1, 3, C3, [768, False]],  # 29 (P5/32-large)

   [-1, 1, Conv, [768, 3, 2]],
   [[-1, 12], 1, Concat, [1]],  # cat head P6
   [-1, 3, C3, [1024, False]],  # 32 (P6/64-xlarge)

   [[23, 26, 29, 32], 1, Detect, [nc, anchors]],  # Detect(P3, P4, P5, P6)
  ]


================================================
FILE: RStask/ObjectDetection/models/hub/yolov5m6.yaml
================================================
# YOLOv5 🚀 by Ultralytics, GPL-3.0 license

# Parameters
nc: 80  # number of classes
depth_multiple: 0.67  # model depth multiple
width_multiple: 0.75  # layer channel multiple
anchors:
  - [19,27,  44,40,  38,94]  # P3/8
  - [96,68,  86,152,  180,137]  # P4/16
  - [140,301,  303,264,  238,542]  # P5/32
  - [436,615,  739,380,  925,792]  # P6/64

# YOLOv5 v6.0 backbone
backbone:
  # [from, number, module, args]
  [[-1, 1, Conv, [64, 6, 2, 2]],  # 0-P1/2
   [-1, 1, Conv, [128, 3, 2]],  # 1-P2/4
   [-1, 3, C3, [128]],
   [-1, 1, Conv, [256, 3, 2]],  # 3-P3/8
   [-1, 6, C3, [256]],
   [-1, 1, Conv, [512, 3, 2]],  # 5-P4/16
   [-1, 9, C3, [512]],
   [-1, 1, Conv, [768, 3, 2]],  # 7-P5/32
   [-1, 3, C3, [768]],
   [-1, 1, Conv, [1024, 3, 2]],  # 9-P6/64
   [-1, 3, C3, [1024]],
   [-1, 1, SPPF, [1024, 5]],  # 11
  ]

# YOLOv5 v6.0 head
head:
  [[-1, 1, Conv, [768, 1, 1]],
   [-1, 1, nn.Upsample, [None, 2, 'nearest']],
   [[-1, 8], 1, Concat, [1]],  # cat backbone P5
   [-1, 3, C3, [768, False]],  # 15

   [-1, 1, Conv, [512, 1, 1]],
   [-1, 1, nn.Upsample, [None, 2, 'nearest']],
   [[-1, 6], 1, Concat, [1]],  # cat backbone P4
   [-1, 3, C3, [512, False]],  # 19

   [-1, 1, Conv, [256, 1, 1]],
   [-1, 1, nn.Upsample, [None, 2, 'nearest']],
   [[-1, 4], 1, Concat, [1]],  # cat backbone P3
   [-1, 3, C3, [256, False]],  # 23 (P3/8-small)

   [-1, 1, Conv, [256, 3, 2]],
   [[-1, 20], 1, Concat, [1]],  # cat head P4
   [-1, 3, C3, [512, False]],  # 26 (P4/16-medium)

   [-1, 1, Conv, [512, 3, 2]],
   [[-1, 16], 1, Concat, [1]],  # cat head P5
   [-1, 3, C3, [768, False]],  # 29 (P5/32-large)

   [-1, 1, Conv, [768, 3, 2]],
   [[-1, 12], 1, Concat, [1]],  # cat head P6
   [-1, 3, C3, [1024, False]],  # 32 (P6/64-xlarge)

   [[23, 26, 29, 32], 1, Detect, [nc, anchors]],  # Detect(P3, P4, P5, P6)
  ]


================================================
FILE: RStask/ObjectDetection/models/hub/yolov5n6.yaml
================================================
# YOLOv5 🚀 by Ultralytics, GPL-3.0 license

# Parameters
nc: 80  # number of classes
depth_multiple: 0.33  # model depth multiple
width_multiple: 0.25  # layer channel multiple
anchors:
  - [19,27,  44,40,  38,94]  # P3/8
  - [96,68,  86,152,  180,137]  # P4/16
  - [140,301,  303,264,  238,542]  # P5/32
  - [436,615,  739,380,  925,792]  # P6/64

# YOLOv5 v6.0 backbone
backbone:
  # [from, number, module, args]
  [[-1, 1, Conv, [64, 6, 2, 2]],  # 0-P1/2
   [-1, 1, Conv, [128, 3, 2]],  # 1-P2/4
   [-1, 3, C3, [128]],
   [-1, 1, Conv, [256, 3, 2]],  # 3-P3/8
   [-1, 6, C3, [256]],
   [-1, 1, Conv, [512, 3, 2]],  # 5-P4/16
   [-1, 9, C3, [512]],
   [-1, 1, Conv, [768, 3, 2]],  # 7-P5/32
   [-1, 3, C3, [768]],
   [-1, 1, Conv, [1024, 3, 2]],  # 9-P6/64
   [-1, 3, C3, [1024]],
   [-1, 1, SPPF, [1024, 5]],  # 11
  ]

# YOLOv5 v6.0 head
head:
  [[-1, 1, Conv, [768, 1, 1]],
   [-1, 1, nn.Upsample, [None, 2, 'nearest']],
   [[-1, 8], 1, Concat, [1]],  # cat backbone P5
   [-1, 3, C3, [768, False]],  # 15

   [-1, 1, Conv, [512, 1, 1]],
   [-1, 1, nn.Upsample, [None, 2, 'nearest']],
   [[-1, 6], 1, Concat, [1]],  # cat backbone P4
   [-1, 3, C3, [512, False]],  # 19

   [-1, 1, Conv, [256, 1, 1]],
   [-1, 1, nn.Upsample, [None, 2, 'nearest']],
   [[-1, 4], 1, Concat, [1]],  # cat backbone P3
   [-1, 3, C3, [256, False]],  # 23 (P3/8-small)

   [-1, 1, Conv, [256, 3, 2]],
   [[-1, 20], 1, Concat, [1]],  # cat head P4
   [-1, 3, C3, [512, False]],  # 26 (P4/16-medium)

   [-1, 1, Conv, [512, 3, 2]],
   [[-1, 16], 1, Concat, [1]],  # cat head P5
   [-1, 3, C3, [768, False]],  # 29 (P5/32-large)

   [-1, 1, Conv, [768, 3, 2]],
   [[-1, 12], 1, Concat, [1]],  # cat head P6
   [-1, 3, C3, [1024, False]],  # 32 (P6/64-xlarge)

   [[23, 26, 29, 32], 1, Detect, [nc, anchors]],  # Detect(P3, P4, P5, P6)
  ]


================================================
FILE: RStask/ObjectDetection/models/hub/yolov5s-ghost.yaml
================================================
# YOLOv5 🚀 by Ultralytics, GPL-3.0 license

# Parameters
nc: 80  # number of classes
depth_multiple: 0.33  # model depth multiple
width_multiple: 0.50  # layer channel multiple
anchors:
  - [10,13, 16,30, 33,23]  # P3/8
  - [30,61, 62,45, 59,119]  # P4/16
  - [116,90, 156,198, 373,326]  # P5/32

# YOLOv5 v6.0 backbone
backbone:
  # [from, number, module, args]
  [[-1, 1, Conv, [64, 6, 2, 2]],  # 0-P1/2
   [-1, 1, GhostConv, [128, 3, 2]],  # 1-P2/4
   [-1, 3, C3Ghost, [128]],
   [-1, 1, GhostConv, [256, 3, 2]],  # 3-P3/8
   [-1, 6, C3Ghost, [256]],
   [-1, 1, GhostConv, [512, 3, 2]],  # 5-P4/16
   [-1, 9, C3Ghost, [512]],
   [-1, 1, GhostConv, [1024, 3, 2]],  # 7-P5/32
   [-1, 3, C3Ghost, [1024]],
   [-1, 1, SPPF, [1024, 5]],  # 9
  ]

# YOLOv5 v6.0 head
head:
  [[-1, 1, GhostConv, [512, 1, 1]],
   [-1, 1, nn.Upsample, [None, 2, 'nearest']],
   [[-1, 6], 1, Concat, [1]],  # cat backbone P4
   [-1, 3, C3Ghost, [512, False]],  # 13

   [-1, 1, GhostConv, [256, 1, 1]],
   [-1, 1, nn.Upsample, [None, 2, 'nearest']],
   [[-1, 4], 1, Concat, [1]],  # cat backbone P3
   [-1, 3, C3Ghost, [256, False]],  # 17 (P3/8-small)

   [-1, 1, GhostConv, [256, 3, 2]],
   [[-1, 14], 1, Concat, [1]],  # cat head P4
   [-1, 3, C3Ghost, [512, False]],  # 20 (P4/16-medium)

   [-1, 1, GhostConv, [512, 3, 2]],
   [[-1, 10], 1, Concat, [1]],  # cat head P5
   [-1, 3, C3Ghost, [1024, False]],  # 23 (P5/32-large)

   [[17, 20, 23], 1, Detect, [nc, anchors]],  # Detect(P3, P4, P5)
  ]


================================================
FILE: RStask/ObjectDetection/models/hub/yolov5s-transformer.yaml
================================================
# YOLOv5 🚀 by Ultralytics, GPL-3.0 license

# Parameters
nc: 80  # number of classes
depth_multiple: 0.33  # model depth multiple
width_multiple: 0.50  # layer channel multiple
anchors:
  - [10,13, 16,30, 33,23]  # P3/8
  - [30,61, 62,45, 59,119]  # P4/16
  - [116,90, 156,198, 373,326]  # P5/32

# YOLOv5 v6.0 backbone
backbone:
  # [from, number, module, args]
  [[-1, 1, Conv, [64, 6, 2, 2]],  # 0-P1/2
   [-1, 1, Conv, [128, 3, 2]],  # 1-P2/4
   [-1, 3, C3, [128]],
   [-1, 1, Conv, [256, 3, 2]],  # 3-P3/8
   [-1, 6, C3, [256]],
   [-1, 1, Conv, [512, 3, 2]],  # 5-P4/16
   [-1, 9, C3, [512]],
   [-1, 1, Conv, [1024, 3, 2]],  # 7-P5/32
   [-1, 3, C3TR, [1024]],  # 9 <--- C3TR() Transformer module
   [-1, 1, SPPF, [1024, 5]],  # 9
  ]

# YOLOv5 v6.0 head
head:
  [[-1, 1, Conv, [512, 1, 1]],
   [-1, 1, nn.Upsample, [None, 2, 'nearest']],
   [[-1, 6], 1, Concat, [1]],  # cat backbone P4
   [-1, 3, C3, [512, False]],  # 13

   [-1, 1, Conv, [256, 1, 1]],
   [-1, 1, nn.Upsample, [None, 2, 'nearest']],
   [[-1, 4], 1, Concat, [1]],  # cat backbone P3
   [-1, 3, C3, [256, False]],  # 17 (P3/8-small)

   [-1, 1, Conv, [256, 3, 2]],
   [[-1, 14], 1, Concat, [1]],  # cat head P4
   [-1, 3, C3, [512, False]],  # 20 (P4/16-medium)

   [-1, 1, Conv, [512, 3, 2]],
   [[-1, 10], 1, Concat, [1]],  # cat head P5
   [-1, 3, C3, [1024, False]],  # 23 (P5/32-large)

   [[17, 20, 23], 1, Detect, [nc, anchors]],  # Detect(P3, P4, P5)
  ]


================================================
FILE: RStask/ObjectDetection/models/hub/yolov5s6.yaml
================================================
# YOLOv5 🚀 by Ultralytics, GPL-3.0 license

# Parameters
nc: 80  # number of classes
depth_multiple: 0.33  # model depth multiple
width_multiple: 0.50  # layer channel multiple
anchors:
  - [19,27,  44,40,  38,94]  # P3/8
  - [96,68,  86,152,  180,137]  # P4/16
  - [140,301,  303,264,  238,542]  # P5/32
  - [436,615,  739,380,  925,792]  # P6/64

# YOLOv5 v6.0 backbone
backbone:
  # [from, number, module, args]
  [[-1, 1, Conv, [64, 6, 2, 2]],  # 0-P1/2
   [-1, 1, Conv, [128, 3, 2]],  # 1-P2/4
   [-1, 3, C3, [128]],
   [-1, 1, Conv, [256, 3, 2]],  # 3-P3/8
   [-1, 6, C3, [256]],
   [-1, 1, Conv, [512, 3, 2]],  # 5-P4/16
   [-1, 9, C3, [512]],
   [-1, 1, Conv, [768, 3, 2]],  # 7-P5/32
   [-1, 3, C3, [768]],
   [-1, 1, Conv, [1024, 3, 2]],  # 9-P6/64
   [-1, 3, C3, [1024]],
   [-1, 1, SPPF, [1024, 5]],  # 11
  ]

# YOLOv5 v6.0 head
head:
  [[-1, 1, Conv, [768, 1, 1]],
   [-1, 1, nn.Upsample, [None, 2, 'nearest']],
   [[-1, 8], 1, Concat, [1]],  # cat backbone P5
   [-1, 3, C3, [768, False]],  # 15

   [-1, 1, Conv, [512, 1, 1]],
   [-1, 1, nn.Upsample, [None, 2, 'nearest']],
   [[-1, 6], 1, Concat, [1]],  # cat backbone P4
   [-1, 3, C3, [512, False]],  # 19

   [-1, 1, Conv, [256, 1, 1]],
   [-1, 1, nn.Upsample, [None, 2, 'nearest']],
   [[-1, 4], 1, Concat, [1]],  # cat backbone P3
   [-1, 3, C3, [256, False]],  # 23 (P3/8-small)

   [-1, 1, Conv, [256, 3, 2]],
   [[-1, 20], 1, Concat, [1]],  # cat head P4
   [-1, 3, C3, [512, False]],  # 26 (P4/16-medium)

   [-1, 1, Conv, [512, 3, 2]],
   [[-1, 16], 1, Concat, [1]],  # cat head P5
   [-1, 3, C3, [768, False]],  # 29 (P5/32-large)

   [-1, 1, Conv, [768, 3, 2]],
   [[-1, 12], 1, Concat, [1]],  # cat head P6
   [-1, 3, C3, [1024, False]],  # 32 (P6/64-xlarge)

   [[23, 26, 29, 32], 1, Detect, [nc, anchors]],  # Detect(P3, P4, P5, P6)
  ]


================================================
FILE: RStask/ObjectDetection/models/hub/yolov5x6.yaml
================================================
# YOLOv5 🚀 by Ultralytics, GPL-3.0 license

# Parameters
nc: 80  # number of classes
depth_multiple: 1.33  # model depth multiple
width_multiple: 1.25  # layer channel multiple
anchors:
  - [19,27,  44,40,  38,94]  # P3/8
  - [96,68,  86,152,  180,137]  # P4/16
  - [140,301,  303,264,  238,542]  # P5/32
  - [436,615,  739,380,  925,792]  # P6/64

# YOLOv5 v6.0 backbone
backbone:
  # [from, number, module, args]
  [[-1, 1, Conv, [64, 6, 2, 2]],  # 0-P1/2
   [-1, 1, Conv, [128, 3, 2]],  # 1-P2/4
   [-1, 3, C3, [128]],
   [-1, 1, Conv, [256, 3, 2]],  # 3-P3/8
   [-1, 6, C3, [256]],
   [-1, 1, Conv, [512, 3, 2]],  # 5-P4/16
   [-1, 9, C3, [512]],
   [-1, 1, Conv, [768, 3, 2]],  # 7-P5/32
   [-1, 3, C3, [768]],
   [-1, 1, Conv, [1024, 3, 2]],  # 9-P6/64
   [-1, 3, C3, [1024]],
   [-1, 1, SPPF, [1024, 5]],  # 11
  ]

# YOLOv5 v6.0 head
head:
  [[-1, 1, Conv, [768, 1, 1]],
   [-1, 1, nn.Upsample, [None, 2, 'nearest']],
   [[-1, 8], 1, Concat, [1]],  # cat backbone P5
   [-1, 3, C3, [768, False]],  # 15

   [-1, 1, Conv, [512, 1, 1]],
   [-1, 1, nn.Upsample, [None, 2, 'nearest']],
   [[-1, 6], 1, Concat, [1]],  # cat backbone P4
   [-1, 3, C3, [512, False]],  # 19

   [-1, 1, Conv, [256, 1, 1]],
   [-1, 1, nn.Upsample, [None, 2, 'nearest']],
   [[-1, 4], 1, Concat, [1]],  # cat backbone P3
   [-1, 3, C3, [256, False]],  # 23 (P3/8-small)

   [-1, 1, Conv, [256, 3, 2]],
   [[-1, 20], 1, Concat, [1]],  # cat head P4
   [-1, 3, C3, [512, False]],  # 26 (P4/16-medium)

   [-1, 1, Conv, [512, 3, 2]],
   [[-1, 16], 1, Concat, [1]],  # cat head P5
   [-1, 3, C3, [768, False]],  # 29 (P5/32-large)

   [-1, 1, Conv, [768, 3, 2]],
   [[-1, 12], 1, Concat, [1]],  # cat head P6
   [-1, 3, C3, [1024, False]],  # 32 (P6/64-xlarge)

   [[23, 26, 29, 32], 1, Detect, [nc, anchors]],  # Detect(P3, P4, P5, P6)
  ]


================================================
FILE: RStask/ObjectDetection/models/tf.py
================================================
# YOLOv5 🚀 by Ultralytics, GPL-3.0 license
"""
TensorFlow, Keras and TFLite versions of YOLOv5
Authored by https://github.com/zldrobit in PR https://github.com/ultralytics/yolov5/pull/1127

Usage:
    $ python models/tf.py --weights yolov5s.pt

Export:
    $ python path/to/export.py --weights yolov5s.pt --include saved_model pb tflite tfjs
"""

import argparse
import sys
from copy import deepcopy
from pathlib import Path

FILE = Path(__file__).resolve()
ROOT = FILE.parents[1]  # YOLOv5 root directory
if str(ROOT) not in sys.path:
    sys.path.append(str(ROOT))  # add ROOT to PATH
# ROOT = ROOT.relative_to(Path.cwd())  # relative

import numpy as np
import tensorflow as tf
import torch
import torch.nn as nn
from tensorflow import keras

from models.common import C3, SPP, SPPF, Bottleneck, BottleneckCSP, Concat, Conv, DWConv, Focus, autopad
from models.experimental import CrossConv, MixConv2d, attempt_load
from models.yolo import Detect
from utils.activations import SiLU
from utils.general import LOGGER, make_divisible, print_args


class TFBN(keras.layers.Layer):
    # TensorFlow BatchNormalization wrapper
    def __init__(self, w=None):
        super().__init__()
        self.bn = keras.layers.BatchNormalization(
            beta_initializer=keras.initializers.Constant(w.bias.numpy()),
            gamma_initializer=keras.initializers.Constant(w.weight.numpy()),
            moving_mean_initializer=keras.initializers.Constant(w.running_mean.numpy()),
            moving_variance_initializer=keras.initializers.Constant(w.running_var.numpy()),
            epsilon=w.eps)

    def call(self, inputs):
        return self.bn(inputs)


class TFPad(keras.layers.Layer):

    def __init__(self, pad):
        super().__init__()
        self.pad = tf.constant([[0, 0], [pad, pad], [pad, pad], [0, 0]])

    def call(self, inputs):
        return tf.pad(inputs, self.pad, mode='constant', constant_values=0)


class TFConv(keras.layers.Layer):
    # Standard convolution
    def __init__(self, c1, c2, k=1, s=1, p=None, g=1, act=True, w=None):
        # ch_in, ch_out, weights, kernel, stride, padding, groups
        super().__init__()
        assert g == 1, "TF v2.2 Conv2D does not support 'groups' argument"
        assert isinstance(k, int), "Convolution with multiple kernels are not allowed."
        # TensorFlow convolution padding is inconsistent with PyTorch (e.g. k=3 s=2 'SAME' padding)
        # see https://stackoverflow.com/questions/52975843/comparing-conv2d-with-padding-between-tensorflow-and-pytorch

        conv = keras.layers.Conv2D(
            c2,
            k,
            s,
            'SAME' if s == 1 else 'VALID',
            use_bias=False if hasattr(w, 'bn') else True,
            kernel_initializer=keras.initializers.Constant(w.conv.weight.permute(2, 3, 1, 0).numpy()),
            bias_initializer='zeros' if hasattr(w, 'bn') else keras.initializers.Constant(w.conv.bias.numpy()))
        self.conv = conv if s == 1 else keras.Sequential([TFPad(autopad(k, p)), conv])
        self.bn = TFBN(w.bn) if hasattr(w, 'bn') else tf.identity

        # YOLOv5 activations
        if isinstance(w.act, nn.LeakyReLU):
            self.act = (lambda x: keras.activations.relu(x, alpha=0.1)) if act else tf.identity
        elif isinstance(w.act, nn.Hardswish):
            self.act = (lambda x: x * tf.nn.relu6(x + 3) * 0.166666667) if act else tf.identity
        elif isinstance(w.act, (nn.SiLU, SiLU)):
            self.act = (lambda x: keras.activations.swish(x)) if act else tf.identity
        else:
            raise Exception(f'no matching TensorFlow activation found for {w.act}')

    def call(self, inputs):
        return self.act(self.bn(self.conv(inputs)))


class TFFocus(keras.layers.Layer):
    # Focus wh information into c-space
    def __init__(self, c1, c2, k=1, s=1, p=None, g=1, act=True, w=None):
        # ch_in, ch_out, kernel, stride, padding, groups
        super().__init__()
        self.conv = TFConv(c1 * 4, c2, k, s, p, g, act, w.conv)

    def call(self, inputs):  # x(b,w,h,c) -> y(b,w/2,h/2,4c)
        # inputs = inputs / 255  # normalize 0-255 to 0-1
        return self.conv(
            tf.concat(
                [inputs[:, ::2, ::2, :], inputs[:, 1::2, ::2, :], inputs[:, ::2, 1::2, :], inputs[:, 1::2, 1::2, :]],
                3))


class TFBottleneck(keras.layers.Layer):
    # Standard bottleneck
    def __init__(self, c1, c2, shortcut=True, g=1, e=0.5, w=None):  # ch_in, ch_out, shortcut, groups, expansion
        super().__init__()
        c_ = int(c2 * e)  # hidden channels
        self.cv1 = TFConv(c1, c_, 1, 1, w=w.cv1)
        self.cv2 = TFConv(c_, c2, 3, 1, g=g, w=w.cv2)
        self.add = shortcut and c1 == c2

    def call(self, inputs):
        return inputs + self.cv2(self.cv1(inputs)) if self.add else self.cv2(self.cv1(inputs))


class TFConv2d(keras.layers.Layer):
    # Substitution for PyTorch nn.Conv2D
    def __init__(self, c1, c2, k, s=1, g=1, bias=True, w=None):
        super().__init__()
        assert g == 1, "TF v2.2 Conv2D does not support 'groups' argument"
        self.conv = keras.layers.Conv2D(
            c2,
            k,
            s,
            'VALID',
            use_bias=bias,
            kernel_initializer=keras.initializers.Constant(w.weight.permute(2, 3, 1, 0).numpy()),
            bias_initializer=keras.initializers.Constant(w.bias.numpy()) if bias else None,
        )

    def call(self, inputs):
        return self.conv(inputs)


class TFBottleneckCSP(keras.layers.Layer):
    # CSP Bottleneck https://github.com/WongKinYiu/CrossStagePartialNetworks
    def __init__(self, c1, c2, n=1, shortcut=True, g=1, e=0.5, w=None):
        # ch_in, ch_out, number, shortcut, groups, expansion
        super().__init__()
        c_ = int(c2 * e)  # hidden channels
        self.cv1 = TFConv(c1, c_, 1, 1, w=w.cv1)
        self.cv2 = TFConv2d(c1, c_, 1, 1, bias=False, w=w.cv2)
        self.cv3 = TFConv2d(c_, c_, 1, 1, bias=False, w=w.cv3)
        self.cv4 = TFConv(2 * c_, c2, 1, 1, w=w.cv4)
        self.bn = TFBN(w.bn)
        self.act = lambda x: keras.activations.swish(x)
        self.m = keras.Sequential([TFBottleneck(c_, c_, shortcut, g, e=1.0, w=w.m[j]) for j in range(n)])

    def call(self, inputs):
        y1 = self.cv3(self.m(self.cv1(inputs)))
        y2 = self.cv2(inputs)
        return self.cv4(self.act(self.bn(tf.concat((y1, y2), axis=3))))


class TFC3(keras.layers.Layer):
    # CSP Bottleneck with 3 convolutions
    def __init__(self, c1, c2, n=1, shortcut=True, g=1, e=0.5, w=None):
        # ch_in, ch_out, number, shortcut, groups, expansion
        super().__init__()
        c_ = int(c2 * e)  # hidden channels
        self.cv1 = TFConv(c1, c_, 1, 1, w=w.cv1)
        self.cv2 = TFConv(c1, c_, 1, 1, w=w.cv2)
        self.cv3 = TFConv(2 * c_, c2, 1, 1, w=w.cv3)
        self.m = keras.Sequential([TFBottleneck(c_, c_, shortcut, g, e=1.0, w=w.m[j]) for j in range(n)])

    def call(self, inputs):
        return self.cv3(tf.concat((self.m(self.cv1(inputs)), self.cv2(inputs)), axis=3))


class TFSPP(keras.layers.Layer):
    # Spatial pyramid pooling layer used in YOLOv3-SPP
    def __init__(self, c1, c2, k=(5, 9, 13), w=None):
        super().__init__()
        c_ = c1 // 2  # hidden channels
        self.cv1 = TFConv(c1, c_, 1, 1, w=w.cv1)
        self.cv2 = TFConv(c_ * (len(k) + 1), c2, 1, 1, w=w.cv2)
        self.m = [keras.layers.MaxPool2D(pool_size=x, strides=1, padding='SAME') for x in k]

    def call(self, inputs):
        x = self.cv1(inputs)
        return self.cv2(tf.concat([x] + [m(x) for m in self.m], 3))


class TFSPPF(keras.layers.Layer):
    # Spatial pyramid pooling-Fast layer
    def __init__(self, c1, c2, k=5, w=None):
        super().__init__()
        c_ = c1 // 2  # hidden channels
        self.cv1 = TFConv(c1, c_, 1, 1, w=w.cv1)
        self.cv2 = TFConv(c_ * 4, c2, 1, 1, w=w.cv2)
        self.m = keras.layers.MaxPool2D(pool_size=k, strides=1, padding='SAME')

    def call(self, inputs):
        x = self.cv1(inputs)
        y1 = self.m(x)
        y2 = self.m(y1)
        return self.cv2(tf.concat([x, y1, y2, self.m(y2)], 3))


class TFDetect(keras.layers.Layer):
    # TF YOLOv5 Detect layer
    def __init__(self, nc=80, anchors=(), ch=(), imgsz=(640, 640), w=None):  # detection layer
        super().__init__()
        self.stride = tf.convert_to_tensor(w.stride.numpy(), dtype=tf.float32)
        self.nc = nc  # number of classes
        self.no = nc + 5  # number of outputs per anchor
        self.nl = len(anchors)  # number of detection layers
        self.na = len(anchors[0]) // 2  # number of anchors
        self.grid = [tf.zeros(1)] * self.nl  # init grid
        self.anchors = tf.convert_to_tensor(w.anchors.numpy(), dtype=tf.float32)
        self.anchor_grid = tf.reshape(self.anchors * tf.reshape(self.stride, [self.nl, 1, 1]), [self.nl, 1, -1, 1, 2])
        self.m = [TFConv2d(x, self.no * self.na, 1, w=w.m[i]) for i, x in enumerate(ch)]
        self.training = False  # set to False after building model
        self.imgsz = imgsz
        for i in range(self.nl):
            ny, nx = self.imgsz[0] // self.stride[i], self.imgsz[1] // self.stride[i]
            self.grid[i] = self._make_grid(nx, ny)

    def call(self, inputs):
        z = []  # inference output
        x = []
        for i in range(self.nl):
            x.append(self.m[i](inputs[i]))
            # x(bs,20,20,255) to x(bs,3,20,20,85)
            ny, nx = self.imgsz[0] // self.stride[i], self.imgsz[1] // self.stride[i]
            x[i] = tf.reshape(x[i], [-1, ny * nx, self.na, self.no])

            if not self.training:  # inference
                y = tf.sigmoid(x[i])
                grid = tf.transpose(self.grid[i], [0, 2, 1, 3]) - 0.5
                anchor_grid = tf.transpose(self.anchor_grid[i], [0, 2, 1, 3]) * 4
                xy = (y[..., 0:2] * 2 + grid) * self.stride[i]  # xy
                wh = y[..., 2:4] ** 2 * anchor_grid
                # Normalize xywh to 0-1 to reduce calibration error
                xy /= tf.constant([[self.imgsz[1], self.imgsz[0]]], dtype=tf.float32)
                wh /= tf.constant([[self.imgsz[1], self.imgsz[0]]], dtype=tf.float32)
                y = tf.concat([xy, wh, y[..., 4:]], -1)
                z.append(tf.reshape(y, [-1, self.na * ny * nx, self.no]))

        return tf.transpose(x, [0, 2, 1, 3]) if self.training else (tf.concat(z, 1), x)

    @staticmethod
    def _make_grid(nx=20, ny=20):
        # yv, xv = torch.meshgrid([torch.arange(ny), torch.arange(nx)])
        # return torch.stack((xv, yv), 2).view((1, 1, ny, nx, 2)).float()
        xv, yv = tf.meshgrid(tf.range(nx), tf.range(ny))
        return tf.cast(tf.reshape(tf.stack([xv, yv], 2), [1, 1, ny * nx, 2]), dtype=tf.float32)


class TFUpsample(keras.layers.Layer):
    # TF version of torch.nn.Upsample()
    def __init__(self, size, scale_factor, mode, w=None):  # warning: all arguments needed including 'w'
        super().__init__()
        assert scale_factor == 2, "scale_factor must be 2"
        self.upsample = lambda x: tf.image.resize(x, (x.shape[1] * 2, x.shape[2] * 2), method=mode)
        # self.upsample = keras.layers.UpSampling2D(size=scale_factor, interpolation=mode)
        # with default arguments: align_corners=False, half_pixel_centers=False
        # self.upsample = lambda x: tf.raw_ops.ResizeNearestNeighbor(images=x,
        #                                                            size=(x.shape[1] * 2, x.shape[2] * 2))

    def call(self, inputs):
        return self.upsample(inputs)


class TFConcat(keras.layers.Layer):
    # TF version of torch.concat()
    def __init__(self, dimension=1, w=None):
        super().__init__()
        assert dimension == 1, "convert only NCHW to NHWC concat"
        self.d = 3

    def call(self, inputs):
        return tf.concat(inputs, self.d)


def parse_model(d, ch, model, imgsz):  # model_dict, input_channels(3)
    LOGGER.info(f"\n{'':>3}{'from':>18}{'n':>3}{'params':>10}  {'module':<40}{'arguments':<30}")
    anchors, nc, gd, gw = d['anchors'], d['nc'], d['depth_multiple'], d['width_multiple']
    na = (len(anchors[0]) // 2) if isinstance(anchors, list) else anchors  # number of anchors
    no = na * (nc + 5)  # number of outputs = anchors * (classes + 5)

    layers, save, c2 = [], [], ch[-1]  # layers, savelist, ch out
    for i, (f, n, m, args) in enumerate(d['backbone'] + d['head']):  # from, number, module, args
        m_str = m
        m = eval(m) if isinstance(m, str) else m  # eval strings
        for j, a in enumerate(args):
            try:
                args[j] = eval(a) if isinstance(a, str) else a  # eval strings
            except NameError:
                pass

        n = max(round(n * gd), 1) if n > 1 else n  # depth gain
        if m in [nn.Conv2d, Conv, Bottleneck, SPP, SPPF, DWConv, MixConv2d, Focus, CrossConv, BottleneckCSP, C3]:
            c1, c2 = ch[f], args[0]
            c2 = make_divisible(c2 * gw, 8) if c2 != no else c2

            args = [c1, c2, *args[1:]]
            if m in [BottleneckCSP, C3]:
                args.insert(2, n)
                n = 1
        elif m is nn.BatchNorm2d:
            args = [ch[f]]
        elif m is Concat:
            c2 = sum(ch[-1 if x == -1 else x + 1] for x in f)
        elif m is Detect:
            args.append([ch[x + 1] for x in f])
            if isinstance(args[1], int):  # number of anchors
                args[1] = [list(range(args[1] * 2))] * len(f)
            args.append(imgsz)
        else:
            c2 = ch[f]

        tf_m = eval('TF' + m_str.replace('nn.', ''))
        m_ = keras.Sequential([tf_m(*args, w=model.model[i][j]) for j in range(n)]) if n > 1 \
            else tf_m(*args, w=model.model[i])  # module

        torch_m_ = nn.Sequential(*(m(*args) for _ in range(n))) if n > 1 else m(*args)  # module
        t = str(m)[8:-2].replace('__main__.', '')  # module type
        np = sum(x.numel() for x in torch_m_.parameters())  # number params
        m_.i, m_.f, m_.type, m_.np = i, f, t, np  # attach index, 'from' index, type, number params
        LOGGER.info(f'{i:>3}{str(f):>18}{str(n):>3}{np:>10}  {t:<40}{str(args):<30}')  # print
        save.extend(x % i for x in ([f] if isinstance(f, int) else f) if x != -1)  # append to savelist
        layers.append(m_)
        ch.append(c2)
    return keras.Sequential(layers), sorted(save)


class TFModel:
    # TF YOLOv5 model
    def __init__(self, cfg='yolov5s.yaml', ch=3, nc=None, model=None, imgsz=(640, 640)):  # model, channels, classes
        super().__init__()
        if isinstance(cfg, dict):
            self.yaml = cfg  # model dict
        else:  # is *.yaml
            import yaml  # for torch hub
            self.yaml_file = Path(cfg).name
            with open(cfg) as f:
                self.yaml = yaml.load(f, Loader=yaml.FullLoader)  # model dict

        # Define model
        if nc and nc != self.yaml['nc']:
            LOGGER.info(f"Overriding {cfg} nc={self.yaml['nc']} with nc={nc}")
            self.yaml['nc'] = nc  # override yaml value
        self.model, self.savelist = parse_model(deepcopy(self.yaml), ch=[ch], model=model, imgsz=imgsz)

    def predict(self,
                inputs,
                tf_nms=False,
                agnostic_nms=False,
                topk_per_class=100,
                topk_all=100,
                iou_thres=0.45,
                conf_thres=0.25):
        y = []  # outputs
        x = inputs
        for i, m in enumerate(self.model.layers):
            if m.f != -1:  # if not from previous layer
                x = y[m.f] if isinstance(m.f, int) else [x if j == -1 else y[j] for j in m.f]  # from earlier layers

            x = m(x)  # run
            y.append(x if m.i in self.savelist else None)  # save output

        # Add TensorFlow NMS
        if tf_nms:
            boxes = self._xywh2xyxy(x[0][..., :4])
            probs = x[0][:, :, 4:5]
            classes = x[0][:, :, 5:]
            scores = probs * classes
            if agnostic_nms:
                nms = AgnosticNMS()((boxes, classes, scores), topk_all, iou_thres, conf_thres)
                return nms, x[1]
            else:
                boxes = tf.expand_dims(boxes, 2)
                nms = tf.image.combined_non_max_suppression(boxes,
                                                            scores,
                                                            topk_per_class,
                                                            topk_all,
                                                            iou_thres,
                                                            conf_thres,
                                                            clip_boxes=False)
                return nms, x[1]

        return x[0]  # output only first tensor [1,6300,85] = [xywh, conf, class0, class1, ...]
        # x = x[0][0]  # [x(1,6300,85), ...] to x(6300,85)
        # xywh = x[..., :4]  # x(6300,4) boxes
        # conf = x[..., 4:5]  # x(6300,1) confidences
        # cls = tf.reshape(tf.cast(tf.argmax(x[..., 5:], axis=1), tf.float32), (-1, 1))  # x(6300,1)  classes
        # return tf.concat([conf, cls, xywh], 1)

    @staticmethod
    def _xywh2xyxy(xywh):
        # Convert nx4 boxes from [x, y, w, h] to [x1, y1, x2, y2] where xy1=top-left, xy2=bottom-right
        x, y, w, h = tf.split(xywh, num_or_size_splits=4, axis=-1)
        return tf.concat([x - w / 2, y - h / 2, x + w / 2, y + h / 2], axis=-1)


class AgnosticNMS(keras.layers.Layer):
    # TF Agnostic NMS
    def call(self, input, topk_all, iou_thres, conf_thres):
        # wrap map_fn to avoid TypeSpec related error https://stackoverflow.com/a/65809989/3036450
        return tf.map_fn(lambda x: self._nms(x, topk_all, iou_thres, conf_thres),
                         input,
                         fn_output_signature=(tf.float32, tf.float32, tf.float32, tf.int32),
                         name='agnostic_nms')

    @staticmethod
    def _nms(x, topk_all=100, iou_thres=0.45, conf_thres=0.25):  # agnostic NMS
        boxes, classes, scores = x
        class_inds = tf.cast(tf.argmax(classes, axis=-1), tf.float32)
        scores_inp = tf.reduce_max(scores, -1)
        selected_inds = tf.image.non_max_suppression(boxes,
                                                     scores_inp,
                                                     max_output_size=topk_all,
                                                     iou_threshold=iou_thres,
                                                     score_threshold=conf_thres)
        selected_boxes = tf.gather(boxes, selected_inds)
        padded_boxes = tf.pad(selected_boxes,
                              paddings=[[0, topk_all - tf.shape(selected_boxes)[0]], [0, 0]],
                              mode="CONSTANT",
                              constant_values=0.0)
        selected_scores = tf.gather(scores_inp, selected_inds)
        padded_scores = tf.pad(selected_scores,
                               paddings=[[0, topk_all - tf.shape(selected_boxes)[0]]],
                               mode="CONSTANT",
                               constant_values=-1.0)
        selected_classes = tf.gather(class_inds, selected_inds)
        padded_classes = tf.pad(selected_classes,
                                paddings=[[0, topk_all - tf.shape(selected_boxes)[0]]],
                                mode="CONSTANT",
                                constant_values=-1.0)
        valid_detections = tf.shape(selected_inds)[0]
        return padded_boxes, padded_scores, padded_classes, valid_detections


def representative_dataset_gen(dataset, ncalib=100):
    # Representative dataset generator for use with converter.representative_dataset, returns a generator of np arrays
    for n, (path, img, im0s, vid_cap, string) in enumerate(dataset):
        input = np.transpose(img, [1, 2, 0])
        input = np.expand_dims(input, axis=0).astype(np.float32)
        input /= 255
        yield [input]
        if n >= ncalib:
            break


def run(
        weights=ROOT / 'yolov5s.pt',  # weights path
        imgsz=(640, 640),  # inference size h,w
        batch_size=1,  # batch size
        dynamic=False,  # dynamic batch size
):
    # PyTorch model
    im = torch.zeros((batch_size, 3, *imgsz))  # BCHW image
    model = attempt_load(weights, map_location=torch.device('cpu'), inplace=True, fuse=False)
    _ = model(im)  # inference
    model.info()

    # TensorFlow model
    im = tf.zeros((batch_size, *imgsz, 3))  # BHWC image
    tf_model = TFModel(cfg=model.yaml, model=model, nc=model.nc, imgsz=imgsz)
    _ = tf_model.predict(im)  # inference

    # Keras model
    im = kera

Download .txt

gitextract_ppmvimf1/

├── Prefix/
│   └── __init__.py
├── README.md
├── RSChatGPT-shell.py
├── RStask/
│   ├── EdgeDetection/
│   │   ├── Canny.py
│   │   ├── __init__.py
│   │   └── test.py
│   ├── ImageCaptioning/
│   │   ├── __init__.py
│   │   ├── blip.py
│   │   └── test.py
│   ├── InstanceSegmentation/
│   │   ├── SwinUpper.py
│   │   ├── __init__.py
│   │   ├── model.py
│   │   ├── swin.py
│   │   ├── test.py
│   │   └── uper.py
│   ├── LanduseSegmentation/
│   │   ├── __init__.py
│   │   ├── seg_hrnet.py
│   │   └── test.py
│   ├── ObjectCounting/
│   │   ├── Yolocounting.py
│   │   ├── __init__.py
│   │   └── test.py
│   ├── ObjectDetection/
│   │   ├── YOLOv5.py
│   │   ├── __init__.py
│   │   ├── models/
│   │   │   ├── __init__.py
│   │   │   ├── common.py
│   │   │   ├── experimental.py
│   │   │   ├── hub/
│   │   │   │   ├── anchors.yaml
│   │   │   │   ├── yolov3-spp.yaml
│   │   │   │   ├── yolov3-tiny.yaml
│   │   │   │   ├── yolov3.yaml
│   │   │   │   ├── yolov5-bifpn.yaml
│   │   │   │   ├── yolov5-fpn.yaml
│   │   │   │   ├── yolov5-p2.yaml
│   │   │   │   ├── yolov5-p34.yaml
│   │   │   │   ├── yolov5-p6.yaml
│   │   │   │   ├── yolov5-p7.yaml
│   │   │   │   ├── yolov5-panet.yaml
│   │   │   │   ├── yolov5l6.yaml
│   │   │   │   ├── yolov5m6.yaml
│   │   │   │   ├── yolov5n6.yaml
│   │   │   │   ├── yolov5s-ghost.yaml
│   │   │   │   ├── yolov5s-transformer.yaml
│   │   │   │   ├── yolov5s6.yaml
│   │   │   │   └── yolov5x6.yaml
│   │   │   ├── tf.py
│   │   │   ├── yolo.py
│   │   │   ├── yolov5l.yaml
│   │   │   ├── yolov5m.yaml
│   │   │   ├── yolov5n.yaml
│   │   │   ├── yolov5s.yaml
│   │   │   └── yolov5x.yaml
│   │   ├── test.py
│   │   └── utils/
│   │       ├── __init__.py
│   │       ├── activations.py
│   │       ├── augmentations.py
│   │       ├── autoanchor.py
│   │       ├── autobatch.py
│   │       ├── aws/
│   │       │   ├── __init__.py
│   │       │   ├── mime.sh
│   │       │   ├── resume.py
│   │       │   └── userdata.sh
│   │       ├── benchmarks.py
│   │       ├── callbacks.py
│   │       ├── datasets.py
│   │       ├── docker/
│   │       │   ├── Dockerfile
│   │       │   └── Dockerfile-cpu
│   │       ├── downloads.py
│   │       ├── flask_rest_api/
│   │       │   ├── README.md
│   │       │   ├── example_request.py
│   │       │   └── restapi.py
│   │       ├── general.py
│   │       ├── google_app_engine/
│   │       │   ├── Dockerfile
│   │       │   ├── additional_requirements.txt
│   │       │   └── app.yaml
│   │       ├── loggers/
│   │       │   ├── __init__.py
│   │       │   └── wandb/
│   │       │       ├── README.md
│   │       │       ├── __init__.py
│   │       │       ├── log_dataset.py
│   │       │       ├── sweep.py
│   │       │       ├── sweep.yaml
│   │       │       └── wandb_utils.py
│   │       ├── loss.py
│   │       ├── metrics.py
│   │       ├── plots.py
│   │       └── torch_utils.py
│   ├── SceneClassification/
│   │   ├── ResNetScene.py
│   │   ├── __init__.py
│   │   └── test.py
│   └── __init__.py
├── checkpoints/
│   └── __init__.py
└── requirements.txt

Download .txt

SYMBOL INDEX (573 symbols across 34 files)

FILE: RSChatGPT-shell.py
  function prompts (line 19) | def prompts(name, description):
  function get_new_image_name (line 25) | def get_new_image_name(org_img_name, func_name="update"):
  class EdgeDetection (line 35) | class EdgeDetection:
    method __init__ (line 36) | def __init__(self, device):
    method inference (line 44) | def inference(self, inputs):
  class ObjectCounting (line 49) | class ObjectCounting:
    method __init__ (line 50) | def __init__(self, device):
    method inference (line 57) | def inference(self, inputs):
  class InstanceSegmentation (line 63) | class InstanceSegmentation:
    method __init__ (line 64) | def __init__(self, device):
    method inference (line 73) | def inference(self, inputs):
  class SceneClassification (line 79) | class SceneClassification:
    method __init__ (line 80) | def __init__(self, device):
    method inference (line 88) | def inference(self, inputs):
  class LandUseSegmentation (line 93) | class LandUseSegmentation:
    method __init__ (line 94) | def __init__(self, device):
    method inference (line 104) | def inference(self, inputs):
  class ObjectDetection (line 110) | class ObjectDetection:
    method __init__ (line 111) | def __init__(self, device):
    method inference (line 121) | def inference(self, inputs):
  class ImageCaptioning (line 127) | class ImageCaptioning:
    method __init__ (line 128) | def __init__(self, device):
    method inference (line 135) | def inference(self, image_path):
  class RSChatGPT (line 140) | class RSChatGPT:
    method __init__ (line 141) | def __init__(self, gpt_name,load_dict,openai_key,proxy_url):
    method initialize (line 171) | def initialize(self):
    method run_text (line 183) | def run_text(self, text, state):
    method run_image (line 191) | def run_image(self, image_dir, state, txt=None):

FILE: RStask/EdgeDetection/Canny.py
  class Image2Canny (line 4) | class Image2Canny:
    method __init__ (line 5) | def __init__(self):
    method inference (line 10) | def inference(self, inputs,new_image_name):

FILE: RStask/ImageCaptioning/blip.py
  class BLIP (line 5) | class BLIP:
    method __init__ (line 6) | def __init__(self, device):
    method inference (line 12) | def inference(self, image_path):

FILE: RStask/InstanceSegmentation/SwinUpper.py
  class SwinInstance (line 6) | class SwinInstance:
    method __init__ (line 7) | def __init__(self, device):
    method inference (line 24) | def inference(self, image_path, det_prompt ,updated_image_path):

FILE: RStask/InstanceSegmentation/model.py
  class SegmentationHead (line 7) | class SegmentationHead(nn.Sequential):
    method __init__ (line 8) | def __init__(self, in_channels, out_channels, kernel_size=3, activatio...
  class ClassificationHead (line 15) | class ClassificationHead(nn.Sequential):
    method __init__ (line 16) | def __init__(self, in_channels, classes, pooling="avg", dropout=0.2, a...
  function initialize_decoder (line 27) | def initialize_decoder(module):
  function initialize_head (line 45) | def initialize_head(module):
  class SwinUPer (line 56) | class SwinUPer(torch.nn.Module):
    method __init__ (line 57) | def __init__(self, classes: int = 16):
    method initialize (line 77) | def initialize(self):
    method forward (line 80) | def forward(self, x):

FILE: RStask/InstanceSegmentation/swin.py
  class Mlp (line 26) | class Mlp(nn.Module):
    method __init__ (line 29) | def __init__(self, in_features, hidden_features=None, out_features=Non...
    method forward (line 38) | def forward(self, x):
  function window_partition (line 47) | def window_partition(x, window_size):
  function window_reverse (line 62) | def window_reverse(windows, window_size, H, W):
  class WindowAttention (line 79) | class WindowAttention(nn.Module):
    method __init__ (line 93) | def __init__(self, dim, window_size, num_heads, qkv_bias=True, qk_scal...
    method forward (line 127) | def forward(self, x, mask=None):
  class SwinTransformerBlock (line 162) | class SwinTransformerBlock(nn.Module):
    method __init__ (line 180) | def __init__(self, dim, num_heads, window_size=7, shift_size=0,
    method forward (line 204) | def forward(self, x, mask_matrix):
  class PatchMerging (line 264) | class PatchMerging(nn.Module):
    method __init__ (line 271) | def __init__(self, dim, norm_layer=nn.LayerNorm):
    method forward (line 277) | def forward(self, x, H, W):
  class BasicLayer (line 307) | class BasicLayer(nn.Module):
    method __init__ (line 326) | def __init__(self,
    method forward (line 368) | def forward(self, x, H, W):
  class PatchEmbed (line 411) | class PatchEmbed(nn.Module):
    method __init__ (line 421) | def __init__(self, patch_size=4, in_chans=3, embed_dim=96, norm_layer=...
    method forward (line 435) | def forward(self, x):
  class swin (line 455) | class swin(nn.Module):
    method __init__ (line 484) | def __init__(self,
    method _freeze_stages (line 580) | def _freeze_stages(self):
    method init_weights (line 597) | def init_weights(self, pretrained):
    method forward (line 665) | def forward(self, x):
    method train (line 694) | def train(self, mode=True):

FILE: RStask/InstanceSegmentation/uper.py
  function resize (line 17) | def resize(input,
  class PPM (line 38) | class PPM(nn.ModuleList):
    method __init__ (line 52) | def __init__(self, pool_scales, in_channels, channels, conv_cfg, norm_...
    method forward (line 75) | def forward(self, x):
  class BaseDecodeHead (line 88) | class BaseDecodeHead(BaseModule, metaclass=ABCMeta):
    method __init__ (line 154) | def __init__(self,
    method extra_repr (line 233) | def extra_repr(self):
    method _init_inputs (line 240) | def _init_inputs(self, in_channels, in_index, input_transform):
    method _transform_inputs (line 278) | def _transform_inputs(self, inputs):
    method forward (line 306) | def forward(self, inputs):
    method cls_seg (line 310) | def cls_seg(self, feat):
    method predict (line 318) | def predict(self, inputs: Tuple[Tensor], batch_img_metas: List[dict],
    method _stack_batch_gt (line 338) | def _stack_batch_gt(self, batch_data_samples) -> Tensor:
    method predict_by_feat (line 345) | def predict_by_feat(self, seg_logits: Tensor,
  class UPerHead (line 366) | class UPerHead(BaseDecodeHead):
    method __init__ (line 377) | def __init__(self, pool_scales=(1, 2, 3, 6), **kwargs):
    method psp_forward (line 429) | def psp_forward(self, inputs):
    method _forward_feature (line 439) | def _forward_feature(self, inputs):
    method forward (line 488) | def forward(self, *inputs):

FILE: RStask/LanduseSegmentation/seg_hrnet.py
  function conv3x3 (line 17) | def conv3x3(in_planes, out_planes, stride=1):
  class BasicBlock (line 23) | class BasicBlock(nn.Module):
    method __init__ (line 26) | def __init__(self, inplanes, planes, stride=1, downsample=None):
    method forward (line 36) | def forward(self, x):
  class Bottleneck (line 55) | class Bottleneck(nn.Module):
    method __init__ (line 58) | def __init__(self, inplanes, planes, stride=1, downsample=None):
    method forward (line 73) | def forward(self, x):
  class HighResolutionModule (line 96) | class HighResolutionModule(nn.Module):
    method __init__ (line 97) | def __init__(self, num_branches, blocks, num_blocks, num_inchannels,
    method _check_branches (line 114) | def _check_branches(self, num_branches, blocks, num_blocks,
    method _make_one_branch (line 131) | def _make_one_branch(self, branch_index, block, num_blocks, num_channels,
    method _make_branches (line 155) | def _make_branches(self, num_branches, block, num_blocks, num_channels):
    method _make_fuse_layers (line 164) | def _make_fuse_layers(self):
    method get_num_inchannels (line 210) | def get_num_inchannels(self):
    method forward (line 213) | def forward(self, x):
  class hrmodel (line 245) | class hrmodel(nn.Module):
    method __init__ (line 246) | def __init__(self):
    method _make_transition_layer (line 298) | def _make_transition_layer(
    method _make_layer (line 334) | def _make_layer(self, block, inplanes, planes, blocks, stride=1):
    method _make_stage (line 351) | def _make_stage(self, layer_config, num_inchannels,
    method forward (line 380) | def forward(self, x,gts=None):
  class ConvModule (line 434) | class ConvModule(nn.Module):
    method __init__ (line 435) | def __init__(self):
    method forward (line 440) | def forward(self,x):
  class HRNet48 (line 442) | class HRNet48(nn.Module):
    method __init__ (line 443) | def __init__(self,device):
    method visualize (line 458) | def visualize(self,pred,cls):
    method inference (line 473) | def inference(self,image_path, det_prompt,updated_image_path):

FILE: RStask/ObjectCounting/Yolocounting.py
  class YoloCounting (line 7) | class YoloCounting:
    method __init__ (line 8) | def __init__(self, device):
    method inference (line 21) | def inference(self, image_path, det_prompt):
    method non_max_suppression (line 57) | def non_max_suppression(self, prediction,

FILE: RStask/ObjectDetection/YOLOv5.py
  class YoloDetection (line 8) | class YoloDetection:
    method __init__ (line 9) | def __init__(self, device):
    method inference (line 20) | def inference(self, image_path, det_prompt,updated_image_path):
    method visualize (line 42) | def visualize(self,image_path, newpic_path,detections):
    method non_max_suppression (line 54) | def non_max_suppression(self, prediction,

FILE: RStask/ObjectDetection/models/common.py
  function autopad (line 31) | def autopad(k, p=None):  # kernel, padding
  class Conv (line 38) | class Conv(nn.Module):
    method __init__ (line 40) | def __init__(self, c1, c2, k=1, s=1, p=None, g=1, act=True):  # ch_in,...
    method forward (line 46) | def forward(self, x):
    method forward_fuse (line 49) | def forward_fuse(self, x):
  class DWConv (line 53) | class DWConv(Conv):
    method __init__ (line 55) | def __init__(self, c1, c2, k=1, s=1, act=True):  # ch_in, ch_out, kern...
  class TransformerLayer (line 59) | class TransformerLayer(nn.Module):
    method __init__ (line 61) | def __init__(self, c, num_heads):
    method forward (line 70) | def forward(self, x):
  class TransformerBlock (line 76) | class TransformerBlock(nn.Module):
    method __init__ (line 78) | def __init__(self, c1, c2, num_heads, num_layers):
    method forward (line 87) | def forward(self, x):
  class Bottleneck (line 95) | class Bottleneck(nn.Module):
    method __init__ (line 97) | def __init__(self, c1, c2, shortcut=True, g=1, e=0.5):  # ch_in, ch_ou...
    method forward (line 104) | def forward(self, x):
  class BottleneckCSP (line 108) | class BottleneckCSP(nn.Module):
    method __init__ (line 110) | def __init__(self, c1, c2, n=1, shortcut=True, g=1, e=0.5):  # ch_in, ...
    method forward (line 121) | def forward(self, x):
  class C3 (line 127) | class C3(nn.Module):
    method __init__ (line 129) | def __init__(self, c1, c2, n=1, shortcut=True, g=1, e=0.5):  # ch_in, ...
    method forward (line 138) | def forward(self, x):
  class C3TR (line 142) | class C3TR(C3):
    method __init__ (line 144) | def __init__(self, c1, c2, n=1, shortcut=True, g=1, e=0.5):
  class C3SPP (line 150) | class C3SPP(C3):
    method __init__ (line 152) | def __init__(self, c1, c2, k=(5, 9, 13), n=1, shortcut=True, g=1, e=0.5):
  class C3Ghost (line 158) | class C3Ghost(C3):
    method __init__ (line 160) | def __init__(self, c1, c2, n=1, shortcut=True, g=1, e=0.5):
  class SPP (line 166) | class SPP(nn.Module):
    method __init__ (line 168) | def __init__(self, c1, c2, k=(5, 9, 13)):
    method forward (line 175) | def forward(self, x):
  class SPPF (line 182) | class SPPF(nn.Module):
    method __init__ (line 184) | def __init__(self, c1, c2, k=5):  # equivalent to SPP(k=(5, 9, 13))
    method forward (line 191) | def forward(self, x):
  class Focus (line 200) | class Focus(nn.Module):
    method __init__ (line 202) | def __init__(self, c1, c2, k=1, s=1, p=None, g=1, act=True):  # ch_in,...
    method forward (line 207) | def forward(self, x):  # x(b,c,w,h) -> y(b,4c,w/2,h/2)
  class GhostConv (line 212) | class GhostConv(nn.Module):
    method __init__ (line 214) | def __init__(self, c1, c2, k=1, s=1, g=1, act=True):  # ch_in, ch_out,...
    method forward (line 220) | def forward(self, x):
  class GhostBottleneck (line 225) | class GhostBottleneck(nn.Module):
    method __init__ (line 227) | def __init__(self, c1, c2, k=3, s=1):  # ch_in, ch_out, kernel, stride
    method forward (line 237) | def forward(self, x):
  class Contract (line 241) | class Contract(nn.Module):
    method __init__ (line 243) | def __init__(self, gain=2):
    method forward (line 247) | def forward(self, x):
  class Expand (line 255) | class Expand(nn.Module):
    method __init__ (line 257) | def __init__(self, gain=2):
    method forward (line 261) | def forward(self, x):
  class Concat (line 269) | class Concat(nn.Module):
    method __init__ (line 271) | def __init__(self, dimension=1):
    method forward (line 275) | def forward(self, x):
  class DetectMultiBackend (line 279) | class DetectMultiBackend(nn.Module):
    method __init__ (line 281) | def __init__(self, weights='yolov5s.pt', device=torch.device('cpu'), d...
    method forward (line 304) | def forward(self, im, augment=False, visualize=False, val=False):
  class AutoShape (line 316) | class AutoShape(nn.Module):
    method __init__ (line 326) | def __init__(self, model):
    method _apply (line 334) | def _apply(self, fn):
    method forward (line 346) | def forward(self, imgs, size=640, augment=False, profile=False):
  class Detections (line 408) | class Detections:
    method __init__ (line 410) | def __init__(self, imgs, pred, files, times=(0, 0, 0, 0), names=None, ...
    method display (line 427) | def display(self, pprint=False, show=False, save=False, crop=False, re...
    method print (line 470) | def print(self):
    method show (line 474) | def show(self, labels=True):
    method save (line 477) | def save(self, labels=True, save_dir='runs/detect/exp'):
    method crop (line 481) | def crop(self, save=True, save_dir='runs/detect/exp'):
    method render (line 485) | def render(self, labels=True):
    method pandas (line 489) | def pandas(self):
    method tolist (line 499) | def tolist(self):
    method __len__ (line 508) | def __len__(self):
    method __str__ (line 511) | def __str__(self):
  class Classify (line 516) | class Classify(nn.Module):
    method __init__ (line 518) | def __init__(self, c1, c2, k=1, s=1, p=None, g=1):  # ch_in, ch_out, k...
    method forward (line 524) | def forward(self, x):

FILE: RStask/ObjectDetection/models/experimental.py
  class CrossConv (line 15) | class CrossConv(nn.Module):
    method __init__ (line 17) | def __init__(self, c1, c2, k=3, s=1, g=1, e=1.0, shortcut=False):
    method forward (line 25) | def forward(self, x):
  class Sum (line 29) | class Sum(nn.Module):
    method __init__ (line 31) | def __init__(self, n, weight=False):  # n: number of inputs
    method forward (line 38) | def forward(self, x):
  class MixConv2d (line 50) | class MixConv2d(nn.Module):
    method __init__ (line 52) | def __init__(self, c1, c2, k=(1, 3), s=1, equal_ch=True):  # ch_in, ch...
    method forward (line 71) | def forward(self, x):
  class Ensemble (line 75) | class Ensemble(nn.ModuleList):
    method __init__ (line 77) | def __init__(self):
    method forward (line 80) | def forward(self, x, augment=False, profile=False, visualize=False):
  function attempt_load (line 90) | def attempt_load(weights, map_location=None, inplace=True, fuse=True):

FILE: RStask/ObjectDetection/models/tf.py
  class TFBN (line 37) | class TFBN(keras.layers.Layer):
    method __init__ (line 39) | def __init__(self, w=None):
    method call (line 48) | def call(self, inputs):
  class TFPad (line 52) | class TFPad(keras.layers.Layer):
    method __init__ (line 54) | def __init__(self, pad):
    method call (line 58) | def call(self, inputs):
  class TFConv (line 62) | class TFConv(keras.layers.Layer):
    method __init__ (line 64) | def __init__(self, c1, c2, k=1, s=1, p=None, g=1, act=True, w=None):
    method call (line 93) | def call(self, inputs):
  class TFFocus (line 97) | class TFFocus(keras.layers.Layer):
    method __init__ (line 99) | def __init__(self, c1, c2, k=1, s=1, p=None, g=1, act=True, w=None):
    method call (line 104) | def call(self, inputs):  # x(b,w,h,c) -> y(b,w/2,h/2,4c)
  class TFBottleneck (line 112) | class TFBottleneck(keras.layers.Layer):
    method __init__ (line 114) | def __init__(self, c1, c2, shortcut=True, g=1, e=0.5, w=None):  # ch_i...
    method call (line 121) | def call(self, inputs):
  class TFConv2d (line 125) | class TFConv2d(keras.layers.Layer):
    method __init__ (line 127) | def __init__(self, c1, c2, k, s=1, g=1, bias=True, w=None):
    method call (line 140) | def call(self, inputs):
  class TFBottleneckCSP (line 144) | class TFBottleneckCSP(keras.layers.Layer):
    method __init__ (line 146) | def __init__(self, c1, c2, n=1, shortcut=True, g=1, e=0.5, w=None):
    method call (line 158) | def call(self, inputs):
  class TFC3 (line 164) | class TFC3(keras.layers.Layer):
    method __init__ (line 166) | def __init__(self, c1, c2, n=1, shortcut=True, g=1, e=0.5, w=None):
    method call (line 175) | def call(self, inputs):
  class TFSPP (line 179) | class TFSPP(keras.layers.Layer):
    method __init__ (line 181) | def __init__(self, c1, c2, k=(5, 9, 13), w=None):
    method call (line 188) | def call(self, inputs):
  class TFSPPF (line 193) | class TFSPPF(keras.layers.Layer):
    method __init__ (line 195) | def __init__(self, c1, c2, k=5, w=None):
    method call (line 202) | def call(self, inputs):
  class TFDetect (line 209) | class TFDetect(keras.layers.Layer):
    method __init__ (line 211) | def __init__(self, nc=80, anchors=(), ch=(), imgsz=(640, 640), w=None)...
    method call (line 228) | def call(self, inputs):
    method _make_grid (line 252) | def _make_grid(nx=20, ny=20):
  class TFUpsample (line 259) | class TFUpsample(keras.layers.Layer):
    method __init__ (line 261) | def __init__(self, size, scale_factor, mode, w=None):  # warning: all ...
    method call (line 270) | def call(self, inputs):
  class TFConcat (line 274) | class TFConcat(keras.layers.Layer):
    method __init__ (line 276) | def __init__(self, dimension=1, w=None):
    method call (line 281) | def call(self, inputs):
  function parse_model (line 285) | def parse_model(d, ch, model, imgsz):  # model_dict, input_channels(3)
  class TFModel (line 337) | class TFModel:
    method __init__ (line 339) | def __init__(self, cfg='yolov5s.yaml', ch=3, nc=None, model=None, imgs...
    method predict (line 355) | def predict(self,
    method _xywh2xyxy (line 400) | def _xywh2xyxy(xywh):
  class AgnosticNMS (line 406) | class AgnosticNMS(keras.layers.Layer):
    method call (line 408) | def call(self, input, topk_all, iou_thres, conf_thres):
    method _nms (line 416) | def _nms(x, topk_all=100, iou_thres=0.45, conf_thres=0.25):  # agnosti...
  function representative_dataset_gen (line 444) | def representative_dataset_gen(dataset, ncalib=100):
  function run (line 455) | def run(
  function parse_opt (line 480) | def parse_opt():
  function main (line 492) | def main(opt):

FILE: RStask/ObjectDetection/models/yolo.py
  class Detect (line 37) | class Detect(nn.Module):
    method __init__ (line 42) | def __init__(self, nc=80, anchors=(), ch=(), inplace=True):  # detecti...
    method forward (line 54) | def forward(self, x):
    method _make_grid (line 78) | def _make_grid(self, nx=20, ny=20, i=0):
  class Model (line 92) | class Model(nn.Module):
    method __init__ (line 94) | def __init__(self, cfg='yolov5s.yaml', ch=3, nc=None, anchors=None):  ...
    method forward (line 132) | def forward(self, x, augment=False, profile=False, visualize=False):
    method _forward_augment (line 137) | def _forward_augment(self, x):
    method _forward_once (line 151) | def _forward_once(self, x, profile=False, visualize=False):
    method _descale_pred (line 164) | def _descale_pred(self, p, flips, scale, img_size):
    method _clip_augmented (line 181) | def _clip_augmented(self, y):
    method _profile_one_layer (line 192) | def _profile_one_layer(self, m, x, dt):
    method _initialize_biases (line 205) | def _initialize_biases(self, cf=None):  # initialize biases into Detec...
    method _print_biases (line 215) | def _print_biases(self):
    method fuse (line 227) | def fuse(self):  # fuse model Conv2d() + BatchNorm2d() layers
    method info (line 236) | def info(self, verbose=False, img_size=640):  # print model information
    method _apply (line 239) | def _apply(self, fn):
  function parse_model (line 251) | def parse_model(d, ch):  # model_dict, input_channels(3)

FILE: RStask/ObjectDetection/utils/__init__.py
  function notebook_init (line 7) | def notebook_init(verbose=True):

FILE: RStask/ObjectDetection/utils/activations.py
  class SiLU (line 11) | class SiLU(nn.Module):
    method forward (line 14) | def forward(x):
  class Hardswish (line 18) | class Hardswish(nn.Module):
    method forward (line 21) | def forward(x):
  class Mish (line 26) | class Mish(nn.Module):
    method forward (line 29) | def forward(x):
  class MemoryEfficientMish (line 33) | class MemoryEfficientMish(nn.Module):
    class F (line 35) | class F(torch.autograd.Function):
      method forward (line 38) | def forward(ctx, x):
      method backward (line 43) | def backward(ctx, grad_output):
    method forward (line 49) | def forward(self, x):
  class FReLU (line 53) | class FReLU(nn.Module):
    method __init__ (line 55) | def __init__(self, c1, k=3):  # ch_in, kernel
    method forward (line 60) | def forward(self, x):
  class AconC (line 64) | class AconC(nn.Module):
    method __init__ (line 70) | def __init__(self, c1):
    method forward (line 76) | def forward(self, x):
  class MetaAconC (line 81) | class MetaAconC(nn.Module):
    method __init__ (line 87) | def __init__(self, c1, k=1, s=1, r=16):  # ch_in, kernel, stride, r
    method forward (line 97) | def forward(self, x):

FILE: RStask/ObjectDetection/utils/augmentations.py
  class Albumentations (line 16) | class Albumentations:
    method __init__ (line 18) | def __init__(self):
    method __call__ (line 40) | def __call__(self, im, labels, p=1.0):
  function augment_hsv (line 47) | def augment_hsv(im, hgain=0.5, sgain=0.5, vgain=0.5):
  function hist_equalize (line 63) | def hist_equalize(im, clahe=True, bgr=False):
  function replicate (line 74) | def replicate(im, labels):
  function letterbox (line 91) | def letterbox(im, new_shape=(640, 640), color=(114, 114, 114), auto=True...
  function random_perspective (line 124) | def random_perspective(im,
  function copy_paste (line 220) | def copy_paste(im, labels, segments, p=0.5):
  function cutout (line 244) | def cutout(im, labels, p=0.5):
  function mixup (line 271) | def mixup(im, labels, im2, labels2):
  function box_candidates (line 279) | def box_candidates(box1, box2, wh_thr=2, ar_thr=100, area_thr=0.1, eps=1...

FILE: RStask/ObjectDetection/utils/autoanchor.py
  function check_anchor_order (line 18) | def check_anchor_order(m):
  function check_anchors (line 28) | def check_anchors(dataset, model, thr=4.0, imgsz=640):
  function kmean_anchors (line 68) | def kmean_anchors(dataset='./data/coco128.yaml', n=9, img_size=640, thr=...

FILE: RStask/ObjectDetection/utils/autobatch.py
  function check_train_batch_size (line 16) | def check_train_batch_size(model, imgsz=640):
  function autobatch (line 22) | def autobatch(model, imgsz=640, fraction=0.9, batch_size=16):

FILE: RStask/ObjectDetection/utils/benchmarks.py
  function run (line 48) | def run(
  function test (line 96) | def test(
  function parse_opt (line 128) | def parse_opt():
  function main (line 143) | def main(opt):

FILE: RStask/ObjectDetection/utils/callbacks.py
  class Callbacks (line 7) | class Callbacks:
    method __init__ (line 12) | def __init__(self):
    method register_action (line 36) | def register_action(self, hook, name='', callback=None):
    method get_registered_actions (line 49) | def get_registered_actions(self, hook=None):
    method run (line 58) | def run(self, hook, *args, **kwargs):

FILE: RStask/ObjectDetection/utils/datasets.py
  function get_hash (line 47) | def get_hash(paths):
  function exif_size (line 55) | def exif_size(img):
  function exif_transpose (line 70) | def exif_transpose(image):
  function create_dataloader (line 96) | def create_dataloader(path,
  class InfiniteDataLoader (line 144) | class InfiniteDataLoader(dataloader.DataLoader):
    method __init__ (line 150) | def __init__(self, *args, **kwargs):
    method __len__ (line 155) | def __len__(self):
    method __iter__ (line 158) | def __iter__(self):
  class _RepeatSampler (line 163) | class _RepeatSampler:
    method __init__ (line 170) | def __init__(self, sampler):
    method __iter__ (line 173) | def __iter__(self):
  class LoadImages (line 178) | class LoadImages:
    method __init__ (line 180) | def __init__(self, path, img_size=640, stride=32, auto=True):
    method __iter__ (line 209) | def __iter__(self):
    method __next__ (line 213) | def __next__(self):
    method new_video (line 251) | def new_video(self, path):
    method __len__ (line 256) | def __len__(self):
  class LoadWebcam (line 260) | class LoadWebcam:  # for inference
    method __init__ (line 262) | def __init__(self, pipe='0', img_size=640, stride=32):
    method __iter__ (line 269) | def __iter__(self):
    method __next__ (line 273) | def __next__(self):
    method __len__ (line 298) | def __len__(self):
  class LoadStreams (line 302) | class LoadStreams:
    method __init__ (line 304) | def __init__(self, sources='streams.txt', img_size=640, stride=32, aut...
    method update (line 347) | def update(self, i, cap, stream):
    method __iter__ (line 364) | def __iter__(self):
    method __next__ (line 368) | def __next__(self):
    method __len__ (line 387) | def __len__(self):
  function img2label_paths (line 391) | def img2label_paths(img_paths):
  class LoadImagesAndLabels (line 397) | class LoadImagesAndLabels(Dataset):
    method __init__ (line 401) | def __init__(self,
    method cache_labels (line 535) | def cache_labels(self, path=Path('./labels.cache'), prefix=''):
    method __len__ (line 573) | def __len__(self):
    method __getitem__ (line 582) | def __getitem__(self, index):
    method load_image (line 637) | def load_image(self, i):
    method cache_images_to_disk (line 655) | def cache_images_to_disk(self, i):
    method load_mosaic (line 661) | def load_mosaic(self, index):
    method load_mosaic9 (line 719) | def load_mosaic9(self, index):
    method collate_fn (line 796) | def collate_fn(batch):
    method collate_fn4 (line 803) | def collate_fn4(batch):
  function create_folder (line 830) | def create_folder(path='./new'):
  function flatten_recursive (line 837) | def flatten_recursive(path=DATASETS_DIR / 'coco128'):
  function extract_boxes (line 845) | def extract_boxes(path=DATASETS_DIR / 'coco128'):  # from utils.datasets...
  function autosplit (line 879) | def autosplit(path=DATASETS_DIR / 'coco128/images', weights=(0.9, 0.1, 0...
  function verify_image_label (line 903) | def verify_image_label(args):
  function dataset_stats (line 955) | def dataset_stats(path='coco128.yaml', autodownload=False, verbose=False...

FILE: RStask/ObjectDetection/utils/downloads.py
  function gsutil_getsize (line 19) | def gsutil_getsize(url=''):
  function safe_download (line 25) | def safe_download(file, url, url2=None, min_bytes=1E0, error_msg=''):
  function attempt_download (line 46) | def attempt_download(file, repo='ultralytics/yolov5'):  # from utils.dow...
  function gdrive_download (line 90) | def gdrive_download(id='16TiPfZj7htmTyhntwcZyEEAejOUxuT6m', file='tmp.zi...
  function get_token (line 125) | def get_token(cookie="./cookie"):

FILE: RStask/ObjectDetection/utils/flask_rest_api/restapi.py
  function predict (line 19) | def predict():

FILE: RStask/ObjectDetection/utils/general.py
  function is_kaggle (line 55) | def is_kaggle():
  function is_writeable (line 65) | def is_writeable(dir, test=False):
  function set_logging (line 80) | def set_logging(name=None, verbose=VERBOSE):
  function user_config_dir (line 99) | def user_config_dir(dir='Ultralytics', env_var='YOLOV5_CONFIG_DIR'):
  class Profile (line 115) | class Profile(contextlib.ContextDecorator):
    method __enter__ (line 117) | def __enter__(self):
    method __exit__ (line 120) | def __exit__(self, type, value, traceback):
  class Timeout (line 124) | class Timeout(contextlib.ContextDecorator):
    method __init__ (line 126) | def __init__(self, seconds, *, timeout_msg='', suppress_timeout_errors...
    method _timeout_handler (line 131) | def _timeout_handler(self, signum, frame):
    method __enter__ (line 134) | def __enter__(self):
    method __exit__ (line 139) | def __exit__(self, exc_type, exc_val, exc_tb):
  class WorkingDirectory (line 146) | class WorkingDirectory(contextlib.ContextDecorator):
    method __init__ (line 148) | def __init__(self, new_dir):
    method __enter__ (line 152) | def __enter__(self):
    method __exit__ (line 155) | def __exit__(self, exc_type, exc_val, exc_tb):
  function try_except (line 159) | def try_except(func):
  function methods (line 170) | def methods(instance):
  function print_args (line 175) | def print_args(args: Optional[dict] = None, show_file=True, show_fcn=Fal...
  function init_seeds (line 186) | def init_seeds(seed=0):
  function intersect_dicts (line 196) | def intersect_dicts(da, db, exclude=()):
  function get_latest_run (line 201) | def get_latest_run(search_dir='.'):
  function is_docker (line 207) | def is_docker():
  function is_colab (line 212) | def is_colab():
  function is_pip (line 221) | def is_pip():
  function is_ascii (line 226) | def is_ascii(s=''):
  function is_chinese (line 232) | def is_chinese(s='人工智能'):
  function emojis (line 237) | def emojis(str=''):
  function file_age (line 242) | def file_age(path=__file__):
  function file_update_date (line 248) | def file_update_date(path=__file__):
  function file_size (line 254) | def file_size(path):
  function check_online (line 266) | def check_online():
  function git_describe (line 276) | def git_describe(path=ROOT):  # path must be a directory
  function check_git_status (line 287) | def check_git_status():
  function check_python (line 306) | def check_python(minimum='3.7.0'):
  function check_version (line 311) | def check_version(current='0.0.0', minimum='0.0.0', name='version ', pin...
  function check_requirements (line 324) | def check_requirements(requirements=ROOT / 'requirements.txt', exclude=(...
  function check_img_size (line 360) | def check_img_size(imgsz, s=32, floor=0):
  function check_imshow (line 372) | def check_imshow():
  function check_suffix (line 387) | def check_suffix(file='yolov5s.pt', suffix=('.pt',), msg=''):
  function check_yaml (line 398) | def check_yaml(file, suffix=('.yaml', '.yml')):
  function check_file (line 403) | def check_file(file, suffix=''):
  function check_font (line 428) | def check_font(font=FONT, progress=False):
  function check_dataset (line 438) | def check_dataset(data, autodownload=True):
  function url2file (line 472) | def url2file(url):
  function download (line 479) | def download(url, dir='.', unzip=True, delete=True, curl=False, threads=...
  function make_divisible (line 525) | def make_divisible(x, divisor):
  function clean_str (line 532) | def clean_str(s):
  function one_cycle (line 537) | def one_cycle(y1=0.0, y2=1.0, steps=100):
  function colorstr (line 542) | def colorstr(*input):
  function labels_to_class_weights (line 568) | def labels_to_class_weights(labels, nc=80):
  function labels_to_image_weights (line 587) | def labels_to_image_weights(labels, nc=80, class_weights=np.ones(80)):
  function coco80_to_coco91_class (line 595) | def coco80_to_coco91_class():  # converts 80-index (val2014) to 91-index...
  function xyxy2xywh (line 608) | def xyxy2xywh(x):
  function xywh2xyxy (line 618) | def xywh2xyxy(x):
  function xywhn2xyxy (line 628) | def xywhn2xyxy(x, w=640, h=640, padw=0, padh=0):
  function xyxy2xywhn (line 638) | def xyxy2xywhn(x, w=640, h=640, clip=False, eps=0.0):
  function xyn2xy (line 650) | def xyn2xy(x, w=640, h=640, padw=0, padh=0):
  function segment2box (line 658) | def segment2box(segment, width=640, height=640):
  function segments2boxes (line 666) | def segments2boxes(segments):
  function resample_segments (line 675) | def resample_segments(segments, n=1000):
  function scale_coords (line 684) | def scale_coords(img1_shape, coords, img0_shape, ratio_pad=None):
  function clip_coords (line 700) | def clip_coords(boxes, shape):
  function non_max_suppression (line 712) | def non_max_suppression(prediction,
  function strip_optimizer (line 814) | def strip_optimizer(f='best.pt', s=''):  # from utils.general import *; ...
  function print_mutation (line 830) | def print_mutation(results, hyp, save_dir, bucket, prefix=colorstr('evol...
  function apply_classifier (line 870) | def apply_classifier(x, model, img, im0):
  function increment_path (line 906) | def increment_path(path, exist_ok=False, sep='', mkdir=False):
  function imread (line 925) | def imread(path, flags=cv2.IMREAD_COLOR):
  function imwrite (line 929) | def imwrite(path, im):
  function imshow (line 937) | def imshow(path, im):

FILE: RStask/ObjectDetection/utils/loggers/__init__.py
  class Loggers (line 37) | class Loggers():
    method __init__ (line 39) | def __init__(self, save_dir=None, weights=None, opt=None, hyp=None, lo...
    method on_train_start (line 92) | def on_train_start(self):
    method on_pretrain_routine_end (line 96) | def on_pretrain_routine_end(self):
    method on_train_batch_end (line 102) | def on_train_batch_end(self, ni, model, imgs, targets, paths, plots):
    method on_train_epoch_end (line 117) | def on_train_epoch_end(self, epoch):
    method on_val_image_end (line 122) | def on_val_image_end(self, pred, predn, path, names, im):
    method on_val_end (line 127) | def on_val_end(self):
    method on_fit_epoch_end (line 133) | def on_fit_epoch_end(self, vals, epoch, best_fitness, fi):
    method on_model_save (line 155) | def on_model_save(self, last, epoch, final_epoch, best_fitness, fi):
    method on_train_end (line 161) | def on_train_end(self, last, best, plots, epoch, results):
    method on_params_update (line 183) | def on_params_update(self, params):

FILE: RStask/ObjectDetection/utils/loggers/wandb/log_dataset.py
  function create_dataset_artifact (line 10) | def create_dataset_artifact(opt):

FILE: RStask/ObjectDetection/utils/loggers/wandb/sweep.py
  function sweep (line 17) | def sweep():

FILE: RStask/ObjectDetection/utils/loggers/wandb/wandb_utils.py
  function remove_prefix (line 32) | def remove_prefix(from_string, prefix=WANDB_ARTIFACT_PREFIX):
  function check_wandb_config_file (line 36) | def check_wandb_config_file(data_config_file):
  function check_wandb_dataset (line 43) | def check_wandb_dataset(data_file):
  function get_run_info (line 59) | def get_run_info(run_path):
  function check_wandb_resume (line 68) | def check_wandb_resume(opt):
  function process_wandb_config_ddp_mode (line 82) | def process_wandb_config_ddp_mode(opt):
  class WandbLogger (line 106) | class WandbLogger():
    method __init__ (line 120) | def __init__(self, opt, run_id=None, job_type='Training'):
    method check_and_upload_dataset (line 191) | def check_and_upload_dataset(self, opt):
    method setup_training (line 208) | def setup_training(self, opt):
    method download_dataset_artifact (line 260) | def download_dataset_artifact(self, path, alias):
    method download_model_artifact (line 280) | def download_model_artifact(self, opt):
    method log_model (line 298) | def log_model(self, path, opt, epoch, fitness_score, best_model=False):
    method log_dataset_artifact (line 323) | def log_dataset_artifact(self, data_file, single_cls, project, overwri...
    method map_val_table_path (line 381) | def map_val_table_path(self):
    method create_dataset_table (line 391) | def create_dataset_table(self, dataset: LoadImagesAndLabels, class_to_...
    method log_training_progress (line 437) | def log_training_progress(self, predn, path, names):
    method val_one_image (line 480) | def val_one_image(self, pred, predn, path, names, im):
    method log (line 508) | def log(self, log_dict):
    method end_epoch (line 519) | def end_epoch(self, best_result=False):
    method finish_run (line 554) | def finish_run(self):
  function all_logging_disabled (line 566) | def all_logging_disabled(highest_level=logging.CRITICAL):

FILE: RStask/ObjectDetection/utils/loss.py
  function smooth_BCE (line 13) | def smooth_BCE(eps=0.1):  # https://github.com/ultralytics/yolov3/issues...
  class BCEBlurWithLogitsLoss (line 18) | class BCEBlurWithLogitsLoss(nn.Module):
    method __init__ (line 20) | def __init__(self, alpha=0.05):
    method forward (line 25) | def forward(self, pred, true):
  class FocalLoss (line 35) | class FocalLoss(nn.Module):
    method __init__ (line 37) | def __init__(self, loss_fcn, gamma=1.5, alpha=0.25):
    method forward (line 45) | def forward(self, pred, true):
  class QFocalLoss (line 65) | class QFocalLoss(nn.Module):
    method __init__ (line 67) | def __init__(self, loss_fcn, gamma=1.5, alpha=0.25):
    method forward (line 75) | def forward(self, pred, true):
  class ComputeLoss (line 91) | class ComputeLoss:
    method __init__ (line 95) | def __init__(self, model, autobalance=False):
    method __call__ (line 121) | def __call__(self, p, targets):  # predictions, targets
    method build_targets (line 177) | def build_targets(self, p, targets):

FILE: RStask/ObjectDetection/utils/metrics.py
  function fitness (line 15) | def fitness(x):
  function ap_per_class (line 21) | def ap_per_class(tp, conf, pred_cls, target_cls, plot=False, save_dir='....
  function compute_ap (line 89) | def compute_ap(recall, precision):
  class ConfusionMatrix (line 117) | class ConfusionMatrix:
    method __init__ (line 119) | def __init__(self, nc, conf=0.25, iou_thres=0.45):
    method process_batch (line 125) | def process_batch(self, detections, labels):
    method matrix (line 165) | def matrix(self):
    method tp_fp (line 168) | def tp_fp(self):
    method plot (line 174) | def plot(self, normalize=True, save_dir='', names=()):
    method print (line 204) | def print(self):
  function bbox_iou (line 209) | def bbox_iou(box1, box2, xywh=True, GIoU=False, DIoU=False, CIoU=False, ...
  function box_area (line 250) | def box_area(box):
  function box_iou (line 255) | def box_iou(box1, box2):
  function bbox_ioa (line 276) | def bbox_ioa(box1, box2, eps=1E-7):
  function wh_iou (line 298) | def wh_iou(wh1, wh2):
  function plot_pr_curve (line 309) | def plot_pr_curve(px, py, ap, save_dir='pr_curve.png', names=()):
  function plot_mc_curve (line 330) | def plot_mc_curve(px, py, save_dir='mc_curve.png', names=(), xlabel='Con...

FILE: RStask/ObjectDetection/utils/plots.py
  class Colors (line 31) | class Colors:
    method __init__ (line 33) | def __init__(self):
    method __call__ (line 40) | def __call__(self, i, bgr=False):
    method hex2rgb (line 45) | def hex2rgb(h):  # rgb order (PIL)
  function check_pil_font (line 52) | def check_pil_font(font=FONT, size=10):
  class Annotator (line 68) | class Annotator:
    method __init__ (line 70) | def __init__(self, im, line_width=None, font_size=None, font='Arial.tt...
    method box_label (line 83) | def box_label(self, box, label='', color=(128, 128, 128), txt_color=(2...
    method rectangle (line 114) | def rectangle(self, xy, fill=None, outline=None, width=1):
    method text (line 118) | def text(self, xy, text, txt_color=(255, 255, 255)):
    method result (line 123) | def result(self):
  function feature_visualization (line 128) | def feature_visualization(x, module_type, stage, n=32, save_dir=Path('ru...
  function hist2d (line 156) | def hist2d(x, y, n=100):
  function butter_lowpass_filtfilt (line 165) | def butter_lowpass_filtfilt(data, cutoff=1500, fs=50000, order=5):
  function output_to_target (line 178) | def output_to_target(output):
  function plot_images (line 187) | def plot_images(images, targets, paths=None, fname='images.jpg', names=N...
  function plot_lr_scheduler (line 248) | def plot_lr_scheduler(optimizer, scheduler, epochs=300, save_dir=''):
  function plot_val_txt (line 265) | def plot_val_txt():  # from utils.plots import *; plot_val()
  function plot_targets_txt (line 282) | def plot_targets_txt():  # from utils.plots import *; plot_targets_txt()
  function plot_val_study (line 295) | def plot_val_study(file='', dir='', x=None):  # from utils.plots import ...
  function plot_labels (line 342) | def plot_labels(labels, names=(), save_dir=Path('')):
  function plot_evolve (line 389) | def plot_evolve(evolve_csv='path/to/evolve.csv'):  # from utils.plots im...
  function plot_results (line 416) | def plot_results(file='path/to/results.csv', dir=''):
  function profile_idetection (line 442) | def profile_idetection(start=0, stop=0, labels=(), save_dir=''):
  function save_one_box (line 473) | def save_one_box(xyxy, im, file=Path('im.jpg'), gain=1.02, pad=10, squar...

FILE: RStask/ObjectDetection/utils/torch_utils.py
  function torch_distributed_zero_first (line 33) | def torch_distributed_zero_first(local_rank: int):
  function device_count (line 42) | def device_count():
  function select_device (line 52) | def select_device(device='', batch_size=0, newline=True):
  function time_sync (line 83) | def time_sync():
  function profile (line 90) | def profile(input, ops, n=10, device=None):
  function is_parallel (line 142) | def is_parallel(model):
  function de_parallel (line 147) | def de_parallel(model):
  function initialize_weights (line 152) | def initialize_weights(model):
  function find_modules (line 164) | def find_modules(model, mclass=nn.Conv2d):
  function sparsity (line 169) | def sparsity(model):
  function prune (line 178) | def prune(model, amount=0.3):
  function fuse_conv_and_bn (line 189) | def fuse_conv_and_bn(conv, bn):
  function model_info (line 212) | def model_info(model, verbose=False, img_size=640):
  function scale_img (line 237) | def scale_img(img, ratio=1.0, same_shape=False, gs=32):  # img(16,3,256,...
  function copy_attr (line 250) | def copy_attr(a, b, include=(), exclude=()):
  class EarlyStopping (line 259) | class EarlyStopping:
    method __init__ (line 261) | def __init__(self, patience=30):
    method __call__ (line 267) | def __call__(self, epoch, fitness):
  class ModelEMA (line 282) | class ModelEMA:
    method __init__ (line 288) | def __init__(self, model, decay=0.9999, tau=2000, updates=0):
    method update (line 298) | def update(self, model):
    method update_attr (line 310) | def update_attr(self, model, include=(), exclude=('process_group', 're...

FILE: RStask/SceneClassification/ResNetScene.py
  class ResNetAID (line 4) | class ResNetAID:
    method __init__ (line 5) | def __init__(self, device=None):
    method inference (line 29) | def inference(self, inputs):

Download .json

Condensed preview — 91 files, each showing path, character count, and a content snippet. Download the .json file or copy for the full structured content (487K chars).

[
  {
    "path": "Prefix/__init__.py",
    "chars": 3400,
    "preview": "RS_CHATGPT_PREFIX = \"\"\"Remote Sensing ChatGPT is designed to assist with a wide range of remote sensing image related ta"
  },
  {
    "path": "README.md",
    "chars": 4821,
    "preview": "# RS-ChatGPT: Solving Remote Sensing Tasks with ChatGPT and Visual Models\nIntroduction\n----\nRemote Sensing ChatGPT(RS-Ch"
  },
  {
    "path": "RSChatGPT-shell.py",
    "chars": 13913,
    "preview": "import os\r\n\r\nimport re\r\nimport uuid\r\nfrom skimage import io\r\nimport argparse\r\nimport inspect\r\nfrom langchain.chat_models"
  },
  {
    "path": "RStask/EdgeDetection/Canny.py",
    "chars": 750,
    "preview": "from PIL import Image\r\nimport cv2\r\nimport numpy as np\r\nclass Image2Canny:\r\n    def __init__(self):\r\n        print(\"Initi"
  },
  {
    "path": "RStask/EdgeDetection/__init__.py",
    "chars": 0,
    "preview": ""
  },
  {
    "path": "RStask/EdgeDetection/test.py",
    "chars": 167,
    "preview": "from RStask.EdgeDetection.Canny import Image2Canny\r\nmodel=Image2Canny()\r\nmodel.inference('/data/haonan.guo/RSChatGPT/tes"
  },
  {
    "path": "RStask/ImageCaptioning/__init__.py",
    "chars": 63,
    "preview": "from RStask.ImageCaptioning.blip import BLIP as CaptionFunction"
  },
  {
    "path": "RStask/ImageCaptioning/blip.py",
    "chars": 951,
    "preview": "import torch\r\nfrom PIL import Image\r\nfrom transformers import  BlipProcessor, BlipForConditionalGeneration\r\n\r\nclass BLIP"
  },
  {
    "path": "RStask/ImageCaptioning/test.py",
    "chars": 194,
    "preview": "from RStask import LanduseFunction\r\nmodel=LanduseFunction('cuda:0')\r\nmodel.inference('/data/haonan.guo/LoveDA/Train/Urba"
  },
  {
    "path": "RStask/InstanceSegmentation/SwinUpper.py",
    "chars": 2293,
    "preview": "from RStask.InstanceSegmentation.model import SwinUPer\r\nimport torch\r\nfrom skimage import io\r\nfrom PIL import Image\r\nimp"
  },
  {
    "path": "RStask/InstanceSegmentation/__init__.py",
    "chars": 0,
    "preview": ""
  },
  {
    "path": "RStask/InstanceSegmentation/model.py",
    "chars": 3230,
    "preview": "import torch\r\nimport torch.nn as nn\r\nfrom RStask.InstanceSegmentation.swin import swin\r\nfrom RStask.InstanceSegmentation"
  },
  {
    "path": "RStask/InstanceSegmentation/swin.py",
    "chars": 26995,
    "preview": "# --------------------------------------------------------\n# Swin Transformer\n# Copyright (c) 2021 Microsoft\n# Licensed "
  },
  {
    "path": "RStask/InstanceSegmentation/test.py",
    "chars": 192,
    "preview": "from RStask import InstanceFunction\r\nmodel=InstanceFunction('cuda:0')\r\nmodel.inference('/data/haonan.guo/LoveDA/Train/Ur"
  },
  {
    "path": "RStask/InstanceSegmentation/uper.py",
    "chars": 18804,
    "preview": "# Copyright (c) OpenMMLab. All rights reserved.\nimport torch\nimport torch.nn as nn\nimport warnings\nfrom mmcv.cnn import "
  },
  {
    "path": "RStask/LanduseSegmentation/__init__.py",
    "chars": 2,
    "preview": "\r\n"
  },
  {
    "path": "RStask/LanduseSegmentation/seg_hrnet.py",
    "chars": 20348,
    "preview": "import logging\nfrom skimage import  io\nimport torch\nimport torch.nn as nn\nimport torch._utils\nimport torch.nn.functional"
  },
  {
    "path": "RStask/LanduseSegmentation/test.py",
    "chars": 190,
    "preview": "from RStask import LanduseFunction\r\nmodel=LanduseFunction('cuda:0')\r\nmodel.inference('/data/haonan.guo/LoveDA/Train/Urba"
  },
  {
    "path": "RStask/ObjectCounting/Yolocounting.py",
    "chars": 8746,
    "preview": "from RStask.ObjectDetection.models.common import DetectMultiBackend\r\nimport torch\r\nfrom skimage import io\r\nimport numpy "
  },
  {
    "path": "RStask/ObjectCounting/__init__.py",
    "chars": 0,
    "preview": ""
  },
  {
    "path": "RStask/ObjectCounting/test.py",
    "chars": 175,
    "preview": "from RStask import CountingFuncnction\r\nmodel=CountingFuncnction('cuda:0')\r\ntxt='/data/haonan.guo/RSChatGPT/test.tif,smal"
  },
  {
    "path": "RStask/ObjectDetection/YOLOv5.py",
    "chars": 8959,
    "preview": "from RStask.ObjectDetection.models.common import DetectMultiBackend\r\nimport torch\r\nfrom skimage import io\r\nimport numpy "
  },
  {
    "path": "RStask/ObjectDetection/__init__.py",
    "chars": 0,
    "preview": ""
  },
  {
    "path": "RStask/ObjectDetection/models/__init__.py",
    "chars": 0,
    "preview": ""
  },
  {
    "path": "RStask/ObjectDetection/models/common.py",
    "chars": 23075,
    "preview": "# YOLOv5 🚀 by Ultralytics, GPL-3.0 license\n\"\"\"\nCommon modules\n\"\"\"\n\nimport json\nimport math\nimport platform\nimport warnin"
  },
  {
    "path": "RStask/ObjectDetection/models/experimental.py",
    "chars": 4862,
    "preview": "# YOLOv5 🚀 by Ultralytics, GPL-3.0 license\n\"\"\"\nExperimental modules\n\"\"\"\nimport math\n\nimport numpy as np\nimport torch\nimp"
  },
  {
    "path": "RStask/ObjectDetection/models/hub/anchors.yaml",
    "chars": 3332,
    "preview": "# YOLOv5 🚀 by Ultralytics, GPL-3.0 license\n# Default anchors for COCO data\n\n\n# P5 --------------------------------------"
  },
  {
    "path": "RStask/ObjectDetection/models/hub/yolov3-spp.yaml",
    "chars": 1564,
    "preview": "# YOLOv5 🚀 by Ultralytics, GPL-3.0 license\n\n# Parameters\nnc: 80  # number of classes\ndepth_multiple: 1.0  # model depth "
  },
  {
    "path": "RStask/ObjectDetection/models/hub/yolov3-tiny.yaml",
    "chars": 1229,
    "preview": "# YOLOv5 🚀 by Ultralytics, GPL-3.0 license\n\n# Parameters\nnc: 80  # number of classes\ndepth_multiple: 1.0  # model depth "
  },
  {
    "path": "RStask/ObjectDetection/models/hub/yolov3.yaml",
    "chars": 1555,
    "preview": "# YOLOv5 🚀 by Ultralytics, GPL-3.0 license\n\n# Parameters\nnc: 80  # number of classes\ndepth_multiple: 1.0  # model depth "
  },
  {
    "path": "RStask/ObjectDetection/models/hub/yolov5-bifpn.yaml",
    "chars": 1420,
    "preview": "# YOLOv5 🚀 by Ultralytics, GPL-3.0 license\n\n# Parameters\nnc: 80  # number of classes\ndepth_multiple: 1.0  # model depth "
  },
  {
    "path": "RStask/ObjectDetection/models/hub/yolov5-fpn.yaml",
    "chars": 1211,
    "preview": "# YOLOv5 🚀 by Ultralytics, GPL-3.0 license\n\n# Parameters\nnc: 80  # number of classes\ndepth_multiple: 1.0  # model depth "
  },
  {
    "path": "RStask/ObjectDetection/models/hub/yolov5-p2.yaml",
    "chars": 1684,
    "preview": "# YOLOv5 🚀 by Ultralytics, GPL-3.0 license\n\n# Parameters\nnc: 80  # number of classes\ndepth_multiple: 1.0  # model depth "
  },
  {
    "path": "RStask/ObjectDetection/models/hub/yolov5-p34.yaml",
    "chars": 1346,
    "preview": "# YOLOv5 🚀 by Ultralytics, GPL-3.0 license\n\n# Parameters\nnc: 80  # number of classes\ndepth_multiple: 0.33  # model depth"
  },
  {
    "path": "RStask/ObjectDetection/models/hub/yolov5-p6.yaml",
    "chars": 1738,
    "preview": "# YOLOv5 🚀 by Ultralytics, GPL-3.0 license\n\n# Parameters\nnc: 80  # number of classes\ndepth_multiple: 1.0  # model depth "
  },
  {
    "path": "RStask/ObjectDetection/models/hub/yolov5-p7.yaml",
    "chars": 2119,
    "preview": "# YOLOv5 🚀 by Ultralytics, GPL-3.0 license\n\n# Parameters\nnc: 80  # number of classes\ndepth_multiple: 1.0  # model depth "
  },
  {
    "path": "RStask/ObjectDetection/models/hub/yolov5-panet.yaml",
    "chars": 1404,
    "preview": "# YOLOv5 🚀 by Ultralytics, GPL-3.0 license\n\n# Parameters\nnc: 80  # number of classes\ndepth_multiple: 1.0  # model depth "
  },
  {
    "path": "RStask/ObjectDetection/models/hub/yolov5l6.yaml",
    "chars": 1817,
    "preview": "# YOLOv5 🚀 by Ultralytics, GPL-3.0 license\n\n# Parameters\nnc: 80  # number of classes\ndepth_multiple: 1.0  # model depth "
  },
  {
    "path": "RStask/ObjectDetection/models/hub/yolov5m6.yaml",
    "chars": 1819,
    "preview": "# YOLOv5 🚀 by Ultralytics, GPL-3.0 license\n\n# Parameters\nnc: 80  # number of classes\ndepth_multiple: 0.67  # model depth"
  },
  {
    "path": "RStask/ObjectDetection/models/hub/yolov5n6.yaml",
    "chars": 1819,
    "preview": "# YOLOv5 🚀 by Ultralytics, GPL-3.0 license\n\n# Parameters\nnc: 80  # number of classes\ndepth_multiple: 0.33  # model depth"
  },
  {
    "path": "RStask/ObjectDetection/models/hub/yolov5s-ghost.yaml",
    "chars": 1480,
    "preview": "# YOLOv5 🚀 by Ultralytics, GPL-3.0 license\n\n# Parameters\nnc: 80  # number of classes\ndepth_multiple: 0.33  # model depth"
  },
  {
    "path": "RStask/ObjectDetection/models/hub/yolov5s-transformer.yaml",
    "chars": 1438,
    "preview": "# YOLOv5 🚀 by Ultralytics, GPL-3.0 license\n\n# Parameters\nnc: 80  # number of classes\ndepth_multiple: 0.33  # model depth"
  },
  {
    "path": "RStask/ObjectDetection/models/hub/yolov5s6.yaml",
    "chars": 1819,
    "preview": "# YOLOv5 🚀 by Ultralytics, GPL-3.0 license\n\n# Parameters\nnc: 80  # number of classes\ndepth_multiple: 0.33  # model depth"
  },
  {
    "path": "RStask/ObjectDetection/models/hub/yolov5x6.yaml",
    "chars": 1819,
    "preview": "# YOLOv5 🚀 by Ultralytics, GPL-3.0 license\n\n# Parameters\nnc: 80  # number of classes\ndepth_multiple: 1.33  # model depth"
  },
  {
    "path": "RStask/ObjectDetection/models/tf.py",
    "chars": 21639,
    "preview": "# YOLOv5 🚀 by Ultralytics, GPL-3.0 license\n\"\"\"\nTensorFlow, Keras and TFLite versions of YOLOv5\nAuthored by https://githu"
  },
  {
    "path": "RStask/ObjectDetection/models/yolo.py",
    "chars": 15204,
    "preview": "# YOLOv5 🚀 by Ultralytics, GPL-3.0 license\n\"\"\"\nYOLO-specific modules\n\nUsage:\n    $ python path/to/models/yolo.py --cfg y"
  },
  {
    "path": "RStask/ObjectDetection/models/yolov5l.yaml",
    "chars": 1398,
    "preview": "# YOLOv5 🚀 by Ultralytics, GPL-3.0 license\n\n# Parameters\nnc: 80  # number of classes\ndepth_multiple: 1.0  # model depth "
  },
  {
    "path": "RStask/ObjectDetection/models/yolov5m.yaml",
    "chars": 1400,
    "preview": "# YOLOv5 🚀 by Ultralytics, GPL-3.0 license\n\n# Parameters\nnc: 80  # number of classes\ndepth_multiple: 0.67  # model depth"
  },
  {
    "path": "RStask/ObjectDetection/models/yolov5n.yaml",
    "chars": 1400,
    "preview": "# YOLOv5 🚀 by Ultralytics, GPL-3.0 license\n\n# Parameters\nnc: 80  # number of classes\ndepth_multiple: 0.33  # model depth"
  },
  {
    "path": "RStask/ObjectDetection/models/yolov5s.yaml",
    "chars": 1400,
    "preview": "# YOLOv5 🚀 by Ultralytics, GPL-3.0 license\n\n# Parameters\nnc: 80  # number of classes\ndepth_multiple: 0.33  # model depth"
  },
  {
    "path": "RStask/ObjectDetection/models/yolov5x.yaml",
    "chars": 1400,
    "preview": "# YOLOv5 🚀 by Ultralytics, GPL-3.0 license\n\n# Parameters\nnc: 80  # number of classes\ndepth_multiple: 1.33  # model depth"
  },
  {
    "path": "RStask/ObjectDetection/test.py",
    "chars": 191,
    "preview": "from RStask.ObjectDetection.YOLOv5 import YoloDetection\r\nmodel=YoloDetection('cuda:0')\r\ndet=model.inference('/data/haona"
  },
  {
    "path": "RStask/ObjectDetection/utils/__init__.py",
    "chars": 1088,
    "preview": "# YOLOv5 🚀 by Ultralytics, GPL-3.0 license\r\n\"\"\"\r\nutils/initialization\r\n\"\"\"\r\n\r\n\r\ndef notebook_init(verbose=True):\r\n    # "
  },
  {
    "path": "RStask/ObjectDetection/utils/activations.py",
    "chars": 3446,
    "preview": "# YOLOv5 🚀 by Ultralytics, GPL-3.0 license\n\"\"\"\nActivation functions\n\"\"\"\n\nimport torch\nimport torch.nn as nn\nimport torch"
  },
  {
    "path": "RStask/ObjectDetection/utils/augmentations.py",
    "chars": 11953,
    "preview": "# YOLOv5 🚀 by Ultralytics, GPL-3.0 license\n\"\"\"\nImage augmentation functions\n\"\"\"\n\nimport math\nimport random\n\nimport cv2\ni"
  },
  {
    "path": "RStask/ObjectDetection/utils/autoanchor.py",
    "chars": 7444,
    "preview": "# YOLOv5 🚀 by Ultralytics, GPL-3.0 license\n\"\"\"\nAutoAnchor utils\n\"\"\"\n\nimport random\n\nimport numpy as np\nimport torch\nimpo"
  },
  {
    "path": "RStask/ObjectDetection/utils/autobatch.py",
    "chars": 2199,
    "preview": "# YOLOv5 🚀 by Ultralytics, GPL-3.0 license\n\"\"\"\nAuto-batch utils\n\"\"\"\n\nfrom copy import deepcopy\n\nimport numpy as np\nimpor"
  },
  {
    "path": "RStask/ObjectDetection/utils/aws/__init__.py",
    "chars": 0,
    "preview": ""
  },
  {
    "path": "RStask/ObjectDetection/utils/aws/mime.sh",
    "chars": 780,
    "preview": "# AWS EC2 instance startup 'MIME' script https://aws.amazon.com/premiumsupport/knowledge-center/execute-user-data-ec2/\n#"
  },
  {
    "path": "RStask/ObjectDetection/utils/aws/resume.py",
    "chars": 1198,
    "preview": "# Resume all interrupted trainings in yolov5/ dir including DDP trainings\n# Usage: $ python utils/aws/resume.py\n\nimport "
  },
  {
    "path": "RStask/ObjectDetection/utils/aws/userdata.sh",
    "chars": 1247,
    "preview": "#!/bin/bash\n# AWS EC2 instance startup script https://docs.aws.amazon.com/AWSEC2/latest/UserGuide/user-data.html\n# This "
  },
  {
    "path": "RStask/ObjectDetection/utils/benchmarks.py",
    "chars": 6325,
    "preview": "# YOLOv5 🚀 by Ultralytics, GPL-3.0 license\n\"\"\"\nRun YOLOv5 benchmarks on all supported export formats\n\nFormat            "
  },
  {
    "path": "RStask/ObjectDetection/utils/callbacks.py",
    "chars": 2399,
    "preview": "# YOLOv5 🚀 by Ultralytics, GPL-3.0 license\n\"\"\"\nCallback utils\n\"\"\"\n\n\nclass Callbacks:\n    \"\"\"\"\n    Handles all registered"
  },
  {
    "path": "RStask/ObjectDetection/utils/datasets.py",
    "chars": 45827,
    "preview": "# YOLOv5 🚀 by Ultralytics, GPL-3.0 license\n\"\"\"\nDataloaders and dataset utils\n\"\"\"\n\nimport glob\nimport hashlib\nimport json"
  },
  {
    "path": "RStask/ObjectDetection/utils/docker/Dockerfile",
    "chars": 2234,
    "preview": "# YOLOv5 🚀 by Ultralytics, GPL-3.0 license\n\n# Start FROM Nvidia PyTorch image https://ngc.nvidia.com/catalog/containers/"
  },
  {
    "path": "RStask/ObjectDetection/utils/docker/Dockerfile-cpu",
    "chars": 1452,
    "preview": "# YOLOv5 🚀 by Ultralytics, GPL-3.0 license\n\n# Start FROM Ubuntu image https://hub.docker.com/_/ubuntu\nFROM ubuntu:latest"
  },
  {
    "path": "RStask/ObjectDetection/utils/downloads.py",
    "chars": 6608,
    "preview": "# YOLOv5 🚀 by Ultralytics, GPL-3.0 license\n\"\"\"\nDownload utils\n\"\"\"\n\nimport logging\nimport os\nimport platform\nimport subpr"
  },
  {
    "path": "RStask/ObjectDetection/utils/flask_rest_api/README.md",
    "chars": 1710,
    "preview": "# Flask REST API\n\n[REST](https://en.wikipedia.org/wiki/Representational_state_transfer) [API](https://en.wikipedia.org/w"
  },
  {
    "path": "RStask/ObjectDetection/utils/flask_rest_api/example_request.py",
    "chars": 365,
    "preview": "# YOLOv5 🚀 by Ultralytics, GPL-3.0 license\n\"\"\"\nPerform test request\n\"\"\"\n\nimport pprint\n\nimport requests\n\nDETECTION_URL ="
  },
  {
    "path": "RStask/ObjectDetection/utils/flask_rest_api/restapi.py",
    "chars": 1411,
    "preview": "# YOLOv5 🚀 by Ultralytics, GPL-3.0 license\n\"\"\"\nRun a Flask REST API exposing a YOLOv5s model\n\"\"\"\n\nimport argparse\nimport"
  },
  {
    "path": "RStask/ObjectDetection/utils/general.py",
    "chars": 38702,
    "preview": "# YOLOv5 🚀 by Ultralytics, GPL-3.0 license\n\"\"\"\nGeneral utils\n\"\"\"\n\nimport contextlib\nimport glob\nimport inspect\nimport lo"
  },
  {
    "path": "RStask/ObjectDetection/utils/google_app_engine/Dockerfile",
    "chars": 821,
    "preview": "FROM gcr.io/google-appengine/python\n\n# Create a virtualenv for dependencies. This isolates these packages from\n# system-"
  },
  {
    "path": "RStask/ObjectDetection/utils/google_app_engine/additional_requirements.txt",
    "chars": 105,
    "preview": "# add these requirements in your app on top of the existing ones\npip==21.1\nFlask==1.0.2\ngunicorn==19.9.0\n"
  },
  {
    "path": "RStask/ObjectDetection/utils/google_app_engine/app.yaml",
    "chars": 174,
    "preview": "runtime: custom\nenv: flex\n\nservice: yolov5app\n\nliveness_check:\n  initial_delay_sec: 600\n\nmanual_scaling:\n  instances: 1\n"
  },
  {
    "path": "RStask/ObjectDetection/utils/loggers/__init__.py",
    "chars": 8147,
    "preview": "# YOLOv5 🚀 by Ultralytics, GPL-3.0 license\n\"\"\"\nLogging utils\n\"\"\"\n\nimport os\nimport warnings\nfrom threading import Thread"
  },
  {
    "path": "RStask/ObjectDetection/utils/loggers/wandb/README.md",
    "chars": 10791,
    "preview": "📚 This guide explains how to use **Weights & Biases** (W&B) with YOLOv5 🚀. UPDATED 29 September 2021.\n\n- [About Weights "
  },
  {
    "path": "RStask/ObjectDetection/utils/loggers/wandb/__init__.py",
    "chars": 0,
    "preview": ""
  },
  {
    "path": "RStask/ObjectDetection/utils/loggers/wandb/log_dataset.py",
    "chars": 1032,
    "preview": "import argparse\n\nfrom wandb_utils import WandbLogger\n\nfrom utils.general import LOGGER\n\nWANDB_ARTIFACT_PREFIX = 'wandb-a"
  },
  {
    "path": "RStask/ObjectDetection/utils/loggers/wandb/sweep.py",
    "chars": 1213,
    "preview": "import sys\nfrom pathlib import Path\n\nimport wandb\n\nFILE = Path(__file__).resolve()\nROOT = FILE.parents[3]  # YOLOv5 root"
  },
  {
    "path": "RStask/ObjectDetection/utils/loggers/wandb/sweep.yaml",
    "chars": 2463,
    "preview": "# Hyperparameters for training\n# To set range-\n# Provide min and max values as:\n#      parameter:\n#\n#         min: scala"
  },
  {
    "path": "RStask/ObjectDetection/utils/loggers/wandb/wandb_utils.py",
    "chars": 27588,
    "preview": "\"\"\"Utilities and tools for tracking runs with Weights & Biases.\"\"\"\n\nimport logging\nimport os\nimport sys\nfrom contextlib "
  },
  {
    "path": "RStask/ObjectDetection/utils/loss.py",
    "chars": 9908,
    "preview": "# YOLOv5 🚀 by Ultralytics, GPL-3.0 license\n\"\"\"\nLoss functions\n\"\"\"\n\nimport torch\nimport torch.nn as nn\n\nfrom utils.metric"
  },
  {
    "path": "RStask/ObjectDetection/utils/metrics.py",
    "chars": 14105,
    "preview": "# YOLOv5 🚀 by Ultralytics, GPL-3.0 license\n\"\"\"\nModel validation metrics\n\"\"\"\n\nimport math\nimport warnings\nfrom pathlib im"
  },
  {
    "path": "RStask/ObjectDetection/utils/plots.py",
    "chars": 21070,
    "preview": "# YOLOv5 🚀 by Ultralytics, GPL-3.0 license\n\"\"\"\nPlotting utils\n\"\"\"\n\nimport math\nimport os\nfrom copy import copy\nfrom path"
  },
  {
    "path": "RStask/ObjectDetection/utils/torch_utils.py",
    "chars": 13250,
    "preview": "# YOLOv5 🚀 by Ultralytics, GPL-3.0 license\n\"\"\"\nPyTorch utils\n\"\"\"\n\nimport math\nimport os\nimport platform\nimport subproces"
  },
  {
    "path": "RStask/SceneClassification/ResNetScene.py",
    "chars": 2309,
    "preview": "import torch\r\nfrom skimage import io\r\n\r\nclass ResNetAID:\r\n    def __init__(self, device=None):\r\n        print(\"Initializ"
  },
  {
    "path": "RStask/SceneClassification/__init__.py",
    "chars": 0,
    "preview": ""
  },
  {
    "path": "RStask/SceneClassification/test.py",
    "chars": 160,
    "preview": "from RStask.SceneClassification.ResNetScene import ResNetAID as SceneFunction\r\nmodel=SceneFunction()\r\nmodel.inference('/"
  },
  {
    "path": "RStask/__init__.py",
    "chars": 537,
    "preview": "from RStask.EdgeDetection.Canny import Image2Canny as ImageEdgeFunction\r\nfrom RStask.ImageCaptioning.blip import BLIP as"
  },
  {
    "path": "requirements.txt",
    "chars": 75,
    "preview": "langchain\ntorch\ntorchvision\nimageio\nnumpy\nopenai\nopencv-python\nscikit-image"
  }
]

// ... and 1 more files (download for full content)

About this extraction

This page contains the full source code of the HaonanGuo/Remote-Sensing-ChatGPT GitHub repository, extracted and formatted as plain text for AI agents and large language models (LLMs). The extraction includes 91 files (456.0 KB), approximately 138.8k tokens, and a symbol index with 573 extracted functions, classes, methods, constants, and types. Use this with OpenClaw, Claude, ChatGPT, Cursor, Windsurf, or any other AI tool that accepts text input. You can copy the full output to your clipboard or download it as a .txt file.

Extracted by GitExtract — free GitHub repo to text converter for AI. Built by Nikandr Surkov.

Extract another repo