Full Code of prodigysml/Dr.-Watson for AI

master c52e0464256b cached
5 files
19.2 KB
4.8k tokens
22 symbols
1 requests
Download .txt
Repository: prodigysml/Dr.-Watson
Branch: master
Commit: c52e0464256b
Files: 5
Total size: 19.2 KB

Directory structure:
gitextract_2k41eea_/

├── .gitignore
├── README.md
├── __version__.py
├── dr-watson.py
└── issues_library.json

================================================
FILE CONTENTS
================================================

================================================
FILE: .gitignore
================================================
# Byte-compiled / optimized / DLL files
__pycache__/
*.py[cod]
*$py.class

# C extensions
*.so

# Distribution / packaging
.Python
build/
develop-eggs/
dist/
downloads/
eggs/
.eggs/
lib/
lib64/
parts/
sdist/
var/
wheels/
*.egg-info/
.installed.cfg
*.egg
MANIFEST

# PyInstaller
#  Usually these files are written by a python script from a template
#  before PyInstaller builds the exe, so as to inject date/other infos into it.
*.manifest
*.spec

# Installer logs
pip-log.txt
pip-delete-this-directory.txt

# Unit test / coverage reports
htmlcov/
.tox/
.coverage
.coverage.*
.cache
nosetests.xml
coverage.xml
*.cover
.hypothesis/
.pytest_cache/

# Translations
*.mo
*.pot

# Django stuff:
*.log
local_settings.py
db.sqlite3

# Flask stuff:
instance/
.webassets-cache

# Scrapy stuff:
.scrapy

# Sphinx documentation
docs/_build/

# PyBuilder
target/

# Jupyter Notebook
.ipynb_checkpoints

# pyenv
.python-version

# celery beat schedule file
celerybeat-schedule

# SageMath parsed files
*.sage.py

# Environments
.env
.venv
env/
venv/
ENV/
env.bak/
venv.bak/

# Spyder project settings
.spyderproject
.spyproject

# Rope project settings
.ropeproject

# mkdocs documentation
/site

# mypy
.mypy_cache/


================================================
FILE: README.md
================================================
# Dr. Watson

Dr. Watson is a simple Burp Suite extension that helps find assets, keys, subdomains, IP addresses, and other useful information! It's your very own discovery side kick, the Dr. Watson to your Sherlock! 

[![License](https://img.shields.io/badge/license-GPL3-_red.svg)](https://www.gnu.org/licenses/gpl-3.0.en.html) [![Twitter](https://img.shields.io/badge/twitter-@sml555__-blue.svg)](https://twitter.com/sml555_) ![Version](https://img.shields.io/badge/version-1.0.1-blue.svg)

# How Does Dr. Watson Work?

Dr. Watson takes regexes from the issues_library.json file and attempts to match said regexes with responses within Burp Suite. Once it matches a regex, it raises an issue with the severity defined in the config, as a finding for the target host. It is simple, sweet, and easy to use! 

# Setup - Installing for Burp Suite Pro
## Setting Up Jython
1. Download the latest standalone version of [jython](https://www.jython.org/download)
2. Navigate to Extender -> Options
3. Navigate to the "Python Environment" section
4. Click "Select File" and select the previously downloaded file

## Installing the Plugin
1. Navigate to Extender -> Extensions
2. Click the "Add" button
3. Change the "Extension Type" to "Python"
4. Select the plugin python file within the "Extension file" field
5. Click "Next"
6. Enjoy the plugin!

# How to Use The Plugin

1. Install the plugin
2. Add any domain you want analysed into scope (if not in scope, it will not be analysed, ensuring performance is not hindered immensely)
3. Navigate / crawl through the website and observe the plugin creates issues for different resources identified. 

# Authors and Thanks
Originally written by Sajeeb Lohani ([sml555](https://twitter.com/sml555_)). I would like to thank the following for helping with the project:
* BugCrowd HUNT for the Jython installation steps
* Redhunt Labs for the original plugin and the idea
* TruffleHog Regexes and git-all-secrets for the regexes

# Contributions
Contributions to this project are very welcome. If you're a newcomer to open source and would like some help in doing so, feel free to reach out to me on twitter ([@sml555_](https://twitter.com/sml555_)) and I'll assist wherever I can. 


================================================
FILE: __version__.py
================================================
__version__ = "1.0.1"

================================================
FILE: dr-watson.py
================================================
"""
Dr. Watson: Burp Suite Extension that helps you find assets, keys, and useful information. Your very own Burp side kick!
By: Sajeeb Lohani (sml555)
Twitter: https://twitter.com/sml555_

todo: fix dups better, add in api secrets, add in more JS parsing

Code Credits:
Redhunt Labs for making the original asset discovery plugin
OpenSecurityResearch CustomPassiveScanner: https://github.com/OpenSecurityResearch/CustomPassiveScanner
PortSwigger example-scanner-checks: https://github.com/PortSwigger/example-scanner-checks
"""

from burp import IBurpExtender
from burp import IScannerCheck
from burp import IScanIssue
from array import array
import re
import json
import __version__


class BurpExtender(IBurpExtender, IScannerCheck):
    """
    Implement BurpExtender to inherit from multiple base classes
    IBurpExtender is the base class required for all extensions
    IScannerCheck lets us register our extension with Burp as a custom scanner check
    """

    def registerExtenderCallbacks(self, callbacks):
        """
        The only method of the IBurpExtender interface.
        This method is invoked when the extension is loaded and registers
        an instance of the IBurpExtenderCallbacks interface
        """

        # Put the callbacks parameter into a class variable so we have class-level scope
        self._callbacks = callbacks

        # Set the name of our extension, which will appear in the Extender tool when loaded
        self._callbacks.setExtensionName("Dr. Watson")

        # Register our extension as a custom scanner check, so Burp will use this extension
        # to perform active or passive scanning and report on scan issues returned
        self._callbacks.registerScannerCheck(self)

        library_file = open("issues_library.json")
        library_file = library_file.read()

        self.library = json.loads(library_file)

        return

    def consolidateDuplicateIssues(self, existingIssue, newIssue):
        """
        This method is called when multiple issues are reported for the same URL
        In this case we are checking if the issue detail is different, as the
        issues from our scans include affected parameters/values in the detail,
        which we will want to report as unique issue instances
        """
        if existingIssue.getIssueDetail() == newIssue.getIssueDetail():
            return -1
        return 0

    def doPassiveScan(self, baseRequestResponse):
        """
        Implement the doPassiveScan method of IScannerCheck interface
        Burp Scanner invokes this method for each base request/response that is passively scanned.
        """
        # Local variables used to store a list of ScanIssue objects
        scan_issues = list()

        # Create an instance of our CustomScans object, passing the
        # base request and response, and our callbacks object
        self._CustomScans = CustomScans(baseRequestResponse, self._callbacks)

        mime_type = str(self._CustomScans.response.getStatedMimeType())

        # purposely not including SVG in case any interesting information is found
        image_types = ["GIF", "JPEG", "PNG", "image"]

        if mime_type in image_types:
            return None 
        print(mime_type)

        for issue in self.library:
            scan_issues += self._CustomScans.findRegEx(*issue[:4])

        # Finally, per the interface contract, doPassiveScan needs to return a
        # list of scan issues, if any, and None otherwise
        if scan_issues:
            return scan_issues
        else:
            return None


class CustomScans:
    unique_list = dict()
    regexes_compiled = dict()

    def __init__(self, requestResponse, callbacks):
        # Set class variables with the arguments passed to the constructor
        self._requestResponse = requestResponse
        self._callbacks = callbacks

        # Get an instance of IHelpers, which has lots of useful methods, as a class
        # variable, so we have class-level scope to all the helper methods
        self._helpers = self._callbacks.getHelpers()

        self.response = self._helpers.analyzeResponse(requestResponse.getResponse())

        # Put the parameters from the HTTP message in a class variable so we have class-level scope
        self._params = self._helpers.analyzeRequest(requestResponse.getRequest()).getParameters()
        return

    def findRegEx(self, regex, issuename, issuelevel, issuedetail):
        """
        This is a custom scan method to Look for all occurrences in the response
        that match the passed regular expression
        """
        scan_issues = []
        offset = array('i', [0, 0])
        response = self._requestResponse.getResponse()
        response_length = len(response)

        # Only check responses for 'in scope' URLs

        if self._callbacks.isInScope(self._helpers.analyzeRequest(self._requestResponse).getUrl()):

            # Compile the regular expression, telling Python to ignore EOL/LF
            # NOTE: testing required significantly here!
            if regex in CustomScans.regexes_compiled:
                myre = CustomScans.regexes_compiled[regex]
            else:
                myre = re.compile(regex, re.DOTALL)
                CustomScans.regexes_compiled[regex] = myre

            # Using the regular expression, find all occurrences in the base response
            match_vals = myre.findall(self._helpers.bytesToString(response))

            for ref in match_vals:
                url = self._helpers.analyzeRequest(self._requestResponse).getUrl()

                # Don't add the source domain to issues
                if ref.split("//")[-1].split("/")[0].split('?')[0].split(':')[0] == str(url).split("//")[-1].split(":")[0].split('?')[0]:
                    continue

                # For each matched value found, find its start position, so that we can create
                # the offset needed to apply appropriate markers in the resulting Scanner issue
                offsets = []
                start = self._helpers.indexOf(response,
                                              ref, True, 0, response_length)
                offset[0] = start
                offset[1] = start + len(ref)
                offsets.append(offset)

                base_url = str(url).split("//")[-1].split("/")[0].split('?')[0].split(":")[0]

                # Create a ScanIssue object and append it to our list of issues, marking
                # the matched value in the response.

                # create individual classes per unique asset class

                if issuename == "Asset Discovered: Domain":
                    ref = ref.split("//")[-1].split("/")[0].split('?')[0]
                    if ref.endswith("." + self._get_core_domain(url)):
                        continue

                elif issuename == "Asset Discovered: IP":
                    ref_array = ref.split(".")
                    must_continue = False
                    for s in ref_array:
                        # checks to see if 0 is in front of the ip number, 
                        # > 255 or < 0
                        i = int(s)
                        if not (0 < i < 255) or i or s != str(i):
                            must_continue = True
                            break

                    if ref_array[0] == "0":
                        continue

                    if must_continue:
                        continue

                elif issuename == "Asset Discovered: Subdomain":
                    ref = ref.split("//")[-1].split("/")[0].split('?')[0]
                    coredomain = self._get_core_domain(url)
                    if not ref.endswith("." + coredomain) or ref == coredomain:
                        continue

                elif issuename == "Asset Discovered: S3 Bucket":
                    try:
                        # getting the S3 bucket name and catch exception if regex catches incorrect data
                        ref = ref.split(" ")[0].split('/')[2]
                    except:
                        continue
                elif issuename == "Asset Discovered: DigitalOcean Space":
                    ref = ref.split('/')[2]

                elif issuename == "Asset Discovered: Azure Blob":
                    ref = ref.split(" ")[0].split('/')[2] + ":" + ref.split(" ")[0].split('/')[3]

                # this was done to only keep a single issue created for each ref
                if not self.check_unique(base_url, ref):
                    continue

                scan_issues.append(ScanIssue(self._requestResponse.getHttpService(),
                    self._helpers.analyzeRequest(self._requestResponse).getUrl(),
                    [self._callbacks.applyMarkers(self._requestResponse, None, offsets)],
                    issuename, issuelevel, issuedetail.replace("$asset$", ref)))

        return scan_issues

    def _get_core_domain(self, url):
        domain = str(url).split("//")[-1].split(":")[0].split('?')[0]
        return str(domain).rsplit('.')[-2]+"."+str(domain).rsplit('.')[-1]

    def check_unique(self, core, ref):
        if core in CustomScans.unique_list.keys():
            if ref in CustomScans.unique_list[core]:
                return False
            else:
                return True
        else:
            CustomScans.unique_list[core] = [ref]
            return True


class ScanIssue(IScanIssue):
    """
    Implementation of the IScanIssue interface with simple constructor and getter methods
    """

    def __init__(self, httpservice, url, requestresponsearray, name, severity, detailmsg):
        self._url = url
        self._httpservice = httpservice
        self._requestresponsearray = requestresponsearray
        self._name = name
        self._severity = severity
        self._detailmsg = detailmsg

    def getUrl(self):
        return self._url

    def getHttpMessages(self):
        return self._requestresponsearray

    def getHttpService(self):
        return self._httpservice

    def getRemediationDetail(self):
        return None

    def getIssueDetail(self):
        return self._detailmsg

    def getIssueBackground(self):
        return None

    def getRemediationBackground(self):
        return None

    def getIssueType(self):
        return 0

    def getIssueName(self):
        return self._name

    def getSeverity(self):
        return self._severity

    def getConfidence(self):
        return "Tentative"


================================================
FILE: issues_library.json
================================================
[
    [
        "(?:[\\d]{1,3})\\.(?:[\\d]{1,3})\\.(?:[\\d]{1,3})\\.(?:[\\d]{1,3})",
        "Asset Discovered: IP",
        "Information",
        "IP Discovered: <b>$asset$</b><br><br><b>Note:</b> Before performing any active assessment of the identified asset, please check with the owner. The asset might not be owned by the same owner/organizaion or part of the scope."
    ],
    [
        "http[s]?://(?:[a-zA-Z]|[0-9]|[$-_@.&+]|[!*\\(\\),]|(?:%[0-9a-fA-F][0-9a-fA-F]))[^><'\" \n)]+",
        "Asset Discovered: Domain",
        "Information",
        "Domain Discovered: <b>$asset$</b><br><br><b>Note:</b> Before performing any active assessment of the identified asset, please check with the owner. The asset might not be owned by the same owner/organizaion or part of the scope."
    ],
    [
        "http[s]?://(?:[a-zA-Z]|[0-9]|[$-_@.&+]|[!*\\(\\),]|(?:%[0-9a-fA-F][0-9a-fA-F]))[^><'\" \n)]+",
        "Asset Discovered: Subdomain",
        "Information",
        "Subdomain Discovered: <b>$asset$</b><br><br><b>Note:</b> Before performing any active assessment of the identified asset, please check with the owner. The asset might not be owned by the same owner/organizaion or part of the scope."
    ],
    [
        "(http(?:s)?://.[^><\\'\\\" \\n\\)]+.s3.amazonaws.com|\\))",
        "Asset Discovered: S3 Bucket",
        "Information",
        "S3 Bucket Discovered: <b>$asset$</b><br><br><b>Note:</b> Before performing any active assessment of the identified asset, please check with the owner. The asset might not be owned by the same owner/organizaion or part of the scope."
    ],
    [
        "http(?:s)://[^><\\.'\" \n\\)]+.[^><\\.'\" \n\\)]+.[^><\\.'\" \n\\)]+.digitaloceanspaces.com",
        "Asset Discovered: DigitalOcean Space",
        "Information",
        "DigitalOcean Space Discovered: <b>$asset$</b><br><br><b>Note:</b> Before performing any active assessment of the identified asset, please check with the owner. The asset might not be owned by the same owner/organizaion or part of the scope."
    ],
    [
        "http(?:s)://.[^><'\" \n\\)]+.blob.core.windows.net/.[^><'\" \n/)]+./",
        "Asset Discovered: Azure Blob",
        "Information",
        "Azure Blob Discovered: <b>$asset$</b><br><br><b>Note:</b> Before performing any active assessment of the identified asset, please check with the owner. The asset might not be owned by the same owner/organizaion or part of the scope."
    ],
    [
        "-----BEGIN RSA PRIVATE KEY-----",
        "Asset Discovered: RSA Private Key",
        "High",
        "RSA Private Key Discovered: <b>$asset$</b><br><br><b>Note:</b> Before performing any active assessment of the identified asset, please check with the owner. The asset might not be owned by the same owner/organizaion or part of the scope."
    ],
    [
        "-----BEGIN DSA PRIVATE KEY-----",
        "Asset Discovered: DSA Private Key",
        "High",
        "DSA Private Key Discovered: <b>$asset$</b><br><br><b>Note:</b> Before performing any active assessment of the identified asset, please check with the owner. The asset might not be owned by the same owner/organizaion or part of the scope."
    ],
    [
        "-----BEGIN EC PRIVATE KEY-----",
        "Asset Discovered: EC Private Key",
        "High",
        "EC Private Key Discovered: <b>$asset$</b><br><br><b>Note:</b> Before performing any active assessment of the identified asset, please check with the owner. The asset might not be owned by the same owner/organizaion or part of the scope."
    ],
    [
        "-----BEGIN PGP PRIVATE KEY BLOCK-----",
        "Asset Discovered: PGP Private Key",
        "High",
        "PGP Private Key Discovered: <b>$asset$</b><br><br><b>Note:</b> Before performing any active assessment of the identified asset, please check with the owner. The asset might not be owned by the same owner/organizaion or part of the scope."
    ],
    [
        "X-Octopus-ApiKey",
        "Asset Discovered: Octopus API Key",
        "Medium",
        "Octopus API Key Discovered: <b>$asset$</b><br><br><b>Note:</b> Before performing any active assessment of the identified asset, please check with the owner. The asset might not be owned by the same owner/organizaion or part of the scope."
    ],
    [
        "X-NuGet-ApiKey",
        "Asset Discovered: NuGet API Key",
        "Medium",
        "NuGet API Key Discovered: <b>$asset$</b><br><br><b>Note:</b> Before performing any active assessment of the identified asset, please check with the owner. The asset might not be owned by the same owner/organizaion or part of the scope."
    ],
    [
        "(xox[p|b|o|a]-[0-9]{12}-[0-9]{12}-[0-9]{12}-[a-z0-9]{32})",
        "Asset Discovered: Slack token",
        "Medium",
        "Slack token Discovered: <b>$asset$</b><br><br><b>Note:</b> Before performing any active assessment of the identified asset, please check with the owner. The asset might not be owned by the same owner/organizaion or part of the scope."
    ],
    [
        "(AccessKeyId|aws_access_key_id)",
        "Asset Discovered: AWS Access Key ID",
        "High",
        "AWS Access Key ID Discovered: <b>$asset$</b><br><br><b>Note:</b> Before performing any active assessment of the identified asset, please check with the owner. The asset might not be owned by the same owner/organizaion or part of the scope."
    ],
    [
        "(SecretAccessKey|aws_secret_access_key)",
        "Asset Discovered: AWS Secret Access Key ID",
        "High",
        "AWS Secret Access Key ID Discovered: <b>$asset$</b><br><br><b>Note:</b> Before performing any active assessment of the identified asset, please check with the owner. The asset might not be owned by the same owner/organizaion or part of the scope."
    ]
]
Download .txt
gitextract_2k41eea_/

├── .gitignore
├── README.md
├── __version__.py
├── dr-watson.py
└── issues_library.json
Download .txt
SYMBOL INDEX (22 symbols across 1 files)

FILE: dr-watson.py
  class BurpExtender (line 23) | class BurpExtender(IBurpExtender, IScannerCheck):
    method registerExtenderCallbacks (line 30) | def registerExtenderCallbacks(self, callbacks):
    method consolidateDuplicateIssues (line 54) | def consolidateDuplicateIssues(self, existingIssue, newIssue):
    method doPassiveScan (line 65) | def doPassiveScan(self, baseRequestResponse):
  class CustomScans (line 97) | class CustomScans:
    method __init__ (line 101) | def __init__(self, requestResponse, callbacks):
    method findRegEx (line 116) | def findRegEx(self, regex, issuename, issuelevel, issuedetail):
    method _get_core_domain (line 215) | def _get_core_domain(self, url):
    method check_unique (line 219) | def check_unique(self, core, ref):
  class ScanIssue (line 230) | class ScanIssue(IScanIssue):
    method __init__ (line 235) | def __init__(self, httpservice, url, requestresponsearray, name, sever...
    method getUrl (line 243) | def getUrl(self):
    method getHttpMessages (line 246) | def getHttpMessages(self):
    method getHttpService (line 249) | def getHttpService(self):
    method getRemediationDetail (line 252) | def getRemediationDetail(self):
    method getIssueDetail (line 255) | def getIssueDetail(self):
    method getIssueBackground (line 258) | def getIssueBackground(self):
    method getRemediationBackground (line 261) | def getRemediationBackground(self):
    method getIssueType (line 264) | def getIssueType(self):
    method getIssueName (line 267) | def getIssueName(self):
    method getSeverity (line 270) | def getSeverity(self):
    method getConfidence (line 273) | def getConfidence(self):
Condensed preview — 5 files, each showing path, character count, and a content snippet. Download the .json file or copy for the full structured content (21K chars).
[
  {
    "path": ".gitignore",
    "chars": 1203,
    "preview": "# Byte-compiled / optimized / DLL files\n__pycache__/\n*.py[cod]\n*$py.class\n\n# C extensions\n*.so\n\n# Distribution / packagi"
  },
  {
    "path": "README.md",
    "chars": 2222,
    "preview": "# Dr. Watson\n\nDr. Watson is a simple Burp Suite extension that helps find assets, keys, subdomains, IP addresses, and ot"
  },
  {
    "path": "__version__.py",
    "chars": 21,
    "preview": "__version__ = \"1.0.1\""
  },
  {
    "path": "dr-watson.py",
    "chars": 10479,
    "preview": "\"\"\"\nDr. Watson: Burp Suite Extension that helps you find assets, keys, and useful information. Your very own Burp side k"
  },
  {
    "path": "issues_library.json",
    "chars": 5748,
    "preview": "[\n    [\n        \"(?:[\\\\d]{1,3})\\\\.(?:[\\\\d]{1,3})\\\\.(?:[\\\\d]{1,3})\\\\.(?:[\\\\d]{1,3})\",\n        \"Asset Discovered: IP\",\n   "
  }
]

About this extraction

This page contains the full source code of the prodigysml/Dr.-Watson GitHub repository, extracted and formatted as plain text for AI agents and large language models (LLMs). The extraction includes 5 files (19.2 KB), approximately 4.8k tokens, and a symbol index with 22 extracted functions, classes, methods, constants, and types. Use this with OpenClaw, Claude, ChatGPT, Cursor, Windsurf, or any other AI tool that accepts text input. You can copy the full output to your clipboard or download it as a .txt file.

Extracted by GitExtract — free GitHub repo to text converter for AI. Built by Nikandr Surkov.

Copied to clipboard!