[
  {
    "path": "Dockerfile",
    "content": "FROM guysoft/uwsgi-nginx:python3.7\n\nLABEL maintainer=\"hunshcn <hunsh.cn@gmail.com>\"\n\nRUN pip install flask requests\n\nCOPY ./app /app\nWORKDIR /app\n\n# Make /app/* available to be imported by Python globally to better support several use cases like Alembic migrations.\nENV PYTHONPATH=/app\n\n# Move the base entrypoint to reuse it\nRUN mv /entrypoint.sh /uwsgi-nginx-entrypoint.sh\n# Copy the entrypoint that will generate Nginx additional configs\nCOPY entrypoint.sh /entrypoint.sh\nRUN chmod +x /entrypoint.sh\n\nENTRYPOINT [\"/entrypoint.sh\"]\n\n# Run the start script provided by the parent image tiangolo/uwsgi-nginx.\n# It will check for an /app/prestart.sh script (e.g. for migrations)\n# And then will start Supervisor, which in turn will start Nginx and uWSGI\n\nEXPOSE 80\n\nCMD [\"/start.sh\"]\n"
  },
  {
    "path": "LICENSE",
    "content": "MIT License\n\nCopyright (c) 2020 hunshcn\n\nPermission is hereby granted, free of charge, to any person obtaining a copy\nof this software and associated documentation files (the \"Software\"), to deal\nin the Software without restriction, including without limitation the rights\nto use, copy, modify, merge, publish, distribute, sublicense, and/or sell\ncopies of the Software, and to permit persons to whom the Software is\nfurnished to do so, subject to the following conditions:\n\nThe above copyright notice and this permission notice shall be included in all\ncopies or substantial portions of the Software.\n\nTHE SOFTWARE IS PROVIDED \"AS IS\", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR\nIMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,\nFITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE\nAUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER\nLIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,\nOUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE\nSOFTWARE.\n"
  },
  {
    "path": "README.md",
    "content": "# gh-proxy\n\n## 简介\n\ngithub release、archive以及项目文件的加速项目，支持clone，有Cloudflare Workers无服务器版本以及Python版本\n\n## 演示\n\n[https://gh.api.99988866.xyz/](https://gh.api.99988866.xyz/)\n\n演示站为公共服务，如有大规模使用需求请自行部署，演示站有点不堪重负\n\n![imagea272c95887343279.png](https://img.maocdn.cn/img/2021/04/24/imagea272c95887343279.png)\n\n当然也欢迎[捐赠](#捐赠)以支持作者\n\n## python版本和cf worker版本差异\n\n- python版本支持进行文件大小限制，超过设定返回原地址 [issue #8](https://github.com/hunshcn/gh-proxy/issues/8)\n\n- python版本支持特定user/repo 封禁/白名单 以及passby [issue #41](https://github.com/hunshcn/gh-proxy/issues/41)\n\n## 使用\n\n直接在copy出来的url前加`https://gh.api.99988866.xyz/`即可\n\n也可以直接访问，在input输入\n\n***大量使用请自行部署，以上域名仅为演示使用。***\n\n访问私有仓库可以通过\n\n`git clone https://user:TOKEN@ghproxy.com/https://github.com/xxxx/xxxx` [#71](https://github.com/hunshcn/gh-proxy/issues/71)\n\n以下都是合法输入（仅示例，文件不存在）：\n\n- 分支源码：https://github.com/hunshcn/project/archive/master.zip\n\n- release源码：https://github.com/hunshcn/project/archive/v0.1.0.tar.gz\n\n- release文件：https://github.com/hunshcn/project/releases/download/v0.1.0/example.zip\n\n- 分支文件：https://github.com/hunshcn/project/blob/master/filename\n\n- commit文件：https://github.com/hunshcn/project/blob/1111111111111111111111111111/filename\n\n- gist：https://gist.githubusercontent.com/cielpy/351557e6e465c12986419ac5a4dd2568/raw/cmd.py\n\n## cf worker版本部署\n\n首页：https://workers.cloudflare.com\n\n注册，登陆，`Start building`，取一个子域名，`Create a Worker`。\n\n复制 [index.js](https://cdn.jsdelivr.net/gh/hunshcn/gh-proxy@master/index.js)  到左侧代码框，`Save and deploy`。如果正常，右侧应显示首页。\n\n`ASSET_URL`是静态资源的url（实际上就是现在显示出来的那个输入框单页面）\n\n`PREFIX`是前缀，默认（根路径情况为\"/\"），如果自定义路由为example.com/gh/*，请将PREFIX改为 '/gh/'，注意，少一个杠都会错！\n\n## Python版本部署\n\n### Docker部署\n\n```\ndocker run -d --name=\"gh-proxy-py\" \\\n  -p 0.0.0.0:80:80 \\\n  --restart=always \\\n  hunsh/gh-proxy-py:latest\n```\n\n第一个80是你要暴露出去的端口\n\n### 直接部署\n\n安装依赖（请使用python3）\n\n```pip install flask requests```\n\n按需求修改`app/main.py`的前几项配置\n\n*注意:* 可能需要在`return Response`前加两行\n```python3\nif 'Transfer-Encoding' in headers:\n    headers.pop('Transfer-Encoding')\n```\n\n### 注意\n\npython版本的机器如果无法正常访问github.io会启动报错，请自行修改静态文件url\n\npython版本默认走服务器（2021.3.27更新）\n\n## Cloudflare Workers计费\n\n到 `overview` 页面可参看使用情况。免费版每天有 10 万次免费请求，并且有每分钟1000次请求的限制。\n\n如果不够用，可升级到 $5 的高级版本，每月可用 1000 万次请求（超出部分 $0.5/百万次请求）。\n\n## Changelog\n\n* 2020.04.10 增加对`raw.githubusercontent.com`文件的支持\n* 2020.04.09 增加Python版本（使用Flask）\n* 2020.03.23 新增了clone的支持\n* 2020.03.22 初始版本\n\n## 链接\n\n[我的博客](https://hunsh.net)\n\n## 参考\n\n[jsproxy](https://github.com/EtherDream/jsproxy/)\n\n## 捐赠\n\n![wx.png](https://img.maocdn.cn/img/2021/04/24/image.md.png)\n![ali.png](https://www.helloimg.com/images/2021/04/24/BK9vmb.md.png)\n"
  },
  {
    "path": "app/main.py",
    "content": "# -*- coding: utf-8 -*-\nimport re\n\nimport requests\nfrom flask import Flask, Response, redirect, request\nfrom requests.exceptions import (\n    ChunkedEncodingError,\n    ContentDecodingError, ConnectionError, StreamConsumedError)\nfrom requests.utils import (\n    stream_decode_response_unicode, iter_slices, CaseInsensitiveDict)\nfrom urllib3.exceptions import (\n    DecodeError, ReadTimeoutError, ProtocolError)\nfrom urllib.parse import quote\n\n# config\n# 分支文件使用jsDelivr镜像的开关，0为关闭，默认关闭\njsdelivr = 0\nsize_limit = 1024 * 1024 * 1024 * 999  # 允许的文件大小，默认999GB，相当于无限制了 https://github.com/hunshcn/gh-proxy/issues/8\n\n\"\"\"\n  先生效白名单再匹配黑名单，pass_list匹配到的会直接302到jsdelivr而忽略设置\n  生效顺序 白->黑->pass，可以前往https://github.com/hunshcn/gh-proxy/issues/41 查看示例\n  每个规则一行，可以封禁某个用户的所有仓库，也可以封禁某个用户的特定仓库，下方用黑名单示例，白名单同理\n  user1 # 封禁user1的所有仓库\n  user1/repo1 # 封禁user1的repo1\n  */repo1 # 封禁所有叫做repo1的仓库\n\"\"\"\nwhite_list = '''\n'''\nblack_list = '''\n'''\npass_list = '''\n'''\n\nHOST = '127.0.0.1'  # 监听地址，建议监听本地然后由web服务器反代\nPORT = 80  # 监听端口\nASSET_URL = 'https://hunshcn.github.io/gh-proxy'  # 主页\n\nwhite_list = [tuple([x.replace(' ', '') for x in i.split('/')]) for i in white_list.split('\\n') if i]\nblack_list = [tuple([x.replace(' ', '') for x in i.split('/')]) for i in black_list.split('\\n') if i]\npass_list = [tuple([x.replace(' ', '') for x in i.split('/')]) for i in pass_list.split('\\n') if i]\napp = Flask(__name__)\nCHUNK_SIZE = 1024 * 10\nindex_html = requests.get(ASSET_URL, timeout=10).text\nicon_r = requests.get(ASSET_URL + '/favicon.ico', timeout=10).content\nexp1 = re.compile(r'^(?:https?://)?github\\.com/(?P<author>.+?)/(?P<repo>.+?)/(?:releases|archive)/.*$')\nexp2 = re.compile(r'^(?:https?://)?github\\.com/(?P<author>.+?)/(?P<repo>.+?)/(?:blob|raw)/.*$')\nexp3 = re.compile(r'^(?:https?://)?github\\.com/(?P<author>.+?)/(?P<repo>.+?)/(?:info|git-).*$')\nexp4 = re.compile(r'^(?:https?://)?raw\\.(?:githubusercontent|github)\\.com/(?P<author>.+?)/(?P<repo>.+?)/.+?/.+$')\nexp5 = re.compile(r'^(?:https?://)?gist\\.(?:githubusercontent|github)\\.com/(?P<author>.+?)/.+?/.+$')\n\nrequests.sessions.default_headers = lambda: CaseInsensitiveDict()\n\n\n@app.route('/')\ndef index():\n    if 'q' in request.args:\n        return redirect('/' + request.args.get('q'))\n    return index_html\n\n\n@app.route('/favicon.ico')\ndef icon():\n    return Response(icon_r, content_type='image/vnd.microsoft.icon')\n\n\ndef iter_content(self, chunk_size=1, decode_unicode=False):\n    \"\"\"rewrite requests function, set decode_content with False\"\"\"\n\n    def generate():\n        # Special case for urllib3.\n        if hasattr(self.raw, 'stream'):\n            try:\n                for chunk in self.raw.stream(chunk_size, decode_content=False):\n                    yield chunk\n            except ProtocolError as e:\n                raise ChunkedEncodingError(e)\n            except DecodeError as e:\n                raise ContentDecodingError(e)\n            except ReadTimeoutError as e:\n                raise ConnectionError(e)\n        else:\n            # Standard file-like object.\n            while True:\n                chunk = self.raw.read(chunk_size)\n                if not chunk:\n                    break\n                yield chunk\n\n        self._content_consumed = True\n\n    if self._content_consumed and isinstance(self._content, bool):\n        raise StreamConsumedError()\n    elif chunk_size is not None and not isinstance(chunk_size, int):\n        raise TypeError(\"chunk_size must be an int, it is instead a %s.\" % type(chunk_size))\n    # simulate reading small chunks of the content\n    reused_chunks = iter_slices(self._content, chunk_size)\n\n    stream_chunks = generate()\n\n    chunks = reused_chunks if self._content_consumed else stream_chunks\n\n    if decode_unicode:\n        chunks = stream_decode_response_unicode(chunks, self)\n\n    return chunks\n\n\ndef check_url(u):\n    for exp in (exp1, exp2, exp3, exp4, exp5):\n        m = exp.match(u)\n        if m:\n            return m\n    return False\n\n\n@app.route('/<path:u>', methods=['GET', 'POST'])\ndef handler(u):\n    u = u if u.startswith('http') else 'https://' + u\n    if u.rfind('://', 3, 9) == -1:\n        u = u.replace('s:/', 's://', 1)  # uwsgi会将//传递为/\n    pass_by = False\n    m = check_url(u)\n    if m:\n        m = tuple(m.groups())\n        if white_list:\n            for i in white_list:\n                if m[:len(i)] == i or i[0] == '*' and len(m) == 2 and m[1] == i[1]:\n                    break\n            else:\n                return Response('Forbidden by white list.', status=403)\n        for i in black_list:\n            if m[:len(i)] == i or i[0] == '*' and len(m) == 2 and m[1] == i[1]:\n                return Response('Forbidden by black list.', status=403)\n        for i in pass_list:\n            if m[:len(i)] == i or i[0] == '*' and len(m) == 2 and m[1] == i[1]:\n                pass_by = True\n                break\n    else:\n        return Response('Invalid input.', status=403)\n\n    if (jsdelivr or pass_by) and exp2.match(u):\n        u = u.replace('/blob/', '@', 1).replace('github.com', 'cdn.jsdelivr.net/gh', 1)\n        return redirect(u)\n    elif (jsdelivr or pass_by) and exp4.match(u):\n        u = re.sub(r'(\\.com/.*?/.+?)/(.+?/)', r'\\1@\\2', u, 1)\n        _u = u.replace('raw.githubusercontent.com', 'cdn.jsdelivr.net/gh', 1)\n        u = u.replace('raw.github.com', 'cdn.jsdelivr.net/gh', 1) if _u == u else _u\n        return redirect(u)\n    else:\n        if exp2.match(u):\n            u = u.replace('/blob/', '/raw/', 1)\n        if pass_by:\n            url = u + request.url.replace(request.base_url, '', 1)\n            if url.startswith('https:/') and not url.startswith('https://'):\n                url = 'https://' + url[7:]\n            return redirect(url)\n        u = quote(u, safe='/:')\n        return proxy(u)\n\n\ndef proxy(u, allow_redirects=False):\n    headers = {}\n    r_headers = dict(request.headers)\n    if 'Host' in r_headers:\n        r_headers.pop('Host')\n    try:\n        url = u + request.url.replace(request.base_url, '', 1)\n        if url.startswith('https:/') and not url.startswith('https://'):\n            url = 'https://' + url[7:]\n        r = requests.request(method=request.method, url=url, data=request.data, headers=r_headers, stream=True, allow_redirects=allow_redirects)\n        headers = dict(r.headers)\n\n        if 'Content-length' in r.headers and int(r.headers['Content-length']) > size_limit:\n            return redirect(u + request.url.replace(request.base_url, '', 1))\n\n        def generate():\n            for chunk in iter_content(r, chunk_size=CHUNK_SIZE):\n                yield chunk\n\n        if 'Location' in r.headers:\n            _location = r.headers.get('Location')\n            if check_url(_location):\n                headers['Location'] = '/' + _location\n            else:\n                return proxy(_location, True)\n\n        return Response(generate(), headers=headers, status=r.status_code)\n    except Exception as e:\n        headers['content-type'] = 'text/html; charset=UTF-8'\n        return Response('server error ' + str(e), status=500, headers=headers)\n\napp.debug = True\nif __name__ == '__main__':\n    app.run(host=HOST, port=PORT)\n"
  },
  {
    "path": "app/uwsgi.ini",
    "content": "[uwsgi]\nmodule = main\ncallable = app\n"
  },
  {
    "path": "entrypoint.sh",
    "content": "#! /usr/bin/env bash\nset -e\n\n/uwsgi-nginx-entrypoint.sh\n\n# Get the listen port for Nginx, default to 80\nUSE_LISTEN_PORT=${LISTEN_PORT:-80}\n\nif [ -f /app/nginx.conf ]; then\n    cp /app/nginx.conf /etc/nginx/nginx.conf\nelse\n    content_server='server {\\n'\n    content_server=$content_server\"    listen ${USE_LISTEN_PORT};\\n\"\n    content_server=$content_server'    location / {\\n'\n    content_server=$content_server'        try_files $uri @app;\\n'\n    content_server=$content_server'    }\\n'\n    content_server=$content_server'    location @app {\\n'\n    content_server=$content_server'        include uwsgi_params;\\n'\n    content_server=$content_server'        uwsgi_pass unix:///tmp/uwsgi.sock;\\n'\n    content_server=$content_server'        uwsgi_buffer_size 256k;\\n'\n    content_server=$content_server'        uwsgi_buffers 32 512k;\\n'\n    content_server=$content_server'        uwsgi_busy_buffers_size 512k;\\n'\n    content_server=$content_server'    }\\n'\n    content_server=$content_server'}\\n'\n    # Save generated server /etc/nginx/conf.d/nginx.conf\n    printf \"$content_server\" > /etc/nginx/conf.d/nginx.conf\nfi\n\nexec \"$@\"\n"
  },
  {
    "path": "index.js",
    "content": "'use strict'\n\n/**\n * static files (404.html, sw.js, conf.js)\n */\nconst ASSET_URL = 'https://hunshcn.github.io/gh-proxy/'\n// 前缀，如果自定义路由为example.com/gh/*，将PREFIX改为 '/gh/'，注意，少一个杠都会错！\nconst PREFIX = '/'\n// 分支文件使用jsDelivr镜像的开关，0为关闭，默认关闭\nconst Config = {\n    jsdelivr: 0\n}\n\nconst whiteList = [] // 白名单，路径里面有包含字符的才会通过，e.g. ['/username/']\n\n/** @type {ResponseInit} */\nconst PREFLIGHT_INIT = {\n    status: 204,\n    headers: new Headers({\n        'access-control-allow-origin': '*',\n        'access-control-allow-methods': 'GET,POST,PUT,PATCH,TRACE,DELETE,HEAD,OPTIONS',\n        'access-control-max-age': '1728000',\n    }),\n}\n\n\nconst exp1 = /^(?:https?:\\/\\/)?github\\.com\\/.+?\\/.+?\\/(?:releases|archive)\\/.*$/i\nconst exp2 = /^(?:https?:\\/\\/)?github\\.com\\/.+?\\/.+?\\/(?:blob|raw)\\/.*$/i\nconst exp3 = /^(?:https?:\\/\\/)?github\\.com\\/.+?\\/.+?\\/(?:info|git-).*$/i\nconst exp4 = /^(?:https?:\\/\\/)?raw\\.(?:githubusercontent|github)\\.com\\/.+?\\/.+?\\/.+?\\/.+$/i\nconst exp5 = /^(?:https?:\\/\\/)?gist\\.(?:githubusercontent|github)\\.com\\/.+?\\/.+?\\/.+$/i\nconst exp6 = /^(?:https?:\\/\\/)?github\\.com\\/.+?\\/.+?\\/tags.*$/i\n\n/**\n * @param {any} body\n * @param {number} status\n * @param {Object<string, string>} headers\n */\nfunction makeRes(body, status = 200, headers = {}) {\n    headers['access-control-allow-origin'] = '*'\n    return new Response(body, { status, headers })\n}\n\n\n/**\n * @param {string} urlStr\n */\nfunction newUrl(urlStr) {\n    try {\n        return new URL(urlStr)\n    } catch (err) {\n        return null\n    }\n}\n\n\naddEventListener('fetch', e => {\n    const ret = fetchHandler(e)\n        .catch(err => makeRes('cfworker error:\\n' + err.stack, 502))\n    e.respondWith(ret)\n})\n\n\nfunction checkUrl(u) {\n    for (let i of [exp1, exp2, exp3, exp4, exp5, exp6]) {\n        if (u.search(i) === 0) {\n            return true\n        }\n    }\n    return false\n}\n\n/**\n * @param {FetchEvent} e\n */\nasync function fetchHandler(e) {\n    const req = e.request\n    const urlStr = req.url\n    const urlObj = new URL(urlStr)\n    let path = urlObj.searchParams.get('q')\n    if (path) {\n        return Response.redirect('https://' + urlObj.host + PREFIX + path, 301)\n    }\n    // cfworker 会把路径中的 `//` 合并成 `/`\n    path = urlObj.href.slice(urlObj.origin.length + PREFIX.length).replace(/^https?:\\/+/, 'https://')\n    if (path.search(exp1) === 0 || path.search(exp5) === 0 || path.search(exp6) === 0 || path.search(exp3) === 0) {\n        return httpHandler(req, path)\n    } else if (path.search(exp2) === 0) {\n        if (Config.jsdelivr) {\n            const newUrl = path.replace('/blob/', '@').replace(/^(?:https?:\\/\\/)?github\\.com/, 'https://cdn.jsdelivr.net/gh')\n            return Response.redirect(newUrl, 302)\n        } else {\n            path = path.replace('/blob/', '/raw/')\n            return httpHandler(req, path)\n        }\n    } else if (path.search(exp4) === 0) {\n        if (Config.jsdelivr) {\n            const newUrl = path.replace(/(?<=com\\/.+?\\/.+?)\\/(.+?\\/)/, '@$1').replace(/^(?:https?:\\/\\/)?raw\\.(?:githubusercontent|github)\\.com/, 'https://cdn.jsdelivr.net/gh')\n            return Response.redirect(newUrl, 302)\n        }\n        else {\n            return httpHandler(req, path)\n        }\n    } else {\n        return fetch(ASSET_URL + path)\n    }\n}\n\n\n/**\n * @param {Request} req\n * @param {string} pathname\n */\nfunction httpHandler(req, pathname) {\n    const reqHdrRaw = req.headers\n\n    // preflight\n    if (req.method === 'OPTIONS' &&\n        reqHdrRaw.has('access-control-request-headers')\n    ) {\n        return new Response(null, PREFLIGHT_INIT)\n    }\n\n    const reqHdrNew = new Headers(reqHdrRaw)\n\n    let urlStr = pathname\n    let flag = !Boolean(whiteList.length)\n    for (let i of whiteList) {\n        if (urlStr.includes(i)) {\n            flag = true\n            break\n        }\n    }\n    if (!flag) {\n        return new Response(\"blocked\", { status: 403 })\n    }\n    if (urlStr.search(/^https?:\\/\\//) !== 0) {\n        urlStr = 'https://' + urlStr\n    }\n    const urlObj = newUrl(urlStr)\n\n    /** @type {RequestInit} */\n    const reqInit = {\n        method: req.method,\n        headers: reqHdrNew,\n        redirect: 'manual',\n        body: req.body\n    }\n    return proxy(urlObj, reqInit)\n}\n\n\n/**\n *\n * @param {URL} urlObj\n * @param {RequestInit} reqInit\n */\nasync function proxy(urlObj, reqInit) {\n    const res = await fetch(urlObj.href, reqInit)\n    const resHdrOld = res.headers\n    const resHdrNew = new Headers(resHdrOld)\n\n    const status = res.status\n\n    if (resHdrNew.has('location')) {\n        let _location = resHdrNew.get('location')\n        if (checkUrl(_location))\n            resHdrNew.set('location', PREFIX + _location)\n        else {\n            reqInit.redirect = 'follow'\n            return proxy(newUrl(_location), reqInit)\n        }\n    }\n    resHdrNew.set('access-control-expose-headers', '*')\n    resHdrNew.set('access-control-allow-origin', '*')\n\n    resHdrNew.delete('content-security-policy')\n    resHdrNew.delete('content-security-policy-report-only')\n    resHdrNew.delete('clear-site-data')\n\n    return new Response(res.body, {\n        status,\n        headers: resHdrNew,\n    })\n}\n\n"
  }
]