Repository: hwfan/DriveDownloader
Branch: master
Commit: 22517cac9ab3
Files: 31
Total size: 117.3 KB
Directory structure:
gitextract_fmy0tce_/
├── .gitignore
├── DriveDownloader/
│ ├── __init__.py
│ ├── downloader.py
│ ├── netdrives/
│ │ ├── __init__.py
│ │ ├── basedrive.py
│ │ ├── build.py
│ │ ├── directlink.py
│ │ ├── dropbox.py
│ │ ├── googledrive.py
│ │ ├── onedrive.py
│ │ ├── settings.yaml
│ │ └── sharepoint.py
│ ├── pydrive2/
│ │ ├── __init__.py
│ │ ├── apiattr.py
│ │ ├── auth.py
│ │ ├── drive.py
│ │ ├── files.py
│ │ ├── fs/
│ │ │ ├── __init__.py
│ │ │ ├── spec.py
│ │ │ └── utils.py
│ │ └── settings.py
│ └── utils/
│ ├── __init__.py
│ ├── misc.py
│ └── multithread.py
├── LICENSE
├── README.md
├── README_CN.md
├── requirements.txt
├── setup.py
└── tests/
├── run.sh
└── test.list
================================================
FILE CONTENTS
================================================
================================================
FILE: .gitignore
================================================
# Byte-compiled / optimized / DLL files
__pycache__/
*.py[cod]
*$py.class
# C extensions
*.so
# Distribution / packaging
.Python
build/
develop-eggs/
dist/
downloads/
eggs/
.eggs/
lib64/
parts/
sdist/
var/
wheels/
pip-wheel-metadata/
share/python-wheels/
*.egg-info/
.installed.cfg
*.egg
MANIFEST
# PyInstaller
# Usually these files are written by a python script from a template
# before PyInstaller builds the exe, so as to inject date/other infos into it.
*.manifest
*.spec
# Installer logs
pip-log.txt
pip-delete-this-directory.txt
# Unit test / coverage reports
htmlcov/
.tox/
.nox/
.coverage
.coverage.*
.cache
nosetests.xml
coverage.xml
*.cover
*.py,cover
.hypothesis/
.pytest_cache/
# Translations
*.mo
*.pot
# Django stuff:
*.log
local_settings.py
db.sqlite3
db.sqlite3-journal
# Flask stuff:
instance/
.webassets-cache
# Scrapy stuff:
.scrapy
# Sphinx documentation
docs/_build/
# PyBuilder
target/
# Jupyter Notebook
.ipynb_checkpoints
# IPython
profile_default/
ipython_config.py
# pyenv
.python-version
# pipenv
# According to pypa/pipenv#598, it is recommended to include Pipfile.lock in version control.
# However, in case of collaboration, if having platform-specific dependencies or dependencies
# having no cross-platform support, pipenv may install dependencies that don't work, or not
# install all needed dependencies.
#Pipfile.lock
# PEP 582; used by e.g. github.com/David-OConnor/pyflow
__pypackages__/
# Celery stuff
celerybeat-schedule
celerybeat.pid
# SageMath parsed files
*.sage.py
# Environments
.env
.venv
env/
venv/
ENV/
env.bak/
venv.bak/
# Spyder project settings
.spyderproject
.spyproject
# Rope project settings
.ropeproject
# mkdocs documentation
/site
# mypy
.mypy_cache/
.dmypy.json
dmypy.json
# Pyre type checker
.pyre/
pack.sh
================================================
FILE: DriveDownloader/__init__.py
================================================
================================================
FILE: DriveDownloader/downloader.py
================================================
#############################################
# Author: Hongwei Fan #
# E-mail: hwnorm@outlook.com #
# Homepage: https://github.com/hwfan #
#############################################
from DriveDownloader.netdrives import get_session
from DriveDownloader.utils import judge_session, MultiThreadDownloader, judge_scheme
import argparse
import os
import sys
from rich.console import Console
from rich.progress import (
BarColumn,
DownloadColumn,
Progress,
TaskID,
TextColumn,
TimeRemainingColumn,
TransferSpeedColumn,
)
MAJOR_VERSION = 1
MINOR_VERSION = 6
POST_VERSION = 0
__version__ = f"{MAJOR_VERSION}.{MINOR_VERSION}.{POST_VERSION}"
console = Console(width=72)
url_scheme_env_key_map = {
"http": "http_proxy",
"https": "https_proxy",
}
def parse_args():
parser = argparse.ArgumentParser(description='Drive Downloader Args')
parser.add_argument('url', help='URL you want to download from.', default='', type=str)
parser.add_argument('--filename', '-o', help='Target file name.', default='', type=str)
parser.add_argument('--thread-number', '-n', help='thread number of multithread.', type=int, default=1)
parser.add_argument('--version', '-v', action='version', version=__version__, help='Version.')
parser.add_argument('--force-back-google','-F',help='Force to use the backup downloader for GoogleDrive.', action='store_true')
args = parser.parse_args()
return args
def get_env(key):
value = os.environ.get(key)
if not value or len(value) == 0:
return None
return value;
def download_single_file(url, filename="", thread_number=1, force_back_google=False, list_suffix=None):
scheme = judge_scheme(url)
if scheme not in url_scheme_env_key_map.keys():
raise NotImplementedError(f"Unsupported scheme {scheme}")
env_key = url_scheme_env_key_map[scheme]
used_proxy = get_env(env_key)
session_name = judge_session(url)
session_func = get_session(session_name)
google_fix_logic = False
if session_name == 'GoogleDrive' and thread_number > 1 and not force_back_google:
thread_number = 1
google_fix_logic = True
single_progress = Progress(
TextColumn("[bold blue]Downloading: ", justify="left"),
BarColumn(bar_width=15),
"[progress.percentage]{task.percentage:>3.1f}%",
"|",
DownloadColumn(),
"|",
TransferSpeedColumn(),
"|",
TimeRemainingColumn(),
refresh_per_second=10
)
multi_progress = Progress(
TextColumn("[bold blue]Thread {task.fields[proc_id]}: ", justify="left"),
BarColumn(bar_width=15),
"[progress.percentage]{task.percentage:>3.1f}%",
"|",
DownloadColumn(),
"|",
TransferSpeedColumn(),
"|",
TimeRemainingColumn(),
refresh_per_second=10
)
progress_applied = multi_progress if thread_number > 1 else single_progress
download_session = session_func(used_proxy)
download_session.connect(url, filename, force_backup=force_back_google if session_name == 'GoogleDrive' else False)
final_filename = download_session.filename
download_session.show_info(progress_applied, list_suffix)
if google_fix_logic:
console.print('[yellow]Warning: Google Drive URL detected. Only one thread will be created.')
if thread_number > 1:
download_session = MultiThreadDownloader(progress_applied, session_func, used_proxy, download_session.filesize, thread_number)
interrupted = download_session.get(url, final_filename, force_back_google)
if interrupted:
return
download_session.concatenate(final_filename)
else:
with progress_applied:
task_id = progress_applied.add_task("download", filename=final_filename, proc_id=0, start=False)
interrupted = download_session.save_response_content(progress_bar=progress_applied)
if interrupted:
return
console.print('[green]Bye.')
def download_filelist(args):
lines = [line for line in open(args.url, 'r')]
for line_idx, line in enumerate(lines):
splitted_line = line.strip().split(" ")
url, filename = splitted_line[0], splitted_line[1] if len(splitted_line) > 1 else ""
thread_number = int(splitted_line[2]) if len(splitted_line) > 2 else 1
list_suffix = "({:d}/{:d})".format(line_idx+1, len(lines))
download_single_file(url, filename, thread_number, args.force_back_google, list_suffix)
def simple_cli():
console.print(f"***********************************************************************")
console.print(f"* *")
console.print(f"* DriveDownloader {MAJOR_VERSION}.{MINOR_VERSION}.{POST_VERSION} *")
console.print(f"* Homesite: https://github.com/hwfan/DriveDownloader *")
console.print(f"* *")
console.print(f"***********************************************************************")
args = parse_args()
assert len(args.url) > 0, "Please input your URL or filelist path!"
if os.path.exists(args.url):
console.print('Downloading filelist: {:s}'.format(os.path.basename(args.url)))
download_filelist(args)
else:
download_single_file(args.url, args.filename, args.thread_number, args.force_back_google)
if __name__ == '__main__':
simple_cli()
================================================
FILE: DriveDownloader/netdrives/__init__.py
================================================
from .build import get_session
================================================
FILE: DriveDownloader/netdrives/basedrive.py
================================================
#############################################
# Author: Hongwei Fan #
# E-mail: hwnorm@outlook.com #
# Homepage: https://github.com/hwfan #
#############################################
import requests
import requests_random_user_agent
import sys
import re
import os
from tqdm import tqdm
from DriveDownloader.utils.misc import *
from threading import Event
import signal
from rich.console import Console
from googleapiclient.http import _retry_request, DEFAULT_CHUNK_SIZE
import time
import random
console = Console(width=71)
done_event = Event()
def handle_sigint(signum, frame):
console.print("\n[yellow]Interrupted. Will shutdown after the latest chunk is downloaded.\n")
done_event.set()
signal.signal(signal.SIGINT, handle_sigint)
class DriveSession:
def __init__(self, proxy=None, chunk_size=32768):
self.session = requests.Session()
self.session.headers['Accept-Encoding'] = ''
if proxy is None:
self.proxies = None
else:
self.proxies = { "http": proxy, "https": proxy, }
self.params = dict()
self.chunk_size = chunk_size
self.filename = ''
self.filesize = None
self.response = None
self.file_handler = None
self.base_url = None
def generate_url(self, url):
raise NotImplementedError
def set_range(self, start, end):
self.session.headers['Range'] = 'bytes={:s}-{:s}'.format(str(start), str(end))
def parse_response_header(self):
try:
pattern = re.compile(r'filename=\"(.*?)\"')
filename = pattern.findall(self.response.headers['content-disposition'])[0]
except:
filename = 'noname.out'
try:
header_size = int(self.response.headers['Content-Length'])
except:
header_size = None
return filename, header_size
def save_response_content(self, start=None, end=None, proc_id=-1, progress_bar=None):
dirname = os.path.dirname(self.filename)
if len(dirname) > 0:
os.makedirs(dirname, exist_ok=True)
interrupted = False
if proc_id >= 0:
name, ext = os.path.splitext(self.filename)
name = name + '_{}'.format(proc_id)
sub_filename = name + ext
sub_dirname = os.path.dirname(sub_filename)
sub_basename = os.path.basename(sub_filename)
sub_tmp_dirname = os.path.join(sub_dirname, 'tmp')
os.makedirs(sub_tmp_dirname, exist_ok=True)
sub_filename = os.path.join(sub_tmp_dirname, sub_basename)
used_filename = sub_filename
else:
proc_id = 0
used_filename = self.filename
start = 0
end = self.filesize-1
ori_filesize = os.path.getsize(used_filename) if os.path.exists(used_filename) else 0
self.file_handler = open(used_filename, 'ab' if ori_filesize > 0 else 'wb' )
progress_bar.update(proc_id, total=end+1-start)
progress_bar.start_task(proc_id)
progress_bar.update(proc_id, advance=ori_filesize)
if 'googleapiclient' in str(type(self.response)):
self.chunk_size = 1 * 1024 * 1024
_headers = {}
for k, v in self.response.headers.items():
if not k.lower() in ("accept", "accept-encoding", "user-agent"):
_headers[k] = v
cur_state = start + ori_filesize
while cur_state < end + 1:
headers = _headers.copy()
remained = end + 1 - cur_state
chunk_size = self.chunk_size if remained >= self.chunk_size else remained
headers["range"] = "bytes=%d-%d" % (
cur_state,
cur_state + chunk_size - 1,
)
http = self.response.http
resp, content = _retry_request(
http,
0,
"media download",
time.sleep,
random.random,
self.response.uri,
"GET",
headers=headers,
)
self.file_handler.write(content)
progress_bar.update(proc_id, advance=len(content))
cur_state += len(content)
if done_event.is_set():
interrupted = True
return interrupted
else:
if ori_filesize > 0:
self.set_range(start + ori_filesize, end)
self.response = self.session.get(self.base_url, params=self.params, proxies=self.proxies, stream=True)
else:
self.set_range(start, end)
cur_state = start + ori_filesize
for chunk in self.response.iter_content(self.chunk_size):
if cur_state >= end + 1:
break
self.file_handler.write(chunk)
chunk_num = len(chunk)
progress_bar.update(proc_id, advance=chunk_num)
cur_state += chunk_num
if done_event.is_set():
interrupted = True
return interrupted
def connect(self, url, custom_filename=''):
self.base_url = url
self.response = self.session.get(url, params=self.params, proxies=self.proxies, stream=True)
if self.response.status_code // 100 >= 4:
raise RuntimeError("Bad status code {}. Please check your connection.".format(self.response.status_code))
filename_parsed, self.filesize = self.parse_response_header()
self.filename = filename_parsed if len(custom_filename) == 0 else custom_filename
def show_info(self, progress_bar, list_suffix):
filesize_str = str(format_size(self.filesize)) if self.filesize is not None else 'Invalid'
progress_bar.console.print('{:s}Name: {:s}, Size: {:s}'.format(list_suffix+' ' if list_suffix else '', self.filename, filesize_str))
================================================
FILE: DriveDownloader/netdrives/build.py
================================================
#############################################
# Author: Hongwei Fan #
# E-mail: hwnorm@outlook.com #
# Homepage: https://github.com/hwfan #
#############################################
from .googledrive import GoogleDriveSession
from .onedrive import OneDriveSession
from .sharepoint import SharePointSession
from .dropbox import DropBoxSession
from .directlink import DirectLink
__factory__ = {"GoogleDrive": GoogleDriveSession,
"OneDrive": OneDriveSession,
"SharePoint": SharePointSession,
"DropBox": DropBoxSession,
"DirectLink": DirectLink,
}
def get_session(name):
return __factory__[name]
================================================
FILE: DriveDownloader/netdrives/directlink.py
================================================
#############################################
# Author: Hongwei Fan #
# E-mail: hwnorm@outlook.com #
# Homepage: https://github.com/hwfan #
#############################################
import urllib.parse as urlparse
import os
from DriveDownloader.netdrives.basedrive import DriveSession
class DirectLink(DriveSession):
def __init__(self, *args, **kwargs):
DriveSession.__init__(self, *args, **kwargs)
def parse_response_header(self):
filename = os.path.basename(self.response.url)
try:
header_size = int(self.response.headers['Content-Length'])
except:
header_size = None
return filename, header_size
def generate_url(self, url):
return url
def connect(self, url, custom_filename='', proc_id=-1, force_backup=False):
generated_url = self.generate_url(url)
DriveSession.connect(self, generated_url, custom_filename=custom_filename)
================================================
FILE: DriveDownloader/netdrives/dropbox.py
================================================
#############################################
# Author: Hongwei Fan #
# E-mail: hwnorm@outlook.com #
# Homepage: https://github.com/hwfan #
#############################################
import urllib.parse as urlparse
from DriveDownloader.netdrives.basedrive import DriveSession
class DropBoxSession(DriveSession):
def __init__(self, *args, **kwargs):
DriveSession.__init__(self, *args, **kwargs)
def generate_url(self, url):
'''
Solution provided by:
https://sunpma.com/564.html
'''
parsed_url = urlparse.urlparse(url)
netloc = parsed_url.netloc.replace('www', 'dl-web')
query = ''
parsed_url = parsed_url._replace(netloc=netloc, query=query)
resultUrl = urlparse.urlunparse(parsed_url)
return resultUrl
def connect(self, url, custom_filename='', proc_id=-1, force_backup=False):
generated_url = self.generate_url(url)
DriveSession.connect(self, generated_url, custom_filename=custom_filename)
================================================
FILE: DriveDownloader/netdrives/googledrive.py
================================================
#############################################
# Author: Hongwei Fan #
# E-mail: hwnorm@outlook.com #
# Homepage: https://github.com/hwfan #
#############################################
import urllib.parse as urlparse
from DriveDownloader.netdrives.basedrive import DriveSession
from DriveDownloader.pydrive2.auth import GoogleAuth
from DriveDownloader.pydrive2.drive import GoogleDrive
import os
import sys
from rich.console import Console
googleauthdata = \
'''
client_config_backend: settings
client_config:
client_id: 367116221053-7n0vf5akeru7on6o2fjinrecpdoe99eg.apps.googleusercontent.com
client_secret: 1qsNodXNaWq1mQuBjUjmvhoO
save_credentials: True
save_credentials_backend: file
get_refresh_token: True
oauth_scope:
- https://www.googleapis.com/auth/drive
'''
info = \
'''
+-------------------------------------------------------------------+
|Warning: DriveDownloader is using the backup downloader due to the |
|forbiddance or manual setting. If this is the first time you meet |
|the notice, please follow the instructions to login your Google |
|Account. This operation only needs to be done once. |
+-------------------------------------------------------------------+
'''
console = Console(width=71)
class GoogleDriveSession(DriveSession):
def __init__(self, *args, **kwargs):
DriveSession.__init__(self, *args, **kwargs)
def generate_url(self, url):
'''
Solution provided by:
https://stackoverflow.com/questions/25010369/wget-curl-large-file-from-google-drive
'''
parsed_url = urlparse.urlparse(url)
parsed_qs = urlparse.parse_qs(parsed_url.query)
if 'id' in parsed_qs:
id_str = parsed_qs['id'][0]
else:
id_str = parsed_url.path.split('/')[3]
replaced_url = "https://drive.google.com/u/0/uc?export=download"
return replaced_url, id_str
def connect(self, url, custom_filename='', force_backup=False, proc_id=-1):
replaced_url, id_str = self.generate_url(url)
if force_backup:
self.backup_connect(url, custom_filename, id_str, proc_id=proc_id)
return
try:
self.params["id"] = id_str
self.params["confirm"] = "t"
DriveSession.connect(self, replaced_url, custom_filename=custom_filename)
except:
self.backup_connect(url, custom_filename, id_str, proc_id=proc_id)
def backup_connect(self, url, custom_filename, id_str, proc_id=-1):
if proc_id == -1:
console.print(info)
settings_file_path = os.path.join(os.path.dirname(__file__), 'settings.yaml')
if not os.path.exists(settings_file_path):
with open(settings_file_path, "w") as f:
f.write(googleauthdata)
self.gauth = GoogleAuth(settings_file=settings_file_path)
self.gauth.CommandLineAuth()
self.gid_str = id_str
drive = GoogleDrive(self.gauth)
file = drive.CreateFile({"id": id_str})
self.filename = file['title'] if len(custom_filename) == 0 else custom_filename
self.filesize = float(file['fileSize'])
self.response = self.gauth.service.files().get_media(fileId=id_str)
================================================
FILE: DriveDownloader/netdrives/onedrive.py
================================================
#############################################
# Author: Hongwei Fan #
# E-mail: hwnorm@outlook.com #
# Homepage: https://github.com/hwfan #
#############################################
import base64
from DriveDownloader.netdrives.basedrive import DriveSession
class OneDriveSession(DriveSession):
def __init__(self, *args, **kwargs):
DriveSession.__init__(self, *args, **kwargs)
def generate_url(self, url):
'''
Solution provided by:
https://towardsdatascience.com/how-to-get-onedrive-direct-download-link-ecb52a62fee4
'''
data_bytes64 = base64.b64encode(bytes(url, 'utf-8'))
data_bytes64_String = data_bytes64.decode('utf-8').replace('/','_').replace('+','-').rstrip("=")
resultUrl = f"https://api.onedrive.com/v1.0/shares/u!{data_bytes64_String}/root/content"
return resultUrl
def connect(self, url, custom_filename='', proc_id=-1, force_backup=False):
generated_url = self.generate_url(url)
DriveSession.connect(self, generated_url, custom_filename=custom_filename)
================================================
FILE: DriveDownloader/netdrives/settings.yaml
================================================
client_config_backend: settings
client_config:
client_id: 367116221053-7n0vf5akeru7on6o2fjinrecpdoe99eg.apps.googleusercontent.com
client_secret: 1qsNodXNaWq1mQuBjUjmvhoO
save_credentials: True
save_credentials_backend: file
get_refresh_token: True
oauth_scope:
- https://www.googleapis.com/auth/drive
================================================
FILE: DriveDownloader/netdrives/sharepoint.py
================================================
#############################################
# Author: Hongwei Fan #
# E-mail: hwnorm@outlook.com #
# Homepage: https://github.com/hwfan #
#############################################
import urllib.parse as urlparse
from DriveDownloader.netdrives.basedrive import DriveSession
class SharePointSession(DriveSession):
def __init__(self, *args, **kwargs):
DriveSession.__init__(self, *args, **kwargs)
def generate_url(self, url):
'''
Solution provided by:
https://www.qian.blue/archives/OneDrive-straight.html
'''
parsed_url = urlparse.urlparse(url)
path = parsed_url.path
netloc = parsed_url.netloc
splitted_path = path.split('/')
personal_attr, domain, sharelink = splitted_path[3:6]
resultUrl = f"https://{netloc}/{personal_attr}/{domain}/_layouts/52/download.aspx?share={sharelink}"
return resultUrl
def connect(self, url, custom_filename='', proc_id=-1, force_backup=False):
generated_url = self.generate_url(url)
DriveSession.connect(self, generated_url, custom_filename=custom_filename)
================================================
FILE: DriveDownloader/pydrive2/__init__.py
================================================
# Credit: https://github.com/iterative/PyDrive2
================================================
FILE: DriveDownloader/pydrive2/apiattr.py
================================================
# Credit: https://github.com/iterative/PyDrive2
from six import Iterator, iteritems
class ApiAttribute(object):
"""A data descriptor that sets and returns values."""
def __init__(self, name):
"""Create an instance of ApiAttribute.
:param name: name of this attribute.
:type name: str.
"""
self.name = name
def __get__(self, obj, type=None):
"""Accesses value of this attribute."""
return obj.attr.get(self.name)
def __set__(self, obj, value):
"""Write value of this attribute."""
obj.attr[self.name] = value
if obj.dirty.get(self.name) is not None:
obj.dirty[self.name] = True
def __del__(self, obj=None):
"""Delete value of this attribute."""
if not obj:
return
del obj.attr[self.name]
if obj.dirty.get(self.name) is not None:
del obj.dirty[self.name]
class ApiAttributeMixin(object):
"""Mixin to initialize required global variables to use ApiAttribute."""
def __init__(self):
self.attr = {}
self.dirty = {}
self.http = None # Any element may make requests and will require this
# field.
class ApiResource(dict):
"""Super class of all api resources.
Inherits and behaves as a python dictionary to handle api resources.
Save clean copy of metadata in self.metadata as a dictionary.
Provides changed metadata elements to efficiently update api resources.
"""
auth = ApiAttribute("auth")
def __init__(self, *args, **kwargs):
"""Create an instance of ApiResource."""
super(ApiResource, self).__init__()
self.update(*args, **kwargs)
self.metadata = dict(self)
def __getitem__(self, key):
"""Overwritten method of dictionary.
:param key: key of the query.
:type key: str.
:returns: value of the query.
"""
return dict.__getitem__(self, key)
def __setitem__(self, key, val):
"""Overwritten method of dictionary.
:param key: key of the query.
:type key: str.
:param val: value of the query.
"""
dict.__setitem__(self, key, val)
def __repr__(self):
"""Overwritten method of dictionary."""
dict_representation = dict.__repr__(self)
return "%s(%s)" % (type(self).__name__, dict_representation)
def update(self, *args, **kwargs):
"""Overwritten method of dictionary."""
for k, v in iteritems(dict(*args, **kwargs)):
self[k] = v
def UpdateMetadata(self, metadata=None):
"""Update metadata and mark all of them to be clean."""
if metadata:
self.update(metadata)
self.metadata = dict(self)
def GetChanges(self):
"""Returns changed metadata elements to update api resources efficiently.
:returns: dict -- changed metadata elements.
"""
dirty = {}
for key in self:
if self.metadata.get(key) is None:
dirty[key] = self[key]
elif self.metadata[key] != self[key]:
dirty[key] = self[key]
return dirty
class ApiResourceList(ApiAttributeMixin, ApiResource, Iterator):
"""Abstract class of all api list resources.
Inherits ApiResource and builds iterator to list any API resource.
"""
metadata = ApiAttribute("metadata")
def __init__(self, auth=None, metadata=None):
"""Create an instance of ApiResourceList.
:param auth: authorized GoogleAuth instance.
:type auth: GoogleAuth.
:param metadata: parameter to send to list command.
:type metadata: dict.
"""
ApiAttributeMixin.__init__(self)
ApiResource.__init__(self)
self.auth = auth
self.UpdateMetadata()
if metadata:
self.update(metadata)
def __iter__(self):
"""Returns iterator object.
:returns: ApiResourceList -- self
"""
return self
def __next__(self):
"""Make API call to list resources and return them.
Auto updates 'pageToken' every time it makes API call and
raises StopIteration when it reached the end of iteration.
:returns: list -- list of API resources.
:raises: StopIteration
"""
if "pageToken" in self and self["pageToken"] is None:
raise StopIteration
result = self._GetList()
self["pageToken"] = self.metadata.get("nextPageToken")
return result
def GetList(self):
"""Get list of API resources.
If 'maxResults' is not specified, it will automatically iterate through
every resources available. Otherwise, it will make API call once and
update 'pageToken'.
:returns: list -- list of API resources.
"""
if self.get("maxResults") is None:
self["maxResults"] = 1000
result = []
for x in self:
result.extend(x)
del self["maxResults"]
return result
else:
return next(self)
def _GetList(self):
"""Helper function which actually makes API call.
Should be overwritten.
:raises: NotImplementedError
"""
raise NotImplementedError
def Reset(self):
"""Resets current iteration"""
if "pageToken" in self:
del self["pageToken"]
================================================
FILE: DriveDownloader/pydrive2/auth.py
================================================
# Credit: https://github.com/iterative/PyDrive2
import socket
import webbrowser
import httplib2
import oauth2client.clientsecrets as clientsecrets
from six.moves import input
import threading
from googleapiclient.discovery import build
from functools import wraps
from oauth2client.service_account import ServiceAccountCredentials
from oauth2client.client import FlowExchangeError
from oauth2client.client import AccessTokenRefreshError
from oauth2client.client import OAuth2WebServerFlow
from oauth2client.client import OOB_CALLBACK_URN
from oauth2client.file import Storage
from oauth2client.tools import ClientRedirectHandler
from oauth2client.tools import ClientRedirectServer
from oauth2client._helpers import scopes_to_string
from .apiattr import ApiAttribute
from .apiattr import ApiAttributeMixin
from .settings import LoadSettingsFile
from .settings import ValidateSettings
from .settings import SettingsError
from .settings import InvalidConfigError
import os
class AuthError(Exception):
"""Base error for authentication/authorization errors."""
class InvalidCredentialsError(IOError):
"""Error trying to read credentials file."""
class AuthenticationRejected(AuthError):
"""User rejected authentication."""
class AuthenticationError(AuthError):
"""General authentication error."""
class RefreshError(AuthError):
"""Access token refresh error."""
def LoadAuth(decoratee):
"""Decorator to check if the auth is valid and loads auth if not."""
@wraps(decoratee)
def _decorated(self, *args, **kwargs):
# Initialize auth if needed.
if self.auth is None:
self.auth = GoogleAuth()
# Re-create access token if it expired.
if self.auth.access_token_expired:
if getattr(self.auth, "auth_method", False) == "service":
self.auth.ServiceAuth()
else:
self.auth.LocalWebserverAuth()
# Initialise service if not built yet.
if self.auth.service is None:
self.auth.Authorize()
# Ensure that a thread-safe HTTP object is provided.
if (
kwargs is not None
and "param" in kwargs
and kwargs["param"] is not None
and "http" in kwargs["param"]
and kwargs["param"]["http"] is not None
):
self.http = kwargs["param"]["http"]
del kwargs["param"]["http"]
else:
# If HTTP object not specified, create or resuse an HTTP
# object from the thread local storage.
if not getattr(self.auth.thread_local, "http", None):
self.auth.thread_local.http = self.auth.Get_Http_Object()
self.http = self.auth.thread_local.http
return decoratee(self, *args, **kwargs)
return _decorated
def CheckServiceAuth(decoratee):
"""Decorator to authorize service account."""
@wraps(decoratee)
def _decorated(self, *args, **kwargs):
self.auth_method = "service"
dirty = False
save_credentials = self.settings.get("save_credentials")
if self.credentials is None and save_credentials:
self.LoadCredentials()
if self.credentials is None:
decoratee(self, *args, **kwargs)
self.Authorize()
dirty = True
elif self.access_token_expired:
self.Refresh()
dirty = True
if dirty and save_credentials:
self.SaveCredentials()
return _decorated
def CheckAuth(decoratee):
"""Decorator to check if it requires OAuth2 flow request."""
@wraps(decoratee)
def _decorated(self, *args, **kwargs):
dirty = False
code = None
save_credentials = self.settings.get("save_credentials")
if self.credentials is None and save_credentials:
self.LoadCredentials()
if self.flow is None:
self.GetFlow()
if self.credentials is None:
code = decoratee(self, *args, **kwargs)
dirty = True
else:
if self.access_token_expired:
if self.credentials.refresh_token is not None:
self.Refresh()
else:
code = decoratee(self, *args, **kwargs)
dirty = True
if code is not None:
self.Auth(code)
if dirty and save_credentials:
self.SaveCredentials()
return _decorated
class GoogleAuth(ApiAttributeMixin, object):
"""Wrapper class for oauth2client library in google-api-python-client.
Loads all settings and credentials from one 'settings.yaml' file
and performs common OAuth2.0 related functionality such as authentication
and authorization.
"""
DEFAULT_SETTINGS = {
"client_config_backend": "file",
"client_config_file": "client_secrets.json",
"save_credentials": False,
"oauth_scope": ["https://www.googleapis.com/auth/drive"],
}
CLIENT_CONFIGS_LIST = [
"client_id",
"client_secret",
"auth_uri",
"token_uri",
"revoke_uri",
"redirect_uri",
]
SERVICE_CONFIGS_LIST = ["client_user_email"]
settings = ApiAttribute("settings")
client_config = ApiAttribute("client_config")
flow = ApiAttribute("flow")
credentials = ApiAttribute("credentials")
http = ApiAttribute("http")
service = ApiAttribute("service")
auth_method = ApiAttribute("auth_method")
def __init__(self, settings_file="settings.yaml", http_timeout=None):
"""Create an instance of GoogleAuth.
This constructor just sets the path of settings file.
It does not actually read the file.
:param settings_file: path of settings file. 'settings.yaml' by default.
:type settings_file: str.
"""
self.http_timeout = http_timeout
ApiAttributeMixin.__init__(self)
self.thread_local = threading.local()
self.client_config = {}
try:
self.settings = LoadSettingsFile(settings_file)
except SettingsError:
self.settings = self.DEFAULT_SETTINGS
else:
if self.settings is None:
self.settings = self.DEFAULT_SETTINGS
else:
ValidateSettings(self.settings)
@property
def access_token_expired(self):
"""Checks if access token doesn't exist or is expired.
:returns: bool -- True if access token doesn't exist or is expired.
"""
if self.credentials is None:
return True
return self.credentials.access_token_expired
@CheckAuth
def LocalWebserverAuth(
self, host_name="localhost", port_numbers=None, launch_browser=True
):
"""Authenticate and authorize from user by creating local web server and
retrieving authentication code.
This function is not for web server application. It creates local web server
for user from standalone application.
:param host_name: host name of the local web server.
:type host_name: str.
:param port_numbers: list of port numbers to be tried to used.
:type port_numbers: list.
:param launch_browser: should browser be launched automatically
:type launch_browser: bool
:returns: str -- code returned from local web server
:raises: AuthenticationRejected, AuthenticationError
"""
if port_numbers is None:
port_numbers = [
8080,
8090,
] # Mutable objects should not be default
# values, as each call's changes are global.
success = False
port_number = 0
for port in port_numbers:
port_number = port
try:
httpd = ClientRedirectServer(
(host_name, port), ClientRedirectHandler
)
except socket.error:
pass
else:
success = True
break
if success:
oauth_callback = "http://%s:%s/" % (host_name, port_number)
else:
print(
"Failed to start a local web server. Please check your firewall"
)
print(
"settings and locally running programs that may be blocking or"
)
print("using configured ports. Default ports are 8080 and 8090.")
raise AuthenticationError()
self.flow.redirect_uri = oauth_callback
authorize_url = self.GetAuthUrl()
if launch_browser:
webbrowser.open(authorize_url, new=1, autoraise=True)
print("Your browser has been opened to visit:")
else:
print("Open your browser to visit:")
print()
print(" " + authorize_url)
print()
httpd.handle_request()
if "error" in httpd.query_params:
print("Authentication request was rejected")
raise AuthenticationRejected("User rejected authentication")
if "code" in httpd.query_params:
return httpd.query_params["code"]
else:
print(
'Failed to find "code" in the query parameters of the redirect.'
)
print("Try command-line authentication")
raise AuthenticationError("No code found in redirect")
@CheckAuth
def CommandLineAuth(self):
"""Authenticate and authorize from user by printing authentication url
retrieving authentication code from command-line.
:returns: str -- code returned from commandline.
"""
self.flow.redirect_uri = OOB_CALLBACK_URN
authorize_url = self.GetAuthUrl()
print("Go to the following link in your browser:")
print()
print(" " + authorize_url)
print()
return input("Enter verification code: ").strip()
@CheckServiceAuth
def ServiceAuth(self):
"""Authenticate and authorize using P12 private key, client id
and client email for a Service account.
:raises: AuthError, InvalidConfigError
"""
if set(self.SERVICE_CONFIGS_LIST) - set(self.client_config):
self.LoadServiceConfigSettings()
scopes = scopes_to_string(self.settings["oauth_scope"])
client_service_json = self.client_config.get("client_json_file_path")
if client_service_json:
self.credentials = ServiceAccountCredentials.from_json_keyfile_name(
filename=client_service_json, scopes=scopes
)
else:
service_email = self.client_config["client_service_email"]
file_path = self.client_config["client_pkcs12_file_path"]
self.credentials = ServiceAccountCredentials.from_p12_keyfile(
service_account_email=service_email,
filename=file_path,
scopes=scopes,
)
user_email = self.client_config.get("client_user_email")
if user_email:
self.credentials = self.credentials.create_delegated(
sub=user_email
)
def LoadCredentials(self, backend=None):
"""Loads credentials or create empty credentials if it doesn't exist.
:param backend: target backend to save credential to.
:type backend: str.
:raises: InvalidConfigError
"""
if backend is None:
backend = self.settings.get("save_credentials_backend")
if backend is None:
raise InvalidConfigError("Please specify credential backend")
if backend == "file":
self.LoadCredentialsFile()
else:
raise InvalidConfigError("Unknown save_credentials_backend")
def LoadCredentialsFile(self, credentials_file=None):
"""Loads credentials or create empty credentials if it doesn't exist.
Loads credentials file from path in settings if not specified.
:param credentials_file: path of credentials file to read.
:type credentials_file: str.
:raises: InvalidConfigError, InvalidCredentialsError
"""
if credentials_file is None:
credentials_file = self.settings.get("save_credentials_file")
if credentials_file is None:
raise InvalidConfigError(
"Please specify credentials file to read"
)
try:
storage = Storage(credentials_file)
self.credentials = storage.get()
except IOError:
raise InvalidCredentialsError(
"Credentials file cannot be symbolic link"
)
def SaveCredentials(self, backend=None):
"""Saves credentials according to specified backend.
If you have any specific credentials backend in mind, don't use this
function and use the corresponding function you want.
:param backend: backend to save credentials.
:type backend: str.
:raises: InvalidConfigError
"""
if backend is None:
backend = self.settings.get("save_credentials_backend")
if backend is None:
raise InvalidConfigError("Please specify credential backend")
if backend == "file":
self.SaveCredentialsFile()
else:
raise InvalidConfigError("Unknown save_credentials_backend")
def SaveCredentialsFile(self, credentials_file=None):
"""Saves credentials to the file in JSON format.
:param credentials_file: destination to save file to.
:type credentials_file: str.
:raises: InvalidConfigError, InvalidCredentialsError
"""
if self.credentials is None:
raise InvalidCredentialsError("No credentials to save")
if credentials_file is None:
credentials_file = self.settings.get("save_credentials_file")
if credentials_file is None:
raise InvalidConfigError(
"Please specify credentials file to read"
)
try:
storage = Storage(credentials_file)
storage.put(self.credentials)
self.credentials.set_store(storage)
except IOError:
raise InvalidCredentialsError(
"Credentials file cannot be symbolic link"
)
def LoadClientConfig(self, backend=None):
"""Loads client configuration according to specified backend.
If you have any specific backend to load client configuration from in mind,
don't use this function and use the corresponding function you want.
:param backend: backend to load client configuration from.
:type backend: str.
:raises: InvalidConfigError
"""
if backend is None:
backend = self.settings.get("client_config_backend")
if backend is None:
raise InvalidConfigError(
"Please specify client config backend"
)
if backend == "file":
self.LoadClientConfigFile()
elif backend == "settings":
self.LoadClientConfigSettings()
elif backend == "service":
self.LoadServiceConfigSettings()
else:
raise InvalidConfigError("Unknown client_config_backend")
def LoadClientConfigFile(self, client_config_file=None):
"""Loads client configuration file downloaded from APIs console.
Loads client config file from path in settings if not specified.
:param client_config_file: path of client config file to read.
:type client_config_file: str.
:raises: InvalidConfigError
"""
if client_config_file is None:
client_config_file = self.settings["client_config_file"]
try:
client_type, client_info = clientsecrets.loadfile(
client_config_file
)
except clientsecrets.InvalidClientSecretsError as error:
raise InvalidConfigError("Invalid client secrets file %s" % error)
if client_type not in (
clientsecrets.TYPE_WEB,
clientsecrets.TYPE_INSTALLED,
):
raise InvalidConfigError(
"Unknown client_type of client config file"
)
# General settings.
try:
config_index = [
"client_id",
"client_secret",
"auth_uri",
"token_uri",
]
for config in config_index:
self.client_config[config] = client_info[config]
self.client_config["revoke_uri"] = client_info.get("revoke_uri")
self.client_config["redirect_uri"] = client_info["redirect_uris"][
0
]
except KeyError:
raise InvalidConfigError("Insufficient client config in file")
# Service auth related fields.
service_auth_config = ["client_email"]
try:
for config in service_auth_config:
self.client_config[config] = client_info[config]
except KeyError:
pass # The service auth fields are not present, handling code can go here.
def LoadServiceConfigSettings(self):
"""Loads client configuration from settings file.
:raises: InvalidConfigError
"""
for file_format in ["json", "pkcs12"]:
config = f"client_{file_format}_file_path"
value = self.settings["service_config"].get(config)
if value:
self.client_config[config] = value
break
else:
raise InvalidConfigError(
"Either json or pkcs12 file path required "
"for service authentication"
)
if file_format == "pkcs12":
self.SERVICE_CONFIGS_LIST.append("client_service_email")
for config in self.SERVICE_CONFIGS_LIST:
try:
self.client_config[config] = self.settings["service_config"][
config
]
except KeyError:
err = "Insufficient service config in settings"
err += "\n\nMissing: {} key.".format(config)
raise InvalidConfigError(err)
def LoadClientConfigSettings(self):
"""Loads client configuration from settings file.
:raises: InvalidConfigError
"""
for config in self.CLIENT_CONFIGS_LIST:
try:
self.client_config[config] = self.settings["client_config"][
config
]
except KeyError:
raise InvalidConfigError(
"Insufficient client config in settings"
)
def GetFlow(self):
"""Gets Flow object from client configuration.
:raises: InvalidConfigError
"""
if not all(
config in self.client_config for config in self.CLIENT_CONFIGS_LIST
):
self.LoadClientConfig()
constructor_kwargs = {
"redirect_uri": self.client_config["redirect_uri"],
"auth_uri": self.client_config["auth_uri"],
"token_uri": self.client_config["token_uri"],
"access_type": "online",
}
if self.client_config["revoke_uri"] is not None:
constructor_kwargs["revoke_uri"] = self.client_config["revoke_uri"]
self.flow = OAuth2WebServerFlow(
self.client_config["client_id"],
self.client_config["client_secret"],
scopes_to_string(self.settings["oauth_scope"]),
**constructor_kwargs,
)
if self.settings.get("get_refresh_token"):
self.flow.params.update(
{"access_type": "offline", "approval_prompt": "force"}
)
def Refresh(self):
"""Refreshes the access_token.
:raises: RefreshError
"""
if self.credentials is None:
raise RefreshError("No credential to refresh.")
if (
self.credentials.refresh_token is None
and self.auth_method != "service"
):
raise RefreshError(
"No refresh_token found."
"Please set access_type of OAuth to offline."
)
if self.http is None:
self.http = self._build_http()
try:
self.credentials.refresh(self.http)
except AccessTokenRefreshError as error:
raise RefreshError("Access token refresh failed: %s" % error)
def GetAuthUrl(self):
"""Creates authentication url where user visits to grant access.
:returns: str -- Authentication url.
"""
if self.flow is None:
self.GetFlow()
return self.flow.step1_get_authorize_url()
def Auth(self, code):
"""Authenticate, authorize, and build service.
:param code: Code for authentication.
:type code: str.
:raises: AuthenticationError
"""
self.Authenticate(code)
self.Authorize()
def Authenticate(self, code):
"""Authenticates given authentication code back from user.
:param code: Code for authentication.
:type code: str.
:raises: AuthenticationError
"""
if self.flow is None:
self.GetFlow()
try:
self.credentials = self.flow.step2_exchange(code)
except FlowExchangeError as e:
raise AuthenticationError("OAuth2 code exchange failed: %s" % e)
print("Authentication successful.")
def _build_http(self):
http = httplib2.Http(timeout=self.http_timeout)
# 308's are used by several Google APIs (Drive, YouTube)
# for Resumable Uploads rather than Permanent Redirects.
# This asks httplib2 to exclude 308s from the status codes
# it treats as redirects
# See also: https://stackoverflow.com/a/59850170/298182
try:
http.redirect_codes = http.redirect_codes - {308}
except AttributeError:
# http.redirect_codes does not exist in previous versions
# of httplib2, so pass
pass
return http
def Authorize(self):
"""Authorizes and builds service.
:raises: AuthenticationError
"""
if self.access_token_expired:
raise AuthenticationError(
"No valid credentials provided to authorize"
)
if self.http is None:
self.http = self._build_http()
self.http = self.credentials.authorize(self.http)
self.service = build(
"drive", "v2", http=self.http, cache_discovery=False
)
def Get_Http_Object(self):
"""Create and authorize an httplib2.Http object. Necessary for
thread-safety.
:return: The http object to be used in each call.
:rtype: httplib2.Http
"""
http = self._build_http()
http = self.credentials.authorize(http)
return http
================================================
FILE: DriveDownloader/pydrive2/drive.py
================================================
# Credit: https://github.com/iterative/PyDrive2
from .apiattr import ApiAttributeMixin
from .files import GoogleDriveFile
from .files import GoogleDriveFileList
from .auth import LoadAuth
class GoogleDrive(ApiAttributeMixin, object):
"""Main Google Drive class."""
def __init__(self, auth=None):
"""Create an instance of GoogleDrive.
:param auth: authorized GoogleAuth instance.
:type auth: pydrive2.auth.GoogleAuth.
"""
ApiAttributeMixin.__init__(self)
self.auth = auth
def CreateFile(self, metadata=None):
"""Create an instance of GoogleDriveFile with auth of this instance.
This method would not upload a file to GoogleDrive.
:param metadata: file resource to initialize GoogleDriveFile with.
:type metadata: dict.
:returns: pydrive2.files.GoogleDriveFile -- initialized with auth of this
instance.
"""
return GoogleDriveFile(auth=self.auth, metadata=metadata)
def ListFile(self, param=None):
"""Create an instance of GoogleDriveFileList with auth of this instance.
This method will not fetch from Files.List().
:param param: parameter to be sent to Files.List().
:type param: dict.
:returns: pydrive2.files.GoogleDriveFileList -- initialized with auth of
this instance.
"""
return GoogleDriveFileList(auth=self.auth, param=param)
@LoadAuth
def GetAbout(self):
"""Return information about the Google Drive of the auth instance.
:returns: A dictionary of Google Drive information like user, usage, quota etc.
"""
return self.auth.service.about().get().execute(http=self.http)
================================================
FILE: DriveDownloader/pydrive2/files.py
================================================
# Credit: https://github.com/iterative/PyDrive2
import io
import mimetypes
import json
from googleapiclient import errors
from googleapiclient.http import MediaIoBaseUpload
from googleapiclient.http import MediaIoBaseDownload
from googleapiclient.http import DEFAULT_CHUNK_SIZE
from functools import wraps
from .apiattr import ApiAttribute
from .apiattr import ApiAttributeMixin
from .apiattr import ApiResource
from .apiattr import ApiResourceList
from .auth import LoadAuth
BLOCK_SIZE = 1024
# Usage: MIME_TYPE_TO_BOM['<Google Drive mime type>']['<download mimetype>'].
MIME_TYPE_TO_BOM = {
"application/vnd.google-apps.document": {
"text/plain": u"\ufeff".encode("utf8")
}
}
class FileNotUploadedError(RuntimeError):
"""Error trying to access metadata of file that is not uploaded."""
class ApiRequestError(IOError):
def __init__(self, http_error):
assert isinstance(http_error, errors.HttpError)
content = json.loads(http_error.content.decode("utf-8"))
self.error = content.get("error", {}) if content else {}
# Initialize args for backward compatibility
super().__init__(http_error)
def GetField(self, field):
"""Returns the `field` from the first error"""
return self.error.get("errors", [{}])[0].get(field, "")
class FileNotDownloadableError(RuntimeError):
"""Error trying to download file that is not downloadable."""
def LoadMetadata(decoratee):
"""Decorator to check if the file has metadata and fetches it if not.
:raises: ApiRequestError, FileNotUploadedError
"""
@wraps(decoratee)
def _decorated(self, *args, **kwargs):
if not self.uploaded:
self.FetchMetadata()
return decoratee(self, *args, **kwargs)
return _decorated
class GoogleDriveFileList(ApiResourceList):
"""Google Drive FileList instance.
Equivalent to Files.list() in Drive APIs.
"""
def __init__(self, auth=None, param=None):
"""Create an instance of GoogleDriveFileList."""
super(GoogleDriveFileList, self).__init__(auth=auth, metadata=param)
@LoadAuth
def _GetList(self):
"""Overwritten method which actually makes API call to list files.
:returns: list -- list of pydrive2.files.GoogleDriveFile.
"""
# Teamdrive support
self["supportsAllDrives"] = True
self["includeItemsFromAllDrives"] = True
try:
self.metadata = (
self.auth.service.files()
.list(**dict(self))
.execute(http=self.http)
)
except errors.HttpError as error:
raise ApiRequestError(error)
result = []
for file_metadata in self.metadata["items"]:
tmp_file = GoogleDriveFile(
auth=self.auth, metadata=file_metadata, uploaded=True
)
result.append(tmp_file)
return result
class IoBuffer(object):
"""Lightweight retention of one chunk."""
def __init__(self, encoding):
self.encoding = encoding
self.chunk = None
def write(self, chunk):
self.chunk = chunk
def read(self):
return (
self.chunk.decode(self.encoding)
if self.chunk and self.encoding
else self.chunk
)
class MediaIoReadable(object):
def __init__(
self,
request,
encoding=None,
pre_buffer=True,
remove_prefix=b"",
chunksize=DEFAULT_CHUNK_SIZE,
):
"""File-like wrapper around MediaIoBaseDownload.
:param pre_buffer: Whether to read one chunk into an internal buffer
immediately in order to raise any potential errors.
:param remove_prefix: Bytes prefix to remove from internal pre_buffer.
:raises: ApiRequestError
"""
self.done = False
self._fd = IoBuffer(encoding)
self.downloader = MediaIoBaseDownload(
self._fd, request, chunksize=chunksize
)
self.size = None
self._pre_buffer = False
if pre_buffer:
self.read()
if remove_prefix:
chunk = io.BytesIO(self._fd.chunk)
GoogleDriveFile._RemovePrefix(chunk, remove_prefix)
self._fd.chunk = chunk.getvalue()
self._pre_buffer = True
def read(self):
"""
:returns: bytes or str -- chunk (or None if done)
:raises: ApiRequestError
"""
if self._pre_buffer:
self._pre_buffer = False
return self._fd.read()
if self.done:
return None
try:
status, self.done = self.downloader.next_chunk()
self.size = status.total_size
except errors.HttpError as error:
raise ApiRequestError(error)
return self._fd.read()
def __iter__(self):
"""
:raises: ApiRequestError
"""
while True:
chunk = self.read()
if chunk is None:
break
yield chunk
def __len__(self):
return self.size
class GoogleDriveFile(ApiAttributeMixin, ApiResource):
"""Google Drive File instance.
Inherits ApiResource which inherits dict.
Can access and modify metadata like dictionary.
"""
content = ApiAttribute("content")
uploaded = ApiAttribute("uploaded")
metadata = ApiAttribute("metadata")
def __init__(self, auth=None, metadata=None, uploaded=False):
"""Create an instance of GoogleDriveFile.
:param auth: authorized GoogleAuth instance.
:type auth: pydrive2.auth.GoogleAuth
:param metadata: file resource to initialize GoogleDriveFile with.
:type metadata: dict.
:param uploaded: True if this file is confirmed to be uploaded.
:type uploaded: bool.
"""
ApiAttributeMixin.__init__(self)
ApiResource.__init__(self)
self.metadata = {}
self.dirty = {"content": False}
self.auth = auth
self.uploaded = uploaded
if uploaded:
self.UpdateMetadata(metadata)
elif metadata:
self.update(metadata)
self.has_bom = True
def __getitem__(self, key):
"""Overwrites manner of accessing Files resource.
If this file instance is not uploaded and id is specified,
it will try to look for metadata with Files.get().
:param key: key of dictionary query.
:type key: str.
:returns: value of Files resource
:raises: KeyError, FileNotUploadedError
"""
try:
return dict.__getitem__(self, key)
except KeyError as e:
if self.uploaded:
raise KeyError(e)
if self.get("id"):
self.FetchMetadata()
return dict.__getitem__(self, key)
else:
raise FileNotUploadedError()
def SetContentString(self, content, encoding="utf-8"):
"""Set content of this file to be a string.
Creates io.BytesIO instance of utf-8 encoded string.
Sets mimeType to be 'text/plain' if not specified.
:param encoding: The encoding to use when setting the content of this file.
:type encoding: str
:param content: content of the file in string.
:type content: str
"""
self.content = io.BytesIO(content.encode(encoding))
if self.get("mimeType") is None:
self["mimeType"] = "text/plain"
def SetContentFile(self, filename):
"""Set content of this file from a file.
Opens the file specified by this method.
Will be read, uploaded, and closed by Upload() method.
Sets metadata 'title' and 'mimeType' automatically if not specified.
:param filename: name of the file to be uploaded.
:type filename: str.
"""
self.content = open(filename, "rb")
if self.get("title") is None:
self["title"] = filename
if self.get("mimeType") is None:
self["mimeType"] = mimetypes.guess_type(filename)[0]
def GetContentString(
self, mimetype=None, encoding="utf-8", remove_bom=False
):
"""Get content of this file as a string.
:param mimetype: The mimetype of the content string.
:type mimetype: str
:param encoding: The encoding to use when decoding the byte string.
:type encoding: str
:param remove_bom: Whether to strip a known BOM.
:type remove_bom: bool
:returns: str -- utf-8 decoded content of the file
:raises: ApiRequestError, FileNotUploadedError, FileNotDownloadableError
"""
if (
self.content is None
or type(self.content) is not io.BytesIO
or self.has_bom == remove_bom
):
self.FetchContent(mimetype, remove_bom)
return self.content.getvalue().decode(encoding)
@LoadAuth
def GetContentFile(
self,
filename,
mimetype=None,
remove_bom=False,
callback=None,
chunksize=DEFAULT_CHUNK_SIZE,
):
"""Save content of this file as a local file.
:param filename: name of the file to write to.
:type filename: str
:param mimetype: mimeType of the file.
:type mimetype: str
:param remove_bom: Whether to remove the byte order marking.
:type remove_bom: bool
:param callback: passed two arguments: (total transferred, file size).
:type param: callable
:param chunksize: chunksize in bytes (standard 100 MB(1024*1024*100))
:type chunksize: int
:raises: ApiRequestError, FileNotUploadedError
"""
files = self.auth.service.files()
file_id = self.metadata.get("id") or self.get("id")
if not file_id:
raise FileNotUploadedError()
def download(fd, request):
downloader = MediaIoBaseDownload(
fd, self._WrapRequest(request), chunksize=chunksize
)
done = False
while done is False:
status, done = downloader.next_chunk()
if callback:
callback(status.resumable_progress, status.total_size)
with open(filename, mode="w+b") as fd:
# Should use files.export_media instead of files.get_media if
# metadata["mimeType"].startswith("application/vnd.google-apps.").
# But that would first require a slow call to FetchMetadata().
# We prefer to try-except for speed.
try:
download(fd, files.get_media(fileId=file_id))
except errors.HttpError as error:
exc = ApiRequestError(error)
if (
exc.error["code"] != 403
or exc.GetField("reason") != "fileNotDownloadable"
):
raise exc
mimetype = mimetype or "text/plain"
fd.seek(0) # just in case `download()` modified `fd`
try:
download(
fd,
files.export_media(fileId=file_id, mimeType=mimetype),
)
except errors.HttpError as error:
raise ApiRequestError(error)
if mimetype == "text/plain" and remove_bom:
fd.seek(0)
bom = self._GetBOM(mimetype)
if bom:
self._RemovePrefix(fd, bom)
@LoadAuth
def GetContentIOBuffer(
self,
mimetype=None,
encoding=None,
remove_bom=False,
chunksize=DEFAULT_CHUNK_SIZE,
):
"""Get a file-like object which has a buffered read() method.
:param mimetype: mimeType of the file.
:type mimetype: str
:param encoding: The encoding to use when decoding the byte string.
:type encoding: str
:param remove_bom: Whether to remove the byte order marking.
:type remove_bom: bool
:param chunksize: default read()/iter() chunksize.
:type chunksize: int
:returns: MediaIoReadable -- file-like object.
:raises: ApiRequestError, FileNotUploadedError
"""
files = self.auth.service.files()
file_id = self.metadata.get("id") or self.get("id")
if not file_id:
raise FileNotUploadedError()
# Should use files.export_media instead of files.get_media if
# metadata["mimeType"].startswith("application/vnd.google-apps.").
# But that would first require a slow call to FetchMetadata().
# We prefer to try-except for speed.
try:
request = self._WrapRequest(files.get_media(fileId=file_id))
return MediaIoReadable(
request, encoding=encoding, chunksize=chunksize
)
except ApiRequestError as exc:
if (
exc.error["code"] != 403
or exc.GetField("reason") != "fileNotDownloadable"
):
raise exc
mimetype = mimetype or "text/plain"
request = self._WrapRequest(
files.export_media(fileId=file_id, mimeType=mimetype)
)
remove_prefix = (
self._GetBOM(mimetype)
if mimetype == "text/plain" and remove_bom
else b""
)
return MediaIoReadable(
request,
encoding=encoding,
remove_prefix=remove_prefix,
chunksize=chunksize,
)
@LoadAuth
def FetchMetadata(self, fields=None, fetch_all=False):
"""Download file's metadata from id using Files.get().
:param fields: The fields to include, as one string, each entry separated
by commas, e.g. 'fields,labels'.
:type fields: str
:param fetch_all: Whether to fetch all fields.
:type fetch_all: bool
:raises: ApiRequestError, FileNotUploadedError
"""
file_id = self.metadata.get("id") or self.get("id")
if fetch_all:
fields = "*"
if file_id:
try:
metadata = (
self.auth.service.files()
.get(
fileId=file_id,
fields=fields,
# Teamdrive support
supportsAllDrives=True,
)
.execute(http=self.http)
)
except errors.HttpError as error:
raise ApiRequestError(error)
else:
self.uploaded = True
self.UpdateMetadata(metadata)
else:
raise FileNotUploadedError()
@LoadMetadata
def FetchContent(self, mimetype=None, remove_bom=False):
"""Download file's content from download_url.
:raises: ApiRequestError, FileNotUploadedError, FileNotDownloadableError
"""
download_url = self.metadata.get("downloadUrl")
export_links = self.metadata.get("exportLinks")
if download_url:
self.content = io.BytesIO(self._DownloadFromUrl(download_url))
self.dirty["content"] = False
elif export_links and export_links.get(mimetype):
self.content = io.BytesIO(
self._DownloadFromUrl(export_links.get(mimetype))
)
self.dirty["content"] = False
else:
raise FileNotDownloadableError(
"No downloadLink/exportLinks for mimetype found in metadata"
)
if mimetype == "text/plain" and remove_bom:
self._RemovePrefix(
self.content, MIME_TYPE_TO_BOM[self["mimeType"]][mimetype]
)
self.has_bom = not remove_bom
def Upload(self, param=None):
"""Upload/update file by choosing the most efficient method.
:param param: additional parameter to upload file.
:type param: dict.
:raises: ApiRequestError
"""
if self.uploaded or self.get("id") is not None:
if self.dirty["content"]:
self._FilesUpdate(param=param)
else:
self._FilesPatch(param=param)
else:
self._FilesInsert(param=param)
def Trash(self, param=None):
"""Move a file to the trash.
:raises: ApiRequestError
"""
self._FilesTrash(param=param)
def UnTrash(self, param=None):
"""Move a file out of the trash.
:param param: Additional parameter to file.
:type param: dict.
:raises: ApiRequestError
"""
self._FilesUnTrash(param=param)
def Delete(self, param=None):
"""Hard-delete a file.
:param param: additional parameter to file.
:type param: dict.
:raises: ApiRequestError
"""
self._FilesDelete(param=param)
def InsertPermission(self, new_permission, param=None):
"""Insert a new permission. Re-fetches all permissions after call.
:param new_permission: The new permission to insert, please see the
official Google Drive API guide on permissions.insert
for details.
:type new_permission: object
:param param: addition parameters to pass
:type param: dict
:return: The permission object.
:rtype: object
"""
if param is None:
param = {}
param["fileId"] = self.metadata.get("id") or self["id"]
param["body"] = new_permission
# Teamdrive support
param["supportsAllDrives"] = True
try:
permission = (
self.auth.service.permissions()
.insert(**param)
.execute(http=self.http)
)
except errors.HttpError as error:
raise ApiRequestError(error)
else:
self.GetPermissions() # Update permissions field.
return permission
@LoadAuth
def GetPermissions(self):
"""Get file's or shared drive's permissions.
For files in a shared drive, at most 100 results will be returned.
It doesn't paginate and collect all results.
:return: A list of the permission objects.
:rtype: object[]
"""
file_id = self.metadata.get("id") or self.get("id")
# We can't do FetchMetada call (which would nicely update
# local metada cache, etc) here since it doesn't return
# permissions for the team drive use case.
permissions = (
self.auth.service.permissions()
.list(
fileId=file_id,
# Teamdrive support
supportsAllDrives=True,
)
.execute(http=self.http)
).get("items")
if permissions:
self["permissions"] = permissions
self.metadata["permissions"] = permissions
return permissions
def DeletePermission(self, permission_id):
"""Deletes the permission specified by the permission_id.
:param permission_id: The permission id.
:type permission_id: str
:return: True if it succeeds.
:rtype: bool
"""
return self._DeletePermission(permission_id)
def _WrapRequest(self, request):
"""Replaces request.http with self.http.
Ensures thread safety. Similar to other places where we call
`.execute(http=self.http)` to pass a client from the thread local storage.
"""
if self.http:
request.http = self.http
return request
@LoadAuth
def _FilesInsert(self, param=None):
"""Upload a new file using Files.insert().
:param param: additional parameter to upload file.
:type param: dict.
:raises: ApiRequestError
"""
if param is None:
param = {}
param["body"] = self.GetChanges()
# teamdrive support
param["supportsAllDrives"] = True
try:
if self.dirty["content"]:
param["media_body"] = self._BuildMediaBody()
metadata = (
self.auth.service.files()
.insert(**param)
.execute(http=self.http)
)
except errors.HttpError as error:
raise ApiRequestError(error)
else:
self.uploaded = True
self.dirty["content"] = False
self.UpdateMetadata(metadata)
@LoadAuth
def _FilesUnTrash(self, param=None):
"""Un-delete (Trash) a file using Files.UnTrash().
:param param: additional parameter to file.
:type param: dict.
:raises: ApiRequestError
"""
if param is None:
param = {}
param["fileId"] = self.metadata.get("id") or self["id"]
# Teamdrive support
param["supportsAllDrives"] = True
try:
self.auth.service.files().untrash(**param).execute(http=self.http)
except errors.HttpError as error:
raise ApiRequestError(error)
else:
if self.metadata:
self.metadata[u"labels"][u"trashed"] = False
return True
@LoadAuth
def _FilesTrash(self, param=None):
"""Soft-delete (Trash) a file using Files.Trash().
:param param: additional parameter to file.
:type param: dict.
:raises: ApiRequestError
"""
if param is None:
param = {}
param["fileId"] = self.metadata.get("id") or self["id"]
# Teamdrive support
param["supportsAllDrives"] = True
try:
self.auth.service.files().trash(**param).execute(http=self.http)
except errors.HttpError as error:
raise ApiRequestError(error)
else:
if self.metadata:
self.metadata[u"labels"][u"trashed"] = True
return True
@LoadAuth
def _FilesDelete(self, param=None):
"""Delete a file using Files.Delete()
(WARNING: deleting permanently deletes the file!)
:param param: additional parameter to file.
:type param: dict.
:raises: ApiRequestError
"""
if param is None:
param = {}
param["fileId"] = self.metadata.get("id") or self["id"]
# Teamdrive support
param["supportsAllDrives"] = True
try:
self.auth.service.files().delete(**param).execute(http=self.http)
except errors.HttpError as error:
raise ApiRequestError(error)
else:
return True
@LoadAuth
@LoadMetadata
def _FilesUpdate(self, param=None):
"""Update metadata and/or content using Files.Update().
:param param: additional parameter to upload file.
:type param: dict.
:raises: ApiRequestError, FileNotUploadedError
"""
if param is None:
param = {}
param["body"] = self.GetChanges()
param["fileId"] = self.metadata.get("id")
# Teamdrive support
param["supportsAllDrives"] = True
try:
if self.dirty["content"]:
param["media_body"] = self._BuildMediaBody()
metadata = (
self.auth.service.files()
.update(**param)
.execute(http=self.http)
)
except errors.HttpError as error:
raise ApiRequestError(error)
else:
self.uploaded = True
self.dirty["content"] = False
self.UpdateMetadata(metadata)
@LoadAuth
@LoadMetadata
def _FilesPatch(self, param=None):
"""Update metadata using Files.Patch().
:param param: additional parameter to upload file.
:type param: dict.
:raises: ApiRequestError, FileNotUploadedError
"""
if param is None:
param = {}
param["body"] = self.GetChanges()
param["fileId"] = self.metadata.get("id")
# Teamdrive support
param["supportsAllDrives"] = True
try:
metadata = (
self.auth.service.files()
.patch(**param)
.execute(http=self.http)
)
except errors.HttpError as error:
raise ApiRequestError(error)
else:
self.UpdateMetadata(metadata)
def _BuildMediaBody(self):
"""Build MediaIoBaseUpload to get prepared to upload content of the file.
Sets mimeType as 'application/octet-stream' if not specified.
:returns: MediaIoBaseUpload -- instance that will be used to upload content.
"""
if self.get("mimeType") is None:
self["mimeType"] = "application/octet-stream"
return MediaIoBaseUpload(
self.content, self["mimeType"], resumable=True
)
@LoadAuth
def _DownloadFromUrl(self, url):
"""Download file from url using provided credential.
:param url: link of the file to download.
:type url: str.
:returns: str -- content of downloaded file in string.
:raises: ApiRequestError
"""
resp, content = self.http.request(url)
if resp.status != 200:
raise ApiRequestError(errors.HttpError(resp, content, uri=url))
return content
@LoadAuth
def _DeletePermission(self, permission_id):
"""Deletes the permission remotely, and from the file object itself.
:param permission_id: The ID of the permission.
:type permission_id: str
:return: The permission
:rtype: object
"""
file_id = self.metadata.get("id") or self["id"]
try:
self.auth.service.permissions().delete(
fileId=file_id, permissionId=permission_id
).execute()
except errors.HttpError as error:
raise ApiRequestError(error)
else:
if "permissions" in self and "permissions" in self.metadata:
permissions = self["permissions"]
is_not_current_permission = (
lambda per: per["id"] == permission_id
)
permissions = list(
filter(is_not_current_permission, permissions)
)
self["permissions"] = permissions
self.metadata["permissions"] = permissions
return True
@staticmethod
def _GetBOM(mimetype):
"""Based on download mime type (ignores Google Drive mime type)"""
for bom in MIME_TYPE_TO_BOM.values():
if mimetype in bom:
return bom[mimetype]
@staticmethod
def _RemovePrefix(file_object, prefix, block_size=BLOCK_SIZE):
"""Deletes passed prefix by shifting content of passed file object by to
the left. Operation is in-place.
Args:
file_object (obj): The file object to manipulate.
prefix (str): The prefix to insert.
block_size (int): The size of the blocks which are moved one at a time.
"""
prefix_length = len(prefix)
# Detect if prefix exists in file.
content_start = file_object.read(prefix_length)
if content_start == prefix:
# Shift content left by prefix length, by copying 1KiB at a time.
block_to_write = file_object.read(block_size)
current_block_length = len(block_to_write)
# Read and write location in separate variables for simplicity.
read_location = prefix_length + current_block_length
write_location = 0
while current_block_length > 0:
# Write next block.
file_object.seek(write_location)
file_object.write(block_to_write)
# Set write location to the next block.
write_location += len(block_to_write)
# Read next block of input.
file_object.seek(read_location)
block_to_write = file_object.read(block_size)
# Update the current block length and read_location.
current_block_length = len(block_to_write)
read_location += current_block_length
# Truncate the file to its, now shorter, length.
file_object.truncate(read_location - prefix_length)
@staticmethod
def _InsertPrefix(file_object, prefix, block_size=BLOCK_SIZE):
"""Inserts the passed prefix in the beginning of the file, operation is
in-place.
Args:
file_object (obj): The file object to manipulate.
prefix (str): The prefix to insert.
"""
# Read the first two blocks.
first_block = file_object.read(block_size)
second_block = file_object.read(block_size)
# Pointer to the first byte of the next block to be read.
read_location = block_size * 2
# Write BOM.
file_object.seek(0)
file_object.write(prefix)
# {read|write}_location separated for readability.
write_location = len(prefix)
# Write and read block alternatingly.
while len(first_block):
# Write first block.
file_object.seek(write_location)
file_object.write(first_block)
# Increment write_location.
write_location += block_size
# Move second block into first variable.
first_block = second_block
# Read in the next block.
file_object.seek(read_location)
second_block = file_object.read(block_size)
# Increment read_location.
read_location += block_size
================================================
FILE: DriveDownloader/pydrive2/fs/__init__.py
================================================
# Credit: https://github.com/iterative/PyDrive2
from pydrive2.fs.spec import GDriveFileSystem
__all__ = ["GDriveFileSystem"]
================================================
FILE: DriveDownloader/pydrive2/fs/spec.py
================================================
# Credit: https://github.com/iterative/PyDrive2
import errno
import io
import logging
import os
import posixpath
import threading
from collections import defaultdict
from fsspec.spec import AbstractFileSystem
from funcy import cached_property, retry, wrap_prop, wrap_with
from funcy.py3 import cat
from tqdm.utils import CallbackIOWrapper
from pydrive2.drive import GoogleDrive
from pydrive2.fs.utils import IterStream
logger = logging.getLogger(__name__)
FOLDER_MIME_TYPE = "application/vnd.google-apps.folder"
def _gdrive_retry(func):
def should_retry(exc):
from pydrive2.files import ApiRequestError
if not isinstance(exc, ApiRequestError):
return False
error_code = exc.error.get("code", 0)
result = False
if 500 <= error_code < 600:
result = True
if error_code == 403:
result = exc.GetField("reason") in [
"userRateLimitExceeded",
"rateLimitExceeded",
]
if result:
logger.debug(f"Retrying GDrive API call, error: {exc}.")
return result
# 16 tries, start at 0.5s, multiply by golden ratio, cap at 20s
return retry(
16,
timeout=lambda a: min(0.5 * 1.618 ** a, 20),
filter_errors=should_retry,
)(func)
class GDriveFileSystem(AbstractFileSystem):
def __init__(self, path, google_auth, trash_only=True, **kwargs):
self.path = path
self.root, self.base = self.split_path(self.path)
self.client = GoogleDrive(google_auth)
self._trash_only = trash_only
super().__init__(**kwargs)
def split_path(self, path):
parts = path.replace("//", "/").rstrip("/").split("/", 1)
if len(parts) == 2:
return parts
else:
return parts[0], ""
@wrap_prop(threading.RLock())
@cached_property
def _ids_cache(self):
cache = {
"dirs": defaultdict(list),
"ids": {},
"root_id": self._get_item_id(
self.path,
use_cache=False,
hint="Confirm the directory exists and you can access it.",
),
}
self._cache_path_id(self.base, cache["root_id"], cache=cache)
for item in self._gdrive_list(
"'{}' in parents and trashed=false".format(cache["root_id"])
):
item_path = posixpath.join(self.base, item["title"])
self._cache_path_id(item_path, item["id"], cache=cache)
return cache
def _cache_path_id(self, path, *item_ids, cache=None):
cache = cache or self._ids_cache
for item_id in item_ids:
cache["dirs"][path].append(item_id)
cache["ids"][item_id] = path
@cached_property
def _list_params(self):
params = {"corpora": "default"}
if self.root != "root" and self.root != "appDataFolder":
drive_id = self._gdrive_shared_drive_id(self.root)
if drive_id:
logger.debug(
"GDrive remote '{}' is using shared drive id '{}'.".format(
self.path, drive_id
)
)
params["driveId"] = drive_id
params["corpora"] = "drive"
return params
@_gdrive_retry
def _gdrive_shared_drive_id(self, item_id):
from pydrive2.files import ApiRequestError
param = {"id": item_id}
# it does not create a file on the remote
item = self.client.CreateFile(param)
# ID of the shared drive the item resides in.
# Only populated for items in shared drives.
try:
item.FetchMetadata("driveId")
except ApiRequestError as exc:
error_code = exc.error.get("code", 0)
if error_code == 404:
raise PermissionError from exc
raise
return item.get("driveId", None)
def _gdrive_list(self, query):
param = {"q": query, "maxResults": 1000}
param.update(self._list_params)
file_list = self.client.ListFile(param)
# Isolate and decorate fetching of remote drive items in pages.
get_list = _gdrive_retry(lambda: next(file_list, None))
# Fetch pages until None is received, lazily flatten the thing.
return cat(iter(get_list, None))
def _gdrive_list_ids(self, query_ids):
query = " or ".join(
f"'{query_id}' in parents" for query_id in query_ids
)
query = f"({query}) and trashed=false"
return self._gdrive_list(query)
def _get_remote_item_ids(self, parent_ids, title):
if not parent_ids:
return None
query = "trashed=false and ({})".format(
" or ".join(
f"'{parent_id}' in parents" for parent_id in parent_ids
)
)
query += " and title='{}'".format(title.replace("'", "\\'"))
# GDrive list API is case insensitive, we need to compare
# all results and pick the ones with the right title
return [
item["id"]
for item in self._gdrive_list(query)
if item["title"] == title
]
def _get_cached_item_ids(self, path, use_cache):
if not path:
return [self.root]
if use_cache:
return self._ids_cache["dirs"].get(path, [])
return []
def _path_to_item_ids(self, path, create=False, use_cache=True):
item_ids = self._get_cached_item_ids(path, use_cache)
if item_ids:
return item_ids
parent_path, title = posixpath.split(path)
parent_ids = self._path_to_item_ids(parent_path, create, use_cache)
item_ids = self._get_remote_item_ids(parent_ids, title)
if item_ids:
return item_ids
return (
[self._create_dir(min(parent_ids), title, path)] if create else []
)
def _get_item_id(self, path, create=False, use_cache=True, hint=None):
bucket, base = self.split_path(path)
assert bucket == self.root
item_ids = self._path_to_item_ids(base, create, use_cache)
if item_ids:
return min(item_ids)
assert not create
raise FileNotFoundError(
errno.ENOENT, os.strerror(errno.ENOENT), hint or path
)
@_gdrive_retry
def _gdrive_create_dir(self, parent_id, title):
parent = {"id": parent_id}
item = self.client.CreateFile(
{"title": title, "parents": [parent], "mimeType": FOLDER_MIME_TYPE}
)
item.Upload()
return item
@wrap_with(threading.RLock())
def _create_dir(self, parent_id, title, remote_path):
cached = self._ids_cache["dirs"].get(remote_path)
if cached:
return cached[0]
item = self._gdrive_create_dir(parent_id, title)
if parent_id == self._ids_cache["root_id"]:
self._cache_path_id(remote_path, item["id"])
return item["id"]
def exists(self, path):
try:
self._get_item_id(path)
except FileNotFoundError:
return False
else:
return True
@_gdrive_retry
def info(self, path):
bucket, base = self.split_path(path)
item_id = self._get_item_id(path)
gdrive_file = self.client.CreateFile({"id": item_id})
gdrive_file.FetchMetadata()
metadata = {"name": posixpath.join(bucket, base.rstrip("/"))}
if gdrive_file["mimeType"] == FOLDER_MIME_TYPE:
metadata["type"] = "directory"
metadata["size"] = 0
metadata["name"] += "/"
else:
metadata["type"] = "file"
metadata["size"] = int(gdrive_file.get("fileSize"))
metadata["checksum"] = gdrive_file["md5Checksum"]
return metadata
def ls(self, path, detail=False):
bucket, base = self.split_path(path)
cached = base in self._ids_cache["dirs"]
if cached:
dir_ids = self._ids_cache["dirs"][base]
else:
dir_ids = self._path_to_item_ids(base)
if not dir_ids:
return None
root_path = posixpath.join(bucket, base)
contents = []
for item in self._gdrive_list_ids(dir_ids):
item_path = posixpath.join(root_path, item["title"])
if item["mimeType"] == FOLDER_MIME_TYPE:
contents.append(
{
"type": "directory",
"name": item_path.rstrip("/") + "/",
"size": 0,
}
)
else:
contents.append(
{
"type": "file",
"name": item_path,
"size": int(item["fileSize"]),
"checksum": item["md5Checksum"],
}
)
if not cached:
self._cache_path_id(root_path, *dir_ids)
if detail:
return contents
else:
return [content["name"] for content in contents]
def find(self, path, detail=False, **kwargs):
bucket, base = self.split_path(path)
seen_paths = set()
dir_ids = [self._ids_cache["ids"].copy()]
contents = []
while dir_ids:
query_ids = {
dir_id: dir_name
for dir_id, dir_name in dir_ids.pop().items()
if posixpath.commonpath([base, dir_name]) == base
if dir_id not in seen_paths
}
if not query_ids:
continue
seen_paths |= query_ids.keys()
new_query_ids = {}
dir_ids.append(new_query_ids)
for item in self._gdrive_list_ids(query_ids):
parent_id = item["parents"][0]["id"]
item_path = posixpath.join(query_ids[parent_id], item["title"])
if item["mimeType"] == FOLDER_MIME_TYPE:
new_query_ids[item["id"]] = item_path
self._cache_path_id(item_path, item["id"])
continue
contents.append(
{
"name": posixpath.join(bucket, item_path),
"type": "file",
"size": int(item["fileSize"]),
"checksum": item["md5Checksum"],
}
)
if detail:
return {content["name"]: content for content in contents}
else:
return [content["name"] for content in contents]
def upload_fobj(self, stream, rpath, callback=None, **kwargs):
parent_id = self._get_item_id(self._parent(rpath), create=True)
if callback:
stream = CallbackIOWrapper(
callback.relative_update, stream, "read"
)
return self.gdrive_upload_fobj(
posixpath.basename(rpath.rstrip("/")), parent_id, stream
)
def put_file(self, lpath, rpath, callback=None, **kwargs):
if callback:
callback.set_size(os.path.getsize(lpath))
with open(lpath, "rb") as stream:
self.upload_fobj(stream, rpath, callback=callback)
@_gdrive_retry
def gdrive_upload_fobj(self, title, parent_id, stream, callback=None):
item = self.client.CreateFile(
{"title": title, "parents": [{"id": parent_id}]}
)
item.content = stream
item.Upload()
return item
def cp_file(self, lpath, rpath, **kwargs):
"""In-memory streamed copy"""
with self.open(lpath) as stream:
# IterStream objects doesn't support full-length
# seek() calls, so we have to wrap the data with
# an external buffer.
buffer = io.BytesIO(stream.read())
self.upload_fobj(buffer, rpath)
def get_file(self, lpath, rpath, callback=None, block_size=None, **kwargs):
item_id = self._get_item_id(lpath)
return self.gdrive_get_file(
item_id, rpath, callback=callback, block_size=block_size
)
@_gdrive_retry
def gdrive_get_file(self, item_id, rpath, callback=None, block_size=None):
param = {"id": item_id}
# it does not create a file on the remote
gdrive_file = self.client.CreateFile(param)
extra_args = {}
if block_size:
extra_args["chunksize"] = block_size
if callback:
def cb(value, _):
callback.absolute_update(value)
gdrive_file.FetchMetadata(fields="fileSize")
callback.set_size(int(gdrive_file.get("fileSize")))
extra_args["callback"] = cb
gdrive_file.GetContentFile(rpath, **extra_args)
def _open(self, path, mode, **kwargs):
assert mode in {"rb", "wb"}
if mode == "wb":
return GDriveBufferedWriter(self, path)
else:
item_id = self._get_item_id(path)
return self.gdrive_open_file(item_id)
@_gdrive_retry
def gdrive_open_file(self, item_id):
param = {"id": item_id}
# it does not create a file on the remote
gdrive_file = self.client.CreateFile(param)
fd = gdrive_file.GetContentIOBuffer()
return IterStream(iter(fd))
def rm_file(self, path):
item_id = self._get_item_id(path)
self.gdrive_delete_file(item_id)
@_gdrive_retry
def gdrive_delete_file(self, item_id):
from pydrive2.files import ApiRequestError
param = {"id": item_id}
# it does not create a file on the remote
item = self.client.CreateFile(param)
try:
item.Trash() if self._trash_only else item.Delete()
except ApiRequestError as exc:
http_error_code = exc.error.get("code", 0)
if (
http_error_code == 403
and self._list_params["corpora"] == "drive"
and exc.GetField("location") == "file.permissions"
):
raise PermissionError(
"Insufficient permissions to {}. You should have {} "
"access level for the used shared drive. More details "
"at {}.".format(
"move the file into Trash"
if self._trash_only
else "permanently delete the file",
"Manager or Content Manager"
if self._trash_only
else "Manager",
"https://support.google.com/a/answer/7337554",
)
) from exc
raise
class GDriveBufferedWriter(io.IOBase):
def __init__(self, fs, path):
self.fs = fs
self.path = path
self.buffer = io.BytesIO()
self._closed = False
def write(self, *args, **kwargs):
self.buffer.write(*args, **kwargs)
def readable(self):
return False
def writable(self):
return not self.readable()
def flush(self):
self.buffer.flush()
try:
self.fs.upload_fobj(self.buffer, self.path)
finally:
self._closed = True
def close(self):
if self._closed:
return None
self.flush()
self.buffer.close()
self._closed = True
def __enter__(self):
return self
def __exit__(self, *exc_info):
self.close()
@property
def closed(self):
return self._closed
================================================
FILE: DriveDownloader/pydrive2/fs/utils.py
================================================
# Credit: https://github.com/iterative/PyDrive2
import io
class IterStream(io.RawIOBase):
"""Wraps an iterator yielding bytes as a file object"""
def __init__(self, iterator): # pylint: disable=super-init-not-called
self.iterator = iterator
self.leftover = b""
def readable(self):
return True
def writable(self) -> bool:
return False
# Python 3 requires only .readinto() method, it still uses other ones
# under some circumstances and falls back if those are absent. Since
# iterator already constructs byte strings for us, .readinto() is not the
# most optimal, so we provide .read1() too.
def readinto(self, b):
try:
n = len(b) # We're supposed to return at most this much
chunk = self.leftover or next(self.iterator)
output, self.leftover = chunk[:n], chunk[n:]
n_out = len(output)
b[:n_out] = output
return n_out
except StopIteration:
return 0 # indicate EOF
readinto1 = readinto
def read1(self, n=-1):
try:
chunk = self.leftover or next(self.iterator)
except StopIteration:
return b""
# Return an arbitrary number or bytes
if n <= 0:
self.leftover = b""
return chunk
output, self.leftover = chunk[:n], chunk[n:]
return output
def peek(self, n):
while len(self.leftover) < n:
try:
self.leftover += next(self.iterator)
except StopIteration:
break
return self.leftover[:n]
================================================
FILE: DriveDownloader/pydrive2/settings.py
================================================
# Credit: https://github.com/iterative/PyDrive2
from yaml import load
from yaml import YAMLError
import os
try:
from yaml import CLoader as Loader
except ImportError:
from yaml import Loader
SETTINGS_FILE = "settings.yaml"
SETTINGS_STRUCT = {
"client_config_backend": {
"type": str,
"required": True,
"default": "file",
"dependency": [
{"value": "file", "attribute": ["client_config_file"]},
{"value": "settings", "attribute": ["client_config"]},
{"value": "service", "attribute": ["service_config"]},
],
},
"save_credentials": {
"type": bool,
"required": True,
"default": False,
"dependency": [
{"value": True, "attribute": ["save_credentials_backend"]}
],
},
"get_refresh_token": {"type": bool, "required": False, "default": False},
"client_config_file": {
"type": str,
"required": False,
"default": "client_secrets.json",
},
"save_credentials_backend": {
"type": str,
"required": False,
"dependency": [
{"value": "file", "attribute": ["save_credentials_file"]}
],
},
"client_config": {
"type": dict,
"required": False,
"struct": {
"client_id": {"type": str, "required": True},
"client_secret": {"type": str, "required": True},
"auth_uri": {
"type": str,
"required": True,
"default": "https://accounts.google.com/o/oauth2/auth",
},
"token_uri": {
"type": str,
"required": True,
"default": "https://accounts.google.com/o/oauth2/token",
},
"redirect_uri": {
"type": str,
"required": True,
"default": "urn:ietf:wg:oauth:2.0:oob",
},
"revoke_uri": {"type": str, "required": True, "default": None},
},
},
"service_config": {
"type": dict,
"required": False,
"struct": {
"client_user_email": {
"type": str,
"required": True,
"default": None,
},
"client_service_email": {"type": str, "required": False},
"client_pkcs12_file_path": {"type": str, "required": False},
"client_json_file_path": {"type": str, "required": False},
},
},
"oauth_scope": {
"type": list,
"required": True,
"struct": str,
"default": ["https://www.googleapis.com/auth/drive"],
},
"save_credentials_file": {"type": str, "required": False, "default": os.path.join(os.environ['HOME'], '.credentials.json')},
}
class SettingsError(IOError):
"""Error while loading/saving settings"""
class InvalidConfigError(IOError):
"""Error trying to read client configuration."""
def LoadSettingsFile(filename=SETTINGS_FILE):
"""Loads settings file in yaml format given file name.
:param filename: path for settings file. 'settings.yaml' by default.
:type filename: str.
:raises: SettingsError
"""
try:
with open(filename, "r") as stream:
data = load(stream, Loader=Loader)
except (YAMLError, IOError) as e:
raise SettingsError(e)
return data
def ValidateSettings(data):
"""Validates if current settings is valid.
:param data: dictionary containing all settings.
:type data: dict.
:raises: InvalidConfigError
"""
_ValidateSettingsStruct(data, SETTINGS_STRUCT)
def _ValidateSettingsStruct(data, struct):
"""Validates if provided data fits provided structure.
:param data: dictionary containing settings.
:type data: dict.
:param struct: dictionary containing structure information of settings.
:type struct: dict.
:raises: InvalidConfigError
"""
# Validate required elements of the setting.
for key in struct:
if struct[key]["required"]:
_ValidateSettingsElement(data, struct, key)
def _ValidateSettingsElement(data, struct, key):
"""Validates if provided element of settings data fits provided structure.
:param data: dictionary containing settings.
:type data: dict.
:param struct: dictionary containing structure information of settings.
:type struct: dict.
:param key: key of the settings element to validate.
:type key: str.
:raises: InvalidConfigError
"""
# Check if data exists. If not, check if default value exists.
value = data.get(key)
data_type = struct[key]["type"]
if value is None:
try:
default = struct[key]["default"]
except KeyError:
raise InvalidConfigError("Missing required setting %s" % key)
else:
data[key] = default
# If data exists, Check type of the data
elif type(value) is not data_type:
raise InvalidConfigError(
"Setting %s should be type %s" % (key, data_type)
)
# If type of this data is dict, check if structure of the data is valid.
if data_type is dict:
_ValidateSettingsStruct(data[key], struct[key]["struct"])
# If type of this data is list, check if all values in the list is valid.
elif data_type is list:
for element in data[key]:
if type(element) is not struct[key]["struct"]:
raise InvalidConfigError(
"Setting %s should be list of %s"
% (key, struct[key]["struct"])
)
# Check dependency of this attribute.
dependencies = struct[key].get("dependency")
if dependencies:
for dependency in dependencies:
if value == dependency["value"]:
for reqkey in dependency["attribute"]:
_ValidateSettingsElement(data, struct, reqkey)
================================================
FILE: DriveDownloader/utils/__init__.py
================================================
from .misc import *
from .multithread import MultiThreadDownloader
================================================
FILE: DriveDownloader/utils/misc.py
================================================
#############################################
# Author: Hongwei Fan #
# E-mail: hwnorm@outlook.com #
# Homepage: https://github.com/hwfan #
#############################################
from urllib.parse import urlparse
def format_size(value):
units = ["B", "KB", "MB", "GB", "TB", "PB"]
size = 1024.0
for i in range(len(units)):
if (value / size) < 1:
return "%.2f %s" % (value, units[i])
value = value / size
return value
def judge_session(url):
if '1drv.ms' in url or '1drv.ws' in url:
return 'OneDrive'
elif 'drive.google.com' in url:
return 'GoogleDrive'
elif 'sharepoint' in url:
return 'SharePoint'
elif 'dropbox' in url:
return 'DropBox'
else:
return 'DirectLink'
def judge_scheme(url):
return urlparse(url).scheme
================================================
FILE: DriveDownloader/utils/multithread.py
================================================
import copy
import threading
import shutil
import os
from concurrent.futures import ThreadPoolExecutor, as_completed
def download_session(session_func, url, filename, proc_id, start, end, used_proxy, progress_bar, force_back_google):
drive_session = session_func(used_proxy)
drive_session.set_range(start, end)
drive_session.connect(url, filename, proc_id=proc_id, force_backup=force_back_google)
interrupted = drive_session.save_response_content(start=int(start), end=int(end), proc_id=proc_id, progress_bar=progress_bar)
return interrupted
class MultiThreadDownloader:
def __init__(self, progress_bar, session_func, used_proxy, filesize, thread_number):
self.progress = progress_bar
self.session_func = session_func
self.used_proxy = used_proxy
self.thread_number = thread_number
self.filesize = filesize
self.get_ranges()
def get_ranges(self):
self.ranges = []
offset = int(self.filesize / self.thread_number)
for i in range(self.thread_number):
if i == self.thread_number - 1:
self.ranges.append((str(i * offset), str(int(self.filesize-1))))
else:
self.ranges.append((str(i * offset), str((i+1) * offset - 1)))
def get(self, url, filename, force_back_google):
with self.progress:
with ThreadPoolExecutor(max_workers=len(self.ranges)) as pool:
ts = []
status = []
for proc_id, each_range in enumerate(self.ranges):
start, end = each_range
task_id = self.progress.add_task("download", filename=filename, proc_id=proc_id, start=False)
t = pool.submit(download_session, self.session_func, url, filename, proc_id, start, end, self.used_proxy, self.progress, force_back_google)
ts.append(t)
for t in as_completed(ts):
interrupted = t.result()
status.append(interrupted)
if True in status:
return True
return False
def concatenate(self, filename):
sub_filenames = []
dirname = os.path.dirname(filename)
tmp_dirname = os.path.join(dirname, 'tmp')
for proc_id in range(len(self.ranges)):
name, ext = os.path.splitext(filename)
name = name + '_{}'.format(proc_id)
sub_filename = name + ext
sub_basename = os.path.basename(sub_filename)
sub_filename = os.path.join(tmp_dirname, sub_basename)
sub_filenames.append(sub_filename)
with open(filename, 'wb') as wfd:
for f in sub_filenames:
with open(f, 'rb') as fd:
shutil.copyfileobj(fd, wfd)
os.remove(f)
shutil.rmtree(tmp_dirname)
================================================
FILE: LICENSE
================================================
MIT License
Copyright (c) 2020 Hongwei Fan
Permission is hereby granted, free of charge, to any person obtaining a copy
of this software and associated documentation files (the "Software"), to deal
in the Software without restriction, including without limitation the rights
to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
copies of the Software, and to permit persons to whom the Software is
furnished to do so, subject to the following conditions:
The above copyright notice and this permission notice shall be included in all
copies or substantial portions of the Software.
THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
SOFTWARE.
================================================
FILE: README.md
================================================
# DriveDownloader
English | [中文文档](README_CN.md)
**DriveDownloader** is a Python-based **CLI** tool for downloading files on online drives. With DriveDownloader, one can download the resources from netdrive with **only one command line**.
DriveDownloader now supports:
- OneDrive
- OneDrive for Business
- GoogleDrive
- Dropbox
- Direct Link
## Usage
```
ddl URL/FILELIST [--filename FILENAME] [--thread-number NUMBER] [--version] [--help]
```
- `URL/FILELIST`: target url/filelist to download from. **The example of filelist is shown in `tests/test.list`.**
- `--filename/-o FILENAME`: (optional) output filename. Example: 'hello.txt'
- `--thread-number/-n NUMBER`: (optional) the thread number when using multithread.
- `--force-back-google/-F`: (optional) use the backup downloader for Google drive (it needs authentication, but is more stable).
- Using proxy:
- Set the environment variables `http_proxy` and `https_proxy` to your proxy addresses, and DriveDownloader will automatically read them.
- Resume:
- If your download was interrupted accidentally, simply restart the command will resume, regardless the number of threads.
## Installation
1. Install from pip
```
pip install DriveDownloader
```
2. Install from source
```
git clone https://github.com/hwfan/DriveDownloader.git && cd DriveDownloader
python setup.py install
```
## Quick Start
Coming Soon.
## Requirements
- Python 3.7+
- Use `pip install -r requirements.txt` to install the packages.
- Proxy server if necessary. **We don't provide proxy service for DriveDownloader.**
## Examples
You can also see these examples in `tests/run.sh`.
```
echo "Unit Tests of DriveDownloader"
mkdir -p test_outputs
echo "Testing Direct Link..."
# direct link
ddl https://www.google.com.hk/images/branding/googlelogo/2x/googlelogo_color_272x92dp.png -o test_outputs/directlink.png
echo "Testing OneDrive..."
# OneDrive
ddl https://1drv.ms/t/s!ArUVoRxpBphY5U-a3JznLkLG1uEY?e=czbq1R -o test_outputs/hello_od.txt
echo "Testing GoogleDrive..."
# GoogleDrive
ddl https://drive.google.com/file/d/1XQRdK8ewbpOlQn7CvB99aT1FLi6cUKt_/view?usp=sharing -o test_outputs/hello_gd.txt
echo "Testing SharePoint..."
# SharePoint
ddl https://bupteducn-my.sharepoint.com/:t:/g/personal/hwfan_bupt_edu_cn/EQzn4SeFkJZHq8OikhX7X3QB97PSiNvJpPVtllBQln8EQw?e=NmgRSc -o test_outputs/hello_sp.txt
echo "Testing Dropbox..."
# Dropbox
ddl https://www.dropbox.com/s/bd0bak3h9dlfw3z/hello.txt?dl=0 -o test_outputs/hello_db.txt
echo "Testing File List..."
# file list
ddl test.list -l
echo "Testing Multi Thread..."
# Multi Thread
ddl https://www.dropbox.com/s/r4bme0kew42oo7e/Get%20Started%20with%20Dropbox.pdf?dl=0 -o test_outputs/Dropbox.pdf -n 8
```
## FAQ
- Why does "Size:Invalid" occur?
- We extract the size of file from the "Content-Length" of HTTP response. If this parameter is empty, the file size will fall back to "Invalid". (The response of GoogleDrive often hides this header.)
- I couldn't connect to the target server through a socks5 proxy.
- Try "socks5h" as the protocol prefix instead. It will transmit the url to proxy server for parsing.
- There exists some old bugs in my DriveDownloader.
- Try `pip install DriveDownloader --force-reinstall --upgrade` to update. We keep the latest version of DDL free from those bugs.
- !{some string}: event not found
- Since bash can parse "!" from the url, single quotes(') should be added before and after the url when using bash.
```
ddl 'https://1drv.ms/t/s!ArUVoRxpBphY5U-a3JznLkLG1uEY?e=czbq1R' -o test_outputs/hello_od.txt
```
## Acknowledgement
Some code of DriveDownloader is borrowed from [PyDrive2](https://github.com/iterative/PyDrive2) and [rich](https://github.com/Textualize/rich). Thanks for their wonderful jobs!
## TODOs
- [x] General downloader API - one class for downloading, and several inheritance classes to load the configurations.
- [x] Support more netdrives - OneDrive for Business, Dropbox, ...
- [x] Downloading files from a list.
- [x] Multi-thread downloading.
- [x] Resume downloading.
- [ ] Folder downloading.
- [ ] Window-based UI.
- [ ] Quick Start.
## Update Log
### v1.6.0
- Added automatic resume downloading.
- Changed the progress bar manager to [rich](https://github.com/Textualize/rich).
### v1.5.0
- Solved the problem of "not accessible" when downloading a large file on Google Drive.
- The input type (URL/FILELIST) is now automatically detected by the downloader, and `-l/--list` is deprecated.
- The proxy server is now parsed from environmental variables, and `-p/--proxy` is deprecated.
- Added the version option `-v/--version`.
### v1.4.0
- Supported Multi-thread and downloading from a list and a direct link.
- Removed interactive mode.
### v1.3.0
- Supported Sharepoint and Dropbox.
- Removed the deprecated fake-useragent.
================================================
FILE: README_CN.md
================================================
# DriveDownloader
[English](README.md) | 中文文档
**DriveDownloader**是一个基于Python的命令行工具,用来下载OneDrive, 谷歌网盘等在线存储上的文件。使用DriveDownloader,只需要一行简洁的命令,就可以从各类网盘上下载文件。
DriveDownloader当前支持:
- OneDrive
- OneDrive for Business
- GoogleDrive
- Dropbox
- 直链
## 命令用法
```
ddl URL/FILELIST [--filename FILENAME] [--thread-number NUMBER] [--force-back-google] [--version] [--help]
```
- `URL/FILELIST`: 目标的URL或者文件列表。**文件列表的格式请参考:`tests/test.list`.**
- `--filename/-o FILENAME`: (可选) 输出的文件名,如'hello.txt'
- `--thread-number/-n NUMBER`: (可选) 多线程的线程数量。
- `--force-back-google/-F`: (可选) 对于谷歌网盘使用备份下载器 (需要谷歌账号认证,但可以保证稳定连接)
- 使用代理服务器:
- 请将环境变量 `http_proxy` 与 `https_proxy` 设置成你的代理服务器地址,DriveDownloader会自动读取它们。
- 断点续传:
- 如果您的下载意外中断,只需要重启同样的命令即可恢复。
## 安装方式
1. 从pip安装
```
pip install DriveDownloader
```
2. 从源代码安装
```
git clone https://github.com/hwfan/DriveDownloader.git && cd DriveDownloader
python setup.py install
```
## 快速开始
制作中,与新版本共同发布。
## 依赖
- Python 3.7+
- 请使用`pip install -r requirements.txt`安装依赖。
## 用例
这些用例也可以在`tests/run.sh`中找到。
```
echo "Unit Tests of DriveDownloader"
mkdir -p test_outputs
echo "Testing Direct Link..."
# direct link
ddl https://www.google.com.hk/images/branding/googlelogo/2x/googlelogo_color_272x92dp.png -o test_outputs/directlink.png
echo "Testing OneDrive..."
# OneDrive
ddl https://1drv.ms/t/s!ArUVoRxpBphY5U-a3JznLkLG1uEY?e=czbq1R -o test_outputs/hello_od.txt
echo "Testing GoogleDrive..."
# GoogleDrive
ddl https://drive.google.com/file/d/1XQRdK8ewbpOlQn7CvB99aT1FLi6cUKt_/view?usp=sharing -o test_outputs/hello_gd.txt
echo "Testing SharePoint..."
# SharePoint
ddl https://bupteducn-my.sharepoint.com/:t:/g/personal/hwfan_bupt_edu_cn/EQzn4SeFkJZHq8OikhX7X3QB97PSiNvJpPVtllBQln8EQw?e=NmgRSc -o test_outputs/hello_sp.txt
echo "Testing Dropbox..."
# Dropbox
ddl https://www.dropbox.com/s/bd0bak3h9dlfw3z/hello.txt?dl=0 -o test_outputs/hello_db.txt
echo "Testing File List..."
# file list
ddl test.list -l
echo "Testing Multi Thread..."
# Multi Thread
ddl https://www.dropbox.com/s/r4bme0kew42oo7e/Get%20Started%20with%20Dropbox.pdf?dl=0 -o test_outputs/Dropbox.pdf -n 8
```
## 常见问题
- 为什么提示"Size:Invalid"?
- 我们根据HTTP报文中的"Content-Length"提取文件大小。如果该参数置空,文件大小就会回落至默认的"Invalid". (谷歌网盘会隐藏这个参数)
- 通过socks5代理无法连接目标服务器。
- 请使用"socks5h"作为协议前缀。该前缀会将URL发送给代理服务器进行解析。
- 我的DriveDownloader中有一些没有修复的bug。
- 请使用`pip install DriveDownloader --force-reinstall --upgrade`更新DriveDownloader。
- !{some string}: event not found
- 在bash中,URL中的"!"是一个关键字, 请在URL前后增加引号(')以解决该问题,例如
```
ddl 'https://1drv.ms/t/s!ArUVoRxpBphY5U-a3JznLkLG1uEY?e=czbq1R' -o test_outputs/hello_od.txt
```
## 鸣谢
本项目的部分代码来源于[PyDrive2](https://github.com/iterative/PyDrive2)与[rich](https://github.com/Textualize/rich)。感谢他们优秀的工作!
## 开发计划
- [x] 通用的下载API - 一个下载类,多个网盘下载的继承类。
- [x] 支持更多网盘 - OneDrive for Business, Dropbox, 直链等。
- [x] 从列表下载
- [x] 多线程下载
- [x] 断点续传
- [ ] 基于窗口的UI
- [ ] 快速开始
## 更新日志
### v1.6.0
- 增加断点续传功能。
- 采用新的进度条管理器[rich](https://github.com/Textualize/rich)。
### v1.5.0
- 解决了在部分情况下无法访问谷歌网盘文件的问题。
- 输入是URL还是文件将由下载器自行判断,`-l/--list`选项将不再维护。
- 统一读取环境变量中的代理,`-p/--proxy`选项将不再维护。
- 增加了版本号显示选项`-v/--version`。
### v1.4.0
- 支持多线程,从文件下载,从直链下载。
- 移除了交互模式。
### v1.3.0
- 支持Sharepoint与Dropbox。
- 移除了fake-useragent的依赖。
================================================
FILE: requirements.txt
================================================
argparse
requests
tqdm
rich
pysocks
requests_random_user_agent
google-api-python-client >= 1.12.5
six >= 1.13.0
oauth2client >= 4.0.0
PyYAML >= 3.0
pyOpenSSL >= 19.1.0
================================================
FILE: setup.py
================================================
from setuptools import setup, find_packages
setup(
name = "DriveDownloader",
version = "1.6.0.post1",
keywords = ("drivedownloader", "drive", "netdrive", "download"),
description = "A Python netdrive downloader.",
long_description = "A Python netdrive downloader.",
license = "MIT Licence",
url = "https://github.com/hwfan",
author = "hwfan",
author_email = "hwnorm@outlook.com",
packages = find_packages(),
include_package_data = True,
platforms = "any",
install_requires = ['argparse', 'requests', 'tqdm', 'rich', 'pysocks', 'requests_random_user_agent',
"google-api-python-client >= 1.12.5", "six >= 1.13.0", "oauth2client >= 4.0.0",
"PyYAML >= 3.0", "pyOpenSSL >= 19.1.0"],
scripts = [],
entry_points = {
'console_scripts': [
'ddl = DriveDownloader.downloader:simple_cli'
]
}
)
================================================
FILE: tests/run.sh
================================================
echo "Unit Tests of DriveDownloader"
mkdir -p test_outputs
echo "Testing Direct Link..."
# direct link
ddl https://www.google.com.hk/images/branding/googlelogo/2x/googlelogo_color_272x92dp.png -o test_outputs/directlink.png
echo "Testing OneDrive..."
# OneDrive
ddl https://1drv.ms/t/s!ArUVoRxpBphY5U-a3JznLkLG1uEY?e=czbq1R -o test_outputs/hello_od.txt
echo "Testing GoogleDrive..."
# GoogleDrive
ddl https://drive.google.com/file/d/1XQRdK8ewbpOlQn7CvB99aT1FLi6cUKt_/view?usp=sharing -o test_outputs/hello_gd.txt
echo "Testing SharePoint..."
# SharePoint
ddl https://bupteducn-my.sharepoint.com/:t:/g/personal/hwfan_bupt_edu_cn/EQzn4SeFkJZHq8OikhX7X3QB97PSiNvJpPVtllBQln8EQw?e=NmgRSc -o test_outputs/hello_sp.txt
echo "Testing Dropbox..."
# Dropbox
ddl https://www.dropbox.com/s/bd0bak3h9dlfw3z/hello.txt?dl=0 -o test_outputs/hello_db.txt
echo "Testing File List..."
# file list
ddl test.list
echo "Testing Multi Thread..."
# Multi Thread
ddl https://www.dropbox.com/s/r4bme0kew42oo7e/Get%20Started%20with%20Dropbox.pdf?dl=0 -o test_outputs/Dropbox.pdf -n 8
================================================
FILE: tests/test.list
================================================
https://drive.google.com/file/d/1XQRdK8ewbpOlQn7CvB99aT1FLi6cUKt_/view?usp=sharing test_outputs/list_hello_google.txt
https://www.dropbox.com/s/bd0bak3h9dlfw3z/hello.txt?dl=0 test_outputs/list_hello_dropbox.txt
gitextract_fmy0tce_/
├── .gitignore
├── DriveDownloader/
│ ├── __init__.py
│ ├── downloader.py
│ ├── netdrives/
│ │ ├── __init__.py
│ │ ├── basedrive.py
│ │ ├── build.py
│ │ ├── directlink.py
│ │ ├── dropbox.py
│ │ ├── googledrive.py
│ │ ├── onedrive.py
│ │ ├── settings.yaml
│ │ └── sharepoint.py
│ ├── pydrive2/
│ │ ├── __init__.py
│ │ ├── apiattr.py
│ │ ├── auth.py
│ │ ├── drive.py
│ │ ├── files.py
│ │ ├── fs/
│ │ │ ├── __init__.py
│ │ │ ├── spec.py
│ │ │ └── utils.py
│ │ └── settings.py
│ └── utils/
│ ├── __init__.py
│ ├── misc.py
│ └── multithread.py
├── LICENSE
├── README.md
├── README_CN.md
├── requirements.txt
├── setup.py
└── tests/
├── run.sh
└── test.list
SYMBOL INDEX (204 symbols across 17 files)
FILE: DriveDownloader/downloader.py
function parse_args (line 32) | def parse_args():
function get_env (line 42) | def get_env(key):
function download_single_file (line 48) | def download_single_file(url, filename="", thread_number=1, force_back_g...
function download_filelist (line 107) | def download_filelist(args):
function simple_cli (line 116) | def simple_cli():
FILE: DriveDownloader/netdrives/basedrive.py
function handle_sigint (line 22) | def handle_sigint(signum, frame):
class DriveSession (line 27) | class DriveSession:
method __init__ (line 28) | def __init__(self, proxy=None, chunk_size=32768):
method generate_url (line 43) | def generate_url(self, url):
method set_range (line 46) | def set_range(self, start, end):
method parse_response_header (line 49) | def parse_response_header(self):
method save_response_content (line 63) | def save_response_content(self, start=None, end=None, proc_id=-1, prog...
method connect (line 141) | def connect(self, url, custom_filename=''):
method show_info (line 149) | def show_info(self, progress_bar, list_suffix):
FILE: DriveDownloader/netdrives/build.py
function get_session (line 19) | def get_session(name):
FILE: DriveDownloader/netdrives/directlink.py
class DirectLink (line 10) | class DirectLink(DriveSession):
method __init__ (line 11) | def __init__(self, *args, **kwargs):
method parse_response_header (line 14) | def parse_response_header(self):
method generate_url (line 23) | def generate_url(self, url):
method connect (line 26) | def connect(self, url, custom_filename='', proc_id=-1, force_backup=Fa...
FILE: DriveDownloader/netdrives/dropbox.py
class DropBoxSession (line 9) | class DropBoxSession(DriveSession):
method __init__ (line 10) | def __init__(self, *args, **kwargs):
method generate_url (line 13) | def generate_url(self, url):
method connect (line 25) | def connect(self, url, custom_filename='', proc_id=-1, force_backup=Fa...
FILE: DriveDownloader/netdrives/googledrive.py
class GoogleDriveSession (line 42) | class GoogleDriveSession(DriveSession):
method __init__ (line 43) | def __init__(self, *args, **kwargs):
method generate_url (line 46) | def generate_url(self, url):
method connect (line 60) | def connect(self, url, custom_filename='', force_backup=False, proc_id...
method backup_connect (line 72) | def backup_connect(self, url, custom_filename, id_str, proc_id=-1):
FILE: DriveDownloader/netdrives/onedrive.py
class OneDriveSession (line 9) | class OneDriveSession(DriveSession):
method __init__ (line 10) | def __init__(self, *args, **kwargs):
method generate_url (line 13) | def generate_url(self, url):
method connect (line 23) | def connect(self, url, custom_filename='', proc_id=-1, force_backup=Fa...
FILE: DriveDownloader/netdrives/sharepoint.py
class SharePointSession (line 9) | class SharePointSession(DriveSession):
method __init__ (line 10) | def __init__(self, *args, **kwargs):
method generate_url (line 13) | def generate_url(self, url):
method connect (line 26) | def connect(self, url, custom_filename='', proc_id=-1, force_backup=Fa...
FILE: DriveDownloader/pydrive2/apiattr.py
class ApiAttribute (line 6) | class ApiAttribute(object):
method __init__ (line 9) | def __init__(self, name):
method __get__ (line 17) | def __get__(self, obj, type=None):
method __set__ (line 21) | def __set__(self, obj, value):
method __del__ (line 27) | def __del__(self, obj=None):
class ApiAttributeMixin (line 37) | class ApiAttributeMixin(object):
method __init__ (line 40) | def __init__(self):
class ApiResource (line 47) | class ApiResource(dict):
method __init__ (line 57) | def __init__(self, *args, **kwargs):
method __getitem__ (line 63) | def __getitem__(self, key):
method __setitem__ (line 72) | def __setitem__(self, key, val):
method __repr__ (line 81) | def __repr__(self):
method update (line 86) | def update(self, *args, **kwargs):
method UpdateMetadata (line 91) | def UpdateMetadata(self, metadata=None):
method GetChanges (line 97) | def GetChanges(self):
class ApiResourceList (line 111) | class ApiResourceList(ApiAttributeMixin, ApiResource, Iterator):
method __init__ (line 119) | def __init__(self, auth=None, metadata=None):
method __iter__ (line 134) | def __iter__(self):
method __next__ (line 141) | def __next__(self):
method GetList (line 156) | def GetList(self):
method _GetList (line 175) | def _GetList(self):
method Reset (line 184) | def Reset(self):
FILE: DriveDownloader/pydrive2/auth.py
class AuthError (line 29) | class AuthError(Exception):
class InvalidCredentialsError (line 33) | class InvalidCredentialsError(IOError):
class AuthenticationRejected (line 37) | class AuthenticationRejected(AuthError):
class AuthenticationError (line 41) | class AuthenticationError(AuthError):
class RefreshError (line 45) | class RefreshError(AuthError):
function LoadAuth (line 49) | def LoadAuth(decoratee):
function CheckServiceAuth (line 91) | def CheckServiceAuth(decoratee):
function CheckAuth (line 114) | def CheckAuth(decoratee):
class GoogleAuth (line 144) | class GoogleAuth(ApiAttributeMixin, object):
method __init__ (line 175) | def __init__(self, settings_file="settings.yaml", http_timeout=None):
method access_token_expired (line 199) | def access_token_expired(self):
method LocalWebserverAuth (line 209) | def LocalWebserverAuth(
method CommandLineAuth (line 281) | def CommandLineAuth(self):
method ServiceAuth (line 296) | def ServiceAuth(self):
method LoadCredentials (line 324) | def LoadCredentials(self, backend=None):
method LoadCredentialsFile (line 340) | def LoadCredentialsFile(self, credentials_file=None):
method SaveCredentials (line 363) | def SaveCredentials(self, backend=None):
method SaveCredentialsFile (line 382) | def SaveCredentialsFile(self, credentials_file=None):
method LoadClientConfig (line 406) | def LoadClientConfig(self, backend=None):
method LoadClientConfigFile (line 431) | def LoadClientConfigFile(self, client_config_file=None):
method LoadServiceConfigSettings (line 482) | def LoadServiceConfigSettings(self):
method LoadClientConfigSettings (line 511) | def LoadClientConfigSettings(self):
method GetFlow (line 526) | def GetFlow(self):
method Refresh (line 554) | def Refresh(self):
method GetAuthUrl (line 576) | def GetAuthUrl(self):
method Auth (line 585) | def Auth(self, code):
method Authenticate (line 595) | def Authenticate(self, code):
method _build_http (line 610) | def _build_http(self):
method Authorize (line 625) | def Authorize(self):
method Get_Http_Object (line 642) | def Get_Http_Object(self):
FILE: DriveDownloader/pydrive2/drive.py
class GoogleDrive (line 9) | class GoogleDrive(ApiAttributeMixin, object):
method __init__ (line 12) | def __init__(self, auth=None):
method CreateFile (line 21) | def CreateFile(self, metadata=None):
method ListFile (line 33) | def ListFile(self, param=None):
method GetAbout (line 46) | def GetAbout(self):
FILE: DriveDownloader/pydrive2/files.py
class FileNotUploadedError (line 28) | class FileNotUploadedError(RuntimeError):
class ApiRequestError (line 32) | class ApiRequestError(IOError):
method __init__ (line 33) | def __init__(self, http_error):
method GetField (line 41) | def GetField(self, field):
class FileNotDownloadableError (line 46) | class FileNotDownloadableError(RuntimeError):
function LoadMetadata (line 50) | def LoadMetadata(decoratee):
class GoogleDriveFileList (line 65) | class GoogleDriveFileList(ApiResourceList):
method __init__ (line 71) | def __init__(self, auth=None, param=None):
method _GetList (line 76) | def _GetList(self):
class IoBuffer (line 103) | class IoBuffer(object):
method __init__ (line 106) | def __init__(self, encoding):
method write (line 110) | def write(self, chunk):
method read (line 113) | def read(self):
class MediaIoReadable (line 121) | class MediaIoReadable(object):
method __init__ (line 122) | def __init__(
method read (line 152) | def read(self):
method __iter__ (line 169) | def __iter__(self):
method __len__ (line 179) | def __len__(self):
class GoogleDriveFile (line 183) | class GoogleDriveFile(ApiAttributeMixin, ApiResource):
method __init__ (line 194) | def __init__(self, auth=None, metadata=None, uploaded=False):
method __getitem__ (line 216) | def __getitem__(self, key):
method SetContentString (line 238) | def SetContentString(self, content, encoding="utf-8"):
method SetContentFile (line 253) | def SetContentFile(self, filename):
method GetContentString (line 269) | def GetContentString(
method GetContentFile (line 295) | def GetContentFile(
method GetContentIOBuffer (line 363) | def GetContentIOBuffer(
method FetchMetadata (line 420) | def FetchMetadata(self, fields=None, fetch_all=False):
method FetchContent (line 458) | def FetchContent(self, mimetype=None, remove_bom=False):
method Upload (line 486) | def Upload(self, param=None):
method Trash (line 501) | def Trash(self, param=None):
method UnTrash (line 508) | def UnTrash(self, param=None):
method Delete (line 516) | def Delete(self, param=None):
method InsertPermission (line 525) | def InsertPermission(self, new_permission, param=None):
method GetPermissions (line 560) | def GetPermissions(self):
method DeletePermission (line 590) | def DeletePermission(self, permission_id):
method _WrapRequest (line 600) | def _WrapRequest(self, request):
method _FilesInsert (line 611) | def _FilesInsert(self, param=None):
method _FilesUnTrash (line 641) | def _FilesUnTrash(self, param=None):
method _FilesTrash (line 664) | def _FilesTrash(self, param=None):
method _FilesDelete (line 688) | def _FilesDelete(self, param=None):
method _FilesUpdate (line 712) | def _FilesUpdate(self, param=None):
method _FilesPatch (line 744) | def _FilesPatch(self, param=None):
method _BuildMediaBody (line 770) | def _BuildMediaBody(self):
method _DownloadFromUrl (line 784) | def _DownloadFromUrl(self, url):
method _DeletePermission (line 798) | def _DeletePermission(self, permission_id):
method _GetBOM (line 828) | def _GetBOM(mimetype):
method _RemovePrefix (line 835) | def _RemovePrefix(file_object, prefix, block_size=BLOCK_SIZE):
method _InsertPrefix (line 875) | def _InsertPrefix(file_object, prefix, block_size=BLOCK_SIZE):
FILE: DriveDownloader/pydrive2/fs/spec.py
function _gdrive_retry (line 24) | def _gdrive_retry(func):
class GDriveFileSystem (line 54) | class GDriveFileSystem(AbstractFileSystem):
method __init__ (line 55) | def __init__(self, path, google_auth, trash_only=True, **kwargs):
method split_path (line 62) | def split_path(self, path):
method _ids_cache (line 71) | def _ids_cache(self):
method _cache_path_id (line 92) | def _cache_path_id(self, path, *item_ids, cache=None):
method _list_params (line 99) | def _list_params(self):
method _gdrive_shared_drive_id (line 114) | def _gdrive_shared_drive_id(self, item_id):
method _gdrive_list (line 132) | def _gdrive_list(self, query):
method _gdrive_list_ids (line 143) | def _gdrive_list_ids(self, query_ids):
method _get_remote_item_ids (line 150) | def _get_remote_item_ids(self, parent_ids, title):
method _get_cached_item_ids (line 168) | def _get_cached_item_ids(self, path, use_cache):
method _path_to_item_ids (line 175) | def _path_to_item_ids(self, path, create=False, use_cache=True):
method _get_item_id (line 190) | def _get_item_id(self, path, create=False, use_cache=True, hint=None):
method _gdrive_create_dir (line 204) | def _gdrive_create_dir(self, parent_id, title):
method _create_dir (line 213) | def _create_dir(self, parent_id, title, remote_path):
method exists (line 225) | def exists(self, path):
method info (line 234) | def info(self, path):
method ls (line 251) | def ls(self, path, detail=False):
method find (line 293) | def find(self, path, detail=False, **kwargs):
method upload_fobj (line 335) | def upload_fobj(self, stream, rpath, callback=None, **kwargs):
method put_file (line 345) | def put_file(self, lpath, rpath, callback=None, **kwargs):
method gdrive_upload_fobj (line 352) | def gdrive_upload_fobj(self, title, parent_id, stream, callback=None):
method cp_file (line 360) | def cp_file(self, lpath, rpath, **kwargs):
method get_file (line 369) | def get_file(self, lpath, rpath, callback=None, block_size=None, **kwa...
method gdrive_get_file (line 376) | def gdrive_get_file(self, item_id, rpath, callback=None, block_size=No...
method _open (line 396) | def _open(self, path, mode, **kwargs):
method gdrive_open_file (line 405) | def gdrive_open_file(self, item_id):
method rm_file (line 412) | def rm_file(self, path):
method gdrive_delete_file (line 417) | def gdrive_delete_file(self, item_id):
class GDriveBufferedWriter (line 449) | class GDriveBufferedWriter(io.IOBase):
method __init__ (line 450) | def __init__(self, fs, path):
method write (line 456) | def write(self, *args, **kwargs):
method readable (line 459) | def readable(self):
method writable (line 462) | def writable(self):
method flush (line 465) | def flush(self):
method close (line 472) | def close(self):
method __enter__ (line 480) | def __enter__(self):
method __exit__ (line 483) | def __exit__(self, *exc_info):
method closed (line 487) | def closed(self):
FILE: DriveDownloader/pydrive2/fs/utils.py
class IterStream (line 6) | class IterStream(io.RawIOBase):
method __init__ (line 9) | def __init__(self, iterator): # pylint: disable=super-init-not-called
method readable (line 13) | def readable(self):
method writable (line 16) | def writable(self) -> bool:
method readinto (line 24) | def readinto(self, b):
method read1 (line 38) | def read1(self, n=-1):
method peek (line 52) | def peek(self, n):
FILE: DriveDownloader/pydrive2/settings.py
class SettingsError (line 93) | class SettingsError(IOError):
class InvalidConfigError (line 97) | class InvalidConfigError(IOError):
function LoadSettingsFile (line 101) | def LoadSettingsFile(filename=SETTINGS_FILE):
function ValidateSettings (line 116) | def ValidateSettings(data):
function _ValidateSettingsStruct (line 126) | def _ValidateSettingsStruct(data, struct):
function _ValidateSettingsElement (line 141) | def _ValidateSettingsElement(data, struct, key):
FILE: DriveDownloader/utils/misc.py
function format_size (line 8) | def format_size(value):
function judge_session (line 17) | def judge_session(url):
function judge_scheme (line 29) | def judge_scheme(url):
FILE: DriveDownloader/utils/multithread.py
function download_session (line 7) | def download_session(session_func, url, filename, proc_id, start, end, u...
class MultiThreadDownloader (line 14) | class MultiThreadDownloader:
method __init__ (line 15) | def __init__(self, progress_bar, session_func, used_proxy, filesize, t...
method get_ranges (line 23) | def get_ranges(self):
method get (line 32) | def get(self, url, filename, force_back_google):
method concatenate (line 49) | def concatenate(self, filename):
Condensed preview — 31 files, each showing path, character count, and a content snippet. Download the .json file or copy for the full structured content (129K chars).
[
{
"path": ".gitignore",
"chars": 1803,
"preview": "# Byte-compiled / optimized / DLL files\n__pycache__/\n*.py[cod]\n*$py.class\n\n# C extensions\n*.so\n\n# Distribution / packagi"
},
{
"path": "DriveDownloader/__init__.py",
"chars": 0,
"preview": ""
},
{
"path": "DriveDownloader/downloader.py",
"chars": 5743,
"preview": "#############################################\r\n# Author: Hongwei Fan #\r\n# E-mail: hwnorm@outlook."
},
{
"path": "DriveDownloader/netdrives/__init__.py",
"chars": 30,
"preview": "from .build import get_session"
},
{
"path": "DriveDownloader/netdrives/basedrive.py",
"chars": 5604,
"preview": "#############################################\r\n# Author: Hongwei Fan #\r\n# E-mail: hwnorm@outlook."
},
{
"path": "DriveDownloader/netdrives/build.py",
"chars": 718,
"preview": "#############################################\n# Author: Hongwei Fan #\n# E-mail: hwnorm@outlook.co"
},
{
"path": "DriveDownloader/netdrives/directlink.py",
"chars": 985,
"preview": "#############################################\n# Author: Hongwei Fan #\n# E-mail: hwnorm@outlook.co"
},
{
"path": "DriveDownloader/netdrives/dropbox.py",
"chars": 1083,
"preview": "#############################################\r\n# Author: Hongwei Fan #\r\n# E-mail: hwnorm@outlook."
},
{
"path": "DriveDownloader/netdrives/googledrive.py",
"chars": 3296,
"preview": "#############################################\r\n# Author: Hongwei Fan #\r\n# E-mail: hwnorm@outlook."
},
{
"path": "DriveDownloader/netdrives/onedrive.py",
"chars": 1140,
"preview": "#############################################\r\n# Author: Hongwei Fan #\r\n# E-mail: hwnorm@outlook."
},
{
"path": "DriveDownloader/netdrives/settings.yaml",
"chars": 312,
"preview": "\nclient_config_backend: settings\nclient_config:\n client_id: 367116221053-7n0vf5akeru7on6o2fjinrecpdoe99eg.apps.googleus"
},
{
"path": "DriveDownloader/netdrives/sharepoint.py",
"chars": 1190,
"preview": "#############################################\r\n# Author: Hongwei Fan #\r\n# E-mail: hwnorm@outlook."
},
{
"path": "DriveDownloader/pydrive2/__init__.py",
"chars": 48,
"preview": "# Credit: https://github.com/iterative/PyDrive2\n"
},
{
"path": "DriveDownloader/pydrive2/apiattr.py",
"chars": 5323,
"preview": "# Credit: https://github.com/iterative/PyDrive2\n\nfrom six import Iterator, iteritems\n\n\nclass ApiAttribute(object):\n \""
},
{
"path": "DriveDownloader/pydrive2/auth.py",
"chars": 22859,
"preview": "# Credit: https://github.com/iterative/PyDrive2\n\nimport socket\nimport webbrowser\nimport httplib2\nimport oauth2client.cli"
},
{
"path": "DriveDownloader/pydrive2/drive.py",
"chars": 1675,
"preview": "# Credit: https://github.com/iterative/PyDrive2\n\nfrom .apiattr import ApiAttributeMixin\nfrom .files import GoogleDriveFi"
},
{
"path": "DriveDownloader/pydrive2/files.py",
"chars": 29319,
"preview": "# Credit: https://github.com/iterative/PyDrive2\n\nimport io\nimport mimetypes\nimport json\n\nfrom googleapiclient import err"
},
{
"path": "DriveDownloader/pydrive2/fs/__init__.py",
"chars": 127,
"preview": "# Credit: https://github.com/iterative/PyDrive2\n\nfrom pydrive2.fs.spec import GDriveFileSystem\n\n__all__ = [\"GDriveFileSy"
},
{
"path": "DriveDownloader/pydrive2/fs/spec.py",
"chars": 15668,
"preview": "# Credit: https://github.com/iterative/PyDrive2\n\nimport errno\nimport io\nimport logging\nimport os\nimport posixpath\nimport"
},
{
"path": "DriveDownloader/pydrive2/fs/utils.py",
"chars": 1640,
"preview": "# Credit: https://github.com/iterative/PyDrive2\n\nimport io\n\n\nclass IterStream(io.RawIOBase):\n \"\"\"Wraps an iterator yi"
},
{
"path": "DriveDownloader/pydrive2/settings.py",
"chars": 5889,
"preview": "# Credit: https://github.com/iterative/PyDrive2\n\nfrom yaml import load\nfrom yaml import YAMLError\nimport os\n\ntry:\n fr"
},
{
"path": "DriveDownloader/utils/__init__.py",
"chars": 66,
"preview": "from .misc import *\nfrom .multithread import MultiThreadDownloader"
},
{
"path": "DriveDownloader/utils/misc.py",
"chars": 906,
"preview": "#############################################\r\n# Author: Hongwei Fan #\r\n# E-mail: hwnorm@outlook."
},
{
"path": "DriveDownloader/utils/multithread.py",
"chars": 2870,
"preview": "import copy\nimport threading\nimport shutil\nimport os\nfrom concurrent.futures import ThreadPoolExecutor, as_completed\n\nde"
},
{
"path": "LICENSE",
"chars": 1068,
"preview": "MIT License\n\nCopyright (c) 2020 Hongwei Fan\n\nPermission is hereby granted, free of charge, to any person obtaining a cop"
},
{
"path": "README.md",
"chars": 4978,
"preview": "# DriveDownloader\n\nEnglish | [中文文档](README_CN.md)\n\n**DriveDownloader** is a Python-based **CLI** tool for downloading fi"
},
{
"path": "README_CN.md",
"chars": 3372,
"preview": "# DriveDownloader\n\n[English](README.md) | 中文文档\n\n**DriveDownloader**是一个基于Python的命令行工具,用来下载OneDrive, 谷歌网盘等在线存储上的文件。使用Drive"
},
{
"path": "requirements.txt",
"chars": 177,
"preview": "argparse\r\nrequests\r\ntqdm\r\nrich\r\npysocks\r\nrequests_random_user_agent\r\ngoogle-api-python-client >= 1.12.5\r\nsix >= 1.13.0\r\n"
},
{
"path": "setup.py",
"chars": 950,
"preview": "from setuptools import setup, find_packages\r\nsetup(\r\n name = \"DriveDownloader\",\r\n version = \"1.6.0.post1\",\r\n ke"
},
{
"path": "tests/run.sh",
"chars": 1064,
"preview": "echo \"Unit Tests of DriveDownloader\"\nmkdir -p test_outputs\n\necho \"Testing Direct Link...\"\n# direct link\nddl https://www."
},
{
"path": "tests/test.list",
"chars": 211,
"preview": "https://drive.google.com/file/d/1XQRdK8ewbpOlQn7CvB99aT1FLi6cUKt_/view?usp=sharing test_outputs/list_hello_google.txt\nht"
}
]
About this extraction
This page contains the full source code of the hwfan/DriveDownloader GitHub repository, extracted and formatted as plain text for AI agents and large language models (LLMs). The extraction includes 31 files (117.3 KB), approximately 27.9k tokens, and a symbol index with 204 extracted functions, classes, methods, constants, and types. Use this with OpenClaw, Claude, ChatGPT, Cursor, Windsurf, or any other AI tool that accepts text input. You can copy the full output to your clipboard or download it as a .txt file.
Extracted by GitExtract — free GitHub repo to text converter for AI. Built by Nikandr Surkov.