[
  {
    "path": ".github/FUNDING.yml",
    "content": "# These are supported funding model platforms\npatreon: twintproject\ncustom: paypal.me/noneprivacy\n"
  },
  {
    "path": ".github/ISSUE_TEMPLATE/ISSUE_TEMPLATE.md",
    "content": "### Initial Check\n> If the issue is a request please specify that it is a request in the title (Example: [REQUEST] more features). If this is a question regarding 'twint' please specify that it's a question in the title (Example: [QUESTION] What is x?). Please **only** submit issues related to 'twint'. Thanks.\n\n>Make sure you've checked the following:\n\n- [] Python version is 3.6;\n- [] Using the latest version of Twint;\n- [] Updated Twint with `pip3 install --upgrade -e git+https://github.com/twintproject/twint.git@origin/master#egg=twint`;\n\n### Command Ran\n>Please provide the _exact_ command ran including the username/search/code so I may reproduce the issue.\n\n### Description of Issue\n>Please use **as much detail as possible.**\n\n### Environment Details\n>Using Windows, Linux? What OS version? Running this in Anaconda? Jupyter Notebook? Terminal?\n"
  },
  {
    "path": ".github/ISSUE_TEMPLATE.md",
    "content": "# Issue Template\nPlease use this template!\n\n### Initial Check\n> If the issue is a request please specify that it is a request in the title (Example: [REQUEST] more features). If this is a question regarding 'twint' please specify that it's a question in the title (Example: [QUESTION] What is x?). Please **only** submit issues related to 'twint'. Thanks.\n\n>Make sure you've checked the following:\n\n- [] Python version is 3.6;\n- [] Updated Twint with `pip3 install --user --upgrade -e git+https://github.com/twintproject/twint.git@origin/master#egg=twint`;\n- [] I have searched the issues and there are no duplicates of this issue/question/request.\n\n### Command Ran\n>Please provide the _exact_ command ran including the username/search/code so I may reproduce the issue.\n\n### Description of Issue\n>Please use **as much detail as possible.**\n\n### Environment Details\n>Using Windows, Linux? What OS version? Running this in Anaconda? Jupyter Notebook? Terminal?\n"
  },
  {
    "path": ".gitignore",
    "content": "# Byte-compiled / optimized / DLL files\n__pycache__/\n*.py[cod]\n*$py.class\ntweets.db\n# C extensions\n*.so\n\nconfig.ini\ntwint/storage/mysql.py\n\n# Node Dependency directories\nnode_modules/\njspm_packages/\ntests/\n# Distribution / packaging\n.Python\nenv/\nbuild/\ndevelop-eggs/\ndist/\ndownloads/\neggs/\n.eggs/\nlib/\nlib64/\nparts/\nsdist/\nvar/\nwheels/\n*.egg-info/\n.installed.cfg\n*.egg\n\n# PyInstaller\n#  Usually these files are written by a python script from a template\n#  before PyInstaller builds the exe, so as to inject date/other infos into it.\n*.manifest\n*.spec\n\n# Installer logs\npip-log.txt\npip-delete-this-directory.txt\n\n# Unit test / coverage reports\nhtmlcov/\n.tox/\n.coverage\n.coverage.*\n.cache\nnosetests.xml\ncoverage.xml\n*.cover\n.hypothesis/\n\n# Translations\n*.mo\n*.pot\n\n# Django stuff:\n*.log\nlocal_settings.py\n\n# Flask stuff:\ninstance/\n.webassets-cache\n\n# Scrapy stuff:\n.scrapy\n\n# Sphinx documentation\ndocs/_build/\n\n# PyBuilder\ntarget/\n\n# Jupyter Notebook\n.ipynb_checkpoints\n\n# pyenv\n.python-version\n\n# celery beat schedule file\ncelerybeat-schedule\n\n# SageMath parsed files\n*.sage.py\n\n# dotenv\n.env\n\n# virtualenv\n.venv\nvenv/\nENV/\n\n# Spyder project settings\n.spyderproject\n.spyproject\n\n# Rope project settings\n.ropeproject\n\n# mkdocs documentation\n/site\n\n# mypy\n.mypy_cache/\n\n# output\n*.csv\n*.json\n*.txt\n\ntest_twint.py\n"
  },
  {
    "path": ".travis.yml",
    "content": "dist: bionic\nlanguage: python\npython:\n  - \"3.6\"\n  - \"3.7\"\n  - \"3.8\"\n  - \"nightly\"\nmatrix:\n  allow_failures:\n    - python: \"nightly\"\n    - python: \"3.8\"\ninstall:\n- pip install -r requirements.txt\nscript:\n- python test.py\ndeploy:\n  provider: pypi\n  user: \"codyzacharias\"\n  password:\n    secure: sWWvx50F7KJBtf8z2njc+Q31WIAHiQs4zKEiGD4/7xrshw55H5z+WnqZ9VIP83qm9yKefoRKp7WnaJeXZ3ulZSLn64ue45lqFozWMyGvelRPOKvZi9XPMqBA7+qllR/GseTHSGC3G5EGxac6UEI3irYe3mZXxfjpxNOXVti8rJ2xX8TiJM0AVKRrdDiAstOhMMkXkB7fYXMQALwEp8UoW/UbjbeqsKueXydjStaESNP/QzRFZ3/tuNu+3HMz/olniLUhUWcF/xDbJVpXuaRMUalgqe+BTbDdtUVt/s/GKtpg5GAzJyhQphiCM/huihedUIKSoI+6A8PTzuxrLhB5BMi9pcllED02v7w1enpu5L2l5cRDgQJSOpkxkA5Eese8nxKOOq0KzwDQa3JByrRor8R4yz+p5s4u2r0Rs2A9fkjQYwd/uWBSEIRF4K9WZoniiikahwXq070DMRgV7HbovKSjo5NK5F8j+psrtqPF+OHN2aVfWxbGnezrOOkmzuTHhWZVj3pPSpQU1WFWHo9fPo4I6YstR4q6XjNNjrpY3ojSlv0ThMbUem7zhHTRkRsSA2SpPfqw5E3Jf7vaiQb4M5zkBVqxuq4tXb14GJ26tGD8tel8u8b+ccpkAE9xf+QavP8UHz4PbBhqgFX5TbV/H++cdsICyoZnT35yiaDOELM=\n  on:\n    tags: true\n    python: \"3.7\"\n"
  },
  {
    "path": "Dockerfile",
    "content": "FROM python:3.6-buster\nLABEL maintainer=\"codyzacharias@pm.me\"\n\nWORKDIR /root\n\nRUN git clone --depth=1 https://github.com/twintproject/twint.git && \\\n\tcd /root/twint && \\\n\tpip3 install . -r requirements.txt\n\nCMD /bin/bash\n"
  },
  {
    "path": "LICENSE",
    "content": "MIT License\n\nCopyright (c) 2018 Cody Zacharias\n\nPermission is hereby granted, free of charge, to any person obtaining a copy\nof this software and associated documentation files (the \"Software\"), to deal\nin the Software without restriction, including without limitation the rights\nto use, copy, modify, merge, publish, distribute, sublicense, and/or sell\ncopies of the Software, and to permit persons to whom the Software is\nfurnished to do so, subject to the following conditions:\n\nThe above copyright notice and this permission notice shall be included in all\ncopies or substantial portions of the Software.\n\nTHE SOFTWARE IS PROVIDED \"AS IS\", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR\nIMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,\nFITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE\nAUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER\nLIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,\nOUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE\nSOFTWARE.\n"
  },
  {
    "path": "MANIFEST.in",
    "content": "include README.md LICENSE\n"
  },
  {
    "path": "README.md",
    "content": "# TWINT - Twitter Intelligence Tool\n![2](https://i.imgur.com/iaH3s7z.png)\n![3](https://i.imgur.com/hVeCrqL.png)\n\n[![PyPI](https://img.shields.io/pypi/v/twint.svg)](https://pypi.org/project/twint/) [![Build Status](https://travis-ci.org/twintproject/twint.svg?branch=master)](https://travis-ci.org/twintproject/twint) [![Python 3.6|3.7|3.8](https://img.shields.io/badge/Python-3.6%2F3.7%2F3.8-blue.svg)](https://www.python.org/download/releases/3.0/) [![GitHub license](https://img.shields.io/github/license/haccer/tweep.svg)](https://github.com/haccer/tweep/blob/master/LICENSE) [![Downloads](https://pepy.tech/badge/twint)](https://pepy.tech/project/twint) [![Downloads](https://pepy.tech/badge/twint/week)](https://pepy.tech/project/twint/week) [![Patreon](https://img.shields.io/endpoint.svg?url=https:%2F%2Fshieldsio-patreon.herokuapp.com%2Ftwintproject)](https://www.patreon.com/twintproject) ![](https://img.shields.io/twitter/follow/noneprivacy.svg?label=Follow&style=social) \n\n>No authentication. No API. No limits.\n\nTwint is an advanced Twitter scraping tool written in Python that allows for scraping Tweets from Twitter profiles **without** using Twitter's API.\n\nTwint utilizes Twitter's search operators to let you scrape Tweets from specific users, scrape Tweets relating to certain topics, hashtags & trends, or sort out *sensitive* information from Tweets like e-mail and phone numbers. I find this very useful, and you can get really creative with it too.\n\nTwint also makes special queries to Twitter allowing you to also scrape a Twitter user's followers, Tweets a user has liked, and who they follow **without** any authentication, API, Selenium, or browser emulation.\n\n## tl;dr Benefits\nSome of the benefits of using Twint vs Twitter API:\n- Can fetch almost __all__ Tweets (Twitter API limits to last 3200 Tweets only);\n- Fast initial setup;\n- Can be used anonymously and without Twitter sign up;\n- **No rate limitations**.\n\n## Limits imposed by Twitter\nTwitter limits scrolls while browsing the user timeline. This means that with `.Profile` or with `.Favorites` you will be able to get ~3200 tweets.\n\n## Requirements\n- Python 3.6;\n- aiohttp;\n- aiodns;\n- beautifulsoup4;\n- cchardet;\n- dataclasses\n- elasticsearch;\n- pysocks;\n- pandas (>=0.23.0);\n- aiohttp_socks;\n- schedule;\n- geopy;\n- fake-useragent;\n- py-googletransx.\n\n## Installing\n\n**Git:**\n```bash\ngit clone --depth=1 https://github.com/twintproject/twint.git\ncd twint\npip3 install . -r requirements.txt\n```\n\n**Pip:**\n```bash\npip3 install twint\n```\n\nor\n\n```bash\npip3 install --user --upgrade git+https://github.com/twintproject/twint.git@origin/master#egg=twint\n```\n\n**Pipenv**:\n```bash\npipenv install git+https://github.com/twintproject/twint.git#egg=twint\n```\n\n### March 2, 2021 Update\n\n**Added**: Dockerfile\n\nNoticed a lot of people are having issues installing (including me). Please use the Dockerfile temporarily while I look into them. \n\n## CLI Basic Examples and Combos\nA few simple examples to help you understand the basics:\n\n- `twint -u username` - Scrape all the Tweets of a *user* (doesn't include **retweets** but includes **replies**).\n- `twint -u username -s pineapple` - Scrape all Tweets from the *user*'s timeline containing _pineapple_.\n- `twint -s pineapple` - Collect every Tweet containing *pineapple* from everyone's Tweets.\n- `twint -u username --year 2014` - Collect Tweets that were tweeted **before** 2014.\n- `twint -u username --since \"2015-12-20 20:30:15\"` - Collect Tweets that were tweeted since 2015-12-20 20:30:15.\n- `twint -u username --since 2015-12-20` - Collect Tweets that were tweeted since 2015-12-20 00:00:00.\n- `twint -u username -o file.txt` - Scrape Tweets and save to file.txt.\n- `twint -u username -o file.csv --csv` - Scrape Tweets and save as a csv file.\n- `twint -u username --email --phone` - Show Tweets that might have phone numbers or email addresses.\n- `twint -s \"Donald Trump\" --verified` - Display Tweets by verified users that Tweeted about Donald Trump.\n- `twint -g=\"48.880048,2.385939,1km\" -o file.csv --csv` - Scrape Tweets from a radius of 1km around a place in Paris and export them to a csv file.\n- `twint -u username -es localhost:9200` - Output Tweets to Elasticsearch\n- `twint -u username -o file.json --json` - Scrape Tweets and save as a json file.\n- `twint -u username --database tweets.db` - Save Tweets to a SQLite database.\n- `twint -u username --followers` - Scrape a Twitter user's followers.\n- `twint -u username --following` - Scrape who a Twitter user follows.\n- `twint -u username --favorites` - Collect all the Tweets a user has favorited (gathers ~3200 tweet).\n- `twint -u username --following --user-full` - Collect full user information a person follows\n- `twint -u username --timeline` - Use an effective method to gather Tweets from a user's profile (Gathers ~3200 Tweets, including **retweets** & **replies**).\n- `twint -u username --retweets` - Use a quick method to gather the last 900 Tweets (that includes retweets) from a user's profile.\n- `twint -u username --resume resume_file.txt` - Resume a search starting from the last saved scroll-id.\n\nMore detail about the commands and options are located in the [wiki](https://github.com/twintproject/twint/wiki/Commands)\n\n## Module Example\n\nTwint can now be used as a module and supports custom formatting. **More details are located in the [wiki](https://github.com/twintproject/twint/wiki/Module)**\n\n```python\nimport twint\n\n# Configure\nc = twint.Config()\nc.Username = \"realDonaldTrump\"\nc.Search = \"great\"\n\n# Run\ntwint.run.Search(c)\n```\n> Output\n\n`955511208597184512 2018-01-22 18:43:19 GMT <now> pineapples are the best fruit`\n\n```python\nimport twint\n\nc = twint.Config()\n\nc.Username = \"noneprivacy\"\nc.Custom[\"tweet\"] = [\"id\"]\nc.Custom[\"user\"] = [\"bio\"]\nc.Limit = 10\nc.Store_csv = True\nc.Output = \"none\"\n\ntwint.run.Search(c)\n```\n\n## Storing Options\n- Write to file;\n- CSV;\n- JSON;\n- SQLite;\n- Elasticsearch.\n\n## Elasticsearch Setup\n\nDetails on setting up Elasticsearch with Twint is located in the [wiki](https://github.com/twintproject/twint/wiki/Elasticsearch).\n\n## Graph Visualization\n![graph](https://i.imgur.com/EEJqB8n.png)\n\n[Graph](https://github.com/twintproject/twint/wiki/Graph) details are also located in the [wiki](https://github.com/twintproject/twint/wiki/Graph).\n\nWe are developing a Twint Desktop App.\n\n![4](https://i.imgur.com/DzcfIgL.png)\n\n## FAQ\n> I tried scraping tweets from a user, I know that they exist but I'm not getting them\n\nTwitter can shadow-ban accounts, which means that their tweets will not be available via search. To solve this, pass `--profile-full` if you are using Twint via CLI or, if are using Twint as module, add `config.Profile_full = True`. Please note that this process will be quite slow.\n## More Examples\n\n#### Followers/Following\n\n> To get only follower usernames/following usernames\n\n`twint -u username --followers`\n\n`twint -u username --following`\n\n> To get user info of followers/following users\n\n`twint -u username --followers --user-full`\n\n`twint -u username --following --user-full`\n\n#### userlist\n\n> To get only user info of user\n\n`twint -u username --user-full`\n\n> To get user info of users from a userlist\n\n`twint --userlist inputlist --user-full`\n\n\n#### tweet translation (experimental)\n\n> To get 100 english tweets and translate them to italian\n\n`twint -u noneprivacy --csv --output none.csv --lang en --translate --translate-dest it --limit 100`\n\nor\n\n```python\nimport twint\n\nc = twint.Config()\nc.Username = \"noneprivacy\"\nc.Limit = 100\nc.Store_csv = True\nc.Output = \"none.csv\"\nc.Lang = \"en\"\nc.Translate = True\nc.TranslateDest = \"it\"\ntwint.run.Search(c)\n```\n\nNotes:\n- [Google translate has some quotas](https://cloud.google.com/translate/quotas)\n\n## Featured Blog Posts:\n- [How to use Twint as an OSINT tool](https://pielco11.ovh/posts/twint-osint/)\n- [Basic tutorial made by Null Byte](https://null-byte.wonderhowto.com/how-to/mine-twitter-for-targeted-information-with-twint-0193853/)\n- [Analyzing Tweets with NLP in minutes with Spark, Optimus and Twint](https://towardsdatascience.com/analyzing-tweets-with-nlp-in-minutes-with-spark-optimus-and-twint-a0c96084995f)\n- [Loading tweets into Kafka and Neo4j](https://markhneedham.com/blog/2019/05/29/loading-tweets-twint-kafka-neo4j/)\n\n## Contact\n\nIf you have any question, want to join in discussions, or need extra help, you are welcome to join our Twint focused channel at [OSINT team](https://osint.team)\n"
  },
  {
    "path": "automate.py",
    "content": "import twint\nimport schedule\nimport time\n\n# you can change the name of each \"job\" after \"def\" if you'd like.\ndef jobone():\n\tprint (\"Fetching Tweets\")\n\tc = twint.Config()\n\t# choose username (optional)\n\tc.Username = \"insert username here\"\n\t# choose search term (optional)\n\tc.Search = \"insert search term here\"\n\t# choose beginning time (narrow results)\n\tc.Since = \"2018-01-01\"\n\t# set limit on total tweets\n\tc.Limit = 1000\n\t# no idea, but makes the csv format properly\n\tc.Store_csv = True\n\t# format of the csv\n\tc.Custom = [\"date\", \"time\", \"username\", \"tweet\", \"link\", \"likes\", \"retweets\", \"replies\", \"mentions\", \"hashtags\"]\n\t# change the name of the csv file\n\tc.Output = \"filename.csv\"\n\ttwint.run.Search(c)\n\ndef jobtwo():\n\tprint (\"Fetching Tweets\")\n\tc = twint.Config()\n\t# choose username (optional)\n\tc.Username = \"insert username here\"\n\t# choose search term (optional)\n\tc.Search = \"insert search term here\"\n\t# choose beginning time (narrow results)\n\tc.Since = \"2018-01-01\"\n\t# set limit on total tweets\n\tc.Limit = 1000\n\t# no idea, but makes the csv format properly\n\tc.Store_csv = True\n\t# format of the csv\n\tc.Custom = [\"date\", \"time\", \"username\", \"tweet\", \"link\", \"likes\", \"retweets\", \"replies\", \"mentions\", \"hashtags\"]\n\t# change the name of the csv file\n\tc.Output = \"filename2.csv\"\n\ttwint.run.Search(c)\n\n# run once when you start the program\n\njobone()\njobtwo()\n\n# run every minute(s), hour, day at, day of the week, day of the week and time. Use \"#\" to block out which ones you don't want to use.  Remove it to active. Also, replace \"jobone\" and \"jobtwo\" with your new function names (if applicable)\n\n# schedule.every(1).minutes.do(jobone)\nschedule.every().hour.do(jobone)\n# schedule.every().day.at(\"10:30\").do(jobone)\n# schedule.every().monday.do(jobone)\n# schedule.every().wednesday.at(\"13:15\").do(jobone)\n\n# schedule.every(1).minutes.do(jobtwo)\nschedule.every().hour.do(jobtwo)\n# schedule.every().day.at(\"10:30\").do(jobtwo)\n# schedule.every().monday.do(jobtwo)\n# schedule.every().wednesday.at(\"13:15\").do(jobtwo)\n\nwhile True:\n  schedule.run_pending()\n  time.sleep(1)\n"
  },
  {
    "path": "elasticsearch/README.md",
    "content": "# Elasticsearch How-To\n\n![dashboard](https://i.imgur.com/BEbtdo5.png)\n\nPlease read the Wiki [here](https://github.com/twintproject/twint/wiki/Elasticsearch)\n"
  },
  {
    "path": "setup.py",
    "content": "#!/usr/bin/python3\nfrom setuptools import setup\nimport io\nimport os\n\n# Package meta-data\nNAME = 'twint'\nDESCRIPTION = 'An advanced Twitter scraping & OSINT tool.'\nURL = 'https://github.com/twintproject/twint'\nEMAIL = 'codyzacharias@pm.me'\nAUTHOR = 'Cody Zacharias'\nREQUIRES_PYTHON = '>=3.6.0'\nVERSION = None\n\n# Packages required\nREQUIRED = [\n    'aiohttp', 'aiodns', 'beautifulsoup4', 'cchardet', 'dataclasses',\n    'elasticsearch', 'pysocks', 'pandas', 'aiohttp_socks',\n    'schedule', 'geopy', 'fake-useragent', 'googletransx'\n]\n\nhere = os.path.abspath(os.path.dirname(__file__))\n\nwith io.open(os.path.join(here, 'README.md'), encoding='utf-8') as f:\n    long_description = '\\n' + f.read()\n\n# Load the package's __version__.py\nabout = {}\nif not VERSION:\n    with open(os.path.join(here, NAME, '__version__.py')) as f:\n        exec(f.read(), about)\nelse:\n    about['__version__'] = VERSION\n\nsetup(\n    name=NAME,\n    version=about['__version__'],\n    description=DESCRIPTION,\n    long_description=long_description,\n    long_description_content_type=\"text/markdown\",\n    author=AUTHOR,\n    author_email=EMAIL,\n    python_requires=REQUIRES_PYTHON,\n    url=URL,\n    packages=['twint', 'twint.storage'],\n    entry_points={\n        'console_scripts': [\n            'twint = twint.cli:run_as_command',\n        ],\n    },\n    install_requires=REQUIRED,\n    dependency_links=[\n        'git+https://github.com/x0rzkov/py-googletrans#egg=googletrans'\n    ],\n    license='MIT',\n    classifiers=[\n        'License :: OSI Approved :: MIT License',\n        'Programming Language :: Python',\n        'Programming Language :: Python :: 3',\n        'Programming Language :: Python :: 3.6',\n        'Programming Language :: Python :: 3.7',\n        'Programming Language :: Python :: 3.8',\n        'Programming Language :: Python :: Implementation :: CPython',\n    ],\n)\n"
  },
  {
    "path": "test.py",
    "content": "import twint\nimport os\n\n'''\nTest.py - Testing TWINT to make sure everything works.\n'''\n\n\ndef test_reg(c, run):\n    print(\"[+] Beginning vanilla test in {}\".format(str(run)))\n    run(c)\n\n\ndef test_db(c, run):\n    print(\"[+] Beginning DB test in {}\".format(str(run)))\n    c.Database = \"test_twint.db\"\n    run(c)\n\n\ndef custom(c, run, _type):\n    print(\"[+] Beginning custom {} test in {}\".format(_type, str(run)))\n    c.Custom['tweet'] = [\"id\", \"username\"]\n    c.Custom['user'] = [\"id\", \"username\"]\n    run(c)\n\n\ndef test_json(c, run):\n    c.Store_json = True\n    c.Output = \"test_twint.json\"\n    custom(c, run, \"JSON\")\n    print(\"[+] Beginning JSON test in {}\".format(str(run)))\n    run(c)\n\n\ndef test_csv(c, run):\n    c.Store_csv = True\n    c.Output = \"test_twint.csv\"\n    custom(c, run, \"CSV\")\n    print(\"[+] Beginning CSV test in {}\".format(str(run)))\n    run(c)\n\n\ndef main():\n    c = twint.Config()\n    c.Username = \"verified\"\n    c.Limit = 20\n    c.Store_object = True\n\n    # Separate objects are necessary.\n\n    f = twint.Config()\n    f.Username = \"verified\"\n    f.Limit = 20\n    f.Store_object = True\n    f.User_full = True\n\n    runs = [\n        twint.run.Profile,  # this doesn't\n        twint.run.Search,  # this works\n        twint.run.Following,\n        twint.run.Followers,\n        twint.run.Favorites,\n    ]\n\n    tests = [test_reg, test_json, test_csv, test_db]\n\n    # Something breaks if we don't split these up\n\n    for run in runs[:3]:\n        if run == twint.run.Search:\n            c.Since = \"2012-1-1 20:30:22\"\n            c.Until = \"2017-1-1\"\n        else:\n            c.Since = \"\"\n            c.Until = \"\"\n\n        for test in tests:\n            test(c, run)\n\n    for run in runs[3:]:\n        for test in tests:\n            test(f, run)\n\n    files = [\"test_twint.db\", \"test_twint.json\", \"test_twint.csv\"]\n    for _file in files:\n        os.remove(_file)\n\n    print(\"[+] Testing complete!\")\n\n\nif __name__ == '__main__':\n    main()\n"
  },
  {
    "path": "twint/__init__.py",
    "content": "'''\nTWINT - Twitter Intelligence Tool (formerly known as Tweep).\n\nSee wiki on Github for in-depth details.\nhttps://github.com/twintproject/twint/wiki\n\nLicensed under MIT License\nCopyright (c) 2018 Cody Zacharias\n'''\nimport logging, os\n\nfrom .config import Config\nfrom .__version__ import __version__\nfrom . import run\n\n_levels = {\n    'info': logging.INFO,\n    'debug': logging.DEBUG\n}\n\n_level = os.getenv('TWINT_DEBUG', 'info')\n_logLevel = _levels[_level]\n\nif _level == \"debug\":\n    logger = logging.getLogger()\n    _output_fn = 'twint.log'\n    logger.setLevel(_logLevel)\n    formatter = logging.Formatter('%(levelname)s:%(asctime)s:%(name)s:%(message)s')\n    fileHandler = logging.FileHandler(_output_fn)\n    fileHandler.setLevel(_logLevel)\n    fileHandler.setFormatter(formatter)\n    logger.addHandler(fileHandler)\n"
  },
  {
    "path": "twint/__version__.py",
    "content": "VERSION = (2, 1, 21)\n\n__version__ = '.'.join(map(str, VERSION))\n"
  },
  {
    "path": "twint/cli.py",
    "content": "#!/usr/bin/env python3\n'''\nTwint.py - Twitter Intelligence Tool (formerly known as Tweep).\n\nSee wiki on Github for in-depth details.\nhttps://github.com/twintproject/twint/wiki\n\nLicensed under MIT License\nCopyright (c) 2018 The Twint Project  \n'''\nimport sys\nimport os\nimport argparse\n\nfrom . import run\nfrom . import config\nfrom . import storage\n\n\ndef error(_error, message):\n    \"\"\" Print errors to stdout\n    \"\"\"\n    print(\"[-] {}: {}\".format(_error, message))\n    sys.exit(0)\n\n\ndef check(args):\n    \"\"\" Error checking\n    \"\"\"\n    if args.username is not None or args.userlist or args.members_list:\n        if args.verified:\n            error(\"Contradicting Args\",\n                  \"Please use --verified in combination with -s.\")\n        if args.userid:\n            error(\"Contradicting Args\",\n                  \"--userid and -u cannot be used together.\")\n        if args.all:\n            error(\"Contradicting Args\",\n                  \"--all and -u cannot be used together.\")\n    elif args.search and args.timeline:\n        error(\"Contradicting Args\",\n              \"--s and --tl cannot be used together.\")\n    elif args.timeline and not args.username:\n        error(\"Error\", \"-tl cannot be used without -u.\")\n    elif args.search is None:\n        if args.custom_query is not None:\n            pass\n        elif (args.geo or args.near) is None and not (args.all or args.userid):\n            error(\"Error\", \"Please use at least -u, -s, -g or --near.\")\n    elif args.all and args.userid:\n        error(\"Contradicting Args\",\n              \"--all and --userid cannot be used together\")\n    if args.output is None:\n        if args.csv:\n            error(\"Error\", \"Please specify an output file (Example: -o file.csv).\")\n        elif args.json:\n            error(\"Error\", \"Please specify an output file (Example: -o file.json).\")\n    if args.backoff_exponent <= 0:\n        error(\"Error\", \"Please specifiy a positive value for backoff_exponent\")\n    if args.min_wait_time < 0:\n        error(\"Error\", \"Please specifiy a non negative value for min_wait_time\")\n\n\ndef loadUserList(ul, _type):\n    \"\"\" Concatenate users\n    \"\"\"\n    if os.path.exists(os.path.abspath(ul)):\n        userlist = open(os.path.abspath(ul), \"r\").read().splitlines()\n    else:\n        userlist = ul.split(\",\")\n    if _type == \"search\":\n        un = \"\"\n        for user in userlist:\n            un += \"%20OR%20from%3A\" + user\n        return un[15:]\n    return userlist\n\n\ndef initialize(args):\n    \"\"\" Set default values for config from args\n    \"\"\"\n    c = config.Config()\n    c.Username = args.username\n    c.User_id = args.userid\n    c.Search = args.search\n    c.Geo = args.geo\n    c.Location = args.location\n    c.Near = args.near\n    c.Lang = args.lang\n    c.Output = args.output\n    c.Elasticsearch = args.elasticsearch\n    c.Year = args.year\n    c.Since = args.since\n    c.Until = args.until\n    c.Email = args.email\n    c.Phone = args.phone\n    c.Verified = args.verified\n    c.Store_csv = args.csv\n    c.Tabs = args.tabs\n    c.Store_json = args.json\n    c.Show_hashtags = args.hashtags\n    c.Show_cashtags = args.cashtags\n    c.Limit = args.limit\n    c.Count = args.count\n    c.Stats = args.stats\n    c.Database = args.database\n    c.To = args.to\n    c.All = args.all\n    c.Essid = args.essid\n    c.Format = args.format\n    c.User_full = args.user_full\n    # c.Profile_full = args.profile_full\n    c.Pandas_type = args.pandas_type\n    c.Index_tweets = args.index_tweets\n    c.Index_follow = args.index_follow\n    c.Index_users = args.index_users\n    c.Debug = args.debug\n    c.Resume = args.resume\n    c.Images = args.images\n    c.Videos = args.videos\n    c.Media = args.media\n    c.Replies = args.replies\n    c.Pandas_clean = args.pandas_clean\n    c.Proxy_host = args.proxy_host\n    c.Proxy_port = args.proxy_port\n    c.Proxy_type = args.proxy_type\n    c.Tor_control_port = args.tor_control_port\n    c.Tor_control_password = args.tor_control_password\n    c.Retweets = args.retweets\n    c.Custom_query = args.custom_query\n    c.Popular_tweets = args.popular_tweets\n    c.Skip_certs = args.skip_certs\n    c.Hide_output = args.hide_output\n    c.Native_retweets = args.native_retweets\n    c.Min_likes = args.min_likes\n    c.Min_retweets = args.min_retweets\n    c.Min_replies = args.min_replies\n    c.Links = args.links\n    c.Source = args.source\n    c.Members_list = args.members_list\n    c.Filter_retweets = args.filter_retweets\n    c.Translate = args.translate\n    c.TranslateDest = args.translate_dest\n    c.Backoff_exponent = args.backoff_exponent\n    c.Min_wait_time = args.min_wait_time\n    return c\n\n\ndef options():\n    \"\"\" Parse arguments\n    \"\"\"\n    ap = argparse.ArgumentParser(prog=\"twint\",\n                                 usage=\"python3 %(prog)s [options]\",\n                                 description=\"TWINT - An Advanced Twitter Scraping Tool.\")\n    ap.add_argument(\"-u\", \"--username\", help=\"User's Tweets you want to scrape.\")\n    ap.add_argument(\"-s\", \"--search\", help=\"Search for Tweets containing this word or phrase.\")\n    ap.add_argument(\"-g\", \"--geo\", help=\"Search for geocoded Tweets.\")\n    ap.add_argument(\"--near\", help=\"Near a specified city.\")\n    ap.add_argument(\"--location\", help=\"Show user's location (Experimental).\", action=\"store_true\")\n    ap.add_argument(\"-l\", \"--lang\", help=\"Search for Tweets in a specific language.\")\n    ap.add_argument(\"-o\", \"--output\", help=\"Save output to a file.\")\n    ap.add_argument(\"-es\", \"--elasticsearch\", help=\"Index to Elasticsearch.\")\n    ap.add_argument(\"--year\", help=\"Filter Tweets before specified year.\")\n    ap.add_argument(\"--since\", help=\"Filter Tweets sent since date (Example: \\\"2017-12-27 20:30:15\\\" or 2017-12-27).\",\n                    metavar=\"DATE\")\n    ap.add_argument(\"--until\", help=\"Filter Tweets sent until date (Example: \\\"2017-12-27 20:30:15\\\" or 2017-12-27).\",\n                    metavar=\"DATE\")\n    ap.add_argument(\"--email\", help=\"Filter Tweets that might have email addresses\", action=\"store_true\")\n    ap.add_argument(\"--phone\", help=\"Filter Tweets that might have phone numbers\", action=\"store_true\")\n    ap.add_argument(\"--verified\", help=\"Display Tweets only from verified users (Use with -s).\",\n                    action=\"store_true\")\n    ap.add_argument(\"--csv\", help=\"Write as .csv file.\", action=\"store_true\")\n    ap.add_argument(\"--tabs\", help=\"Separate CSV fields with tab characters, not commas.\", action=\"store_true\")\n    ap.add_argument(\"--json\", help=\"Write as .json file\", action=\"store_true\")\n    ap.add_argument(\"--hashtags\", help=\"Output hashtags in seperate column.\", action=\"store_true\")\n    ap.add_argument(\"--cashtags\", help=\"Output cashtags in seperate column.\", action=\"store_true\")\n    ap.add_argument(\"--userid\", help=\"Twitter user id.\")\n    ap.add_argument(\"--limit\", help=\"Number of Tweets to pull (Increments of 20).\")\n    ap.add_argument(\"--count\", help=\"Display number of Tweets scraped at the end of session.\",\n                    action=\"store_true\")\n    ap.add_argument(\"--stats\", help=\"Show number of replies, retweets, and likes.\",\n                    action=\"store_true\")\n    ap.add_argument(\"-db\", \"--database\", help=\"Store Tweets in a sqlite3 database.\")\n    ap.add_argument(\"--to\", help=\"Search Tweets to a user.\", metavar=\"USERNAME\")\n    ap.add_argument(\"--all\", help=\"Search all Tweets associated with a user.\", metavar=\"USERNAME\")\n    ap.add_argument(\"--followers\", help=\"Scrape a person's followers.\", action=\"store_true\")\n    ap.add_argument(\"--following\", help=\"Scrape a person's follows\", action=\"store_true\")\n    ap.add_argument(\"--favorites\", help=\"Scrape Tweets a user has liked.\", action=\"store_true\")\n    ap.add_argument(\"--proxy-type\", help=\"Socks5, HTTP, etc.\")\n    ap.add_argument(\"--proxy-host\", help=\"Proxy hostname or IP.\")\n    ap.add_argument(\"--proxy-port\", help=\"The port of the proxy server.\")\n    ap.add_argument(\"--tor-control-port\", help=\"If proxy-host is set to tor, this is the control port\", default=9051)\n    ap.add_argument(\"--tor-control-password\",\n                    help=\"If proxy-host is set to tor, this is the password for the control port\",\n                    default=\"my_password\")\n    ap.add_argument(\"--essid\",\n                    help=\"Elasticsearch Session ID, use this to differentiate scraping sessions.\",\n                    nargs=\"?\", default=\"\")\n    ap.add_argument(\"--userlist\", help=\"Userlist from list or file.\")\n    ap.add_argument(\"--retweets\",\n                    help=\"Include user's Retweets (Warning: limited).\",\n                    action=\"store_true\")\n    ap.add_argument(\"--format\", help=\"Custom output format (See wiki for details).\")\n    ap.add_argument(\"--user-full\",\n                    help=\"Collect all user information (Use with followers or following only).\",\n                    action=\"store_true\")\n    # I am removing this this feature for the time being, because it is no longer required, default method will do this\n    # ap.add_argument(\"--profile-full\",\n    #                 help=\"Slow, but effective method of collecting a user's Tweets and RT.\",\n    #                 action=\"store_true\")\n    ap.add_argument(\n        \"-tl\",\n        \"--timeline\",\n        help=\"Collects every tweet from a User's Timeline. (Tweets, RTs & Replies)\",\n        action=\"store_true\",\n    )\n    ap.add_argument(\"--translate\",\n                    help=\"Get tweets translated by Google Translate.\",\n                    action=\"store_true\")\n    ap.add_argument(\"--translate-dest\", help=\"Translate tweet to language (ISO2).\",\n                    default=\"en\")\n    ap.add_argument(\"--store-pandas\", help=\"Save Tweets in a DataFrame (Pandas) file.\")\n    ap.add_argument(\"--pandas-type\",\n                    help=\"Specify HDF5 or Pickle (HDF5 as default)\", nargs=\"?\", default=\"HDF5\")\n    ap.add_argument(\"-it\", \"--index-tweets\",\n                    help=\"Custom Elasticsearch Index name for Tweets.\", nargs=\"?\", default=\"twinttweets\")\n    ap.add_argument(\"-if\", \"--index-follow\",\n                    help=\"Custom Elasticsearch Index name for Follows.\",\n                    nargs=\"?\", default=\"twintgraph\")\n    ap.add_argument(\"-iu\", \"--index-users\", help=\"Custom Elasticsearch Index name for Users.\",\n                    nargs=\"?\", default=\"twintuser\")\n    ap.add_argument(\"--debug\",\n                    help=\"Store information in debug logs\", action=\"store_true\")\n    ap.add_argument(\"--resume\", help=\"Resume from Tweet ID.\", metavar=\"TWEET_ID\")\n    ap.add_argument(\"--videos\", help=\"Display only Tweets with videos.\", action=\"store_true\")\n    ap.add_argument(\"--images\", help=\"Display only Tweets with images.\", action=\"store_true\")\n    ap.add_argument(\"--media\",\n                    help=\"Display Tweets with only images or videos.\", action=\"store_true\")\n    ap.add_argument(\"--replies\", help=\"Display replies to a subject.\", action=\"store_true\")\n    ap.add_argument(\"-pc\", \"--pandas-clean\",\n                    help=\"Automatically clean Pandas dataframe at every scrape.\")\n    ap.add_argument(\"-cq\", \"--custom-query\", help=\"Custom search query.\")\n    ap.add_argument(\"-pt\", \"--popular-tweets\", help=\"Scrape popular tweets instead of recent ones.\",\n                    action=\"store_true\")\n    ap.add_argument(\"-sc\", \"--skip-certs\", help=\"Skip certs verification, useful for SSC.\", action=\"store_false\")\n    ap.add_argument(\"-ho\", \"--hide-output\", help=\"Hide output, no tweets will be displayed.\", action=\"store_true\")\n    ap.add_argument(\"-nr\", \"--native-retweets\", help=\"Filter the results for retweets only.\", action=\"store_true\")\n    ap.add_argument(\"--min-likes\", help=\"Filter the tweets by minimum number of likes.\")\n    ap.add_argument(\"--min-retweets\", help=\"Filter the tweets by minimum number of retweets.\")\n    ap.add_argument(\"--min-replies\", help=\"Filter the tweets by minimum number of replies.\")\n    ap.add_argument(\"--links\", help=\"Include or exclude tweets containing one o more links. If not specified\" +\n                                    \" you will get both tweets that might contain links or not.\")\n    ap.add_argument(\"--source\", help=\"Filter the tweets for specific source client.\")\n    ap.add_argument(\"--members-list\", help=\"Filter the tweets sent by users in a given list.\")\n    ap.add_argument(\"-fr\", \"--filter-retweets\", help=\"Exclude retweets from the results.\", action=\"store_true\")\n    ap.add_argument(\"--backoff-exponent\", help=\"Specify a exponent for the polynomial backoff in case of errors.\",\n                    type=float, default=3.0)\n    ap.add_argument(\"--min-wait-time\", type=float, default=15,\n                    help=\"specifiy a minimum wait time in case of scraping limit error. This value will be adjusted by twint if the value provided does not satisfy the limits constraints\")\n    args = ap.parse_args()\n\n    return args\n\n\ndef main():\n    \"\"\" Main\n    \"\"\"\n    args = options()\n    check(args)\n\n    if args.pandas_clean:\n        storage.panda.clean()\n\n    c = initialize(args)\n\n    if args.userlist:\n        c.Query = loadUserList(args.userlist, \"search\")\n\n    if args.pandas_clean:\n        storage.panda.clean()\n\n    if args.favorites:\n        if args.userlist:\n            _userlist = loadUserList(args.userlist, \"favorites\")\n            for _user in _userlist:\n                args.username = _user\n                c = initialize(args)\n                run.Favorites(c)\n        else:\n            run.Favorites(c)\n    elif args.following:\n        if args.userlist:\n            _userlist = loadUserList(args.userlist, \"following\")\n            for _user in _userlist:\n                args.username = _user\n                c = initialize(args)\n                run.Following(c)\n        else:\n            run.Following(c)\n    elif args.followers:\n        if args.userlist:\n            _userlist = loadUserList(args.userlist, \"followers\")\n            for _user in _userlist:\n                args.username = _user\n                c = initialize(args)\n                run.Followers(c)\n        else:\n            run.Followers(c)\n    elif args.retweets:  # or args.profile_full:\n        if args.userlist:\n            _userlist = loadUserList(args.userlist, \"profile\")\n            for _user in _userlist:\n                args.username = _user\n                c = initialize(args)\n                run.Profile(c)\n        else:\n            run.Profile(c)\n    elif args.user_full:\n        if args.userlist:\n            _userlist = loadUserList(args.userlist, \"userlist\")\n            for _user in _userlist:\n                args.username = _user\n                c = initialize(args)\n                run.Lookup(c)\n        else:\n            run.Lookup(c)\n    elif args.timeline:\n        run.Profile(c)\n    else:\n        run.Search(c)\n\n\ndef run_as_command():\n    version = \".\".join(str(v) for v in sys.version_info[:2])\n    if float(version) < 3.6:\n        print(\"[-] TWINT requires Python version 3.6+.\")\n        sys.exit(0)\n\n    main()\n\n\nif __name__ == '__main__':\n    main()\n"
  },
  {
    "path": "twint/config.py",
    "content": "from dataclasses import dataclass\nfrom typing import Optional\n\n@dataclass\nclass Config:\n    Username: Optional[str] = None\n    User_id: Optional[str] = None\n    Search: Optional[str] = None\n    Lookup: bool = False\n    Geo: str = \"\"\n    Location: bool = False\n    Near: str = None\n    Lang: Optional[str] = None\n    Output: Optional[str] = None\n    Elasticsearch: object = None\n    Year: Optional[int] = None\n    Since: Optional[str] = None\n    Until: Optional[str] = None\n    Email: Optional[str] = None\n    Phone: Optional[str] = None\n    Verified: bool = False\n    Store_csv: bool = False\n    Store_json: bool = False\n    Custom = {\"tweet\": None, \"user\": None, \"username\": None}\n    Show_hashtags: bool = False\n    Show_cashtags: bool = False\n    Limit: Optional[int] = None\n    Count: Optional[int] = None\n    Stats: bool = False\n    Database: object = None\n    To: str = None\n    All = None\n    Debug: bool = False\n    Format = None\n    Essid: str = \"\"\n    Profile: bool = False\n    Followers: bool = False\n    Following: bool = False\n    Favorites: bool = False\n    TwitterSearch: bool = False\n    User_full: bool = False\n    # Profile_full: bool = False\n    Store_object: bool = False\n    Store_object_tweets_list: list = None\n    Store_object_users_list: list = None\n    Store_object_follow_list: list = None\n    Pandas_type: type = None\n    Pandas: bool = False\n    Index_tweets: str = \"twinttweets\"\n    Index_follow: str = \"twintgraph\"\n    Index_users: str = \"twintuser\"\n    Retries_count: int = 10\n    Resume: object = None\n    Images: bool = False\n    Videos: bool = False\n    Media: bool = False\n    Replies: bool = False\n    Pandas_clean: bool = True\n    Lowercase: bool = True\n    Pandas_au: bool = True\n    Proxy_host: str = \"\"\n    Proxy_port: int = 0\n    Proxy_type: object = None\n    Tor_control_port: int = 9051\n    Tor_control_password: str = None\n    Retweets: bool = False\n    Query: str = None\n    Hide_output: bool = False\n    Custom_query: str = \"\"\n    Popular_tweets: bool = False\n    Skip_certs: bool = False\n    Native_retweets: bool = False\n    Min_likes: int = 0\n    Min_retweets: int = 0\n    Min_replies: int = 0\n    Links: Optional[str] = None\n    Source: Optional[str] = None\n    Members_list: Optional[str] = None\n    Filter_retweets: bool = False\n    Translate: bool = False\n    TranslateSrc: str = \"en\"\n    TranslateDest: str = \"en\"\n    Backoff_exponent: float = 3.0\n    Min_wait_time: int = 0\n    Bearer_token: str = None\n    Guest_token: str = None\n    deleted: list = None\n"
  },
  {
    "path": "twint/datelock.py",
    "content": "import datetime\n\nimport logging as logme\n\nfrom .tweet import utc_to_local\n\n\nclass Datelock:\n    until = None\n    since = None\n    _since_def_user = None\n\n\ndef convertToDateTime(string):\n    dateTimeList = string.split()\n    ListLength = len(dateTimeList)\n    if ListLength == 2:\n        return string\n    if ListLength == 1:\n        return string + \" 00:00:00\"\n    else:\n        return \"\"\n\n\ndef Set(Until, Since):\n    logme.debug(__name__+':Set')\n    d = Datelock()\n\n    if Until:\n        d.until = datetime.datetime.strptime(convertToDateTime(Until), \"%Y-%m-%d %H:%M:%S\")\n        d.until = utc_to_local(d.until)\n    else:\n        d.until = datetime.datetime.today()\n\n    if Since:\n        d.since = datetime.datetime.strptime(convertToDateTime(Since), \"%Y-%m-%d %H:%M:%S\")\n        d.since = utc_to_local(d.since)\n        d._since_def_user = True\n    else:\n        d.since = datetime.datetime.strptime(\"2006-03-21 00:00:00\", \"%Y-%m-%d %H:%M:%S\")\n        d.since = utc_to_local(d.since)\n        d._since_def_user = False\n\n    return d\n"
  },
  {
    "path": "twint/feed.py",
    "content": "import time\nfrom datetime import datetime\n\nfrom bs4 import BeautifulSoup\nfrom re import findall\nfrom json import loads\n\nimport logging as logme\n\nfrom .tweet import utc_to_local, Tweet_formats\n\n\nclass NoMoreTweetsException(Exception):\n    def __init__(self, msg):\n        super().__init__(msg)\n\n\ndef Follow(response):\n    logme.debug(__name__ + ':Follow')\n    soup = BeautifulSoup(response, \"html.parser\")\n    follow = soup.find_all(\"td\", \"info fifty screenname\")\n    cursor = soup.find_all(\"div\", \"w-button-more\")\n    try:\n        cursor = findall(r'cursor=(.*?)\">', str(cursor))[0]\n    except IndexError:\n        logme.critical(__name__ + ':Follow:IndexError')\n\n    return follow, cursor\n\n\n# TODO: this won't be used by --profile-full anymore. if it isn't used anywhere else, perhaps remove this in future\ndef Mobile(response):\n    logme.debug(__name__ + ':Mobile')\n    soup = BeautifulSoup(response, \"html.parser\")\n    tweets = soup.find_all(\"span\", \"metadata\")\n    max_id = soup.find_all(\"div\", \"w-button-more\")\n    try:\n        max_id = findall(r'max_id=(.*?)\">', str(max_id))[0]\n    except Exception as e:\n        logme.critical(__name__ + ':Mobile:' + str(e))\n\n    return tweets, max_id\n\n\ndef MobileFav(response):\n    soup = BeautifulSoup(response, \"html.parser\")\n    tweets = soup.find_all(\"table\", \"tweet\")\n    max_id = soup.find_all(\"div\", \"w-button-more\")\n    try:\n        max_id = findall(r'max_id=(.*?)\">', str(max_id))[0]\n    except Exception as e:\n        print(str(e) + \" [x] feed.MobileFav\")\n\n    return tweets, max_id\n\n\ndef _get_cursor(response):\n    try:\n        next_cursor = response['timeline']['instructions'][0]['addEntries']['entries'][-1]['content'][\n            'operation']['cursor']['value']\n    except KeyError:\n        # this is needed because after the first request location of cursor is changed\n        next_cursor = response['timeline']['instructions'][-1]['replaceEntry']['entry']['content']['operation'][\n            'cursor']['value']\n    return next_cursor\n\n\ndef Json(response):\n    logme.debug(__name__ + ':Json')\n    json_response = loads(response)\n    html = json_response[\"items_html\"]\n    soup = BeautifulSoup(html, \"html.parser\")\n    feed = soup.find_all(\"div\", \"tweet\")\n    return feed, json_response[\"min_position\"]\n\n\ndef parse_tweets(config, response):\n    logme.debug(__name__ + ':parse_tweets')\n    response = loads(response)\n    if len(response['globalObjects']['tweets']) == 0:\n        msg = 'No more data!'\n        raise NoMoreTweetsException(msg)\n    feed = []\n    for timeline_entry in response['timeline']['instructions'][0]['addEntries']['entries']:\n        # this will handle the cases when the timeline entry is a tweet\n        if (config.TwitterSearch or config.Profile) and (timeline_entry['entryId'].startswith('sq-I-t-') or\n                                                         timeline_entry['entryId'].startswith('tweet-')):\n            if 'tweet' in timeline_entry['content']['item']['content']:\n                _id = timeline_entry['content']['item']['content']['tweet']['id']\n                # skip the ads\n                if 'promotedMetadata' in timeline_entry['content']['item']['content']['tweet']:\n                    continue\n            elif 'tombstone' in timeline_entry['content']['item']['content'] and 'tweet' in \\\n                    timeline_entry['content']['item']['content']['tombstone']:\n                _id = timeline_entry['content']['item']['content']['tombstone']['tweet']['id']\n            else:\n                _id = None\n            if _id is None:\n                raise ValueError('Unable to find ID of tweet in timeline.')\n            try:\n                temp_obj = response['globalObjects']['tweets'][_id]\n            except KeyError:\n                logme.info('encountered a deleted tweet with id {}'.format(_id))\n\n                config.deleted.append(_id)\n                continue\n            temp_obj['user_data'] = response['globalObjects']['users'][temp_obj['user_id_str']]\n            if 'retweeted_status_id_str' in temp_obj:\n                rt_id = temp_obj['retweeted_status_id_str']\n                _dt = response['globalObjects']['tweets'][rt_id]['created_at']\n                _dt = datetime.strptime(_dt, '%a %b %d %H:%M:%S %z %Y')\n                _dt = utc_to_local(_dt)\n                _dt = str(_dt.strftime(Tweet_formats['datetime']))\n                temp_obj['retweet_data'] = {\n                    'user_rt_id': response['globalObjects']['tweets'][rt_id]['user_id_str'],\n                    'user_rt': response['globalObjects']['tweets'][rt_id]['full_text'],\n                    'retweet_id': rt_id,\n                    'retweet_date': _dt,\n                }\n            feed.append(temp_obj)\n    next_cursor = _get_cursor(response)\n    return feed, next_cursor\n"
  },
  {
    "path": "twint/format.py",
    "content": "import logging as logme\n\ndef Tweet(config, t):\n    if config.Format:\n        logme.debug(__name__+':Tweet:Format')\n        output = config.Format.replace(\"{id}\", t.id_str)\n        output = output.replace(\"{conversation_id}\", t.conversation_id)\n        output = output.replace(\"{date}\", t.datestamp)\n        output = output.replace(\"{time}\", t.timestamp)\n        output = output.replace(\"{user_id}\", t.user_id_str)\n        output = output.replace(\"{username}\", t.username)\n        output = output.replace(\"{name}\", t.name)\n        output = output.replace(\"{place}\", t.place)\n        output = output.replace(\"{timezone}\", t.timezone)\n        output = output.replace(\"{urls}\", \",\".join(t.urls))\n        output = output.replace(\"{photos}\", \",\".join(t.photos))\n        output = output.replace(\"{video}\", str(t.video))\n        output = output.replace(\"{thumbnail}\", t.thumbnail)\n        output = output.replace(\"{tweet}\", t.tweet)\n        output = output.replace(\"{language}\", t.lang)\n        output = output.replace(\"{hashtags}\", \",\".join(t.hashtags))\n        output = output.replace(\"{cashtags}\", \",\".join(t.cashtags))\n        output = output.replace(\"{replies}\", t.replies_count)\n        output = output.replace(\"{retweets}\", t.retweets_count)\n        output = output.replace(\"{likes}\", t.likes_count)\n        output = output.replace(\"{link}\", t.link)\n        output = output.replace(\"{is_retweet}\", str(t.retweet))\n        output = output.replace(\"{user_rt_id}\", str(t.user_rt_id))\n        output = output.replace(\"{quote_url}\", t.quote_url)\n        output = output.replace(\"{near}\", t.near)\n        output = output.replace(\"{geo}\", t.geo)\n        output = output.replace(\"{mentions}\", \",\".join(t.mentions))\n        output = output.replace(\"{translate}\", t.translate)\n        output = output.replace(\"{trans_src}\", t.trans_src)\n        output = output.replace(\"{trans_dest}\", t.trans_dest)\n    else:\n        logme.debug(__name__+':Tweet:notFormat')\n        output = f\"{t.id_str} {t.datestamp} {t.timestamp} {t.timezone} \"\n\n        # TODO: someone who is familiar with this code, needs to take a look at what this is <also see tweet.py>\n        # if t.retweet:\n        #    output += \"RT \"\n\n        output += f\"<{t.username}> {t.tweet}\"\n\n        if config.Show_hashtags:\n            hashtags = \",\".join(t.hashtags)\n            output += f\" {hashtags}\"\n        if config.Show_cashtags:\n            cashtags = \",\".join(t.cashtags)\n            output += f\" {cashtags}\"\n        if config.Stats:\n            output += f\" | {t.replies_count} replies {t.retweets_count} retweets {t.likes_count} likes\"\n        if config.Translate:\n            output += f\" {t.translate} {t.trans_src} {t.trans_dest}\"\n    return output\n\ndef User(_format, u):\n    if _format:\n        logme.debug(__name__+':User:Format')\n        output = _format.replace(\"{id}\", str(u.id))\n        output = output.replace(\"{name}\", u.name)\n        output = output.replace(\"{username}\", u.username)\n        output = output.replace(\"{bio}\", u.bio)\n        output = output.replace(\"{location}\", u.location)\n        output = output.replace(\"{url}\", u.url)\n        output = output.replace(\"{join_date}\", u.join_date)\n        output = output.replace(\"{join_time}\", u.join_time)\n        output = output.replace(\"{tweets}\", str(u.tweets))\n        output = output.replace(\"{following}\", str(u.following))\n        output = output.replace(\"{followers}\", str(u.followers))\n        output = output.replace(\"{likes}\", str(u.likes))\n        output = output.replace(\"{media}\", str(u.media_count))\n        output = output.replace(\"{private}\", str(u.is_private))\n        output = output.replace(\"{verified}\", str(u.is_verified))\n        output = output.replace(\"{avatar}\", u.avatar)\n        if u.background_image:\n            output = output.replace(\"{background_image}\", u.background_image)\n        else:\n            output = output.replace(\"{background_image}\", \"\")\n    else:\n        logme.debug(__name__+':User:notFormat')\n        output = f\"{u.id} | {u.name} | @{u.username} | Private: \"\n        output += f\"{u.is_private} | Verified: {u.is_verified} |\"\n        output += f\" Bio: {u.bio} | Location: {u.location} | Url: \"\n        output += f\"{u.url} | Joined: {u.join_date} {u.join_time} \"\n        output += f\"| Tweets: {u.tweets} | Following: {u.following}\"\n        output += f\" | Followers: {u.followers} | Likes: {u.likes} \"\n        output += f\"| Media: {u.media_count} | Avatar: {u.avatar}\"\n\n    return output\n"
  },
  {
    "path": "twint/get.py",
    "content": "from async_timeout import timeout\nfrom datetime import datetime\nfrom bs4 import BeautifulSoup\nimport sys\nimport socket\nimport aiohttp\nfrom fake_useragent import UserAgent\nimport asyncio\nimport concurrent.futures\nimport random\nfrom json import loads, dumps\nfrom aiohttp_socks import ProxyConnector, ProxyType\nfrom urllib.parse import quote\n\nfrom . import url\nfrom .output import Tweets, Users\nfrom .token import TokenExpiryException\n\nimport logging as logme\n\nhttpproxy = None\n\nuser_agent_list = [\n    # 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko)'\n    # ' Chrome/60.0.3112.113 Safari/537.36',\n    # 'Mozilla/5.0 (Windows NT 6.1; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko)'\n    # ' Chrome/60.0.3112.90 Safari/537.36',\n    # 'Mozilla/5.0 (Windows NT 5.1; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko)'\n    # ' Chrome/60.0.3112.90 Safari/537.36',\n    # 'Mozilla/5.0 (Windows NT 6.2; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko)'\n    # ' Chrome/60.0.3112.90 Safari/537.36',\n    # 'Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko)'\n    # ' Chrome/44.0.2403.157 Safari/537.36',\n    # 'Mozilla/5.0 (Windows NT 6.3; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko)'\n    # ' Chrome/60.0.3112.113 Safari/537.36',\n    # 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko)'\n    # ' Chrome/57.0.2987.133 Safari/537.36',\n    # 'Mozilla/5.0 (Windows NT 6.1; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko)'\n    # ' Chrome/57.0.2987.133 Safari/537.36',\n    # 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko)'\n    # ' Chrome/55.0.2883.87 Safari/537.36',\n    # 'Mozilla/5.0 (Windows NT 6.1; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko)'\n    # ' Chrome/55.0.2883.87 Safari/537.36',\n\n    'Mozilla/4.0 (compatible; MSIE 9.0; Windows NT 6.1)',\n    'Mozilla/5.0 (Windows NT 6.1; WOW64; Trident/7.0; rv:11.0) like Gecko',\n    'Mozilla/5.0 (compatible; MSIE 9.0; Windows NT 6.1; WOW64; Trident/5.0)',\n    'Mozilla/5.0 (Windows NT 6.1; Trident/7.0; rv:11.0) like Gecko',\n    'Mozilla/5.0 (Windows NT 6.2; WOW64; Trident/7.0; rv:11.0) like Gecko',\n    'Mozilla/5.0 (Windows NT 10.0; WOW64; Trident/7.0; rv:11.0) like Gecko',\n    'Mozilla/5.0 (compatible; MSIE 9.0; Windows NT 6.0; Trident/5.0)',\n    'Mozilla/5.0 (Windows NT 6.3; WOW64; Trident/7.0; rv:11.0) like Gecko',\n    'Mozilla/5.0 (compatible; MSIE 9.0; Windows NT 6.1; Trident/5.0)',\n    'Mozilla/5.0 (Windows NT 6.1; Win64; x64; Trident/7.0; rv:11.0) like Gecko',\n    'Mozilla/5.0 (compatible; MSIE 10.0; Windows NT 6.1; WOW64; Trident/6.0)',\n    'Mozilla/5.0 (compatible; MSIE 10.0; Windows NT 6.1; Trident/6.0)',\n    'Mozilla/4.0 (compatible; MSIE 8.0; Windows NT 5.1; Trident/4.0; .NET CLR 2.0.50727; .NET CLR 3.0.4506.2152; .NET '\n    'CLR 3.5.30729)',\n]\n\n\n# function to convert python `dict` to json and then encode it to be passed in the url as a parameter\n# some urls require this format\ndef dict_to_url(dct):\n    return quote(dumps(dct))\n\n\ndef get_connector(config):\n    logme.debug(__name__ + ':get_connector')\n    _connector = None\n    if config.Proxy_host:\n        if config.Proxy_host.lower() == \"tor\":\n            _connector = ProxyConnector(\n                host='127.0.0.1',\n                port=9050,\n                rdns=True)\n        elif config.Proxy_port and config.Proxy_type:\n            if config.Proxy_type.lower() == \"socks5\":\n                _type = ProxyType.SOCKS5\n            elif config.Proxy_type.lower() == \"socks4\":\n                _type = ProxyType.SOCKS4\n            elif config.Proxy_type.lower() == \"http\":\n                global httpproxy\n                httpproxy = \"http://\" + config.Proxy_host + \":\" + str(config.Proxy_port)\n                return _connector\n            else:\n                logme.critical(\"get_connector:proxy-type-error\")\n                print(\"Error: Proxy types allowed are: http, socks5 and socks4. No https.\")\n                sys.exit(1)\n            _connector = ProxyConnector(\n                proxy_type=_type,\n                host=config.Proxy_host,\n                port=config.Proxy_port,\n                rdns=True)\n        else:\n            logme.critical(__name__ + ':get_connector:proxy-port-type-error')\n            print(\"Error: Please specify --proxy-host, --proxy-port, and --proxy-type\")\n            sys.exit(1)\n    else:\n        if config.Proxy_port or config.Proxy_type:\n            logme.critical(__name__ + ':get_connector:proxy-host-arg-error')\n            print(\"Error: Please specify --proxy-host, --proxy-port, and --proxy-type\")\n            sys.exit(1)\n\n    return _connector\n\n\nasync def RequestUrl(config, init):\n    logme.debug(__name__ + ':RequestUrl')\n    _connector = get_connector(config)\n    _serialQuery = \"\"\n    params = []\n    _url = \"\"\n    _headers = [(\"authorization\", config.Bearer_token), (\"x-guest-token\", config.Guest_token)]\n\n    # TODO : do this later\n    if config.Profile:\n        logme.debug(__name__ + ':RequestUrl:Profile')\n        _url, params, _serialQuery = url.SearchProfile(config, init)\n    elif config.TwitterSearch:\n        logme.debug(__name__ + ':RequestUrl:TwitterSearch')\n        _url, params, _serialQuery = await url.Search(config, init)\n    else:\n        if config.Following:\n            logme.debug(__name__ + ':RequestUrl:Following')\n            _url = await url.Following(config.Username, init)\n        elif config.Followers:\n            logme.debug(__name__ + ':RequestUrl:Followers')\n            _url = await url.Followers(config.Username, init)\n        else:\n            logme.debug(__name__ + ':RequestUrl:Favorites')\n            _url = await url.Favorites(config.Username, init)\n        _serialQuery = _url\n\n    response = await Request(_url, params=params, connector=_connector, headers=_headers)\n\n    if config.Debug:\n        print(_serialQuery, file=open(\"twint-request_urls.log\", \"a\", encoding=\"utf-8\"))\n\n    return response\n\n\ndef ForceNewTorIdentity(config):\n    logme.debug(__name__ + ':ForceNewTorIdentity')\n    try:\n        tor_c = socket.create_connection(('127.0.0.1', config.Tor_control_port))\n        tor_c.send('AUTHENTICATE \"{}\"\\r\\nSIGNAL NEWNYM\\r\\n'.format(config.Tor_control_password).encode())\n        response = tor_c.recv(1024)\n        if response != b'250 OK\\r\\n250 OK\\r\\n':\n            sys.stderr.write('Unexpected response from Tor control port: {}\\n'.format(response))\n            logme.critical(__name__ + ':ForceNewTorIdentity:unexpectedResponse')\n    except Exception as e:\n        logme.debug(__name__ + ':ForceNewTorIdentity:errorConnectingTor')\n        sys.stderr.write('Error connecting to Tor control port: {}\\n'.format(repr(e)))\n        sys.stderr.write('If you want to rotate Tor ports automatically - enable Tor control port\\n')\n\n\nasync def Request(_url, connector=None, params=None, headers=None):\n    logme.debug(__name__ + ':Request:Connector')\n    async with aiohttp.ClientSession(connector=connector, headers=headers) as session:\n        return await Response(session, _url, params)\n\n\nasync def Response(session, _url, params=None):\n    logme.debug(__name__ + ':Response')\n    with timeout(120):\n        async with session.get(_url, ssl=True, params=params, proxy=httpproxy) as response:\n            resp = await response.text()\n            if response.status == 429:  # 429 implies Too many requests i.e. Rate Limit Exceeded\n                raise TokenExpiryException(loads(resp)['errors'][0]['message'])\n            return resp\n\n\nasync def RandomUserAgent(wa=None):\n    logme.debug(__name__ + ':RandomUserAgent')\n    try:\n        if wa:\n            return \"Mozilla/5.0 (Windows NT 6.4; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/41.0.2225.0 Safari/537.36\"\n        return UserAgent(verify_ssl=False, use_cache_server=False).random\n    except:\n        return random.choice(user_agent_list)\n\n\nasync def Username(_id, bearer_token, guest_token):\n    logme.debug(__name__ + ':Username')\n    _dct = {'userId': _id, 'withHighlightedLabel': False}\n    _url = \"https://api.twitter.com/graphql/B9FuNQVmyx32rdbIPEZKag/UserByRestId?variables={}\".format(dict_to_url(_dct))\n    _headers = {\n        'authorization': bearer_token,\n        'x-guest-token': guest_token,\n    }\n    r = await Request(_url, headers=_headers)\n    j_r = loads(r)\n    username = j_r['data']['user']['legacy']['screen_name']\n    return username\n\n\nasync def Tweet(url, config, conn):\n    logme.debug(__name__ + ':Tweet')\n    try:\n        response = await Request(url)\n        soup = BeautifulSoup(response, \"html.parser\")\n        tweets = soup.find_all(\"div\", \"tweet\")\n        await Tweets(tweets, config, conn, url)\n    except Exception as e:\n        logme.critical(__name__ + ':Tweet:' + str(e))\n\n\nasync def User(username, config, conn, user_id=False):\n    logme.debug(__name__ + ':User')\n    _dct = {'screen_name': username, 'withHighlightedLabel': False}\n    _url = 'https://api.twitter.com/graphql/jMaTS-_Ea8vh9rpKggJbCQ/UserByScreenName?variables={}'\\\n        .format(dict_to_url(_dct))\n    _headers = {\n        'authorization': config.Bearer_token,\n        'x-guest-token': config.Guest_token,\n    }\n    try:\n        response = await Request(_url, headers=_headers)\n        j_r = loads(response)\n        if user_id:\n            try:\n                _id = j_r['data']['user']['rest_id']\n                return _id\n            except KeyError as e:\n                logme.critical(__name__ + ':User:' + str(e))\n                return\n        await Users(j_r, config, conn)\n    except Exception as e:\n        logme.critical(__name__ + ':User:' + str(e))\n        raise\n\n\ndef Limit(Limit, count):\n    logme.debug(__name__ + ':Limit')\n    if Limit is not None and count >= int(Limit):\n        return True\n\n\nasync def Multi(feed, config, conn):\n    logme.debug(__name__ + ':Multi')\n    count = 0\n    try:\n        with concurrent.futures.ThreadPoolExecutor(max_workers=20) as executor:\n            loop = asyncio.get_event_loop()\n            futures = []\n            for tweet in feed:\n                count += 1\n                if config.Favorites or config.Profile_full:\n                    logme.debug(__name__ + ':Multi:Favorites-profileFull')\n                    link = tweet.find(\"a\")[\"href\"]\n                    url = f\"https://twitter.com{link}&lang=en\"\n                elif config.User_full:\n                    logme.debug(__name__ + ':Multi:userFull')\n                    username = tweet.find(\"a\")[\"name\"]\n                    url = f\"http://twitter.com/{username}?lang=en\"\n                else:\n                    logme.debug(__name__ + ':Multi:else-url')\n                    link = tweet.find(\"a\", \"tweet-timestamp js-permalink js-nav js-tooltip\")[\"href\"]\n                    url = f\"https://twitter.com{link}?lang=en\"\n\n                if config.User_full:\n                    logme.debug(__name__ + ':Multi:user-full-Run')\n                    futures.append(loop.run_in_executor(executor, await User(url,\n                                                                             config, conn)))\n                else:\n                    logme.debug(__name__ + ':Multi:notUser-full-Run')\n                    futures.append(loop.run_in_executor(executor, await Tweet(url,\n                                                                              config, conn)))\n            logme.debug(__name__ + ':Multi:asyncioGather')\n            await asyncio.gather(*futures)\n    except Exception as e:\n        # TODO: fix error not error\n        # print(str(e) + \" [x] get.Multi\")\n        # will return \"'NoneType' object is not callable\"\n        # but still works\n        # logme.critical(__name__+':Multi:' + str(e))\n        pass\n\n    return count\n"
  },
  {
    "path": "twint/output.py",
    "content": "from datetime import datetime\n\nfrom . import format, get\nfrom .tweet import Tweet\nfrom .user import User\nfrom .storage import db, elasticsearch, write, panda\n\nimport logging as logme\n\nfollows_list = []\ntweets_list = []\nusers_list = []\n\nauthor_list = {''}\nauthor_list.pop()\n\n# used by Pandas\n_follows_object = {}\n\n\ndef _formatDateTime(datetimestamp):\n    try:\n        return int(datetime.strptime(datetimestamp, \"%Y-%m-%d %H:%M:%S\").timestamp())\n    except ValueError:\n        return int(datetime.strptime(datetimestamp, \"%Y-%m-%d\").timestamp())\n\n\ndef _clean_follow_list():\n    logme.debug(__name__ + ':clean_follow_list')\n    global _follows_object\n    _follows_object = {}\n\n\ndef clean_lists():\n    logme.debug(__name__ + ':clean_lists')\n    global follows_list\n    global tweets_list\n    global users_list\n    follows_list = []\n    tweets_list = []\n    users_list = []\n\n\ndef datecheck(datetimestamp, config):\n    logme.debug(__name__ + ':datecheck')\n    if config.Since:\n        logme.debug(__name__ + ':datecheck:SinceTrue')\n\n        d = _formatDateTime(datetimestamp)\n        s = _formatDateTime(config.Since)\n\n        if d < s:\n            return False\n    if config.Until:\n        logme.debug(__name__ + ':datecheck:UntilTrue')\n\n        d = _formatDateTime(datetimestamp)\n        s = _formatDateTime(config.Until)\n\n        if d > s:\n            return False\n    logme.debug(__name__ + ':datecheck:dateRangeFalse')\n    return True\n\n\n# TODO In this method we need to delete the quoted tweets, because twitter also sends the quoted tweets in the\n#  `tweets` list along with the other tweets\ndef is_tweet(tw):\n    try:\n        tw[\"data-item-id\"]\n        logme.debug(__name__ + ':is_tweet:True')\n        return True\n    except:\n        logme.critical(__name__ + ':is_tweet:False')\n        return False\n\n\ndef _output(obj, output, config, **extra):\n    logme.debug(__name__ + ':_output')\n    if config.Lowercase:\n        if isinstance(obj, str):\n            logme.debug(__name__ + ':_output:Lowercase:username')\n            obj = obj.lower()\n        elif obj.__class__.__name__ == \"user\":\n            logme.debug(__name__ + ':_output:Lowercase:user')\n            pass\n        elif obj.__class__.__name__ == \"tweet\":\n            logme.debug(__name__ + ':_output:Lowercase:tweet')\n            obj.username = obj.username.lower()\n            author_list.update({obj.username})\n            for dct in obj.mentions:\n                for key, val in dct.items():\n                    dct[key] = val.lower()\n            for i in range(len(obj.hashtags)):\n                obj.hashtags[i] = obj.hashtags[i].lower()\n            for i in range(len(obj.cashtags)):\n                obj.cashtags[i] = obj.cashtags[i].lower()\n        else:\n            logme.info('_output:Lowercase:hiddenTweetFound')\n            print(\"[x] Hidden tweet found, account suspended due to violation of TOS\")\n            return\n    if config.Output != None:\n        if config.Store_csv:\n            try:\n                write.Csv(obj, config)\n                logme.debug(__name__ + ':_output:CSV')\n            except Exception as e:\n                logme.critical(__name__ + ':_output:CSV:Error:' + str(e))\n                print(str(e) + \" [x] output._output\")\n        elif config.Store_json:\n            write.Json(obj, config)\n            logme.debug(__name__ + ':_output:JSON')\n        else:\n            write.Text(output, config.Output)\n            logme.debug(__name__ + ':_output:Text')\n\n    if config.Elasticsearch:\n        logme.debug(__name__ + ':_output:Elasticsearch')\n        print(\"\", end=\".\", flush=True)\n    else:\n        if not config.Hide_output:\n            try:\n                print(output.replace('\\n', ' '))\n            except UnicodeEncodeError:\n                logme.critical(__name__ + ':_output:UnicodeEncodeError')\n                print(\"unicode error [x] output._output\")\n\n\nasync def checkData(tweet, config, conn):\n    logme.debug(__name__ + ':checkData')\n    tweet = Tweet(tweet, config)\n    if not tweet.datestamp:\n        logme.critical(__name__ + ':checkData:hiddenTweetFound')\n        print(\"[x] Hidden tweet found, account suspended due to violation of TOS\")\n        return\n    if datecheck(tweet.datestamp + \" \" + tweet.timestamp, config):\n        output = format.Tweet(config, tweet)\n        if config.Database:\n            logme.debug(__name__ + ':checkData:Database')\n            db.tweets(conn, tweet, config)\n        if config.Pandas:\n            logme.debug(__name__ + ':checkData:Pandas')\n            panda.update(tweet, config)\n        if config.Store_object:\n            logme.debug(__name__ + ':checkData:Store_object')\n            if hasattr(config.Store_object_tweets_list, 'append'):\n                config.Store_object_tweets_list.append(tweet)\n            else:\n                tweets_list.append(tweet)\n        if config.Elasticsearch:\n            logme.debug(__name__ + ':checkData:Elasticsearch')\n            elasticsearch.Tweet(tweet, config)\n        _output(tweet, output, config)\n    # else:\n    #     logme.critical(__name__+':checkData:copyrightedTweet')\n\n\nasync def Tweets(tweets, config, conn):\n    logme.debug(__name__ + ':Tweets')\n    if config.Favorites or config.Location:\n        logme.debug(__name__ + ':Tweets:fav+full+loc')\n        for tw in tweets:\n            await checkData(tw, config, conn)\n    elif config.TwitterSearch or config.Profile:\n        logme.debug(__name__ + ':Tweets:TwitterSearch')\n        await checkData(tweets, config, conn)\n    else:\n        logme.debug(__name__ + ':Tweets:else')\n        if int(tweets[\"data-user-id\"]) == config.User_id or config.Retweets:\n            await checkData(tweets, config, conn)\n\n\nasync def Users(u, config, conn):\n    logme.debug(__name__ + ':User')\n    global users_list\n\n    user = User(u)\n    output = format.User(config.Format, user)\n\n    if config.Database:\n        logme.debug(__name__ + ':User:Database')\n        db.user(conn, config, user)\n\n    if config.Elasticsearch:\n        logme.debug(__name__ + ':User:Elasticsearch')\n        _save_date = user.join_date\n        _save_time = user.join_time\n        user.join_date = str(datetime.strptime(user.join_date, \"%d %b %Y\")).split()[0]\n        user.join_time = str(datetime.strptime(user.join_time, \"%I:%M %p\")).split()[1]\n        elasticsearch.UserProfile(user, config)\n        user.join_date = _save_date\n        user.join_time = _save_time\n\n    if config.Store_object:\n        logme.debug(__name__ + ':User:Store_object')\n\n        if hasattr(config.Store_object_follow_list, 'append'):\n            config.Store_object_follow_list.append(user)\n        elif hasattr(config.Store_object_users_list, 'append'):\n            config.Store_object_users_list.append(user)\n        else:\n            users_list.append(user)  # twint.user.user\n\n    if config.Pandas:\n        logme.debug(__name__ + ':User:Pandas+user')\n        panda.update(user, config)\n\n    _output(user, output, config)\n\n\nasync def Username(username, config, conn):\n    logme.debug(__name__ + ':Username')\n    global _follows_object\n    global follows_list\n    follow_var = config.Following * \"following\" + config.Followers * \"followers\"\n\n    if config.Database:\n        logme.debug(__name__ + ':Username:Database')\n        db.follow(conn, config.Username, config.Followers, username)\n\n    if config.Elasticsearch:\n        logme.debug(__name__ + ':Username:Elasticsearch')\n        elasticsearch.Follow(username, config)\n\n    if config.Store_object:\n        if hasattr(config.Store_object_follow_list, 'append'):\n            config.Store_object_follow_list.append(username)\n        else:\n            follows_list.append(username)  # twint.user.user\n\n    if config.Pandas:\n        logme.debug(__name__ + ':Username:object+pandas')\n        try:\n            _ = _follows_object[config.Username][follow_var]\n        except KeyError:\n            _follows_object.update({config.Username: {follow_var: []}})\n        _follows_object[config.Username][follow_var].append(username)\n        if config.Pandas_au:\n            logme.debug(__name__ + ':Username:object+pandas+au')\n            panda.update(_follows_object[config.Username], config)\n    _output(username, username, config)\n"
  },
  {
    "path": "twint/run.py",
    "content": "import sys, os, datetime\nfrom asyncio import get_event_loop, TimeoutError, ensure_future, new_event_loop, set_event_loop\n\nfrom . import datelock, feed, get, output, verbose, storage\nfrom .token import TokenExpiryException\nfrom . import token\nfrom .storage import db\nfrom .feed import NoMoreTweetsException\n\nimport logging as logme\n\nimport time\n\nbearer = 'Bearer AAAAAAAAAAAAAAAAAAAAANRILgAAAAAAnNwIzUejRCOuH5E6I8xnZz4puTs' \\\n         '%3D1Zv7ttfk8LF81IUq16cHjhLTvJu4FA33AGWWjCpTnA'\n\n\nclass Twint:\n    def __init__(self, config):\n        logme.debug(__name__ + ':Twint:__init__')\n        if config.Resume is not None and (config.TwitterSearch or config.Followers or config.Following):\n            logme.debug(__name__ + ':Twint:__init__:Resume')\n            self.init = self.get_resume(config.Resume)\n        else:\n            self.init = -1\n\n        config.deleted = []\n        self.feed: list = [-1]\n        self.count = 0\n        self.user_agent = \"\"\n        self.config = config\n        self.config.Bearer_token = bearer\n        # TODO might have to make some adjustments for it to work with multi-treading\n        # USAGE : to get a new guest token simply do `self.token.refresh()`\n        self.token = token.Token(config)\n        self.token.refresh()\n        self.conn = db.Conn(config.Database)\n        self.d = datelock.Set(self.config.Until, self.config.Since)\n        verbose.Elastic(config.Elasticsearch)\n\n        if self.config.Store_object:\n            logme.debug(__name__ + ':Twint:__init__:clean_follow_list')\n            output._clean_follow_list()\n\n        if self.config.Pandas_clean:\n            logme.debug(__name__ + ':Twint:__init__:pandas_clean')\n            storage.panda.clean()\n\n    def get_resume(self, resumeFile):\n        if not os.path.exists(resumeFile):\n            return '-1'\n        with open(resumeFile, 'r') as rFile:\n            _init = rFile.readlines()[-1].strip('\\n')\n            return _init\n\n    async def Feed(self):\n        logme.debug(__name__ + ':Twint:Feed')\n        consecutive_errors_count = 0\n        while True:\n            # this will receive a JSON string, parse it into a `dict` and do the required stuff\n            try:\n                response = await get.RequestUrl(self.config, self.init)\n            except TokenExpiryException as e:\n                logme.debug(__name__ + 'Twint:Feed:' + str(e))\n                self.token.refresh()\n                response = await get.RequestUrl(self.config, self.init)\n\n            if self.config.Debug:\n                print(response, file=open(\"twint-last-request.log\", \"w\", encoding=\"utf-8\"))\n\n            self.feed = []\n            try:\n                if self.config.Favorites:\n                    self.feed, self.init = feed.MobileFav(response)\n                    favorite_err_cnt = 0\n                    if len(self.feed) == 0 and len(self.init) == 0:\n                        while (len(self.feed) == 0 or len(self.init) == 0) and favorite_err_cnt < 5:\n                            self.user_agent = await get.RandomUserAgent(wa=False)\n                            response = await get.RequestUrl(self.config, self.init,\n                                                            headers=[(\"User-Agent\", self.user_agent)])\n                            self.feed, self.init = feed.MobileFav(response)\n                            favorite_err_cnt += 1\n                            time.sleep(1)\n                        if favorite_err_cnt == 5:\n                            print(\"Favorite page could not be fetched\")\n                    if not self.count % 40:\n                        time.sleep(5)\n                elif self.config.Followers or self.config.Following:\n                    self.feed, self.init = feed.Follow(response)\n                    if not self.count % 40:\n                        time.sleep(5)\n                elif self.config.Profile or self.config.TwitterSearch:\n                    try:\n                        self.feed, self.init = feed.parse_tweets(self.config, response)\n                    except NoMoreTweetsException as e:\n                        logme.debug(__name__ + ':Twint:Feed:' + str(e))\n                        print('[!] ' + str(e) + ' Scraping will stop now.')\n                        print('found {} deleted tweets in this search.'.format(len(self.config.deleted)))\n                        break\n                break\n            except TimeoutError as e:\n                if self.config.Proxy_host.lower() == \"tor\":\n                    print(\"[?] Timed out, changing Tor identity...\")\n                    if self.config.Tor_control_password is None:\n                        logme.critical(__name__ + ':Twint:Feed:tor-password')\n                        sys.stderr.write(\"Error: config.Tor_control_password must be set for proxy auto-rotation!\\r\\n\")\n                        sys.stderr.write(\n                            \"Info: What is it? See https://stem.torproject.org/faq.html#can-i-interact-with-tors\"\n                            \"-controller-interface-directly\\r\\n\")\n                        break\n                    else:\n                        get.ForceNewTorIdentity(self.config)\n                        continue\n                else:\n                    logme.critical(__name__ + ':Twint:Feed:' + str(e))\n                    print(str(e))\n                    break\n            except Exception as e:\n                if self.config.Profile or self.config.Favorites:\n                    print(\"[!] Twitter does not return more data, scrape stops here.\")\n                    break\n\n                logme.critical(__name__ + ':Twint:Feed:noData' + str(e))\n                # Sometimes Twitter says there is no data. But it's a lie.\n                # raise\n                consecutive_errors_count += 1\n                if consecutive_errors_count < self.config.Retries_count:\n                    # skip to the next iteration if wait time does not satisfy limit constraints\n                    delay = round(consecutive_errors_count ** self.config.Backoff_exponent, 1)\n\n                    # if the delay is less than users set min wait time then replace delay\n                    if self.config.Min_wait_time > delay:\n                        delay = self.config.Min_wait_time\n\n                    sys.stderr.write('sleeping for {} secs\\n'.format(delay))\n                    time.sleep(delay)\n                    self.user_agent = await get.RandomUserAgent(wa=True)\n                    continue\n                logme.critical(__name__ + ':Twint:Feed:Tweets_known_error:' + str(e))\n                sys.stderr.write(str(e) + \" [x] run.Feed\")\n                sys.stderr.write(\n                    \"[!] if you get this error but you know for sure that more tweets exist, please open an issue and \"\n                    \"we will investigate it!\")\n                break\n        if self.config.Resume:\n            print(self.init, file=open(self.config.Resume, \"a\", encoding=\"utf-8\"))\n\n    async def follow(self):\n        await self.Feed()\n        if self.config.User_full:\n            logme.debug(__name__ + ':Twint:follow:userFull')\n            self.count += await get.Multi(self.feed, self.config, self.conn)\n        else:\n            logme.debug(__name__ + ':Twint:follow:notUserFull')\n            for user in self.feed:\n                self.count += 1\n                username = user.find(\"a\")[\"name\"]\n                await output.Username(username, self.config, self.conn)\n\n    async def favorite(self):\n        logme.debug(__name__ + ':Twint:favorite')\n        await self.Feed()\n        favorited_tweets_list = []\n        for tweet in self.feed:\n            tweet_dict = {}\n            self.count += 1\n            try:\n                tweet_dict['data-item-id'] = tweet.find(\"div\", {\"class\": \"tweet-text\"})['data-id']\n                t_url = tweet.find(\"span\", {\"class\": \"metadata\"}).find(\"a\")[\"href\"]\n                tweet_dict['data-conversation-id'] = t_url.split('?')[0].split('/')[-1]\n                tweet_dict['username'] = tweet.find(\"div\", {\"class\": \"username\"}).text.replace('\\n', '').replace(' ',\n                                                                                                                 '')\n                tweet_dict['tweet'] = tweet.find(\"div\", {\"class\": \"tweet-text\"}).find(\"div\", {\"class\": \"dir-ltr\"}).text\n                date_str = tweet.find(\"td\", {\"class\": \"timestamp\"}).find(\"a\").text\n                # test_dates = [\"1m\", \"2h\", \"Jun 21, 2019\", \"Mar 12\", \"28 Jun 19\"]\n                # date_str = test_dates[3]\n                if len(date_str) <= 3 and (date_str[-1] == \"m\" or date_str[-1] == \"h\"):  # 25m 1h\n                    dateu = str(datetime.date.today())\n                    tweet_dict['date'] = dateu\n                elif ',' in date_str:  # Aug 21, 2019\n                    sp = date_str.replace(',', '').split(' ')\n                    date_str_formatted = sp[1] + ' ' + sp[0] + ' ' + sp[2]\n                    dateu = datetime.datetime.strptime(date_str_formatted, \"%d %b %Y\").strftime(\"%Y-%m-%d\")\n                    tweet_dict['date'] = dateu\n                elif len(date_str.split(' ')) == 3:  # 28 Jun 19\n                    sp = date_str.split(' ')\n                    if len(sp[2]) == 2:\n                        sp[2] = '20' + sp[2]\n                    date_str_formatted = sp[0] + ' ' + sp[1] + ' ' + sp[2]\n                    dateu = datetime.datetime.strptime(date_str_formatted, \"%d %b %Y\").strftime(\"%Y-%m-%d\")\n                    tweet_dict['date'] = dateu\n                else:  # Aug 21\n                    sp = date_str.split(' ')\n                    date_str_formatted = sp[1] + ' ' + sp[0] + ' ' + str(datetime.date.today().year)\n                    dateu = datetime.datetime.strptime(date_str_formatted, \"%d %b %Y\").strftime(\"%Y-%m-%d\")\n                    tweet_dict['date'] = dateu\n\n                favorited_tweets_list.append(tweet_dict)\n\n            except Exception as e:\n                logme.critical(__name__ + ':Twint:favorite:favorite_field_lack')\n                print(\"shit: \", date_str, \" \", str(e))\n\n        try:\n            self.config.favorited_tweets_list += favorited_tweets_list\n        except AttributeError:\n            self.config.favorited_tweets_list = favorited_tweets_list\n\n    async def profile(self):\n        await self.Feed()\n        logme.debug(__name__ + ':Twint:profile')\n        for tweet in self.feed:\n            self.count += 1\n            await output.Tweets(tweet, self.config, self.conn)\n\n    async def tweets(self):\n        await self.Feed()\n        # TODO : need to take care of this later\n        if self.config.Location:\n            logme.debug(__name__ + ':Twint:tweets:location')\n            self.count += await get.Multi(self.feed, self.config, self.conn)\n        else:\n            logme.debug(__name__ + ':Twint:tweets:notLocation')\n            for tweet in self.feed:\n                self.count += 1\n                await output.Tweets(tweet, self.config, self.conn)\n\n    async def main(self, callback=None):\n\n        task = ensure_future(self.run())  # Might be changed to create_task in 3.7+.\n\n        if callback:\n            task.add_done_callback(callback)\n\n        await task\n\n    async def run(self):\n        if self.config.TwitterSearch:\n            self.user_agent = await get.RandomUserAgent(wa=True)\n        else:\n            self.user_agent = await get.RandomUserAgent()\n\n        if self.config.User_id is not None and self.config.Username is None:\n            logme.debug(__name__ + ':Twint:main:user_id')\n            self.config.Username = await get.Username(self.config.User_id, self.config.Bearer_token,\n                                                      self.config.Guest_token)\n\n        if self.config.Username is not None and self.config.User_id is None:\n            logme.debug(__name__ + ':Twint:main:username')\n\n            self.config.User_id = await get.User(self.config.Username, self.config, self.conn, True)\n            if self.config.User_id is None:\n                raise ValueError(\"Cannot find twitter account with name = \" + self.config.Username)\n\n        # TODO : will need to modify it to work with the new endpoints\n        if self.config.TwitterSearch and self.config.Since and self.config.Until:\n            logme.debug(__name__ + ':Twint:main:search+since+until')\n            while self.d.since < self.d.until:\n                self.config.Since = datetime.datetime.strftime(self.d.since, \"%Y-%m-%d %H:%M:%S\")\n                self.config.Until = datetime.datetime.strftime(self.d.until, \"%Y-%m-%d %H:%M:%S\")\n                if len(self.feed) > 0:\n                    await self.tweets()\n                else:\n                    logme.debug(__name__ + ':Twint:main:gettingNewTweets')\n                    break\n\n                if get.Limit(self.config.Limit, self.count):\n                    break\n        elif self.config.Lookup:\n            await self.Lookup()\n        else:\n            logme.debug(__name__ + ':Twint:main:not-search+since+until')\n            while True:\n                if len(self.feed) > 0:\n                    if self.config.Followers or self.config.Following:\n                        logme.debug(__name__ + ':Twint:main:follow')\n                        await self.follow()\n                    elif self.config.Favorites:\n                        logme.debug(__name__ + ':Twint:main:favorites')\n                        await self.favorite()\n                    elif self.config.Profile:\n                        logme.debug(__name__ + ':Twint:main:profile')\n                        await self.profile()\n                    elif self.config.TwitterSearch:\n                        logme.debug(__name__ + ':Twint:main:twitter-search')\n                        await self.tweets()\n                else:\n                    logme.debug(__name__ + ':Twint:main:no-more-tweets')\n                    break\n\n                # logging.info(\"[<] \" + str(datetime.now()) + ':: run+Twint+main+CallingGetLimit2')\n                if get.Limit(self.config.Limit, self.count):\n                    logme.debug(__name__ + ':Twint:main:reachedLimit')\n                    break\n\n        if self.config.Count:\n            verbose.Count(self.count, self.config)\n\n    async def Lookup(self):\n        logme.debug(__name__ + ':Twint:Lookup')\n\n        try:\n            if self.config.User_id is not None and self.config.Username is None:\n                logme.debug(__name__ + ':Twint:Lookup:user_id')\n                self.config.Username = await get.Username(self.config.User_id, self.config.Bearer_token,\n                                                          self.config.Guest_token)\n            await get.User(self.config.Username, self.config, db.Conn(self.config.Database))\n\n        except Exception as e:\n            logme.exception(__name__ + ':Twint:Lookup:Unexpected exception occurred.')\n            raise\n\n\ndef run(config, callback=None):\n    logme.debug(__name__ + ':run')\n    try:\n        get_event_loop()\n    except RuntimeError as e:\n        if \"no current event loop\" in str(e):\n            set_event_loop(new_event_loop())\n        else:\n            logme.exception(__name__ + ':run:Unexpected exception while handling an expected RuntimeError.')\n            raise\n    except Exception as e:\n        logme.exception(\n            __name__ + ':run:Unexpected exception occurred while attempting to get or create a new event loop.')\n        raise\n\n    get_event_loop().run_until_complete(Twint(config).main(callback))\n\n\ndef Favorites(config):\n    logme.debug(__name__ + ':Favorites')\n    config.Favorites = True\n    config.Following = False\n    config.Followers = False\n    config.Profile = False\n    config.TwitterSearch = False\n    run(config)\n    if config.Pandas_au:\n        storage.panda._autoget(\"tweet\")\n\n\ndef Followers(config):\n    logme.debug(__name__ + ':Followers')\n    config.Followers = True\n    config.Following = False\n    config.Profile = False\n    config.Favorites = False\n    config.TwitterSearch = False\n    run(config)\n    if config.Pandas_au:\n        storage.panda._autoget(\"followers\")\n        if config.User_full:\n            storage.panda._autoget(\"user\")\n    if config.Pandas_clean and not config.Store_object:\n        # storage.panda.clean()\n        output._clean_follow_list()\n\n\ndef Following(config):\n    logme.debug(__name__ + ':Following')\n    config.Following = True\n    config.Followers = False\n    config.Profile = False\n    config.Favorites = False\n    config.TwitterSearch = False\n    run(config)\n    if config.Pandas_au:\n        storage.panda._autoget(\"following\")\n        if config.User_full:\n            storage.panda._autoget(\"user\")\n    if config.Pandas_clean and not config.Store_object:\n        # storage.panda.clean()\n        output._clean_follow_list()\n\n\ndef Lookup(config):\n    logme.debug(__name__ + ':Lookup')\n    config.Profile = False\n    config.Lookup = True\n    config.Favorites = False\n    config.FOllowing = False\n    config.Followers = False\n    config.TwitterSearch = False\n    run(config)\n    if config.Pandas_au:\n        storage.panda._autoget(\"user\")\n\n\ndef Profile(config):\n    logme.debug(__name__ + ':Profile')\n    config.Profile = True\n    config.Favorites = False\n    config.Following = False\n    config.Followers = False\n    config.TwitterSearch = False\n    run(config)\n    if config.Pandas_au:\n        storage.panda._autoget(\"tweet\")\n\n\ndef Search(config, callback=None):\n    logme.debug(__name__ + ':Search')\n    config.TwitterSearch = True\n    config.Favorites = False\n    config.Following = False\n    config.Followers = False\n    config.Profile = False\n    run(config, callback)\n    if config.Pandas_au:\n        storage.panda._autoget(\"tweet\")\n"
  },
  {
    "path": "twint/storage/__init__.py",
    "content": ""
  },
  {
    "path": "twint/storage/db.py",
    "content": "import sqlite3\nimport sys\nimport time\nimport hashlib\n\nfrom datetime import datetime\n\ndef Conn(database):\n    if database:\n        print(\"[+] Inserting into Database: \" + str(database))\n        conn = init(database)\n        if isinstance(conn, str): # error\n            print(conn)\n            sys.exit(1)\n    else:\n        conn = \"\"\n\n    return conn\n\ndef init(db):\n    try:\n        conn = sqlite3.connect(db)\n        cursor = conn.cursor()\n\n        table_users = \"\"\"\n            CREATE TABLE IF NOT EXISTS\n                users(\n                    id integer not null,\n                    id_str text not null,\n                    name text,\n                    username text not null,\n                    bio text,\n                    location text,\n                    url text,\n                    join_date text not null,\n                    join_time text not null,\n                    tweets integer,\n                    following integer,\n                    followers integer,\n                    likes integer,\n                    media integer,\n                    private integer not null,\n                    verified integer not null,\n                    profile_image_url text not null,\n                    background_image text,\n                    hex_dig  text not null,\n                    time_update integer not null,\n                    CONSTRAINT users_pk PRIMARY KEY (id, hex_dig)\n                );\n            \"\"\"\n        cursor.execute(table_users)\n\n        table_tweets = \"\"\"\n            CREATE TABLE IF NOT EXISTS\n                tweets (\n                    id integer not null,\n                    id_str text not null,\n                    tweet text default '',\n                    language text default '',\n                    conversation_id text not null,\n                    created_at integer not null,\n                    date text not null,\n                    time text not null,\n                    timezone text not null,\n                    place text default '',\n                    replies_count integer,\n                    likes_count integer,\n                    retweets_count integer,\n                    user_id integer not null,\n                    user_id_str text not null,\n                    screen_name text not null,\n                    name text default '',\n                    link text,\n                    mentions text,\n                    hashtags text,\n                    cashtags text,\n                    urls text,\n                    photos text,\n                    thumbnail text,\n                    quote_url text,\n                    video integer,\n                    geo text,\n                    near text,\n                    source text,\n                    time_update integer not null,\n                    `translate` text default '',\n                    trans_src text default '',\n                    trans_dest text default '',\n                    PRIMARY KEY (id)\n                );\n        \"\"\"\n        cursor.execute(table_tweets)\n\n        table_retweets = \"\"\"\n            CREATE TABLE IF NOT EXISTS\n                retweets(\n                    user_id integer not null,\n                    username text not null,\n                    tweet_id integer not null,\n                    retweet_id integer not null,\n                    retweet_date integer,\n                    CONSTRAINT retweets_pk PRIMARY KEY(user_id, tweet_id),\n                    CONSTRAINT user_id_fk FOREIGN KEY(user_id) REFERENCES users(id),\n                    CONSTRAINT tweet_id_fk FOREIGN KEY(tweet_id) REFERENCES tweets(id)\n                );\n        \"\"\"\n        cursor.execute(table_retweets)\n\n        table_reply_to = \"\"\"\n            CREATE TABLE IF NOT EXISTS\n                replies(\n                    tweet_id integer not null,\n                    user_id integer not null,\n                    username text not null,\n                    CONSTRAINT replies_pk PRIMARY KEY (user_id, tweet_id),\n                    CONSTRAINT tweet_id_fk FOREIGN KEY (tweet_id) REFERENCES tweets(id)\n                );\n        \"\"\"\n        cursor.execute(table_reply_to)\n\n        table_favorites =  \"\"\"\n            CREATE TABLE IF NOT EXISTS\n                favorites(\n                    user_id integer not null,\n                    tweet_id integer not null,\n                    CONSTRAINT favorites_pk PRIMARY KEY (user_id, tweet_id),\n                    CONSTRAINT user_id_fk FOREIGN KEY (user_id) REFERENCES users(id),\n                    CONSTRAINT tweet_id_fk FOREIGN KEY (tweet_id) REFERENCES tweets(id)\n                );\n        \"\"\"\n        cursor.execute(table_favorites)\n\n        table_followers = \"\"\"\n            CREATE TABLE IF NOT EXISTS\n                followers (\n                    id integer not null,\n                    follower_id integer not null,\n                    CONSTRAINT followers_pk PRIMARY KEY (id, follower_id),\n                    CONSTRAINT id_fk FOREIGN KEY(id) REFERENCES users(id),\n                    CONSTRAINT follower_id_fk FOREIGN KEY(follower_id) REFERENCES users(id)\n                );\n        \"\"\"\n        cursor.execute(table_followers)\n\n        table_following = \"\"\"\n            CREATE TABLE IF NOT EXISTS\n                following (\n                    id integer not null,\n                    following_id integer not null,\n                    CONSTRAINT following_pk PRIMARY KEY (id, following_id),\n                    CONSTRAINT id_fk FOREIGN KEY(id) REFERENCES users(id),\n                    CONSTRAINT following_id_fk FOREIGN KEY(following_id) REFERENCES users(id)\n                );\n        \"\"\"\n        cursor.execute(table_following)\n\n        table_followers_names = \"\"\"\n            CREATE TABLE IF NOT EXISTS\n                followers_names (\n                    user text not null,\n                    time_update integer not null,\n                    follower text not null,\n                    PRIMARY KEY (user, follower)\n                );\n        \"\"\"\n        cursor.execute(table_followers_names)\n\n        table_following_names = \"\"\"\n            CREATE TABLE IF NOT EXISTS\n                following_names (\n                    user text not null,\n                    time_update integer not null,\n                    follows text not null,\n                    PRIMARY KEY (user, follows)\n                );\n        \"\"\"\n        cursor.execute(table_following_names)\n\n        return conn\n    except Exception as e:\n        return str(e)\n\ndef fTable(Followers):\n    if Followers:\n        table = \"followers_names\"\n    else:\n        table = \"following_names\"\n\n    return table\n\ndef uTable(Followers):\n    if Followers:\n        table = \"followers\"\n    else:\n        table = \"following\"\n\n    return table\n\ndef follow(conn, Username, Followers, User):\n    try:\n        time_ms = round(time.time()*1000)\n        cursor = conn.cursor()\n        entry = (User, time_ms, Username,)\n        table = fTable(Followers)\n        query = f\"INSERT INTO {table} VALUES(?,?,?)\"\n        cursor.execute(query, entry)\n        conn.commit()\n    except sqlite3.IntegrityError:\n        pass\n\ndef get_hash_id(conn, id):\n    cursor = conn.cursor()\n    cursor.execute('SELECT hex_dig FROM users WHERE id = ? LIMIT 1', (id,))\n    resultset = cursor.fetchall()\n    return resultset[0][0] if resultset else -1\n\ndef user(conn, config, User):\n    try:\n        time_ms = round(time.time()*1000)\n        cursor = conn.cursor()\n        user = [int(User.id), User.id, User.name, User.username, User.bio, User.location, User.url,User.join_date, User.join_time, User.tweets, User.following, User.followers, User.likes, User.media_count, User.is_private, User.is_verified, User.avatar, User.background_image]\n\n        hex_dig = hashlib.sha256(','.join(str(v) for v in user).encode()).hexdigest()\n        entry = tuple(user) + (hex_dig,time_ms,)\n        old_hash = get_hash_id(conn, User.id)\n\n        if old_hash == -1 or old_hash != hex_dig:\n            query = f\"INSERT INTO users VALUES(?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?)\"\n            cursor.execute(query, entry)\n        else:\n            pass\n\n        if config.Followers or config.Following:\n            table = uTable(config.Followers)\n            query = f\"INSERT INTO {table} VALUES(?,?)\"\n            cursor.execute(query, (config.User_id, int(User.id)))\n\n        conn.commit()\n    except sqlite3.IntegrityError:\n        pass\n\ndef tweets(conn, Tweet, config):\n    try:\n        time_ms = round(time.time()*1000)\n        cursor = conn.cursor()\n        entry = (Tweet.id,\n                    Tweet.id_str,\n                    Tweet.tweet,\n                    Tweet.lang,\n                    Tweet.conversation_id,\n                    Tweet.datetime,\n                    Tweet.datestamp,\n                    Tweet.timestamp,\n                    Tweet.timezone,\n                    Tweet.place,\n                    Tweet.replies_count,\n                    Tweet.likes_count,\n                    Tweet.retweets_count,\n                    Tweet.user_id,\n                    Tweet.user_id_str,\n                    Tweet.username,\n                    Tweet.name,\n                    Tweet.link,\n                    \",\".join(Tweet.mentions),\n                    \",\".join(Tweet.hashtags),\n                    \",\".join(Tweet.cashtags),\n                    \",\".join(Tweet.urls),\n                    \",\".join(Tweet.photos),\n                    Tweet.thumbnail,\n                    Tweet.quote_url,\n                    Tweet.video,\n                    Tweet.geo,\n                    Tweet.near,\n                    Tweet.source,\n                    time_ms,\n                    Tweet.translate,\n                    Tweet.trans_src,\n                    Tweet.trans_dest)\n        cursor.execute('INSERT INTO tweets VALUES(?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?)', entry)\n\n        if config.Favorites:\n            query = 'INSERT INTO favorites VALUES(?,?)'\n            cursor.execute(query, (config.User_id, Tweet.id))\n\n        if Tweet.retweet:\n            query = 'INSERT INTO retweets VALUES(?,?,?,?,?)'\n            _d = datetime.timestamp(datetime.strptime(Tweet.retweet_date, \"%Y-%m-%d %H:%M:%S\"))\n            cursor.execute(query, (int(Tweet.user_rt_id), Tweet.user_rt, Tweet.id, int(Tweet.retweet_id), _d))\n\n        if Tweet.reply_to:\n            for reply in Tweet.reply_to:\n                query = 'INSERT INTO replies VALUES(?,?,?)'\n                cursor.execute(query, (Tweet.id, int(reply['user_id']), reply['username']))\n\n        conn.commit()\n    except sqlite3.IntegrityError:\n        pass\n"
  },
  {
    "path": "twint/storage/elasticsearch.py",
    "content": "## TODO - Fix Weekday situation\nfrom elasticsearch import Elasticsearch, helpers\nfrom geopy.geocoders import Nominatim\nfrom datetime import datetime\nimport contextlib\nimport sys\n\n_index_tweet_status = False\n_index_follow_status = False\n_index_user_status = False\n_is_near_def = False\n_is_location_def = False\n_near = {}\n_location = {}\n\ngeolocator = Nominatim(user_agent=\"twint-1.2\")\n\nclass RecycleObject(object):\n    def write(self, junk): pass\n    def flush(self): pass\n\ndef getLocation(place, **options):\n    location = geolocator.geocode(place,timeout=1000)\n    if location:\n        if options.get(\"near\"):\n            global _near\n            _near = {\"lat\": location.latitude, \"lon\": location.longitude}\n            return True\n        elif options.get(\"location\"):\n            global _location\n            _location = {\"lat\": location.latitude, \"lon\": location.longitude}\n            return True\n        return {\"lat\": location.latitude, \"lon\": location.longitude}\n    else:\n        return {}\n\ndef handleIndexResponse(response):\n    try:\n        if response[\"status\"] == 400:\n            return True\n    except KeyError:\n        pass\n    if response[\"acknowledged\"]:\n        print(\"[+] Index \\\"\" + response[\"index\"] + \"\\\" created!\")\n    else:\n        print(\"[x] error index creation :: storage.elasticsearch.handleIndexCreation\")\n    if response[\"shards_acknowledged\"]:\n        print(\"[+] Shards acknowledged, everything is ready to be used!\")\n        return True\n    else:\n        print(\"[x] error with shards :: storage.elasticsearch.HandleIndexCreation\")\n        return False\n\ndef createIndex(config, instance, **scope):\n    if scope.get(\"scope\") == \"tweet\":\n        tweets_body = {\n                \"mappings\": {\n                    \"properties\": {\n                        \"id\": {\"type\": \"long\"},\n                        \"conversation_id\": {\"type\": \"long\"},\n                        \"created_at\": {\"type\": \"text\"},\n                        \"date\": {\"type\": \"date\", \"format\": \"yyyy-MM-dd HH:mm:ss\"},\n                        \"timezone\": {\"type\": \"keyword\"},\n                        \"place\": {\"type\": \"keyword\"},\n                        \"location\": {\"type\": \"keyword\"},\n                        \"tweet\": {\"type\": \"text\"},\n                        \"lang\": {\"type\": \"keyword\"},\n                        \"hashtags\": {\"type\": \"keyword\", \"normalizer\": \"hashtag_normalizer\"},\n                        \"cashtags\": {\"type\": \"keyword\", \"normalizer\": \"hashtag_normalizer\"},\n                        \"user_id_str\": {\"type\": \"keyword\"},\n                        \"username\": {\"type\": \"keyword\", \"normalizer\": \"hashtag_normalizer\"},\n                        \"name\": {\"type\": \"text\"},\n                        \"profile_image_url\": {\"type\": \"text\"},\n                        \"day\": {\"type\": \"integer\"},\n                        \"hour\": {\"type\": \"integer\"},\n                        \"link\": {\"type\": \"text\"},\n                        \"retweet\": {\"type\": \"text\"},\n                        \"essid\": {\"type\": \"keyword\"},\n                        \"nlikes\": {\"type\": \"integer\"},\n                        \"nreplies\": {\"type\": \"integer\"},\n                        \"nretweets\": {\"type\": \"integer\"},\n                        \"quote_url\": {\"type\": \"text\"},\n                        \"video\": {\"type\":\"integer\"},\n                        \"thumbnail\": {\"type\":\"text\"},\n                        \"search\": {\"type\": \"text\"},\n                        \"near\": {\"type\": \"text\"},\n                        \"geo_near\": {\"type\": \"geo_point\"},\n                        \"geo_tweet\": {\"type\": \"geo_point\"},\n                        \"photos\": {\"type\": \"text\"},\n                        \"user_rt_id\": {\"type\": \"keyword\"},\n                        \"mentions\": {\"type\": \"keyword\", \"normalizer\": \"hashtag_normalizer\"},\n                        \"source\": {\"type\": \"keyword\"},\n                        \"user_rt\": {\"type\": \"keyword\"},\n                        \"retweet_id\": {\"type\": \"keyword\"},\n                        \"reply_to\": {\n                            \"type\": \"nested\",\n                            \"properties\": {\n                                \"user_id\": {\"type\": \"keyword\"},\n                                \"username\": {\"type\": \"keyword\"}\n                            }\n                        },\n                        \"retweet_date\": {\"type\": \"date\", \"format\": \"yyyy-MM-dd HH:mm:ss\", \"ignore_malformed\": True},\n                        \"urls\": {\"type\": \"keyword\"},\n                        \"translate\": {\"type\": \"text\"},\n                        \"trans_src\": {\"type\": \"keyword\"},\n                        \"trans_dest\": {\"type\": \"keyword\"},\n                        }\n                    },\n                    \"settings\": {\n                        \"number_of_shards\": 1,\n                        \"analysis\": {\n                            \"normalizer\": {\n                                \"hashtag_normalizer\": {\n                                    \"type\": \"custom\",\n                                    \"char_filter\": [],\n                                    \"filter\": [\"lowercase\", \"asciifolding\"]\n                                }\n                            }\n                        }\n                    }\n                }\n        with nostdout():\n            resp = instance.indices.create(index=config.Index_tweets, body=tweets_body, ignore=400)\n        return handleIndexResponse(resp)\n    elif scope.get(\"scope\") == \"follow\":\n        follow_body = {\n                \"mappings\": {\n                    \"properties\": {\n                        \"user\": {\"type\": \"keyword\"},\n                        \"follow\": {\"type\": \"keyword\"},\n                        \"essid\": {\"type\": \"keyword\"}\n                        }\n                    },\n                    \"settings\": {\n                        \"number_of_shards\": 1\n                    }\n                }\n        with nostdout():\n            resp = instance.indices.create(index=config.Index_follow, body=follow_body, ignore=400)\n        return handleIndexResponse(resp)\n    elif scope.get(\"scope\") == \"user\":\n        user_body = {\n                \"mappings\": {\n                    \"properties\": {\n                        \"id\": {\"type\": \"keyword\"},\n                        \"name\": {\"type\": \"keyword\"},\n                        \"username\": {\"type\": \"keyword\"},\n                        \"bio\": {\"type\": \"text\"},\n                        \"location\": {\"type\": \"keyword\"},\n                        \"url\": {\"type\": \"text\"},\n                        \"join_datetime\": {\"type\": \"date\", \"format\": \"yyyy-MM-dd HH:mm:ss\"},\n                        \"tweets\": {\"type\": \"integer\"},\n                        \"following\": {\"type\": \"integer\"},\n                        \"followers\": {\"type\": \"integer\"},\n                        \"likes\": {\"type\": \"integer\"},\n                        \"media\": {\"type\": \"integer\"},\n                        \"private\": {\"type\": \"integer\"},\n                        \"verified\": {\"type\": \"integer\"},\n                        \"avatar\": {\"type\": \"text\"},\n                        \"background_image\": {\"type\": \"text\"},\n                        \"session\": {\"type\": \"keyword\"},\n                        \"geo_user\": {\"type\": \"geo_point\"}\n                        }\n                    },\n                    \"settings\": {\n                        \"number_of_shards\": 1\n                    }\n                }\n        with nostdout():\n            resp = instance.indices.create(index=config.Index_users, body=user_body, ignore=400)\n        return handleIndexResponse(resp)\n    else:\n        print(\"[x] error index pre-creation :: storage.elasticsearch.createIndex\")\n        return False\n\n@contextlib.contextmanager\ndef nostdout():\n    savestdout = sys.stdout\n    sys.stdout = RecycleObject()\n    yield\n    sys.stdout = savestdout\n\ndef weekday(day):\n    weekdays = {\n            \"Monday\": 1,\n            \"Tuesday\": 2,\n            \"Wednesday\": 3,\n            \"Thursday\": 4,\n            \"Friday\": 5,\n            \"Saturday\": 6,\n            \"Sunday\": 7,\n            }\n\n    return weekdays[day]\n\ndef Tweet(Tweet, config):\n    global _index_tweet_status\n    global _is_near_def\n    date_obj = datetime.strptime(Tweet.datetime, \"%Y-%m-%d %H:%M:%S %Z\")\n\n    actions = []\n\n    try:\n        retweet = Tweet.retweet\n    except AttributeError:\n        retweet = None\n\n    dt = f\"{Tweet.datestamp} {Tweet.timestamp}\"\n\n    j_data = {\n            \"_index\": config.Index_tweets,\n            \"_id\": str(Tweet.id) + \"_raw_\" + config.Essid,\n            \"_source\": {\n                \"id\": str(Tweet.id),\n                \"conversation_id\": Tweet.conversation_id,\n                \"created_at\": Tweet.datetime,\n                \"date\": dt,\n                \"timezone\": Tweet.timezone,\n                \"place\": Tweet.place,\n                \"tweet\": Tweet.tweet,\n                \"language\": Tweet.lang,\n                \"hashtags\": Tweet.hashtags,\n                \"cashtags\": Tweet.cashtags,\n                \"user_id_str\": Tweet.user_id_str,\n                \"username\": Tweet.username,\n                \"name\": Tweet.name,\n                \"day\": date_obj.weekday(),\n                \"hour\": date_obj.hour,\n                \"link\": Tweet.link,\n                \"retweet\": retweet,\n                \"essid\": config.Essid,\n                \"nlikes\": int(Tweet.likes_count),\n                \"nreplies\": int(Tweet.replies_count),\n                \"nretweets\": int(Tweet.retweets_count),\n                \"quote_url\": Tweet.quote_url,\n                \"video\": Tweet.video,\n                \"search\": str(config.Search),\n                \"near\": config.Near\n                }\n            }\n    if retweet is not None:\n        j_data[\"_source\"].update({\"user_rt_id\": Tweet.user_rt_id})\n        j_data[\"_source\"].update({\"user_rt\": Tweet.user_rt})\n        j_data[\"_source\"].update({\"retweet_id\": Tweet.retweet_id})\n        j_data[\"_source\"].update({\"retweet_date\": Tweet.retweet_date})\n    if Tweet.reply_to:\n        j_data[\"_source\"].update({\"reply_to\": Tweet.reply_to})\n    if Tweet.photos:\n        _photos = []\n        for photo in Tweet.photos:\n            _photos.append(photo)\n        j_data[\"_source\"].update({\"photos\": _photos})\n    if Tweet.thumbnail:\n        j_data[\"_source\"].update({\"thumbnail\": Tweet.thumbnail})\n    if Tweet.mentions:\n        _mentions = []\n        for mention in Tweet.mentions:\n            _mentions.append(mention)\n        j_data[\"_source\"].update({\"mentions\": _mentions})\n    if Tweet.urls:\n        _urls = []\n        for url in Tweet.urls:\n            _urls.append(url)\n        j_data[\"_source\"].update({\"urls\": _urls})\n    if config.Near or config.Geo:\n        if not _is_near_def:\n            __geo = \"\"\n            __near = \"\"\n            if config.Geo:\n                __geo = config.Geo\n            if config.Near:\n                __near = config.Near\n            _is_near_def = getLocation(__near + __geo, near=True)\n        if _near:\n            j_data[\"_source\"].update({\"geo_near\": _near})\n    if Tweet.place:\n        _t_place = getLocation(Tweet.place)\n        if _t_place:\n            j_data[\"_source\"].update({\"geo_tweet\": getLocation(Tweet.place)})\n    if Tweet.source:\n        j_data[\"_source\"].update({\"source\": Tweet.Source})\n    if config.Translate:\n        j_data[\"_source\"].update({\"translate\": Tweet.translate})        \n        j_data[\"_source\"].update({\"trans_src\": Tweet.trans_src})\n        j_data[\"_source\"].update({\"trans_dest\": Tweet.trans_dest})\n\n    actions.append(j_data)\n\n    es = Elasticsearch(config.Elasticsearch, verify_certs=config.Skip_certs)\n    if not _index_tweet_status:\n        _index_tweet_status = createIndex(config, es, scope=\"tweet\")\n    with nostdout():\n        helpers.bulk(es, actions, chunk_size=2000, request_timeout=200)\n    actions = []\n\ndef Follow(user, config):\n    global _index_follow_status\n    actions = []\n\n    if config.Following:\n        _user = config.Username\n        _follow = user\n    else:\n        _user = user\n        _follow = config.Username\n    j_data = {\n            \"_index\": config.Index_follow,\n            \"_id\": _user + \"_\" + _follow + \"_\" + config.Essid,\n            \"_source\": {\n                \"user\": _user,\n                \"follow\": _follow,\n                \"essid\": config.Essid\n                }\n            }\n    actions.append(j_data)\n\n    es = Elasticsearch(config.Elasticsearch, verify_certs=config.Skip_certs)\n    if not _index_follow_status:\n        _index_follow_status = createIndex(config, es, scope=\"follow\")\n    with nostdout():\n        helpers.bulk(es, actions, chunk_size=2000, request_timeout=200)\n    actions = []\n\ndef UserProfile(user, config):\n    global _index_user_status\n    global _is_location_def\n    actions = []\n\n    j_data = {\n            \"_index\": config.Index_users,\n            \"_id\": user.id + \"_\" + user.join_date + \"_\" + user.join_time + \"_\" + config.Essid,\n            \"_source\": {\n                \"id\": user.id,\n                \"name\": user.name,\n                \"username\": user.username,\n                \"bio\": user.bio,\n                \"location\": user.location,\n                \"url\": user.url,\n                \"join_datetime\": user.join_date + \" \" + user.join_time,\n                \"tweets\": user.tweets,\n                \"following\": user.following,\n                \"followers\": user.followers,\n                \"likes\": user.likes,\n                \"media\": user.media_count,\n                \"private\": user.is_private,\n                \"verified\": user.is_verified,\n                \"avatar\": user.avatar,\n                \"background_image\": user.background_image,\n                \"session\": config.Essid\n                }\n            }\n    if config.Location:\n        if not _is_location_def:\n            _is_location_def = getLocation(user.location, location=True)\n        if _location:\n            j_data[\"_source\"].update({\"geo_user\": _location})\n    actions.append(j_data)\n\n    es = Elasticsearch(config.Elasticsearch, verify_certs=config.Skip_certs)\n    if not _index_user_status:\n        _index_user_status = createIndex(config, es, scope=\"user\")\n    with nostdout():\n        helpers.bulk(es, actions, chunk_size=2000, request_timeout=200)\n    actions = []\n"
  },
  {
    "path": "twint/storage/panda.py",
    "content": "import datetime, pandas as pd, warnings\nfrom time import strftime, localtime\nfrom twint.tweet import Tweet_formats\n\nTweets_df = None\nFollow_df = None\nUser_df = None\n\n_object_blocks = {\n    \"tweet\": [],\n    \"user\": [],\n    \"following\": [],\n    \"followers\": []\n}\n\nweekdays = {\n        \"Monday\": 1,\n        \"Tuesday\": 2,\n        \"Wednesday\": 3,\n        \"Thursday\": 4,\n        \"Friday\": 5,\n        \"Saturday\": 6,\n        \"Sunday\": 7,\n        }\n\n_type = \"\"\n\ndef _concat(df, _type):\n    if df is None:\n        df = pd.DataFrame(_object_blocks[_type])\n    else:\n        _df = pd.DataFrame(_object_blocks[_type])\n        df = pd.concat([df, _df], sort=True)\n    return df\n\ndef _autoget(_type):\n    global Tweets_df\n    global Follow_df\n    global User_df\n\n    if _type == \"tweet\":\n        Tweets_df = _concat(Tweets_df, _type)\n    elif _type == \"followers\" or _type == \"following\":\n        Follow_df = _concat(Follow_df, _type)\n    elif _type == \"user\":\n        User_df = _concat(User_df, _type)\n    else:\n        error(\"[x] Wrong type of object passed\")\n\n\ndef update(object, config):\n    global _type\n\n    #try:\n    #    _type = ((object.__class__.__name__ == \"tweet\")*\"tweet\" +\n    #             (object.__class__.__name__ == \"user\")*\"user\")\n    #except AttributeError:\n    #    _type = config.Following*\"following\" + config.Followers*\"followers\"\n    if object.__class__.__name__ == \"tweet\":\n        _type = \"tweet\"\n    elif object.__class__.__name__ == \"user\":\n        _type = \"user\"\n    elif object.__class__.__name__ == \"dict\":\n        _type = config.Following*\"following\" + config.Followers*\"followers\"\n\n    if _type == \"tweet\":\n        Tweet = object\n        datetime_ms = datetime.datetime.strptime(Tweet.datetime, Tweet_formats['datetime']).timestamp() * 1000\n        day = weekdays[strftime(\"%A\", localtime(datetime_ms/1000))]\n        dt = f\"{object.datestamp} {object.timestamp}\"\n        _data = {\n            \"id\": str(Tweet.id),\n            \"conversation_id\": Tweet.conversation_id,\n            \"created_at\": datetime_ms,\n            \"date\": dt,\n            \"timezone\": Tweet.timezone,\n            \"place\": Tweet.place,\n            \"tweet\": Tweet.tweet,\n            \"language\": Tweet.lang,\n            \"hashtags\": Tweet.hashtags,\n            \"cashtags\": Tweet.cashtags,\n            \"user_id\": Tweet.user_id,\n            \"user_id_str\": Tweet.user_id_str,\n            \"username\": Tweet.username,\n            \"name\": Tweet.name,\n            \"day\": day,\n            \"hour\": strftime(\"%H\", localtime(datetime_ms/1000)),\n            \"link\": Tweet.link,\n            \"urls\": Tweet.urls,\n            \"photos\": Tweet.photos,\n            \"video\": Tweet.video,\n            \"thumbnail\": Tweet.thumbnail,\n            \"retweet\": Tweet.retweet,\n            \"nlikes\": int(Tweet.likes_count),\n            \"nreplies\": int(Tweet.replies_count),\n            \"nretweets\": int(Tweet.retweets_count),\n            \"quote_url\": Tweet.quote_url,\n            \"search\": str(config.Search),\n            \"near\": Tweet.near,\n            \"geo\": Tweet.geo,\n            \"source\": Tweet.source,\n            \"user_rt_id\": Tweet.user_rt_id,\n            \"user_rt\": Tweet.user_rt,\n            \"retweet_id\": Tweet.retweet_id,\n            \"reply_to\": Tweet.reply_to,\n            \"retweet_date\": Tweet.retweet_date,\n            \"translate\": Tweet.translate,\n            \"trans_src\": Tweet.trans_src,\n            \"trans_dest\": Tweet.trans_dest\n            }\n        _object_blocks[_type].append(_data)\n    elif _type == \"user\":\n        user = object\n        try:\n            background_image = user.background_image\n        except:\n            background_image = \"\"\n        _data = {\n            \"id\": user.id,\n            \"name\": user.name,\n            \"username\": user.username,\n            \"bio\": user.bio,\n            \"url\": user.url,\n            \"join_datetime\": user.join_date + \" \" + user.join_time,\n            \"join_date\": user.join_date,\n            \"join_time\": user.join_time,\n            \"tweets\": user.tweets,\n            \"location\": user.location,\n            \"following\": user.following,\n            \"followers\": user.followers,\n            \"likes\": user.likes,\n            \"media\": user.media_count,\n            \"private\": user.is_private,\n            \"verified\": user.is_verified,\n            \"avatar\": user.avatar,\n            \"background_image\": background_image,\n            }\n        _object_blocks[_type].append(_data)\n    elif _type == \"followers\" or _type == \"following\":\n        _data = {\n            config.Following*\"following\" + config.Followers*\"followers\" :\n                             {config.Username: object[_type]}\n        }\n        _object_blocks[_type] = _data\n    else:\n        print(\"Wrong type of object passed!\")\n\n\ndef clean():\n    global Tweets_df\n    global Follow_df\n    global User_df\n    _object_blocks[\"tweet\"].clear()\n    _object_blocks[\"following\"].clear()\n    _object_blocks[\"followers\"].clear()\n    _object_blocks[\"user\"].clear()\n    Tweets_df = None\n    Follow_df = None\n    User_df = None\n\ndef save(_filename, _dataframe, **options):\n    if options.get(\"dataname\"):\n        _dataname = options.get(\"dataname\")\n    else:\n        _dataname = \"twint\"\n\n    if not options.get(\"type\"):\n        with warnings.catch_warnings():\n            warnings.simplefilter(\"ignore\")\n            _store = pd.HDFStore(_filename + \".h5\")\n            _store[_dataname] = _dataframe\n            _store.close()\n    elif options.get(\"type\") == \"Pickle\":\n        with warnings.catch_warnings():\n            warnings.simplefilter(\"ignore\")\n            _dataframe.to_pickle(_filename + \".pkl\")\n    else:\n        print(\"\"\"Please specify: filename, DataFrame, DataFrame name and type\n              (HDF5, default, or Pickle)\"\"\")\n\ndef read(_filename, **options):\n    if not options.get(\"dataname\"):\n        _dataname = \"twint\"\n    else:\n        _dataname = options.get(\"dataname\")\n\n    if not options.get(\"type\"):\n        _store = pd.HDFStore(_filename + \".h5\")\n        _df = _store[_dataname]\n        return _df\n    elif options.get(\"type\") == \"Pickle\":\n        _df = pd.read_pickle(_filename + \".pkl\")\n        return _df\n    else:\n        print(\"\"\"Please specify: DataFrame, DataFrame name (twint as default),\n              filename and type (HDF5, default, or Pickle\"\"\")\n"
  },
  {
    "path": "twint/storage/write.py",
    "content": "from . import write_meta as meta\nimport csv\nimport json\nimport os\n\ndef outputExt(objType, fType):\n    if objType == \"str\":\n        objType = \"username\"\n    outExt = f\"/{objType}s.{fType}\"\n\n    return outExt\n\ndef addExt(base, objType, fType):\n    if len(base.split('.')) == 1:\n        createDirIfMissing(base)\n        base += outputExt(objType, fType)\n\n    return base\n\ndef Text(entry, f):\n    print(entry.replace('\\n', ' '), file=open(f, \"a\", encoding=\"utf-8\"))\n\ndef Type(config):\n    if config.User_full:\n        _type = \"user\"\n    elif config.Followers or config.Following:\n        _type = \"username\"\n    else:\n        _type = \"tweet\"\n\n    return _type\n\ndef struct(obj, custom, _type):\n    if custom:\n        fieldnames = custom\n        row = {}\n        for f in fieldnames:\n            row[f] = meta.Data(obj, _type)[f]\n    else:\n        fieldnames = meta.Fieldnames(_type)\n        row = meta.Data(obj, _type)\n\n    return fieldnames, row\n\ndef createDirIfMissing(dirname):\n    if not os.path.exists(dirname):\n        os.makedirs(dirname)\n\ndef Csv(obj, config):\n    _obj_type = obj.__class__.__name__\n    if _obj_type == \"str\":\n        _obj_type = \"username\"\n    fieldnames, row = struct(obj, config.Custom[_obj_type], _obj_type)\n    \n    base = addExt(config.Output, _obj_type, \"csv\")\n    dialect = 'excel-tab' if 'Tabs' in config.__dict__ else 'excel'\n    \n    if not (os.path.exists(base)):\n        with open(base, \"w\", newline='', encoding=\"utf-8\") as csv_file:\n            writer = csv.DictWriter(csv_file, fieldnames=fieldnames, dialect=dialect)\n            writer.writeheader()\n\n    with open(base, \"a\", newline='', encoding=\"utf-8\") as csv_file:\n        writer = csv.DictWriter(csv_file, fieldnames=fieldnames, dialect=dialect)\n        writer.writerow(row)\n\ndef Json(obj, config):\n    _obj_type = obj.__class__.__name__\n    if _obj_type == \"str\":\n        _obj_type = \"username\"\n    null, data = struct(obj, config.Custom[_obj_type], _obj_type)\n\n    base = addExt(config.Output, _obj_type, \"json\")\n\n    with open(base, \"a\", newline='', encoding=\"utf-8\") as json_file:\n        json.dump(data, json_file, ensure_ascii=False)\n        json_file.write(\"\\n\")\n"
  },
  {
    "path": "twint/storage/write_meta.py",
    "content": "def tweetData(t):\n    data = {\n            \"id\": int(t.id),\n            \"conversation_id\": t.conversation_id,\n            \"created_at\": t.datetime,\n            \"date\": t.datestamp,\n            \"time\": t.timestamp,\n            \"timezone\": t.timezone,\n            \"user_id\": t.user_id,\n            \"username\": t.username,\n            \"name\": t.name,\n            \"place\": t.place,\n            \"tweet\": t.tweet,\n            \"language\": t.lang,\n            \"mentions\": t.mentions,\n            \"urls\": t.urls,\n            \"photos\": t.photos,\n            \"replies_count\": int(t.replies_count),\n            \"retweets_count\": int(t.retweets_count),\n            \"likes_count\": int(t.likes_count),\n            \"hashtags\": t.hashtags,\n            \"cashtags\": t.cashtags,\n            \"link\": t.link,\n            \"retweet\": t.retweet,\n            \"quote_url\": t.quote_url,\n            \"video\": t.video,\n            \"thumbnail\": t.thumbnail,\n            \"near\": t.near,\n            \"geo\": t.geo,\n            \"source\": t.source,\n            \"user_rt_id\": t.user_rt_id,\n            \"user_rt\": t.user_rt,\n            \"retweet_id\": t.retweet_id,\n            \"reply_to\": t.reply_to,\n            \"retweet_date\": t.retweet_date,\n            \"translate\": t.translate,\n            \"trans_src\": t.trans_src,\n            \"trans_dest\": t.trans_dest,\n            }\n    return data\n\ndef tweetFieldnames():\n    fieldnames = [\n            \"id\",\n            \"conversation_id\",\n            \"created_at\",\n            \"date\",\n            \"time\",\n            \"timezone\",\n            \"user_id\",\n            \"username\",\n            \"name\",\n            \"place\",\n            \"tweet\",\n            \"language\",\n            \"mentions\",\n            \"urls\",\n            \"photos\",\n            \"replies_count\",\n            \"retweets_count\",\n            \"likes_count\",\n            \"hashtags\",\n            \"cashtags\",\n            \"link\",\n            \"retweet\",\n            \"quote_url\",\n            \"video\",\n            \"thumbnail\",\n            \"near\",\n            \"geo\",\n            \"source\",\n            \"user_rt_id\",\n            \"user_rt\",\n            \"retweet_id\",\n            \"reply_to\",\n            \"retweet_date\",\n            \"translate\",\n            \"trans_src\",\n            \"trans_dest\"\n            ]\n    return fieldnames\n\ndef userData(u):\n    data = {\n            \"id\": int(u.id),\n            \"name\": u.name,\n            \"username\": u.username,\n            \"bio\": u.bio,\n            \"location\": u.location,\n            \"url\": u.url,\n            \"join_date\": u.join_date,\n            \"join_time\": u.join_time,\n            \"tweets\": int(u.tweets),\n            \"following\": int(u.following),\n            \"followers\": int(u.followers),\n            \"likes\": int(u.likes),\n            \"media\": int(u.media_count),\n            \"private\": u.is_private,\n            \"verified\": u.is_verified,\n            \"profile_image_url\": u.avatar,\n            \"background_image\": u.background_image\n            }\n    return data\n\ndef userFieldnames():\n    fieldnames = [\n            \"id\",\n            \"name\",\n            \"username\",\n            \"bio\",\n            \"location\",\n            \"url\",\n            \"join_date\",\n            \"join_time\",\n            \"tweets\",\n            \"following\",\n            \"followers\",\n            \"likes\",\n            \"media\",\n            \"private\",\n            \"verified\",\n            \"profile_image_url\",\n            \"background_image\"\n            ]\n    return fieldnames\n\ndef usernameData(u):\n    return {\"username\": u}\n\ndef usernameFieldnames():\n    return [\"username\"]\n\ndef Data(obj, _type):\n    if _type == \"user\":\n        ret = userData(obj)\n    elif _type == \"username\":\n        ret = usernameData(obj)\n    else:\n        ret = tweetData(obj)\n\n    return ret\n\ndef Fieldnames(_type):\n    if _type == \"user\":\n        ret = userFieldnames()\n    elif _type == \"username\":\n        ret = usernameFieldnames()\n    else:\n        ret = tweetFieldnames()\n\n    return ret\n"
  },
  {
    "path": "twint/token.py",
    "content": "import re\nimport time\n\nimport requests\nimport logging as logme\n\n\nclass TokenExpiryException(Exception):\n    def __init__(self, msg):\n        super().__init__(msg)\n\n        \nclass RefreshTokenException(Exception):\n    def __init__(self, msg):\n        super().__init__(msg)\n        \n\nclass Token:\n    def __init__(self, config):\n        self._session = requests.Session()\n        self._session.headers.update({'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:78.0) Gecko/20100101 Firefox/78.0'})\n        self.config = config\n        self._retries = 5\n        self._timeout = 10\n        self.url = 'https://twitter.com'\n\n    def _request(self):\n        for attempt in range(self._retries + 1):\n            # The request is newly prepared on each retry because of potential cookie updates.\n            req = self._session.prepare_request(requests.Request('GET', self.url))\n            logme.debug(f'Retrieving {req.url}')\n            try:\n                r = self._session.send(req, allow_redirects=True, timeout=self._timeout)\n            except requests.exceptions.RequestException as exc:\n                if attempt < self._retries:\n                    retrying = ', retrying'\n                    level = logme.WARNING\n                else:\n                    retrying = ''\n                    level = logme.ERROR\n                logme.log(level, f'Error retrieving {req.url}: {exc!r}{retrying}')\n            else:\n                success, msg = (True, None)\n                msg = f': {msg}' if msg else ''\n\n                if success:\n                    logme.debug(f'{req.url} retrieved successfully{msg}')\n                    return r\n            if attempt < self._retries:\n                # TODO : might wanna tweak this back-off timer\n                sleep_time = 2.0 * 2 ** attempt\n                logme.info(f'Waiting {sleep_time:.0f} seconds')\n                time.sleep(sleep_time)\n        else:\n            msg = f'{self._retries + 1} requests to {self.url} failed, giving up.'\n            logme.fatal(msg)\n            self.config.Guest_token = None\n            raise RefreshTokenException(msg)\n\n    def refresh(self):\n        logme.debug('Retrieving guest token')\n        res = self._request()\n        match = re.search(r'\\(\"gt=(\\d+);', res.text)\n        if match:\n            logme.debug('Found guest token in HTML')\n            self.config.Guest_token = str(match.group(1))\n        else:\n            self.config.Guest_token = None\n            raise RefreshTokenException('Could not find the Guest token in HTML')\n"
  },
  {
    "path": "twint/tweet.py",
    "content": "from time import strftime, localtime\nfrom datetime import datetime, timezone\n\nimport logging as logme\nfrom googletransx import Translator\n# ref. \n# - https://github.com/x0rzkov/py-googletrans#basic-usage\ntranslator = Translator()\n\n\nclass tweet:\n    \"\"\"Define Tweet class\n    \"\"\"\n    type = \"tweet\"\n\n    def __init__(self):\n        pass\n\n\ndef utc_to_local(utc_dt):\n    return utc_dt.replace(tzinfo=timezone.utc).astimezone(tz=None)\n\n\nTweet_formats = {\n    'datetime': '%Y-%m-%d %H:%M:%S %Z',\n    'datestamp': '%Y-%m-%d',\n    'timestamp': '%H:%M:%S'\n}\n\n\ndef _get_mentions(tw):\n    \"\"\"Extract mentions from tweet\n    \"\"\"\n    logme.debug(__name__ + ':get_mentions')\n    try:\n        mentions = [\n            {\n                'screen_name': _mention['screen_name'],\n                'name': _mention['name'],\n                'id': _mention['id_str'],\n            } for _mention in tw['entities']['user_mentions']\n            if tw['display_text_range'][0] < _mention['indices'][0]\n        ]\n    except KeyError:\n        mentions = []\n    return mentions\n\n\ndef _get_reply_to(tw):\n    try:\n        reply_to = [\n            {\n                'screen_name': _mention['screen_name'],\n                'name': _mention['name'],\n                'id': _mention['id_str'],\n            } for _mention in tw['entities']['user_mentions']\n            if tw['display_text_range'][0] > _mention['indices'][1]\n        ]\n    except KeyError:\n        reply_to = []\n    return reply_to\n\n\ndef getText(tw):\n    \"\"\"Replace some text\n    \"\"\"\n    logme.debug(__name__ + ':getText')\n    text = tw['full_text']\n    text = text.replace(\"http\", \" http\")\n    text = text.replace(\"pic.twitter\", \" pic.twitter\")\n    text = text.replace(\"\\n\", \" \")\n\n    return text\n\n\ndef Tweet(tw, config):\n    \"\"\"Create Tweet object\n    \"\"\"\n    logme.debug(__name__ + ':Tweet')\n    t = tweet()\n    t.id = int(tw['id_str'])\n    t.id_str = tw[\"id_str\"]\n    t.conversation_id = tw[\"conversation_id_str\"]\n\n    # parsing date to user-friendly format\n    _dt = tw['created_at']\n    _dt = datetime.strptime(_dt, '%a %b %d %H:%M:%S %z %Y')\n    _dt = utc_to_local(_dt)\n    t.datetime = str(_dt.strftime(Tweet_formats['datetime']))\n    # date is of the format year,\n    t.datestamp = _dt.strftime(Tweet_formats['datestamp'])\n    t.timestamp = _dt.strftime(Tweet_formats['timestamp'])\n    t.user_id = int(tw[\"user_id_str\"])\n    t.user_id_str = tw[\"user_id_str\"]\n    t.username = tw[\"user_data\"]['screen_name']\n    t.name = tw[\"user_data\"]['name']\n    t.place = tw['geo'] if 'geo' in tw and tw['geo'] else \"\"\n    t.timezone = strftime(\"%z\", localtime())\n    t.mentions = _get_mentions(tw)\n    t.reply_to = _get_reply_to(tw)\n    try:\n        t.urls = [_url['expanded_url'] for _url in tw['entities']['urls']]\n    except KeyError:\n        t.urls = []\n    try:\n        t.photos = [_img['media_url_https'] for _img in tw['entities']['media'] if _img['type'] == 'photo' and\n                    _img['expanded_url'].find('/photo/') != -1]\n    except KeyError:\n        t.photos = []\n    try:\n        t.video = 1 if len(tw['extended_entities']['media']) else 0\n    except KeyError:\n        t.video = 0\n    try:\n        t.thumbnail = tw['extended_entities']['media'][0]['media_url_https']\n    except KeyError:\n        t.thumbnail = ''\n    t.tweet = getText(tw)\n    t.lang = tw['lang']\n    try:\n        t.hashtags = [hashtag['text'] for hashtag in tw['entities']['hashtags']]\n    except KeyError:\n        t.hashtags = []\n    try:\n        t.cashtags = [cashtag['text'] for cashtag in tw['entities']['symbols']]\n    except KeyError:\n        t.cashtags = []\n    t.replies_count = tw['reply_count']\n    t.retweets_count = tw['retweet_count']\n    t.likes_count = tw['favorite_count']\n    t.link = f\"https://twitter.com/{t.username}/status/{t.id}\"\n    try:\n        if 'user_rt_id' in tw['retweet_data']:\n            t.retweet = True\n            t.retweet_id = tw['retweet_data']['retweet_id']\n            t.retweet_date = tw['retweet_data']['retweet_date']\n            t.user_rt = tw['retweet_data']['user_rt']\n            t.user_rt_id = tw['retweet_data']['user_rt_id']\n    except KeyError:\n        t.retweet = False\n        t.retweet_id = ''\n        t.retweet_date = ''\n        t.user_rt = ''\n        t.user_rt_id = ''\n    try:\n        t.quote_url = tw['quoted_status_permalink']['expanded'] if tw['is_quote_status'] else ''\n    except KeyError:\n        # means that the quoted tweet have been deleted\n        t.quote_url = 0\n    t.near = config.Near if config.Near else \"\"\n    t.geo = config.Geo if config.Geo else \"\"\n    t.source = config.Source if config.Source else \"\"\n    t.translate = ''\n    t.trans_src = ''\n    t.trans_dest = ''\n    if config.Translate:\n        try:\n            ts = translator.translate(text=t.tweet, dest=config.TranslateDest)\n            t.translate = ts.text\n            t.trans_src = ts.src\n            t.trans_dest = ts.dest\n        # ref. https://github.com/SuniTheFish/ChainTranslator/blob/master/ChainTranslator/__main__.py#L31\n        except ValueError as e:\n            logme.debug(__name__ + ':Tweet:translator.translate:' + str(e))\n            raise Exception(\"Invalid destination language: {} / Tweet: {}\".format(config.TranslateDest, t.tweet))\n    return t\n"
  },
  {
    "path": "twint/url.py",
    "content": "import datetime\nfrom sys import platform\nimport logging as logme\nfrom urllib.parse import urlencode\nfrom urllib.parse import quote\n\nmobile = \"https://mobile.twitter.com\"\nbase = \"https://api.twitter.com/2/search/adaptive.json\"\n\n\ndef _sanitizeQuery(_url, params):\n    _serialQuery = \"\"\n    _serialQuery = urlencode(params, quote_via=quote)\n    _serialQuery = _url + \"?\" + _serialQuery\n    return _serialQuery\n\n\ndef _formatDate(date):\n    if \"win\" in platform:\n        return f'\\\"{date.split()[0]}\\\"'\n    try:\n        return int(datetime.datetime.strptime(date, \"%Y-%m-%d %H:%M:%S\").timestamp())\n    except ValueError:\n        return int(datetime.datetime.strptime(date, \"%Y-%m-%d\").timestamp())\n\n\nasync def Favorites(username, init):\n    logme.debug(__name__ + ':Favorites')\n    url = f\"{mobile}/{username}/favorites?lang=en\"\n\n    if init != '-1':\n        url += f\"&max_id={init}\"\n\n    return url\n\n\nasync def Followers(username, init):\n    logme.debug(__name__ + ':Followers')\n    url = f\"{mobile}/{username}/followers?lang=en\"\n\n    if init != '-1':\n        url += f\"&cursor={init}\"\n\n    return url\n\n\nasync def Following(username, init):\n    logme.debug(__name__ + ':Following')\n    url = f\"{mobile}/{username}/following?lang=en\"\n\n    if init != '-1':\n        url += f\"&cursor={init}\"\n\n    return url\n\n\nasync def MobileProfile(username, init):\n    logme.debug(__name__ + ':MobileProfile')\n    url = f\"{mobile}/{username}?lang=en\"\n\n    if init != '-1':\n        url += f\"&max_id={init}\"\n\n    return url\n\n\nasync def Search(config, init):\n    logme.debug(__name__ + ':Search')\n    url = base\n    tweet_count = 100\n    q = \"\"\n    params = [\n        # ('include_blocking', '1'),\n        # ('include_blocked_by', '1'),\n        # ('include_followed_by', '1'),\n        # ('include_want_retweets', '1'),\n        # ('include_mute_edge', '1'),\n        # ('include_can_dm', '1'),\n        ('include_can_media_tag', '1'),\n        # ('skip_status', '1'),\n        # ('include_cards', '1'),\n        ('include_ext_alt_text', 'true'),\n        ('include_quote_count', 'true'),\n        ('include_reply_count', '1'),\n        ('tweet_mode', 'extended'),\n        ('include_entities', 'true'),\n        ('include_user_entities', 'true'),\n        ('include_ext_media_availability', 'true'),\n        ('send_error_codes', 'true'),\n        ('simple_quoted_tweet', 'true'),\n        ('count', tweet_count),\n        # ('query_source', 'typed_query'),\n        # ('pc', '1'),\n        ('cursor', str(init)),\n        ('spelling_corrections', '1'),\n        ('ext', 'mediaStats%2ChighlightedLabel'),\n        ('tweet_search_mode', 'live'),  # this can be handled better, maybe take an argument and set it then\n    ]\n    if not config.Popular_tweets:\n        params.append(('f', 'tweets'))\n    if config.Lang:\n        params.append((\"l\", config.Lang))\n        params.append((\"lang\", \"en\"))\n    if config.Query:\n        q += f\" from:{config.Query}\"\n    if config.Username:\n        q += f\" from:{config.Username}\"\n    if config.Geo:\n        config.Geo = config.Geo.replace(\" \", \"\")\n        q += f\" geocode:{config.Geo}\"\n    if config.Search:\n\n        q += f\" {config.Search}\"\n    if config.Year:\n        q += f\" until:{config.Year}-1-1\"\n    if config.Since:\n        q += f\" since:{_formatDate(config.Since)}\"\n    if config.Until:\n        q += f\" until:{_formatDate(config.Until)}\"\n    if config.Email:\n        q += ' \"mail\" OR \"email\" OR'\n        q += ' \"gmail\" OR \"e-mail\"'\n    if config.Phone:\n        q += ' \"phone\" OR \"call me\" OR \"text me\"'\n    if config.Verified:\n        q += \" filter:verified\"\n    if config.To:\n        q += f\" to:{config.To}\"\n    if config.All:\n        q += f\" to:{config.All} OR from:{config.All} OR @{config.All}\"\n    if config.Near:\n        q += f' near:\"{config.Near}\"'\n    if config.Images:\n        q += \" filter:images\"\n    if config.Videos:\n        q += \" filter:videos\"\n    if config.Media:\n        q += \" filter:media\"\n    if config.Replies:\n        q += \" filter:replies\"\n    # although this filter can still be used, but I found it broken in my preliminary testing, needs more testing\n    if config.Native_retweets:\n        q += \" filter:nativeretweets\"\n    if config.Min_likes:\n        q += f\" min_faves:{config.Min_likes}\"\n    if config.Min_retweets:\n        q += f\" min_retweets:{config.Min_retweets}\"\n    if config.Min_replies:\n        q += f\" min_replies:{config.Min_replies}\"\n    if config.Links == \"include\":\n        q += \" filter:links\"\n    elif config.Links == \"exclude\":\n        q += \" exclude:links\"\n    if config.Source:\n        q += f\" source:\\\"{config.Source}\\\"\"\n    if config.Members_list:\n        q += f\" list:{config.Members_list}\"\n    if config.Filter_retweets:\n        q += f\" exclude:nativeretweets exclude:retweets\"\n    if config.Custom_query:\n        q = config.Custom_query\n\n    q = q.strip()\n    params.append((\"q\", q))\n    _serialQuery = _sanitizeQuery(url, params)\n    return url, params, _serialQuery\n\n\ndef SearchProfile(config, init=None):\n    logme.debug(__name__ + ':SearchProfile')\n    _url = 'https://api.twitter.com/2/timeline/profile/{user_id}.json'.format(user_id=config.User_id)\n    tweet_count = 100\n    params = [\n        # some of the fields are not required, need to test which ones aren't required\n        ('include_profile_interstitial_type', '1'),\n        ('include_blocking', '1'),\n        ('include_blocked_by', '1'),\n        ('include_followed_by', '1'),\n        ('include_want_retweets', '1'),\n        ('include_mute_edge', '1'),\n        ('include_can_dm', '1'),\n        ('include_can_media_tag', '1'),\n        ('skip_status', '1'),\n        ('cards_platform', 'Web - 12'),\n        ('include_cards', '1'),\n        ('include_ext_alt_text', 'true'),\n        ('include_quote_count', 'true'),\n        ('include_reply_count', '1'),\n        ('tweet_mode', 'extended'),\n        ('include_entities', 'true'),\n        ('include_user_entities', 'true'),\n        ('include_ext_media_color', 'true'),\n        ('include_ext_media_availability', 'true'),\n        ('send_error_codes', 'true'),\n        ('simple_quoted_tweet', 'true'),\n        ('include_tweet_replies', 'true'),\n        ('count', tweet_count),\n        ('ext', 'mediaStats%2ChighlightedLabel'),\n    ]\n\n    if type(init) == str:\n        params.append(('cursor', str(init)))\n    _serialQuery = _sanitizeQuery(_url, params)\n    return _url, params, _serialQuery\n"
  },
  {
    "path": "twint/user.py",
    "content": "import datetime\nimport logging as logme\n\n\nclass user:\n    type = \"user\"\n\n    def __init__(self):\n        pass\n\n\nUser_formats = {\n    'join_date': '%Y-%m-%d',\n    'join_time': '%H:%M:%S %Z'\n}\n\n\n# ur object must be a json from the endpoint https://api.twitter.com/graphql\ndef User(ur):\n    logme.debug(__name__ + ':User')\n    if 'data' not in ur and 'user' not in ur['data']:\n        msg = 'malformed json! cannot be parsed to get user data'\n        logme.fatal(msg)\n        raise KeyError(msg)\n    _usr = user()\n    _usr.id = ur['data']['user']['rest_id']\n    _usr.name = ur['data']['user']['legacy']['name']\n    _usr.username = ur['data']['user']['legacy']['screen_name']\n    _usr.bio = ur['data']['user']['legacy']['description']\n    _usr.location = ur['data']['user']['legacy']['location']\n    _usr.url = ur['data']['user']['legacy']['url']\n    # parsing date to user-friendly format\n    _dt = ur['data']['user']['legacy']['created_at']\n    _dt = datetime.datetime.strptime(_dt, '%a %b %d %H:%M:%S %z %Y')\n    # date is of the format year,\n    _usr.join_date = _dt.strftime(User_formats['join_date'])\n    _usr.join_time = _dt.strftime(User_formats['join_time'])\n\n    # :type `int`\n    _usr.tweets = int(ur['data']['user']['legacy']['statuses_count'])\n    _usr.following = int(ur['data']['user']['legacy']['friends_count'])\n    _usr.followers = int(ur['data']['user']['legacy']['followers_count'])\n    _usr.likes = int(ur['data']['user']['legacy']['favourites_count'])\n    _usr.media_count = int(ur['data']['user']['legacy']['media_count'])\n\n    _usr.is_private = ur['data']['user']['legacy']['protected']\n    _usr.is_verified = ur['data']['user']['legacy']['verified']\n    _usr.avatar = ur['data']['user']['legacy']['profile_image_url_https']\n    _usr.background_image = ur['data']['user']['legacy']['profile_banner_url']\n    # TODO : future implementation\n    # legacy_extended_profile is also available in some cases which can be used to get DOB of user\n    return _usr\n"
  },
  {
    "path": "twint/verbose.py",
    "content": "def Count(count, config):\n    msg = \"[+] Finished: Successfully collected \"\n    if config.Followers:\n        msg += f\"all {count} users who follow @{config.Username}\"\n    elif config.Following:\n        msg += f\"all {count} users who @{config.Username} follows\"\n    elif config.Favorites:\n        msg += f\"{count} Tweets that @{config.Username} liked\"\n    else:\n        msg += f\"{count} Tweets\"\n        if config.Username:\n            msg += f\" from @{config.Username}\"\n    msg += \".\"\n    print(msg)\n\ndef Elastic(elasticsearch):\n    if elasticsearch:\n        print(\"[+] Indexing to Elasticsearch @ \" + str(elasticsearch))\n"
  }
]