[
  {
    "path": ".gitattributes",
    "content": "# Auto detect text files and perform LF normalization\n* text=auto\n"
  },
  {
    "path": ".github/FUNDING.yml",
    "content": "# These are supported funding model platforms\n\ngithub: # Replace with up to 4 GitHub Sponsors-enabled usernames e.g., [user1, user2]\npatreon: # Replace with a single Patreon username\nopen_collective: # Replace with a single Open Collective username\nko_fi: # Replace with a single Ko-fi username\ntidelift: # Replace with a single Tidelift platform-name/package-name e.g., npm/babel\ncommunity_bridge: # Replace with a single Community Bridge project-name e.g., cloud-foundry\nliberapay: # Replace with a single Liberapay username\nissuehunt: # Replace with a single IssueHunt username\notechie: # Replace with a single Otechie username\ncustom: ['https://www.cyfylabs.com']              \n"
  },
  {
    "path": "LICENSE",
    "content": "MIT License\n\nCopyright (c) 2018 harismuneer, hussamh10\n\nPermission is hereby granted, free of charge, to any person obtaining a copy\nof this software and associated documentation files (the \"Software\"), to deal\nin the Software without restriction, including without limitation the rights\nto use, copy, modify, merge, publish, distribute, sublicense, and/or sell\ncopies of the Software, and to permit persons to whom the Software is\nfurnished to do so, subject to the following conditions:\n\nThe above copyright notice and this permission notice shall be included in all\ncopies or substantial portions of the Software.\n\nTHE SOFTWARE IS PROVIDED \"AS IS\", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR\nIMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,\nFITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE\nAUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER\nLIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,\nOUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE\nSOFTWARE.\n"
  },
  {
    "path": "README.md",
    "content": "# 📱 Android Apps Scraper/Downloader\n<a href=\"https://github.com/harismuneer\"><img alt=\"views\" title=\"Github views\" src=\"https://komarev.com/ghpvc/?username=harismuneer&style=flat-square\" width=\"125\"/></a>\n[![DOI](https://zenodo.org/badge/DOI/10.5281/zenodo.2537131.svg)](https://doi.org/10.5281/zenodo.2537131)\n\n[![Open Source Love svg1](https://badges.frapsoft.com/os/v1/open-source.svg?v=103)](#)\n[![GitHub Forks](https://img.shields.io/github/forks/harismuneer/Android-Apps-Downloader.svg?style=social&label=Fork&maxAge=2592000)](https://www.github.com/harismuneer/Android-Apps-Downloader/fork)\n[![GitHub Issues](https://img.shields.io/github/issues/harismuneer/Android-Apps-Downloader.svg?style=flat&label=Issues&maxAge=2592000)](https://www.github.com/harismuneer/Android-Apps-Downloader/issues)\n[![contributions welcome](https://img.shields.io/badge/contributions-welcome-brightgreen.svg?style=flat&label=Contributions&colorA=red&colorB=black\t)](#)\n\n\n\nWe did a research project on comparison of Official Play Store Apps and their 3rd Party App Stores counterparts to analyze what modifications are done to the 3rd Party versions of an app. For this purpose, we wrote this script to download pairs of an app from Google Play Store and Xiaomi App Store (a famous 3rd Party Chinese App Store). It downloads an app from Xiaomi and Google Play store only when that app is available on both stores. This way it creates a dataset of pairs of an app.\n\nWe are open-sourcing this tool so that it can be utilized by the research community for research in Android Security. \n\nMoroever, to compare two Android Apps we wrote another tool named [AndroCompare](https://github.com/harismuneer/AndroCompare). We have open-sourced it as well.\n\nFor details regarding **citing/referencing** this tool for your research, check the 'Citation' section below.\n\n\n## Approach\nThe download URL of an app on Xiaomi App Store is like http://app.mi.com/download/23 \nThe number at the end of the URL can be incremented to download as many apps as you want. So, theoretically you can download each and every app on the Xiaomi App Store. Hence there's a variable named 'target' in the tool. If target = 1000 then the tool will scan first 1000 urls for the apps. You can change the target to any number you want.\n\n\n## Features\n* download all apps from the famous Xiaomi App Store\n* download pairs of an app from Play Store and Xiaomi App Store\n* the record of all downloaded apps is maintained in a SQLite database\n* if the code is interrupted using CTRL + Z then the current progress is saved so that next time the code resumes downloading from where it left previously\n* incase the script is running and there occurs some internet connectivity issue then all current progress is saved and the script waits until the internet is connected again and resumes from where it left\n\n## How to Run Code\nThe code is in ready to run condition. It can be run on both Windows/Ubuntu Linux. \nIts written in Python 3. Moreover it uses [gplaycli](https://github.com/matlink/gplaycli), so install it using pip.\n\nYou can use [DB Browser for SQLite](http://sqlitebrowser.org/) to view the database.\n\n----------------------------------------------------------------------------------------------------------------------------------------\n## Note\nThis script can be easily modified to meet your specific needs e.g currently it first checks whether an app is present on both stores and if yes then it downloads it from each store. You can remove this constraint to download every possible app from Xiaomi App Store.\n\nThis code is for research purposes only.\n\n----------------------------------------------------------------------------------------------------------------------------------------\n\n## Citation\n\n[![DOI](https://zenodo.org/badge/DOI/10.5281/zenodo.2537131.svg)](https://doi.org/10.5281/zenodo.2537131)\n\nIf you use this tool for your research, then kindly cite it. Click the above badge for more information regarding the complete citation for this tool and diffferent citation formats like IEEE, APA etc.\n\n---------------------------------------------------------------------------------------------------------------------------------------\n\n<hr>\n\n## Authors\nYou can get in touch with us on our LinkedIn profiles:\n\n#### Haris Muneer\n\n[![LinkedIn Link](https://img.shields.io/badge/Connect-harismuneer-blue.svg?logo=linkedin&longCache=true&style=social&label=Follow\n)](https://www.linkedin.com/in/harismuneer) \n\nTo stay updated about my latest projects: [![GitHub Follow](https://img.shields.io/badge/Connect-harismuneer-blue.svg?logo=Github&longCache=true&style=social&label=Follow)](https://github.com/harismuneer)\n\n#### Hussam Habib\n\n[![LinkedIn Link](https://img.shields.io/badge/Connect-hussam--habib-blue.svg?logo=linkedin&longCache=true&style=social&label=Connect)](https://www.linkedin.com/in/hussam-habib-0bb098104/)\n\nTo stay updated about my latest projects: [![GitHub Follow](https://img.shields.io/badge/Connect-hussam--habib-blue.svg?logo=Github&longCache=true&style=social&label=Follow)](https://github.com/hussamh10)\n\n\n---\nIf you liked the repo then kindly support it by giving it a star ⭐ and share in your circles so more people can benefit from the effort.\n\n## Contributions Welcome\n[![GitHub Issues](https://img.shields.io/github/issues/harismuneer/Android-Apps-Downloader.svg?style=flat&label=Issues&maxAge=2592000)](https://www.github.com/harismuneer/Android-Apps-Downloader/issues)\n\nIf you find any bugs, have suggestions, or face issues:\n\n- Open an Issue in the Issues Tab to discuss them.\n- Submit a Pull Request to propose fixes or improvements.\n- Review Pull Requests from other contributors to help maintain the project's quality and progress.\n\nThis project thrives on community collaboration! Members are encouraged to take the initiative, support one another, and actively engage in all aspects of the project. Whether it’s debugging, fixing issues, or brainstorming new ideas, your contributions are what keep this project moving forward.\n\nWith modern AI tools like ChatGPT, solving challenges and contributing effectively is easier than ever. Let’s work together to make this project the best it can be! 🚀\n\n## License\n[![MIT](https://img.shields.io/cocoapods/l/AFNetworking.svg?style=style&label=License&maxAge=2592000)](../master/LICENSE)\n\nCopyright (c) 2018-present, harismuneer, hussamh10                                                        \n\n<!-- PROFILE_INTRO_START -->\n\n<hr>\n\n<h1> <a href=\"#\"><img src=\"https://media.giphy.com/media/hvRJCLFzcasrR4ia7z/giphy.gif\" alt=\"Waving hand\" width=\"28\"></a>\nHey there, I'm <a href=\"https://www.linkedin.com/in/harismuneer/\">Haris Muneer</a> 👨🏻‍💻\n</h1>\n\n\n<a href=\"https://github.com/harismuneer\"><img src=\"https://img.shields.io/github/stars/harismuneer\" alt=\"Total Github Stars\"></a>\n<a href=\"https://github.com/harismuneer?tab=followers\"><img src=\"https://img.shields.io/github/followers/harismuneer\" alt=\"Total Github Followers\"></a>\n\n<hr>\n\n- <b>🛠️ Product Builder:</b> Agile Product Manager with 5+ years of hands-on experience delivering SaaS solutions across sales, recruiting, AI, social media, and public sector domains. Background in Computer Science, with a proven track record of scaling products from inception to $XXM+ ARR, launching 3 top-ranking tools on Product Hunt, and developing solutions adopted by 250+ B2B clients in 40+ countries.  \n \n- <b>🌟 Open Source Advocate:</b> Passionate about making technology accessible, I’ve developed and open-sourced several software projects for web, mobile, desktop, and AI on my <a href=\"https://github.com/harismuneer\">GitHub profile</a>. These projects have been used by thousands of learners worldwide to enhance their skills and knowledge.\n\n- <b>📫 How to Reach Me:</b> To learn more about my skills and work, visit my <a href=\"https://www.linkedin.com/in/harismuneer\">LinkedIn profile</a>. For collaboration or inquiries, feel free to reach out via <a href=\"mailto:haris.muneer5@gmail.com\">email</a>.\n\n<hr>\n\n<h2 align=\"left\">🤝 Follow my journey</h2>\n<p align=\"left\">\n  <a href=\"https://www.linkedin.com/in/harismuneer\"><img title=\"Follow Haris Muneer on LinkedIn\" src=\"https://img.shields.io/badge/LinkedIn-0077B5?style=for-the-badge&logo=linkedin&logoColor=white\"/></a>\n  <a href=\"https://github.com/harismuneer\"><img title=\"Follow Haris Muneer on GitHub\" src=\"https://img.shields.io/badge/GitHub-100000?style=for-the-badge&logo=github&logoColor=white\"/></a>\n  <a href=\"https://www.youtube.com/@haris_muneer?sub_confirmation=1\"><img title=\"Subscribe on YouTube\" src=\"https://img.shields.io/badge/YouTube-FF0000?style=for-the-badge&logo=youtube&logoColor=white\"/></a> \n  <a href=\"mailto:haris.muneer5@gmail.com\"><img title=\"Email\" src=\"https://img.shields.io/badge/Gmail-D14836?style=for-the-badge&logo=gmail&logoColor=white\"/></a>\n</p>\n\n\n\n<!-- PROFILE_INTRO_END -->\n\n\n\n\n\n"
  },
  {
    "path": "code/gplaycli.conf",
    "content": "[Credentials]\ngmail_address=\ngmail_password=\n#keyring_service=gplaycli\ntoken=True\ntoken_url=https://matlink.fr/token/email/gsfid\n\n[Cache]\ntoken=~/.cache/gplaycli/token\n\n[Locale]\nlocale=en_GB\ntimezone=CEST"
  },
  {
    "path": "code/ids_done.txt",
    "content": "1"
  },
  {
    "path": "code/scraper.py",
    "content": "import os\nimport shutil\nimport sqlite3\nimport ssl\nimport time\nimport platform\nfrom urllib.request import urlopen\n\ntry:\n    import httplib\nexcept:\n    import http.client as httplib\n\n\n# check whether the internet is working or not\ndef have_internet():\n    conn = httplib.HTTPConnection(\"www.google.com\", timeout=5)\n    try:\n        conn.request(\"HEAD\", \"/\")\n        conn.close()\n        return True\n    except:\n        conn.close()\n        return False\n\n\n# --------------------------------------------------------------------\n\ndef exit_gracefully():\n    f.close()\n    conn.commit()\n    print(\"Exiting....\")\n\n    exit()\n\n\n# --------------------------------------------------------------------\n\n\n# Ignore SSL certificate errors\nctx = ssl.create_default_context()\nctx.check_hostname = False\nctx.verify_mode = ssl.CERT_NONE\n\nif __name__ == '__main__':\n\n    target = 1000000\n\n    if platform.system()==\"Linux\":\n        download_dir = os.getcwd() + \"/tmp/\"\n    else:\n        download_dir = os.getcwd() + \"\\\\tmp\\\\\"\n\n    database = \"g_x_apps.sqlite\"\n\n    conn = sqlite3.connect(database)\n    cur = conn.cursor()\n\n    base_site = \"http://app.mi.com/download/\"\n\n    # load the last id number from which to continue downloading\n    f = open(\"ids_done.txt\", \"r\")\n    numbers = f.readlines()\n    numbers = [a.rstrip() for a in numbers]\n    curr_num = int(numbers[-1])\n    f.close()\n\n    f = open(\"ids_done.txt\", \"a\")\n    c = 1\n\n    while curr_num != target:\n        try:\n            curr_num += 1\n\n            c += 1\n            # commit after every 500 iterations\n            if c == 500:\n                conn.commit()\n                c = 0\n\n            # check net connectivity\n            if not have_internet():\n\n                # save progress\n                conn.commit()\n                f.close()\n\n                print(\"Internet disconnected.. waiting\")\n                while not have_internet():\n                    time.sleep(5)\n\n                f = open(\"ids_done.txt\", \"a\")\n                print(\"Connected!\")\n\n            print(\"------------------------------------------\")\n            print(\"Processing:\", curr_num)\n\n            # empty left over downloads\n            if platform.system() == 'Linux':\n                os.system(\"rm -f '\" + download_dir + \"{*,.*}'\")\n            else:\n                os.system('del \"' + download_dir + '*\" /Q')\n\n            # ------------------------------------------------------------------\n            # check whether app on xiomi exists or not\n            try:\n                url = base_site + str(curr_num)\n                html = urlopen(url, context=ctx)\n\n                # check apk exists or not\n                if html.url[-4:] != \".apk\":\n                    f.write(str(curr_num) + \"\\n\")\n                    print(\"No app against this ID on Xiaomi Store\")\n                    continue\n\n                # if it exists\n                package = html.url.split('/')[-1]\n                package = package.split(\".apk\")[0]\n\n                print(\"Found on Xiaomi:\", package)\n            except:\n                f.write(str(curr_num) + \"\\n\")\n                print(\"No app against this ID on Xiaomi Store\")\n                continue\n            # ------------------------------------------------------------------\n\n            # ------------------------------------------------------------------\n            # check if the same app exists on google\n            try:\n                url = \"https://play.google.com/store/apps/details?id=\" + package\n                html = urlopen(url, context=ctx)\n\n                if str(html.getcode()) != '200':\n                    f.write(str(curr_num) + \"\\n\")\n                    print(package, \" doesn't exist on Play Store\")\n                    continue\n\n                # if it exists\n                print(\"Found on PlayStore:\", package)\n            except:\n                f.write(str(curr_num) + \"\\n\")\n                print(package, \" doesn't exist on Play Store\")\n                continue\n            # ------------------------------------------------------------------\n\n            # ------------------------------------------------------------------\n            # download google\n            print(\"Downloading from Playstore\")\n            os.system('gplaycli -d ' + package + ' -f \"' + 'google_apps\"' + ' -p')\n\n            # check if a file of that apk is created\n            if platform.system() == \"Linux\":\n                dir = os.getcwd() + \"/google_apps/\" + package + \".apk\"\n            else:\n                dir = os.getcwd() + \"\\\\google_apps\\\\\" + package + \".apk\"\n\n            time.sleep(1)\n            save = False\n\n            # check if that directory exists\n            if os.path.exists(dir):\n                save = True\n\n            if not save:\n                f.write(str(curr_num) + \"\\n\")\n                print(\"App from playstore not downloaded (might be paid app or not available in your country).\")\n                continue\n\n            print(package, \": Google Download Successful\")\n            # ------------------------------------------------------------------\n\n            # ------------------------------------------------------------------\n            # download xiomi\n            print(\"Downloading from Xiaomi\")\n            os.system(\"wget -P tmp/ --content-disposition -q \" + base_site + str(curr_num))\n\n            # rename the file and move it to its folder\n            for file in os.listdir(download_dir):\n                if file.endswith(\".apk\"):\n                    print('Moving Xiomi File...')\n                    shutil.move(download_dir + file, \"xiomi_apps/\" + package + \".apk\")\n\n                    cur.execute(\"INSERT OR IGNORE INTO APPS VALUES (?,1,1,0)\",\n                                (package,))\n\n                    print(package, \": Xiomi Download Successful\")\n\n                else:\n                    error = 'Error in downloading'\n                    print(error, package)\n            # ------------------------------------------------------------------\n\n            f.write(str(curr_num) + \"\\n\")\n\n        except KeyboardInterrupt:\n            exit_gracefully()\n\n        except:\n            pass\n\n    print(\"-----------------------------\")\n    print(\"Downloads Complete!!!\")\n    print(\"-----------------------------\")\n\n    print('\\n\\nProcess Successfully finished!!!!!\\n\\n')\n\n    f.close()\n    conn.commit()\n"
  },
  {
    "path": "code/tmp/Note.txt",
    "content": "This folder is the temporary download directory for an app. After the app is downloaded, it is moved\nto the relavant download folder."
  }
]