Repository: greg-randall/memento-mori Branch: main Commit: a0a86c3bd6e9 Files: 26 Total size: 295.4 KB Directory structure: gitextract_5pnua18_/ ├── .gitignore ├── Dockerfile ├── LICENSE ├── README.md ├── deprecated_php_utility/ │ ├── index.php │ ├── modal.js │ ├── notes.md │ └── style.css ├── docker-compose.yml ├── memento_mori/ │ ├── __init__.py │ ├── cli.py │ ├── extractor.py │ ├── file_mapper.py │ ├── generator.py │ ├── loader.py │ ├── media.py │ ├── static/ │ │ ├── css/ │ │ │ └── style.css │ │ └── js/ │ │ ├── modal.js │ │ └── stories.js │ └── templates/ │ ├── grid.html │ ├── index.html │ ├── stories.html │ └── stories_page.html ├── project_plan.md ├── pyproject.toml └── requirements.txt ================================================ FILE CONTENTS ================================================ ================================================ FILE: .gitignore ================================================ *.zip *.7z *.tar *.gz *.tar.gz .aider* # Builds and Downloads output/ instagram*/ # Virtual Environments venv/ # Python bytecode __pycache__/ *.py[cod] *$py.class *.so .Python *.cpython-* # System Files .DS_Store # Comments test secrets comments-test/.env comments-test/instagram_cookies.json comments-test/runs/ comments-test/ ================================================ FILE: Dockerfile ================================================ FROM python:3.10-slim # Install system dependencies (including support for image processing and libmagic) RUN apt-get update && apt-get install -y \ libgl1 \ libglib2.0-0 \ libjpeg-dev \ zlib1g-dev \ libmagic-dev \ file \ && rm -rf /var/lib/apt/lists/* # Set up working directory WORKDIR /app # Copy requirements file COPY requirements.txt . # Install dependencies from requirements.txt RUN pip install --no-cache-dir -r requirements.txt # Copy application code COPY . . # Create directories for input/output RUN mkdir -p /input /output # Set the entrypoint ENTRYPOINT ["python", "-m", "memento_mori.cli"] # Default command if none provided CMD ["--help"] ================================================ FILE: LICENSE ================================================ GNU LESSER GENERAL PUBLIC LICENSE Version 2.1, February 1999 Copyright (C) 1991, 1999 Free Software Foundation, Inc. 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA Everyone is permitted to copy and distribute verbatim copies of this license document, but changing it is not allowed. [This is the first released version of the Lesser GPL. It also counts as the successor of the GNU Library Public License, version 2, hence the version number 2.1.] Preamble The licenses for most software are designed to take away your freedom to share and change it. By contrast, the GNU General Public Licenses are intended to guarantee your freedom to share and change free software--to make sure the software is free for all its users. This license, the Lesser General Public License, applies to some specially designated software packages--typically libraries--of the Free Software Foundation and other authors who decide to use it. You can use it too, but we suggest you first think carefully about whether this license or the ordinary General Public License is the better strategy to use in any particular case, based on the explanations below. When we speak of free software, we are referring to freedom of use, not price. Our General Public Licenses are designed to make sure that you have the freedom to distribute copies of free software (and charge for this service if you wish); that you receive source code or can get it if you want it; that you can change the software and use pieces of it in new free programs; and that you are informed that you can do these things. To protect your rights, we need to make restrictions that forbid distributors to deny you these rights or to ask you to surrender these rights. These restrictions translate to certain responsibilities for you if you distribute copies of the library or if you modify it. For example, if you distribute copies of the library, whether gratis or for a fee, you must give the recipients all the rights that we gave you. You must make sure that they, too, receive or can get the source code. If you link other code with the library, you must provide complete object files to the recipients, so that they can relink them with the library after making changes to the library and recompiling it. And you must show them these terms so they know their rights. We protect your rights with a two-step method: (1) we copyright the library, and (2) we offer you this license, which gives you legal permission to copy, distribute and/or modify the library. To protect each distributor, we want to make it very clear that there is no warranty for the free library. Also, if the library is modified by someone else and passed on, the recipients should know that what they have is not the original version, so that the original author's reputation will not be affected by problems that might be introduced by others. Finally, software patents pose a constant threat to the existence of any free program. We wish to make sure that a company cannot effectively restrict the users of a free program by obtaining a restrictive license from a patent holder. Therefore, we insist that any patent license obtained for a version of the library must be consistent with the full freedom of use specified in this license. Most GNU software, including some libraries, is covered by the ordinary GNU General Public License. This license, the GNU Lesser General Public License, applies to certain designated libraries, and is quite different from the ordinary General Public License. We use this license for certain libraries in order to permit linking those libraries into non-free programs. When a program is linked with a library, whether statically or using a shared library, the combination of the two is legally speaking a combined work, a derivative of the original library. The ordinary General Public License therefore permits such linking only if the entire combination fits its criteria of freedom. The Lesser General Public License permits more lax criteria for linking other code with the library. We call this license the "Lesser" General Public License because it does Less to protect the user's freedom than the ordinary General Public License. It also provides other free software developers Less of an advantage over competing non-free programs. These disadvantages are the reason we use the ordinary General Public License for many libraries. However, the Lesser license provides advantages in certain special circumstances. For example, on rare occasions, there may be a special need to encourage the widest possible use of a certain library, so that it becomes a de-facto standard. To achieve this, non-free programs must be allowed to use the library. A more frequent case is that a free library does the same job as widely used non-free libraries. In this case, there is little to gain by limiting the free library to free software only, so we use the Lesser General Public License. In other cases, permission to use a particular library in non-free programs enables a greater number of people to use a large body of free software. For example, permission to use the GNU C Library in non-free programs enables many more people to use the whole GNU operating system, as well as its variant, the GNU/Linux operating system. Although the Lesser General Public License is Less protective of the users' freedom, it does ensure that the user of a program that is linked with the Library has the freedom and the wherewithal to run that program using a modified version of the Library. The precise terms and conditions for copying, distribution and modification follow. Pay close attention to the difference between a "work based on the library" and a "work that uses the library". The former contains code derived from the library, whereas the latter must be combined with the library in order to run. GNU LESSER GENERAL PUBLIC LICENSE TERMS AND CONDITIONS FOR COPYING, DISTRIBUTION AND MODIFICATION 0. This License Agreement applies to any software library or other program which contains a notice placed by the copyright holder or other authorized party saying it may be distributed under the terms of this Lesser General Public License (also called "this License"). Each licensee is addressed as "you". A "library" means a collection of software functions and/or data prepared so as to be conveniently linked with application programs (which use some of those functions and data) to form executables. The "Library", below, refers to any such software library or work which has been distributed under these terms. A "work based on the Library" means either the Library or any derivative work under copyright law: that is to say, a work containing the Library or a portion of it, either verbatim or with modifications and/or translated straightforwardly into another language. (Hereinafter, translation is included without limitation in the term "modification".) "Source code" for a work means the preferred form of the work for making modifications to it. For a library, complete source code means all the source code for all modules it contains, plus any associated interface definition files, plus the scripts used to control compilation and installation of the library. Activities other than copying, distribution and modification are not covered by this License; they are outside its scope. The act of running a program using the Library is not restricted, and output from such a program is covered only if its contents constitute a work based on the Library (independent of the use of the Library in a tool for writing it). Whether that is true depends on what the Library does and what the program that uses the Library does. 1. You may copy and distribute verbatim copies of the Library's complete source code as you receive it, in any medium, provided that you conspicuously and appropriately publish on each copy an appropriate copyright notice and disclaimer of warranty; keep intact all the notices that refer to this License and to the absence of any warranty; and distribute a copy of this License along with the Library. You may charge a fee for the physical act of transferring a copy, and you may at your option offer warranty protection in exchange for a fee. 2. You may modify your copy or copies of the Library or any portion of it, thus forming a work based on the Library, and copy and distribute such modifications or work under the terms of Section 1 above, provided that you also meet all of these conditions: a) The modified work must itself be a software library. b) You must cause the files modified to carry prominent notices stating that you changed the files and the date of any change. c) You must cause the whole of the work to be licensed at no charge to all third parties under the terms of this License. d) If a facility in the modified Library refers to a function or a table of data to be supplied by an application program that uses the facility, other than as an argument passed when the facility is invoked, then you must make a good faith effort to ensure that, in the event an application does not supply such function or table, the facility still operates, and performs whatever part of its purpose remains meaningful. (For example, a function in a library to compute square roots has a purpose that is entirely well-defined independent of the application. Therefore, Subsection 2d requires that any application-supplied function or table used by this function must be optional: if the application does not supply it, the square root function must still compute square roots.) These requirements apply to the modified work as a whole. If identifiable sections of that work are not derived from the Library, and can be reasonably considered independent and separate works in themselves, then this License, and its terms, do not apply to those sections when you distribute them as separate works. But when you distribute the same sections as part of a whole which is a work based on the Library, the distribution of the whole must be on the terms of this License, whose permissions for other licensees extend to the entire whole, and thus to each and every part regardless of who wrote it. Thus, it is not the intent of this section to claim rights or contest your rights to work written entirely by you; rather, the intent is to exercise the right to control the distribution of derivative or collective works based on the Library. In addition, mere aggregation of another work not based on the Library with the Library (or with a work based on the Library) on a volume of a storage or distribution medium does not bring the other work under the scope of this License. 3. You may opt to apply the terms of the ordinary GNU General Public License instead of this License to a given copy of the Library. To do this, you must alter all the notices that refer to this License, so that they refer to the ordinary GNU General Public License, version 2, instead of to this License. (If a newer version than version 2 of the ordinary GNU General Public License has appeared, then you can specify that version instead if you wish.) Do not make any other change in these notices. Once this change is made in a given copy, it is irreversible for that copy, so the ordinary GNU General Public License applies to all subsequent copies and derivative works made from that copy. This option is useful when you wish to copy part of the code of the Library into a program that is not a library. 4. You may copy and distribute the Library (or a portion or derivative of it, under Section 2) in object code or executable form under the terms of Sections 1 and 2 above provided that you accompany it with the complete corresponding machine-readable source code, which must be distributed under the terms of Sections 1 and 2 above on a medium customarily used for software interchange. If distribution of object code is made by offering access to copy from a designated place, then offering equivalent access to copy the source code from the same place satisfies the requirement to distribute the source code, even though third parties are not compelled to copy the source along with the object code. 5. A program that contains no derivative of any portion of the Library, but is designed to work with the Library by being compiled or linked with it, is called a "work that uses the Library". Such a work, in isolation, is not a derivative work of the Library, and therefore falls outside the scope of this License. However, linking a "work that uses the Library" with the Library creates an executable that is a derivative of the Library (because it contains portions of the Library), rather than a "work that uses the library". The executable is therefore covered by this License. Section 6 states terms for distribution of such executables. When a "work that uses the Library" uses material from a header file that is part of the Library, the object code for the work may be a derivative work of the Library even though the source code is not. Whether this is true is especially significant if the work can be linked without the Library, or if the work is itself a library. The threshold for this to be true is not precisely defined by law. If such an object file uses only numerical parameters, data structure layouts and accessors, and small macros and small inline functions (ten lines or less in length), then the use of the object file is unrestricted, regardless of whether it is legally a derivative work. (Executables containing this object code plus portions of the Library will still fall under Section 6.) Otherwise, if the work is a derivative of the Library, you may distribute the object code for the work under the terms of Section 6. Any executables containing that work also fall under Section 6, whether or not they are linked directly with the Library itself. 6. As an exception to the Sections above, you may also combine or link a "work that uses the Library" with the Library to produce a work containing portions of the Library, and distribute that work under terms of your choice, provided that the terms permit modification of the work for the customer's own use and reverse engineering for debugging such modifications. You must give prominent notice with each copy of the work that the Library is used in it and that the Library and its use are covered by this License. You must supply a copy of this License. If the work during execution displays copyright notices, you must include the copyright notice for the Library among them, as well as a reference directing the user to the copy of this License. Also, you must do one of these things: a) Accompany the work with the complete corresponding machine-readable source code for the Library including whatever changes were used in the work (which must be distributed under Sections 1 and 2 above); and, if the work is an executable linked with the Library, with the complete machine-readable "work that uses the Library", as object code and/or source code, so that the user can modify the Library and then relink to produce a modified executable containing the modified Library. (It is understood that the user who changes the contents of definitions files in the Library will not necessarily be able to recompile the application to use the modified definitions.) b) Use a suitable shared library mechanism for linking with the Library. A suitable mechanism is one that (1) uses at run time a copy of the library already present on the user's computer system, rather than copying library functions into the executable, and (2) will operate properly with a modified version of the library, if the user installs one, as long as the modified version is interface-compatible with the version that the work was made with. c) Accompany the work with a written offer, valid for at least three years, to give the same user the materials specified in Subsection 6a, above, for a charge no more than the cost of performing this distribution. d) If distribution of the work is made by offering access to copy from a designated place, offer equivalent access to copy the above specified materials from the same place. e) Verify that the user has already received a copy of these materials or that you have already sent this user a copy. For an executable, the required form of the "work that uses the Library" must include any data and utility programs needed for reproducing the executable from it. However, as a special exception, the materials to be distributed need not include anything that is normally distributed (in either source or binary form) with the major components (compiler, kernel, and so on) of the operating system on which the executable runs, unless that component itself accompanies the executable. It may happen that this requirement contradicts the license restrictions of other proprietary libraries that do not normally accompany the operating system. Such a contradiction means you cannot use both them and the Library together in an executable that you distribute. 7. You may place library facilities that are a work based on the Library side-by-side in a single library together with other library facilities not covered by this License, and distribute such a combined library, provided that the separate distribution of the work based on the Library and of the other library facilities is otherwise permitted, and provided that you do these two things: a) Accompany the combined library with a copy of the same work based on the Library, uncombined with any other library facilities. This must be distributed under the terms of the Sections above. b) Give prominent notice with the combined library of the fact that part of it is a work based on the Library, and explaining where to find the accompanying uncombined form of the same work. 8. You may not copy, modify, sublicense, link with, or distribute the Library except as expressly provided under this License. Any attempt otherwise to copy, modify, sublicense, link with, or distribute the Library is void, and will automatically terminate your rights under this License. However, parties who have received copies, or rights, from you under this License will not have their licenses terminated so long as such parties remain in full compliance. 9. You are not required to accept this License, since you have not signed it. However, nothing else grants you permission to modify or distribute the Library or its derivative works. These actions are prohibited by law if you do not accept this License. Therefore, by modifying or distributing the Library (or any work based on the Library), you indicate your acceptance of this License to do so, and all its terms and conditions for copying, distributing or modifying the Library or works based on it. 10. Each time you redistribute the Library (or any work based on the Library), the recipient automatically receives a license from the original licensor to copy, distribute, link with or modify the Library subject to these terms and conditions. You may not impose any further restrictions on the recipients' exercise of the rights granted herein. You are not responsible for enforcing compliance by third parties with this License. 11. If, as a consequence of a court judgment or allegation of patent infringement or for any other reason (not limited to patent issues), conditions are imposed on you (whether by court order, agreement or otherwise) that contradict the conditions of this License, they do not excuse you from the conditions of this License. If you cannot distribute so as to satisfy simultaneously your obligations under this License and any other pertinent obligations, then as a consequence you may not distribute the Library at all. For example, if a patent license would not permit royalty-free redistribution of the Library by all those who receive copies directly or indirectly through you, then the only way you could satisfy both it and this License would be to refrain entirely from distribution of the Library. If any portion of this section is held invalid or unenforceable under any particular circumstance, the balance of the section is intended to apply, and the section as a whole is intended to apply in other circumstances. It is not the purpose of this section to induce you to infringe any patents or other property right claims or to contest validity of any such claims; this section has the sole purpose of protecting the integrity of the free software distribution system which is implemented by public license practices. Many people have made generous contributions to the wide range of software distributed through that system in reliance on consistent application of that system; it is up to the author/donor to decide if he or she is willing to distribute software through any other system and a licensee cannot impose that choice. This section is intended to make thoroughly clear what is believed to be a consequence of the rest of this License. 12. If the distribution and/or use of the Library is restricted in certain countries either by patents or by copyrighted interfaces, the original copyright holder who places the Library under this License may add an explicit geographical distribution limitation excluding those countries, so that distribution is permitted only in or among countries not thus excluded. In such case, this License incorporates the limitation as if written in the body of this License. 13. The Free Software Foundation may publish revised and/or new versions of the Lesser General Public License from time to time. Such new versions will be similar in spirit to the present version, but may differ in detail to address new problems or concerns. Each version is given a distinguishing version number. If the Library specifies a version number of this License which applies to it and "any later version", you have the option of following the terms and conditions either of that version or of any later version published by the Free Software Foundation. If the Library does not specify a license version number, you may choose any version ever published by the Free Software Foundation. 14. If you wish to incorporate parts of the Library into other free programs whose distribution conditions are incompatible with these, write to the author to ask for permission. For software which is copyrighted by the Free Software Foundation, write to the Free Software Foundation; we sometimes make exceptions for this. Our decision will be guided by the two goals of preserving the free status of all derivatives of our free software and of promoting the sharing and reuse of software generally. NO WARRANTY 15. BECAUSE THE LIBRARY IS LICENSED FREE OF CHARGE, THERE IS NO WARRANTY FOR THE LIBRARY, TO THE EXTENT PERMITTED BY APPLICABLE LAW. EXCEPT WHEN OTHERWISE STATED IN WRITING THE COPYRIGHT HOLDERS AND/OR OTHER PARTIES PROVIDE THE LIBRARY "AS IS" WITHOUT WARRANTY OF ANY KIND, EITHER EXPRESSED OR IMPLIED, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE. THE ENTIRE RISK AS TO THE QUALITY AND PERFORMANCE OF THE LIBRARY IS WITH YOU. SHOULD THE LIBRARY PROVE DEFECTIVE, YOU ASSUME THE COST OF ALL NECESSARY SERVICING, REPAIR OR CORRECTION. 16. IN NO EVENT UNLESS REQUIRED BY APPLICABLE LAW OR AGREED TO IN WRITING WILL ANY COPYRIGHT HOLDER, OR ANY OTHER PARTY WHO MAY MODIFY AND/OR REDISTRIBUTE THE LIBRARY AS PERMITTED ABOVE, BE LIABLE TO YOU FOR DAMAGES, INCLUDING ANY GENERAL, SPECIAL, INCIDENTAL OR CONSEQUENTIAL DAMAGES ARISING OUT OF THE USE OR INABILITY TO USE THE LIBRARY (INCLUDING BUT NOT LIMITED TO LOSS OF DATA OR DATA BEING RENDERED INACCURATE OR LOSSES SUSTAINED BY YOU OR THIRD PARTIES OR A FAILURE OF THE LIBRARY TO OPERATE WITH ANY OTHER SOFTWARE), EVEN IF SUCH HOLDER OR OTHER PARTY HAS BEEN ADVISED OF THE POSSIBILITY OF SUCH DAMAGES. END OF TERMS AND CONDITIONS How to Apply These Terms to Your New Libraries If you develop a new library, and you want it to be of the greatest possible use to the public, we recommend making it free software that everyone can redistribute and change. You can do so by permitting redistribution under these terms (or, alternatively, under the terms of the ordinary General Public License). To apply these terms, attach the following notices to the library. It is safest to attach them to the start of each source file to most effectively convey the exclusion of warranty; and each file should have at least the "copyright" line and a pointer to where the full notice is found. Reads an instagram export and creates html output that is browsable. Copyright (C) 2025 Gregory Randall This library is free software; you can redistribute it and/or modify it under the terms of the GNU Lesser General Public License as published by the Free Software Foundation; either version 2.1 of the License, or (at your option) any later version. This library is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU Lesser General Public License for more details. You should have received a copy of the GNU Lesser General Public License along with this library; if not, write to the Free Software Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA Also add information on how to contact you by electronic and paper mail. You should also get your employer (if you work as a programmer) or your school, if any, to sign a "copyright disclaimer" for the library, if necessary. Here is a sample; alter the names: Yoyodyne, Inc., hereby disclaims all copyright interest in the library `Frob' (a library for tweaking knobs) written by James Random Hacker. , 1 April 1990 Ty Coon, President of Vice That's all there is to it! ================================================ FILE: README.md ================================================ # Memento Mori - Instagram Archive Viewer Memento Mori Interface Preview **Memento Mori** is a tool that converts your Instagram data export into a beautiful, standalone viewer that resembles the Instagram interface. The name "Memento Mori" (Latin for "remember that you will die") reflects the ephemeral nature of our digital content. You can see an example at https://gregr.org/instagram/. If you find a bug that you're able to fix please create a pull request, otherwise create an issue! ## Quick Start Get your Instagram data export zip, throw it in with this code, and run this command: ```bash docker compose run --rm memento-mori #Then open output/index.html in your browser ``` ## ⚠️ IMPORTANT SECURITY WARNING ⚠️ **DO NOT** share your raw Instagram export online! It contains sensitive data you probably don't want to share: - Phone numbers - Precise location data - Personal messages - Email addresses - Other private information Only share the generated output folder after processing with this tool. ## How It Works Memento Mori processes your Instagram data export and generates a static site with your posts and stories, copying all your media files into an organized structure that can be viewed offline or hosted on your own website. ## Key Features - **Familiar Interface**: Grid layout with post details and carousel for multiple images - **Stories Support**: View your Instagram Stories with auto-progression and 9:16 aspect ratio display - **Media Optimization**: Converts images to WebP, generates thumbnails, and supports video playback - **Organization**: Sorts posts by various criteria with shareable links to specific content - **Profile Information**: Displays bio, website, and follower count from your Instagram profile - **Technical Improvements**: - Fixes encoding issues and mislabeled file formats - Shortens filenames for smaller HTML size - Processes files in parallel with a responsive design that works on all devices - Robust error handling with verbose debugging option ## How to Use Memento Mori ### 1. Get Your Instagram Data 1. Request and download your Instagram data archive 2. Place the zip within the folder of this repo ### 2. Preferred Method: Using Docker (Easiest) Docker Compose is the easiest way to run Memento Mori without installing any dependencies. Many thanks to [CarsonDavis](https://github.com/CarsonDavis) for building out all the dockerizing code (as well as generally making my code better): ```bash # Build the Docker image docker compose build # Run with default settings docker compose run --rm memento-mori # Run with specific arguments docker compose run --rm memento-mori --output /output/my-site --quality 90 # Add Google Analytics tracking docker compose run --rm memento-mori --gtag-id G-DX1ZWTC9NZ # Serve the output folder locally to preview in your browser python3 -m http.server -d output ``` By default, Docker will: - Search for exports in the project directory - Output the generated site to the './output' directory ### 3. Alternative Method: Direct Python Installation If you prefer running the tool directly without Docker: ```bash # Install package and dependencies pip install -e . # Or install dependencies manually pip install ftfy==6.3.1 Jinja2==3.0.3 MarkupSafe==2.1.5 opencv_python==4.10.0.84 Pillow==11.1.0 tqdm==4.67.1 python_magic==0.4.27 # Run the CLI python -m memento_mori.cli # Serve the output folder locally to preview in your browser python3 -m http.server -d output ``` ### CLI Arguments The CLI supports the following arguments: ``` Options: --input PATH Path to data (ZIP or folder). If not specified, auto-detection will be used. --output PATH Output directory for generated website [default: ./output] --threads INTEGER Number of parallel processing threads [default: core count - 1] --search-dir PATH Directory to search for exports when auto-detecting [default: current directory] --quality INTEGER WebP conversion quality (1-100) [default: 70] --max-dimension INTEGER Maximum dimension for images in pixels [default: 1920] --thumbnail-size WxH Size of thumbnails [default: 292x292] --no-auto-detect Disable auto-detection (requires --input to be specified) --gtag-id ID Google Analytics tag ID (e.g., 'G-DX1ZWTC9NZ') to add tracking to the generated site --verbose, -v Enable verbose output for debugging ``` Note: Auto-detection is enabled by default and will look for exports in the current directory. Use `--no-auto-detect` if you want to disable this feature and specify an input path manually. ### Example Commands ```bash # Auto-detect export in current directory python -m memento_mori.cli # Specify input file/folder and output directory python -m memento_mori.cli --input path/to/export.zip --output my-site # Use specific number of threads and image quality python -m memento_mori.cli --threads 8 --quality 90 # Specify search directory for auto-detection python -m memento_mori.cli --search-dir ~/Downloads # Use custom thumbnail size python -m memento_mori.cli --thumbnail-size 400x400 # Specify maximum image dimension python -m memento_mori.cli --max-dimension 1600 # Disable auto-detection (requires specifying input) python -m memento_mori.cli --no-auto-detect --input path/to/export.zip # Add Google Analytics tracking python -m memento_mori.cli --gtag-id G-DX1ZWTC9NZ # Enable verbose debugging output python -m memento_mori.cli --verbose ``` ## Viewing Your Generated Site After the tool finishes processing your Instagram data: 1. The website will be generated in the output directory (default: ./output) 2. Open the `index.html` file in this directory with your web browser to view your Instagram archive 3. Click on "stories" in your profile stats to view your Stories archive 4. You can also upload the entire output directory to a web hosting service to share it online ## PHP Version (Alternative) For those who prefer the deprecated PHP implementation, there are a few notes in the deprecated_php_utility folder, but basically extract your data into the folder with the php file, and run ```bash # Run from command line php index.php ``` ## Why This Exists When requesting your data from Instagram, the export you receive contains your content but in a format that's intentionally difficult to navigate and enjoy. Memento Mori solves this problem by transforming your archive into an intuitive, familiar interface that brings your memories back to life. Instagram, like many social platforms, has undergone significant "enshittification" - a term coined to describe how platforms evolve: 1. First, they attract users with a quality experience 2. Then, they leverage their position to extract data and attention 3. Finally, they degrade the user experience to maximize profit ================================================ FILE: deprecated_php_utility/index.php ================================================ 0) { $percentage_saved = (($total_size_original - $total_size_webp) / $total_size_original * 100); fwrite(STDERR, sprintf("Total space saved: %.2f MB (%.1f%%)\n", $space_saved_mb, $percentage_saved )); } else { fwrite(STDERR, sprintf("Total space saved: %.2f MB (0.0%%)\n", $space_saved_mb)); } echo "Media files copied to distribution folder.\n"; } /** * Copy a single file to the distribution folder, maintaining its path structure * * @param string $file_path The path to the file */ function copy_file_to_distribution($file_path) { // Skip if it's already a data URI if (strpos($file_path, 'data:image') === 0) { return; } $source = $file_path; $destination = 'distribution/' . $file_path; // Create directory structure if it doesn't exist $dir = dirname($destination); if (!file_exists($dir)) { mkdir($dir, 0755, true); } // Check if it's an image file that can be converted to WebP $is_image = preg_match('/\.(jpg|jpeg|png|gif)$/i', $file_path); $is_video = preg_match('/\.(mp4|mov|avi|webm)$/i', $file_path); if ($is_image && file_exists($source)) { // Convert image to WebP for better compression $webp_destination = preg_replace('/\.(jpg|jpeg|png|gif)$/i', '.webp', $destination); convert_to_webp($source, $webp_destination); // Generate thumbnail for this file generate_thumbnail($source, $file_path); } else if (file_exists($source)) { // Copy the file as is (for videos and other file types) copy($source, $destination); // Generate thumbnail for this file generate_thumbnail($source, $file_path); } } /** * Convert an image to WebP format without cropping * * @param string $source_path The source image path * @param string $destination_path The destination WebP path * @return bool True if successful, false otherwise */ function convert_to_webp($source_path, $destination_path) { try { // Detect file type by examining file contents $file_info = new finfo(FILEINFO_MIME_TYPE); $mime_type = $file_info->file($source_path); // Create image resource based on mime type $source_image = null; switch ($mime_type) { case 'image/jpeg': $source_image = @imagecreatefromjpeg($source_path); break; case 'image/png': $source_image = @imagecreatefrompng($source_path); // Preserve transparency for PNG if ($source_image) { imagepalettetotruecolor($source_image); imagealphablending($source_image, true); imagesavealpha($source_image, true); } break; case 'image/gif': $source_image = @imagecreatefromgif($source_path); break; default: // Try to load as JPEG first, then PNG, then GIF as fallbacks $source_image = @imagecreatefromjpeg($source_path); if (!$source_image) { $source_image = @imagecreatefrompng($source_path); if ($source_image) { imagepalettetotruecolor($source_image); imagealphablending($source_image, true); imagesavealpha($source_image, true); } } if (!$source_image) { $source_image = @imagecreatefromgif($source_path); } break; } if (!$source_image) { fwrite(STDERR, "Failed to create image resource for conversion: " . $source_path . "\n"); // Fall back to copying the original file copy($source_path, str_replace('.webp', '.jpg', $destination_path)); return false; } // Save as WebP with 80% quality (good balance between quality and file size) $result = imagewebp($source_image, $destination_path, 80); // Clean up imagedestroy($source_image); if ($result) { // Check if the WebP file is actually smaller than the original $original_size = filesize($source_path); $webp_size = filesize($destination_path); if ($webp_size > 0 && $webp_size < $original_size) { fwrite(STDERR, "Converted to WebP: " . $source_path . " (saved " . round(($original_size - $webp_size) / 1024, 2) . " KB)\n"); return true; } else { // If WebP is larger or failed, use the original file unlink($destination_path); copy($source_path, str_replace('.webp', '.jpg', $destination_path)); fwrite(STDERR, "WebP larger than original, using original: " . $source_path . "\n"); return false; } } else { // If WebP conversion failed, use the original file copy($source_path, str_replace('.webp', '.jpg', $destination_path)); fwrite(STDERR, "WebP conversion failed, using original: " . $source_path . "\n"); return false; } } catch (Exception $e) { fwrite(STDERR, "Error converting to WebP: " . $e->getMessage() . "\n"); // Fall back to copying the original file copy($source_path, str_replace('.webp', '.jpg', $destination_path)); return false; } } /** * Generate a thumbnail for an image or video file * * @param string $source_path The source file path * @param string $relative_path The relative path for naming the thumbnail * @return string|null The path to the generated thumbnail or null if failed */ function generate_thumbnail($source_path, $relative_path) { // Create thumbnails directory if it doesn't exist $thumbs_dir = 'distribution/thumbnails'; if (!file_exists($thumbs_dir)) { mkdir($thumbs_dir, 0755, true); } // Generate a unique filename for the thumbnail based on the original path $thumb_filename = md5($relative_path) . '.webp'; $thumb_path = $thumbs_dir . '/' . $thumb_filename; // Skip if thumbnail already exists if (file_exists($thumb_path)) { return $thumb_path; } // Target dimensions $target_width = 292; $target_height = 292; fwrite(STDERR, "Generating thumbnail for: $relative_path\n"); try { // Check if file exists if (!file_exists($source_path)) { fwrite(STDERR, "File not found: $source_path\n"); return null; } // Detect file type by examining file contents $file_info = new finfo(FILEINFO_MIME_TYPE); $mime_type = $file_info->file($source_path); // Determine if it's a video based on mime type $is_video = (strpos($mime_type, 'video/') === 0); // For HEIC files (often incorrectly labeled) $is_heic = false; if (strpos($mime_type, 'application/octet-stream') === 0) { // Check for HEIC signature $file_header = file_get_contents($source_path, false, null, 0, 12); if (strpos($file_header, 'ftypheic') !== false || strpos($file_header, 'ftypmif1') !== false || strpos($file_header, 'ftyphevc') !== false) { $is_heic = true; } } if ($is_video) { // For videos, try to use FFmpeg to extract a frame if (function_exists('exec')) { $temp_jpg = tempnam(sys_get_temp_dir(), 'thumb') . '.jpg'; // Extract a frame at 1 second mark exec("ffmpeg -i \"$source_path\" -ss 00:00:01 -vframes 1 -vf \"scale=$target_width:$target_height:force_original_aspect_ratio=decrease,pad=$target_width:$target_height:(ow-iw)/2:(oh-ih)/2:color=black\" \"$temp_jpg\" 2>&1", $output, $return_var); if ($return_var !== 0) { fwrite(STDERR, "FFmpeg error: " . implode("\n", $output) . "\n"); return null; } // Convert the extracted frame to WebP if (function_exists('imagecreatefromjpeg') && function_exists('imagewebp')) { $image = imagecreatefromjpeg($temp_jpg); imagewebp($image, $thumb_path, 80); imagedestroy($image); unlink($temp_jpg); // Clean up temp file return $thumb_path; } } // If FFmpeg fails or is not available, use a placeholder fwrite(STDERR, "Could not generate video thumbnail for: $relative_path\n"); return null; } else if ($is_heic) { // For HEIC files, try to use ImageMagick if available if (function_exists('exec')) { $temp_jpg = tempnam(sys_get_temp_dir(), 'thumb') . '.jpg'; exec("convert \"$source_path\" \"$temp_jpg\" 2>&1", $output, $return_var); if ($return_var !== 0) { fwrite(STDERR, "ImageMagick error for HEIC: " . implode("\n", $output) . "\n"); return null; } // Now process the converted JPG if (file_exists($temp_jpg)) { $source_image = imagecreatefromjpeg($temp_jpg); if (!$source_image) { fwrite(STDERR, "Failed to create image from converted HEIC: $relative_path\n"); unlink($temp_jpg); return null; } // Process the image (resize and save as WebP) $result = process_and_save_image($source_image, $thumb_path, $target_width, $target_height); unlink($temp_jpg); // Clean up temp file return $result; } } fwrite(STDERR, "Could not convert HEIC file: $relative_path\n"); return null; } else { // For images, use GD library if (!function_exists('imagecreatefromjpeg') || !function_exists('imagewebp')) { fwrite(STDERR, "GD library with WebP support is required\n"); return null; } // Create image resource based on mime type $source_image = null; switch ($mime_type) { case 'image/jpeg': $source_image = @imagecreatefromjpeg($source_path); break; case 'image/png': $source_image = @imagecreatefrompng($source_path); break; case 'image/gif': $source_image = @imagecreatefromgif($source_path); break; case 'image/webp': $source_image = @imagecreatefromwebp($source_path); break; default: // Try to load as JPEG first, then PNG, then GIF as fallbacks $source_image = @imagecreatefromjpeg($source_path); if (!$source_image) { $source_image = @imagecreatefrompng($source_path); } if (!$source_image) { $source_image = @imagecreatefromgif($source_path); } if (!$source_image) { $source_image = @imagecreatefromwebp($source_path); } break; } if (!$source_image) { fwrite(STDERR, "Failed to create image resource for: $relative_path (MIME: $mime_type)\n"); return null; } return process_and_save_image($source_image, $thumb_path, $target_width, $target_height); } } catch (Exception $e) { fwrite(STDERR, "Error generating thumbnail: " . $e->getMessage() . "\n"); return null; } return null; } /** * Process an image resource and save it as a WebP thumbnail * * @param resource $source_image The source image resource * @param string $thumb_path The path to save the thumbnail * @param int $target_width The target width * @param int $target_height The target height * @return string|null The path to the generated thumbnail or null if failed */ function process_and_save_image($source_image, $thumb_path, $target_width, $target_height) { try { // Get original dimensions $original_width = imagesx($source_image); $original_height = imagesy($source_image); // Create the final square thumbnail $thumb_image = imagecreatetruecolor($target_width, $target_height); // Fill with white background $white = imagecolorallocate($thumb_image, 255, 255, 255); imagefilledrectangle($thumb_image, 0, 0, $target_width, $target_height, $white); // Calculate dimensions for cropping to ensure 1:1 aspect ratio // We'll take the center portion of the image if ($original_width > $original_height) { // Landscape image: crop from the center horizontally $src_x = ($original_width - $original_height) / 2; $src_y = 0; $src_w = $original_height; $src_h = $original_height; } else { // Portrait image: crop from the center vertically $src_x = 0; $src_y = ($original_height - $original_width) / 2; $src_w = $original_width; $src_h = $original_width; } // Copy and resize the cropped portion directly to the thumbnail imagecopyresampled( $thumb_image, $source_image, 0, 0, $src_x, $src_y, $target_width, $target_height, $src_w, $src_h ); // Save as WebP imagewebp($thumb_image, $thumb_path, 80); // Clean up imagedestroy($source_image); imagedestroy($thumb_image); return $thumb_path; } catch (Exception $e) { fwrite(STDERR, "Error processing image: " . $e->getMessage() . "\n"); return null; } } function render_instagram_grid($post_data, $lazy_after = 30) { $output = ''; // Process each post $i=1; foreach ($post_data as $timestamp => $post) { if($i > $lazy_after){ $lazy_load = ' loading="lazy"'; } else { $lazy_load = ''; } $index = $post['post_index']; $media_count = count($post['media']); // Determine which media to use for the grid thumbnail $display_media = ''; $is_video = false; if (isset($post['media'][0])) { $first_media = $post['media'][0]; $original_media = $first_media; $display_media = $first_media; // Check if first media is a video $is_video = preg_match('/\.(mp4|mov|avi|webm)$/i', $first_media); // Check if we have a thumbnail for this media $thumb_filename = md5($first_media) . '.webp'; $thumb_path = 'thumbnails/' . $thumb_filename; if (file_exists('distribution/' . $thumb_path)) { // Use the thumbnail instead of the original $display_media = $thumb_path; fwrite(STDERR, "Using thumbnail for: $first_media\n"); } else { // Check if we have a WebP version of the original image if (!$is_video) { $webp_path = preg_replace('/\.(jpg|jpeg|png|gif)$/i', '.webp', $first_media); if (file_exists('distribution/' . $webp_path)) { $display_media = $webp_path; fwrite(STDERR, "Using WebP version for: $first_media\n"); } } // If it's a video, look for a thumbnail among all media items if ($is_video) { $found_thumbnail = false; // First check if there are any image files in the post's media that could be thumbnails foreach ($post['media'] as $media_item) { if (preg_match('/\.(jpg|jpeg|png|webp|gif)$/i', $media_item)) { // Check if we have a thumbnail for this image $img_thumb_filename = md5($media_item) . '.webp'; $img_thumb_path = 'thumbnails/' . $img_thumb_filename; if (file_exists('distribution/' . $img_thumb_path)) { $display_media = $img_thumb_path; } else { $display_media = $media_item; } $found_thumbnail = true; break; } } // If no thumbnail found, use a better SVG placeholder if (!$found_thumbnail) { // Create a simple SVG with a play button $svg = ''; $svg .= ''; $svg .= ''; $svg .= ''; $svg .= ''; // Encode the SVG properly for use in an img src attribute $display_media = 'data:image/svg+xml;base64,' . base64_encode($svg); } } } } $output .= '
' . "\n"; $output .= ' Instagram post' . "\n"; // Add video indicator if it's a video if ($is_video) { $output .= '
▶ Video
' . "\n"; } if ($media_count > 1) { $output .= '
⊞ ' . $media_count . '
' . "\n"; } elseif (isset($post['Likes']) && $post['Likes'] !== '') { $output .= ' ' . "\n"; } $output .= '
' . "\n"; $i++; } return $output; } date_default_timezone_set("America/New_York"); $personal_data = file_get_contents("personal_information/personal_information/personal_information.json"); $personal_data = json_decode($personal_data,true); $profile_picture = $personal_data['profile_user'][0]['media_map_data']['Profile Photo']['uri']; $user_name = $personal_data['profile_user'][0]["string_map_data"]["Username"]["value"]; unset($personal_data); //echo "profile picture: $profile_picture //username: $user_name\n"; $location_data = file_get_contents("personal_information/information_about_you/profile_based_in.json"); $location_data = json_decode($location_data ,true); $location = $location_data['inferred_data_primary_location'][0]['string_map_data']['City Name']['value']; unset($location_data); //echo "location: $location\n"; // Function to search for posts_1.json file recursively function find_posts_json() { $standard_path = 'your_instagram_activity/content/posts_1.json'; // First check the standard location if (file_exists($standard_path)) { return $standard_path; } // If not found, search recursively fwrite(STDERR, "posts_1.json not found in standard location, searching directories...\n"); $found_files = []; $iterator = new RecursiveIteratorIterator( new RecursiveDirectoryIterator('.', RecursiveDirectoryIterator::SKIP_DOTS) ); foreach ($iterator as $file) { if ($file->getFilename() === 'posts_1.json') { $found_files[] = $file->getPathname(); } } if (empty($found_files)) { fwrite(STDERR, "ERROR: Could not find posts_1.json anywhere in the directory structure.\n"); return false; } // If multiple files found, use the one that seems most likely if (count($found_files) > 1) { fwrite(STDERR, "Found multiple posts_1.json files:\n"); foreach ($found_files as $index => $path) { fwrite(STDERR, " [$index] $path\n"); } // Try to find the one in a directory with "content" or "activity" in the path foreach ($found_files as $path) { if (strpos($path, 'content') !== false || strpos($path, 'activity') !== false) { fwrite(STDERR, "Selected: $path\n"); return $path; } } // If no preferred path found, use the first one fwrite(STDERR, "Selected: {$found_files[0]}\n"); return $found_files[0]; } fwrite(STDERR, "Found posts_1.json at: {$found_files[0]}\n"); return $found_files[0]; } // Load and decode the JSON files $insights_data = file_get_contents('logged_information/past_instagram_insights/posts.json'); $insights_data = json_decode($insights_data, true); $posts_json_path = find_posts_json(); if (!$posts_json_path) { die("ERROR: Could not find the posts_1.json file. Please ensure your Instagram data is properly extracted."); } $post_data = file_get_contents($posts_json_path); $post_data = json_decode($post_data, true); // Create an indexed array of insights data using creation_timestamp as key $indexed_insights = []; foreach ($insights_data['organic_insights_posts'] as $insight) { $timestamp = $insight['media_map_data']['Media Thumbnail']['creation_timestamp']; $indexed_insights[$timestamp] = $insight; } // Combine the data $combined_data = []; foreach ($post_data as $post) { // Get the timestamp from the first media item (since a post might have multiple media items) $timestamp = $post['media'][0]['creation_timestamp']; // Create the combined post object $combined_post = [ 'post_data' => $post, 'insights' => isset($indexed_insights[$timestamp]) ? $indexed_insights[$timestamp] : null ]; // Add to combined data array $combined_data[] = $combined_post; } unset($post_data); unset($insights_data); unset($indexed_insights); function extractRelevantData($combined_data) { $simplified_data = []; foreach ($combined_data as $index => $item) { // Initialize a new post entry $post_entry = [ 'post_index' => $index, 'media' => [], 'creation_timestamp_unix' => "", 'creation_timestamp_readable' => "", 'title' => "", 'Impressions' => "", 'Likes' => "", 'Comments' => "" ]; // Extract post-level data if (isset($item['post_data'])) { if (isset($item['post_data']['creation_timestamp'])) { $post_entry['creation_timestamp_unix'] = $item['post_data']['creation_timestamp']; } elseif (isset($item['post_data']['media'][0]['creation_timestamp'])) { // Fallback to first media item timestamp if post timestamp not available $post_entry['creation_timestamp_unix'] = $item['post_data']['media'][0]['creation_timestamp']; } $post_entry['creation_timestamp_readable'] = gmdate("F j, Y \a\\t g:i A", $post_entry['creation_timestamp_unix']); if (isset($item['post_data']['title'])) { $post_entry['title'] = $item['post_data']['title']; } // Extract media URIs if (isset($item['post_data']['media'])) { foreach ($item['post_data']['media'] as $media) { $post_entry['media'][] = $media['uri'] ?? ""; } } } // Get insights data if available if (isset($item['insights']) && isset($item['insights']['string_map_data'])) { $insights = $item['insights']['string_map_data']; // Extract specific metrics and ensure they're integers or blank if (isset($insights['Impressions'])) { $impressions = $insights['Impressions']['value'] ?? ""; // Validate and convert to integer if numeric, otherwise leave blank $post_entry['Impressions'] = is_numeric($impressions) ? (int)$impressions : ""; } if (isset($insights['Likes'])) { $likes = $insights['Likes']['value'] ?? ""; // Validate and convert to integer if numeric, otherwise leave blank $post_entry['Likes'] = is_numeric($likes) ? (int)$likes : ""; } if (isset($insights['Comments'])) { $comments = $insights['Comments']['value'] ?? ""; // Validate and convert to integer if numeric, otherwise leave blank $post_entry['Comments'] = is_numeric($comments) ? (int)$comments : ""; } } $simplified_data[$post_entry['creation_timestamp_unix']] = $post_entry; } krsort($simplified_data); return $simplified_data; } $post_data = extractRelevantData($combined_data); unset($combined_data); echo "


"; // Assuming your array is stored in $post_data $keys = array_keys($post_data); // Get first and last keys $first_key = reset($keys); // Or $keys[0] $last_key = end($keys); // Or $keys[count($keys) - 1] // Get timestamps from first and last elements $last_timestamp = gmdate("F Y",$post_data[$first_key]['creation_timestamp_unix']); $first_timestamp = gmdate("F Y",$post_data[$last_key]['creation_timestamp_unix']); //echo"
" . print_r($post_data[],true) ."
"; ?> Memento Mori '; echo file_get_contents('modal.js'); echo ''; ?>
Profile Picture

posts
Memento Mori '; echo file_get_contents('modal.js'); echo ''; ?>
Profile Picture

posts
]+src=([\'"])([^"\']+)\\1/i', $html_content, $matches); $image_sources = $matches[2]; $total_images = count($image_sources); $missing_images = 0; $fixed_images = 0; fwrite(STDERR, "Found $total_images image references to verify.\n"); foreach ($image_sources as $src) { // Skip data URIs if (strpos($src, 'data:image') === 0) { continue; } // Check if the image exists in the distribution folder $image_path = 'distribution/' . $src; if (!file_exists($image_path)) { $missing_images++; fwrite(STDERR, "Missing image: $src\n"); // Try to find the image with a different extension $base_path = pathinfo($image_path, PATHINFO_DIRNAME) . '/' . pathinfo($image_path, PATHINFO_FILENAME); $found = false; // Check common image extensions foreach (['.jpg', '.jpeg', '.png', '.gif', '.webp'] as $ext) { $alt_path = $base_path . $ext; if (file_exists($alt_path)) { fwrite(STDERR, " Found alternative: " . basename($alt_path) . "\n"); // Copy the file to the expected path copy($alt_path, $image_path); $fixed_images++; $found = true; break; } } if (!$found) { // Check if the original file exists (before distribution) $original_src = $src; if (file_exists($original_src)) { fwrite(STDERR, " Found original file, copying to distribution: $original_src\n"); // Create directory if it doesn't exist $dir = dirname($image_path); if (!file_exists($dir)) { mkdir($dir, 0755, true); } // Copy the file copy($original_src, $image_path); $fixed_images++; } } } } // Report results if ($missing_images === 0) { fwrite(STDERR, "All images verified successfully!\n"); } else { fwrite(STDERR, "Found $missing_images missing images, fixed $fixed_images.\n"); if ($missing_images > $fixed_images) { fwrite(STDERR, "WARNING: " . ($missing_images - $fixed_images) . " images could not be fixed.\n"); } } } ?> ================================================ FILE: deprecated_php_utility/modal.js ================================================ document.addEventListener('DOMContentLoaded', function() { // Get DOM elements const postsGrid = document.getElementById('postsGrid'); const postModal = document.getElementById('postModal'); const closeModalBtn = document.getElementById('closeModal'); const modalPrev = document.getElementById('modalPrev'); const modalNext = document.getElementById('modalNext'); const postMedia = document.getElementById('postMedia'); const postCaption = document.getElementById('postCaption'); const postStats = document.getElementById('postStats'); const postDate = document.getElementById('postDate'); const postUsername = document.getElementById('postUsername'); const postUserPic = document.getElementById('postUserPic'); const sortLinks = document.querySelectorAll('.sort-link'); // Global variables to track current post and indexes let currentPostIndex = -1; let currentSlideIndex = 0; let postIndexToTimestamp = {}; // Map post index to timestamp let currentSortType = 'newest'; // Default sort // Initialize by creating mapping and attaching listeners function initialize() { // Create a mapping from post_index to timestamp Object.entries(window.postData).forEach(([timestamp, post]) => { postIndexToTimestamp[post.post_index] = timestamp; }); // Attach click listeners to grid items attachGridItemListeners(); // Initialize sorting functionality initializeSorting(); } // Initialize sorting functionality function initializeSorting() { // Add event listeners to sort links sortLinks.forEach(link => { link.addEventListener('click', function(e) { e.preventDefault(); // Update active class sortLinks.forEach(l => l.classList.remove('active')); this.classList.add('active'); // Get sort type and sort posts const sortType = this.getAttribute('data-sort'); currentSortType = sortType; sortPosts(sortType); }); }); } // Sort posts based on selected criteria function sortPosts(sortType) { // Get all grid items let gridItems = Array.from(document.querySelectorAll('.grid-item')); // Sort the grid items based on the selected criteria switch(sortType) { case 'newest': // Sort by timestamp (newest first) - this is the default gridItems.sort((a, b) => { const indexA = parseInt(a.getAttribute('data-index')); const indexB = parseInt(b.getAttribute('data-index')); const timestampA = getTimestampByIndex(indexA); const timestampB = getTimestampByIndex(indexB); return timestampB - timestampA; }); break; case 'oldest': // Sort by timestamp (oldest first) gridItems.sort((a, b) => { const indexA = parseInt(a.getAttribute('data-index')); const indexB = parseInt(b.getAttribute('data-index')); const timestampA = getTimestampByIndex(indexA); const timestampB = getTimestampByIndex(indexB); return timestampA - timestampB; }); break; case 'most-likes': // Sort by number of likes gridItems.sort((a, b) => { const indexA = parseInt(a.getAttribute('data-index')); const indexB = parseInt(b.getAttribute('data-index')); const likesA = getLikesByIndex(indexA) || 0; const likesB = getLikesByIndex(indexB) || 0; return likesB - likesA; }); break; case 'most-comments': // Sort by number of comments gridItems.sort((a, b) => { const indexA = parseInt(a.getAttribute('data-index')); const indexB = parseInt(b.getAttribute('data-index')); const commentsA = getCommentsByIndex(indexA) || 0; const commentsB = getCommentsByIndex(indexB) || 0; return commentsB - commentsA; }); break; case 'most-views': // Sort by number of views/impressions gridItems.sort((a, b) => { const indexA = parseInt(a.getAttribute('data-index')); const indexB = parseInt(b.getAttribute('data-index')); const viewsA = getViewsByIndex(indexA) || 0; const viewsB = getViewsByIndex(indexB) || 0; return viewsB - viewsA; }); break; case 'random': // Shuffle the grid items randomly gridItems.sort(() => Math.random() - 0.5); break; } // Reorder the grid items in the DOM const fragment = document.createDocumentFragment(); gridItems.forEach(item => { fragment.appendChild(item); }); // Clear the grid and append the sorted items postsGrid.innerHTML = ''; postsGrid.appendChild(fragment); // Reattach event listeners to grid items attachGridItemListeners(); } // Helper function to get timestamp by post index function getTimestampByIndex(index) { const timestamp = postIndexToTimestamp[index]; return parseInt(timestamp); } // Helper function to get likes by post index function getLikesByIndex(index) { const timestamp = postIndexToTimestamp[index]; if (timestamp && window.postData[timestamp]) { return parseInt(window.postData[timestamp].Likes) || 0; } return 0; } // Helper function to get comments by post index function getCommentsByIndex(index) { const timestamp = postIndexToTimestamp[index]; if (timestamp && window.postData[timestamp]) { return parseInt(window.postData[timestamp].Comments) || 0; } return 0; } // Helper function to get views/impressions by post index function getViewsByIndex(index) { const timestamp = postIndexToTimestamp[index]; if (timestamp && window.postData[timestamp]) { return parseInt(window.postData[timestamp].Impressions) || 0; } return 0; } // Attach click event listeners to all grid items function attachGridItemListeners() { const gridItems = document.querySelectorAll('.grid-item'); gridItems.forEach(item => { item.addEventListener('click', function() { const postIndex = parseInt(this.getAttribute('data-index')); openModal(postIndex); }); }); } // Open the modal with the selected post function openModal(index, imageIndex = 0) { currentPostIndex = index; // Store the current scroll position before opening the modal const scrollPosition = window.pageYOffset || document.documentElement.scrollTop; // Get the timestamp using the post_index mapping const timestamp = postIndexToTimestamp[index]; // Get the post data using the timestamp const post = window.postData[timestamp]; // Show the modal first (important for correct dimensions) postModal.style.display = 'block'; document.body.style.overflow = 'hidden'; // Prevent scrolling // Store the scroll position as a data attribute on the modal postModal.setAttribute('data-scroll-position', scrollPosition); // Update modal content updateModalContent(post, imageIndex); // Update URL with post ID and image index updateUrlWithPostInfo(timestamp, imageIndex); // For mobile devices, ensure content is visible and properly sized if (window.innerWidth <= 768) { // Don't scroll to top on mobile as it causes the issue // Instead, just ensure the modal is properly positioned postModal.scrollTop = 0; // Force layout recalculation with a longer timeout setTimeout(() => { const mediaContainer = document.querySelector('.media-container'); const postMediaEl = document.getElementById('postMedia'); // Ensure post-media has explicit height if (postMediaEl) { postMediaEl.style.height = '50vh'; postMediaEl.style.minHeight = '300px'; } // Ensure media-container has explicit height if (mediaContainer) { mediaContainer.style.height = '100%'; mediaContainer.style.display = 'flex'; // Force reflow void mediaContainer.offsetHeight; } // Reset any active slides to ensure they're visible const activeSlides = document.querySelectorAll('.media-slide.active'); activeSlides.forEach(slide => { slide.style.opacity = '0'; void slide.offsetHeight; // Force reflow slide.style.opacity = '1'; // Make sure images have height const img = slide.querySelector('img'); if (img) { img.style.maxHeight = '100%'; img.style.width = 'auto'; img.style.height = 'auto'; } }); }, 50); // Increase timeout for more reliability } } // Function to update the URL with post and image information function updateUrlWithPostInfo(timestamp, imageIndex) { // Create a new URL object based on the current URL const url = new URL(window.location.href); // Set the post parameter to the timestamp url.searchParams.set('post', timestamp); // Only add the image parameter if it's not the first image if (imageIndex > 0) { url.searchParams.set('image', imageIndex); } else { url.searchParams.delete('image'); } // Update the browser history without reloading the page window.history.pushState({}, '', url); } //Creates the appropriate media element (video or image) based on the file type function createMediaElement(mediaUrl) { // Check if the media is a video based on file extension if (mediaUrl.endsWith('.mp4') || mediaUrl.endsWith('.mov') || mediaUrl.endsWith('.avi') || mediaUrl.endsWith('.webm')) { // Create video element const video = document.createElement('video'); video.src = mediaUrl; video.controls = true; video.autoplay = true; video.loop = true; video.muted = false; video.playsInline = true; video.alt = 'Instagram video post'; return video; } else { // Create image element const img = document.createElement('img'); // Check if there's a WebP version available for non-WebP images if (!mediaUrl.endsWith('.webp') && (mediaUrl.endsWith('.jpg') || mediaUrl.endsWith('.jpeg') || mediaUrl.endsWith('.png') || mediaUrl.endsWith('.gif'))) { // Try to use WebP version if it exists const webpUrl = mediaUrl.replace(/\.(jpg|jpeg|png|gif)$/i, '.webp'); // Set up error handling to fall back to original if WebP doesn't exist img.onerror = function() { this.onerror = null; // Prevent infinite loop this.src = mediaUrl; // Fall back to original }; img.src = webpUrl; } else { img.src = mediaUrl; } img.alt = 'Instagram post'; return img; } } // Update the modal content with the post data function updateModalContent(post, initialImageIndex = 0) { // Clear previous content postMedia.innerHTML = ''; postCaption.innerHTML = ''; postStats.innerHTML = ''; // Create media container for the slides const mediaContainer = document.createElement('div'); mediaContainer.className = 'media-container'; // Check if the post has multiple media if (post.media && post.media.length > 1) { // Create slides for each media item post.media.forEach((mediaUrl, index) => { const slide = document.createElement('div'); slide.className = `media-slide ${index === initialImageIndex ? 'active' : ''}`; // Create and add the appropriate media element const mediaElement = createMediaElement(mediaUrl); slide.appendChild(mediaElement); mediaContainer.appendChild(slide); }); // Add navigation buttons for slideshow const prevBtn = document.createElement('div'); prevBtn.className = 'slideshow-nav slideshow-prev'; prevBtn.innerHTML = '❮'; prevBtn.addEventListener('click', function(e) { e.stopPropagation(); navigateSlideshow(-1); }); const nextBtn = document.createElement('div'); nextBtn.className = 'slideshow-nav slideshow-next'; nextBtn.innerHTML = '❯'; nextBtn.addEventListener('click', function(e) { e.stopPropagation(); navigateSlideshow(1); }); // Add indicator dots const indicator = document.createElement('div'); indicator.className = 'slideshow-indicator'; for (let i = 0; i < post.media.length; i++) { const dot = document.createElement('div'); dot.className = `slideshow-dot ${i === initialImageIndex ? 'active' : ''}`; dot.setAttribute('data-index', i); dot.addEventListener('click', function(e) { e.stopPropagation(); const index = parseInt(this.getAttribute('data-index')); showSlide(index); }); indicator.appendChild(dot); } mediaContainer.appendChild(prevBtn); mediaContainer.appendChild(nextBtn); mediaContainer.appendChild(indicator); // Set the current slide index to the initial image index currentSlideIndex = initialImageIndex; } else { // Single media post const slide = document.createElement('div'); slide.className = 'media-slide active'; // Create and add the appropriate media element const mediaElement = createMediaElement(post.media[0]); slide.appendChild(mediaElement); mediaContainer.appendChild(slide); } postMedia.appendChild(mediaContainer); // Set post caption postCaption.textContent = post.title || ''; // Set post stats if (post.Impressions) { const impressionsDiv = document.createElement('div'); impressionsDiv.className = 'post-stat'; impressionsDiv.innerHTML = ` 👁️ ${post.Impressions} views `; postStats.appendChild(impressionsDiv); } if (post.Likes) { const likesDiv = document.createElement('div'); likesDiv.className = 'post-stat'; likesDiv.innerHTML = ` ${post.Likes} `; postStats.appendChild(likesDiv); } if (post.Comments) { const commentsDiv = document.createElement('div'); commentsDiv.className = 'post-stat'; commentsDiv.innerHTML = ` 💬 ${post.Comments} comments `; postStats.appendChild(commentsDiv); } // Set post date postDate.textContent = post.creation_timestamp_readable; // Show/hide stats container based on whether there are any stats postStats.style.display = postStats.children.length > 0 ? 'flex' : 'none'; } // Navigate between slides in a multi-media post function navigateSlideshow(direction) { const slides = document.querySelectorAll('.media-slide'); const dots = document.querySelectorAll('.slideshow-dot'); let activeIndex = 0; // Find the currently active slide slides.forEach((slide, index) => { if (slide.classList.contains('active')) { activeIndex = index; } }); // Pause any videos in the current slide const currentVideo = slides[activeIndex].querySelector('video'); if (currentVideo) { currentVideo.pause(); } // Calculate the new index let newIndex = activeIndex + direction; if (newIndex < 0) newIndex = slides.length - 1; if (newIndex >= slides.length) newIndex = 0; // Update slides and dots showSlide(newIndex); } // Show a specific slide function showSlide(index) { const slides = document.querySelectorAll('.media-slide'); const dots = document.querySelectorAll('.slideshow-dot'); // Pause all videos before changing slides slides.forEach(slide => { const video = slide.querySelector('video'); if (video) { video.pause(); } }); // Remove active class from all slides and dots slides.forEach(slide => slide.classList.remove('active')); if (dots.length > 0) { dots.forEach(dot => dot.classList.remove('active')); dots[index].classList.add('active'); } // Add active class to the selected slide slides[index].classList.add('active'); // Update current slide index currentSlideIndex = index; // Update URL with the new image index const timestamp = postIndexToTimestamp[currentPostIndex]; updateUrlWithPostInfo(timestamp, index); } // Navigate between posts (next/prev buttons in modal) function navigatePost(direction) { // Pause all videos in the current post const videos = document.querySelectorAll('.media-slide video'); videos.forEach(video => { if (video) { video.pause(); } }); // Get all grid items in their current sorted order const gridItems = Array.from(document.querySelectorAll('.grid-item')); const gridIndexes = gridItems.map(item => parseInt(item.getAttribute('data-index'))); // Find the position of the current post in the sorted grid const currentPosition = gridIndexes.indexOf(currentPostIndex); if (currentPosition === -1) { console.error('Current post not found in grid'); return; } // Calculate new position with wraparound let newPosition = (currentPosition + direction + gridIndexes.length) % gridIndexes.length; // Get the new post index from the grid's current order const newPostIndex = gridIndexes[newPosition]; // Open the new post openModal(newPostIndex); } // Close the modal function closeModal() { // Pause all videos before closing the modal const videos = document.querySelectorAll('.media-slide video'); videos.forEach(video => { if (video) { video.pause(); } }); // Store the current scroll position before closing the modal const scrollPosition = window.pageYOffset || document.documentElement.scrollTop; postModal.style.display = 'none'; document.body.style.overflow = 'auto'; // Re-enable scrolling // Remove post and image parameters from URL const url = new URL(window.location.href); url.searchParams.delete('post'); url.searchParams.delete('image'); window.history.pushState({}, '', url); // Restore the scroll position after a short delay setTimeout(() => { window.scrollTo({ top: scrollPosition, behavior: 'auto' // Use 'auto' instead of 'smooth' to prevent visible scrolling }); }, 10); } // Event listeners for modal navigation closeModalBtn.addEventListener('click', closeModal); modalPrev.addEventListener('click', function(e) { e.stopPropagation(); navigatePost(-1); }); modalNext.addEventListener('click', function(e) { e.stopPropagation(); navigatePost(1); }); // Close modal when clicking outside of content postModal.addEventListener('click', function(e) { if (e.target === postModal) { closeModal(); } }); // Keyboard navigation document.addEventListener('keydown', function(e) { if (postModal.style.display === 'block') { if (e.key === 'Escape') { closeModal(); } else if (e.key === 'ArrowLeft') { navigatePost(-1); } else if (e.key === 'ArrowRight') { navigatePost(1); } } }); // Initialize the modal functionality if (typeof window.postData !== 'undefined') { initialize(); // Check if URL has post and image parameters const urlParams = new URLSearchParams(window.location.search); const postTimestamp = urlParams.get('post'); const imageIndex = parseInt(urlParams.get('image') || '0'); if (postTimestamp && window.postData[postTimestamp]) { // Find the post index from the timestamp let postIndex = -1; Object.entries(postIndexToTimestamp).forEach(([index, timestamp]) => { if (timestamp === postTimestamp) { postIndex = parseInt(index); } }); if (postIndex >= 0) { // Open the modal with the specified post and image setTimeout(() => { openModal(postIndex, imageIndex); }, 500); // Delay to ensure everything is loaded } } } else { console.error('Post data not available'); } }); ================================================ FILE: deprecated_php_utility/notes.md ================================================ The PHP version may be easier to run on shared hosting environments and doesn't require additional packages if PHP is already installed with the necessary extensions. ## Troubleshooting - If you see errors about GD library or WebP support, you may need to install additional PHP extensions - For video thumbnail generation, ensure FFmpeg is installed and accessible in your system path - For HEIC file support, ensure ImageMagick is installed - If using Docker, ensure you have permissions to write to the output directory - For large archives, be patient as processing media files can take time ================================================ FILE: deprecated_php_utility/style.css ================================================ :root { --instagram-bg: #fafafa; --instagram-border: #dbdbdb; --instagram-text: #262626; --instagram-link: #0095f6; --header-height: 60px; } * { margin: 0; padding: 0; box-sizing: border-box; } body { font-family: -apple-system, BlinkMacSystemFont, "Segoe UI", Roboto, Helvetica, Arial, sans-serif; background-color: var(--instagram-bg); color: var(--instagram-text); line-height: 1.5; } header { position: fixed; top: 0; left: 0; right: 0; height: var(--header-height); background-color: white; border-bottom: 1px solid var(--instagram-border); display: flex; align-items: center; justify-content: center; padding: 0 20px; z-index: 100; } .header-content { max-width: 975px; width: 100%; display: flex; justify-content: space-between; align-items: center; } .logo { font-size: 24px; font-weight: bold; color: var(--instagram-text); text-decoration: none; } .date-range-header { color: #8e8e8e; font-size: 14px; margin-left: 15px; } main { max-width: 975px; margin: calc(var(--header-height) + 30px) auto 30px; padding: 0 20px; } .profile-info { display: flex; align-items: center; margin-bottom: 30px; } .profile-picture { width: 150px; height: 150px; border-radius: 50%; object-fit: cover; margin-right: 30px; background-color: #eee; display: flex; align-items: center; justify-content: center; font-size: 36px; color: #aaa; } .profile-details h1 { font-size: 28px; font-weight: 300; margin-bottom: 15px; } .stats { display: flex; margin-bottom: 15px; font-size: 16px; } .stat { margin-right: 40px; } .stat-count { font-weight: 600; } .posts-grid { display: grid; grid-template-columns: repeat(3, 1fr); gap: 28px; } .grid-item { position: relative; aspect-ratio: 1/1; cursor: pointer; overflow: hidden; } .grid-item img, .grid-item video { width: 100%; height: 100%; object-fit: cover; transition: transform 0.3s ease; aspect-ratio: 1/1; } .grid-item:hover img, .grid-item:hover video { transform: scale(1.05); } .multi-indicator { position: absolute; top: 10px; right: 10px; color: white; background-color: rgba(0, 0, 0, 0.6); padding: 3px 8px; border-radius: 4px; font-size: 12px; z-index: 2; } .video-indicator { position: absolute; top: 10px; right: 10px; color: white; background-color: rgba(0, 0, 0, 0.7); padding: 4px 10px; border-radius: 4px; font-size: 12px; font-weight: bold; z-index: 2; box-shadow: 0 1px 3px rgba(0, 0, 0, 0.3); } .post-modal { display: none; position: fixed; top: 0; left: 0; width: 100%; height: 100%; background-color: rgba(0, 0, 0, 0.9); z-index: 1000; overflow-y: auto; } .post-modal-content { display: flex; max-width: 1200px; margin: 30px auto; background-color: white; height: calc(100vh - 60px); max-height: 800px; border-radius: 4px; overflow: hidden; position: relative; } .post-media { flex: 1; background-color: black; position: relative; min-width: 0; display: flex; align-items: center; justify-content: center; } .post-media img, .post-media video { max-width: 100%; max-height: 100%; object-fit: contain; } .post-info { width: 340px; border-left: 1px solid var(--instagram-border); display: flex; flex-direction: column; } .post-header { padding: 16px; border-bottom: 1px solid var(--instagram-border); display: flex; align-items: center; } .post-user { width: 32px; height: 32px; border-radius: 50%; margin-right: 12px; display: flex; align-items: center; justify-content: center; background-color: #eee; font-size: 14px; color: #aaa; } .post-username { font-weight: 600; flex-grow: 1; } .share-button { cursor: pointer; padding: 5px; border-radius: 50%; display: flex; align-items: center; justify-content: center; transition: background-color 0.2s; } .share-button:hover { background-color: rgba(0, 0, 0, 0.1); } .share-button svg { width: 18px; height: 18px; color: #8e8e8e; } .post-caption { padding: 16px; flex-grow: 1; overflow-y: auto; } .post-date { padding: 16px; color: #8e8e8e; font-size: 12px; border-top: 1px solid var(--instagram-border); } .post-stats { padding: 12px 16px; color: var(--instagram-text); font-size: 14px; border-top: 1px solid var(--instagram-border); display: flex; gap: 16px; } .post-stat { display: flex; align-items: center; gap: 6px; } .post-stat-icon { font-size: 16px; } .likes-indicator { position: absolute; bottom: 10px; left: 10px; color: white; background-color: rgba(0, 0, 0, 0.7); padding: 4px 10px; border-radius: 4px; font-size: 12px; font-weight: bold; z-index: 2; box-shadow: 0 1px 3px rgba(0, 0, 0, 0.3); } .close-modal { position: absolute; top: 20px; right: 20px; color: white; font-size: 30px; cursor: pointer; z-index: 1001; } .modal-nav { position: absolute; top: 50%; transform: translateY(-50%); color: white; font-size: 30px; cursor: pointer; z-index: 1001; background-color: rgba(0, 0, 0, 0.5); width: 40px; height: 40px; border-radius: 50%; display: flex; align-items: center; justify-content: center; } .modal-prev { left: 20px; } .modal-next { right: 20px; } .slideshow-nav { position: absolute; top: 50%; transform: translateY(-50%); color: white; font-size: 30px; cursor: pointer; z-index: 5; background-color: rgba(0, 0, 0, 0.7); width: 40px; height: 40px; border-radius: 50%; display: flex; align-items: center; justify-content: center; transition: background-color 0.2s; } .slideshow-nav:hover { background-color: rgba(0, 0, 0, 0.9); } .slideshow-prev { left: 10px; } .slideshow-next { right: 10px; } .slideshow-indicator { position: absolute; bottom: 20px; left: 0; right: 0; display: flex; justify-content: center; z-index: 5; background-color: rgba(0, 0, 0, 0.3); padding: 8px 0; border-radius: 20px; width: auto; max-width: 80%; margin: 0 auto; } .slideshow-dot { width: 8px; height: 8px; border-radius: 50%; background-color: rgba(255, 255, 255, 0.5); margin: 0 4px; cursor: pointer; transition: background-color 0.2s; } .slideshow-dot:hover { background-color: rgba(255, 255, 255, 0.8); } .slideshow-dot.active { background-color: white; } .media-container { position: relative; width: 100%; height: 100%; display: flex; align-items: center; justify-content: center; } .media-slide { position: absolute; top: 0; left: 0; width: 100%; height: 100%; display: flex; align-items: center; justify-content: center; opacity: 0; transition: opacity 0.3s ease; pointer-events: none; } .media-slide img, .media-slide video { max-width: 100%; max-height: 100%; object-fit: contain; } .media-slide.active { opacity: 1; z-index: 2; pointer-events: auto; } .media-slide.active { opacity: 1; z-index: 2; } .file-input-container { margin-bottom: 20px; padding: 20px; background-color: white; border: 1px solid var(--instagram-border); border-radius: 4px; } .loading { text-align: center; padding: 40px; font-size: 18px; } .sort-options { display: flex; align-items: center; justify-content: center; padding: 10px 20px; margin-bottom: 20px; } .sort-row { display: flex; align-items: center; justify-content: center; flex-wrap: wrap; margin: 5px 0; width: 100%; max-width: 600px; } .sort-link { margin: 0 10px; color: var(--instagram-text); text-decoration: none; padding: 5px 0; position: relative; transition: color 0.2s; } .sort-link:hover { color: var(--instagram-link); } .sort-link.active { color: var(--instagram-link); font-weight: 600; } .sort-link.active::after { content: ''; position: absolute; bottom: 0; left: 0; width: 100%; height: 2px; background-color: var(--instagram-link); } @media (max-width: 768px) { .posts-grid { grid-template-columns: repeat(2, 1fr); gap: 4px; } .post-modal-content { flex-direction: column; height: auto; max-height: none; margin: 30px auto 0; border-radius: 0; width: 100%; } .post-media { height: 50vh; width: 100%; min-height: 300px; position: relative; } .post-info { width: 100%; border-left: none; border-top: 1px solid var(--instagram-border); } .profile-picture { width: 80px; height: 80px; margin-right: 15px; } .stat { margin-right: 20px; } .post-modal { overflow-y: auto; padding-top: 0; } .media-container { position: relative; width: 100%; height: 100%; } .media-slide { position: absolute; top: 0; left: 0; width: 100%; height: 100%; display: flex; align-items: center; justify-content: center; } .media-slide img, .media-slide video { max-width: 100%; max-height: 100%; object-fit: contain; } } @media (max-width: 480px) { .posts-grid { grid-template-columns: repeat(3, 1fr); gap: 3px; } .profile-info { flex-direction: column; text-align: center; } .profile-picture { margin-right: 0; margin-bottom: 15px; } .stats { justify-content: center; } .header-content { flex-direction: column; align-items: center; padding: 5px 0; } .date-range-header { margin-left: 0; margin-top: 2px; font-size: 12px; } .sort-options { padding: 5px; } .sort-row { width: 100%; flex-wrap: wrap; justify-content: center; } .sort-link { margin: 5px; font-size: 13px; padding: 5px 0; flex: 0 0 auto; } } /* Mobile-specific fixes */ @media (max-width: 768px) { /* Ensure the modal takes up the full screen */ .post-modal { padding: 0; overflow-y: auto; } /* Make modal content take full width */ .post-modal-content { flex-direction: column; height: auto; margin: 0; width: 100%; max-width: 100%; } /* Explicitly set post-media height */ .post-media { height: 50vh !important; /* Important to override any inline styles */ min-height: 300px !important; width: 100%; flex: 0 0 auto; /* Don't grow or shrink */ } /* Ensure media container fills the available space */ .media-container { position: relative; width: 100%; height: 100% !important; display: flex !important; align-items: center; justify-content: center; } /* Fix media slides */ .media-slide { position: absolute; top: 0; left: 0; width: 100%; height: 100%; display: flex !important; align-items: center; justify-content: center; } /* Ensure images don't exceed container */ .media-slide img, .media-slide video { max-width: 100%; max-height: 100%; width: auto; height: auto; object-fit: contain; } /* Make post info section scroll independently if needed */ .post-info { flex: 1 1 auto; overflow-y: auto; max-height: 50vh; } } ================================================ FILE: docker-compose.yml ================================================ services: memento-mori: build: . volumes: - ./:/app/workspace - ./output:/output environment: - PYTHONUNBUFFERED=1 command: --search-dir /app/workspace --output /output ================================================ FILE: memento_mori/__init__.py ================================================ # __init__.py """ Memento Mori - Instagram Archive Viewer A tool that converts your Instagram data export into a beautiful, standalone viewer that resembles the Instagram interface. The name "Memento Mori" (Latin for "remember that you will die") reflects the ephemeral nature of our digital content. """ __version__ = "0.1.0" # Import main classes for easier access from .extractor import InstagramArchiveExtractor from .file_mapper import InstagramFileMapper from .loader import InstagramDataLoader from .media import InstagramMediaProcessor from .generator import InstagramSiteGenerator # Define what's available when using `from memento_mori import *` __all__ = [ "InstagramArchiveExtractor", "InstagramFileMapper", "InstagramDataLoader", "InstagramMediaProcessor", "InstagramSiteGenerator", ] ================================================ FILE: memento_mori/cli.py ================================================ # memento_mori/cli.py import os import argparse import multiprocessing from pathlib import Path import traceback import sys from memento_mori.extractor import InstagramArchiveExtractor from memento_mori.loader import InstagramDataLoader from memento_mori.media import InstagramMediaProcessor from memento_mori.generator import InstagramSiteGenerator def main(): """Main entry point for the Memento Mori CLI.""" parser = argparse.ArgumentParser( description="Transform Instagram data export into a viewer." ) parser.add_argument( "--input", type=str, help="Path to Instagram data (ZIP or folder). If not specified, auto-detection will be used.", ) parser.add_argument( "--output", type=str, default="./output", help="Output directory for generated website [default: ./output]", ) parser.add_argument( "--threads", type=int, default=0, help="Number of parallel processing threads [default: auto]", ) parser.add_argument( "--search-dir", type=str, default=".", help="Directory to search for Instagram exports when auto-detecting [default: current directory]", ) parser.add_argument( "--quality", type=int, default=70, help="WebP conversion quality (1-100) [default: 70]", ) parser.add_argument( "--max-dimension", type=int, default=1920, help="Maximum dimension for images in pixels [default: 1920]", ) parser.add_argument( "--thumbnail-size", type=str, default="292x292", help="Size of thumbnails [default: 292x292]", ) parser.add_argument( "--no-auto-detect", action="store_true", help="Disable auto-detection (requires --input to be specified)", ) parser.add_argument( "--gtag-id", type=str, help="Google Analytics tag ID (e.g., 'G-DX1ZWTC9NZ') to add tracking to the generated site", ) parser.add_argument( "--verbose", "-v", action="store_true", help="Enable verbose output for debugging", ) args = parser.parse_args() # Set defaults for threads if not specified if args.threads <= 0: args.threads = max(1, multiprocessing.cpu_count() - 1) # Parse thumbnail size try: if "x" in args.thumbnail_size: width, height = map(int, args.thumbnail_size.lower().split("x")) thumbnail_size = (width, height) else: size = int(args.thumbnail_size) thumbnail_size = (size, size) except ValueError: print(f"Invalid thumbnail size: {args.thumbnail_size}, using default 292x292") thumbnail_size = (292, 292) # Create output directory output_dir = Path(args.output) output_dir.mkdir(parents=True, exist_ok=True) # Initialize extractor with input path if specified extractor = InstagramArchiveExtractor(input_path=args.input) # Handle input selection # If input is explicitly provided, use that if args.input: print(f"Using specified input: {args.input}") # If auto-detect is not disabled, try to find an export elif not args.no_auto_detect: print(f"Auto-detecting Instagram archive in {args.search_dir}...") detected_archive = extractor.auto_detect_archive(search_dir=args.search_dir) if not detected_archive: print( "No Instagram archive detected. Please specify an input file with --input." ) return 1 print(f"Detected archive: {detected_archive}") # If no input and auto-detect disabled, raise error else: print("Error: No input specified and auto-detection is disabled.") print("Please provide an input path with --input.") return 1 try: # Extract archive print("\n📦 EXTRACTING ARCHIVE") print(f" Source: {extractor.input_path}") extraction_dir = extractor.extract() print(f" Extracted to: {extraction_dir}") # Get file mapper from extractor file_mapper = extractor.file_mapper # Initialize loader with the same file mapper print("\n📋 LOADING DATA") loader = InstagramDataLoader(extraction_dir, file_mapper, verbose=args.verbose) # Load and process data data = loader.load_all_data() if args.verbose: print("\n🔍 VERBOSE: Data Loading Details") print(f" Profile data found: {'Yes' if loader.profile_data else 'No'}") print(f" Location data found: {'Yes' if loader.location_data else 'No'}") print(f" Posts data found: {'Yes' if loader.posts_data else 'No'}") print(f" Insights data found: {'Yes' if loader.insights_data else 'No'}") print(f" Combined data entries: {len(loader.combined_data) if loader.combined_data else 0}") # Show file paths that were found print("\n File paths found:") for file_type, file_path in file_mapper.file_map.items(): if isinstance(file_path, list): print(f" {file_type}: {len(file_path)} files") if args.verbose: for i, path in enumerate(file_path[:3]): # Show first 3 only print(f" - {path}") if len(file_path) > 3: print(f" - ... and {len(file_path)-3} more") else: print(f" {file_type}: {file_path}") print(f" Found {data['post_count']} posts from {data['profile']['username']}") # Process media files print(f"\n🖼️ PROCESSING MEDIA") print(f" Using {args.threads} threads, quality {args.quality}, max dimension {args.max_dimension}...") media_processor = InstagramMediaProcessor( extraction_dir, output_dir, thread_count=args.threads, quality=args.quality, max_dimension=args.max_dimension ) media_result = media_processor.process_media_files( data["posts"], data["profile"]["profile_picture"], data.get("stories", {}) ) # Update data with shortened filenames data["posts"] = media_result["updated_post_data"] data["profile"]["profile_picture"] = media_result["shortened_profile"] # Update stories data if it exists if "stories" in data and media_result.get("updated_stories_data"): data["stories"] = media_result["updated_stories_data"] # Generate website with the loaded data print("\n🌐 GENERATING WEBSITE") generator = InstagramSiteGenerator(data, output_dir, gtag_id=args.gtag_id) success = generator.generate() if success: stats = media_result["stats"] print("\n✅ PROCESS COMPLETE") print(f" Website generated at: {output_dir}") print(f" Posts processed: {data['post_count']}") print(f" Media files processed: {stats['thumbnail_count'] + stats['webp_count']}") print(f" Space saved: {stats['space_saved_mb']:.2f} MB ({stats['percentage_saved']:.1f}%)") print(f" Fixed file extensions: {stats['extension_fixes']}") return 0 else: print("\n❌ ERROR: Failed to generate website.") return 1 except Exception as e: print(f"\n❌ ERROR: {str(e)}") if args.verbose: print("\n🔍 VERBOSE: Exception traceback") traceback.print_exc(file=sys.stdout) return 1 if __name__ == "__main__": exit(main()) ================================================ FILE: memento_mori/extractor.py ================================================ # memento_mori/extractor.py import os import zipfile import tempfile import shutil from pathlib import Path from .file_mapper import InstagramFileMapper class InstagramArchiveExtractor: """ Class for handling the extraction and validation of Instagram data archives. This class provides methods to: - Auto-detect Instagram archive files - Extract archives to temporary or specified locations - Validate the structure of extracted content - Clean up temporary files after processing """ REQUIRED_FILES = ["profile", "posts"] def __init__(self, input_path=None, output_path=None, cleanup=True): """ Initialize the extractor with paths and options. Args: input_path (str, optional): Path to the Instagram archive (ZIP or folder) output_path (str, optional): Path where extracted content should be placed cleanup (bool): Whether to clean up temporary files after extraction """ self.input_path = input_path self.input_paths = [input_path] if input_path else [] self.output_path = output_path self.cleanup = cleanup self.temp_dir = None self.extraction_dir = None self.file_mapper = None self.file_map = {} # Maps required file types to their actual paths def auto_detect_archive(self, search_dir="."): """ Auto-detect Instagram archive files in the specified directory. Args: search_dir (str): Directory to search for Instagram archives Returns: str: Path to the detected archive or None if not found """ print(f"🔍 DETECTING INSTAGRAM ARCHIVE") print(f" Searching in: {search_dir}") # Look for ZIP files that might be Instagram archives potential_archives = [] for root, _, files in os.walk(search_dir): for file in files: if file.lower().endswith(".zip"): zip_path = os.path.join(root, file) # Check if this ZIP might be an Instagram archive if self._is_instagram_archive(zip_path): potential_archives.append(zip_path) if not potential_archives: print(" No Instagram archives found.") return None # Sort by modification time (oldest first, so newest archive is extracted last and wins on conflicts) potential_archives.sort(key=lambda x: os.path.getmtime(x)) if len(potential_archives) > 1: print(f" Found {len(potential_archives)} archives. All will be merged.") for archive in potential_archives: print(f" - {os.path.basename(archive)}") self.input_path = potential_archives[0] self.input_paths = potential_archives print(f" Selected: {os.path.basename(self.input_path)}") return self.input_path def _is_instagram_archive(self, zip_path): """ Check if a ZIP file is likely an Instagram archive. """ try: with zipfile.ZipFile(zip_path, "r") as zip_ref: namelist = zip_ref.namelist() # More flexible check - look for these directory names anywhere in the paths key_dirs = ["personal_information", "your_instagram_activity"] found_dirs = set() for name in namelist: for dir_name in key_dirs: if dir_name in name.lower(): found_dirs.add(dir_name) # If we found any of the key directories, it's probably an Instagram archive is_archive = len(found_dirs) > 0 return is_archive except Exception as e: print(f"Error examining ZIP: {str(e)}") return False def extract(self): """ Extract the Instagram archive to the specified location. Returns: str: Path to the extracted content Raises: ValueError: If no input path is specified or the file doesn't exist zipfile.BadZipFile: If the ZIP file is invalid """ if not self.input_path: raise ValueError( "No input path specified. Use auto_detect_archive() or specify input_path." ) if not os.path.exists(self.input_path): raise ValueError(f"Input path does not exist: {self.input_path}") # Determine if input is a ZIP file or a directory if os.path.isfile(self.input_path) and self.input_path.lower().endswith(".zip"): # Create a temporary directory if no output_path is specified if not self.output_path: self.temp_dir = tempfile.mkdtemp(prefix="instagram_export_") self.extraction_dir = self.temp_dir else: self.extraction_dir = self.output_path os.makedirs(self.extraction_dir, exist_ok=True) # Extract all detected ZIP files, merging their contents for zip_path in self.input_paths: print(f"Extracting {zip_path} to {self.extraction_dir}...") self._extract_and_merge(zip_path, self.extraction_dir) else: # Input is already a directory self.extraction_dir = self.input_path # After extraction, check if there's a single directory at the top level contents = os.listdir(self.extraction_dir) if len(contents) == 1 and os.path.isdir( os.path.join(self.extraction_dir, contents[0]) ): # If so, use that as the actual extraction directory self.extraction_dir = os.path.join(self.extraction_dir, contents[0]) print( f"Found single top-level directory, using it as extraction dir: {self.extraction_dir}" ) # Now validate with the correct path if self.validate_structure(): return self.extraction_dir else: raise ValueError( "Extracted content does not appear to be a valid Instagram archive." ) def validate_structure(self): """ Validate the structure of the extracted content. """ if not self.extraction_dir or not os.path.exists(self.extraction_dir): return False # Create file mapper self.file_mapper = InstagramFileMapper(self.extraction_dir) self.file_mapper.discover_all_files() # Validate required files valid, missing_files = self.file_mapper.validate_required_files( self.REQUIRED_FILES ) if not valid: print(f"Missing required files: {', '.join(missing_files)}") return False # For backward compatibility, update self.file_map self.file_map = self.file_mapper.file_map return True def _map_important_files(self): """ Find and map important files that might be in different locations. """ for file_type, patterns in self.FILE_PATTERNS.items(): # Handle both single string patterns and lists of patterns if isinstance(patterns, str): patterns = [patterns] all_matches = [] for pattern in patterns: # Use Path.glob to find files matching each pattern matches = list(Path(self.extraction_dir).glob(pattern)) all_matches.extend(matches) if all_matches: # Store the path to the first matching file self.file_map[file_type] = str(all_matches[0]) # If multiple posts files are found, store them all if file_type == "posts" and len(all_matches) > 1: self.file_map[f"{file_type}_all"] = [ str(match) for match in all_matches ] def _extract_and_merge(self, zip_path, target_dir): """ Extract a ZIP file into target_dir, handling the case where the ZIP contains a single top-level directory by merging its contents directly. """ staging_dir = tempfile.mkdtemp(prefix="instagram_staging_") try: with zipfile.ZipFile(zip_path, "r") as zip_ref: zip_ref.extractall(staging_dir) # If the ZIP had a single top-level directory, use its contents directly contents = os.listdir(staging_dir) if len(contents) == 1 and os.path.isdir(os.path.join(staging_dir, contents[0])): source = os.path.join(staging_dir, contents[0]) else: source = staging_dir self._merge_dirs(source, target_dir) finally: shutil.rmtree(staging_dir, ignore_errors=True) def _merge_dirs(self, src, dst): """Recursively merge src directory into dst directory.""" for item in os.listdir(src): s = os.path.join(src, item) d = os.path.join(dst, item) if os.path.isdir(s): if os.path.exists(d): self._merge_dirs(s, d) else: shutil.copytree(s, d) else: shutil.copy2(s, d) def get_file_path(self, file_type): """ Get the path to an important file. Args: file_type (str): Type of file to get (e.g., "posts", "insights") Returns: str: Path to the file or None if not found """ return self.file_map.get(file_type) def cleanup_temp_files(self): """ Clean up temporary files created during extraction. """ if self.cleanup and self.temp_dir and os.path.exists(self.temp_dir): print(f"Cleaning up temporary directory: {self.temp_dir}") shutil.rmtree(self.temp_dir) self.temp_dir = None def __del__(self): """ Ensure cleanup of temporary files when the object is destroyed. """ self.cleanup_temp_files() ================================================ FILE: memento_mori/file_mapper.py ================================================ # memento_mori/file_mapper.py from pathlib import Path import os class InstagramFileMapper: """ Central class for discovering and mapping Instagram export files. Used by both Extractor and Loader to maintain consistency. """ # Define all patterns in one central location FILE_PATTERNS = { "posts": ["**/content/posts*.json", "**/media/posts*.json"], "insights": ["**/past_instagram_insights/posts.json"], "profile": [ "**/personal_information/personal_information/personal_information.json", # Double-nested (newer exports) "**/personal_information/personal_information.json", "**/account_information/personal_information.json", "**/personal_information.json", "**/*/personal_information.json" ], "location": [ "**/personal_information/information_about_you/profile_based_in.json", # Newer exports "**/information_about_you/profile_based_in.json", "**/profile_based_in.json", "**/*/profile_based_in.json", "**/account_information/profile_based_in.json", "**/personal_information/profile_based_in.json" ], "followers": [ "**/connections/followers_and_following/followers*.json", "**/followers_and_following/followers*.json", "**/followers*.json", # Search in any subdirectory "**/*/followers*.json" ], "stories": [ "**/content/stories*.json", "**/media/stories*.json", "**/your_instagram_activity/stories*.json", "**/stories*.json", "**/your_instagram_activity/stories/stories*.json", "**/your_instagram_activity/content/stories*.json", # Search in any subdirectory "**/*/stories*.json" ], # Add more patterns as needed } def __init__(self, base_dir): self.base_dir = Path(base_dir) self.file_map = {} def discover_all_files(self): """ Discover all files defined in FILE_PATTERNS. """ for file_type, patterns in self.FILE_PATTERNS.items(): self.discover_files(file_type, patterns) return self.file_map def discover_files(self, file_type, patterns=None): """ Discover files of a specific type. """ if patterns is None: patterns = self.FILE_PATTERNS.get(file_type, []) # Handle both single string patterns and lists of patterns if isinstance(patterns, str): patterns = [patterns] all_matches = [] for pattern in patterns: # First try exact path if it looks like one if not pattern.startswith("**"): exact_path = os.path.join(self.base_dir, pattern) if os.path.exists(exact_path): all_matches.append(Path(exact_path)) continue # Otherwise use Path.glob to find files matching pattern matches = list(self.base_dir.glob(pattern)) all_matches.extend(matches) if all_matches: # Store the path to the first matching file self.file_map[file_type] = str(all_matches[0]) # If multiple matches are found, store them all if len(all_matches) > 1: self.file_map[f"{file_type}_all"] = [ str(match) for match in all_matches ] return self.file_map.get(file_type) def get_file_path(self, file_type): """ Get the path to a specific file type. """ if file_type not in self.file_map and file_type in self.FILE_PATTERNS: # Try to discover it if not already in the map self.discover_files(file_type) return self.file_map.get(file_type) def validate_required_files(self, required_files): """ Validate that all required files exist. """ missing_files = [] for file_type in required_files: if not self.get_file_path(file_type): missing_files.append(file_type) return len(missing_files) == 0, missing_files ================================================ FILE: memento_mori/generator.py ================================================ # memento_mori/generator.py import os import json import shutil import datetime from pathlib import Path from jinja2 import Environment, FileSystemLoader from markupsafe import Markup import re import hashlib import base64 class InstagramSiteGenerator: """ Class for generating the static website from processed Instagram data. This class handles: - Creating HTML using templates - Copying static assets (CSS, JS) - Verifying the completeness of the output """ def __init__(self, data_package, output_dir, template_dir=None, static_dir=None, gtag_id=None): """Initialize the generator with data and path options.""" self.data_package = data_package self.output_dir = Path(output_dir) self.gtag_id = gtag_id # Store the Google tag ID # Find template directory if template_dir is None: # Try to find templates relative to this file or common locations module_dir = Path(__file__).parent template_dir = module_dir / "templates" if not template_dir.exists(): for path in [ Path("templates"), Path("./templates"), Path("../templates"), ]: if path.exists(): template_dir = path break # Find static directory if static_dir is None: module_dir = Path(__file__).parent static_dir = module_dir / "static" if not static_dir.exists(): for path in [Path("static"), Path("./static"), Path("../static")]: if path.exists(): static_dir = path break self.template_dir = Path(template_dir) self.static_dir = Path(static_dir) print(f"Using template directory: {self.template_dir}") print(f"Using static directory: {self.static_dir}") # Set up Jinja environment self.jinja_env = Environment( loader=FileSystemLoader(str(self.template_dir)), autoescape=True ) def generate(self): """Generate the complete static website and verify output.""" try: # Create output directory self.output_dir.mkdir(parents=True, exist_ok=True) # Create CSS and JS directories in output (self.output_dir / "css").mkdir(exist_ok=True) (self.output_dir / "js").mkdir(exist_ok=True) # Copy static assets self._copy_static_assets() # Generate HTML self._generate_html() # Generate stories HTML if we have stories data if "stories" in self.data_package and self.data_package["stories"]: self._generate_stories_html() print(f"Website successfully generated at {self.output_dir}") return True except Exception as e: print(f"Error generating website: {str(e)}") return False def _copy_static_assets(self): """Copy CSS and JS files to the output directory.""" # Copy CSS css_dir = self.static_dir / "css" if css_dir.exists(): for css_file in css_dir.glob("*.css"): shutil.copy2(css_file, self.output_dir / "css" / css_file.name) print(f"Copied CSS: {css_file.name}") # Copy JS js_dir = self.static_dir / "js" if js_dir.exists(): for js_file in js_dir.glob("*.js"): shutil.copy2(js_file, self.output_dir / "js" / js_file.name) print(f"Copied JS: {js_file.name}") # Ensure stories.js exists, create it if not stories_js = js_dir / "stories.js" if not stories_js.exists(): # Create a minimal stories.js file if it doesn't exist with open(stories_js, "w") as f: f.write("// Stories viewer functionality\n") print(f"Created placeholder: stories.js") # Copy stories.js to output shutil.copy2(stories_js, self.output_dir / "js" / "stories.js") print(f"Copied JS: stories.js") def _generate_html(self): """Generate HTML using templates.""" # Generate the grid HTML grid_html = self._render_grid() # Extract data for the main template profile_info = self.data_package["profile"] location_info = self.data_package.get("location", {"location": "Unknown"}) date_range = self.data_package["date_range"]["range"] post_count = self.data_package["post_count"] story_count = self.data_package.get("story_count", 0) # Get profile picture path and check for WebP version profile_picture = profile_info["profile_picture"] # Check if we have a WebP version of the profile picture if profile_picture: webp_path = re.sub(r"\.(jpg|jpeg|png|gif)$", ".webp", profile_picture, flags=re.I) if os.path.exists(os.path.join(self.output_dir, webp_path)): profile_picture = webp_path # Current date for footer generation_date = datetime.datetime.now().strftime("%Y-%m-%d") # Get stories data or empty dict if not available stories_data = self.data_package.get("stories", {}) # Render the main template template = self.jinja_env.get_template("index.html") html_content = template.render( username=profile_info["username"], profile_picture=profile_picture, bio=profile_info.get("bio", ""), # Pass bio to template profile=profile_info, # Pass the entire profile object date_range=date_range, post_count=post_count, story_count=story_count, has_stories=story_count > 0, # Flag to show stories link grid_html=grid_html, post_data_json=json.dumps(self.data_package["posts"], ensure_ascii=False), stories_data_json=json.dumps(stories_data, ensure_ascii=False), # Add stories data generation_date=generation_date, gtag_id=self.gtag_id, # Add Google tag ID ) # Write HTML file with open(self.output_dir / "index.html", "w", encoding="utf-8") as f: f.write(html_content) print(f"Generated HTML file: {self.output_dir / 'index.html'}") def _render_grid(self): """Render the grid HTML using the grid.html template.""" posts_data = self.data_package["posts"] lazy_after = 30 # Start lazy loading after this many posts # Check if posts_data is valid if not posts_data or not isinstance(posts_data, dict): print("Warning: No valid posts data found for grid rendering") return "" # Prepare data for the grid template grid_posts = [] for i, (timestamp, post) in enumerate(posts_data.items()): # Determine which media to use for the grid thumbnail display_media = self._get_display_media(post, i >= lazy_after) grid_posts.append( { "index": post["i"], "display_media": display_media["url"], "is_video": display_media["is_video"], "media_count": len(post["m"]), "likes": post.get("l", ""), "lazy_load": Markup(' loading="lazy"') if i >= lazy_after else "", } ) # Render grid template grid_template = self.jinja_env.get_template("grid.html") return grid_template.render(posts=grid_posts) def _get_display_media(self, post, use_lazy_loading=False): """Determine which media to use for the grid thumbnail.""" result = {"url": "", "is_video": False} if not post["m"] or len(post["m"]) == 0: return result first_media = post["m"][0] result["url"] = first_media # Check if first media is a video result["is_video"] = bool( re.search(r"\.(mp4|mov|avi|webm)$", first_media, re.I) if first_media else False ) # Check if we have a thumbnail for this media if first_media: thumb_filename = hashlib.md5(first_media.encode()).hexdigest() + ".webp" thumb_path = f"thumbnails/{thumb_filename}" if os.path.exists(os.path.join(self.output_dir, thumb_path)): # Use the thumbnail instead of the original result["url"] = thumb_path elif not result["is_video"]: # Check if we have a WebP version of the original image webp_path = re.sub( r"\.(jpg|jpeg|png|gif)$", ".webp", first_media, flags=re.I ) if os.path.exists(os.path.join(self.output_dir, webp_path)): result["url"] = webp_path # If it's a video, look for a thumbnail among all media items if ( result["is_video"] and result["url"] == first_media ): # No thumbnail found yet for media_item in post["m"]: if re.search(r"\.(jpg|jpeg|png|webp|gif)$", media_item, re.I): # Check if we have a thumbnail for this image img_thumb_filename = ( hashlib.md5(media_item.encode()).hexdigest() + ".webp" ) img_thumb_path = f"thumbnails/{img_thumb_filename}" if os.path.exists( os.path.join(self.output_dir, img_thumb_path) ): result["url"] = img_thumb_path break else: result["url"] = media_item break # If no thumbnail found, use a SVG placeholder if result["url"] == first_media: # Create a simple SVG with a play button svg = ( '' '' '' '' "" ) # Encode the SVG properly for use in an img src attribute result["url"] = ( "data:image/svg+xml;base64," + base64.b64encode(svg.encode()).decode() ) return result def _generate_stories_html(self): """Generate a separate HTML file for stories.""" stories_data = self.data_package.get("stories", {}) if not stories_data: print("No stories data found, skipping stories.html generation") return # Extract data for the stories template profile_info = self.data_package["profile"] date_range = self.data_package["date_range"]["range"] story_count = len(stories_data) post_count = self.data_package["post_count"] # Get profile picture path and check for WebP version profile_picture = profile_info["profile_picture"] # Check if we have a WebP version of the profile picture if profile_picture: webp_path = re.sub(r"\.(jpg|jpeg|png|gif)$", ".webp", profile_picture, flags=re.I) if os.path.exists(os.path.join(self.output_dir, webp_path)): profile_picture = webp_path # Current date for footer generation_date = datetime.datetime.now().strftime("%Y-%m-%d") # Prepare stories data for the template stories_list = [] lazy_after = 30 # Start lazy loading after this many stories for i, (timestamp, story) in enumerate(stories_data.items()): # Check for story-specific thumbnail story_thumb = story.get("story_thumb", None) if story_thumb and os.path.exists(os.path.join(self.output_dir, story_thumb)): # Use the 9:16 story thumbnail media_url = story_thumb else: # Fall back to regular thumbnail or original media display_media = self._get_display_media(story, i >= lazy_after) media_url = display_media["url"] # Determine if it's a video is_video = bool(re.search(r"\.(mp4|mov|avi|webm)$", story["m"][0], re.I)) if story["m"] else False stories_list.append({ "index": story["i"], "media": media_url, "is_video": is_video, "date": story.get("d", ""), "caption": story.get("tt", ""), "timestamp": timestamp, "lazy_load": Markup(' loading="lazy"') if i >= lazy_after else "", "original_media": story["m"][0] if story["m"] else "", # Include original media path }) # Render the stories template template = self.jinja_env.get_template("stories_page.html") html_content = template.render( username=profile_info["username"], profile_picture=profile_picture, bio=profile_info.get("bio", ""), profile=profile_info, date_range=date_range, post_count=post_count, story_count=story_count, stories=stories_list, stories_data_json=json.dumps(stories_data, ensure_ascii=False), generation_date=generation_date, gtag_id=self.gtag_id, ) # Write HTML file with open(self.output_dir / "stories.html", "w", encoding="utf-8") as f: f.write(html_content) print(f"Generated stories HTML file: {self.output_dir / 'stories.html'}") ================================================ FILE: memento_mori/loader.py ================================================ # memento_mori/loader.py import json import re import os from datetime import datetime import html from ftfy import fix_text from pathlib import Path def fix_double_encoded_utf8(text): """ Fix double-encoded UTF-8 sequences in text using ftfy. This handles cases where UTF-8 characters (like emoji) were incorrectly encoded twice. """ if not isinstance(text, str): return text # Use ftfy to fix the text encoding issues return fix_text(text) class InstagramDataLoader: """ Class for loading and processing Instagram data from the exported archive. This class provides methods to: - Load JSON files (posts, insights, user data) - Parse and merge data sources - Convert timestamps and format data - Provide a clean data structure for the generator """ def __init__(self, extraction_dir, file_mapper=None, verbose=False): """ Initialize the loader with the path to the extracted data. Args: extraction_dir (str): Path to the extracted Instagram data file_mapper (InstagramFileMapper, optional): File mapper from extractor verbose (bool): Whether to print verbose debug information """ self.extraction_dir = extraction_dir self.file_mapper = file_mapper self.verbose = verbose # If no file mapper was provided, create one if self.file_mapper is None: from .file_mapper import InstagramFileMapper self.file_mapper = InstagramFileMapper(extraction_dir) self.file_mapper.discover_all_files() # Storage for loaded data self.profile_data = None self.location_data = None self.posts_data = None self.insights_data = None self.combined_data = None def load_profile_data(self): """ Load user profile data. Returns: dict: User profile information """ profile_path = self.file_mapper.get_file_path("profile") if not profile_path: print("Profile data not found") return {"username": "Unknown", "profile_picture": "", "bio": ""} try: with open(profile_path, "r", encoding="utf-8") as f: self.profile_data = json.load(f) string_map = self.profile_data["profile_user"][0]["string_map_data"] media_map = self.profile_data["profile_user"][0]["media_map_data"] profile_info = { "username": string_map["Username"]["value"], "profile_picture": "", "bio": "", "website": "", "name": "", } for key, value in media_map.items(): if key.lower() == "profile photo": profile_info["profile_picture"] = value.get("uri", "") break if "Name" in string_map: profile_info["name"] = string_map["Name"]["value"] if "Bio" in string_map: profile_info["bio"] = string_map["Bio"]["value"] if "Website" in string_map: profile_info["website"] = string_map["Website"]["value"] return profile_info except Exception as e: print(f"Error loading profile data: {str(e)}") return {"username": "Unknown", "profile_picture": ""} def load_location_data(self): """ Load user location data. Returns: dict: User location information """ location_path = self.file_mapper.get_file_path("location") if not location_path: print("Location data not found") return {"location": "Unknown"} try: with open(location_path, "r", encoding="utf-8") as f: self.location_data = json.load(f) string_map = self.location_data["inferred_data_primary_location"][0]["string_map_data"] location_value = "Unknown" for key in ["Town/city name", "City Name", "Name"]: if key in string_map: location_value = string_map[key]["value"] break return {"location": location_value} return location_info except Exception as e: print(f"Error loading location data: {str(e)}") return {"location": "Unknown"} def load_posts_data(self): """ Load posts data from one or more posts JSON files. Returns: list: Combined posts data from all posts files """ all_posts = [] # Check if we have multiple post files post_paths = [] if self.file_mapper.file_map.get("posts_all"): post_paths = self.file_mapper.file_map["posts_all"] elif self.file_mapper.get_file_path("posts"): post_paths = [self.file_mapper.get_file_path("posts")] if not post_paths: print("No posts data found") return [] if self.verbose: print(f"Found {len(post_paths)} posts data file(s):") for i, path in enumerate(post_paths): print(f" {i+1}. {path}") for posts_path in post_paths: try: if self.verbose: print(f"Loading posts from: {posts_path}") with open(posts_path, "r", encoding="utf-8") as f: # Read the file content first file_content = f.read() if self.verbose: print(f" File size: {len(file_content)} bytes") # Fix encoding issues with ftfy file_content = fix_text(file_content) # Parse the modified content posts_data = json.loads(file_content, strict=False) # Check if posts_data is a list (expected format) if isinstance(posts_data, list): if self.verbose: print(f" Found {len(posts_data)} posts in list format") all_posts.extend(posts_data) elif isinstance(posts_data, dict): # Some exports might have posts as a dictionary if self.verbose: print(f" Found posts in dictionary format") print(f" Dictionary keys: {', '.join(list(posts_data.keys())[:5])}...") # Try to extract a list from it if "posts" in posts_data and isinstance(posts_data["posts"], list): if self.verbose: print(f" Found {len(posts_data['posts'])} posts in 'posts' key") all_posts.extend(posts_data["posts"]) else: # Add the dict as a single item if we can't extract a list if self.verbose: print(f" No 'posts' list found, adding dictionary as a single item") all_posts.append(posts_data) else: print(f"Warning: Unexpected posts data format in {posts_path}") if self.verbose: print(f" Data type: {type(posts_data)}") except Exception as e: print(f"Error loading posts data from {posts_path}: {str(e)}") if self.verbose: import traceback traceback.print_exc() if not all_posts: print("Warning: No posts data could be loaded from any file") elif self.verbose: print(f"Successfully loaded {len(all_posts)} posts in total") self.posts_data = all_posts return all_posts def load_insights_data(self): """ Load insights data. Returns: dict: Insights data indexed by timestamp """ insights_path = self.file_mapper.get_file_path("insights") if not insights_path: print( "Warning: No insights file found. Insights data will not be available." ) # Initialize as empty dict, not None self.insights_data = {} return {} try: with open(insights_path, "r", encoding="utf-8") as f: file_content = f.read() # Fix encoding issues file_content = fix_text(file_content) insights_raw = json.loads(file_content, strict=False) # Index insights by timestamp insights_indexed = {} # Handle different possible structures if "organic_insights_posts" in insights_raw: for insight in insights_raw.get("organic_insights_posts", []): timestamp = None # Try to get timestamp from media_map_data if "media_map_data" in insight and "Media Thumbnail" in insight["media_map_data"]: timestamp = insight["media_map_data"]["Media Thumbnail"].get("creation_timestamp") # If no timestamp yet, try other fields if not timestamp and "creation_timestamp" in insight: timestamp = insight["creation_timestamp"] if timestamp: insights_indexed[str(timestamp)] = insight else: # Try alternative structure for insight in insights_raw: if isinstance(insight, dict) and "creation_timestamp" in insight: timestamp = insight["creation_timestamp"] insights_indexed[str(timestamp)] = insight self.insights_data = insights_indexed return insights_indexed except Exception as e: print(f"Error loading insights data: {str(e)}") self.insights_data = {} return {} def combine_data(self): """ Combine posts and insights data. Returns: list: Combined data with posts and their associated insights """ if self.posts_data is None: if self.verbose: print("No posts data yet, loading posts data") self.load_posts_data() if self.insights_data is None: if self.verbose: print("No insights data yet, loading insights data") self.load_insights_data() # Ensure insights_data is a dictionary if not isinstance(self.insights_data, dict): if self.verbose: print("Warning: insights_data is not a dictionary, initializing as empty") self.insights_data = {} if self.verbose: print(f"Combining {len(self.posts_data) if self.posts_data else 0} posts with {len(self.insights_data)} insights entries") combined = [] # Create a mapping of timestamps to insights for faster lookup insights_map = {} for timestamp, insight in self.insights_data.items(): insights_map[str(timestamp)] = insight if not self.posts_data: if self.verbose: print("Warning: No posts data to combine") self.combined_data = [] return [] for post in self.posts_data: try: # Get the timestamp from the first media item timestamp = None if "media" in post and len(post["media"]) > 0 and "creation_timestamp" in post["media"][0]: timestamp = str(post["media"][0]["creation_timestamp"]) elif "creation_timestamp" in post: timestamp = str(post["creation_timestamp"]) # Find associated insights insight = insights_map.get(timestamp) if timestamp else None # Create combined entry combined.append({"post_data": post, "insights": insight}) if self.verbose and not timestamp: print(f"Warning: Post without timestamp") print(f" Post keys: {', '.join(list(post.keys())[:5])}...") if "media" in post: print(f" Media items: {len(post['media'])}") if len(post["media"]) > 0: print(f" First media keys: {', '.join(list(post['media'][0].keys())[:5])}...") except (IndexError, KeyError) as e: print(f"Error processing post: {str(e)}") if self.verbose: import traceback traceback.print_exc() print(f" Post keys: {', '.join(list(post.keys())[:5])}...") # Add post without insights combined.append({"post_data": post, "insights": None}) if self.verbose: print(f"Created {len(combined)} combined entries") self.combined_data = combined return combined def extract_relevant_data(self): """ Extract relevant data from the combined posts and insights data. Returns: dict: Simplified data structure with relevant information """ if self.combined_data is None: if self.verbose: print("No combined data yet, calling combine_data()") self.combine_data() # Check if combined_data is still None or empty after trying to combine if not self.combined_data: print("Warning: No post data found or could not be processed.") if self.verbose: print("combined_data is None or empty after combine_data() call") print(f"posts_data: {type(self.posts_data)}, length: {len(self.posts_data) if self.posts_data else 0}") print(f"insights_data: {type(self.insights_data)}, length: {len(self.insights_data) if self.insights_data else 0}") return {} if self.verbose: print(f"Processing {len(self.combined_data)} combined data entries") simplified_data = {} for index, item in enumerate(self.combined_data): # Initialize a new post entry with shortened keys post_entry = { "i": index, # post_index "m": [], # media "t": "", # creation_timestamp_unix "d": "", # creation_timestamp_readable "tt": "", # title "im": "", # Impressions "l": "", # Likes "c": "", # Comments } # Extract post-level data if "post_data" in item: if "creation_timestamp" in item["post_data"]: post_entry["t"] = item["post_data"]["creation_timestamp"] elif "media" in item["post_data"] and len(item["post_data"]["media"]) > 0 and "creation_timestamp" in item["post_data"]["media"][0]: # Fallback to first media item timestamp if post timestamp not available post_entry["t"] = item["post_data"]["media"][0]["creation_timestamp"] post_entry["d"] = datetime.utcfromtimestamp( post_entry["t"] ).strftime("%B %d, %Y at %I:%M %p") # Get title from post data post_title = "" # Check for title directly in post_data if "title" in item["post_data"] and item["post_data"]["title"]: post_title = item["post_data"]["title"] if isinstance(post_title, str): # Use ftfy to fix text encoding issues post_title = fix_text(post_title) # Then unescape HTML entities post_title = html.unescape(post_title) # Check for title in media items if not post_title and "media" in item["post_data"]: for media_item in item["post_data"]["media"]: if "title" in media_item and media_item["title"]: post_title = media_item["title"] if isinstance(post_title, str): post_title = fix_text(post_title) post_title = html.unescape(post_title) break # Use the first media item with a title # Extract media URIs if "media" in item["post_data"]: for media in item["post_data"]["media"]: if "uri" in media: post_entry["m"].append(media["uri"]) else: if self.verbose: print(f"Warning: Media item without URI at post index {index}") print(f" Media keys: {', '.join(list(media.keys())[:5])}...") post_entry["m"].append("") # Get insights data if available insights_title = "" if "insights" in item and item["insights"]: insights = item["insights"] # Try to get caption from insights if "string_map_data" in insights: insights_data = insights["string_map_data"] # Extract specific metrics and ensure they're integers or blank if "Impressions" in insights_data: impressions = insights_data["Impressions"].get("value", "") # Validate and convert to integer if numeric, otherwise leave blank post_entry["im"] = int(impressions) if impressions and impressions.isdigit() else "" if "Likes" in insights_data: likes = insights_data["Likes"].get("value", "") # Validate and convert to integer if numeric, otherwise leave blank post_entry["l"] = int(likes) if likes and likes.isdigit() else "" if "Comments" in insights_data: comments = insights_data["Comments"].get("value", "") # Validate and convert to integer if numeric, otherwise leave blank post_entry["c"] = int(comments) if comments and comments.isdigit() else "" # Try to get caption from insights if "Caption" in insights_data and insights_data["Caption"].get("value"): insights_title = insights_data["Caption"].get("value", "") if isinstance(insights_title, str): insights_title = fix_text(insights_title) insights_title = html.unescape(insights_title) # Check for title directly in insights if not insights_title and "title" in insights and insights["title"]: insights_title = insights["title"] if isinstance(insights_title, str): insights_title = fix_text(insights_title) insights_title = html.unescape(insights_title) # Check for title in media_map_data if not insights_title and "media_map_data" in insights: for media_key, media_data in insights["media_map_data"].items(): if "title" in media_data and media_data["title"]: insights_title = media_data["title"] if isinstance(insights_title, str): insights_title = fix_text(insights_title) insights_title = html.unescape(insights_title) break # Use the first media item with a title # Use the longer or non-empty title between post data and insights if post_title and insights_title: post_entry["tt"] = post_title if len(post_title) >= len(insights_title) else insights_title elif post_title: post_entry["tt"] = post_title elif insights_title: post_entry["tt"] = insights_title # Only add posts with valid timestamps if post_entry["t"]: simplified_data[post_entry["t"]] = post_entry elif self.verbose: print(f"Skipping post at index {index} due to missing timestamp") if self.verbose: print(f"Extracted {len(simplified_data)} posts with valid timestamps") # Sort by timestamp (newest first) sorted_data = dict(sorted(simplified_data.items(), key=lambda x: x[0], reverse=True)) if self.verbose and sorted_data: print(f"Posts date range: {datetime.utcfromtimestamp(int(list(sorted_data.keys())[-1])).strftime('%Y-%m-%d')} to {datetime.utcfromtimestamp(int(list(sorted_data.keys())[0])).strftime('%Y-%m-%d')}") return sorted_data def load_followers_data(self): """ Load followers data and count the number of followers. Returns: int: Number of followers """ followers_path = self.file_mapper.get_file_path("followers") if not followers_path: if self.verbose: print("Followers data not found") return 0 try: with open(followers_path, "r", encoding="utf-8") as f: file_content = f.read() # Fix encoding issues file_content = fix_text(file_content) followers_data = json.loads(file_content, strict=False) # Count the number of followers follower_count = len(followers_data) if self.verbose: print(f"Found {follower_count} followers") return follower_count except Exception as e: print(f"Error loading followers data: {str(e)}") return 0 def process_json_strings(self, data): """ Recursively process all string values in JSON data to fix encoding issues. """ if isinstance(data, dict): return {k: self.process_json_strings(v) for k, v in data.items()} elif isinstance(data, list): return [self.process_json_strings(item) for item in data] elif isinstance(data, str): # Apply all string fixes # Use ftfy to fix text encoding issues fixed = fix_text(data) # Still apply HTML unescaping after fixing encoding fixed = html.unescape(fixed) return fixed else: return data def load_stories_data(self): """ Load stories data from stories JSON files. Returns: dict: Processed stories data """ stories_path = self.file_mapper.get_file_path("stories") if not stories_path: if self.verbose: print("\n🔍 STORIES DATA SEARCH") print(" No stories file found in standard locations") print(" Checking all patterns for stories files:") for pattern in self.file_mapper.FILE_PATTERNS["stories"]: print(f" - Searching with pattern: {pattern}") matches = list(Path(self.file_mapper.base_dir).glob(pattern)) if matches: print(f" Found {len(matches)} matches:") for match in matches[:3]: # Show first 3 print(f" • {match}") if len(matches) > 3: print(f" • ... and {len(matches)-3} more") else: print(f" No matches found") # Try a more aggressive search print("\n Performing deep search for any files containing 'stories':") for root, dirs, files in os.walk(self.file_mapper.base_dir): for file in files: if 'stories' in file.lower() and file.endswith('.json'): print(f" • Found potential stories file: {os.path.join(root, file)}") return {} try: if self.verbose: print(f"\n🔍 STORIES DATA LOADING") print(f" Found stories file: {stories_path}") file_size = os.path.getsize(stories_path) print(f" File size: {file_size} bytes") with open(stories_path, "r", encoding="utf-8") as f: file_content = f.read() # Fix encoding issues file_content = fix_text(file_content) if self.verbose: print(f" Parsing JSON content...") stories_data = json.loads(file_content, strict=False) if self.verbose: print(f" JSON parsed successfully") if isinstance(stories_data, dict): print(f" Data structure: Dictionary with {len(stories_data)} keys") print(f" Top-level keys: {', '.join(list(stories_data.keys())[:5])}") elif isinstance(stories_data, list): print(f" Data structure: List with {len(stories_data)} items") else: print(f" Data structure: {type(stories_data)}") # Process stories data similar to posts simplified_stories = {} # Handle different possible structures stories_list = [] if self.verbose: print(f"\n Extracting stories list from data structure...") # Check for "ig_stories" key specifically if isinstance(stories_data, dict) and "ig_stories" in stories_data: stories_list = stories_data["ig_stories"] if self.verbose: print(f" Found stories in 'ig_stories' key: {len(stories_list)} items") # Also keep the existing checks for other formats elif isinstance(stories_data, list): stories_list = stories_data if self.verbose: print(f" Using top-level list with {len(stories_list)} items") elif isinstance(stories_data, dict): # Try different possible keys where stories might be stored possible_keys = ["stories", "story_activities", "story_media", "items"] for key in possible_keys: if key in stories_data and isinstance(stories_data[key], list): stories_list = stories_data[key] if self.verbose: print(f" Found stories in '{key}' key: {len(stories_list)} items") break if not stories_list and self.verbose: print(f" Could not find stories list in dictionary keys") print(f" Available keys: {', '.join(list(stories_data.keys()))}") if self.verbose: print(f"\n Processing {len(stories_list)} stories...") for index, story in enumerate(stories_list): # Initialize a new story entry with shortened keys story_entry = { "i": index, # story_index "m": [], # media "t": "", # creation_timestamp_unix "d": "", # creation_timestamp_readable "tt": "", # title/caption } if self.verbose and index < 3: # Only show details for first 3 stories print(f"\n Story #{index+1}:") if isinstance(story, dict): print(f" Keys: {', '.join(list(story.keys())[:10])}") # Extract timestamp timestamp_found = False if isinstance(story, dict): # Try different possible timestamp fields timestamp_fields = ["creation_timestamp", "taken_at", "timestamp"] for field in timestamp_fields: if field in story and story[field]: story_entry["t"] = int(story[field]) timestamp_found = True if self.verbose and index < 3: print(f" Timestamp found in '{field}': {story_entry['t']}") break # Try media items if no timestamp at story level if not timestamp_found and "media" in story and isinstance(story["media"], list) and len(story["media"]) > 0: for media_item in story["media"]: if isinstance(media_item, dict): for field in timestamp_fields: if field in media_item and media_item[field]: story_entry["t"] = int(media_item[field]) timestamp_found = True if self.verbose and index < 3: print(f" Timestamp found in media item '{field}': {story_entry['t']}") break if timestamp_found: break # Format date if timestamp found if story_entry["t"]: story_entry["d"] = datetime.utcfromtimestamp( int(story_entry["t"]) ).strftime("%B %d, %Y at %I:%M %p") if self.verbose and index < 3: print(f" Formatted date: {story_entry['d']}") # Extract caption/title caption_found = False if isinstance(story, dict): # Try different possible caption fields caption_fields = ["caption", "title", "text"] for field in caption_fields: if field in story and story[field]: story_entry["tt"] = story[field] caption_found = True if self.verbose and index < 3: print(f" Caption found in '{field}': {story_entry['tt'][:30]}...") break # Try string_map_data if no caption found directly if not caption_found and "string_map_data" in story and isinstance(story["string_map_data"], dict): string_map = story["string_map_data"] caption_keys = ["Caption", "Text", "Story Text"] for key in caption_keys: if key in string_map and isinstance(string_map[key], dict) and "value" in string_map[key]: story_entry["tt"] = string_map[key]["value"] caption_found = True if self.verbose and index < 3: print(f" Caption found in string_map_data['{key}']: {story_entry['tt'][:30]}...") break # Extract media URIs media_found = False if isinstance(story, dict): # Try direct URI field if "uri" in story and story["uri"]: story_entry["m"].append(story["uri"]) media_found = True if self.verbose and index < 3: print(f" Media found directly in 'uri': {story_entry['m'][0]}") # Try media list if "media" in story and isinstance(story["media"], list): for media_item in story["media"]: if isinstance(media_item, dict) and "uri" in media_item and media_item["uri"]: story_entry["m"].append(media_item["uri"]) media_found = True if self.verbose and index < 3 and len(story_entry["m"]) <= 3: print(f" Media found in media list: {media_item['uri']}") # Try media_map_data if not media_found and "media_map_data" in story and isinstance(story["media_map_data"], dict): for key, media_item in story["media_map_data"].items(): if isinstance(media_item, dict) and "uri" in media_item and media_item["uri"]: story_entry["m"].append(media_item["uri"]) media_found = True if self.verbose and index < 3 and len(story_entry["m"]) <= 3: print(f" Media found in media_map_data['{key}']: {media_item['uri']}") # Only add stories with valid timestamps and media if story_entry["t"] and story_entry["m"]: simplified_stories[str(story_entry["t"])] = story_entry if self.verbose and index < 3: print(f" ✓ Story added with timestamp {story_entry['t']} and {len(story_entry['m'])} media items") elif self.verbose and index < 3: if not story_entry["t"]: print(f" ✗ Story skipped: No timestamp found") if not story_entry["m"]: print(f" ✗ Story skipped: No media found") if self.verbose: print(f"\n Extracted {len(simplified_stories)} valid stories from {len(stories_list)} total") # Sort by timestamp (newest first) sorted_stories = dict(sorted(simplified_stories.items(), key=lambda x: int(x[0]), reverse=True)) if self.verbose and sorted_stories: newest = datetime.utcfromtimestamp(int(list(sorted_stories.keys())[0])).strftime('%Y-%m-%d') oldest = datetime.utcfromtimestamp(int(list(sorted_stories.keys())[-1])).strftime('%Y-%m-%d') print(f" Stories date range: {oldest} to {newest}") return sorted_stories except Exception as e: print(f"Error loading stories data: {str(e)}") if self.verbose: import traceback traceback.print_exc() return {} def load_all_data(self): """ Load all data and return a comprehensive data package. Returns: dict: Data package containing all processed data """ profile_info = self.load_profile_data() location_info = self.load_location_data() posts_data = self.extract_relevant_data() stories_data = self.load_stories_data() follower_count = self.load_followers_data() # Add follower count to profile info profile_info["follower_count"] = follower_count # Process all string values to fix encoding issues profile_info = self.process_json_strings(profile_info) location_info = self.process_json_strings(location_info) posts_data = self.process_json_strings(posts_data) stories_data = self.process_json_strings(stories_data) # Get date range for display if posts_data and isinstance(posts_data, dict) and len(posts_data) > 0: keys = list(posts_data.keys()) first_key = keys[0] # Newest post last_key = keys[-1] # Oldest post # Format timestamps newest_post_date = datetime.utcfromtimestamp(int(first_key)).strftime( "%B %Y" ) oldest_post_date = datetime.utcfromtimestamp(int(last_key)).strftime( "%B %Y" ) date_range = { "newest": newest_post_date, "oldest": oldest_post_date, "range": f"{oldest_post_date} - {newest_post_date}", } else: date_range = {"newest": "Unknown", "oldest": "Unknown", "range": "Unknown"} # If no posts data, create an empty dict to avoid NoneType errors if not isinstance(posts_data, dict): posts_data = {} return { "profile": profile_info, "location": location_info, "posts": posts_data, "stories": stories_data, "date_range": date_range, "post_count": len(posts_data), "story_count": len(stories_data), } ================================================ FILE: memento_mori/media.py ================================================ # memento_mori/media.py import os import shutil import hashlib import base64 import re import mimetypes import magic # python-magic library from pathlib import Path from PIL import Image from concurrent.futures import ThreadPoolExecutor import multiprocessing from tqdm import tqdm class InstagramMediaProcessor: """ Class for processing Instagram media files. This class handles: - Converting images to WebP format - Generating thumbnails for images and videos - Copying media files to the output directory """ def __init__(self, extraction_dir, output_dir, thread_count=None, quality=70, max_dimension=1200): """Initialize the media processor with paths and options.""" self.extraction_dir = Path(extraction_dir) self.output_dir = Path(output_dir) self.thread_count = thread_count or max(1, multiprocessing.cpu_count() - 1) self.quality = quality # Store the quality setting self.max_dimension = max_dimension # Maximum dimension for resizing # Create output directories self.media_dirs = [ self.output_dir / "media", self.output_dir / "media" / "posts", self.output_dir / "media" / "other", self.output_dir / "thumbnails", ] for directory in self.media_dirs: directory.mkdir(parents=True, exist_ok=True) # Statistics self.thumbnail_count = 0 self.webp_count = 0 self.total_size_original = 0 self.total_size_webp = 0 # Initialize filename mapping self.filename_map = {} # Build a basename -> [Path, ...] index for fallback file lookup self.file_index = self._build_file_index() def _build_file_index(self): """ Walk extraction_dir and build a basename -> [Path, ...] index. Warns about any filename collisions found. """ index = {} for path in self.extraction_dir.rglob("*"): if not path.is_file(): continue name = path.name if name not in index: index[name] = [] index[name].append(path) collisions = {name: paths for name, paths in index.items() if len(paths) > 1} if collisions: print(f"\n⚠️ WARNING: {len(collisions)} duplicate filename(s) found across archive") print(" Fallback lookup will use the first match:") for name, paths in collisions.items(): print(f" {name} ({len(paths)} copies):") for p in paths: print(f" {p.relative_to(self.extraction_dir)}") return index def shorten_filename(self, original_path): """ Create a shortened version of a filename while preserving extension. Args: original_path (str): Original file path Returns: str: Shortened file path """ if not original_path or not isinstance(original_path, str): return original_path # Skip if it's already a data URI if original_path.startswith('data:'): return original_path # Check if we already have a mapping for this path if original_path in self.filename_map: return self.filename_map[original_path] # Parse the path path_obj = Path(original_path) parent_dir = path_obj.parent filename = path_obj.name extension = path_obj.suffix.lower() # Create a hash of the original filename filename_hash = hashlib.md5(filename.encode()).hexdigest()[:8] # Use first 8 chars of hash # Create new filename: hash + original extension new_filename = f"{filename_hash}{extension}" # Create the new path if parent_dir == Path('.'): new_path = new_filename else: new_path = str(parent_dir / new_filename) # Store the mapping self.filename_map[original_path] = new_path return new_path def process_media_files(self, post_data, profile_picture, stories_data=None): """Process all media files from posts, stories, and profile picture.""" # First, fix any incorrect file extensions in the extraction directory print("Checking and fixing file extensions...") extension_stats = self.fix_file_extensions(self.extraction_dir) print(f"Fixed {extension_stats['fixed']} files with incorrect extensions") # Create a path mapping for quick lookups path_mapping = extension_stats.get("path_mapping", {}) # Update profile picture path if it was fixed if profile_picture in path_mapping: profile_picture = path_mapping[profile_picture] # Process profile picture and get shortened path (only if profile picture exists) shortened_profile = "" if profile_picture and profile_picture.strip(): # Check if the profile picture file actually exists profile_path = Path(self.extraction_dir) / profile_picture if profile_path.exists() and profile_path.is_file(): shortened_profile = self.shorten_filename(profile_picture) self.copy_file_to_distribution(profile_picture) self.generate_thumbnail(profile_picture, shortened_profile) else: print(f"Warning: Profile picture not found or is not a file: {profile_picture}") else: print("Warning: No profile picture specified in data") # Collect all media files to process all_media = [] story_media = [] # Separate list for story media # Create a deep copy of post_data to modify updated_post_data = {} for timestamp, post in post_data.items(): # Create a copy of the post updated_post = post.copy() updated_media = [] for media_url in post["m"]: # Check if this media URL was fixed if str(self.extraction_dir / media_url) in path_mapping: # Get the new path relative to extraction_dir new_full_path = path_mapping[str(self.extraction_dir / media_url)] media_url = str(Path(new_full_path).relative_to(self.extraction_dir)) # Add to processing list all_media.append(media_url) # Get shortened path shortened_url = self.shorten_filename(media_url) updated_media.append(shortened_url) # Update the post with shortened media URLs updated_post["m"] = updated_media updated_post_data[timestamp] = updated_post # Process stories data if provided updated_stories_data = {} story_thumbnails = {} # Store story thumbnail paths if stories_data: for timestamp, story in stories_data.items(): # Create a copy of the story updated_story = story.copy() updated_media = [] for media_url in story["m"]: # Check if this media URL was fixed if str(self.extraction_dir / media_url) in path_mapping: # Get the new path relative to extraction_dir new_full_path = path_mapping[str(self.extraction_dir / media_url)] media_url = str(Path(new_full_path).relative_to(self.extraction_dir)) # Add to processing list all_media.append(media_url) story_media.append(media_url) # Also add to story-specific list # Get shortened path shortened_url = self.shorten_filename(media_url) updated_media.append(shortened_url) # Update the story with shortened media URLs updated_story["m"] = updated_media updated_stories_data[timestamp] = updated_story total_media = len(all_media) print( f"Processing {total_media} media files using {self.thread_count} threads..." ) # Process media files in parallel using ThreadPoolExecutor with tqdm with ThreadPoolExecutor(max_workers=self.thread_count) as executor: list( tqdm( executor.map(self.copy_file_to_distribution, all_media), total=total_media, desc="Processing media files", unit="files", ) ) # Generate story thumbnails with 9:16 aspect ratio if story_media: print(f"Generating {len(story_media)} story thumbnails with 9:16 aspect ratio...") # Process story thumbnails and collect results with ThreadPoolExecutor(max_workers=self.thread_count) as executor: story_thumb_results = list( tqdm( executor.map( lambda media_url: ( media_url, self.generate_story_thumbnail( self.extraction_dir / media_url, self.shorten_filename(media_url) ) ), story_media ), total=len(story_media), desc="Processing story thumbnails", unit="files", ) ) # Create a mapping of original media to thumbnail paths for media_url, thumb_path in story_thumb_results: if thumb_path: # Store relative path from output directory rel_thumb_path = str(Path(thumb_path).relative_to(self.output_dir)) story_thumbnails[self.shorten_filename(media_url)] = rel_thumb_path # Add thumbnail paths to story data for timestamp, story in updated_stories_data.items(): for i, media_url in enumerate(story["m"]): if media_url in story_thumbnails: # If this is the first media item, add the thumbnail path to the story if i == 0: story["story_thumb"] = story_thumbnails[media_url] # Calculate space savings self._calculate_space_savings(post_data) # Return updated post data and statistics return { "updated_post_data": updated_post_data, "updated_stories_data": updated_stories_data, "shortened_profile": shortened_profile, "stats": { "thumbnail_count": self.thumbnail_count, "webp_count": self.webp_count, "total_size_original": self.total_size_original, "total_size_webp": self.total_size_webp, "space_saved_mb": (self.total_size_original - self.total_size_webp) / (1024 * 1024), "percentage_saved": ( (self.total_size_original - self.total_size_webp) / self.total_size_original * 100 if self.total_size_original > 0 else 0 ), "extension_fixes": extension_stats["fixed"], } } def _calculate_space_savings(self, post_data): """Calculate space savings from WebP conversion and other optimizations.""" # Count thumbnails if (self.output_dir / "thumbnails").exists(): self.thumbnail_count = len( list((self.output_dir / "thumbnails").glob("*.webp")) ) # Calculate total size of original files and their optimized versions self.total_size_original = 0 self.total_size_webp = 0 # Track all media files that were processed processed_files = set() # First, add all media from posts for timestamp, post in post_data.items(): for media_url in post["m"]: processed_files.add(media_url) # Process each file to calculate size differences for media_url in processed_files: # Get the original file path original_path = Path(self.extraction_dir) / media_url # Get the shortened filename shortened_url = self.shorten_filename(media_url) # Check if the original exists if original_path.exists(): original_size = original_path.stat().st_size self.total_size_original += original_size # Check for WebP version first webp_path = self.output_dir / (shortened_url.rsplit('.', 1)[0] + '.webp') if webp_path.exists(): # WebP version exists webp_size = webp_path.stat().st_size self.total_size_webp += webp_size self.webp_count += 1 else: # No WebP, check for the original format in output dest_path = self.output_dir / shortened_url if dest_path.exists(): dest_size = dest_path.stat().st_size self.total_size_webp += dest_size def copy_file_to_distribution(self, file_path, quiet=True): """Copy a file to distribution, optionally converting images to WebP and generating thumbnails.""" # Skip if it's already a data URI if str(file_path).startswith("data:image"): return True source = Path(self.extraction_dir) / file_path # Create shortened filename shortened_path = self.shorten_filename(file_path) destination = Path(self.output_dir) / shortened_path # Create directory structure if it doesn't exist destination.parent.mkdir(parents=True, exist_ok=True) # Check if source file exists; fall back to basename index if not found if not source.exists(): basename = Path(file_path).name candidates = self.file_index.get(basename, []) if candidates: if len(candidates) > 1 and not quiet: print(f"Warning: '{basename}' matched {len(candidates)} files in archive" f" — using {candidates[0].relative_to(self.extraction_dir)}") source = candidates[0] else: if not quiet: print(f"Warning: Source file not found anywhere in archive: {file_path}") return False # Check if it's an image or video is_image = bool(re.search(r"\.(jpg|jpeg|png|gif)$", str(file_path), re.I)) is_video = bool(re.search(r"\.(mp4|mov|avi|webm)$", str(file_path), re.I)) if is_image: # Convert image to WebP for better compression webp_destination = Path( str(destination).replace(destination.suffix, ".webp") ) self.convert_to_webp(source, webp_destination, quiet) # Generate thumbnail self.generate_thumbnail(source, shortened_path, quiet) return True else: # Copy the file as is (for videos and other file types) shutil.copy2(source, destination) # Generate thumbnail for videos if is_video: self.generate_thumbnail(source, shortened_path, quiet) return True def convert_to_webp(self, source_path, destination_path, quiet=False): """Convert an image to WebP format if it results in a smaller file.""" try: # Open the image with Image.open(source_path) as img: # Get original dimensions original_width, original_height = img.size # Resize if the image is larger than the maximum dimension if original_width > self.max_dimension or original_height > self.max_dimension: # Calculate the scaling factor scale = self.max_dimension / max(original_width, original_height) new_width = int(original_width * scale) new_height = int(original_height * scale) # Resize the image img = img.resize((new_width, new_height), Image.LANCZOS) if not quiet: print(f"Resized image from {original_width}x{original_height} to {new_width}x{new_height}") # Handle transparency if img.mode in ("RGBA", "LA") or ( img.mode == "P" and "transparency" in img.info ): if img.mode != "RGBA": img = img.convert("RGBA") else: img = img.convert("RGB") # Save as WebP with the configured quality and method=6 for better compression img.save(destination_path, "WEBP", quality=self.quality, method=6) # Check if the WebP file is actually smaller original_size = source_path.stat().st_size webp_size = destination_path.stat().st_size if webp_size > 0 and webp_size < original_size: if not quiet: print( f"Converted to WebP: {source_path} (saved {(original_size - webp_size) / 1024:.2f} KB)" ) return True else: # If WebP is larger, use the original file if destination_path.exists(): destination_path.unlink() # Copy with original extension original_ext = source_path.suffix original_destination = Path( str(destination_path).replace(".webp", original_ext) ) shutil.copy2(source_path, original_destination) if not quiet: print(f"WebP larger than original, using original: {source_path}") return False except Exception as e: if not quiet: print(f"Error converting to WebP: {str(e)}") # Fall back to copying the original file original_ext = source_path.suffix original_destination = Path( str(destination_path).replace(".webp", original_ext) ) original_destination.parent.mkdir(parents=True, exist_ok=True) shutil.copy2(source_path, original_destination) return False def fix_file_extensions(self, directory_path): """ Scan a directory for files with incorrect extensions and fix them. Particularly focuses on media files like HEIC files that are actually JPEGs. Args: directory_path (str or Path): Directory to scan for files with incorrect extensions Returns: dict: Statistics about the fixed files """ directory_path = Path(directory_path) stats = { "total_checked": 0, "fixed": 0, "already_correct": 0, "errors": 0, "fixed_files": [] } # Make sure we have the mime-type libraries try: # Try the libmagic binding first (common on Linux/Mac) mime = magic.Magic(mime=True) except (TypeError, AttributeError): try: # Try the alternative API (common in some python-magic implementations) mime = magic.open(magic.MAGIC_MIME_TYPE) mime.load() except (AttributeError, TypeError): print("Error initializing python-magic. Please install it:") print("pip install python-magic") if os.name == "nt": # Windows print("Windows users also need to install the binary from: https://github.com/ahupp/python-magic#windows") return stats # Mapping of MIME types to extensions mime_to_ext = { "image/jpeg": ".jpg", "image/png": ".png", "image/gif": ".gif", "image/webp": ".webp", "image/heic": ".heic", "video/mp4": ".mp4", "video/quicktime": ".mov", "video/webm": ".webm" } # List of media MIME type prefixes we want to process media_mime_prefixes = ["image/", "video/"] print(f"Scanning {directory_path} for files with incorrect extensions...") # Find all files recursively for file_path in tqdm(list(directory_path.glob("**/*.*")), desc="Checking files"): stats["total_checked"] += 1 try: # Skip directories if file_path.is_dir(): continue # Skip non-media files based on extension current_ext = file_path.suffix.lower() if current_ext in ['.json', '.txt', '.srt', '.csv', '.html', '.xml', '.md']: stats["already_correct"] += 1 continue # Get the current extension and mime type # Handle different magic library interfaces try: # First approach (libmagic binding) file_mime = mime.from_file(str(file_path)) except AttributeError: # Second approach (alternative API) file_mime = mime.file(str(file_path)) # Skip if not a media file if not any(file_mime.startswith(prefix) for prefix in media_mime_prefixes): stats["already_correct"] += 1 continue # Get the correct extension for this mime type correct_ext = mime_to_ext.get(file_mime) if correct_ext is None: # If we don't have a mapping for this mime type, use mimetypes correct_ext = mimetypes.guess_extension(file_mime) or current_ext # If extensions don't match, rename the file if correct_ext != current_ext: new_path = file_path.with_suffix(correct_ext) # Avoid overwriting existing files counter = 1 while new_path.exists(): new_stem = f"{file_path.stem}_{counter}" new_path = file_path.with_stem(new_stem).with_suffix(correct_ext) counter += 1 # copy the file with the new extension # leave the old file in place so if we run the program again, path_mapping is created properly shutil.copy(file_path, new_path) stats["fixed"] += 1 stats["fixed_files"].append({ "old_path": str(file_path), "new_path": str(new_path), "old_type": current_ext, "new_type": correct_ext }) # Don't print each fixed file to keep output clean else: stats["already_correct"] += 1 except Exception as e: print(f"Error processing {file_path}: {str(e)}") stats["errors"] += 1 # Print summary if stats["fixed"] > 0: print(f"\n🔧 EXTENSION CORRECTION") print(f" Fixed {stats['fixed']} media files with incorrect extensions") # Group fixes by type for a cleaner summary fixes_by_type = {} for item in stats["fixed_files"]: key = f"{item['old_type']} → {item['new_type']}" if key not in fixes_by_type: fixes_by_type[key] = 0 fixes_by_type[key] += 1 # Print summary by type for fix_type, count in fixes_by_type.items(): print(f" • {count} files: {fix_type}") else: print(f"\n✓ All {stats['already_correct']} media files had correct extensions") if stats["errors"] > 0: print(f" ⚠️ Errors: {stats['errors']}") # Add a path mapping to the stats stats["path_mapping"] = {item["old_path"]: item["new_path"] for item in stats["fixed_files"]} return stats def generate_thumbnail(self, source_path, relative_path, quiet=False): """Generate a thumbnail for an image or video file.""" # Ensure source_path is a Path object source_path = ( Path(source_path) if not isinstance(source_path, Path) else source_path ) # Create thumbnails directory thumbs_dir = self.output_dir / "thumbnails" thumbs_dir.mkdir(parents=True, exist_ok=True) # Generate unique filename for the thumbnail thumb_filename = hashlib.md5(str(relative_path).encode()).hexdigest() + ".webp" thumb_path = thumbs_dir / thumb_filename # Skip if thumbnail already exists if thumb_path.exists(): return thumb_path # Target dimensions for square thumbnail target_width = 292 target_height = 292 try: # Check if file exists if not source_path.exists(): if not quiet: print(f"File not found: {source_path}") return None # Determine if it's a video is_video = bool(re.search(r"\.(mp4|mov|avi|webm)$", str(source_path), re.I)) if is_video: # Try using OpenCV for video thumbnail try: import cv2 video = cv2.VideoCapture(str(source_path)) if not video.isOpened(): raise Exception(f"Could not open video: {source_path}") # Get video properties total_frames = int(video.get(cv2.CAP_PROP_FRAME_COUNT)) fps = video.get(cv2.CAP_PROP_FPS) # Get frame from 1 second in or middle of video target_frame = ( min(int(fps), total_frames // 2) if fps > 0 else total_frames // 2 ) video.set(cv2.CAP_PROP_POS_FRAMES, target_frame) success, frame = video.read() video.release() if not success: raise Exception(f"Failed to extract frame from video") # Convert to RGB and create PIL image frame_rgb = cv2.cvtColor(frame, cv2.COLOR_BGR2RGB) img = Image.fromarray(frame_rgb) except (ImportError, Exception) as e: if not quiet: print(f"Video thumbnail error: {str(e)}") # Create video placeholder if extraction fails svg = ( '' '' '' '' "" ) return None else: # For images, use PIL img = Image.open(source_path) # Get original dimensions original_width, original_height = img.size # Calculate dimensions for cropping to 1:1 aspect ratio (center crop) if original_width > original_height: src_x = (original_width - original_height) // 2 src_y = 0 src_w = original_height src_h = original_height else: src_x = 0 src_y = (original_height - original_width) // 2 src_w = original_width src_h = original_width # Crop and resize img = img.crop((src_x, src_y, src_x + src_w, src_y + src_h)) img = img.resize((target_width, target_height), Image.LANCZOS) # Save as WebP img.save(thumb_path, "WEBP", quality=80) return thumb_path except Exception as e: if not quiet: print(f"Error generating thumbnail: {str(e)}") return None def generate_story_thumbnail(self, source_path, relative_path, quiet=False): """Generate a 9:16 aspect ratio thumbnail for a story.""" # Ensure source_path is a Path object source_path = Path(source_path) if not isinstance(source_path, Path) else source_path # Create thumbnails directory thumbs_dir = self.output_dir / "thumbnails" / "stories" thumbs_dir.mkdir(parents=True, exist_ok=True) # Generate unique filename for the thumbnail thumb_filename = hashlib.md5(str(relative_path).encode()).hexdigest() + ".webp" thumb_path = thumbs_dir / thumb_filename # Skip if thumbnail already exists if thumb_path.exists(): return thumb_path # Target dimensions for 9:16 aspect ratio thumbnail target_width = 270 # Keeping similar width as square thumbnails target_height = 480 # 9:16 ratio (270 * 16/9) try: # Check if file exists if not source_path.exists(): if not quiet: print(f"File not found: {source_path}") return None # Determine if it's a video is_video = bool(re.search(r"\.(mp4|mov|avi|webm)$", str(source_path), re.I)) if is_video: # Try using OpenCV for video thumbnail try: import cv2 video = cv2.VideoCapture(str(source_path)) if not video.isOpened(): raise Exception(f"Could not open video: {source_path}") # Get video properties total_frames = int(video.get(cv2.CAP_PROP_FRAME_COUNT)) fps = video.get(cv2.CAP_PROP_FPS) # Get frame from 1 second in or middle of video target_frame = ( min(int(fps), total_frames // 2) if fps > 0 else total_frames // 2 ) video.set(cv2.CAP_PROP_POS_FRAMES, target_frame) success, frame = video.read() video.release() if not success: raise Exception(f"Failed to extract frame from video") # Convert to RGB and create PIL image frame_rgb = cv2.cvtColor(frame, cv2.COLOR_BGR2RGB) img = Image.fromarray(frame_rgb) except (ImportError, Exception) as e: if not quiet: print(f"Video thumbnail error: {str(e)}") return None else: # For images, use PIL img = Image.open(source_path) # Get original dimensions original_width, original_height = img.size # Calculate dimensions for cropping to 9:16 aspect ratio (center crop) target_ratio = 9 / 16 original_ratio = original_width / original_height if original_ratio > target_ratio: # Image is wider than 9:16 # Crop width to match 9:16 new_width = int(original_height * target_ratio) src_x = (original_width - new_width) // 2 src_y = 0 src_w = new_width src_h = original_height else: # Image is taller than 9:16 # Crop height to match 9:16 new_height = int(original_width / target_ratio) src_x = 0 src_y = (original_height - new_height) // 2 src_w = original_width src_h = new_height # Crop and resize img = img.crop((src_x, src_y, src_x + src_w, src_y + src_h)) img = img.resize((target_width, target_height), Image.LANCZOS) # Save as WebP img.save(thumb_path, "WEBP", quality=80) return thumb_path except Exception as e: if not quiet: print(f"Error generating story thumbnail: {str(e)}") return None ================================================ FILE: memento_mori/static/css/style.css ================================================ /* memento_mori/static/css/style.css */ :root { --instagram-bg: #fafafa; --instagram-border: #dbdbdb; --instagram-text: #262626; --instagram-link: #0095f6; --header-height: 60px; } * { margin: 0; padding: 0; box-sizing: border-box; } body { font-family: -apple-system, BlinkMacSystemFont, "Segoe UI", Roboto, Helvetica, Arial, sans-serif; background-color: var(--instagram-bg); color: var(--instagram-text); line-height: 1.5; } header { position: fixed; top: 0; left: 0; right: 0; height: var(--header-height); background-color: white; border-bottom: 1px solid var(--instagram-border); display: flex; align-items: center; justify-content: center; padding: 0 20px; z-index: 100; } .header-content { max-width: 975px; width: 100%; display: flex; justify-content: space-between; align-items: center; } .logo { font-size: 24px; font-weight: bold; color: var(--instagram-text); text-decoration: none; } .date-range-header { color: #8e8e8e; font-size: 14px; margin-left: 15px; } main { max-width: 975px; margin: calc(var(--header-height) + 30px) auto 30px; padding: 0 20px; } .profile-info { display: flex; align-items: center; margin-bottom: 30px; } .profile-picture { width: 150px; height: 150px; border-radius: 50%; object-fit: cover; margin-right: 30px; background-color: #eee; display: flex; align-items: center; justify-content: center; font-size: 36px; color: #aaa; } .profile-details h1 { font-size: 28px; font-weight: 300; margin-bottom: 15px; } .profile-details .bio { margin: 0 0 10px 0; max-width: 600px; line-height: 1.4; } .profile-details .website { margin: 0 0 15px 0; font-size: 14px; } .profile-details .website a { color: var(--instagram-link); text-decoration: none; } .profile-details .website a:hover { text-decoration: underline; } .stats { display: flex; margin-bottom: 15px; font-size: 16px; } .stat { margin-right: 40px; } .stat-count { font-weight: 600; } .posts-grid { display: grid; grid-template-columns: repeat(3, 1fr); gap: 28px; } .grid-item { position: relative; aspect-ratio: 1/1; cursor: pointer; overflow: hidden; } .grid-item img, .grid-item video { width: 100%; height: 100%; object-fit: cover; transition: transform 0.3s ease; aspect-ratio: 1/1; } .grid-item:hover img, .grid-item:hover video { transform: scale(1.05); } .multi-indicator { position: absolute; top: 10px; right: 10px; color: white; background-color: rgba(0, 0, 0, 0.6); padding: 3px 8px; border-radius: 4px; font-size: 12px; z-index: 2; } .video-indicator { position: absolute; top: 10px; right: 10px; color: white; background-color: rgba(0, 0, 0, 0.7); padding: 4px 10px; border-radius: 4px; font-size: 12px; font-weight: bold; z-index: 2; box-shadow: 0 1px 3px rgba(0, 0, 0, 0.3); } .post-modal { display: none; position: fixed; top: 0; left: 0; width: 100%; height: 100%; background-color: rgba(0, 0, 0, 0.9); z-index: 1000; overflow-y: auto; } .post-modal-content { display: flex; max-width: 1200px; margin: 30px auto; background-color: white; height: calc(100vh - 60px); max-height: 800px; border-radius: 4px; overflow: hidden; position: relative; } .post-media { flex: 1; background-color: black; position: relative; min-width: 0; display: flex; align-items: center; justify-content: center; } .post-media img, .post-media video { max-width: 100%; max-height: 100%; object-fit: contain; } .post-info { width: 340px; border-left: 1px solid var(--instagram-border); display: flex; flex-direction: column; } .post-header { padding: 16px; border-bottom: 1px solid var(--instagram-border); display: flex; align-items: center; } .post-user { width: 32px; height: 32px; border-radius: 50%; margin-right: 12px; display: flex; align-items: center; justify-content: center; background-color: #eee; font-size: 14px; color: #aaa; } .post-username { font-weight: 600; flex-grow: 1; } .share-button { cursor: pointer; padding: 5px; border-radius: 50%; display: flex; align-items: center; justify-content: center; transition: background-color 0.2s; } .share-button:hover { background-color: rgba(0, 0, 0, 0.1); } .share-button svg { width: 18px; height: 18px; color: #8e8e8e; } .post-caption { padding: 16px; flex-grow: 1; overflow-y: auto; } .post-date { padding: 16px; color: #8e8e8e; font-size: 12px; border-top: 1px solid var(--instagram-border); } .post-stats { padding: 12px 16px; color: var(--instagram-text); font-size: 14px; border-top: 1px solid var(--instagram-border); display: flex; gap: 16px; } .post-stat { display: flex; align-items: center; gap: 6px; } .post-stat-icon { font-size: 16px; } .likes-indicator { position: absolute; bottom: 10px; left: 10px; color: white; background-color: rgba(0, 0, 0, 0.7); padding: 4px 10px; border-radius: 4px; font-size: 12px; font-weight: bold; z-index: 2; box-shadow: 0 1px 3px rgba(0, 0, 0, 0.3); } .close-modal { position: absolute; top: 20px; right: 20px; color: white; font-size: 30px; cursor: pointer; z-index: 1001; } .modal-nav { position: absolute; top: 50%; transform: translateY(-50%); color: white; font-size: 30px; cursor: pointer; z-index: 1001; background-color: rgba(0, 0, 0, 0.5); width: 40px; height: 40px; border-radius: 50%; display: flex; align-items: center; justify-content: center; } .modal-prev { left: 20px; } .modal-next { right: 20px; } .slideshow-nav { position: absolute; top: 50%; transform: translateY(-50%); color: white; font-size: 30px; cursor: pointer; z-index: 5; background-color: rgba(0, 0, 0, 0.7); width: 40px; height: 40px; border-radius: 50%; display: flex; align-items: center; justify-content: center; transition: background-color 0.2s; } .slideshow-nav:hover { background-color: rgba(0, 0, 0, 0.9); } .slideshow-prev { left: 10px; } .slideshow-next { right: 10px; } .slideshow-indicator { position: absolute; bottom: 20px; left: 0; right: 0; display: flex; justify-content: center; z-index: 5; background-color: rgba(0, 0, 0, 0.3); padding: 8px 0; border-radius: 20px; width: auto; max-width: 80%; margin: 0 auto; } .slideshow-dot { width: 8px; height: 8px; border-radius: 50%; background-color: rgba(255, 255, 255, 0.5); margin: 0 4px; cursor: pointer; transition: background-color 0.2s; } .slideshow-dot:hover { background-color: rgba(255, 255, 255, 0.8); } .slideshow-dot.active { background-color: white; } .media-container { position: relative; width: 100%; height: 100%; display: flex; align-items: center; justify-content: center; } .media-slide { position: absolute; top: 0; left: 0; width: 100%; height: 100%; display: flex; align-items: center; justify-content: center; opacity: 0; transition: opacity 0.3s ease, transform 0.5s ease; pointer-events: none; will-change: transform, opacity; transform: translateX(0); } .media-slide img, .media-slide video { max-width: 100%; max-height: 100%; object-fit: contain; } .media-slide.active { opacity: 1; z-index: 2; pointer-events: auto; } .media-slide.active { opacity: 1; z-index: 2; } .file-input-container { margin-bottom: 20px; padding: 20px; background-color: white; border: 1px solid var(--instagram-border); border-radius: 4px; } .loading { text-align: center; padding: 40px; font-size: 18px; } .sort-options { display: flex; align-items: center; justify-content: center; padding: 10px 20px; margin-bottom: 20px; } .sort-row { display: flex; align-items: center; justify-content: center; flex-wrap: wrap; margin: 5px 0; width: 100%; max-width: 600px; } .sort-link { margin: 0 10px; color: var(--instagram-text); text-decoration: none; padding: 5px 0; position: relative; transition: color 0.2s; } .sort-link:hover { color: var(--instagram-link); } .sort-link.active { color: var(--instagram-link); font-weight: 600; } .sort-link.active::after { content: ''; position: absolute; bottom: 0; left: 0; width: 100%; height: 2px; background-color: var(--instagram-link); } @media (max-width: 768px) { .posts-grid { grid-template-columns: repeat(2, 1fr); gap: 4px; } .post-modal-content { flex-direction: column; height: auto; max-height: none; margin: 30px auto 0; border-radius: 0; width: 100%; } .post-media { height: 50vh; width: 100%; min-height: 300px; position: relative; } .post-info { width: 100%; border-left: none; border-top: 1px solid var(--instagram-border); } .profile-picture { width: 80px; height: 80px; margin-right: 15px; } .stat { margin-right: 20px; } .post-modal { overflow-y: auto; padding-top: 0; } .media-container { position: relative; width: 100%; height: 100%; } .media-slide { position: absolute; top: 0; left: 0; width: 100%; height: 100%; display: flex; align-items: center; justify-content: center; } .media-slide img, .media-slide video { max-width: 100%; max-height: 100%; object-fit: contain; } } @media (max-width: 480px) { .posts-grid { grid-template-columns: repeat(3, 1fr); gap: 3px; } .profile-info { flex-direction: column; text-align: center; } .profile-picture { margin-right: 0; margin-bottom: 15px; } .stats { justify-content: center; } .header-content { flex-direction: column; align-items: center; padding: 5px 0; } .date-range-header { margin-left: 0; margin-top: 2px; font-size: 12px; } .sort-options { padding: 5px; } .sort-row { width: 100%; flex-wrap: wrap; justify-content: center; } .sort-link { margin: 5px; font-size: 13px; padding: 5px 0; flex: 0 0 auto; } } /* Stories grid styles */ .stories-grid { display: grid; grid-template-columns: repeat(auto-fill, minmax(300px, 1fr)); gap: 20px; margin-top: 20px; } .story-item { position: relative; border-radius: 8px; overflow: hidden; background-color: #fff; box-shadow: 0 1px 3px rgba(0, 0, 0, 0.1); transition: transform 0.2s ease; } .story-item:hover { transform: translateY(-5px); } .story-info { padding: 12px; } .story-date { font-size: 12px; color: #8e8e8e; margin-bottom: 4px; } /* Story caption styles removed as they're now used for alt text */ .video-indicator { position: absolute; top: 10px; right: 10px; width: 30px; height: 30px; background-color: rgba(0, 0, 0, 0.5); border-radius: 50%; display: flex; align-items: center; justify-content: center; } .video-placeholder { width: 100%; height: 300px; background-color: #f0f0f0; display: flex; align-items: center; justify-content: center; } /* Navigation links */ .nav-link { margin-right: 20px; font-weight: 600; color: #8e8e8e; text-decoration: none; } .nav-link.active { color: #262626; border-bottom: 2px solid #262626; } /* Mobile-specific fixes */ @media (max-width: 768px) { /* Ensure the modal takes up the full screen */ .post-modal { padding: 0; overflow-y: auto; } /* Make modal content take full width */ .post-modal-content { flex-direction: column; height: auto; margin: 0; width: 100%; max-width: 100%; } /* Explicitly set post-media height */ .post-media { height: 50vh !important; /* Important to override any inline styles */ min-height: 300px !important; width: 100%; flex: 0 0 auto; /* Don't grow or shrink */ } /* Ensure media container fills the available space */ .media-container { position: relative; width: 100%; height: 100% !important; display: flex !important; align-items: center; justify-content: center; } /* Fix media slides */ .media-slide { position: absolute; top: 0; left: 0; width: 100%; height: 100%; display: flex !important; align-items: center; justify-content: center; } /* Ensure images don't exceed container */ .media-slide img, .media-slide video { max-width: 100%; max-height: 100%; width: auto; height: auto; object-fit: contain; } /* Make post info section scroll independently if needed */ .post-info { flex: 1 1 auto; overflow-y: auto; max-height: 50vh; } } /* Stories section */ .section-title { text-align: center; margin: 30px 0 15px; } .section-title h2 { font-size: 24px; font-weight: 600; color: #262626; margin: 0; padding-bottom: 10px; border-bottom: 1px solid #dbdbdb; } .story-item { position: relative; } .story-info { position: absolute; bottom: 0; left: 0; right: 0; background: rgba(0, 0, 0, 0.5); color: white; padding: 8px; font-size: 12px; opacity: 0; transition: opacity 0.3s ease; } .story-item:hover .story-info { opacity: 1; } .story-date { font-weight: bold; margin-bottom: 4px; color: white; opacity: 1; } .story-caption { overflow: hidden; text-overflow: ellipsis; display: -webkit-box; -webkit-line-clamp: 2; -webkit-box-orient: vertical; } /* Stories page specific styles */ .stories-container { margin-top: 30px; } .stories-grid { display: grid; grid-template-columns: repeat(auto-fill, minmax(300px, 1fr)); gap: 20px; margin-top: 20px; } .story-item { border-radius: 8px; overflow: hidden; background-color: #fff; box-shadow: 0 1px 3px rgba(0, 0, 0, 0.1); transition: transform 0.2s ease; cursor: pointer; } .story-item:hover { transform: translateY(-5px); } .story-media { position: relative; aspect-ratio: 9/16; overflow: hidden; background-color: #000; } .story-media img { width: 100%; height: 100%; object-fit: cover; } .story-viewer { display: none; position: fixed; top: 0; left: 0; width: 100%; height: 100vh; /* Explicitly use viewport height */ background-color: rgba(0, 0, 0, 0.95); z-index: 2000; flex-direction: column; justify-content: center; align-items: center; } .story-close { position: absolute; top: 20px; right: 20px; color: white; font-size: 30px; cursor: pointer; z-index: 2001; width: 40px; height: 40px; display: flex; align-items: center; justify-content: center; background-color: rgba(0, 0, 0, 0.5); border-radius: 50%; } .story-nav { position: absolute; top: 50%; transform: translateY(-50%); color: white; font-size: 30px; cursor: pointer; z-index: 2001; background-color: rgba(0, 0, 0, 0.5); width: 40px; height: 40px; border-radius: 50%; display: flex; align-items: center; justify-content: center; } .story-prev { left: 20px; } .story-next { right: 20px; } .story-progress-container { position: absolute; top: 0; left: 0; width: 100%; height: 4px; background-color: rgba(255, 255, 255, 0.3); z-index: 2001; } .story-progress { height: 100%; width: 0%; background-color: white; transition: width 10s linear; } .story-content { position: relative; width: 100%; height: 100%; display: flex; justify-content: center; align-items: center; } .story-media-container { max-width: 100%; max-height: 100%; width: 100%; height: 100%; display: flex; justify-content: center; align-items: center; position: relative; overflow: hidden; /* Important to hide slides that move outside the container */ } .story-media-container img, .story-media-container video { width: 100%; height: 100%; max-height: 90vh; object-fit: contain; } .story-info-overlay { position: absolute; bottom: 0; left: 0; width: 100%; padding: 20px; background: linear-gradient(transparent, rgba(0, 0, 0, 0.7)); color: white; } /* Caption styles removed as they're now used for alt text */ .story-info-overlay .story-date { font-size: 14px; opacity: 1; color: white; max-width: 800px; margin: 0 auto; text-align: center; } @media (max-width: 768px) { .story-media-container img, .story-media-container video { max-height: 80vh; } .story-nav { width: 30px; height: 30px; font-size: 20px; } .story-close { width: 30px; height: 30px; font-size: 20px; } .story-info-overlay { padding: 15px; } .story-info-overlay .story-caption { font-size: 14px; } .story-info-overlay .story-date { font-size: 12px; } } /* Navigation links */ .nav-link { color: #0095f6; text-decoration: none; padding-bottom: 2px; transition: color 0.2s ease; position: relative; } .nav-link:hover { color: #00376b; text-decoration: underline; } .nav-link.active { font-weight: 600; color: #262626; border-bottom: 2px solid #262626; } ================================================ FILE: memento_mori/static/js/modal.js ================================================ // memento_mori/static/js/modal.js document.addEventListener('DOMContentLoaded', function () { // Get DOM elements const postsGrid = document.getElementById('postsGrid'); const postModal = document.getElementById('postModal'); const closeModalBtn = document.getElementById('closeModal'); const modalPrev = document.getElementById('modalPrev'); const modalNext = document.getElementById('modalNext'); const postMedia = document.getElementById('postMedia'); const postCaption = document.getElementById('postCaption'); const postStats = document.getElementById('postStats'); const postDate = document.getElementById('postDate'); const postUsername = document.getElementById('postUsername'); const postUserPic = document.getElementById('postUserPic'); const sortLinks = document.querySelectorAll('.sort-link'); // Global variables to track current post and indexes let currentPostIndex = -1; let currentSlideIndex = 0; let postIndexToTimestamp = {}; // Map post index to timestamp let currentSortType = 'newest'; // Default sort // Initialize by creating mapping and attaching listeners function initialize() { // Create a mapping from post index to timestamp Object.entries(window.postData).forEach(([timestamp, post]) => { postIndexToTimestamp[post.i] = timestamp; }); // Attach click listeners to grid items attachGridItemListeners(); // Initialize sorting functionality initializeSorting(); } // Initialize sorting functionality function initializeSorting() { // Add event listeners to sort links sortLinks.forEach(link => { link.addEventListener('click', function (e) { e.preventDefault(); // Update active class sortLinks.forEach(l => l.classList.remove('active')); this.classList.add('active'); // Get sort type and sort posts const sortType = this.getAttribute('data-sort'); currentSortType = sortType; sortPosts(sortType); }); }); } // Sort posts based on selected criteria function sortPosts(sortType) { // Get all grid items let gridItems = Array.from(document.querySelectorAll('.grid-item')); // Sort the grid items based on the selected criteria switch (sortType) { case 'newest': // Sort by timestamp (newest first) - this is the default gridItems.sort((a, b) => { const indexA = parseInt(a.getAttribute('data-index')); const indexB = parseInt(b.getAttribute('data-index')); const timestampA = getTimestampByIndex(indexA); const timestampB = getTimestampByIndex(indexB); return timestampB - timestampA; }); break; case 'oldest': // Sort by timestamp (oldest first) gridItems.sort((a, b) => { const indexA = parseInt(a.getAttribute('data-index')); const indexB = parseInt(b.getAttribute('data-index')); const timestampA = getTimestampByIndex(indexA); const timestampB = getTimestampByIndex(indexB); return timestampA - timestampB; }); break; case 'most-likes': // Sort by number of likes gridItems.sort((a, b) => { const indexA = parseInt(a.getAttribute('data-index')); const indexB = parseInt(b.getAttribute('data-index')); const likesA = getLikesByIndex(indexA) || 0; const likesB = getLikesByIndex(indexB) || 0; return likesB - likesA; }); break; case 'most-comments': // Sort by number of comments gridItems.sort((a, b) => { const indexA = parseInt(a.getAttribute('data-index')); const indexB = parseInt(b.getAttribute('data-index')); const commentsA = getCommentsByIndex(indexA) || 0; const commentsB = getCommentsByIndex(indexB) || 0; return commentsB - commentsA; }); break; case 'most-views': // Sort by number of views/impressions gridItems.sort((a, b) => { const indexA = parseInt(a.getAttribute('data-index')); const indexB = parseInt(b.getAttribute('data-index')); const viewsA = getViewsByIndex(indexA) || 0; const viewsB = getViewsByIndex(indexB) || 0; return viewsB - viewsA; }); break; case 'random': // Shuffle the grid items randomly gridItems.sort(() => Math.random() - 0.5); break; } // Reorder the grid items in the DOM const fragment = document.createDocumentFragment(); gridItems.forEach(item => { fragment.appendChild(item); }); // Clear the grid and append the sorted items postsGrid.innerHTML = ''; postsGrid.appendChild(fragment); // Reattach event listeners to grid items attachGridItemListeners(); } // Helper function to get timestamp by post index function getTimestampByIndex(index) { const timestamp = postIndexToTimestamp[index]; return parseInt(timestamp); } // Helper function to get likes by post index function getLikesByIndex(index) { const timestamp = postIndexToTimestamp[index]; if (timestamp && window.postData[timestamp]) { return parseInt(window.postData[timestamp].l) || 0; } return 0; } // Helper function to get comments by post index function getCommentsByIndex(index) { const timestamp = postIndexToTimestamp[index]; if (timestamp && window.postData[timestamp]) { return parseInt(window.postData[timestamp].c) || 0; } return 0; } // Helper function to get views/impressions by post index function getViewsByIndex(index) { const timestamp = postIndexToTimestamp[index]; if (timestamp && window.postData[timestamp]) { return parseInt(window.postData[timestamp].im) || 0; } return 0; } // Attach click event listeners to all grid items function attachGridItemListeners() { const gridItems = document.querySelectorAll('.grid-item'); gridItems.forEach(item => { item.addEventListener('click', function () { const postIndex = parseInt(this.getAttribute('data-index')); openModal(postIndex); }); }); } // Open the modal with the selected post function openModal(index, imageIndex = 0) { currentPostIndex = index; // Store the current scroll position before opening the modal const scrollPosition = window.pageYOffset || document.documentElement.scrollTop; // Get the timestamp using the post_index mapping const timestamp = postIndexToTimestamp[index]; // Get the post data using the timestamp const post = window.postData[timestamp]; // Show the modal first (important for correct dimensions) postModal.style.display = 'block'; document.body.style.overflow = 'hidden'; // Prevent scrolling // Store the scroll position as a data attribute on the modal postModal.setAttribute('data-scroll-position', scrollPosition); // Update modal content updateModalContent(post, imageIndex); // Update URL with post ID and image index updateUrlWithPostInfo(timestamp, imageIndex); // For mobile devices, ensure content is visible and properly sized if (window.innerWidth <= 768) { // Don't scroll to top on mobile as it causes the issue // Instead, just ensure the modal is properly positioned postModal.scrollTop = 0; // Force layout recalculation with a longer timeout setTimeout(() => { const mediaContainer = document.querySelector('.media-container'); const postMediaEl = document.getElementById('postMedia'); // Ensure post-media has explicit height if (postMediaEl) { postMediaEl.style.height = '50vh'; postMediaEl.style.minHeight = '300px'; } // Ensure media-container has explicit height if (mediaContainer) { mediaContainer.style.height = '100%'; mediaContainer.style.display = 'flex'; // Force reflow void mediaContainer.offsetHeight; } // Reset any active slides to ensure they're visible const activeSlides = document.querySelectorAll('.media-slide.active'); activeSlides.forEach(slide => { slide.style.opacity = '0'; void slide.offsetHeight; // Force reflow slide.style.opacity = '1'; // Make sure images have height const img = slide.querySelector('img'); if (img) { img.style.maxHeight = '100%'; img.style.width = 'auto'; img.style.height = 'auto'; } }); }, 50); // Increase timeout for more reliability } } // Function to update the URL with post and image information function updateUrlWithPostInfo(timestamp, imageIndex) { // Create a new URL object based on the current URL const url = new URL(window.location.href); // Set the post parameter to the timestamp url.searchParams.set('post', timestamp); // Only add the image parameter if it's not the first image if (imageIndex > 0) { url.searchParams.set('image', imageIndex); } else { url.searchParams.delete('image'); } // Update the browser history without reloading the page window.history.pushState({}, '', url); } // Creates the appropriate media element (video or image) based on the file type function createMediaElement(mediaUrl) { // Check if the media is a video based on file extension if (mediaUrl.endsWith('.mp4') || mediaUrl.endsWith('.mov') || mediaUrl.endsWith('.avi') || mediaUrl.endsWith('.webm')) { // Create video element const video = document.createElement('video'); video.src = mediaUrl; video.controls = true; video.autoplay = true; video.loop = true; video.muted = false; video.playsInline = true; video.alt = 'Instagram video post'; return video; } else { // Create image element const img = document.createElement('img'); // Check if there's a WebP version available for non-WebP images if (!mediaUrl.endsWith('.webp') && (mediaUrl.endsWith('.jpg') || mediaUrl.endsWith('.jpeg') || mediaUrl.endsWith('.png') || mediaUrl.endsWith('.gif'))) { // Try to use WebP version if it exists const webpUrl = mediaUrl.replace(/\.(jpg|jpeg|png|gif)$/i, '.webp'); // Set up error handling to fall back to original if WebP doesn't exist img.onerror = function () { this.onerror = null; // Prevent infinite loop this.src = mediaUrl; // Fall back to original }; img.src = webpUrl; } else { img.src = mediaUrl; } img.alt = 'Instagram post'; return img; } } // Update modal content with the post data function updateModalContent(post, initialImageIndex = 0) { // Clear previous content postMedia.innerHTML = ''; postCaption.innerHTML = ''; postStats.innerHTML = ''; // Create media container for the slides const mediaContainer = document.createElement('div'); mediaContainer.className = 'media-container'; // Check if the post has multiple media if (post.m && post.m.length > 1) { // Changed from media // Create slides for each media item post.m.forEach((mediaUrl, index) => { // Changed from media const slide = document.createElement('div'); slide.className = `media-slide ${index === initialImageIndex ? 'active' : ''}`; // Create and add the appropriate media element const mediaElement = createMediaElement(mediaUrl); slide.appendChild(mediaElement); mediaContainer.appendChild(slide); }); // Add navigation buttons for slideshow const prevBtn = document.createElement('div'); prevBtn.className = 'slideshow-nav slideshow-prev'; prevBtn.innerHTML = '❮'; prevBtn.addEventListener('click', function (e) { e.stopPropagation(); navigateSlideshow(-1); }); const nextBtn = document.createElement('div'); nextBtn.className = 'slideshow-nav slideshow-next'; nextBtn.innerHTML = '❯'; nextBtn.addEventListener('click', function (e) { e.stopPropagation(); navigateSlideshow(1); }); // Add indicator dots const indicator = document.createElement('div'); indicator.className = 'slideshow-indicator'; for (let i = 0; i < post.m.length; i++) { const dot = document.createElement('div'); dot.className = `slideshow-dot ${i === initialImageIndex ? 'active' : ''}`; dot.setAttribute('data-index', i); dot.addEventListener('click', function (e) { e.stopPropagation(); const index = parseInt(this.getAttribute('data-index')); showSlide(index); }); indicator.appendChild(dot); } mediaContainer.appendChild(prevBtn); mediaContainer.appendChild(nextBtn); mediaContainer.appendChild(indicator); // Set the current slide index to the initial image index currentSlideIndex = initialImageIndex; } else { // Single media post const slide = document.createElement('div'); slide.className = 'media-slide active'; // Create and add the appropriate media element const mediaElement = createMediaElement(post.m[0]); // Changed from media slide.appendChild(mediaElement); mediaContainer.appendChild(slide); } postMedia.appendChild(mediaContainer); // Set post caption if (post.tt) { postCaption.innerHTML = post.tt.replace(/\n/g, '
'); } else { postCaption.innerHTML = ''; } // Set post stats if (post.im) { const impressionsDiv = document.createElement('div'); impressionsDiv.className = 'post-stat'; impressionsDiv.innerHTML = ` 👁️ ${post.im} views `; postStats.appendChild(impressionsDiv); } if (post.l) { const likesDiv = document.createElement('div'); likesDiv.className = 'post-stat'; likesDiv.innerHTML = ` ${post.l} `; postStats.appendChild(likesDiv); } if (post.c) { const commentsDiv = document.createElement('div'); commentsDiv.className = 'post-stat'; commentsDiv.innerHTML = ` 💬 ${post.c} comments `; postStats.appendChild(commentsDiv); } // Set post date postDate.textContent = post.d; // Show/hide stats container based on whether there are any stats postStats.style.display = postStats.children.length > 0 ? 'flex' : 'none'; } // Navigate between slides in a multi-media post function navigateSlideshow(direction) { const slides = document.querySelectorAll('.media-slide'); const dots = document.querySelectorAll('.slideshow-dot'); let activeIndex = 0; // Find the currently active slide slides.forEach((slide, index) => { if (slide.classList.contains('active')) { activeIndex = index; } }); // Pause any videos in the current slide const currentVideo = slides[activeIndex].querySelector('video'); if (currentVideo) { currentVideo.pause(); } // Calculate the new index let newIndex = activeIndex + direction; if (newIndex < 0) newIndex = slides.length - 1; if (newIndex >= slides.length) newIndex = 0; // Update slides and dots showSlide(newIndex); } // Show a specific slide function showSlide(index) { const slides = document.querySelectorAll('.media-slide'); const dots = document.querySelectorAll('.slideshow-dot'); // Pause all videos before changing slides slides.forEach(slide => { const video = slide.querySelector('video'); if (video) { video.pause(); } }); // Remove active class from all slides and dots slides.forEach(slide => slide.classList.remove('active')); if (dots.length > 0) { dots.forEach(dot => dot.classList.remove('active')); dots[index].classList.add('active'); } // Add active class to the selected slide slides[index].classList.add('active'); // Update current slide index currentSlideIndex = index; // Update URL with the new image index const timestamp = postIndexToTimestamp[currentPostIndex]; updateUrlWithPostInfo(timestamp, index); } // Navigate between posts (next/prev buttons in modal) function navigatePost(direction) { // Pause all videos in the current post const videos = document.querySelectorAll('.media-slide video'); videos.forEach(video => { if (video) { video.pause(); } }); // Get all grid items in their current sorted order const gridItems = Array.from(document.querySelectorAll('.grid-item')); const gridIndexes = gridItems.map(item => parseInt(item.getAttribute('data-index'))); // Find the position of the current post in the sorted grid const currentPosition = gridIndexes.indexOf(currentPostIndex); if (currentPosition === -1) { console.error('Current post not found in grid'); return; } // Calculate new position with wraparound let newPosition = (currentPosition + direction + gridIndexes.length) % gridIndexes.length; // Get the new post index from the grid's current order const newPostIndex = gridIndexes[newPosition]; // Open the new post openModal(newPostIndex); } // Close the modal function closeModal() { // Pause all videos before closing the modal const videos = document.querySelectorAll('.media-slide video'); videos.forEach(video => { if (video) { video.pause(); } }); // Store the current scroll position before closing the modal const scrollPosition = window.pageYOffset || document.documentElement.scrollTop; postModal.style.display = 'none'; document.body.style.overflow = 'auto'; // Re-enable scrolling // Remove post and image parameters from URL const url = new URL(window.location.href); url.searchParams.delete('post'); url.searchParams.delete('image'); window.history.pushState({}, '', url); // Restore the scroll position after a short delay setTimeout(() => { window.scrollTo({ top: scrollPosition, behavior: 'auto' // Use 'auto' instead of 'smooth' to prevent visible scrolling }); }, 10); } // Event listeners for modal navigation closeModalBtn.addEventListener('click', closeModal); modalPrev.addEventListener('click', function (e) { e.stopPropagation(); navigatePost(-1); }); modalNext.addEventListener('click', function (e) { e.stopPropagation(); navigatePost(1); }); // Close modal when clicking outside of content postModal.addEventListener('click', function (e) { if (e.target === postModal) { closeModal(); } }); // Keyboard navigation document.addEventListener('keydown', function (e) { if (postModal.style.display === 'block') { if (e.key === 'Escape') { closeModal(); } else if (e.key === 'ArrowLeft') { navigatePost(-1); } else if (e.key === 'ArrowRight') { navigatePost(1); } } }); // Initialize the modal functionality if (typeof window.postData !== 'undefined') { initialize(); // Check if URL has post and image parameters const urlParams = new URLSearchParams(window.location.search); const postTimestamp = urlParams.get('post'); const imageIndex = parseInt(urlParams.get('image') || '0'); if (postTimestamp && window.postData[postTimestamp]) { // Find the post index from the timestamp let postIndex = -1; Object.entries(postIndexToTimestamp).forEach(([index, timestamp]) => { if (timestamp === postTimestamp) { postIndex = parseInt(index); } }); if (postIndex >= 0) { // Open the modal with the specified post and image setTimeout(() => { openModal(postIndex, imageIndex); }, 500); // Delay to ensure everything is loaded } } } else { console.error('Post data not available'); } }); /** * Fixes common Unicode encoding issues in text * @param {string} text - The text to fix * @return {string} - The fixed text */ function fixEncodingIssues(text) { if (!text) return text; // Common replacements for incorrectly encoded characters const replacements = [ // Fix smart quotes and apostrophes { pattern: /â\u0080\u0099/g, replacement: "\u2019" }, // Right single quote/apostrophe { pattern: /â\u0080\u009c/g, replacement: "\u201C" }, // Left double quote { pattern: /â\u0080\u009d/g, replacement: "\u201D" }, // Right double quote { pattern: /â\u0080\u0098/g, replacement: "\u2018" }, // Left single quote // Fix dashes and ellipsis { pattern: /â\u0080\u0093/g, replacement: "\u2013" }, // En dash { pattern: /â\u0080\u0094/g, replacement: "\u2014" }, // Em dash { pattern: /â\u0080¦/g, replacement: "\u2026" }, // Ellipsis // Remove non-breaking space indicator { pattern: /Â/g, replacement: "" }, // Fix fractions { pattern: /½/g, replacement: "\u00BD" }, // Half fraction // Fix bullet point { pattern: /•/g, replacement: "•" }, // Fix common mis-encoded accented characters { pattern: /é/g, replacement: "é" }, { pattern: /è/g, replacement: "è" }, { pattern: /â/g, replacement: "â" }, { pattern: /ê/g, replacement: "ê" }, { pattern: /ë/g, replacement: "ë" }, { pattern: /î/g, replacement: "î" }, { pattern: /ï/g, replacement: "ï" }, { pattern: /ô/g, replacement: "ô" }, { pattern: /ö/g, replacement: "ö" }, { pattern: /ù/g, replacement: "ù" }, { pattern: /ú/g, replacement: "ú" }, { pattern: /ü/g, replacement: "ü" }, { pattern: /ç/g, replacement: "ç" } ]; // Apply all replacements let fixedText = text; for (const { pattern, replacement } of replacements) { fixedText = fixedText.replace(pattern, replacement); } return fixedText; } // Apply the fix to all post captions when the page loads document.addEventListener('DOMContentLoaded', function() { // Fix the JSON data directly for (const timestamp in window.postData) { const post = window.postData[timestamp]; if (post.tt) { // Changed from title post.tt = fixEncodingIssues(post.tt); // Changed from title } } // Update any already rendered content const captions = document.querySelectorAll('.post-caption'); captions.forEach(caption => { caption.textContent = fixEncodingIssues(caption.textContent); }); }); ================================================ FILE: memento_mori/static/js/stories.js ================================================ // Stories viewer functionality document.addEventListener('DOMContentLoaded', function() { // Story viewer elements const storyViewer = document.getElementById('storyViewer'); const storyMedia = document.getElementById('storyMedia'); const storyCaption = document.getElementById('storyCaption'); const storyDate = document.getElementById('storyDate'); const storyClose = document.getElementById('storyClose'); const storyPrev = document.getElementById('storyPrev'); const storyNext = document.getElementById('storyNext'); const storyProgress = document.getElementById('storyProgress'); // Story data and state let currentStoryIndex = 0; let storyItems = []; let autoProgressTimer = null; let isNavigating = false; const autoProgressDelay = 10000; // 10 seconds // Initialize story items from the grid const storyGridItems = document.querySelectorAll('.story-item'); storyGridItems.forEach(item => { item.addEventListener('click', function() { const storyIndex = parseInt(this.getAttribute('data-index')); openStory(storyIndex); }); }); // Open a story by index function openStory(index) { // Get all story items in their current order storyItems = Array.from(document.querySelectorAll('.story-item')); currentStoryIndex = storyItems.findIndex(item => parseInt(item.getAttribute('data-index')) === index); if (currentStoryIndex === -1) return; // Show the story viewer storyViewer.style.display = 'flex'; document.body.style.overflow = 'hidden'; // Prevent scrolling // Load the current story loadCurrentStory(); // Update URL with story info const timestamp = storyItems[currentStoryIndex].getAttribute('data-timestamp'); if (timestamp) { const url = new URL(window.location.href); url.searchParams.set('story', timestamp); window.history.pushState({}, '', url); } } // Load the current story content function loadCurrentStory() { if (currentStoryIndex < 0 || currentStoryIndex >= storyItems.length) return; // Clear any existing timer clearAutoProgressTimer(); // Reset pause state when loading a new story if (isPaused) { isPaused = false; pauseIcon.style.display = 'block'; playIcon.style.display = 'none'; } // Reset progress bar storyProgress.style.width = '0%'; // Clear previous media storyMedia.innerHTML = ''; // Ensure the story media container has the correct class storyMedia.className = 'story-media-container'; // Create a new slide const slide = document.createElement('div'); slide.className = 'media-slide active'; slide.style.opacity = '1'; slide.style.transform = 'translateX(0)'; // Load content into the slide loadStoryContent(slide, currentStoryIndex); // Add the slide to the container storyMedia.appendChild(slide); } // Start auto-progress timer with visual indicator function startAutoProgressTimer() { // Don't start timer if paused if (isPaused) { console.log('Not starting timer because story is paused'); return; } console.log('Starting auto-progress timer with delay:', autoProgressDelay, 'ms'); // Animate progress bar storyProgress.style.transition = `width ${autoProgressDelay}ms linear`; storyProgress.style.width = '100%'; // Set timer for auto-progression autoProgressTimer = setTimeout(() => { console.log('Auto-progress timer completed, navigating to next story'); navigateStory(1); }, autoProgressDelay); } // Clear auto-progress timer function clearAutoProgressTimer() { console.log('Clearing auto-progress timer'); clearTimeout(autoProgressTimer); autoProgressTimer = null; // Also clear any video timer if it exists const videoElement = storyMedia.querySelector('video'); if (videoElement && videoElement.videoTimer) { clearTimeout(videoElement.videoTimer); videoElement.videoTimer = null; } // Reset progress bar immediately storyProgress.style.transition = 'none'; storyProgress.style.width = '0%'; // Force a reflow to ensure the transition is reset storyProgress.offsetHeight; } // Helper function to load story content into a slide function loadStoryContent(slide, index) { const storyItem = storyItems[index]; const timestamp = storyItem.getAttribute('data-timestamp'); const storyData = window.storiesData[timestamp]; if (!storyData) { isNavigating = false; // Reset navigation lock if we can't load content return; } // Update story date storyDate.textContent = storyData.d || ''; // Create media element based on type const mediaUrl = storyData.m[0]; // Use first media item const isVideo = mediaUrl.endsWith('.mp4') || mediaUrl.endsWith('.mov') || mediaUrl.endsWith('.avi') || mediaUrl.endsWith('.webm'); if (isVideo) { console.log('Loading video story:', mediaUrl); const video = document.createElement('video'); video.src = mediaUrl; video.controls = true; video.autoplay = !isPaused; // Only autoplay if not paused video.muted = false; // Force the video to take the full size of its container video.style.width = '100%'; video.style.height = '100%'; video.style.maxHeight = '90vh'; video.style.objectFit = 'contain'; // Create a wrapper div to help control dimensions const videoWrapper = document.createElement('div'); videoWrapper.style.width = '100%'; videoWrapper.style.height = '100%'; videoWrapper.style.display = 'flex'; videoWrapper.appendChild(video); slide.appendChild(videoWrapper); // Handle video events as in the original loadCurrentStory function video.addEventListener('loadedmetadata', function() { // Once we know the video duration, decide how to handle it const videoLength = video.duration; console.log(`Video duration: ${videoLength}s, Auto-progress delay: ${autoProgressDelay/1000}s`); // Clear any existing video timer first if (video.videoTimer) { clearTimeout(video.videoTimer); video.videoTimer = null; } if (videoLength > autoProgressDelay/1000) { // For longer videos, we'll let them play through once console.log('Video is longer than auto-progress delay, will play once'); video.loop = false; } else { // For shorter videos, loop until we reach the delay time console.log('Video is shorter than auto-progress delay, will loop'); video.loop = true; // Set up a timer to move to next story after delay if (!isPaused) { video.videoTimer = setTimeout(() => { if (!isPaused && !isNavigating) { console.log(`Auto-progress timer completed after ${autoProgressDelay/1000}s`); navigateStory(1); } }, autoProgressDelay); } } // Start progress bar animation if (!isPaused) { storyProgress.style.transition = `width ${autoProgressDelay}ms linear`; storyProgress.style.width = '100%'; } }); // Store the video element in a variable accessible to the togglePause function currentVideoElement = video; } else { console.log('Loading image story:', mediaUrl); const img = document.createElement('img'); // Check if there's a WebP version available for non-WebP images if (!mediaUrl.endsWith('.webp') && (mediaUrl.endsWith('.jpg') || mediaUrl.endsWith('.jpeg') || mediaUrl.endsWith('.png') || mediaUrl.endsWith('.gif'))) { // Try to use WebP version if it exists const webpUrl = mediaUrl.replace(/\.(jpg|jpeg|png|gif)$/i, '.webp'); img.onerror = function() { this.onerror = null; // Prevent infinite loop this.src = mediaUrl; // Fall back to original }; img.src = webpUrl; } else { img.src = mediaUrl; } img.alt = storyData.tt || 'Instagram Story'; slide.appendChild(img); // Start auto-progress for images if (!isPaused && !isNavigating) { startAutoProgressTimer(); } } // Update navigation buttons visibility - always show both buttons for circular navigation storyPrev.style.display = 'flex'; storyNext.style.display = 'flex'; } // Navigate to previous/next story function navigateStory(direction) { // Prevent rapid clicks from causing issues if (isNavigating) return; isNavigating = true; // If we're paused and this is an automatic navigation (not user-initiated), // don't advance to the next story const isUserInitiated = event && (event.type === 'click' || event.type === 'keydown'); if (isPaused && direction > 0 && !isUserInitiated) { console.log('Auto-navigation blocked because story is paused'); isNavigating = false; return; } // Always clear any existing timers first clearAutoProgressTimer(); // Calculate the new index with circular navigation let newIndex = currentStoryIndex + direction; // Implement circular navigation (only once) if (newIndex < 0) { newIndex = storyItems.length - 1; // Wrap to the last story console.log('Wrapping to the last story'); } else if (newIndex >= storyItems.length) { newIndex = 0; // Wrap to the first story console.log('Wrapping to the first story'); } // Get the current slide for animation const currentSlide = storyMedia.querySelector('.media-slide.active'); // Animate the current slide out if (currentSlide) { currentSlide.style.transition = 'transform 0.5s ease'; currentSlide.style.transform = `translateX(${direction < 0 ? '100%' : '-100%'})`; // Create and prepare the new slide with initial position const newSlide = document.createElement('div'); newSlide.className = 'media-slide'; newSlide.style.transition = 'none'; // No transition initially newSlide.style.transform = `translateX(${direction > 0 ? '100%' : '-100%'})`; // Start from right or left newSlide.style.opacity = '1'; // Load the content into the new slide loadStoryContent(newSlide, newIndex); storyMedia.appendChild(newSlide); // Force a reflow to ensure the initial transform is applied newSlide.offsetHeight; // Now animate the slide into view newSlide.style.transition = 'transform 0.5s ease'; newSlide.style.transform = 'translateX(0)'; // After animation completes, update to the new story setTimeout(() => { currentStoryIndex = newIndex; // Remove old slides const oldSlides = storyMedia.querySelectorAll('.media-slide:not(:last-child)'); oldSlides.forEach(slide => slide.remove()); // Make the new slide active newSlide.classList.add('active'); // Update URL const timestamp = storyItems[currentStoryIndex].getAttribute('data-timestamp'); if (timestamp) { const url = new URL(window.location.href); url.searchParams.set('story', timestamp); window.history.pushState({}, '', url); } // Reset navigation lock isNavigating = false; }, 500); } else { // If no current slide (shouldn't happen), just load the new story currentStoryIndex = newIndex; loadCurrentStory(); // Update URL const timestamp = storyItems[currentStoryIndex].getAttribute('data-timestamp'); if (timestamp) { const url = new URL(window.location.href); url.searchParams.set('story', timestamp); window.history.pushState({}, '', url); } // Reset navigation lock isNavigating = false; } } // Close the story viewer function closeStory() { // Pause any playing videos before closing const videoElements = storyMedia.querySelectorAll('video'); videoElements.forEach(video => { if (video && !video.paused) { video.pause(); } }); clearAutoProgressTimer(); storyViewer.style.display = 'none'; document.body.style.overflow = ''; // Restore scrolling // Remove story parameter from URL const url = new URL(window.location.href); url.searchParams.delete('story'); window.history.pushState({}, '', url); } // Get pause button element const storyPause = document.getElementById('storyPause'); const pauseIcon = document.getElementById('pauseIcon'); const playIcon = document.getElementById('playIcon'); let isPaused = false; let currentVideoElement = null; // Track the current video element // Event listeners storyClose.addEventListener('click', closeStory); storyPrev.addEventListener('click', () => navigateStory(-1)); storyNext.addEventListener('click', () => navigateStory(1)); storyPause.addEventListener('click', togglePause); // Toggle pause function function togglePause() { console.log('Toggle pause called, current state:', isPaused); isPaused = !isPaused; if (isPaused) { console.log('Pausing story playback'); // Show play icon when paused pauseIcon.style.display = 'none'; playIcon.style.display = 'block'; // Clear the timer and stop progress clearAutoProgressTimer(); storyProgress.style.transition = 'none'; // Don't pause videos, let them continue playing in loop console.log('Video will continue playing but auto-advance is disabled'); // Get the current video if there is one const videoElement = storyMedia.querySelector('video'); if (videoElement && videoElement.videoTimer) { clearTimeout(videoElement.videoTimer); videoElement.videoTimer = null; } } else { console.log('Resuming story playback'); // Show pause icon when playing pauseIcon.style.display = 'block'; playIcon.style.display = 'none'; // Get the media element in the story viewer const videoElement = storyMedia.querySelector('video'); const isVideo = videoElement !== null; console.log('Is video:', isVideo); if (!isVideo) { console.log('Starting auto progress timer for image'); startAutoProgressTimer(); } else { console.log('Playing video'); videoElement.play(); } } } // Keyboard navigation document.addEventListener('keydown', function(e) { if (storyViewer.style.display !== 'none') { if (e.key === 'ArrowLeft') { navigateStory(-1); } else if (e.key === 'ArrowRight') { navigateStory(1); } else if (e.key === 'Escape') { closeStory(); } } }); // Click on the story media area to navigate forward storyMedia.addEventListener('click', function(e) { // Only if it's not a video (to avoid interfering with video controls) if (!e.target.matches('video')) { navigateStory(1); } }); // Check URL for story parameter on page load function checkUrlForStory() { const urlParams = new URLSearchParams(window.location.search); const storyTimestamp = urlParams.get('story'); if (storyTimestamp) { // Find the story item with this timestamp const storyItem = Array.from(storyGridItems).find( item => item.getAttribute('data-timestamp') === storyTimestamp ); if (storyItem) { const storyIndex = parseInt(storyItem.getAttribute('data-index')); // Slight delay to ensure DOM is fully loaded setTimeout(() => openStory(storyIndex), 100); } } } // Run URL check checkUrlForStory(); }); ================================================ FILE: memento_mori/templates/grid.html ================================================ {% for post in posts %}
Instagram post {% if post.is_video %}
▶ Video
{% endif %} {% if post.media_count > 1 %}
⊞ {{ post.media_count }}
{% elif post.likes %} {% endif %}
{% endfor %} ================================================ FILE: memento_mori/templates/index.html ================================================ Memento Mori - {{ username }}'s Instagram Archive {% if gtag_id %} {% endif %}
{{ date_range }}
Profile Picture

{{ username }}

{% if bio %}

{{ bio }}

{% endif %} {% if profile.website %}

{{ profile.website }}

{% endif %}
{% if has_stories %} {% endif %} {% if profile.follower_count is defined and profile.follower_count is not none %}
{{ profile.follower_count }} followers
{% endif %}
{{ grid_html|safe }}
================================================ FILE: memento_mori/templates/stories.html ================================================ {% for story in stories %}
{% if story.v %}
{% if story.thumb %} {{ story.tt|default('Instagram Story') }} {% else %}
{% endif %}
{% else %}
{{ story.tt|default('Instagram Story') }}
{% endif %}
{% endfor %} ================================================ FILE: memento_mori/templates/stories_page.html ================================================ Memento Mori - {{ username }}'s Instagram Stories {% if gtag_id %} {% endif %}
{{ date_range }}
Profile Picture

{{ username }}

{% if bio %}

{{ bio }}

{% endif %} {% if profile.website %}

{{ profile.website }}

{% endif %}
{% if profile.follower_count is defined and profile.follower_count is not none %}
{{ profile.follower_count }} followers
{% endif %}

Stories

{% for story in stories %}
{% if story.is_video %}
{% endif %} {{ story.caption|default('Instagram Story') }}
{% endfor %}
================================================ FILE: project_plan.md ================================================ # Memento Mori - Instagram Archive Viewer ## Project Overview Memento Mori is a tool that transforms your Instagram data export into a beautiful, standalone viewer that resembles the Instagram interface. The name "Memento Mori" (Latin for "remember that you will die") reflects the ephemeral nature of digital content. This README outlines the architecture for refactoring the project from a single script into a modular package that supports: - Automatic detection and extraction of Instagram exports - Modular processing of media and data - Docker-based execution - Extensible architecture for future enhancements ## Architecture The project follows a modular architecture that separates concerns while maintaining simplicity: ### Core Components 1. **File Mapping** (`file_mapper.py`) - Central source of truth for file discovery patterns - Discovers and maps files in Instagram export structure - Provides consistent file access for all components 2. **Archive Extraction** (`extractor.py`) - Detects and extracts Instagram data archives - Identifies required files in the archive structure - Creates a file mapper instance for use by other components 3. **Data Loading & Processing** (`loader.py`) - Loads JSON data from Instagram export using file mapper - Processes and merges posts with insights - Creates structured data models 4. **Media Processing** (`media.py`) - Converts images to optimized formats (WebP) - Generates thumbnails for images and videos - Organizes media files in the output directory 5. **Website Generation** (`generator.py`) - Creates HTML/CSS/JS files using templates - Renders Instagram-like interface - Verifies output for completeness 6. **Command Line Interface** (`cli.py`) - Processes command-line arguments - Coordinates the execution flow - Provides user feedback ## Project Structure ``` memento-mori/ ├── pyproject.toml # Package metadata and dependencies ├── README.md # Documentation ├── Dockerfile # Docker container configuration ├── docker-compose.yml # For easy Docker operation ├── memento_mori/ │ ├── __init__.py # Package version, exports │ ├── cli.py # Command-line interface │ ├── config.py # Configuration handling │ ├── file_mapper.py # File discovery and mapping │ ├── extractor.py # Archive detection and extraction │ ├── loader.py # Load and process Instagram data │ ├── media.py # Media processing (conversion, thumbnails) │ ├── generator.py # Website generation │ ├── templates/ # HTML templates │ │ ├── index.html # Main template │ │ └── components/ # Reusable components │ │ └── modal.html # Post modal component │ ├── static/ # Static assets │ │ ├── css/ │ │ │ └── style.css │ │ └── js/ │ │ └── modal.js │ └── utils.py # Common utilities and helpers └── tests/ # Tests directory ``` ## Component Details ### file_mapper.py Provides a central location for file discovery and mapping: - Defines patterns for locating important files in Instagram exports - Discovers files based on patterns and maps them to easy-to-use identifiers - Ensures consistency between different components accessing the same files - Provides validation for required files - Handles variations in Instagram export structure gracefully ### extractor.py Responsible for locating and extracting Instagram data archives: - Auto-detection of ZIP files in specified directories - Extraction of archives to temporary or specified locations - Creates a file_mapper instance for the extracted content - Validation of extracted content structure via file_mapper - Cleanup of temporary files after processing ### loader.py Handles loading and processing Instagram data: - Uses file_mapper to access JSON files (posts, insights, user data) - Parsing and merging data sources - Converting timestamps and other data formatting - Providing a clean data structure for the generator ### media.py Manages all media processing operations: - Converting images to WebP format when beneficial - Generating thumbnails for grid view - Creating video thumbnails using appropriate libraries - Parallel processing of media files - Tracking conversion statistics ### generator.py Creates the static website: - Using templates instead of hardcoded HTML - Generating responsive layout - Including JavaScript for interactive features - Verifying output completeness - Supporting customization options ### cli.py Provides command-line interface: - Processing command-line arguments - Validating inputs - Coordinating processing flow - Using the file_mapper for consistent file access - Reporting progress and statistics ## Processing Flow The typical processing flow is: ```python # Initialize extractor extractor = InstagramArchiveExtractor() # Extract archive extractor.auto_detect_archive() extraction_dir = extractor.extract() # Get file mapper from extractor file_mapper = extractor.file_mapper # Initialize loader with the same file mapper loader = InstagramDataLoader(extraction_dir, file_mapper) # Load and process data data = loader.load_all_data() # Generate website with the loaded data generator = WebsiteGenerator(data, output_dir) generator.generate() ``` ## Implementation Roadmap ### Phase 1: Basic Structure 1. Create package structure 2. Implement file_mapper for centralized file discovery 3. Move core functionality from original script to appropriate modules 4. Create basic CLI ### Phase 2: Features 1. Implement archive auto-detection 2. Add archive extraction 3. Implement templating system 4. Enhance media processing ### Phase 3: Packaging & Deployment 1. Create Docker configuration 2. Set up package installation 3. Add documentation 4. Create tests ## Docker Usage The Docker configuration will allow easy execution: ``` # Run using docker-compose docker-compose run --rm memento-mori --input /input/instagram-export.zip --output /output # Or directly with docker docker run -v $(pwd)/input:/input -v $(pwd)/output:/output memento-mori --input /input/instagram-export.zip ``` ## CLI Usage ``` Usage: memento-mori [OPTIONS] Options: --input PATH Path to Instagram data (ZIP or folder) --output PATH Output directory for generated website [default: ./distribution] --threads INTEGER Number of parallel processing threads [default: auto] --auto-detect Auto-detect Instagram export in current directory --quality INTEGER WebP conversion quality (1-100) [default: 80] --thumbnail-size WxH Size of thumbnails [default: 292x292] --help Show this message and exit ``` ## Future Extensions The architecture supports several planned extensions: 1. Multiple archive merging 2. Custom themes 3. Additional statistics and visualizations 4. Progressive enhancement of the viewer 5. Support for Stories and other Instagram content types 6. Support for different Instagram export formats as they evolve ================================================ FILE: pyproject.toml ================================================ [build-system] requires = ["setuptools>=42", "wheel"] build-backend = "setuptools.build_meta" [project] name = "memento-mori" version = "0.1.0" description = "Transform Instagram data export into a viewer" readme = "README.md" requires-python = ">=3.9" license = "MIT" dependencies = [ "ftfy==6.3.1", "Jinja2==3.0.3", "MarkupSafe==2.1.5", "opencv_python==4.10.0.84", "Pillow==11.1.0", "python_magic>=0.4.27", "tqdm==4.67.1" ] [tool.setuptools] packages = ["memento_mori"] [project.scripts] memento-mori = "memento_mori.cli:main" [project.urls] "Homepage" = "https://github.com/greg-randall/memento-mori" "Bug Tracker" = "https://github.com/greg-randall/memento-mori/issues" ================================================ FILE: requirements.txt ================================================ ftfy==6.3.1 Jinja2==3.0.3 MarkupSafe==2.1.5 opencv_python==4.10.0.84 Pillow>=11.1.0 python_magic==0.4.27 tqdm==4.67.1