Full Code of greg-randall/memento-mori for AI

main a0a86c3bd6e9 cached

26 files

295.4 KB

66.9k tokens

102 symbols

1 requests

Download .txt

Showing preview only (307K chars total). Download the full file or copy to clipboard to get everything.

Repository: greg-randall/memento-mori
Branch: main
Commit: a0a86c3bd6e9
Files: 26
Total size: 295.4 KB

Directory structure:
gitextract_5pnua18_/

├── .gitignore
├── Dockerfile
├── LICENSE
├── README.md
├── deprecated_php_utility/
│   ├── index.php
│   ├── modal.js
│   ├── notes.md
│   └── style.css
├── docker-compose.yml
├── memento_mori/
│   ├── __init__.py
│   ├── cli.py
│   ├── extractor.py
│   ├── file_mapper.py
│   ├── generator.py
│   ├── loader.py
│   ├── media.py
│   ├── static/
│   │   ├── css/
│   │   │   └── style.css
│   │   └── js/
│   │       ├── modal.js
│   │       └── stories.js
│   └── templates/
│       ├── grid.html
│       ├── index.html
│       ├── stories.html
│       └── stories_page.html
├── project_plan.md
├── pyproject.toml
└── requirements.txt

================================================
FILE CONTENTS
================================================

================================================
FILE: .gitignore
================================================
*.zip
*.7z
*.tar
*.gz
*.tar.gz
.aider*

# Builds and Downloads
output/
instagram*/

# Virtual Environments
venv/

# Python bytecode
__pycache__/
*.py[cod]
*$py.class
*.so
.Python
*.cpython-*

# System Files
.DS_Store

# Comments test secrets
comments-test/.env
comments-test/instagram_cookies.json
comments-test/runs/
comments-test/


================================================
FILE: Dockerfile
================================================
FROM python:3.10-slim

# Install system dependencies (including support for image processing and libmagic)
RUN apt-get update && apt-get install -y \
    libgl1 \
    libglib2.0-0 \
    libjpeg-dev \
    zlib1g-dev \
    libmagic-dev \
    file \
    && rm -rf /var/lib/apt/lists/*

# Set up working directory
WORKDIR /app

# Copy requirements file
COPY requirements.txt .

# Install dependencies from requirements.txt
RUN pip install --no-cache-dir -r requirements.txt

# Copy application code
COPY . .

# Create directories for input/output
RUN mkdir -p /input /output

# Set the entrypoint
ENTRYPOINT ["python", "-m", "memento_mori.cli"]

# Default command if none provided
CMD ["--help"]

================================================
FILE: LICENSE
================================================
                  GNU LESSER GENERAL PUBLIC LICENSE
                       Version 2.1, February 1999

 Copyright (C) 1991, 1999 Free Software Foundation, Inc.
 51 Franklin Street, Fifth Floor, Boston, MA  02110-1301  USA
 Everyone is permitted to copy and distribute verbatim copies
 of this license document, but changing it is not allowed.

[This is the first released version of the Lesser GPL.  It also counts
 as the successor of the GNU Library Public License, version 2, hence
 the version number 2.1.]

                            Preamble

  The licenses for most software are designed to take away your
freedom to share and change it.  By contrast, the GNU General Public
Licenses are intended to guarantee your freedom to share and change
free software--to make sure the software is free for all its users.

  This license, the Lesser General Public License, applies to some
specially designated software packages--typically libraries--of the
Free Software Foundation and other authors who decide to use it.  You
can use it too, but we suggest you first think carefully about whether
this license or the ordinary General Public License is the better
strategy to use in any particular case, based on the explanations below.

  When we speak of free software, we are referring to freedom of use,
not price.  Our General Public Licenses are designed to make sure that
you have the freedom to distribute copies of free software (and charge
for this service if you wish); that you receive source code or can get
it if you want it; that you can change the software and use pieces of
it in new free programs; and that you are informed that you can do
these things.

  To protect your rights, we need to make restrictions that forbid
distributors to deny you these rights or to ask you to surrender these
rights.  These restrictions translate to certain responsibilities for
you if you distribute copies of the library or if you modify it.

  For example, if you distribute copies of the library, whether gratis
or for a fee, you must give the recipients all the rights that we gave
you.  You must make sure that they, too, receive or can get the source
code.  If you link other code with the library, you must provide
complete object files to the recipients, so that they can relink them
with the library after making changes to the library and recompiling
it.  And you must show them these terms so they know their rights.

  We protect your rights with a two-step method: (1) we copyright the
library, and (2) we offer you this license, which gives you legal
permission to copy, distribute and/or modify the library.

  To protect each distributor, we want to make it very clear that
there is no warranty for the free library.  Also, if the library is
modified by someone else and passed on, the recipients should know
that what they have is not the original version, so that the original
author's reputation will not be affected by problems that might be
introduced by others.

  Finally, software patents pose a constant threat to the existence of
any free program.  We wish to make sure that a company cannot
effectively restrict the users of a free program by obtaining a
restrictive license from a patent holder.  Therefore, we insist that
any patent license obtained for a version of the library must be
consistent with the full freedom of use specified in this license.

  Most GNU software, including some libraries, is covered by the
ordinary GNU General Public License.  This license, the GNU Lesser
General Public License, applies to certain designated libraries, and
is quite different from the ordinary General Public License.  We use
this license for certain libraries in order to permit linking those
libraries into non-free programs.

  When a program is linked with a library, whether statically or using
a shared library, the combination of the two is legally speaking a
combined work, a derivative of the original library.  The ordinary
General Public License therefore permits such linking only if the
entire combination fits its criteria of freedom.  The Lesser General
Public License permits more lax criteria for linking other code with
the library.

  We call this license the "Lesser" General Public License because it
does Less to protect the user's freedom than the ordinary General
Public License.  It also provides other free software developers Less
of an advantage over competing non-free programs.  These disadvantages
are the reason we use the ordinary General Public License for many
libraries.  However, the Lesser license provides advantages in certain
special circumstances.

  For example, on rare occasions, there may be a special need to
encourage the widest possible use of a certain library, so that it becomes
a de-facto standard.  To achieve this, non-free programs must be
allowed to use the library.  A more frequent case is that a free
library does the same job as widely used non-free libraries.  In this
case, there is little to gain by limiting the free library to free
software only, so we use the Lesser General Public License.

  In other cases, permission to use a particular library in non-free
programs enables a greater number of people to use a large body of
free software.  For example, permission to use the GNU C Library in
non-free programs enables many more people to use the whole GNU
operating system, as well as its variant, the GNU/Linux operating
system.

  Although the Lesser General Public License is Less protective of the
users' freedom, it does ensure that the user of a program that is
linked with the Library has the freedom and the wherewithal to run
that program using a modified version of the Library.

  The precise terms and conditions for copying, distribution and
modification follow.  Pay close attention to the difference between a
"work based on the library" and a "work that uses the library".  The
former contains code derived from the library, whereas the latter must
be combined with the library in order to run.

                  GNU LESSER GENERAL PUBLIC LICENSE
   TERMS AND CONDITIONS FOR COPYING, DISTRIBUTION AND MODIFICATION

  0. This License Agreement applies to any software library or other
program which contains a notice placed by the copyright holder or
other authorized party saying it may be distributed under the terms of
this Lesser General Public License (also called "this License").
Each licensee is addressed as "you".

  A "library" means a collection of software functions and/or data
prepared so as to be conveniently linked with application programs
(which use some of those functions and data) to form executables.

  The "Library", below, refers to any such software library or work
which has been distributed under these terms.  A "work based on the
Library" means either the Library or any derivative work under
copyright law: that is to say, a work containing the Library or a
portion of it, either verbatim or with modifications and/or translated
straightforwardly into another language.  (Hereinafter, translation is
included without limitation in the term "modification".)

  "Source code" for a work means the preferred form of the work for
making modifications to it.  For a library, complete source code means
all the source code for all modules it contains, plus any associated
interface definition files, plus the scripts used to control compilation
and installation of the library.

  Activities other than copying, distribution and modification are not
covered by this License; they are outside its scope.  The act of
running a program using the Library is not restricted, and output from
such a program is covered only if its contents constitute a work based
on the Library (independent of the use of the Library in a tool for
writing it).  Whether that is true depends on what the Library does
and what the program that uses the Library does.

  1. You may copy and distribute verbatim copies of the Library's
complete source code as you receive it, in any medium, provided that
you conspicuously and appropriately publish on each copy an
appropriate copyright notice and disclaimer of warranty; keep intact
all the notices that refer to this License and to the absence of any
warranty; and distribute a copy of this License along with the
Library.

  You may charge a fee for the physical act of transferring a copy,
and you may at your option offer warranty protection in exchange for a
fee.

  2. You may modify your copy or copies of the Library or any portion
of it, thus forming a work based on the Library, and copy and
distribute such modifications or work under the terms of Section 1
above, provided that you also meet all of these conditions:

    a) The modified work must itself be a software library.

    b) You must cause the files modified to carry prominent notices
    stating that you changed the files and the date of any change.

    c) You must cause the whole of the work to be licensed at no
    charge to all third parties under the terms of this License.

    d) If a facility in the modified Library refers to a function or a
    table of data to be supplied by an application program that uses
    the facility, other than as an argument passed when the facility
    is invoked, then you must make a good faith effort to ensure that,
    in the event an application does not supply such function or
    table, the facility still operates, and performs whatever part of
    its purpose remains meaningful.

    (For example, a function in a library to compute square roots has
    a purpose that is entirely well-defined independent of the
    application.  Therefore, Subsection 2d requires that any
    application-supplied function or table used by this function must
    be optional: if the application does not supply it, the square
    root function must still compute square roots.)

These requirements apply to the modified work as a whole.  If
identifiable sections of that work are not derived from the Library,
and can be reasonably considered independent and separate works in
themselves, then this License, and its terms, do not apply to those
sections when you distribute them as separate works.  But when you
distribute the same sections as part of a whole which is a work based
on the Library, the distribution of the whole must be on the terms of
this License, whose permissions for other licensees extend to the
entire whole, and thus to each and every part regardless of who wrote
it.

Thus, it is not the intent of this section to claim rights or contest
your rights to work written entirely by you; rather, the intent is to
exercise the right to control the distribution of derivative or
collective works based on the Library.

In addition, mere aggregation of another work not based on the Library
with the Library (or with a work based on the Library) on a volume of
a storage or distribution medium does not bring the other work under
the scope of this License.

  3. You may opt to apply the terms of the ordinary GNU General Public
License instead of this License to a given copy of the Library.  To do
this, you must alter all the notices that refer to this License, so
that they refer to the ordinary GNU General Public License, version 2,
instead of to this License.  (If a newer version than version 2 of the
ordinary GNU General Public License has appeared, then you can specify
that version instead if you wish.)  Do not make any other change in
these notices.

  Once this change is made in a given copy, it is irreversible for
that copy, so the ordinary GNU General Public License applies to all
subsequent copies and derivative works made from that copy.

  This option is useful when you wish to copy part of the code of
the Library into a program that is not a library.

  4. You may copy and distribute the Library (or a portion or
derivative of it, under Section 2) in object code or executable form
under the terms of Sections 1 and 2 above provided that you accompany
it with the complete corresponding machine-readable source code, which
must be distributed under the terms of Sections 1 and 2 above on a
medium customarily used for software interchange.

  If distribution of object code is made by offering access to copy
from a designated place, then offering equivalent access to copy the
source code from the same place satisfies the requirement to
distribute the source code, even though third parties are not
compelled to copy the source along with the object code.

  5. A program that contains no derivative of any portion of the
Library, but is designed to work with the Library by being compiled or
linked with it, is called a "work that uses the Library".  Such a
work, in isolation, is not a derivative work of the Library, and
therefore falls outside the scope of this License.

  However, linking a "work that uses the Library" with the Library
creates an executable that is a derivative of the Library (because it
contains portions of the Library), rather than a "work that uses the
library".  The executable is therefore covered by this License.
Section 6 states terms for distribution of such executables.

  When a "work that uses the Library" uses material from a header file
that is part of the Library, the object code for the work may be a
derivative work of the Library even though the source code is not.
Whether this is true is especially significant if the work can be
linked without the Library, or if the work is itself a library.  The
threshold for this to be true is not precisely defined by law.

  If such an object file uses only numerical parameters, data
structure layouts and accessors, and small macros and small inline
functions (ten lines or less in length), then the use of the object
file is unrestricted, regardless of whether it is legally a derivative
work.  (Executables containing this object code plus portions of the
Library will still fall under Section 6.)

  Otherwise, if the work is a derivative of the Library, you may
distribute the object code for the work under the terms of Section 6.
Any executables containing that work also fall under Section 6,
whether or not they are linked directly with the Library itself.

  6. As an exception to the Sections above, you may also combine or
link a "work that uses the Library" with the Library to produce a
work containing portions of the Library, and distribute that work
under terms of your choice, provided that the terms permit
modification of the work for the customer's own use and reverse
engineering for debugging such modifications.

  You must give prominent notice with each copy of the work that the
Library is used in it and that the Library and its use are covered by
this License.  You must supply a copy of this License.  If the work
during execution displays copyright notices, you must include the
copyright notice for the Library among them, as well as a reference
directing the user to the copy of this License.  Also, you must do one
of these things:

    a) Accompany the work with the complete corresponding
    machine-readable source code for the Library including whatever
    changes were used in the work (which must be distributed under
    Sections 1 and 2 above); and, if the work is an executable linked
    with the Library, with the complete machine-readable "work that
    uses the Library", as object code and/or source code, so that the
    user can modify the Library and then relink to produce a modified
    executable containing the modified Library.  (It is understood
    that the user who changes the contents of definitions files in the
    Library will not necessarily be able to recompile the application
    to use the modified definitions.)

    b) Use a suitable shared library mechanism for linking with the
    Library.  A suitable mechanism is one that (1) uses at run time a
    copy of the library already present on the user's computer system,
    rather than copying library functions into the executable, and (2)
    will operate properly with a modified version of the library, if
    the user installs one, as long as the modified version is
    interface-compatible with the version that the work was made with.

    c) Accompany the work with a written offer, valid for at
    least three years, to give the same user the materials
    specified in Subsection 6a, above, for a charge no more
    than the cost of performing this distribution.

    d) If distribution of the work is made by offering access to copy
    from a designated place, offer equivalent access to copy the above
    specified materials from the same place.

    e) Verify that the user has already received a copy of these
    materials or that you have already sent this user a copy.

  For an executable, the required form of the "work that uses the
Library" must include any data and utility programs needed for
reproducing the executable from it.  However, as a special exception,
the materials to be distributed need not include anything that is
normally distributed (in either source or binary form) with the major
components (compiler, kernel, and so on) of the operating system on
which the executable runs, unless that component itself accompanies
the executable.

  It may happen that this requirement contradicts the license
restrictions of other proprietary libraries that do not normally
accompany the operating system.  Such a contradiction means you cannot
use both them and the Library together in an executable that you
distribute.

  7. You may place library facilities that are a work based on the
Library side-by-side in a single library together with other library
facilities not covered by this License, and distribute such a combined
library, provided that the separate distribution of the work based on
the Library and of the other library facilities is otherwise
permitted, and provided that you do these two things:

    a) Accompany the combined library with a copy of the same work
    based on the Library, uncombined with any other library
    facilities.  This must be distributed under the terms of the
    Sections above.

    b) Give prominent notice with the combined library of the fact
    that part of it is a work based on the Library, and explaining
    where to find the accompanying uncombined form of the same work.

  8. You may not copy, modify, sublicense, link with, or distribute
the Library except as expressly provided under this License.  Any
attempt otherwise to copy, modify, sublicense, link with, or
distribute the Library is void, and will automatically terminate your
rights under this License.  However, parties who have received copies,
or rights, from you under this License will not have their licenses
terminated so long as such parties remain in full compliance.

  9. You are not required to accept this License, since you have not
signed it.  However, nothing else grants you permission to modify or
distribute the Library or its derivative works.  These actions are
prohibited by law if you do not accept this License.  Therefore, by
modifying or distributing the Library (or any work based on the
Library), you indicate your acceptance of this License to do so, and
all its terms and conditions for copying, distributing or modifying
the Library or works based on it.

  10. Each time you redistribute the Library (or any work based on the
Library), the recipient automatically receives a license from the
original licensor to copy, distribute, link with or modify the Library
subject to these terms and conditions.  You may not impose any further
restrictions on the recipients' exercise of the rights granted herein.
You are not responsible for enforcing compliance by third parties with
this License.

  11. If, as a consequence of a court judgment or allegation of patent
infringement or for any other reason (not limited to patent issues),
conditions are imposed on you (whether by court order, agreement or
otherwise) that contradict the conditions of this License, they do not
excuse you from the conditions of this License.  If you cannot
distribute so as to satisfy simultaneously your obligations under this
License and any other pertinent obligations, then as a consequence you
may not distribute the Library at all.  For example, if a patent
license would not permit royalty-free redistribution of the Library by
all those who receive copies directly or indirectly through you, then
the only way you could satisfy both it and this License would be to
refrain entirely from distribution of the Library.

If any portion of this section is held invalid or unenforceable under any
particular circumstance, the balance of the section is intended to apply,
and the section as a whole is intended to apply in other circumstances.

It is not the purpose of this section to induce you to infringe any
patents or other property right claims or to contest validity of any
such claims; this section has the sole purpose of protecting the
integrity of the free software distribution system which is
implemented by public license practices.  Many people have made
generous contributions to the wide range of software distributed
through that system in reliance on consistent application of that
system; it is up to the author/donor to decide if he or she is willing
to distribute software through any other system and a licensee cannot
impose that choice.

This section is intended to make thoroughly clear what is believed to
be a consequence of the rest of this License.

  12. If the distribution and/or use of the Library is restricted in
certain countries either by patents or by copyrighted interfaces, the
original copyright holder who places the Library under this License may add
an explicit geographical distribution limitation excluding those countries,
so that distribution is permitted only in or among countries not thus
excluded.  In such case, this License incorporates the limitation as if
written in the body of this License.

  13. The Free Software Foundation may publish revised and/or new
versions of the Lesser General Public License from time to time.
Such new versions will be similar in spirit to the present version,
but may differ in detail to address new problems or concerns.

Each version is given a distinguishing version number.  If the Library
specifies a version number of this License which applies to it and
"any later version", you have the option of following the terms and
conditions either of that version or of any later version published by
the Free Software Foundation.  If the Library does not specify a
license version number, you may choose any version ever published by
the Free Software Foundation.

  14. If you wish to incorporate parts of the Library into other free
programs whose distribution conditions are incompatible with these,
write to the author to ask for permission.  For software which is
copyrighted by the Free Software Foundation, write to the Free
Software Foundation; we sometimes make exceptions for this.  Our
decision will be guided by the two goals of preserving the free status
of all derivatives of our free software and of promoting the sharing
and reuse of software generally.

                            NO WARRANTY

  15. BECAUSE THE LIBRARY IS LICENSED FREE OF CHARGE, THERE IS NO
WARRANTY FOR THE LIBRARY, TO THE EXTENT PERMITTED BY APPLICABLE LAW.
EXCEPT WHEN OTHERWISE STATED IN WRITING THE COPYRIGHT HOLDERS AND/OR
OTHER PARTIES PROVIDE THE LIBRARY "AS IS" WITHOUT WARRANTY OF ANY
KIND, EITHER EXPRESSED OR IMPLIED, INCLUDING, BUT NOT LIMITED TO, THE
IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR
PURPOSE.  THE ENTIRE RISK AS TO THE QUALITY AND PERFORMANCE OF THE
LIBRARY IS WITH YOU.  SHOULD THE LIBRARY PROVE DEFECTIVE, YOU ASSUME
THE COST OF ALL NECESSARY SERVICING, REPAIR OR CORRECTION.

  16. IN NO EVENT UNLESS REQUIRED BY APPLICABLE LAW OR AGREED TO IN
WRITING WILL ANY COPYRIGHT HOLDER, OR ANY OTHER PARTY WHO MAY MODIFY
AND/OR REDISTRIBUTE THE LIBRARY AS PERMITTED ABOVE, BE LIABLE TO YOU
FOR DAMAGES, INCLUDING ANY GENERAL, SPECIAL, INCIDENTAL OR
CONSEQUENTIAL DAMAGES ARISING OUT OF THE USE OR INABILITY TO USE THE
LIBRARY (INCLUDING BUT NOT LIMITED TO LOSS OF DATA OR DATA BEING
RENDERED INACCURATE OR LOSSES SUSTAINED BY YOU OR THIRD PARTIES OR A
FAILURE OF THE LIBRARY TO OPERATE WITH ANY OTHER SOFTWARE), EVEN IF
SUCH HOLDER OR OTHER PARTY HAS BEEN ADVISED OF THE POSSIBILITY OF SUCH
DAMAGES.

                     END OF TERMS AND CONDITIONS

           How to Apply These Terms to Your New Libraries

  If you develop a new library, and you want it to be of the greatest
possible use to the public, we recommend making it free software that
everyone can redistribute and change.  You can do so by permitting
redistribution under these terms (or, alternatively, under the terms of the
ordinary General Public License).

  To apply these terms, attach the following notices to the library.  It is
safest to attach them to the start of each source file to most effectively
convey the exclusion of warranty; and each file should have at least the
"copyright" line and a pointer to where the full notice is found.

    Reads an instagram export and creates html output that is browsable.
    Copyright (C) 2025  Gregory Randall

    This library is free software; you can redistribute it and/or
    modify it under the terms of the GNU Lesser General Public
    License as published by the Free Software Foundation; either
    version 2.1 of the License, or (at your option) any later version.

    This library is distributed in the hope that it will be useful,
    but WITHOUT ANY WARRANTY; without even the implied warranty of
    MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
    Lesser General Public License for more details.

    You should have received a copy of the GNU Lesser General Public
    License along with this library; if not, write to the Free Software
    Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA  02110-1301
    USA

Also add information on how to contact you by electronic and paper mail.

You should also get your employer (if you work as a programmer) or your
school, if any, to sign a "copyright disclaimer" for the library, if
necessary.  Here is a sample; alter the names:

  Yoyodyne, Inc., hereby disclaims all copyright interest in the
  library `Frob' (a library for tweaking knobs) written by James Random
  Hacker.

  <signature of Ty Coon>, 1 April 1990
  Ty Coon, President of Vice

That's all there is to it!


================================================
FILE: README.md
================================================
# Memento Mori - Instagram Archive Viewer

<img align="right" width="300" hspace="20" src="preview.gif" alt="Memento Mori Interface Preview">

**Memento Mori** is a tool that converts your Instagram data export into a beautiful, standalone viewer that resembles the Instagram interface. The name "Memento Mori" (Latin for "remember that you will die") reflects the ephemeral nature of our digital content. You can see an example at https://gregr.org/instagram/.

If you find a bug that you're able to fix please create a pull request, otherwise create an issue!

## Quick Start
Get your Instagram data export zip, throw it in with this code, and run this command:
```bash
docker compose run --rm memento-mori
#Then open output/index.html in your browser
```

## ⚠️ IMPORTANT SECURITY WARNING ⚠️

**DO NOT** share your raw Instagram export online! It contains sensitive data you probably don't want to share:

- Phone numbers
- Precise location data
- Personal messages
- Email addresses
- Other private information

Only share the generated output folder after processing with this tool.

## How It Works
Memento Mori processes your Instagram data export and generates a static site with your posts and stories, copying all your media files into an organized structure that can be viewed offline or hosted on your own website.

## Key Features
- **Familiar Interface**: Grid layout with post details and carousel for multiple images
- **Stories Support**: View your Instagram Stories with auto-progression and 9:16 aspect ratio display
- **Media Optimization**: Converts images to WebP, generates thumbnails, and supports video playback
- **Organization**: Sorts posts by various criteria with shareable links to specific content
- **Profile Information**: Displays bio, website, and follower count from your Instagram profile
- **Technical Improvements**:
  - Fixes encoding issues and mislabeled file formats
  - Shortens filenames for smaller HTML size
  - Processes files in parallel with a responsive design that works on all devices
  - Robust error handling with verbose debugging option

## How to Use Memento Mori

### 1. Get Your Instagram Data
1. Request and download your Instagram data archive
2. Place the zip within the folder of this repo

### 2. Preferred Method: Using Docker (Easiest)
Docker Compose is the easiest way to run Memento Mori without installing any dependencies. Many thanks to [CarsonDavis](https://github.com/CarsonDavis) for building out all the dockerizing code (as well as generally making my code better):
```bash
# Build the Docker image
docker compose build

# Run with default settings
docker compose run --rm memento-mori

# Run with specific arguments
docker compose run --rm memento-mori --output /output/my-site --quality 90

# Add Google Analytics tracking
docker compose run --rm memento-mori --gtag-id G-DX1ZWTC9NZ

# Serve the output folder locally to preview in your browser
python3 -m http.server -d output
```

By default, Docker will:
- Search for exports in the project directory
- Output the generated site to the './output' directory

### 3. Alternative Method: Direct Python Installation
If you prefer running the tool directly without Docker:
```bash
# Install package and dependencies
pip install -e .

# Or install dependencies manually
pip install ftfy==6.3.1 Jinja2==3.0.3 MarkupSafe==2.1.5 opencv_python==4.10.0.84 Pillow==11.1.0 tqdm==4.67.1 python_magic==0.4.27

# Run the CLI
python -m memento_mori.cli

# Serve the output folder locally to preview in your browser
python3 -m http.server -d output
```

### CLI Arguments
The CLI supports the following arguments:
```
Options:
--input PATH Path to data (ZIP or folder). If not specified, auto-detection will be used.
--output PATH Output directory for generated website [default: ./output]
--threads INTEGER Number of parallel processing threads [default: core count - 1]
--search-dir PATH Directory to search for exports when auto-detecting [default: current directory]
--quality INTEGER WebP conversion quality (1-100) [default: 70]
--max-dimension INTEGER Maximum dimension for images in pixels [default: 1920]
--thumbnail-size WxH Size of thumbnails [default: 292x292]
--no-auto-detect Disable auto-detection (requires --input to be specified)
--gtag-id ID     Google Analytics tag ID (e.g., 'G-DX1ZWTC9NZ') to add tracking to the generated site
--verbose, -v    Enable verbose output for debugging
```

Note: Auto-detection is enabled by default and will look for exports in the current directory. Use `--no-auto-detect` if you want to disable this feature and specify an input path manually.

### Example Commands
```bash
# Auto-detect export in current directory
python -m memento_mori.cli

# Specify input file/folder and output directory
python -m memento_mori.cli --input path/to/export.zip --output my-site

# Use specific number of threads and image quality
python -m memento_mori.cli --threads 8 --quality 90

# Specify search directory for auto-detection
python -m memento_mori.cli --search-dir ~/Downloads

# Use custom thumbnail size
python -m memento_mori.cli --thumbnail-size 400x400

# Specify maximum image dimension
python -m memento_mori.cli --max-dimension 1600

# Disable auto-detection (requires specifying input)
python -m memento_mori.cli --no-auto-detect --input path/to/export.zip

# Add Google Analytics tracking
python -m memento_mori.cli --gtag-id G-DX1ZWTC9NZ

# Enable verbose debugging output
python -m memento_mori.cli --verbose
```

## Viewing Your Generated Site
After the tool finishes processing your Instagram data:
1. The website will be generated in the output directory (default: ./output)
2. Open the `index.html` file in this directory with your web browser to view your Instagram archive
3. Click on "stories" in your profile stats to view your Stories archive
4. You can also upload the entire output directory to a web hosting service to share it online

## PHP Version (Alternative)
For those who prefer the deprecated PHP implementation, there are a few notes in the deprecated_php_utility folder, but basically extract your data into the folder with the php file, and run
```bash
# Run from command line
php index.php
```

## Why This Exists
When requesting your data from Instagram, the export you receive contains your content but in a format that's intentionally difficult to navigate and enjoy. Memento Mori solves this problem by transforming your archive into an intuitive, familiar interface that brings your memories back to life.

Instagram, like many social platforms, has undergone significant "enshittification" - a term coined to describe how platforms evolve:

1. First, they attract users with a quality experience
2. Then, they leverage their position to extract data and attention
3. Finally, they degrade the user experience to maximize profit


================================================
FILE: deprecated_php_utility/index.php
================================================
<?php

// Create distribution directory if it doesn't exist
if (!file_exists('distribution')) {
    mkdir('distribution', 0755, true);
}

/**
 * Copy media files to the distribution folder
 * 
 * @param array $post_data The post data containing media URLs
 * @param string $profile_picture The profile picture URL
 */
function copy_media_files($post_data, $profile_picture) {
  // Create media directories in distribution folder
  $media_dirs = [
      'distribution/media',
      'distribution/media/posts',
      'distribution/media/other',
      'distribution/thumbnails'
  ];
  
  foreach ($media_dirs as $dir) {
      if (!file_exists($dir)) {
          mkdir($dir, 0755, true);
      }
  }
  
  // Copy profile picture
  copy_file_to_distribution($profile_picture);
  
  // Generate thumbnail for profile picture
  generate_thumbnail($profile_picture, $profile_picture);
  
  // Copy all post media
  $total_media = 0;
  $processed_media = 0;
  
  // Count total media files first
  foreach ($post_data as $post) {
      $total_media += count($post['media']);
  }
  
  fwrite(STDERR, "Generating thumbnails for $total_media media files...\n");
  
  // Process each media file
  foreach ($post_data as $post) {
      foreach ($post['media'] as $media_url) {
          copy_file_to_distribution($media_url);
          $processed_media++;
          
          // Show progress
          if ($processed_media % 10 === 0 || $processed_media === $total_media) {
              $percent = round(($processed_media / $total_media) * 100);
              fwrite(STDERR, "Progress: $processed_media/$total_media ($percent%)\n");
          }
      }
  }
  
  // Count how many thumbnails and WebP conversions were successfully generated
  $thumbnail_count = 0;
  $webp_count = 0;
  $total_size_original = 0;
  $total_size_webp = 0;
  
  if (is_dir('distribution/thumbnails')) {
      $thumbnail_count = count(glob('distribution/thumbnails/*.webp'));
  }
  
  // Count WebP conversions and calculate space savings
  foreach ($post_data as $post) {
      foreach ($post['media'] as $media_url) {
          if (preg_match('/\.(jpg|jpeg|png|gif)$/i', $media_url)) {
              $original_path = $media_url;
              $webp_path = preg_replace('/\.(jpg|jpeg|png|gif)$/i', '.webp', $media_url);
              
              if (file_exists('distribution/' . $webp_path)) {
                  $webp_count++;
                  
                  // Calculate size difference if original exists
                  if (file_exists($original_path)) {
                      $original_size = filesize($original_path);
                      $webp_size = filesize('distribution/' . $webp_path);
                      $total_size_original += $original_size;
                      $total_size_webp += $webp_size;
                  }
              }
          }
      }
  }
  
  // Calculate total space savings
  $space_saved_mb = ($total_size_original - $total_size_webp) / (1024 * 1024);
  
  fwrite(STDERR, "All media files and thumbnails processed.\n");
  fwrite(STDERR, "Successfully generated $thumbnail_count thumbnails.\n");
  fwrite(STDERR, "Successfully converted $webp_count images to WebP format.\n");
  
  if ($total_size_original > 0) {
      $percentage_saved = (($total_size_original - $total_size_webp) / $total_size_original * 100);
      fwrite(STDERR, sprintf("Total space saved: %.2f MB (%.1f%%)\n", 
          $space_saved_mb, 
          $percentage_saved
      ));
  } else {
      fwrite(STDERR, sprintf("Total space saved: %.2f MB (0.0%%)\n", $space_saved_mb));
  }
  
  echo "Media files copied to distribution folder.\n";
}

/**
* Copy a single file to the distribution folder, maintaining its path structure
* 
* @param string $file_path The path to the file
*/
function copy_file_to_distribution($file_path) {
  // Skip if it's already a data URI
  if (strpos($file_path, 'data:image') === 0) {
      return;
  }
  
  $source = $file_path;
  $destination = 'distribution/' . $file_path;
  
  // Create directory structure if it doesn't exist
  $dir = dirname($destination);
  if (!file_exists($dir)) {
      mkdir($dir, 0755, true);
  }
  
  // Check if it's an image file that can be converted to WebP
  $is_image = preg_match('/\.(jpg|jpeg|png|gif)$/i', $file_path);
  $is_video = preg_match('/\.(mp4|mov|avi|webm)$/i', $file_path);
  
  if ($is_image && file_exists($source)) {
      // Convert image to WebP for better compression
      $webp_destination = preg_replace('/\.(jpg|jpeg|png|gif)$/i', '.webp', $destination);
      convert_to_webp($source, $webp_destination);
      
      // Generate thumbnail for this file
      generate_thumbnail($source, $file_path);
  } else if (file_exists($source)) {
      // Copy the file as is (for videos and other file types)
      copy($source, $destination);
      
      // Generate thumbnail for this file
      generate_thumbnail($source, $file_path);
  }
}

/**
 * Convert an image to WebP format without cropping
 * 
 * @param string $source_path The source image path
 * @param string $destination_path The destination WebP path
 * @return bool True if successful, false otherwise
 */
function convert_to_webp($source_path, $destination_path) {
    try {
        // Detect file type by examining file contents
        $file_info = new finfo(FILEINFO_MIME_TYPE);
        $mime_type = $file_info->file($source_path);
        
        // Create image resource based on mime type
        $source_image = null;
        
        switch ($mime_type) {
            case 'image/jpeg':
                $source_image = @imagecreatefromjpeg($source_path);
                break;
            case 'image/png':
                $source_image = @imagecreatefrompng($source_path);
                // Preserve transparency for PNG
                if ($source_image) {
                    imagepalettetotruecolor($source_image);
                    imagealphablending($source_image, true);
                    imagesavealpha($source_image, true);
                }
                break;
            case 'image/gif':
                $source_image = @imagecreatefromgif($source_path);
                break;
            default:
                // Try to load as JPEG first, then PNG, then GIF as fallbacks
                $source_image = @imagecreatefromjpeg($source_path);
                if (!$source_image) {
                    $source_image = @imagecreatefrompng($source_path);
                    if ($source_image) {
                        imagepalettetotruecolor($source_image);
                        imagealphablending($source_image, true);
                        imagesavealpha($source_image, true);
                    }
                }
                if (!$source_image) {
                    $source_image = @imagecreatefromgif($source_path);
                }
                break;
        }
        
        if (!$source_image) {
            fwrite(STDERR, "Failed to create image resource for conversion: " . $source_path . "\n");
            // Fall back to copying the original file
            copy($source_path, str_replace('.webp', '.jpg', $destination_path));
            return false;
        }
        
        // Save as WebP with 80% quality (good balance between quality and file size)
        $result = imagewebp($source_image, $destination_path, 80);
        
        // Clean up
        imagedestroy($source_image);
        
        if ($result) {
            // Check if the WebP file is actually smaller than the original
            $original_size = filesize($source_path);
            $webp_size = filesize($destination_path);
            
            if ($webp_size > 0 && $webp_size < $original_size) {
                fwrite(STDERR, "Converted to WebP: " . $source_path . " (saved " . 
                       round(($original_size - $webp_size) / 1024, 2) . " KB)\n");
                return true;
            } else {
                // If WebP is larger or failed, use the original file
                unlink($destination_path);
                copy($source_path, str_replace('.webp', '.jpg', $destination_path));
                fwrite(STDERR, "WebP larger than original, using original: " . $source_path . "\n");
                return false;
            }
        } else {
            // If WebP conversion failed, use the original file
            copy($source_path, str_replace('.webp', '.jpg', $destination_path));
            fwrite(STDERR, "WebP conversion failed, using original: " . $source_path . "\n");
            return false;
        }
    } catch (Exception $e) {
        fwrite(STDERR, "Error converting to WebP: " . $e->getMessage() . "\n");
        // Fall back to copying the original file
        copy($source_path, str_replace('.webp', '.jpg', $destination_path));
        return false;
    }
}

/**
 * Generate a thumbnail for an image or video file
 * 
 * @param string $source_path The source file path
 * @param string $relative_path The relative path for naming the thumbnail
 * @return string|null The path to the generated thumbnail or null if failed
 */
function generate_thumbnail($source_path, $relative_path) {
    // Create thumbnails directory if it doesn't exist
    $thumbs_dir = 'distribution/thumbnails';
    if (!file_exists($thumbs_dir)) {
        mkdir($thumbs_dir, 0755, true);
    }
    
    // Generate a unique filename for the thumbnail based on the original path
    $thumb_filename = md5($relative_path) . '.webp';
    $thumb_path = $thumbs_dir . '/' . $thumb_filename;
    
    // Skip if thumbnail already exists
    if (file_exists($thumb_path)) {
        return $thumb_path;
    }
    
    // Target dimensions
    $target_width = 292;
    $target_height = 292;
    
    fwrite(STDERR, "Generating thumbnail for: $relative_path\n");
    
    try {
        // Check if file exists
        if (!file_exists($source_path)) {
            fwrite(STDERR, "File not found: $source_path\n");
            return null;
        }
        
        // Detect file type by examining file contents
        $file_info = new finfo(FILEINFO_MIME_TYPE);
        $mime_type = $file_info->file($source_path);
        
        // Determine if it's a video based on mime type
        $is_video = (strpos($mime_type, 'video/') === 0);
        
        // For HEIC files (often incorrectly labeled)
        $is_heic = false;
        if (strpos($mime_type, 'application/octet-stream') === 0) {
            // Check for HEIC signature
            $file_header = file_get_contents($source_path, false, null, 0, 12);
            if (strpos($file_header, 'ftypheic') !== false || 
                strpos($file_header, 'ftypmif1') !== false || 
                strpos($file_header, 'ftyphevc') !== false) {
                $is_heic = true;
            }
        }
        
        if ($is_video) {
            // For videos, try to use FFmpeg to extract a frame
            if (function_exists('exec')) {
                $temp_jpg = tempnam(sys_get_temp_dir(), 'thumb') . '.jpg';
                // Extract a frame at 1 second mark
                exec("ffmpeg -i \"$source_path\" -ss 00:00:01 -vframes 1 -vf \"scale=$target_width:$target_height:force_original_aspect_ratio=decrease,pad=$target_width:$target_height:(ow-iw)/2:(oh-ih)/2:color=black\" \"$temp_jpg\" 2>&1", $output, $return_var);
                
                if ($return_var !== 0) {
                    fwrite(STDERR, "FFmpeg error: " . implode("\n", $output) . "\n");
                    return null;
                }
                
                // Convert the extracted frame to WebP
                if (function_exists('imagecreatefromjpeg') && function_exists('imagewebp')) {
                    $image = imagecreatefromjpeg($temp_jpg);
                    imagewebp($image, $thumb_path, 80);
                    imagedestroy($image);
                    unlink($temp_jpg); // Clean up temp file
                    return $thumb_path;
                }
            }
            
            // If FFmpeg fails or is not available, use a placeholder
            fwrite(STDERR, "Could not generate video thumbnail for: $relative_path\n");
            return null;
        } else if ($is_heic) {
            // For HEIC files, try to use ImageMagick if available
            if (function_exists('exec')) {
                $temp_jpg = tempnam(sys_get_temp_dir(), 'thumb') . '.jpg';
                exec("convert \"$source_path\" \"$temp_jpg\" 2>&1", $output, $return_var);
                
                if ($return_var !== 0) {
                    fwrite(STDERR, "ImageMagick error for HEIC: " . implode("\n", $output) . "\n");
                    return null;
                }
                
                // Now process the converted JPG
                if (file_exists($temp_jpg)) {
                    $source_image = imagecreatefromjpeg($temp_jpg);
                    if (!$source_image) {
                        fwrite(STDERR, "Failed to create image from converted HEIC: $relative_path\n");
                        unlink($temp_jpg);
                        return null;
                    }
                    
                    // Process the image (resize and save as WebP)
                    $result = process_and_save_image($source_image, $thumb_path, $target_width, $target_height);
                    unlink($temp_jpg); // Clean up temp file
                    return $result;
                }
            }
            
            fwrite(STDERR, "Could not convert HEIC file: $relative_path\n");
            return null;
        } else {
            // For images, use GD library
            if (!function_exists('imagecreatefromjpeg') || !function_exists('imagewebp')) {
                fwrite(STDERR, "GD library with WebP support is required\n");
                return null;
            }
            
            // Create image resource based on mime type
            $source_image = null;
            
            switch ($mime_type) {
                case 'image/jpeg':
                    $source_image = @imagecreatefromjpeg($source_path);
                    break;
                case 'image/png':
                    $source_image = @imagecreatefrompng($source_path);
                    break;
                case 'image/gif':
                    $source_image = @imagecreatefromgif($source_path);
                    break;
                case 'image/webp':
                    $source_image = @imagecreatefromwebp($source_path);
                    break;
                default:
                    // Try to load as JPEG first, then PNG, then GIF as fallbacks
                    $source_image = @imagecreatefromjpeg($source_path);
                    if (!$source_image) {
                        $source_image = @imagecreatefrompng($source_path);
                    }
                    if (!$source_image) {
                        $source_image = @imagecreatefromgif($source_path);
                    }
                    if (!$source_image) {
                        $source_image = @imagecreatefromwebp($source_path);
                    }
                    break;
            }
            
            if (!$source_image) {
                fwrite(STDERR, "Failed to create image resource for: $relative_path (MIME: $mime_type)\n");
                return null;
            }
            
            return process_and_save_image($source_image, $thumb_path, $target_width, $target_height);
        }
    } catch (Exception $e) {
        fwrite(STDERR, "Error generating thumbnail: " . $e->getMessage() . "\n");
        return null;
    }
    
    return null;
}

/**
 * Process an image resource and save it as a WebP thumbnail
 * 
 * @param resource $source_image The source image resource
 * @param string $thumb_path The path to save the thumbnail
 * @param int $target_width The target width
 * @param int $target_height The target height
 * @return string|null The path to the generated thumbnail or null if failed
 */
function process_and_save_image($source_image, $thumb_path, $target_width, $target_height) {
    try {
        // Get original dimensions
        $original_width = imagesx($source_image);
        $original_height = imagesy($source_image);
        
        // Create the final square thumbnail
        $thumb_image = imagecreatetruecolor($target_width, $target_height);
        
        // Fill with white background
        $white = imagecolorallocate($thumb_image, 255, 255, 255);
        imagefilledrectangle($thumb_image, 0, 0, $target_width, $target_height, $white);
        
        // Calculate dimensions for cropping to ensure 1:1 aspect ratio
        // We'll take the center portion of the image
        if ($original_width > $original_height) {
            // Landscape image: crop from the center horizontally
            $src_x = ($original_width - $original_height) / 2;
            $src_y = 0;
            $src_w = $original_height;
            $src_h = $original_height;
        } else {
            // Portrait image: crop from the center vertically
            $src_x = 0;
            $src_y = ($original_height - $original_width) / 2;
            $src_w = $original_width;
            $src_h = $original_width;
        }
        
        // Copy and resize the cropped portion directly to the thumbnail
        imagecopyresampled(
            $thumb_image, $source_image,
            0, 0, $src_x, $src_y,
            $target_width, $target_height, $src_w, $src_h
        );
        
        // Save as WebP
        imagewebp($thumb_image, $thumb_path, 80);
        
        // Clean up
        imagedestroy($source_image);
        imagedestroy($thumb_image);
        
        return $thumb_path;
    } catch (Exception $e) {
        fwrite(STDERR, "Error processing image: " . $e->getMessage() . "\n");
        return null;
    }
}



function render_instagram_grid($post_data, $lazy_after = 30) {
    $output = '';
    
    // Process each post
    $i=1;
    foreach ($post_data as $timestamp => $post) {
        if($i > $lazy_after){
            $lazy_load = ' loading="lazy"';
        } else {
            $lazy_load = '';
        }
        $index = $post['post_index'];
        $media_count = count($post['media']);
        
        // Determine which media to use for the grid thumbnail
        $display_media = '';
        $is_video = false;
        
        if (isset($post['media'][0])) {
            $first_media = $post['media'][0];
            $original_media = $first_media;
            $display_media = $first_media;
            
            // Check if first media is a video
            $is_video = preg_match('/\.(mp4|mov|avi|webm)$/i', $first_media);
            
            // Check if we have a thumbnail for this media
            $thumb_filename = md5($first_media) . '.webp';
            $thumb_path = 'thumbnails/' . $thumb_filename;
            
            if (file_exists('distribution/' . $thumb_path)) {
                // Use the thumbnail instead of the original
                $display_media = $thumb_path;
                fwrite(STDERR, "Using thumbnail for: $first_media\n");
            } else {
                // Check if we have a WebP version of the original image
                if (!$is_video) {
                    $webp_path = preg_replace('/\.(jpg|jpeg|png|gif)$/i', '.webp', $first_media);
                    if (file_exists('distribution/' . $webp_path)) {
                        $display_media = $webp_path;
                        fwrite(STDERR, "Using WebP version for: $first_media\n");
                    }
                }
                
                // If it's a video, look for a thumbnail among all media items
                if ($is_video) {
                    $found_thumbnail = false;
                    
                    // First check if there are any image files in the post's media that could be thumbnails
                    foreach ($post['media'] as $media_item) {
                        if (preg_match('/\.(jpg|jpeg|png|webp|gif)$/i', $media_item)) {
                            // Check if we have a thumbnail for this image
                            $img_thumb_filename = md5($media_item) . '.webp';
                            $img_thumb_path = 'thumbnails/' . $img_thumb_filename;
                            
                            if (file_exists('distribution/' . $img_thumb_path)) {
                                $display_media = $img_thumb_path;
                            } else {
                                $display_media = $media_item;
                            }
                            $found_thumbnail = true;
                            break;
                        }
                    }
                    
                    // If no thumbnail found, use a better SVG placeholder
                    if (!$found_thumbnail) {
                        // Create a simple SVG with a play button
                        $svg = '<svg xmlns="http://www.w3.org/2000/svg" width="400" height="400" viewBox="0 0 400 400">';
                        $svg .= '<rect width="400" height="400" fill="#333333"/>';
                        $svg .= '<circle cx="200" cy="200" r="60" fill="#ffffff" fill-opacity="0.8"/>';
                        $svg .= '<polygon points="180,160 180,240 240,200" fill="#333333"/>';
                        $svg .= '</svg>';
                        
                        // Encode the SVG properly for use in an img src attribute
                        $display_media = 'data:image/svg+xml;base64,' . base64_encode($svg);
                    }
                }
            }
        }
        
        $output .= '        <div class="grid-item" data-index="' . $index . '">' . "\n";
        $output .= '          <img src="' . $display_media . '" alt="Instagram post"'.$lazy_load.'>' . "\n";
        
        // Add video indicator if it's a video
        if ($is_video) {
            $output .= '          <div class="video-indicator">▶ Video</div>' . "\n";
        }
        
        if ($media_count > 1) {
            $output .= '          <div class="multi-indicator">⊞ ' . $media_count . '</div>' . "\n";
        } elseif (isset($post['Likes']) && $post['Likes'] !== '') {
            $output .= '          <div class="likes-indicator">♥ ' . $post['Likes'] . '</div>' . "\n";
        }
        
        $output .= '        </div>' . "\n";
        $i++;
    }
    
    return $output;
}

date_default_timezone_set("America/New_York");


$personal_data = file_get_contents("personal_information/personal_information/personal_information.json");
$personal_data = json_decode($personal_data,true);
$profile_picture = $personal_data['profile_user'][0]['media_map_data']['Profile Photo']['uri'];
$user_name = $personal_data['profile_user'][0]["string_map_data"]["Username"]["value"];
unset($personal_data);

//echo "profile picture: $profile_picture
//username: $user_name\n";


$location_data = file_get_contents("personal_information/information_about_you/profile_based_in.json");
$location_data  = json_decode($location_data ,true);
$location = $location_data['inferred_data_primary_location'][0]['string_map_data']['City Name']['value'];
unset($location_data);

//echo "location: $location\n";


// Function to search for posts_1.json file recursively
function find_posts_json() {
    $standard_path = 'your_instagram_activity/content/posts_1.json';
    
    // First check the standard location
    if (file_exists($standard_path)) {
        return $standard_path;
    }
    
    // If not found, search recursively
    fwrite(STDERR, "posts_1.json not found in standard location, searching directories...\n");
    
    $found_files = [];
    $iterator = new RecursiveIteratorIterator(
        new RecursiveDirectoryIterator('.', RecursiveDirectoryIterator::SKIP_DOTS)
    );
    
    foreach ($iterator as $file) {
        if ($file->getFilename() === 'posts_1.json') {
            $found_files[] = $file->getPathname();
        }
    }
    
    if (empty($found_files)) {
        fwrite(STDERR, "ERROR: Could not find posts_1.json anywhere in the directory structure.\n");
        return false;
    }
    
    // If multiple files found, use the one that seems most likely
    if (count($found_files) > 1) {
        fwrite(STDERR, "Found multiple posts_1.json files:\n");
        foreach ($found_files as $index => $path) {
            fwrite(STDERR, "  [$index] $path\n");
        }
        
        // Try to find the one in a directory with "content" or "activity" in the path
        foreach ($found_files as $path) {
            if (strpos($path, 'content') !== false || strpos($path, 'activity') !== false) {
                fwrite(STDERR, "Selected: $path\n");
                return $path;
            }
        }
        
        // If no preferred path found, use the first one
        fwrite(STDERR, "Selected: {$found_files[0]}\n");
        return $found_files[0];
    }
    
    fwrite(STDERR, "Found posts_1.json at: {$found_files[0]}\n");
    return $found_files[0];
}

// Load and decode the JSON files
$insights_data = file_get_contents('logged_information/past_instagram_insights/posts.json');
$insights_data = json_decode($insights_data, true);

$posts_json_path = find_posts_json();
if (!$posts_json_path) {
    die("ERROR: Could not find the posts_1.json file. Please ensure your Instagram data is properly extracted.");
}
$post_data = file_get_contents($posts_json_path);
$post_data = json_decode($post_data, true);

// Create an indexed array of insights data using creation_timestamp as key
$indexed_insights = [];
foreach ($insights_data['organic_insights_posts'] as $insight) {
    $timestamp = $insight['media_map_data']['Media Thumbnail']['creation_timestamp'];
    $indexed_insights[$timestamp] = $insight;
}

// Combine the data
$combined_data = [];
foreach ($post_data as $post) {
    // Get the timestamp from the first media item (since a post might have multiple media items)
    $timestamp = $post['media'][0]['creation_timestamp'];
    
    // Create the combined post object
    $combined_post = [
        'post_data' => $post,
        'insights' => isset($indexed_insights[$timestamp]) ? $indexed_insights[$timestamp] : null
    ];
    
    // Add to combined data array
    $combined_data[] = $combined_post;
}

unset($post_data);
unset($insights_data);
unset($indexed_insights);

function extractRelevantData($combined_data) {
    $simplified_data = [];

    foreach ($combined_data as $index => $item) {
        // Initialize a new post entry
        $post_entry = [
            'post_index' => $index,
            'media' => [],
            'creation_timestamp_unix' => "",
            'creation_timestamp_readable' => "",
            'title' => "",
            'Impressions' => "",
            'Likes' => "",
            'Comments' => ""
        ];
        
        // Extract post-level data
        if (isset($item['post_data'])) {
            if (isset($item['post_data']['creation_timestamp'])) {
                $post_entry['creation_timestamp_unix'] = $item['post_data']['creation_timestamp'];
            } elseif (isset($item['post_data']['media'][0]['creation_timestamp'])) {
                // Fallback to first media item timestamp if post timestamp not available
                $post_entry['creation_timestamp_unix'] = $item['post_data']['media'][0]['creation_timestamp'];
            }

   
            $post_entry['creation_timestamp_readable'] = gmdate("F j, Y \a\\t g:i A", $post_entry['creation_timestamp_unix']);

            
            if (isset($item['post_data']['title'])) {
                $post_entry['title'] = $item['post_data']['title'];
            }
            
            // Extract media URIs
            if (isset($item['post_data']['media'])) {
                foreach ($item['post_data']['media'] as $media) {
                    $post_entry['media'][] = $media['uri'] ?? "";
                }
            }
        }
        
        // Get insights data if available
        if (isset($item['insights']) && isset($item['insights']['string_map_data'])) {
            $insights = $item['insights']['string_map_data'];
            
            // Extract specific metrics and ensure they're integers or blank
            if (isset($insights['Impressions'])) {
                $impressions = $insights['Impressions']['value'] ?? "";
                // Validate and convert to integer if numeric, otherwise leave blank
                $post_entry['Impressions'] = is_numeric($impressions) ? (int)$impressions : "";
            }
            
            if (isset($insights['Likes'])) {
                $likes = $insights['Likes']['value'] ?? "";
                // Validate and convert to integer if numeric, otherwise leave blank
                $post_entry['Likes'] = is_numeric($likes) ? (int)$likes : "";
            }
            
            if (isset($insights['Comments'])) {
                $comments = $insights['Comments']['value'] ?? "";
                // Validate and convert to integer if numeric, otherwise leave blank
                $post_entry['Comments'] = is_numeric($comments) ? (int)$comments : "";
            }
        }
        
        $simplified_data[$post_entry['creation_timestamp_unix']] = $post_entry;
        
    }

    krsort($simplified_data);
    return $simplified_data;
}

$post_data = extractRelevantData($combined_data);
unset($combined_data);



echo "<br><br><br>";

// Assuming your array is stored in $post_data
$keys = array_keys($post_data);

// Get first and last keys
$first_key = reset($keys); // Or $keys[0]
$last_key = end($keys);    // Or $keys[count($keys) - 1]

// Get timestamps from first and last elements
$last_timestamp = gmdate("F Y",$post_data[$first_key]['creation_timestamp_unix']);
$first_timestamp = gmdate("F Y",$post_data[$last_key]['creation_timestamp_unix']);

//echo"<pre>" . print_r($post_data[],true) ."</pre>";



?>
<html lang="en">
  <head>
    <meta charset="UTF-8">
    <meta name="viewport" content="width=device-width, initial-scale=1.0">
    <title>Memento Mori</title>
    <link rel="stylesheet" href="style.css">
    <!-- Script to make post data available to JavaScript -->
    <script>
        window.postData = <?php echo json_encode($post_data); ?>;
    </script>
    <?php
    // Include the modal.js content directly in the output
    echo '<script>';
    echo file_get_contents('modal.js');
    echo '</script>';
    ?>
  </head>
  <body class="vsc-initialized">
    <header>
      <div class="header-content">
        <a href="https://github.com/greg-randall/memento-mori" class="logo">Memento Mori</a>
        <div class="date-range-header" id="date-range-header"><?php echo "$first_timestamp - $last_timestamp"; ?></div>
      </div>
    </header>
    <main>
      <div class="loading" id="loadingPosts" style="display: none;"> Loading posts... </div>
      <div class="profile-info">
        <div class="profile-picture">
          <img alt="Profile Picture" src="<?php echo $profile_picture; ?>" style="width: 100%; height: 100%; object-fit: cover; border-radius: 50%;">
        </div>
        <div class="profile-details">
          <h1 id="username"><?php echo $user_name; ?></h1>
          <div class="stats">
            <div class="stat">
              <span class="stat-count" id="post-count"><?php echo count($post_data); ?></span> posts
            </div>
          </div>
        </div>
      </div>
      <div class="sort-options">
        <div class="sort-row">
          <a href="#" class="sort-link active" data-sort="newest">Newest</a>
          <a href="#" class="sort-link" data-sort="oldest">Oldest</a>
          <a href="#" class="sort-link" data-sort="most-likes">Most Likes</a>
          <a href="#" class="sort-link" data-sort="most-comments">Most Comments</a>
          <a href="#" class="sort-link" data-sort="most-views">Most Views</a>
          <a href="#" class="sort-link" data-sort="random">Random</a>
        </div>
      </div>
      <div class="posts-grid" id="postsGrid">
        <?php echo render_instagram_grid($post_data); ?>
        </div>
       
    </main>

    <!-- Modal for post details -->
    <div class="post-modal" id="postModal">
        <div class="close-modal" id="closeModal">✕</div>
        <div class="modal-nav modal-prev" id="modalPrev">❮</div>
        <div class="modal-nav modal-next" id="modalNext">❯</div>
        <div class="post-modal-content">
            <div class="post-media" id="postMedia"></div>
            <div class="post-info">
                <div class="post-header">
                    <div class="post-user" id="postUserPic">
                        <img src="<?php echo $profile_picture; ?>" alt="Profile" style="width: 100%; height: 100%; object-fit: cover; border-radius: 50%;">
                    </div>
                    <div class="post-username" id="postUsername"><?php echo $user_name; ?></div>
                </div>
                <div class="post-caption" id="postCaption"></div>
                <div class="post-stats" id="postStats"></div>
                <div class="post-date" id="postDate"></div>
            </div>
        </div>
    </div>
  </body>
</html>

<?php
// Copy media files and generate thumbnails first
copy_media_files($post_data, $profile_picture);

// Now start output buffering to capture HTML after thumbnails are generated
ob_start();

// Include the HTML generation code here
?>
<html lang="en">
  <head>
    <meta charset="UTF-8">
    <meta name="viewport" content="width=device-width, initial-scale=1.0">
    <title>Memento Mori</title>
    <link rel="stylesheet" href="style.css">
    <!-- Script to make post data available to JavaScript -->
    <script>
        window.postData = <?php echo json_encode($post_data); ?>;
        
        // Function to copy the current URL to clipboard
        function copyCurrentUrl() {
            const url = window.location.href;
            navigator.clipboard.writeText(url)
                .then(() => {
                    alert('Link copied to clipboard!');
                })
                .catch(err => {
                    console.error('Could not copy URL: ', err);
                });
        }
    </script>
    <?php
    // Include the modal.js content directly in the output
    echo '<script>';
    echo file_get_contents('modal.js');
    echo '</script>';
    ?>
  </head>
  <body class="vsc-initialized">
    <header>
      <div class="header-content">
        <a href="https://github.com/greg-randall/memento-mori" class="logo">Memento Mori</a>
        <div class="date-range-header" id="date-range-header"><?php echo "$first_timestamp - $last_timestamp"; ?></div>
      </div>
    </header>
    <main>
      <div class="loading" id="loadingPosts" style="display: none;"> Loading posts... </div>
      <div class="profile-info">
        <div class="profile-picture">
          <img alt="Profile Picture" src="<?php echo $profile_picture; ?>" style="width: 100%; height: 100%; object-fit: cover; border-radius: 50%;">
        </div>
        <div class="profile-details">
          <h1 id="username"><?php echo $user_name; ?></h1>
          <div class="stats">
            <div class="stat">
              <span class="stat-count" id="post-count"><?php echo count($post_data); ?></span> posts
            </div>
          </div>
        </div>
      </div>
      <div class="sort-options">
        <div class="sort-row">
          <a href="#" class="sort-link active" data-sort="newest">Newest</a>
          <a href="#" class="sort-link" data-sort="oldest">Oldest</a>
          <a href="#" class="sort-link" data-sort="most-likes">Most Likes</a>
          <a href="#" class="sort-link" data-sort="most-comments">Most Comments</a>
          <a href="#" class="sort-link" data-sort="most-views">Most Views</a>
          <a href="#" class="sort-link" data-sort="random">Random</a>
        </div>
      </div>
      <div class="posts-grid" id="postsGrid">
        <?php echo render_instagram_grid($post_data); ?>
        </div>
       
    </main>

    <!-- Modal for post details -->
    <div class="post-modal" id="postModal">
        <div class="close-modal" id="closeModal">✕</div>
        <div class="modal-nav modal-prev" id="modalPrev">❮</div>
        <div class="modal-nav modal-next" id="modalNext">❯</div>
        <div class="post-modal-content">
            <div class="post-media" id="postMedia"></div>
            <div class="post-info">
                <div class="post-header">
                    <div class="post-user" id="postUserPic">
                        <img src="<?php echo $profile_picture; ?>" alt="Profile" style="width: 100%; height: 100%; object-fit: cover; border-radius: 50%;">
                    </div>
                    <div class="post-username" id="postUsername"><?php echo $user_name; ?></div>
                </div>
                <div class="post-caption" id="postCaption"></div>
                <div class="post-stats" id="postStats"></div>
                <div class="post-date" id="postDate"></div>
            </div>
        </div>
    </div>
  </body>
</html>
<?php
// Get the HTML content
$html_content = ob_get_contents();
ob_end_clean();

// Write the HTML to the distribution folder
file_put_contents('distribution/index.html', $html_content);

// Copy the CSS file to the distribution folder
if (file_exists('style.css')) {
    copy('style.css', 'distribution/style.css');
    fwrite(STDERR, "CSS file copied to distribution folder.\n");
}

// Verify all images in the HTML are accessible
fwrite(STDERR, "Verifying all images in the generated HTML...\n");
verify_images_in_html($html_content);

// Output the content to the browser as well
echo $html_content;

/**
 * Verify that all images referenced in the HTML actually exist
 * 
 * @param string $html_content The HTML content to check
 */
function verify_images_in_html($html_content) {
    // Extract all image sources from the HTML
    preg_match_all('/<img[^>]+src=([\'"])([^"\']+)\\1/i', $html_content, $matches);
    
    $image_sources = $matches[2];
    $total_images = count($image_sources);
    $missing_images = 0;
    $fixed_images = 0;
    
    fwrite(STDERR, "Found $total_images image references to verify.\n");
    
    foreach ($image_sources as $src) {
        // Skip data URIs
        if (strpos($src, 'data:image') === 0) {
            continue;
        }
        
        // Check if the image exists in the distribution folder
        $image_path = 'distribution/' . $src;
        
        if (!file_exists($image_path)) {
            $missing_images++;
            fwrite(STDERR, "Missing image: $src\n");
            
            // Try to find the image with a different extension
            $base_path = pathinfo($image_path, PATHINFO_DIRNAME) . '/' . pathinfo($image_path, PATHINFO_FILENAME);
            $found = false;
            
            // Check common image extensions
            foreach (['.jpg', '.jpeg', '.png', '.gif', '.webp'] as $ext) {
                $alt_path = $base_path . $ext;
                if (file_exists($alt_path)) {
                    fwrite(STDERR, "  Found alternative: " . basename($alt_path) . "\n");
                    
                    // Copy the file to the expected path
                    copy($alt_path, $image_path);
                    $fixed_images++;
                    $found = true;
                    break;
                }
            }
            
            if (!$found) {
                // Check if the original file exists (before distribution)
                $original_src = $src;
                if (file_exists($original_src)) {
                    fwrite(STDERR, "  Found original file, copying to distribution: $original_src\n");
                    
                    // Create directory if it doesn't exist
                    $dir = dirname($image_path);
                    if (!file_exists($dir)) {
                        mkdir($dir, 0755, true);
                    }
                    
                    // Copy the file
                    copy($original_src, $image_path);
                    $fixed_images++;
                }
            }
        }
    }
    
    // Report results
    if ($missing_images === 0) {
        fwrite(STDERR, "All images verified successfully!\n");
    } else {
        fwrite(STDERR, "Found $missing_images missing images, fixed $fixed_images.\n");
        if ($missing_images > $fixed_images) {
            fwrite(STDERR, "WARNING: " . ($missing_images - $fixed_images) . " images could not be fixed.\n");
        }
    }
}
?>



================================================
FILE: deprecated_php_utility/modal.js
================================================
document.addEventListener('DOMContentLoaded', function() {
    // Get DOM elements
    const postsGrid = document.getElementById('postsGrid');
    const postModal = document.getElementById('postModal');
    const closeModalBtn = document.getElementById('closeModal');
    const modalPrev = document.getElementById('modalPrev');
    const modalNext = document.getElementById('modalNext');
    const postMedia = document.getElementById('postMedia');
    const postCaption = document.getElementById('postCaption');
    const postStats = document.getElementById('postStats');
    const postDate = document.getElementById('postDate');
    const postUsername = document.getElementById('postUsername');
    const postUserPic = document.getElementById('postUserPic');
    const sortLinks = document.querySelectorAll('.sort-link');
    
    // Global variables to track current post and indexes
    let currentPostIndex = -1;
    let currentSlideIndex = 0;
    let postIndexToTimestamp = {}; // Map post index to timestamp
    let currentSortType = 'newest'; // Default sort
    
    // Initialize by creating mapping and attaching listeners
    function initialize() {
        // Create a mapping from post_index to timestamp
        Object.entries(window.postData).forEach(([timestamp, post]) => {
            postIndexToTimestamp[post.post_index] = timestamp;
        });
        
        // Attach click listeners to grid items
        attachGridItemListeners();
        
        // Initialize sorting functionality
        initializeSorting();
    }
    
    // Initialize sorting functionality
    function initializeSorting() {
        // Add event listeners to sort links
        sortLinks.forEach(link => {
            link.addEventListener('click', function(e) {
                e.preventDefault();
                
                // Update active class
                sortLinks.forEach(l => l.classList.remove('active'));
                this.classList.add('active');
                
                // Get sort type and sort posts
                const sortType = this.getAttribute('data-sort');
                currentSortType = sortType;
                sortPosts(sortType);
            });
        });
    }
    
    // Sort posts based on selected criteria
    function sortPosts(sortType) {
        // Get all grid items
        let gridItems = Array.from(document.querySelectorAll('.grid-item'));
        
        // Sort the grid items based on the selected criteria
        switch(sortType) {
            case 'newest':
                // Sort by timestamp (newest first) - this is the default
                gridItems.sort((a, b) => {
                    const indexA = parseInt(a.getAttribute('data-index'));
                    const indexB = parseInt(b.getAttribute('data-index'));
                    const timestampA = getTimestampByIndex(indexA);
                    const timestampB = getTimestampByIndex(indexB);
                    return timestampB - timestampA;
                });
                break;
                
            case 'oldest':
                // Sort by timestamp (oldest first)
                gridItems.sort((a, b) => {
                    const indexA = parseInt(a.getAttribute('data-index'));
                    const indexB = parseInt(b.getAttribute('data-index'));
                    const timestampA = getTimestampByIndex(indexA);
                    const timestampB = getTimestampByIndex(indexB);
                    return timestampA - timestampB;
                });
                break;
                
            case 'most-likes':
                // Sort by number of likes
                gridItems.sort((a, b) => {
                    const indexA = parseInt(a.getAttribute('data-index'));
                    const indexB = parseInt(b.getAttribute('data-index'));
                    const likesA = getLikesByIndex(indexA) || 0;
                    const likesB = getLikesByIndex(indexB) || 0;
                    return likesB - likesA;
                });
                break;
                
            case 'most-comments':
                // Sort by number of comments
                gridItems.sort((a, b) => {
                    const indexA = parseInt(a.getAttribute('data-index'));
                    const indexB = parseInt(b.getAttribute('data-index'));
                    const commentsA = getCommentsByIndex(indexA) || 0;
                    const commentsB = getCommentsByIndex(indexB) || 0;
                    return commentsB - commentsA;
                });
                break;
                
            case 'most-views':
                // Sort by number of views/impressions
                gridItems.sort((a, b) => {
                    const indexA = parseInt(a.getAttribute('data-index'));
                    const indexB = parseInt(b.getAttribute('data-index'));
                    const viewsA = getViewsByIndex(indexA) || 0;
                    const viewsB = getViewsByIndex(indexB) || 0;
                    return viewsB - viewsA;
                });
                break;
                
            case 'random':
                // Shuffle the grid items randomly
                gridItems.sort(() => Math.random() - 0.5);
                break;
        }
        
        // Reorder the grid items in the DOM
        const fragment = document.createDocumentFragment();
        gridItems.forEach(item => {
            fragment.appendChild(item);
        });
        
        // Clear the grid and append the sorted items
        postsGrid.innerHTML = '';
        postsGrid.appendChild(fragment);
        
        // Reattach event listeners to grid items
        attachGridItemListeners();
    }
    
    // Helper function to get timestamp by post index
    function getTimestampByIndex(index) {
        const timestamp = postIndexToTimestamp[index];
        return parseInt(timestamp);
    }
    
    // Helper function to get likes by post index
    function getLikesByIndex(index) {
        const timestamp = postIndexToTimestamp[index];
        if (timestamp && window.postData[timestamp]) {
            return parseInt(window.postData[timestamp].Likes) || 0;
        }
        return 0;
    }
    
    // Helper function to get comments by post index
    function getCommentsByIndex(index) {
        const timestamp = postIndexToTimestamp[index];
        if (timestamp && window.postData[timestamp]) {
            return parseInt(window.postData[timestamp].Comments) || 0;
        }
        return 0;
    }
    
    // Helper function to get views/impressions by post index
    function getViewsByIndex(index) {
        const timestamp = postIndexToTimestamp[index];
        if (timestamp && window.postData[timestamp]) {
            return parseInt(window.postData[timestamp].Impressions) || 0;
        }
        return 0;
    }
    
    // Attach click event listeners to all grid items
    function attachGridItemListeners() {
        const gridItems = document.querySelectorAll('.grid-item');
        gridItems.forEach(item => {
            item.addEventListener('click', function() {
                const postIndex = parseInt(this.getAttribute('data-index'));
                openModal(postIndex);
            });
        });
    }
    
    // Open the modal with the selected post
function openModal(index, imageIndex = 0) {
    currentPostIndex = index;
    
    // Store the current scroll position before opening the modal
    const scrollPosition = window.pageYOffset || document.documentElement.scrollTop;
    
    // Get the timestamp using the post_index mapping
    const timestamp = postIndexToTimestamp[index];
    
    // Get the post data using the timestamp
    const post = window.postData[timestamp];
    
    // Show the modal first (important for correct dimensions)
    postModal.style.display = 'block';
    document.body.style.overflow = 'hidden'; // Prevent scrolling
    
    // Store the scroll position as a data attribute on the modal
    postModal.setAttribute('data-scroll-position', scrollPosition);
    
    // Update modal content
    updateModalContent(post, imageIndex);
    
    // Update URL with post ID and image index
    updateUrlWithPostInfo(timestamp, imageIndex);
    
    // For mobile devices, ensure content is visible and properly sized
    if (window.innerWidth <= 768) {
        // Don't scroll to top on mobile as it causes the issue
        // Instead, just ensure the modal is properly positioned
        postModal.scrollTop = 0;
        
        // Force layout recalculation with a longer timeout
        setTimeout(() => {
            const mediaContainer = document.querySelector('.media-container');
            const postMediaEl = document.getElementById('postMedia');
            
            // Ensure post-media has explicit height
            if (postMediaEl) {
                postMediaEl.style.height = '50vh';
                postMediaEl.style.minHeight = '300px';
            }
            
            // Ensure media-container has explicit height
            if (mediaContainer) {
                mediaContainer.style.height = '100%';
                mediaContainer.style.display = 'flex';
                
                // Force reflow
                void mediaContainer.offsetHeight;
            }
            
            // Reset any active slides to ensure they're visible
            const activeSlides = document.querySelectorAll('.media-slide.active');
            activeSlides.forEach(slide => {
                slide.style.opacity = '0';
                void slide.offsetHeight; // Force reflow
                slide.style.opacity = '1';
                
                // Make sure images have height
                const img = slide.querySelector('img');
                if (img) {
                    img.style.maxHeight = '100%';
                    img.style.width = 'auto';
                    img.style.height = 'auto';
                }
            });
        }, 50); // Increase timeout for more reliability
    }
}

// Function to update the URL with post and image information
function updateUrlWithPostInfo(timestamp, imageIndex) {
    // Create a new URL object based on the current URL
    const url = new URL(window.location.href);
    
    // Set the post parameter to the timestamp
    url.searchParams.set('post', timestamp);
    
    // Only add the image parameter if it's not the first image
    if (imageIndex > 0) {
        url.searchParams.set('image', imageIndex);
    } else {
        url.searchParams.delete('image');
    }
    
    // Update the browser history without reloading the page
    window.history.pushState({}, '', url);
}
    //Creates the appropriate media element (video or image) based on the file type
    function createMediaElement(mediaUrl) {
        // Check if the media is a video based on file extension
        if (mediaUrl.endsWith('.mp4') || mediaUrl.endsWith('.mov') || 
            mediaUrl.endsWith('.avi') || mediaUrl.endsWith('.webm')) {
            
            // Create video element
            const video = document.createElement('video');
            video.src = mediaUrl;
            video.controls = true;
            video.autoplay = true;
            video.loop = true;
            video.muted = false;
            video.playsInline = true;
            video.alt = 'Instagram video post';
            
            return video;
        } else {
            // Create image element
            const img = document.createElement('img');
            
            // Check if there's a WebP version available for non-WebP images
            if (!mediaUrl.endsWith('.webp') && 
                (mediaUrl.endsWith('.jpg') || mediaUrl.endsWith('.jpeg') || 
                 mediaUrl.endsWith('.png') || mediaUrl.endsWith('.gif'))) {
                
                // Try to use WebP version if it exists
                const webpUrl = mediaUrl.replace(/\.(jpg|jpeg|png|gif)$/i, '.webp');
                
                // Set up error handling to fall back to original if WebP doesn't exist
                img.onerror = function() {
                    this.onerror = null; // Prevent infinite loop
                    this.src = mediaUrl; // Fall back to original
                };
                
                img.src = webpUrl;
            } else {
                img.src = mediaUrl;
            }
            
            img.alt = 'Instagram post';
            
            return img;
        }
    }
    // Update the modal content with the post data
    function updateModalContent(post, initialImageIndex = 0) {
        // Clear previous content
        postMedia.innerHTML = '';
        postCaption.innerHTML = '';
        postStats.innerHTML = '';
        
        // Create media container for the slides
        const mediaContainer = document.createElement('div');
        mediaContainer.className = 'media-container';
        
        // Check if the post has multiple media
        if (post.media && post.media.length > 1) {
            // Create slides for each media item
            post.media.forEach((mediaUrl, index) => {
                const slide = document.createElement('div');
                slide.className = `media-slide ${index === initialImageIndex ? 'active' : ''}`;
                
                // Create and add the appropriate media element
                const mediaElement = createMediaElement(mediaUrl);
                slide.appendChild(mediaElement);
                
                mediaContainer.appendChild(slide);
            });
            
            // Add navigation buttons for slideshow
            const prevBtn = document.createElement('div');
            prevBtn.className = 'slideshow-nav slideshow-prev';
            prevBtn.innerHTML = '❮';
            prevBtn.addEventListener('click', function(e) {
                e.stopPropagation();
                navigateSlideshow(-1);
            });
            
            const nextBtn = document.createElement('div');
            nextBtn.className = 'slideshow-nav slideshow-next';
            nextBtn.innerHTML = '❯';
            nextBtn.addEventListener('click', function(e) {
                e.stopPropagation();
                navigateSlideshow(1);
            });
            
            // Add indicator dots
            const indicator = document.createElement('div');
            indicator.className = 'slideshow-indicator';
            
            for (let i = 0; i < post.media.length; i++) {
                const dot = document.createElement('div');
                dot.className = `slideshow-dot ${i === initialImageIndex ? 'active' : ''}`;
                dot.setAttribute('data-index', i);
                dot.addEventListener('click', function(e) {
                    e.stopPropagation();
                    const index = parseInt(this.getAttribute('data-index'));
                    showSlide(index);
                });
                indicator.appendChild(dot);
            }
            
            mediaContainer.appendChild(prevBtn);
            mediaContainer.appendChild(nextBtn);
            mediaContainer.appendChild(indicator);
            
            // Set the current slide index to the initial image index
            currentSlideIndex = initialImageIndex;
        } else {
            // Single media post
            const slide = document.createElement('div');
            slide.className = 'media-slide active';
            
            // Create and add the appropriate media element
            const mediaElement = createMediaElement(post.media[0]);
            slide.appendChild(mediaElement);
            
            mediaContainer.appendChild(slide);
        }
        
        postMedia.appendChild(mediaContainer);
        
        // Set post caption
        postCaption.textContent = post.title || '';
        
        // Set post stats
        if (post.Impressions) {
            const impressionsDiv = document.createElement('div');
            impressionsDiv.className = 'post-stat';
            impressionsDiv.innerHTML = `
                <span class="post-stat-icon">👁️</span>
                <span>${post.Impressions} views</span>
            `;
            postStats.appendChild(impressionsDiv);
        }
        
        if (post.Likes) {
            const likesDiv = document.createElement('div');
            likesDiv.className = 'post-stat';
            likesDiv.innerHTML = `
                <span class="post-stat-icon">♥</span>
                <span>${post.Likes}</span>
            `;
            postStats.appendChild(likesDiv);
        }
        
        if (post.Comments) {
            const commentsDiv = document.createElement('div');
            commentsDiv.className = 'post-stat';
            commentsDiv.innerHTML = `
                <span class="post-stat-icon">💬</span>
                <span>${post.Comments} comments</span>
            `;
            postStats.appendChild(commentsDiv);
        }
        
        // Set post date
        postDate.textContent = post.creation_timestamp_readable;
        
        // Show/hide stats container based on whether there are any stats
        postStats.style.display = postStats.children.length > 0 ? 'flex' : 'none';
    }
    
    // Navigate between slides in a multi-media post
    function navigateSlideshow(direction) {
        const slides = document.querySelectorAll('.media-slide');
        const dots = document.querySelectorAll('.slideshow-dot');
        let activeIndex = 0;
        
        // Find the currently active slide
        slides.forEach((slide, index) => {
            if (slide.classList.contains('active')) {
                activeIndex = index;
            }
        });
        
        // Pause any videos in the current slide
        const currentVideo = slides[activeIndex].querySelector('video');
        if (currentVideo) {
            currentVideo.pause();
        }
        
        // Calculate the new index
        let newIndex = activeIndex + direction;
        if (newIndex < 0) newIndex = slides.length - 1;
        if (newIndex >= slides.length) newIndex = 0;
        
        // Update slides and dots
        showSlide(newIndex);
    }
    
    // Show a specific slide
    function showSlide(index) {
        const slides = document.querySelectorAll('.media-slide');
        const dots = document.querySelectorAll('.slideshow-dot');
        
        // Pause all videos before changing slides
        slides.forEach(slide => {
            const video = slide.querySelector('video');
            if (video) {
                video.pause();
            }
        });
        
        // Remove active class from all slides and dots
        slides.forEach(slide => slide.classList.remove('active'));
        if (dots.length > 0) {
            dots.forEach(dot => dot.classList.remove('active'));
            dots[index].classList.add('active');
        }
        
        // Add active class to the selected slide
        slides[index].classList.add('active');
        
        // Update current slide index
        currentSlideIndex = index;
        
        // Update URL with the new image index
        const timestamp = postIndexToTimestamp[currentPostIndex];
        updateUrlWithPostInfo(timestamp, index);
    }
    
    // Navigate between posts (next/prev buttons in modal)
    function navigatePost(direction) {
        // Pause all videos in the current post
        const videos = document.querySelectorAll('.media-slide video');
        videos.forEach(video => {
            if (video) {
                video.pause();
            }
        });
        
        // Get all grid items in their current sorted order
        const gridItems = Array.from(document.querySelectorAll('.grid-item'));
        const gridIndexes = gridItems.map(item => parseInt(item.getAttribute('data-index')));
        
        // Find the position of the current post in the sorted grid
        const currentPosition = gridIndexes.indexOf(currentPostIndex);
        
        if (currentPosition === -1) {
            console.error('Current post not found in grid');
            return;
        }
        
        // Calculate new position with wraparound
        let newPosition = (currentPosition + direction + gridIndexes.length) % gridIndexes.length;
        
        // Get the new post index from the grid's current order
        const newPostIndex = gridIndexes[newPosition];
        
        // Open the new post
        openModal(newPostIndex);
    }
    
    // Close the modal
    function closeModal() {
        // Pause all videos before closing the modal
        const videos = document.querySelectorAll('.media-slide video');
        videos.forEach(video => {
            if (video) {
                video.pause();
            }
        });
        
        // Store the current scroll position before closing the modal
        const scrollPosition = window.pageYOffset || document.documentElement.scrollTop;
        
        postModal.style.display = 'none';
        document.body.style.overflow = 'auto'; // Re-enable scrolling
        
        // Remove post and image parameters from URL
        const url = new URL(window.location.href);
        url.searchParams.delete('post');
        url.searchParams.delete('image');
        window.history.pushState({}, '', url);
        
        // Restore the scroll position after a short delay
        setTimeout(() => {
            window.scrollTo({
                top: scrollPosition,
                behavior: 'auto' // Use 'auto' instead of 'smooth' to prevent visible scrolling
            });
        }, 10);
    }
    
    // Event listeners for modal navigation
    closeModalBtn.addEventListener('click', closeModal);
    modalPrev.addEventListener('click', function(e) {
        e.stopPropagation();
        navigatePost(-1);
    });
    modalNext.addEventListener('click', function(e) {
        e.stopPropagation();
        navigatePost(1);
    });
    
    // Close modal when clicking outside of content
    postModal.addEventListener('click', function(e) {
        if (e.target === postModal) {
            closeModal();
        }
    });
    
    // Keyboard navigation
    document.addEventListener('keydown', function(e) {
        if (postModal.style.display === 'block') {
            if (e.key === 'Escape') {
                closeModal();
            } else if (e.key === 'ArrowLeft') {
                navigatePost(-1);
            } else if (e.key === 'ArrowRight') {
                navigatePost(1);
            }
        }
    });
    
    // Initialize the modal functionality
    if (typeof window.postData !== 'undefined') {
        initialize();
        
        // Check if URL has post and image parameters
        const urlParams = new URLSearchParams(window.location.search);
        const postTimestamp = urlParams.get('post');
        const imageIndex = parseInt(urlParams.get('image') || '0');
        
        if (postTimestamp && window.postData[postTimestamp]) {
            // Find the post index from the timestamp
            let postIndex = -1;
            Object.entries(postIndexToTimestamp).forEach(([index, timestamp]) => {
                if (timestamp === postTimestamp) {
                    postIndex = parseInt(index);
                }
            });
            
            if (postIndex >= 0) {
                // Open the modal with the specified post and image
                setTimeout(() => {
                    openModal(postIndex, imageIndex);
                }, 500); // Delay to ensure everything is loaded
            }
        }
    } else {
        console.error('Post data not available');
    }
});


================================================
FILE: deprecated_php_utility/notes.md
================================================
The PHP version may be easier to run on shared hosting environments and doesn't require additional packages if PHP is already installed with the necessary extensions.

## Troubleshooting

- If you see errors about GD library or WebP support, you may need to install additional PHP extensions
- For video thumbnail generation, ensure FFmpeg is installed and accessible in your system path
- For HEIC file support, ensure ImageMagick is installed
- If using Docker, ensure you have permissions to write to the output directory
- For large archives, be patient as processing media files can take time


================================================
FILE: deprecated_php_utility/style.css
================================================
:root {
  --instagram-bg: #fafafa;
  --instagram-border: #dbdbdb;
  --instagram-text: #262626;
  --instagram-link: #0095f6;
  --header-height: 60px;
}

* {
  margin: 0;
  padding: 0;
  box-sizing: border-box;
}

body {
  font-family: -apple-system, BlinkMacSystemFont, "Segoe UI", Roboto, Helvetica, Arial, sans-serif;
  background-color: var(--instagram-bg);
  color: var(--instagram-text);
  line-height: 1.5;
}

header {
  position: fixed;
  top: 0;
  left: 0;
  right: 0;
  height: var(--header-height);
  background-color: white;
  border-bottom: 1px solid var(--instagram-border);
  display: flex;
  align-items: center;
  justify-content: center;
  padding: 0 20px;
  z-index: 100;
}

.header-content {
  max-width: 975px;
  width: 100%;
  display: flex;
  justify-content: space-between;
  align-items: center;
}

.logo {
  font-size: 24px;
  font-weight: bold;
  color: var(--instagram-text);
  text-decoration: none;
}

.date-range-header {
  color: #8e8e8e;
  font-size: 14px;
  margin-left: 15px;
}

main {
  max-width: 975px;
  margin: calc(var(--header-height) + 30px) auto 30px;
  padding: 0 20px;
}

.profile-info {
  display: flex;
  align-items: center;
  margin-bottom: 30px;
}

.profile-picture {
  width: 150px;
  height: 150px;
  border-radius: 50%;
  object-fit: cover;
  margin-right: 30px;
  background-color: #eee;
  display: flex;
  align-items: center;
  justify-content: center;
  font-size: 36px;
  color: #aaa;
}

.profile-details h1 {
  font-size: 28px;
  font-weight: 300;
  margin-bottom: 15px;
}

.stats {
  display: flex;
  margin-bottom: 15px;
  font-size: 16px;
}

.stat {
  margin-right: 40px;
}

.stat-count {
  font-weight: 600;
}

.posts-grid {
  display: grid;
  grid-template-columns: repeat(3, 1fr);
  gap: 28px;
}

.grid-item {
  position: relative;
  aspect-ratio: 1/1;
  cursor: pointer;
  overflow: hidden;
}

.grid-item img,
.grid-item video {
  width: 100%;
  height: 100%;
  object-fit: cover;
  transition: transform 0.3s ease;
  aspect-ratio: 1/1;
}

.grid-item:hover img,
.grid-item:hover video {
  transform: scale(1.05);
}

.multi-indicator {
  position: absolute;
  top: 10px;
  right: 10px;
  color: white;
  background-color: rgba(0, 0, 0, 0.6);
  padding: 3px 8px;
  border-radius: 4px;
  font-size: 12px;
  z-index: 2;
}

.video-indicator {
  position: absolute;
  top: 10px;
  right: 10px;
  color: white;
  background-color: rgba(0, 0, 0, 0.7);
  padding: 4px 10px;
  border-radius: 4px;
  font-size: 12px;
  font-weight: bold;
  z-index: 2;
  box-shadow: 0 1px 3px rgba(0, 0, 0, 0.3);
}

.post-modal {
  display: none;
  position: fixed;
  top: 0;
  left: 0;
  width: 100%;
  height: 100%;
  background-color: rgba(0, 0, 0, 0.9);
  z-index: 1000;
  overflow-y: auto;
}

.post-modal-content {
  display: flex;
  max-width: 1200px;
  margin: 30px auto;
  background-color: white;
  height: calc(100vh - 60px);
  max-height: 800px;
  border-radius: 4px;
  overflow: hidden;
  position: relative;
}

.post-media {
  flex: 1;
  background-color: black;
  position: relative;
  min-width: 0;
  display: flex;
  align-items: center;
  justify-content: center;
}

.post-media img,
.post-media video {
  max-width: 100%;
  max-height: 100%;
  object-fit: contain;
}

.post-info {
  width: 340px;
  border-left: 1px solid var(--instagram-border);
  display: flex;
  flex-direction: column;
}

.post-header {
  padding: 16px;
  border-bottom: 1px solid var(--instagram-border);
  display: flex;
  align-items: center;
}

.post-user {
  width: 32px;
  height: 32px;
  border-radius: 50%;
  margin-right: 12px;
  display: flex;
  align-items: center;
  justify-content: center;
  background-color: #eee;
  font-size: 14px;
  color: #aaa;
}

.post-username {
  font-weight: 600;
  flex-grow: 1;
}

.share-button {
  cursor: pointer;
  padding: 5px;
  border-radius: 50%;
  display: flex;
  align-items: center;
  justify-content: center;
  transition: background-color 0.2s;
}

.share-button:hover {
  background-color: rgba(0, 0, 0, 0.1);
}

.share-button svg {
  width: 18px;
  height: 18px;
  color: #8e8e8e;
}

.post-caption {
  padding: 16px;
  flex-grow: 1;
  overflow-y: auto;
}

.post-date {
  padding: 16px;
  color: #8e8e8e;
  font-size: 12px;
  border-top: 1px solid var(--instagram-border);
}

.post-stats {
  padding: 12px 16px;
  color: var(--instagram-text);
  font-size: 14px;
  border-top: 1px solid var(--instagram-border);
  display: flex;
  gap: 16px;
}

.post-stat {
  display: flex;
  align-items: center;
  gap: 6px;
}

.post-stat-icon {
  font-size: 16px;
}

.likes-indicator {
  position: absolute;
  bottom: 10px;
  left: 10px;
  color: white;
  background-color: rgba(0, 0, 0, 0.7);
  padding: 4px 10px;
  border-radius: 4px;
  font-size: 12px;
  font-weight: bold;
  z-index: 2;
  box-shadow: 0 1px 3px rgba(0, 0, 0, 0.3);
}

.close-modal {
  position: absolute;
  top: 20px;
  right: 20px;
  color: white;
  font-size: 30px;
  cursor: pointer;
  z-index: 1001;
}

.modal-nav {
  position: absolute;
  top: 50%;
  transform: translateY(-50%);
  color: white;
  font-size: 30px;
  cursor: pointer;
  z-index: 1001;
  background-color: rgba(0, 0, 0, 0.5);
  width: 40px;
  height: 40px;
  border-radius: 50%;
  display: flex;
  align-items: center;
  justify-content: center;
}

.modal-prev {
  left: 20px;
}

.modal-next {
  right: 20px;
}

.slideshow-nav {
  position: absolute;
  top: 50%;
  transform: translateY(-50%);
  color: white;
  font-size: 30px;
  cursor: pointer;
  z-index: 5;
  background-color: rgba(0, 0, 0, 0.7);
  width: 40px;
  height: 40px;
  border-radius: 50%;
  display: flex;
  align-items: center;
  justify-content: center;
  transition: background-color 0.2s;
}

.slideshow-nav:hover {
  background-color: rgba(0, 0, 0, 0.9);
}

.slideshow-prev {
  left: 10px;
}

.slideshow-next {
  right: 10px;
}

.slideshow-indicator {
  position: absolute;
  bottom: 20px;
  left: 0;
  right: 0;
  display: flex;
  justify-content: center;
  z-index: 5;
  background-color: rgba(0, 0, 0, 0.3);
  padding: 8px 0;
  border-radius: 20px;
  width: auto;
  max-width: 80%;
  margin: 0 auto;
}

.slideshow-dot {
  width: 8px;
  height: 8px;
  border-radius: 50%;
  background-color: rgba(255, 255, 255, 0.5);
  margin: 0 4px;
  cursor: pointer;
  transition: background-color 0.2s;
}

.slideshow-dot:hover {
  background-color: rgba(255, 255, 255, 0.8);
}

.slideshow-dot.active {
  background-color: white;
}

.media-container {
  position: relative;
  width: 100%;
  height: 100%;
  display: flex;
  align-items: center;
  justify-content: center;
}

.media-slide {
  position: absolute;
  top: 0;
  left: 0;
  width: 100%;
  height: 100%;
  display: flex;
  align-items: center;
  justify-content: center;
  opacity: 0;
  transition: opacity 0.3s ease;
  pointer-events: none;
}

.media-slide img,
.media-slide video {
  max-width: 100%;
  max-height: 100%;
  object-fit: contain;
}

.media-slide.active {
  opacity: 1;
  z-index: 2;
  pointer-events: auto;
}

.media-slide.active {
  opacity: 1;
  z-index: 2;
}

.file-input-container {
  margin-bottom: 20px;
  padding: 20px;
  background-color: white;
  border: 1px solid var(--instagram-border);
  border-radius: 4px;
}

.loading {
  text-align: center;
  padding: 40px;
  font-size: 18px;
}

.sort-options {
  display: flex;
  align-items: center;
  justify-content: center;
  padding: 10px 20px;
  margin-bottom: 20px;
}

.sort-row {
  display: flex;
  align-items: center;
  justify-content: center;
  flex-wrap: wrap;
  margin: 5px 0;
  width: 100%;
  max-width: 600px;
}

.sort-link {
  margin: 0 10px;
  color: var(--instagram-text);
  text-decoration: none;
  padding: 5px 0;
  position: relative;
  transition: color 0.2s;
}

.sort-link:hover {
  color: var(--instagram-link);
}

.sort-link.active {
  color: var(--instagram-link);
  font-weight: 600;
}

.sort-link.active::after {
  content: '';
  position: absolute;
  bottom: 0;
  left: 0;
  width: 100%;
  height: 2px;
  background-color: var(--instagram-link);
}

@media (max-width: 768px) {
  .posts-grid {
    grid-template-columns: repeat(2, 1fr);
    gap: 4px;
  }

  .post-modal-content {
    flex-direction: column;
    height: auto;
    max-height: none;
    margin: 30px auto 0;
    border-radius: 0;
    width: 100%;
  }

  .post-media {
    height: 50vh;
    width: 100%;
    min-height: 300px;
    position: relative;
  }

  .post-info {
    width: 100%;
    border-left: none;
    border-top: 1px solid var(--instagram-border);
  }

  .profile-picture {
    width: 80px;
    height: 80px;
    margin-right: 15px;
  }

  .stat {
    margin-right: 20px;
  }
  
  .post-modal {
    overflow-y: auto;
    padding-top: 0;
  }
  
  .media-container {
    position: relative;
    width: 100%;
    height: 100%;
  }
  
  .media-slide {
    position: absolute;
    top: 0;
    left: 0;
    width: 100%;
    height: 100%;
    display: flex;
    align-items: center;
    justify-content: center;
  }
  
  .media-slide img,
  .media-slide video {
    max-width: 100%;
    max-height: 100%;
    object-fit: contain;
  }
}

@media (max-width: 480px) {
  .posts-grid {
    grid-template-columns: repeat(3, 1fr);
    gap: 3px;
  }

  .profile-info {
    flex-direction: column;
    text-align: center;
  }

  .profile-picture {
    margin-right: 0;
    margin-bottom: 15px;
  }

  .stats {
    justify-content: center;
  }
  
  .header-content {
    flex-direction: column;
    align-items: center;
    padding: 5px 0;
  }
  
  .date-range-header {
    margin-left: 0;
    margin-top: 2px;
    font-size: 12px;
  }
  
  .sort-options {
    padding: 5px;
  }
  
  .sort-row {
    width: 100%;
    flex-wrap: wrap;
    justify-content: center;
  }
  
  .sort-link {
    margin: 5px;
    font-size: 13px;
    padding: 5px 0;
    flex: 0 0 auto;
  }
}

/* Mobile-specific fixes */
@media (max-width: 768px) {
  /* Ensure the modal takes up the full screen */
  .post-modal {
    padding: 0;
    overflow-y: auto;
  }
  
  /* Make modal content take full width */
  .post-modal-content {
    flex-direction: column;
    height: auto;
    margin: 0;
    width: 100%;
    max-width: 100%;
  }
  
  /* Explicitly set post-media height */
  .post-media {
    height: 50vh !important; /* Important to override any inline styles */
    min-height: 300px !important;
    width: 100%;
    flex: 0 0 auto; /* Don't grow or shrink */
  }
  
  /* Ensure media container fills the available space */
  .media-container {
    position: relative;
    width: 100%;
    height: 100% !important;
    display: flex !important;
    align-items: center;
    justify-content: center;
  }
  
  /* Fix media slides */
  .media-slide {
    position: absolute;
    top: 0;
    left: 0;
    width: 100%;
    height: 100%;
    display: flex !important;
    align-items: center;
    justify-content: center;
  }
  
  /* Ensure images don't exceed container */
  .media-slide img,
  .media-slide video {
    max-width: 100%;
    max-height: 100%;
    width: auto;
    height: auto;
    object-fit: contain;
  }
  
  /* Make post info section scroll independently if needed */
  .post-info {
    flex: 1 1 auto;
    overflow-y: auto;
    max-height: 50vh;
  }
}


================================================
FILE: docker-compose.yml
================================================
services:
  memento-mori:
    build: .
    volumes:
      - ./:/app/workspace
      - ./output:/output
    environment:
      - PYTHONUNBUFFERED=1
    command: --search-dir /app/workspace --output /output

================================================
FILE: memento_mori/__init__.py
================================================
# __init__.py
"""
Memento Mori - Instagram Archive Viewer

A tool that converts your Instagram data export into a beautiful, standalone viewer that
resembles the Instagram interface. The name "Memento Mori" (Latin for "remember that
you will die") reflects the ephemeral nature of our digital content.
"""

__version__ = "0.1.0"

# Import main classes for easier access
from .extractor import InstagramArchiveExtractor
from .file_mapper import InstagramFileMapper
from .loader import InstagramDataLoader
from .media import InstagramMediaProcessor
from .generator import InstagramSiteGenerator

# Define what's available when using `from memento_mori import *`
__all__ = [
    "InstagramArchiveExtractor",
    "InstagramFileMapper",
    "InstagramDataLoader",
    "InstagramMediaProcessor",
    "InstagramSiteGenerator",
]


================================================
FILE: memento_mori/cli.py
================================================
# memento_mori/cli.py

import os
import argparse
import multiprocessing
from pathlib import Path
import traceback
import sys

from memento_mori.extractor import InstagramArchiveExtractor
from memento_mori.loader import InstagramDataLoader
from memento_mori.media import InstagramMediaProcessor
from memento_mori.generator import InstagramSiteGenerator


def main():
    """Main entry point for the Memento Mori CLI."""
    parser = argparse.ArgumentParser(
        description="Transform Instagram data export into a viewer."
    )

    parser.add_argument(
        "--input",
        type=str,
        help="Path to Instagram data (ZIP or folder). If not specified, auto-detection will be used.",
    )
    parser.add_argument(
        "--output",
        type=str,
        default="./output",
        help="Output directory for generated website [default: ./output]",
    )
    parser.add_argument(
        "--threads",
        type=int,
        default=0,
        help="Number of parallel processing threads [default: auto]",
    )
    parser.add_argument(
        "--search-dir",
        type=str,
        default=".",
        help="Directory to search for Instagram exports when auto-detecting [default: current directory]",
    )
    parser.add_argument(
        "--quality",
        type=int,
        default=70,
        help="WebP conversion quality (1-100) [default: 70]",
    )
    parser.add_argument(
        "--max-dimension",
        type=int,
        default=1920,
        help="Maximum dimension for images in pixels [default: 1920]",
    )
    parser.add_argument(
        "--thumbnail-size",
        type=str,
        default="292x292",
        help="Size of thumbnails [default: 292x292]",
    )
    parser.add_argument(
        "--no-auto-detect",
        action="store_true",
        help="Disable auto-detection (requires --input to be specified)",
    )
    parser.add_argument(
        "--gtag-id",
        type=str,
        help="Google Analytics tag ID (e.g., 'G-DX1ZWTC9NZ') to add tracking to the generated site",
    )
    parser.add_argument(
        "--verbose", "-v",
        action="store_true",
        help="Enable verbose output for debugging",
    )

    args = parser.parse_args()

    # Set defaults for threads if not specified
    if args.threads <= 0:
        args.threads = max(1, multiprocessing.cpu_count() - 1)

    # Parse thumbnail size
    try:
        if "x" in args.thumbnail_size:
            width, height = map(int, args.thumbnail_size.lower().split("x"))
            thumbnail_size = (width, height)
        else:
            size = int(args.thumbnail_size)
            thumbnail_size = (size, size)
    except ValueError:
        print(f"Invalid thumbnail size: {args.thumbnail_size}, using default 292x292")
        thumbnail_size = (292, 292)

    # Create output directory
    output_dir = Path(args.output)
    output_dir.mkdir(parents=True, exist_ok=True)

    # Initialize extractor with input path if specified
    extractor = InstagramArchiveExtractor(input_path=args.input)

    # Handle input selection
    # If input is explicitly provided, use that
    if args.input:
        print(f"Using specified input: {args.input}")
    # If auto-detect is not disabled, try to find an export
    elif not args.no_auto_detect:
        print(f"Auto-detecting Instagram archive in {args.search_dir}...")
        detected_archive = extractor.auto_detect_archive(search_dir=args.search_dir)
        if not detected_archive:
            print(
                "No Instagram archive detected. Please specify an input file with --input."
            )
            return 1
        print(f"Detected archive: {detected_archive}")
    # If no input and auto-detect disabled, raise error
    else:
        print("Error: No input specified and auto-detection is disabled.")
        print("Please provide an input path with --input.")
        return 1

    try:
        # Extract archive
        print("\n📦 EXTRACTING ARCHIVE")
        print(f"   Source: {extractor.input_path}")
        extraction_dir = extractor.extract()
        print(f"   Extracted to: {extraction_dir}")

        # Get file mapper from extractor
        file_mapper = extractor.file_mapper

        # Initialize loader with the same file mapper
        print("\n📋 LOADING DATA")
        loader = InstagramDataLoader(extraction_dir, file_mapper, verbose=args.verbose)

        # Load and process data
        data = loader.load_all_data()
        
        if args.verbose:
            print("\n🔍 VERBOSE: Data Loading Details")
            print(f"   Profile data found: {'Yes' if loader.profile_data else 'No'}")
            print(f"   Location data found: {'Yes' if loader.location_data else 'No'}")
            print(f"   Posts data found: {'Yes' if loader.posts_data else 'No'}")
            print(f"   Insights data found: {'Yes' if loader.insights_data else 'No'}")
            print(f"   Combined data entries: {len(loader.combined_data) if loader.combined_data else 0}")
            
            # Show file paths that were found
            print("\n   File paths found:")
            for file_type, file_path in file_mapper.file_map.items():
                if isinstance(file_path, list):
                    print(f"      {file_type}: {len(file_path)} files")
                    if args.verbose:
                        for i, path in enumerate(file_path[:3]):  # Show first 3 only
                            print(f"         - {path}")
                        if len(file_path) > 3:
                            print(f"         - ... and {len(file_path)-3} more")
                else:
                    print(f"      {file_type}: {file_path}")
        
        print(f"   Found {data['post_count']} posts from {data['profile']['username']}")

        # Process media files
        print(f"\n🖼️  PROCESSING MEDIA")
        print(f"   Using {args.threads} threads, quality {args.quality}, max dimension {args.max_dimension}...")
        media_processor = InstagramMediaProcessor(
            extraction_dir, output_dir, thread_count=args.threads,
            quality=args.quality, max_dimension=args.max_dimension
        )
        media_result = media_processor.process_media_files(
            data["posts"], data["profile"]["profile_picture"], data.get("stories", {})
        )

        # Update data with shortened filenames
        data["posts"] = media_result["updated_post_data"]
        data["profile"]["profile_picture"] = media_result["shortened_profile"]
        
        # Update stories data if it exists
        if "stories" in data and media_result.get("updated_stories_data"):
            data["stories"] = media_result["updated_stories_data"]

        # Generate website with the loaded data
        print("\n🌐 GENERATING WEBSITE")
        generator = InstagramSiteGenerator(data, output_dir, gtag_id=args.gtag_id)
        success = generator.generate()

        if success:
            stats = media_result["stats"]
            print("\n✅ PROCESS COMPLETE")
            print(f"   Website generated at: {output_dir}")
            print(f"   Posts processed: {data['post_count']}")
            print(f"   Media files processed: {stats['thumbnail_count'] + stats['webp_count']}")
            print(f"   Space saved: {stats['space_saved_mb']:.2f} MB ({stats['percentage_saved']:.1f}%)")
            print(f"   Fixed file extensions: {stats['extension_fixes']}")
            return 0
        else:
            print("\n❌ ERROR: Failed to generate website.")
            return 1

    except Exception as e:
        print(f"\n❌ ERROR: {str(e)}")
        if args.verbose:
            print("\n🔍 VERBOSE: Exception traceback")
            traceback.print_exc(file=sys.stdout)
        return 1


if __name__ == "__main__":
    exit(main())


================================================
FILE: memento_mori/extractor.py
================================================
# memento_mori/extractor.py
import os
import zipfile
import tempfile
import shutil
from pathlib import Path
from .file_mapper import InstagramFileMapper


class InstagramArchiveExtractor:
    """
    Class for handling the extraction and validation of Instagram data archives.

    This class provides methods to:
    - Auto-detect Instagram archive files
    - Extract archives to temporary or specified locations
    - Validate the structure of extracted content
    - Clean up temporary files after processing
    """

    REQUIRED_FILES = ["profile", "posts"]

    def __init__(self, input_path=None, output_path=None, cleanup=True):
        """
        Initialize the extractor with paths and options.

        Args:
            input_path (str, optional): Path to the Instagram archive (ZIP or folder)
            output_path (str, optional): Path where extracted content should be placed
            cleanup (bool): Whether to clean up temporary files after extraction
        """
        self.input_path = input_path
        self.input_paths = [input_path] if input_path else []
        self.output_path = output_path
        self.cleanup = cleanup
        self.temp_dir = None
        self.extraction_dir = None
        self.file_mapper = None
        self.file_map = {}  # Maps required file types to their actual paths

    def auto_detect_archive(self, search_dir="."):
        """
        Auto-detect Instagram archive files in the specified directory.

        Args:
            search_dir (str): Directory to search for Instagram archives

        Returns:
            str: Path to the detected archive or None if not found
        """
        print(f"🔍 DETECTING INSTAGRAM ARCHIVE")
        print(f"   Searching in: {search_dir}")
        
        # Look for ZIP files that might be Instagram archives
        potential_archives = []

        for root, _, files in os.walk(search_dir):
            for file in files:
                if file.lower().endswith(".zip"):
                    zip_path = os.path.join(root, file)
                    # Check if this ZIP might be an Instagram archive
                    if self._is_instagram_archive(zip_path):
                        potential_archives.append(zip_path)

        if not potential_archives:
            print("   No Instagram archives found.")
            return None

        # Sort by modification time (oldest first, so newest archive is extracted last and wins on conflicts)
        potential_archives.sort(key=lambda x: os.path.getmtime(x))

        if len(potential_archives) > 1:
            print(f"   Found {len(potential_archives)} archives. All will be merged.")
            for archive in potential_archives:
                print(f"   - {os.path.basename(archive)}")

        self.input_path = potential_archives[0]
        self.input_paths = potential_archives
        print(f"   Selected: {os.path.basename(self.input_path)}")
        return self.input_path

    def _is_instagram_archive(self, zip_path):
        """
        Check if a ZIP file is likely an Instagram archive.
        """

        try:
            with zipfile.ZipFile(zip_path, "r") as zip_ref:
                namelist = zip_ref.namelist()

                # More flexible check - look for these directory names anywhere in the paths
                key_dirs = ["personal_information", "your_instagram_activity"]
                found_dirs = set()

                for name in namelist:
                    for dir_name in key_dirs:
                        if dir_name in name.lower():
                            found_dirs.add(dir_name)

                # If we found any of the key directories, it's probably an Instagram archive
                is_archive = len(found_dirs) > 0
                return is_archive

        except Exception as e:
            print(f"Error examining ZIP: {str(e)}")
            return False

    def extract(self):
        """
        Extract the Instagram archive to the specified location.

        Returns:
            str: Path to the extracted content

        Raises:
            ValueError: If no input path is specified or the file doesn't exist
            zipfile.BadZipFile: If the ZIP file is invalid
        """
        if not self.input_path:
            raise ValueError(
                "No input path specified. Use auto_detect_archive() or specify input_path."
            )

        if not os.path.exists(self.input_path):
            raise ValueError(f"Input path does not exist: {self.input_path}")

        # Determine if input is a ZIP file or a directory
        if os.path.isfile(self.input_path) and self.input_path.lower().endswith(".zip"):
            # Create a temporary directory if no output_path is specified
            if not self.output_path:
                self.temp_dir = tempfile.mkdtemp(prefix="instagram_export_")
                self.extraction_dir = self.temp_dir
            else:
                self.extraction_dir = self.output_path
                os.makedirs(self.extraction_dir, exist_ok=True)

            # Extract all detected ZIP files, merging their contents
            for zip_path in self.input_paths:
                print(f"Extracting {zip_path} to {self.extraction_dir}...")
                self._extract_and_merge(zip_path, self.extraction_dir)
        else:
            # Input is already a directory
            self.extraction_dir = self.input_path

        # After extraction, check if there's a single directory at the top level
        contents = os.listdir(self.extraction_dir)
        if len(contents) == 1 and os.path.isdir(
            os.path.join(self.extraction_dir, contents[0])
        ):
            # If so, use that as the actual extraction directory
            self.extraction_dir = os.path.join(self.extraction_dir, contents[0])
            print(
                f"Found single top-level directory, using it as extraction dir: {self.extraction_dir}"
            )

        # Now validate with the correct path
        if self.validate_structure():
            return self.extraction_dir
        else:
            raise ValueError(
                "Extracted content does not appear to be a valid Instagram archive."
            )

    def validate_structure(self):
        """
        Validate the structure of the extracted content.
        """
        if not self.extraction_dir or not os.path.exists(self.extraction_dir):
            return False

        # Create file mapper
        self.file_mapper = InstagramFileMapper(self.extraction_dir)
        self.file_mapper.discover_all_files()

        # Validate required files
        valid, missing_files = self.file_mapper.validate_required_files(
            self.REQUIRED_FILES
        )

        if not valid:
            print(f"Missing required files: {', '.join(missing_files)}")
            return False

        # For backward compatibility, update self.file_map
        self.file_map = self.file_mapper.file_map
        return True

    def _map_important_files(self):
        """
        Find and map important files that might be in different locations.
        """
        for file_type, patterns in self.FILE_PATTERNS.items():
            # Handle both single string patterns and lists of patterns
            if isinstance(patterns, str):
                patterns = [patterns]

            all_matches = []
            for pattern in patterns:
                # Use Path.glob to find files matching each pattern
                matches = list(Path(self.extraction_dir).glob(pattern))
                all_matches.extend(matches)

            if all_matches:
                # Store the path to the first matching file
                self.file_map[file_type] = str(all_matches[0])

                # If multiple posts files are found, store them all
                if file_type == "posts" and len(all_matches) > 1:
                    self.file_map[f"{file_type}_all"] = [
                        str(match) for match in all_matches
                    ]

    def _extract_and_merge(self, zip_path, target_dir):
        """
        Extract a ZIP file into target_dir, handling the case where the ZIP
        contains a single top-level directory by merging its contents directly.
        """
        staging_dir = tempfile.mkdtemp(prefix="instagram_staging_")
        try:
            with zipfile.ZipFile(zip_path, "r") as zip_ref:
                zip_ref.extractall(staging_dir)

            # If the ZIP had a single top-level directory, use its contents directly
            contents = os.listdir(staging_dir)
            if len(contents) == 1 and os.path.isdir(os.path.join(staging_dir, contents[0])):
                source = os.path.join(staging_dir, contents[0])
            else:
                source = staging_dir

            self._merge_dirs(source, target_dir)
        finally:
            shutil.rmtree(staging_dir, ignore_errors=True)

    def _merge_dirs(self, src, dst):
        """Recursively merge src directory into dst directory."""
        for item in os.listdir(src):
            s = os.path.join(src, item)
            d = os.path.join(dst, item)
            if os.path.isdir(s):
                if os.path.exists(d):
                    self._merge_dirs(s, d)
                else:
                    shutil.copytree(s, d)
            else:
                shutil.copy2(s, d)

    def get_file_path(self, file_type):
        """
        Get the path to an important file.

        Args:
            file_type (str): Type of file to get (e.g., "posts", "insights")

        Returns:
            str: Path to the file or None if not found
        """
        return self.file_map.get(file_type)

    def cleanup_temp_files(self):
        """
        Clean up temporary files created during extraction.
        """
        if self.cleanup and self.temp_dir and os.path.exists(self.temp_dir):
            print(f"Cleaning up temporary directory: {self.temp_dir}")
            shutil.rmtree(self.temp_dir)
            self.temp_dir = None

    def __del__(self):
        """
        Ensure cleanup of temporary files when the object is destroyed.
        """
        self.cleanup_temp_files()


================================================
FILE: memento_mori/file_mapper.py
================================================
# memento_mori/file_mapper.py
from pathlib import Path
import os


class InstagramFileMapper:
    """
    Central class for discovering and mapping Instagram export files.
    Used by both Extractor and Loader to maintain consistency.
    """

    # Define all patterns in one central location
    FILE_PATTERNS = {
        "posts": ["**/content/posts*.json", "**/media/posts*.json"],
        "insights": ["**/past_instagram_insights/posts.json"],
		"profile": [
			"**/personal_information/personal_information/personal_information.json",  # Double-nested (newer exports)
			"**/personal_information/personal_information.json",
			"**/account_information/personal_information.json",
			"**/personal_information.json",
			"**/*/personal_information.json"
		],
		"location": [
			"**/personal_information/information_about_you/profile_based_in.json",  # Newer exports
			"**/information_about_you/profile_based_in.json",
			"**/profile_based_in.json",
			"**/*/profile_based_in.json",
			"**/account_information/profile_based_in.json",
			"**/personal_information/profile_based_in.json"
		],
        "followers": [
            "**/connections/followers_and_following/followers*.json",
            "**/followers_and_following/followers*.json",
            "**/followers*.json",
            # Search in any subdirectory
            "**/*/followers*.json"
        ],
        "stories": [
            "**/content/stories*.json",
            "**/media/stories*.json",
            "**/your_instagram_activity/stories*.json",
            "**/stories*.json",
            "**/your_instagram_activity/stories/stories*.json",
            "**/your_instagram_activity/content/stories*.json",
            # Search in any subdirectory
            "**/*/stories*.json"
        ],
        # Add more patterns as needed
    }

    def __init__(self, base_dir):
        self.base_dir = Path(base_dir)
        self.file_map = {}

    def discover_all_files(self):
        """
        Discover all files defined in FILE_PATTERNS.
        """
        for file_type, patterns in self.FILE_PATTERNS.items():
            self.discover_files(file_type, patterns)
        return self.file_map

    def discover_files(self, file_type, patterns=None):
        """
        Discover files of a specific type.
        """
        if patterns is None:
            patterns = self.FILE_PATTERNS.get(file_type, [])

        # Handle both single string patterns and lists of patterns
        if isinstance(patterns, str):
            patterns = [patterns]

        all_matches = []
        for pattern in patterns:
            # First try exact path if it looks like one
            if not pattern.startswith("**"):
                exact_path = os.path.join(self.base_dir, pattern)
                if os.path.exists(exact_path):
                    all_matches.append(Path(exact_path))
                    continue

            # Otherwise use Path.glob to find files matching pattern
            matches = list(self.base_dir.glob(pattern))
            all_matches.extend(matches)

        if all_matches:
            # Store the path to the first matching file
            self.file_map[file_type] = str(all_matches[0])

            # If multiple matches are found, store them all
            if len(all_matches) > 1:
                self.file_map[f"{file_type}_all"] = [
                    str(match) for match in all_matches
                ]

        return self.file_map.get(file_type)

    def get_file_path(self, file_type):
        """
        Get the path to a specific file type.
        """
        if file_type not in self.file_map and file_type in self.FILE_PATTERNS:
            # Try to discover it if not already in the map
            self.discover_files(file_type)

        return self.file_map.get(file_type)

    def validate_required_files(self, required_files):
        """
        Validate that all required files exist.
        """
        missing_files = []
        for file_type in required_files:
            if not self.get_file_path(file_type):
                missing_files.append(file_type)

        return len(missing_files) == 0, missing_files


================================================
FILE: memento_mori/generator.py
================================================
# memento_mori/generator.py
import os
import json
import shutil
import datetime
from pathlib import Path
from jinja2 import Environment, FileSystemLoader
from markupsafe import Markup
import re
import hashlib
import base64


class InstagramSiteGenerator:
    """
    Class for generating the static website from processed Instagram data.

    This class handles:
    - Creating HTML using templates
    - Copying static assets (CSS, JS)
    - Verifying the completeness of the output
    """

    def __init__(self, data_package, output_dir, template_dir=None, static_dir=None, gtag_id=None):
        """Initialize the generator with data and path options."""
        self.data_package = data_package
        self.output_dir = Path(output_dir)
        self.gtag_id = gtag_id  # Store the Google tag ID

        # Find template directory
        if template_dir is None:
            # Try to find templates relative to this file or common locations
            module_dir = Path(__file__).parent
            template_dir = module_dir / "templates"

            if not template_dir.exists():
                for path in [
                    Path("templates"),
                    Path("./templates"),
                    Path("../templates"),
                ]:
                    if path.exists():
                        template_dir = path
                        break

        # Find static directory
        if static_dir is None:
            module_dir = Path(__file__).parent
            static_dir = module_dir / "static"

            if not static_dir.exists():
                for path in [Path("static"), Path("./static"), Path("../static")]:
                    if path.exists():
                        static_dir = path
                        break

        self.template_dir = Path(template_dir)
        self.static_dir = Path(static_dir)

        print(f"Using template directory: {self.template_dir}")
        print(f"Using static directory: {self.static_dir}")

        # Set up Jinja environment
        self.jinja_env = Environment(
            loader=FileSystemLoader(str(self.template_dir)), autoescape=True
        )

    def generate(self):
        """Generate the complete static website and verify output."""
        try:
            # Create output directory
            self.output_dir.mkdir(parents=True, exist_ok=True)

            # Create CSS and JS directories in output
            (self.output_dir / "css").mkdir(exist_ok=True)
            (self.output_dir / "js").mkdir(exist_ok=True)

            # Copy static assets
            self._copy_static_assets()

            # Generate HTML
            self._generate_html()
            
            # Generate stories HTML if we have stories data
            if "stories" in self.data_package and self.data_package["stories"]:
                self._generate_stories_html()

            print(f"Website successfully generated at {self.output_dir}")
            return True

        except Exception as e:
            print(f"Error generating website: {str(e)}")
            return False

    def _copy_static_assets(self):
        """Copy CSS and JS files to the output directory."""
        # Copy CSS
        css_dir = self.static_dir / "css"
        if css_dir.exists():
            for css_file in css_dir.glob("*.css"):
                shutil.copy2(css_file, self.output_dir / "css" / css_file.name)
                print(f"Copied CSS: {css_file.name}")

        # Copy JS
        js_dir = self.static_dir / "js"
        if js_dir.exists():
            for js_file in js_dir.glob("*.js"):
                shutil.copy2(js_file, self.output_dir / "js" / js_file.name)
                print(f"Copied JS: {js_file.name}")
            
            # Ensure stories.js exists, create it if not
            stories_js = js_dir / "stories.js"
            if not stories_js.exists():
                # Create a minimal stories.js file if it doesn't exist
                with open(stories_js, "w") as f:
                    f.write("// Stories viewer functionality\n")
                print(f"Created placeholder: stories.js")
            
            # Copy stories.js to output
            shutil.copy2(stories_js, self.output_dir / "js" / "stories.js")
            print(f"Copied JS: stories.js")

    def _generate_html(self):
        """Generate HTML using templates."""
        # Generate the grid HTML
        grid_html = self._render_grid()

        # Extract data for the main template
        profile_info = self.data_package["profile"]
        location_info = self.data_package.get("location", {"location": "Unknown"})
        date_range = self.data_package["date_range"]["range"]
        post_count = self.data_package["post_count"]
        story_count = self.data_package.get("story_count", 0)
        
        # Get profile picture path and check for WebP version
        profile_picture = profile_info["profile_picture"]
        
        # Check if we have a WebP version of the profile picture
        if profile_picture:
            webp_path = re.sub(r"\.(jpg|jpeg|png|gif)$", ".webp", profile_picture, flags=re.I)
            if os.path.exists(os.path.join(self.output_dir, webp_path)):
                profile_picture = webp_path

        # Current date for footer
        generation_date = datetime.datetime.now().strftime("%Y-%m-%d")

        # Get stories data or empty dict if not available
        stories_data = self.data_package.get("stories", {})

        # Render the main template
        template = self.jinja_env.get_template("index.html")
        html_content = template.render(
            username=profile_info["username"],
            profile_picture=profile_picture,
            bio=profile_info.get("bio", ""),  # Pass bio to template
            profile=profile_info,  # Pass the entire profile object
            date_range=date_range,
            post_count=post_count,
            story_count=story_count,
            has_stories=story_count > 0,  # Flag to show stories link
            grid_html=grid_html,
            post_data_json=json.dumps(self.data_package["posts"], ensure_ascii=False),
            stories_data_json=json.dumps(stories_data, ensure_ascii=False),  # Add stories data
            generation_date=generation_date,
            gtag_id=self.gtag_id,  # Add Google tag ID
        )

        # Write HTML file
        with open(self.output_dir / "index.html", "w", encoding="utf-8") as f:
            f.write(html_content)

        print(f"Generated HTML file: {self.output_dir / 'index.html'}")

    def _render_grid(self):
        """Render the grid HTML using the grid.html template."""
        posts_data = self.data_package["posts"]
        lazy_after = 30  # Start lazy loading after this many posts

        # Check if posts_data is valid
        if not posts_data or not isinstance(posts_data, dict):
            print("Warning: No valid posts data found for grid rendering")
            return ""

        # Prepare data for the grid template
        grid_posts = []
        for i, (timestamp, post) in enumerate(posts_data.items()):
            # Determine which media to use for the grid thumbnail
            display_media = self._get_display_media(post, i >= lazy_after)

            grid_posts.append(
                {
                    "index": post["i"],
                    "display_media": display_media["url"],
                    "is_video": display_media["is_video"],
                    "media_count": len(post["m"]),
                    "likes": post.get("l", ""),
                    "lazy_load": Markup(' loading="lazy"') if i >= lazy_after else "",
                }
            )

        # Render grid template
        grid_template = self.jinja_env.get_template("grid.html")
        return grid_template.render(posts=grid_posts)

    def _get_display_media(self, post, use_lazy_loading=False):
        """Determine which media to use for the grid thumbnail."""
        result = {"url": "", "is_video": False}

        if not post["m"] or len(post["m"]) == 0:
            return result

        first_media = post["m"][0]
        result["url"] = first_media

        # Check if first media is a video
        result["is_video"] = bool(
            re.search(r"\.(mp4|mov|avi|webm)$", first_media, re.I)
            if first_media
            else False
        )

        # Check if we have a thumbnail for this media
        if first_media:
            thumb_filename = hashlib.md5(first_media.encode()).hexdigest() + ".webp"
            thumb_path = f"thumbnails/{thumb_filename}"

            if os.path.exists(os.path.join(self.output_dir, thumb_path)):
                # Use the thumbnail instead of the original
                result["url"] = thumb_path
            elif not result["is_video"]:
                # Check if we have a WebP version of the original image
                webp_path = re.sub(
                    r"\.(jpg|jpeg|png|gif)$", ".webp", first_media, flags=re.I
                )
                if os.path.exists(os.path.join(self.output_dir, webp_path)):
                    result["url"] = webp_path

            # If it's a video, look for a thumbnail among all media items
            if (
                result["is_video"] and result["url"] == first_media
            ):  # No thumbnail found yet
                for media_item in post["m"]:
                    if re.search(r"\.(jpg|jpeg|png|webp|gif)$", media_item, re.I):
                        # Check if we have a thumbnail for this image
                        img_thumb_filename = (
                            hashlib.md5(media_item.encode()).hexdigest() + ".webp"
                        )
                        img_thumb_path = f"thumbnails/{img_thumb_filename}"

                        if os.path.exists(
                            os.path.join(self.output_dir, img_thumb_path)
                        ):
                            result["url"] = img_thumb_path
                            break
                        else:
                            result["url"] = media_item
                            break

                # If no thumbnail found, use a SVG placeholder
                if result["url"] == first_media:
                    # Create a simple SVG with a play button
                    svg = (
                        '<svg xmlns="http://www.w3.org/2000/svg" width="400" height="400" viewBox="0 0 400 400">'
                        '<rect width="400" height="400" fill="#333333"/>'
                        '<circle cx="200" cy="200" r="60" fill="#ffffff" fill-opacity="0.8"/>'
                        '<polygon points="180,160 180,240 240,200" fill="#333333"/>'
                        "</svg>"
                    )

                    # Encode the SVG properly for use in an img src attribute
                    result["url"] = (
                        "data:image/svg+xml;base64,"
                        + base64.b64encode(svg.encode()).decode()
                    )

        return result
    def _generate_stories_html(self):
        """Generate a separate HTML file for stories."""
        stories_data = self.data_package.get("stories", {})
        
        if not stories_data:
            print("No stories data found, skipping stories.html generation")
            return
        
        # Extract data for the stories template
        profile_info = self.data_package["profile"]
        date_range = self.data_package["date_range"]["range"]
        story_count = len(stories_data)
        post_count = self.data_package["post_count"]
        
        # Get profile picture path and check for WebP version
        profile_picture = profile_info["profile_picture"]
        
        # Check if we have a WebP version of the profile picture
        if profile_picture:
            webp_path = re.sub(r"\.(jpg|jpeg|png|gif)$", ".webp", profile_picture, flags=re.I)
            if os.path.exists(os.path.join(self.output_dir, webp_path)):
                profile_picture = webp_path

        # Current date for footer
        generation_date = datetime.datetime.now().strftime("%Y-%m-%d")
        
        # Prepare stories data for the template
        stories_list = []
        lazy_after = 30  # Start lazy loading after this many stories
        
        for i, (timestamp, story) in enumerate(stories_data.items()):
            # Check for story-specific thumbnail
            story_thumb = story.get("story_thumb", None)
            
            if story_thumb and os.path.exists(os.path.join(self.output_dir, story_thumb)):
                # Use the 9:16 story thumbnail
                media_url = story_thumb
            else:
                # Fall back to regular thumbnail or original media
                display_media = self._get_display_media(story, i >= lazy_after)
                media_url = display_media["url"]
            
            # Determine if it's a video
            is_video = bool(re.search(r"\.(mp4|mov|avi|webm)$", story["m"][0], re.I)) if story["m"] else False
            
            stories_list.append({
                "index": story["i"],
                "media": media_url,
                "is_video": is_video,
                "date": story.get("d", ""),
                "caption": story.get("tt", ""),
                "timestamp": timestamp,
                "lazy_load": Markup(' loading="lazy"') if i >= lazy_after else "",
                "original_media": story["m"][0] if story["m"] else "",  # Include original media path
            })
        
        # Render the stories template
        template = self.jinja_env.get_template("stories_page.html")
        html_content = template.render(
            username=profile_info["username"],
            profile_picture=profile_picture,
            bio=profile_info.get("bio", ""),
            profile=profile_info,
            date_range=date_range,
            post_count=post_count,
            story_count=story_count,
            stories=stories_list,
            stories_data_json=json.dumps(stories_data, ensure_ascii=False),
            generation_date=generation_date,
            gtag_id=self.gtag_id,
        )
        
        # Write HTML file
        with open(self.output_dir / "stories.html", "w", encoding="utf-8") as f:
            f.write(html_content)
        
        print(f"Generated stories HTML file: {self.output_dir / 'stories.html'}")


================================================
FILE: memento_mori/loader.py
================================================
# memento_mori/loader.py
import json
import re
import os
from datetime import datetime
import html
from ftfy import fix_text
from pathlib import Path


def fix_double_encoded_utf8(text):
    """
    Fix double-encoded UTF-8 sequences in text using ftfy.
    This handles cases where UTF-8 characters (like emoji) were incorrectly encoded twice.
    """
    if not isinstance(text, str):
        return text
    
    # Use ftfy to fix the text encoding issues
    return fix_text(text)


class InstagramDataLoader:
    """
    Class for loading and processing Instagram data from the exported archive.

    This class provides methods to:
    - Load JSON files (posts, insights, user data)
    - Parse and merge data sources
    - Convert timestamps and format data
    - Provide a clean data structure for the generator
    """

    def __init__(self, extraction_dir, file_mapper=None, verbose=False):
        """
        Initialize the loader with the path to the extracted data.

        Args:
            extraction_dir (str): Path to the extracted Instagram data
            file_mapper (InstagramFileMapper, optional): File mapper from extractor
            verbose (bool): Whether to print verbose debug information
        """
        self.extraction_dir = extraction_dir
        self.file_mapper = file_mapper
        self.verbose = verbose

        # If no file mapper was provided, create one
        if self.file_mapper is None:
            from .file_mapper import InstagramFileMapper

            self.file_mapper = InstagramFileMapper(extraction_dir)
            self.file_mapper.discover_all_files()

        # Storage for loaded data
        self.profile_data = None
        self.location_data = None
        self.posts_data = None
        self.insights_data = None
        self.combined_data = None

    def load_profile_data(self):
        """
        Load user profile data.

        Returns:
            dict: User profile information
        """
        profile_path = self.file_mapper.get_file_path("profile")
        if not profile_path:
            print("Profile data not found")
            return {"username": "Unknown", "profile_picture": "", "bio": ""}

        try:
            with open(profile_path, "r", encoding="utf-8") as f:
                self.profile_data = json.load(f)

            string_map = self.profile_data["profile_user"][0]["string_map_data"]
            media_map = self.profile_data["profile_user"][0]["media_map_data"]

            profile_info = {
                "username": string_map["Username"]["value"],
                "profile_picture": "",
                "bio": "",
                "website": "",
                "name": "",
            }

            for key, value in media_map.items():
                if key.lower() == "profile photo":
                    profile_info["profile_picture"] = value.get("uri", "")
                    break

            if "Name" in string_map:
                profile_info["name"] = string_map["Name"]["value"]

            if "Bio" in string_map:
                profile_info["bio"] = string_map["Bio"]["value"]

            if "Website" in string_map:
                profile_info["website"] = string_map["Website"]["value"]

            return profile_info
        except Exception as e:
            print(f"Error loading profile data: {str(e)}")
            return {"username": "Unknown", "profile_picture": ""}

    def load_location_data(self):
        """
        Load user location data.

        Returns:
            dict: User location information
        """
        location_path = self.file_mapper.get_file_path("location")
        if not location_path:
            print("Location data not found")
            return {"location": "Unknown"}

        try:
            with open(location_path, "r", encoding="utf-8") as f:
                self.location_data = json.load(f)

            string_map = self.location_data["inferred_data_primary_location"][0]["string_map_data"]

            location_value = "Unknown"
            for key in ["Town/city name", "City Name", "Name"]:
                if key in string_map:
                    location_value = string_map[key]["value"]
                    break

            return {"location": location_value}

            return location_info
        except Exception as e:
            print(f"Error loading location data: {str(e)}")
            return {"location": "Unknown"}

    def load_posts_data(self):
        """
        Load posts data from one or more posts JSON files.

        Returns:
            list: Combined posts data from all posts files
        """
        all_posts = []

        # Check if we have multiple post files
        post_paths = []
        if self.file_mapper.file_map.get("posts_all"):
            post_paths = self.file_mapper.file_map["posts_all"]
        elif self.file_mapper.get_file_path("posts"):
            post_paths = [self.file_mapper.get_file_path("posts")]

        if not post_paths:
            print("No posts data found")
            return []

        if self.verbose:
            print(f"Found {len(post_paths)} posts data file(s):")
            for i, path in enumerate(post_paths):
                print(f"  {i+1}. {path}")

        for posts_path in post_paths:
            try:
                if self.verbose:
                    print(f"Loading posts from: {posts_path}")
                
                with open(posts_path, "r", encoding="utf-8") as f:
                    # Read the file content first
                    file_content = f.read()
                    
                    if self.verbose:
                        print(f"  File size: {len(file_content)} bytes")
                    
                    # Fix encoding issues with ftfy
                    file_content = fix_text(file_content)
                    
                    # Parse the modified content
                    posts_data = json.loads(file_content, strict=False)
                    
                    # Check if posts_data is a list (expected format)
                    if isinstance(posts_data, list):
                        if self.verbose:
                            print(f"  Found {len(posts_data)} posts in list format")
                        all_posts.extend(posts_data)
                    elif isinstance(posts_data, dict):
                        # Some exports might have posts as a dictionary
                        if self.verbose:
                            print(f"  Found posts in dictionary format")
                            print(f"  Dictionary keys: {', '.join(list(posts_data.keys())[:5])}...")
                        
                        # Try to extract a list from it
                        if "posts" in posts_data and isinstance(posts_data["posts"], list):
                            if self.verbose:
                                print(f"  Found {len(posts_data['posts'])} posts in 'posts' key")
                            all_posts.extend(posts_data["posts"])
                        else:
                            # Add the dict as a single item if we can't extract a list
                            if self.verbose:
                                print(f"  No 'posts' list found, adding dictionary as a single item")
                            all_posts.append(posts_data)
                    else:
                        print(f"Warning: Unexpected posts data format in {posts_path}")
                        if self.verbose:
                            print(f"  Data type: {type(posts_data)}")
            except Exception as e:
                print(f"Error loading posts data from {posts_path}: {str(e)}")
                if self.verbose:
                    import traceback
                    traceback.print_exc()

        if not all_posts:
            print("Warning: No posts data could be loaded from any file")
        elif self.verbose:
            print(f"Successfully loaded {len(all_posts)} posts in total")
            
        self.posts_data = all_posts
        return all_posts

    def load_insights_data(self):
        """
        Load insights data.

        Returns:
            dict: Insights data indexed by timestamp
        """
        insights_path = self.file_mapper.get_file_path("insights")
        if not insights_path:
            print(
                "Warning: No insights file found. Insights data will not be available."
            )
            # Initialize as empty dict, not None
            self.insights_data = {}
            return {}

        try:
            with open(insights_path, "r", encoding="utf-8") as f:
                file_content = f.read()
                # Fix encoding issues
                file_content = fix_text(file_content)
                insights_raw = json.loads(file_content, strict=False)

            # Index insights by timestamp
            insights_indexed = {}
            
            # Handle different possible structures
            if "organic_insights_posts" in insights_raw:
                for insight in insights_raw.get("organic_insights_posts", []):
                    timestamp = None
                    
                    # Try to get timestamp from media_map_data
                    if "media_map_data" in insight and "Media Thumbnail" in insight["media_map_data"]:
                        timestamp = insight["media_map_data"]["Media Thumbnail"].get("creation_timestamp")
                    
                    # If no timestamp yet, try other fields
                    if not timestamp and "creation_timestamp" in insight:
                        timestamp = insight["creation_timestamp"]
                    
                    if timestamp:
                        insights_indexed[str(timestamp)] = insight
            else:
                # Try alternative structure
                for insight in insights_raw:
                    if isinstance(insight, dict) and "creation_timestamp" in insight:
                        timestamp = insight["creation_timestamp"]
                        insights_indexed[str(timestamp)] = insight

            self.insights_data = insights_indexed
            return insights_indexed
        except Exception as e:
            print(f"Error loading insights data: {str(e)}")
            self.insights_data = {}
            return {}

    def combine_data(self):
        """
        Combine posts and insights data.

        Returns:
            list: Combined data with posts and their associated insights
        """
        if self.posts_data is None:
            if self.verbose:
                print("No posts data yet, loading posts data")
            self.load_posts_data()

        if self.insights_data is None:
            if self.verbose:
                print("No insights data yet, loading insights data")
            self.load_insights_data()

        # Ensure insights_data is a dictionary
        if not isinstance(self.insights_data, dict):
            if self.verbose:
                print("Warning: insights_data is not a dictionary, initializing as empty")
            self.insights_data = {}

        if self.verbose:
            print(f"Combining {len(self.posts_data) if self.posts_data else 0} posts with {len(self.insights_data)} insights entries")

        combined = []
        
        # Create a mapping of timestamps to insights for faster lookup
        insights_map = {}
        for timestamp, insight in self.insights_data.items():
            insights_map[str(timestamp)] = insight

        if not self.posts_data:
            if self.verbose:
                print("Warning: No posts data to combine")
            self.combined_data = []
            return []

        for post in self.posts_data:
            try:
                # Get the timestamp from the first media item
                timestamp = None
                if "media" in post and len(post["media"]) > 0 and "creation_timestamp" in post["media"][0]:
                    timestamp = str(post["media"][0]["creation_timestamp"])
                elif "creation_timestamp" in post:
                    timestamp = str(post["creation_timestamp"])
                
                # Find associated insights
                insight = insights_map.get(timestamp) if timestamp else None
                
                # Create combined entry
                combined.append({"post_data": post, "insights": insight})
                
                if self.verbose and not timestamp:
                    print(f"Warning: Post without timestamp")
                    print(f"  Post keys: {', '.join(list(post.keys())[:5])}...")
                    if "media" in post:
                        print(f"  Media items: {len(post['media'])}")
                        if len(post["media"]) > 0:
                            print(f"  First media keys: {', '.join(list(post['media'][0].keys())[:5])}...")
                
            except (IndexError, KeyError) as e:
                print(f"Error processing post: {str(e)}")
                if self.verbose:
                    import traceback
                    traceback.print_exc()
                    print(f"  Post keys: {', '.join(list(post.keys())[:5])}...")
                # Add post without insights
                combined.append({"post_data": post, "insights": None})

        if self.verbose:
            print(f"Created {len(combined)} combined entries")
            
        self.combined_data = combined
        return combined

    def extract_relevant_data(self):
        """
        Extract relevant data from the combined posts and insights data.

        Returns:
            dict: Simplified data structure with relevant information
        """
        if self.combined_data is None:
            if self.verbose:
                print("No combined data yet, calling combine_data()")
            self.combine_data()
            
        # Check if combined_data is still None or empty after trying to combine
        if not self.combined_data:
            print("Warning: No post data found or could not be processed.")
            if self.verbose:
                print("combined_data is None or empty after combine_data() call")
                print(f"posts_data: {type(self.posts_data)}, length: {len(self.posts_data) if self.posts_data else 0}")
                print(f"insights_data: {type(self.insights_data)}, length: {len(self.insights_data) if self.insights_data else 0}")
            return {}

        if self.verbose:
            print(f"Processing {len(self.combined_data)} combined data entries")

        simplified_data = {}

        for index, item in enumerate(self.combined_data):
            # Initialize a new post entry with shortened keys
            post_entry = {
                "i": index,  # post_index
                "m": [],     # media
                "t": "",     # creation_timestamp_unix
                "d": "",     # creation_timestamp_readable
                "tt": "",    # title
                "im": "",    # Impressions
                "l": "",     # Likes
                "c": "",     # Comments
            }

            # Extract post-level data
            if "post_data" in item:
                if "creation_timestamp" in item["post_data"]:
                    post_entry["t"] = item["post_data"]["creation_timestamp"]
                elif "media" in item["post_data"] and len(item["post_data"]["media"]) > 0 and "creation_timestamp" in item["post_data"]["media"][0]:
                    # Fallback to first media item timestamp if post timestamp not available
                    post_entry["t"] = item["post_data"]["media"][0]["creation_timestamp"]

                post_entry["d"] = datetime.utcfromtimestamp(
                    post_entry["t"]
                ).strftime("%B %d, %Y at %I:%M %p")

                # Get title from post data
                post_title = ""
                # Check for title directly in post_data
                if "title" in item["post_data"] and item["post_data"]["title"]:
                    post_title = item["post_data"]["title"]
                    if isinstance(post_title, str):
                        # Use ftfy to fix text encoding issues
                        post_title = fix_text(post_title)
                        # Then unescape HTML entities
                        post_title = html.unescape(post_title)
                
                # Check for title in media items
                if not post_title and "media" in item["post_data"]:
                    for media_item in item["post_data"]["media"]:
                        if "title" in media_item and media_item["title"]:
                            post_title = media_item["title"]
                            if isinstance(post_title, str):
                                post_title = fix_text(post_title)
                                post_title = html.unescape(post_title)
                            break  # Use the first media item with a title

                # Extract media URIs
                if "media" in item["post_data"]:
                    for media in item["post_data"]["media"]:
                        if "uri" in media:
                            post_entry["m"].append(media["uri"])
                        else:
                            if self.verbose:
                                print(f"Warning: Media item without URI at post index {index}")
                                print(f"  Media keys: {', '.join(list(media.keys())[:5])}...")
                            post_entry["m"].append("")

            # Get insights data if available
            insights_title = ""
            if "insights" in item and item["insights"]:
                insights = item["insights"]
                
                # Try to get caption from insights
                if "string_map_data" in insights:
                    insights_data = insights["string_map_data"]
                    
                    # Extract specific metrics and ensure they're integers or blank
                    if "Impressions" in insights_data:
                        impressions = insights_data["Impressions"].get("value", "")
                        # Validate and convert to integer if numeric, otherwise leave blank
                        post_entry["im"] = int(impressions) if impressions and impressions.isdigit() else ""

                    if "Likes" in insights_data:
                        likes = insights_data["Likes"].get("value", "")
                        # Validate and convert to integer if numeric, otherwise leave blank
                        post_entry["l"] = int(likes) if likes and likes.isdigit() else ""

                    if "Comments" in insights_data:
                        comments = insights_data["Comments"].get("value", "")
                        # Validate and convert to integer if numeric, otherwise leave blank
                        post_entry["c"] = int(comments) if comments and comments.isdigit() else ""
                    
                    # Try to get caption from insights
                    if "Caption" in insights_data and insights_data["Caption"].get("value"):
                        insights_title = insights_data["Caption"].get("value", "")
                        if isinstance(insights_title, str):
                            insights_title = fix_text(insights_title)
                            insights_title = html.unescape(insights_title)
                
                # Check for title directly in insights
                if not insights_title and "title" in insights and insights["title"]:
                    insights_title = insights["title"]
                    if isinstance(insights_title, str):
                        insights_title = fix_text(insights_title)
                        insights_title = html.unescape(insights_title)
                
                # Check for title in media_map_data
                if not insights_title and "media_map_data" in insights:
                    for media_key, media_data in insights["media_map_data"].items():
                        if "title" in media_data and media_data["title"]:
                            insights_title = media_data["title"]
                            if isinstance(insights_title, str):
                                insights_title = fix_text(insights_title)
                                insights_title = html.unescape(insights_title)
                            break  # Use the first media item with a title

            # Use the longer or non-empty title between post data and insights
            if post_title and insights_title:
                post_entry["tt"] = post_title if len(post_title) >= len(insights_title) else insights_title
            elif post_title:
                post_entry["tt"] = post_title
            elif insights_title:
                post_entry["tt"] = insights_title

            # Only add posts with valid timestamps
            if post_entry["t"]:
                simplified_data[post_entry["t"]] = post_entry
            elif self.verbose:
                print(f"Skipping post at index {index} due to missing timestamp")

        if self.verbose:
            print(f"Extracted {len(simplified_data)} posts with valid timestamps")
            
        # Sort by timestamp (newest first)
        sorted_data = dict(sorted(simplified_data.items(), key=lambda x: x[0], reverse=True))
        
        if self.verbose and sorted_data:
            print(f"Posts date range: {datetime.utcfromtimestamp(int(list(sorted_data.keys())[-1])).strftime('%Y-%m-%d')} to {datetime.utcfromtimestamp(int(list(sorted_data.keys())[0])).strftime('%Y-%m-%d')}")
            
        return sorted_data

    def load_followers_data(self):
        """
        Load followers data and count the number of followers.

        Returns:
            int: Number of followers
        """
        followers_path = self.file_mapper.get_file_path("followers")
        if not followers_path:
            if self.verbose:
                print("Followers data not found")
            return 0

        try:
            with open(followers_path, "r", encoding="utf-8") as f:
                file_content = f.read()
                # Fix encoding issues
                file_content = fix_text(file_content)
                followers_data = json.loads(file_content, strict=False)

            # Count the number of followers
            follower_count = len(followers_data)
            
            if self.verbose:
                print(f"Found {follower_count} followers")
                
            return follower_count
        except Exception as e:
            print(f"Error loading followers data: {str(e)}")
            return 0
            
    def process_json_strings(self, data):
        """
        Recursively process all string values in JSON data to fix encoding issues.
        """
        if isinstance(data, dict):
            return {k: self.process_json_strings(v) for k, v in data.items()}
        elif isinstance(data, list):
            return [self.process_json_strings(item) for item in data]
        elif isinstance(data, str):
            # Apply all string fixes
            # Use ftfy to fix text encoding issues
            fixed = fix_text(data)
            # Still apply HTML unescaping after fixing encoding
            fixed = html.unescape(fixed)
            return fixed
        else:
            return data

    def load_stories_data(self):
        """
        Load stories data from stories JSON files.

        Returns:
            dict: Processed stories data
        """
        stories_path = self.file_mapper.get_file_path("stories")
        if not stories_path:
            if self.verbose:
                print("\n🔍 STORIES DATA SEARCH")
                print("   No stories file found in standard locations")
                print("   Checking all patterns for stories files:")
                for pattern in self.file_mapper.FILE_PATTERNS["stories"]:
                    print(f"   - Searching with pattern: {pattern}")
                    matches = list(Path(self.file_mapper.base_dir).glob(pattern))
                    if matches:
                        print(f"     Found {len(matches)} matches:")
                        for match in matches[:3]:  # Show first 3
                            print(f"     • {match}")
                        if len(matches) > 3:
                            print(f"     • ... and {len(matches)-3} more")
                    else:
                        print(f"     No matches found")
                
                # Try a more aggressive search
                print("\n   Performing deep search for any files containing 'stories':")
                for root, dirs, files in os.walk(self.file_mapper.base_dir):
                    for file in files:
                        if 'stories' in file.lower() and file.endswith('.json'):
                            print(f"     • Found potential stories file: {os.path.join(root, file)}")
            return {}

        try:
            if self.verbose:
                print(f"\n🔍 STORIES DATA LOADING")
                print(f"   Found stories file: {stories_path}")
                file_size = os.path.getsize(stories_path)
                print(f"   File size: {file_size} bytes")
            
            with open(stories_path, "r", encoding="utf-8") as f:
                file_content = f.read()
                # Fix encoding issues
                file_content = fix_text(file_content)
                
                if self.verbose:
                    print(f"   Parsing JSON content...")
                
                stories_data = json.loads(file_content, strict=False)
                
                if self.verbose:
                    print(f"   JSON parsed successfully")
                    if isinstance(stories_data, dict):
                        print(f"   Data structure: Dictionary with {len(stories_data)} keys")
                        print(f"   Top-level keys: {', '.join(list(stories_data.keys())[:5])}")
                    elif isinstance(stories_data, list):
                        print(f"   Data structure: List with {len(stories_data)} items")
                    else:
                        print(f"   Data structure: {type(stories_data)}")

            # Process stories data similar to posts
            simplified_stories = {}
            
            # Handle different possible structures
            stories_list = []
            
            if self.verbose:
                print(f"\n   Extracting stories list from data structure...")
            
            # Check for "ig_stories" key specifically
            if isinstance(stories_data, dict) and "ig_stories" in stories_data:
                stories_list = stories_data["ig_stories"]
                if self.verbose:
                    print(f"   Found stories in 'ig_stories' key: {len(stories_list)} items")
            # Also keep the existing checks for other formats
            elif isinstance(stories_data, list):
                stories_list = stories_data
                if self.verbose:
                    print(f"   Using top-level list with {len(stories_list)} items")
            elif isinstance(stories_data, dict):
                # Try different possible keys where stories might be stored
                possible_keys = ["stories", "story_activities", "story_media", "items"]
                for key in possible_keys:
                    if key in stories_data and isinstance(stories_data[key], list):
                        stories_list = stories_data[key]
                        if self.verbose:
                            print(f"   Found stories in '{key}' key: {len(stories_list)} items")
                        break
                
                if not stories_list and self.verbose:
                    print(f"   Could not find stories list in dictionary keys")
                    print(f"   Available keys: {', '.join(list(stories_data.keys()))}")
            
            if self.verbose:
                print(f"\n   Processing {len(stories_list)} stories...")
            
            for index, story in enumerate(stories_list):
                # Initialize a new story entry with shortened keys
                story_entry = {
                    "i": index,  # story_index
                    "m": [],     # media
                    "t": "",     # creation_timestamp_unix
                    "d": "",     # creation_timestamp_readable
                    "tt": "",    # title/caption
                }
                
                if self.verbose and index < 3:  # Only show details for first 3 stories
                    print(f"\n   Story #{index+1}:")
                    if isinstance(story, dict):
                        print(f"   Keys: {', '.join(list(story.keys())[:10])}")
                
                # Extract timestamp
                timestamp_found = False
                if isinstance(story, dict):
                    # Try different possible timestamp fields
                    timestamp_fields = ["creation_timestamp", "taken_at", "timestamp"]
                    for field in timestamp_fields:
                        if field in story and story[field]:
                            story_entry["t"] = int(story[field])
                            timestamp_found = True
                            if self.verbose and index < 3:
                                print(f"   Timestamp found in '{field}': {story_entry['t']}")
                            break
                    
                    # Try media items if no timestamp at story level
                    if not timestamp_found and "media" in story and isinstance(story["media"], list) and len(story["media"]) > 0:
                        for media_item in story["media"]:
                            if isinstance(media_item, dict):
                                for field in timestamp_fields:
                                    if field in media_item and media_item[field]:
                                        story_entry["t"] = int(media_item[field])
                                        timestamp_found = True
                                        if self.verbose and index < 3:
                                            print(f"   Timestamp found in media item '{field}': {story_entry['t']}")
                                        break
                                if timestamp_found:
                                    break
                
                # Format date if timestamp found
                if story_entry["t"]:
                    story_entry["d"] = datetime.utcfromtimestamp(
                        int(story_entry["t"])
                    ).strftime("%B %d, %Y at %I:%M %p")
                    if self.verbose and index < 3:
                        print(f"   Formatted date: {story_entry['d']}")
                
                # Extract caption/title
                caption_found = False
                if isinstance(story, dict):
                    # Try different possible caption fields
                    caption_fields = ["caption", "title", "text"]
                    for field in caption_fields:
                        if field in story and story[field]:
                            story_entry["tt"] = story[field]
                            caption_found = True
                            if self.verbose and index < 3:
                                print(f"   Caption found in '{field}': {story_entry['tt'][:30]}...")
                            break
                    
                    # Try string_map_data if no caption found directly
                    if not caption_found and "string_map_data" in story and isinstance(story["string_map_data"], dict):
                        string_map = story["string_map_data"]
                        caption_keys = ["Caption", "Text", "Story Text"]
                        for key in caption_keys:
                            if key in string_map and isinstance(string_map[key], dict) and "value" in string_map[key]:
                                story_entry["tt"] = string_map[key]["value"]
                                caption_found = True
                                if self.verbose and index < 3:
                                    print(f"   Caption found in string_map_data['{key}']: {story_entry['tt'][:30]}...")
                                break
                
                # Extract media URIs
                media_found = False
                if isinstance(story, dict):
                    # Try direct URI field
                    if "uri" in story and story["uri"]:
                        story_entry["m"].append(story["uri"])
                        media_found = True
                        if self.verbose and index < 3:
                            print(f"   Media found directly in 'uri': {story_entry['m'][0]}")
                    
                    # Try media list
                    if "media" in story and isinstance(story["media"], list):
                        for media_item in story["media"]:
                            if isinstance(media_item, dict) and "uri" in media_item and media_item["uri"]:
                                story_entry["m"].append(media_item["uri"])
                                media_found = True
                                if self.verbose and index < 3 and len(story_entry["m"]) <= 3:
                                    print(f"   Media found in media list: {media_item['uri']}")
                    
                    # Try media_map_data
                    if not media_found and "media_map_data" in story and isinstance(story["media_map_data"], dict):
                        for key, media_item in story["media_map_data"].items():
                            if isinstance(media_item, dict) and "uri" in media_item and media_item["uri"]:
                                story_entry["m"].append(media_item["uri"])
                                media_found = True
                                if self.verbose and index < 3 and len(story_entry["m"]) <= 3:
                                    print(f"   Media found in media_map_data['{key}']: {media_item['uri']}")
                
                # Only add stories with valid timestamps and media
                if story_entry["t"] and story_entry["m"]:
                    simplified_stories[str(story_entry["t"])] = story_entry
                    if self.verbose and index < 3:
                        print(f"   ✓ Story added with timestamp {story_entry['t']} and {len(story_entry['m'])} media items")
                elif self.verbose and index < 3:
                    if not story_entry["t"]:
                        print(f"   ✗ Story skipped: No timestamp found")
                    if not story_entry["m"]:
                        print(f"   ✗ Story skipped: No media found")
            
            if self.verbose:
                print(f"\n   Extracted {len(simplified_stories)} valid stories from {len(stories_list)} total")
            
            # Sort by timestamp (newest first)
            sorted_stories = dict(sorted(simplified_stories.items(), key=lambda x: int(x[0]), reverse=True))
            
            if self.verbose and sorted_stories:
                newest = datetime.utcfromtimestamp(int(list(sorted_stories.keys())[0])).strftime('%Y-%m-%d')
                oldest = datetime.utcfromtimestamp(int(list(sorted_stories.keys())[-1])).strftime('%Y-%m-%d')
                print(f"   Stories date range: {oldest} to {newest}")
            
            return sorted_stories
            
        except Exception as e:
            print(f"Error loading stories data: {str(e)}")
            if self.verbose:
                import traceback
                traceback.print_exc()
            return {}

    def load_all_data(self):
        """
        Load all data and return a comprehensive data package.

        Returns:
            dict: Data package containing all processed data
        """
        profile_info = self.load_profile_data()
        location_info = self.load_location_data()
        posts_data = self.extract_relevant_data()
        stories_data = self.load_stories_data()
        follower_count = self.load_followers_data()
        
        # Add follower count to profile info
        profile_info["follower_count"] = follower_count
        
        # Process all string values to fix encoding issues
        profile_info = self.process_json_strings(profile_info)
        location_info = self.process_json_strings(location_info)
        posts_data = self.process_json_strings(posts_data)
        stories_data = self.process_json_strings(stories_data)

        # Get date range for display
        if posts_data and isinstance(posts_data, dict) and len(posts_data) > 0:
            keys = list(posts_data.keys())
            first_key = keys[0]  # Newest post
            last_key = keys[-1]  # Oldest post

            # Format timestamps
            newest_post_date = datetime.utcfromtimestamp(int(first_key)).strftime(
                "%B %Y"
            )
            oldest_post_date = datetime.utcfromtimestamp(int(last_key)).strftime(
                "%B %Y"
            )

            date_range = {
                "newest": newest_post_date,
                "oldest": oldest_post_date,
                "range": f"{oldest_post_date} - {newest_post_date}",
            }
        else:
            date_range = {"newest": "Unknown", "oldest": "Unknown", "range": "Unknown"}
            # If no posts data, create an empty dict to avoid NoneType errors
            if not isinstance(posts_data, dict):
                posts_data = {}

        return {
            "profile": profile_info,
            "location": location_info,
            "posts": posts_data,
            "stories": stories_data,
            "date_range": date_range,
            "post_count": len(posts_data),
            "story_count": len(stories_data),
        }


================================================
FILE: memento_mori/media.py
================================================
# memento_mori/media.py
import os
import shutil
import hashlib
import base64
import re
import mimetypes
import magic  # python-magic library
from pathlib import Path
from PIL import Image
from concurrent.futures import ThreadPoolExecutor
import multiprocessing
from tqdm import tqdm


class InstagramMediaProcessor:
    """
    Class for processing Instagram media files.

    This class handles:
    - Converting images to WebP format
    - Generating thumbnails for images and videos
    - Copying media files to the output directory
    """

    def __init__(self, extraction_dir, output_dir, thread_count=None, quality=70, max_dimension=1200):
        """Initialize the media processor with paths and options."""
        self.extraction_dir = Path(extraction_dir)
        self.output_dir = Path(output_dir)
        self.thread_count = thread_count or max(1, multiprocessing.cpu_count() - 1)
        self.quality = quality  # Store the quality setting
        self.max_dimension = max_dimension  # Maximum dimension for resizing

        # Create output directories
        self.media_dirs = [
            self.output_dir / "media",
            self.output_dir / "media" / "posts",
            self.output_dir / "media" / "other",
            self.output_dir / "thumbnails",
        ]

        for directory in self.media_dirs:
            directory.mkdir(parents=True, exist_ok=True)

        # Statistics
        self.thumbnail_count = 0
        self.webp_count = 0
        self.total_size_original = 0
        self.total_size_webp = 0

        # Initialize filename mapping
        self.filename_map = {}

        # Build a basename -> [Path, ...] index for fallback file lookup
        self.file_index = self._build_file_index()

    def _build_file_index(self):
        """
        Walk extraction_dir and build a basename -> [Path, ...] index.
        Warns about any filename collisions found.
        """
        index = {}
        for path in self.extraction_dir.rglob("*"):
            if not path.is_file():
                continue
            name = path.name
            if name not in index:
                index[name] = []
            index[name].append(path)

        collisions = {name: paths for name, paths in index.items() if len(paths) > 1}
        if collisions:
            print(f"\n⚠️  WARNING: {len(collisions)} duplicate filename(s) found across archive")
            print("   Fallback lookup will use the first match:")
            for name, paths in collisions.items():
                print(f"   {name} ({len(paths)} copies):")
                for p in paths:
                    print(f"      {p.relative_to(self.extraction_dir)}")

        return index

    def shorten_filename(self, original_path):
        """
        Create a shortened version of a filename while preserving extension.
        
        Args:
            original_path (str): Original file path
            
        Returns:
            str: Shortened file path
        """
        if not original_path or not isinstance(original_path, str):
            return original_path
            
        # Skip if it's already a data URI
        if original_path.startswith('data:'):
            return original_path
            
        # Check if we already have a mapping for this path
        if original_path in self.filename_map:
            return self.filename_map[original_path]
            
        # Parse the path
        path_obj = Path(original_path)
        parent_dir = path_obj.parent
        filename = path_obj.name
        extension = path_obj.suffix.lower()
        
        # Create a hash of the original filename
        filename_hash = hashlib.md5(filename.encode()).hexdigest()[:8]  # Use first 8 chars of hash
        
        # Create new filename: hash + original extension
        new_filename = f"{filename_hash}{extension}"
        
        # Create the new path
        if parent_dir == Path('.'):
            new_path = new_filename
        else:
            new_path = str(parent_dir / new_filename)
        
        # Store the mapping
        self.filename_map[original_path] = new_path
        
        return new_path

    def process_media_files(self, post_data, profile_picture, stories_data=None):
        """Process all media files from posts, stories, and profile picture."""
        # First, fix any incorrect file extensions in the extraction directory
        print("Checking and fixing file extensions...")
        extension_stats = self.fix_file_extensions(self.extraction_dir)
        print(f"Fixed {extension_stats['fixed']} files with incorrect extensions")
        
        # Create a path mapping for quick lookups
        path_mapping = extension_stats.get("path_mapping", {})
        
        # Update profile picture path if it was fixed
        if profile_picture in path_mapping:
            profile_picture = path_mapping[profile_picture]

        # Process profile picture and get shortened path (only if profile picture exists)
        shortened_profile = ""
        if profile_picture and profile_picture.strip():
            # Check if the profile picture file actually exists
            profile_path = Path(self.extraction_dir) / profile_picture
            if profile_path.exists() and profile_path.is_file():
                shortened_profile = self.shorten_filename(profile_picture)
                self.copy_file_to_distribution(profile_picture)
                self.generate_thumbnail(profile_picture, shortened_profile)
            else:
                print(f"Warning: Profile picture not found or is not a file: {profile_picture}")
        else:
            print("Warning: No profile picture specified in data")

        # Collect all media files to process
        all_media = []
        story_media = []  # Separate list for story media
        
        # Create a deep copy of post_data to modify
        updated_post_data = {}
        
        for timestamp, post in post_data.items():
            # Create a copy of the post
            updated_post = post.copy()
            updated_media = []
            
            for media_url in post["m"]:
                # Check if this media URL was fixed
                if str(self.extraction_dir / media_url) in path_mapping:
                    # Get the new path relative to extraction_dir
                    new_full_path = path_mapping[str(self.extraction_dir / media_url)]
                    media_url = str(Path(new_full_path).relative_to(self.extraction_dir))
                
                # Add to processing list
                all_media.append(media_url)
                
                # Get shortened path
                shortened_url = self.shorten_filename(media_url)
                updated_media.append(shortened_url)
            
            # Update the post with shortened media URLs
            updated_post["m"] = updated_media
            updated_post_data[timestamp] = updated_post
            
        # Process stories data if provided
        updated_stories_data = {}
        story_thumbnails = {}  # Store story thumbnail paths
        
        if stories_data:
            for timestamp, story in stories_data.items():
                # Create a copy of the story
                updated_story = story.copy()
                updated_media = []
                
                for media_url in story["m"]:
                    # Check if this media URL was fixed
                    if str(self.extraction_dir / media_url) in path_mapping:
                        # Get the new path relative to extraction_dir
                        new_full_path = path_mapping[str(self.extraction_dir / media_url)]
                        media_url = str(Path(new_full_path).relative_to(self.extraction_dir))
                    
                    # Add to processing list
                    all_media.append(media_url)
                    story_media.append(media_url)  # Also add to story-specific list
                    
                    # Get shortened path
                    shortened_url = self.shorten_filename(media_url)
                    updated_media.append(shortened_url)
                
                # Update the story with shortened media URLs
                updated_story["m"] = updated_media
                updated_stories_data[timestamp] = updated_story

        total_media = len(all_media)
        print(
            f"Processing {total_media} media files using {self.thread_count} threads..."
        )

        # Process media files in parallel using ThreadPoolExecutor with tqdm
        with ThreadPoolExecutor(max_workers=self.thread_count) as executor:
            list(
                tqdm(
                    executor.map(self.copy_file_to_distribution, all_media),
                    total=total_media,
                    desc="Processing media files",
                    unit="files",
                )
            )
        
        # Generate story thumbnails with 9:16 aspect ratio
        if story_media:
            print(f"Generating {len(story_media)} story thumbnails with 9:16 aspect ratio...")
            
            # Process story thumbnails and collect results
            with ThreadPoolExecutor(max_workers=self.thread_count) as executor:
                story_thumb_results = list(
                    tqdm(
                        executor.map(
                            lambda media_url: (
                                media_url,
                                self.generate_story_thumbnail(
                                    self.extraction_dir / media_url, 
                                    self.shorten_filename(media_url)
                                )
                            ),
                            story_media
                        ),
                        total=len(story_media),
                        desc="Processing story thumbnails",
                        unit="files",
                    )
                )
            
            # Create a mapping of original media to thumbnail paths
            for media_url, thumb_path in story_thumb_results:
                if thumb_path:
                    # Store relative path from output directory
                    rel_thumb_path = str(Path(thumb_path).relative_to(self.output_dir))
                    story_thumbnails[self.shorten_filename(media_url)] = rel_thumb_path
            
            # Add thumbnail paths to story data
            for timestamp, story in updated_stories_data.items():
                for i, media_url in enumerate(story["m"]):
                    if media_url in story_thumbnails:
                        # If this is the first media item, add the thumbnail path to the story
                        if i == 0:
                            story["story_thumb"] = story_thumbnails[media_url]

        # Calculate space savings
        self._calculate_space_savings(post_data)

        # Return updated post data and statistics
        return {
            "updated_post_data": updated_post_data,
            "updated_stories_data": updated_stories_data,
            "shortened_profile": shortened_profile,
            "stats": {
                "thumbnail_count": self.thumbnail_count,
                "webp_count": self.webp_count,
                "total_size_original": self.total_size_original,
                "total_size_webp": self.total_size_webp,
                "space_saved_mb": (self.total_size_original - self.total_size_webp)
                / (1024 * 1024),
                "percentage_saved": (
                    (self.total_size_original - self.total_size_webp)
                    / self.total_size_original
                    * 100
                    if self.total_size_original > 0
                    else 0
                ),
                "extension_fixes": extension_stats["fixed"],
            }
        }

    def _calculate_space_savings(self, post_data):
        """Calculate space savings from WebP conversion and o

Download .txt

gitextract_5pnua18_/

├── .gitignore
├── Dockerfile
├── LICENSE
├── README.md
├── deprecated_php_utility/
│   ├── index.php
│   ├── modal.js
│   ├── notes.md
│   └── style.css
├── docker-compose.yml
├── memento_mori/
│   ├── __init__.py
│   ├── cli.py
│   ├── extractor.py
│   ├── file_mapper.py
│   ├── generator.py
│   ├── loader.py
│   ├── media.py
│   ├── static/
│   │   ├── css/
│   │   │   └── style.css
│   │   └── js/
│   │       ├── modal.js
│   │       └── stories.js
│   └── templates/
│       ├── grid.html
│       ├── index.html
│       ├── stories.html
│       └── stories_page.html
├── project_plan.md
├── pyproject.toml
└── requirements.txt

Download .txt

SYMBOL INDEX (102 symbols across 10 files)

FILE: deprecated_php_utility/index.php
  function copy_media_files (line 14) | function copy_media_files($post_data, $profile_picture) {
  function copy_file_to_distribution (line 117) | function copy_file_to_distribution($file_path) {
  function convert_to_webp (line 159) | function convert_to_webp($source_path, $destination_path) {
  function generate_thumbnail (line 251) | function generate_thumbnail($source_path, $relative_path) {
  function process_and_save_image (line 415) | function process_and_save_image($source_image, $thumb_path, $target_widt...
  function render_instagram_grid (line 467) | function render_instagram_grid($post_data, $lazy_after = 30) {
  function find_posts_json (line 591) | function find_posts_json() {
  function extractRelevantData (line 680) | function extractRelevantData($combined_data) {
  function verify_images_in_html (line 978) | function verify_images_in_html($html_content) {

FILE: deprecated_php_utility/modal.js
  function initialize (line 23) | function initialize() {
  function initializeSorting (line 37) | function initializeSorting() {
  function sortPosts (line 56) | function sortPosts(sortType) {
  function getTimestampByIndex (line 138) | function getTimestampByIndex(index) {
  function getLikesByIndex (line 144) | function getLikesByIndex(index) {
  function getCommentsByIndex (line 153) | function getCommentsByIndex(index) {
  function getViewsByIndex (line 162) | function getViewsByIndex(index) {
  function attachGridItemListeners (line 171) | function attachGridItemListeners() {
  function openModal (line 182) | function openModal(index, imageIndex = 0) {
  function updateUrlWithPostInfo (line 253) | function updateUrlWithPostInfo(timestamp, imageIndex) {
  function createMediaElement (line 271) | function createMediaElement(mediaUrl) {
  function updateModalContent (line 316) | function updateModalContent(post, initialImageIndex = 0) {
  function navigateSlideshow (line 435) | function navigateSlideshow(direction) {
  function showSlide (line 463) | function showSlide(index) {
  function navigatePost (line 494) | function navigatePost(direction) {
  function closeModal (line 526) | function closeModal() {

FILE: memento_mori/cli.py
  function main (line 16) | def main():

FILE: memento_mori/extractor.py
  class InstagramArchiveExtractor (line 10) | class InstagramArchiveExtractor:
    method __init__ (line 23) | def __init__(self, input_path=None, output_path=None, cleanup=True):
    method auto_detect_archive (line 41) | def auto_detect_archive(self, search_dir="."):
    method _is_instagram_archive (line 82) | def _is_instagram_archive(self, zip_path):
    method extract (line 108) | def extract(self):
    method validate_structure (line 164) | def validate_structure(self):
    method _map_important_files (line 188) | def _map_important_files(self):
    method _extract_and_merge (line 213) | def _extract_and_merge(self, zip_path, target_dir):
    method _merge_dirs (line 234) | def _merge_dirs(self, src, dst):
    method get_file_path (line 247) | def get_file_path(self, file_type):
    method cleanup_temp_files (line 259) | def cleanup_temp_files(self):
    method __del__ (line 268) | def __del__(self):

FILE: memento_mori/file_mapper.py
  class InstagramFileMapper (line 6) | class InstagramFileMapper:
    method __init__ (line 51) | def __init__(self, base_dir):
    method discover_all_files (line 55) | def discover_all_files(self):
    method discover_files (line 63) | def discover_files(self, file_type, patterns=None):
    method get_file_path (line 99) | def get_file_path(self, file_type):
    method validate_required_files (line 109) | def validate_required_files(self, required_files):

FILE: memento_mori/generator.py
  class InstagramSiteGenerator (line 14) | class InstagramSiteGenerator:
    method __init__ (line 24) | def __init__(self, data_package, output_dir, template_dir=None, static...
    method generate (line 68) | def generate(self):
    method _copy_static_assets (line 95) | def _copy_static_assets(self):
    method _generate_html (line 123) | def _generate_html(self):
    method _render_grid (line 174) | def _render_grid(self):
    method _get_display_media (line 205) | def _get_display_media(self, post, use_lazy_loading=False):
    method _generate_stories_html (line 277) | def _generate_stories_html(self):

FILE: memento_mori/loader.py
  function fix_double_encoded_utf8 (line 11) | def fix_double_encoded_utf8(text):
  class InstagramDataLoader (line 23) | class InstagramDataLoader:
    method __init__ (line 34) | def __init__(self, extraction_dir, file_mapper=None, verbose=False):
    method load_profile_data (line 61) | def load_profile_data(self):
    method load_location_data (line 107) | def load_location_data(self):
    method load_posts_data (line 138) | def load_posts_data(self):
    method load_insights_data (line 220) | def load_insights_data(self):
    method combine_data (line 275) | def combine_data(self):
    method extract_relevant_data (line 352) | def extract_relevant_data(self):
    method load_followers_data (line 509) | def load_followers_data(self):
    method process_json_strings (line 540) | def process_json_strings(self, data):
    method load_stories_data (line 558) | def load_stories_data(self):
    method load_all_data (line 788) | def load_all_data(self):

FILE: memento_mori/media.py
  class InstagramMediaProcessor (line 16) | class InstagramMediaProcessor:
    method __init__ (line 26) | def __init__(self, extraction_dir, output_dir, thread_count=None, qual...
    method _build_file_index (line 57) | def _build_file_index(self):
    method shorten_filename (line 82) | def shorten_filename(self, original_path):
    method process_media_files (line 126) | def process_media_files(self, post_data, profile_picture, stories_data...
    method _calculate_space_savings (line 294) | def _calculate_space_savings(self, post_data):
    method copy_file_to_distribution (line 342) | def copy_file_to_distribution(self, file_path, quiet=True):
    method convert_to_webp (line 394) | def convert_to_webp(self, source_path, destination_path, quiet=False):
    method fix_file_extensions (line 465) | def fix_file_extensions(self, directory_path):
    method generate_thumbnail (line 611) | def generate_thumbnail(self, source_path, relative_path, quiet=False):
    method generate_story_thumbnail (line 722) | def generate_story_thumbnail(self, source_path, relative_path, quiet=F...

FILE: memento_mori/static/js/modal.js
  function initialize (line 24) | function initialize() {
  function initializeSorting (line 38) | function initializeSorting() {
  function sortPosts (line 57) | function sortPosts(sortType) {
  function getTimestampByIndex (line 139) | function getTimestampByIndex(index) {
  function getLikesByIndex (line 145) | function getLikesByIndex(index) {
  function getCommentsByIndex (line 154) | function getCommentsByIndex(index) {
  function getViewsByIndex (line 163) | function getViewsByIndex(index) {
  function attachGridItemListeners (line 172) | function attachGridItemListeners() {
  function openModal (line 183) | function openModal(index, imageIndex = 0) {
  function updateUrlWithPostInfo (line 254) | function updateUrlWithPostInfo(timestamp, imageIndex) {
  function createMediaElement (line 272) | function createMediaElement(mediaUrl) {
  function updateModalContent (line 317) | function updateModalContent(post, initialImageIndex = 0) {
  function navigateSlideshow (line 440) | function navigateSlideshow(direction) {
  function showSlide (line 468) | function showSlide(index) {
  function navigatePost (line 499) | function navigatePost(direction) {
  function closeModal (line 531) | function closeModal() {
  function fixEncodingIssues (line 631) | function fixEncodingIssues(text) {

FILE: memento_mori/static/js/stories.js
  function openStory (line 30) | function openStory(index) {
  function loadCurrentStory (line 54) | function loadCurrentStory() {
  function startAutoProgressTimer (line 90) | function startAutoProgressTimer() {
  function clearAutoProgressTimer (line 111) | function clearAutoProgressTimer() {
  function loadStoryContent (line 132) | function loadStoryContent(slide, index) {
  function navigateStory (line 252) | function navigateStory(direction) {
  function closeStory (line 348) | function closeStory() {
  function togglePause (line 381) | function togglePause() {
  function checkUrlForStory (line 448) | function checkUrlForStory() {

Download .json

Condensed preview — 26 files, each showing path, character count, and a content snippet. Download the .json file or copy for the full structured content (316K chars).

[
  {
    "path": ".gitignore",
    "chars": 333,
    "preview": "*.zip\n*.7z\n*.tar\n*.gz\n*.tar.gz\n.aider*\n\n# Builds and Downloads\noutput/\ninstagram*/\n\n# Virtual Environments\nvenv/\n\n# Pyth"
  },
  {
    "path": "Dockerfile",
    "chars": 691,
    "preview": "FROM python:3.10-slim\n\n# Install system dependencies (including support for image processing and libmagic)\nRUN apt-get u"
  },
  {
    "path": "LICENSE",
    "chars": 26520,
    "preview": "                  GNU LESSER GENERAL PUBLIC LICENSE\n                       Version 2.1, February 1999\n\n Copyright (C) 19"
  },
  {
    "path": "README.md",
    "chars": 6829,
    "preview": "# Memento Mori - Instagram Archive Viewer\n\n<img align=\"right\" width=\"300\" hspace=\"20\" src=\"preview.gif\" alt=\"Memento Mor"
  },
  {
    "path": "deprecated_php_utility/index.php",
    "chars": 40596,
    "preview": "<?php\n\n// Create distribution directory if it doesn't exist\nif (!file_exists('distribution')) {\n    mkdir('distribution'"
  },
  {
    "path": "deprecated_php_utility/modal.js",
    "chars": 23703,
    "preview": "document.addEventListener('DOMContentLoaded', function() {\n    // Get DOM elements\n    const postsGrid = document.getEle"
  },
  {
    "path": "deprecated_php_utility/notes.md",
    "chars": 598,
    "preview": "The PHP version may be easier to run on shared hosting environments and doesn't require additional packages if PHP is al"
  },
  {
    "path": "deprecated_php_utility/style.css",
    "chars": 11131,
    "preview": ":root {\n  --instagram-bg: #fafafa;\n  --instagram-border: #dbdbdb;\n  --instagram-text: #262626;\n  --instagram-link: #0095"
  },
  {
    "path": "docker-compose.yml",
    "chars": 204,
    "preview": "services:\n  memento-mori:\n    build: .\n    volumes:\n      - ./:/app/workspace\n      - ./output:/output\n    environment:\n"
  },
  {
    "path": "memento_mori/__init__.py",
    "chars": 822,
    "preview": "# __init__.py\n\"\"\"\nMemento Mori - Instagram Archive Viewer\n\nA tool that converts your Instagram data export into a beauti"
  },
  {
    "path": "memento_mori/cli.py",
    "chars": 7782,
    "preview": "# memento_mori/cli.py\n\nimport os\nimport argparse\nimport multiprocessing\nfrom pathlib import Path\nimport traceback\nimport"
  },
  {
    "path": "memento_mori/extractor.py",
    "chars": 10163,
    "preview": "# memento_mori/extractor.py\nimport os\nimport zipfile\nimport tempfile\nimport shutil\nfrom pathlib import Path\nfrom .file_m"
  },
  {
    "path": "memento_mori/file_mapper.py",
    "chars": 4141,
    "preview": "# memento_mori/file_mapper.py\nfrom pathlib import Path\nimport os\n\n\nclass InstagramFileMapper:\n    \"\"\"\n    Central class "
  },
  {
    "path": "memento_mori/generator.py",
    "chars": 14368,
    "preview": "# memento_mori/generator.py\nimport os\nimport json\nimport shutil\nimport datetime\nfrom pathlib import Path\nfrom jinja2 imp"
  },
  {
    "path": "memento_mori/loader.py",
    "chars": 38071,
    "preview": "# memento_mori/loader.py\nimport json\nimport re\nimport os\nfrom datetime import datetime\nimport html\nfrom ftfy import fix_"
  },
  {
    "path": "memento_mori/media.py",
    "chars": 33899,
    "preview": "# memento_mori/media.py\nimport os\nimport shutil\nimport hashlib\nimport base64\nimport re\nimport mimetypes\nimport magic  # "
  },
  {
    "path": "memento_mori/static/css/style.css",
    "chars": 17125,
    "preview": "/* memento_mori/static/css/style.css */\n:root {\n  --instagram-bg: #fafafa;\n  --instagram-border: #dbdbdb;\n  --instagram-"
  },
  {
    "path": "memento_mori/static/js/modal.js",
    "chars": 25921,
    "preview": "// memento_mori/static/js/modal.js\ndocument.addEventListener('DOMContentLoaded', function () {\n    // Get DOM elements\n "
  },
  {
    "path": "memento_mori/static/js/stories.js",
    "chars": 18595,
    "preview": "// Stories viewer functionality\ndocument.addEventListener('DOMContentLoaded', function() {\n    // Story viewer elements\n"
  },
  {
    "path": "memento_mori/templates/grid.html",
    "chars": 449,
    "preview": "{% for post in posts %}\n<div class=\"grid-item\" data-index=\"{{ post.index }}\">\n  <img src=\"{{ post.display_media }}\" alt="
  },
  {
    "path": "memento_mori/templates/index.html",
    "chars": 4570,
    "preview": "<!DOCTYPE html>\n<html lang=\"en\">\n  <head>\n    <meta charset=\"UTF-8\">\n    <meta name=\"viewport\" content=\"width=device-wid"
  },
  {
    "path": "memento_mori/templates/stories.html",
    "chars": 1287,
    "preview": "{% for story in stories %}\n  <div class=\"grid-item story-item\" data-index=\"{{ story.i }}\">\n    <div class=\"grid-item-con"
  },
  {
    "path": "memento_mori/templates/stories_page.html",
    "chars": 6658,
    "preview": "<!DOCTYPE html>\n<html lang=\"en\">\n  <head>\n    <meta charset=\"UTF-8\">\n    <meta name=\"viewport\" content=\"width=device-wid"
  },
  {
    "path": "project_plan.md",
    "chars": 7205,
    "preview": "# Memento Mori - Instagram Archive Viewer\n\n## Project Overview\n\nMemento Mori is a tool that transforms your Instagram da"
  },
  {
    "path": "pyproject.toml",
    "chars": 705,
    "preview": "[build-system]\nrequires = [\"setuptools>=42\", \"wheel\"]\nbuild-backend = \"setuptools.build_meta\"\n\n[project]\nname = \"memento"
  },
  {
    "path": "requirements.txt",
    "chars": 118,
    "preview": "ftfy==6.3.1\nJinja2==3.0.3\nMarkupSafe==2.1.5\nopencv_python==4.10.0.84\nPillow>=11.1.0\npython_magic==0.4.27\ntqdm==4.67.1\n"
  }
]

About this extraction

This page contains the full source code of the greg-randall/memento-mori GitHub repository, extracted and formatted as plain text for AI agents and large language models (LLMs). The extraction includes 26 files (295.4 KB), approximately 66.9k tokens, and a symbol index with 102 extracted functions, classes, methods, constants, and types. Use this with OpenClaw, Claude, ChatGPT, Cursor, Windsurf, or any other AI tool that accepts text input. You can copy the full output to your clipboard or download it as a .txt file.

Extracted by GitExtract — free GitHub repo to text converter for AI. Built by Nikandr Surkov.

Extract another repo