Full Code of dosyago/DiskerNet for AI

fun c757475ca9f1 cached

64 files

322.0 KB

83.6k tokens

201 symbols

1 requests

Download .txt

Showing preview only (340K chars total). Download the full file or copy to clipboard to get everything.

Repository: dosyago/DiskerNet
Branch: fun
Commit: c757475ca9f1
Files: 64
Total size: 322.0 KB

Directory structure:
gitextract_oucmwqcm/

├── .eslintrc.cjs
├── .github/
│   ├── FUNDING.yml
│   └── workflows/
│       └── node.js.yml
├── .gitignore
├── .npm.release
├── .npmignore
├── .npmrelease
├── CONTRIBUTING.md
├── NOTICE
├── README.md
├── TODO
├── docs/
│   ├── OLD-README.md
│   ├── SECURITY.md
│   ├── features.md
│   ├── issues
│   └── todo
├── eslint.config.js
├── exec.js
├── global-run.cjs
├── icons/
│   └── dk.icns
├── package.json
├── public/
│   ├── find_cleaned_duplicates.mjs
│   ├── find_crawlable.mjs
│   ├── injection.js
│   ├── library/
│   │   └── README.md
│   ├── make_top.mjs
│   ├── old-index.html
│   ├── problem_find.mjs
│   ├── redirector.html
│   ├── style.css
│   ├── test-injection.html
│   └── top.html
├── scripts/
│   ├── build_only.sh
│   ├── clean.sh
│   ├── downloadnet-entitlements.xml
│   ├── go_build.sh
│   ├── go_dev.sh
│   ├── postinstall.sh
│   ├── publish.sh
│   ├── release.sh
│   └── sign_windows_release.ps1
├── sign-win.ps1
├── src/
│   ├── app.js
│   ├── archivist.js
│   ├── args.js
│   ├── blockedResponse.js
│   ├── bookmarker.js
│   ├── common.js
│   ├── gem-highlighter.js
│   ├── hello.js
│   ├── highlighter.js
│   ├── index.js
│   ├── installBrowser.js
│   ├── launcher.js
│   ├── libraryServer.js
│   ├── protocol.js
│   ├── root.cjs
│   └── root.js
├── stampers/
│   ├── macos-new.sh
│   ├── macos.sh
│   ├── nix.sh
│   ├── notarize_macos.sh
│   └── win.bat
└── test.sh

================================================
FILE CONTENTS
================================================

================================================
FILE: .eslintrc.cjs
================================================
module.exports = {
  "env": {
    "es2021": true,
    "node": true
  },
  "extends": "eslint:recommended",
  "parserOptions": {
    "ecmaVersion": 13,
    "sourceType": "module"
  },
  "ignorePatterns": [
    "build/**/*.js"
  ],
  "rules": {
  }
};


================================================
FILE: .github/FUNDING.yml
================================================
# These are supported funding model platforms

custom: https://dosaygo.com/downloadnet


================================================
FILE: .github/workflows/node.js.yml
================================================
# This workflow will do a clean installation of node dependencies, cache/restore them, build the source code and run tests across different versions of node
# For more information see: https://docs.github.com/en/actions/automating-builds-and-tests/building-and-testing-nodejs

name: Node.js CI

on:
  push:
    branches: [ "fun" ]
  pull_request:
    branches: [ "fun" ]

jobs:
  build:

    runs-on: ubuntu-latest

    strategy:
      matrix:
        node-version: [16.x, 18.x, 19.x]
        # See supported Node.js release schedule at https://nodejs.org/en/about/releases/

    steps:
    - uses: actions/checkout@v3
    - name: Use Node.js ${{ matrix.node-version }}
      uses: actions/setup-node@v3
      with:
        node-version: ${{ matrix.node-version }}
        cache: 'npm'
    - run: npm ci
    - run: npm run build --if-present
    - run: npm test


================================================
FILE: .gitignore
================================================
*.pkg
"
"*
*~
.*.un~
*.blob
.\build\*
22120-arc

.*.swp

# Bundling and packaging
22120.exe
22120.nix
22120.mac
22120.win32.exe
22120.nix32
bin/*
build/*

#Leave these to allow install by npm -g
#22120.js
#*.22120.js

# Library
public/library/cache.json
public/library/http*


# Logs
logs
*.log
npm-debug.log*
yarn-debug.log*
yarn-error.log*
lerna-debug.log*

# Diagnostic reports (https://nodejs.org/api/report.html)
report.[0-9]*.[0-9]*.[0-9]*.[0-9]*.json

# Runtime data
pids
*.pid
*.seed
*.pid.lock

# Directory for instrumented libs generated by jscoverage/JSCover
lib-cov

# Coverage directory used by tools like istanbul
coverage
*.lcov

# nyc test coverage
.nyc_output

# Grunt intermediate storage (https://gruntjs.com/creating-plugins#storing-task-files)
.grunt

# Bower dependency directory (https://bower.io/)
bower_components

# node-waf configuration
.lock-wscript

# Compiled binary addons (https://nodejs.org/api/addons.html)
build/Release

# Dependency directories
node_modules/
jspm_packages/

# TypeScript v1 declaration files
typings/

# TypeScript cache
*.tsbuildinfo

# Optional npm cache directory
.npm

# Optional eslint cache
.eslintcache

# Microbundle cache
.rpt2_cache/
.rts2_cache_cjs/
.rts2_cache_es/
.rts2_cache_umd/

# Optional REPL history
.node_repl_history

# Output of 'npm pack'
*.tgz

# Yarn Integrity file
.yarn-integrity

# dotenv environment variables file
.env
.env.test

# parcel-bundler cache (https://parceljs.org/)
.cache

# Next.js build output
.next

# Nuxt.js build / generate output
.nuxt
dist

# Gatsby files
.cache/
# Comment in the public line in if your project uses Gatsby and *not* Next.js
# https://nextjs.org/blog/next-9-1#public-directory-support
# public

# vuepress build output
.vuepress/dist

# Serverless directories
.serverless/

# FuseBox cache
.fusebox/

# DynamoDB Local files
.dynamodb/

# TernJS port file
.tern-port


================================================
FILE: .npm.release
================================================
Sun Jan 15 15:11:49 CST 2023


================================================
FILE: .npmignore
================================================

.*.swp
*~
.*un~

# Bundling and packaging
build/bin/*

build/cjs/*



================================================
FILE: .npmrelease
================================================
Fri Aug 30 00:09:47 CST 2024


================================================
FILE: CONTRIBUTING.md
================================================
# Contributing

When contributing to this repository, please first discuss the change you wish to make via issue,
email, or any other method with the owners of this repository before making a change. 

Please note we have a code of conduct, please follow it in all your interactions with the project.

## Pull Request Process

1. Ensure any install or build dependencies are removed before the end of the layer when doing a 
   build.
2. Update the README.md with details of changes to the interface, this includes new environment 
   variables, exposed ports, useful file locations and container parameters.
3. Increase the version numbers in any examples files and the README.md to the new version that this
   Pull Request would represent. The versioning scheme we use is [SemVer](http://semver.org/).
4. You may merge the Pull Request in once you have the sign-off of two other developers, or if you 
   do not have permission to do that, you may request the second reviewer to merge it for you.

## Code of Conduct

### Our Pledge

In the interest of fostering an open and welcoming environment, we as
contributors and maintainers pledge to making participation in our project and
our community a harassment-free experience for everyone, regardless of age, body
size, disability, ethnicity, gender identity and expression, level of experience,
nationality, personal appearance, race, religion, or sexual identity and
orientation.

### Our Standards

Examples of behavior that contributes to creating a positive environment
include:

* Using welcoming and inclusive language
* Being respectful of differing viewpoints and experiences
* Gracefully accepting constructive criticism
* Focusing on what is best for the community
* Showing empathy towards other community members

Examples of unacceptable behavior by participants include:

* The use of sexualized language or imagery and unwelcome sexual attention or
advances
* Trolling, insulting/derogatory comments, and personal or political attacks
* Public or private harassment
* Publishing others' private information, such as a physical or electronic
  address, without explicit permission
* Other conduct which could reasonably be considered inappropriate in a
  professional setting

### Our Responsibilities

Project maintainers are responsible for clarifying the standards of acceptable
behavior and are expected to take appropriate and fair corrective action in
response to any instances of unacceptable behavior.

Project maintainers have the right and responsibility to remove, edit, or
reject comments, commits, code, wiki edits, issues, and other contributions
that are not aligned to this Code of Conduct, or to ban temporarily or
permanently any contributor for other behaviors that they deem inappropriate,
threatening, offensive, or harmful.

### Scope

This Code of Conduct applies both within project spaces and in public spaces
when an individual is representing the project or its community. Examples of
representing a project or community include using an official project e-mail
address, posting via an official social media account, or acting as an appointed
representative at an online or offline event. Representation of a project may be
further defined and clarified by project maintainers.

### Enforcement

Instances of abusive, harassing, or otherwise unacceptable behavior may be
reported by contacting the project team at [INSERT EMAIL ADDRESS]. All
complaints will be reviewed and investigated and will result in a response that
is deemed necessary and appropriate to the circumstances. The project team is
obligated to maintain confidentiality with regard to the reporter of an incident.
Further details of specific enforcement policies may be posted separately.

Project maintainers who do not follow or enforce the Code of Conduct in good
faith may face temporary or permanent repercussions as determined by other
members of the project's leadership.

### Attribution

This Code of Conduct is adapted from the [Contributor Covenant][homepage], version 1.4,
available at [http://contributor-covenant.org/version/1/4][version]

[homepage]: http://contributor-covenant.org
[version]: http://contributor-covenant.org/version/1/4/


================================================
FILE: NOTICE
================================================
Copyright Dosyago Corporation & Cris Stringfellow (https://dosaygo.com)

22120 and all previously released versions, including binaries, NPM packages, and 
Docker images (including all named archivist1, and all other previous names)
is re-licensed under the following PolyForm Strict License 1.0.0 and all previous
licenses are revoked.



================================================
FILE: README.md
================================================
# :floppy_disk: [DownloadNet (dn)](https://github.com/dosyago/DownloadNet) – Your Offline Web Archive with Full Text Search

![source lines of code](https://sloc.xyz/github/crisdosyago/Diskernet)
![binary downloads](https://img.shields.io/github/downloads/c9fe/22120/total?label=OS%20binary%20downloads)
![DownloadNet slogan](https://img.shields.io/badge/%F0%9F%92%BE%20dn-an%20internet%20on%20yer%20disc-hotpink)

Imagine a world where everything you browse online is saved and accessible, even when you're offline. That's the magic of DownloadNet (dn).

## Why dn?

- **Seamless Offline Experience** :earth_africa:: With dn, your offline browsing feels exactly like being online. It hooks directly into your browser, caching every page you visit, so you never lose track of that one article or resource you meant to revisit.
- **Full Text Search** :mag:: Unlike other archiving tools, dn gives you the power to search through your entire archive. No more digging through countless files—just search and find.
- **Completely Private** :lock:: Everything is stored locally on your machine. Browse whatever you want, with the peace of mind that it's all private and secure.

## Getting Started

### 1. **Download a Pre-built Binary (Simplest Option)** :package:
If you’re not familiar with Git or npm, this is the easiest way to get started:

1. **Go to the [Releases Page](https://github.com/dosyago/DownloadNet/releases)**
2. **Download** the binary for your operating system (e.g., Windows, macOS, Linux).
3. **Run** the downloaded application. That’s it! You’re ready to start archiving.

>[!NOTE]
> macOS now has a proper package installer, so it will be even easier. 

### 2. **Install via npm (For Users Familiar with Command Line)** :rocket:

1. **Open your terminal** (Command Prompt on Windows, Terminal on macOS/Linux).
2. **Install dn globally** with npm:
   ```sh
   npm i -g downloadnet@latest
   ```
3. **Start dn** by typing:
   ```sh
   dn
   ```

> [!NOTE]
> Make sure you have Node.js installed before attempting to use npm. If you're new to npm, see the next section for guidance.

### 3. **New to npm? No Problem!** :bulb:

If you’ve never used npm before, don’t worry—it’s easy to get started.

- **What is npm?** npm is a package manager for Node.js, a JavaScript runtime that allows you to run server-side code. You’ll use npm to install and manage software like dn.
- **Installing Node.js and npm:** The easiest way to install Node.js (which includes npm) is by using Node Version Manager (nvm). This tool allows you to easily install, manage, and switch between different versions of Node.js.

**To install nvm:**

1. **Visit the [nvm GitHub page](https://github.com/nvm-sh/nvm#installing-and-updating)** for installation instructions.
2. **Follow the steps** to install nvm on your system.
3. Once nvm is installed, **install the latest version of Node.js** by running:
   ```sh
   nvm install node
   ```
4. Now you can install dn using npm as described in the section above!

> [!TIP]
> Using nvm allows you to easily switch between Node.js versions and manage your environment more effectively.

### 4. **Build Your Own Binary (For Developers or Power Users)** :hammer_and_wrench:

If you like to tinker and want to build the binary yourself, here’s how:

1. **Download Git:** If you haven’t used Git before, download and install it from [git-scm.com](https://git-scm.com/).
2. **Clone the Repository:**
   ```sh
   git clone https://github.com/dosyago/DownloadNet.git
   ```
3. **Navigate to the Project Directory:**
   ```sh
   cd DownloadNet
   ```
4. **Install Dependencies:**
   ```sh
   npm i
   ```
5. **Build the Binary:**
   ```sh
   npm run build
   ```

6. **Find Your Binary:** The newly built binary will be in the `./build/bin` directory, ready to be executed!

### 5. **Run Directly from the Repository (Quick Start)** :runner:

Want to get dn up and running without building a binary? No problem!

1. **Clone the Repository:**
   ```sh
   git clone https://github.com/dosyago/DownloadNet.git
   ```
2. **Navigate to the Project Directory:**
   ```sh
   cd DownloadNet
   ```
3. **Install Dependencies:**
   ```sh
   npm i
   ```
4. **Start dn:**
   ```sh
   npm start
   ```

And just like that, you’re archiving!

## How It Works

dn runs as an intercepting proxy, hooking into your browser's internal fetch cycle. Once you fire up dn, it automatically configures your browser, and you’re good to go. Everything you browse is archived, and you can choose to save everything or just what you bookmark.

### Modes:

- **Save Mode** :floppy_disk:: Archive and index as you browse.
- **Serve Mode** :open_file_folder:: Browse your saved content as if you were still online.

> [!CAUTION]
> As your archive grows, you may encounter performance issues. If that happens, you can adjust the memory settings by setting environment variables for NODE runtime arguments, like `--max-old-space-size`.

## Accessing Your Archive

Once dn is running, your archive is at your fingertips. Just go to `http://localhost:22120` in your browser. Your archive’s control panel opens automatically, and from there, you can search, configure settings, and explore everything you’ve saved.

## Minimalistic Interface, Maximum Power

dn’s interface is basic but functional. It’s not about flashy design; it’s about delivering what you need—offline access to the web, as if you were still connected.

## Advanced Settings (If Needed)

As your archive grows, you may want to adjust where it's stored, manage memory settings, or blacklist domains you don’t want to archive. All of these settings can be tweaked directly from the control panel or command line.

## Get Started Now

With dn, you’ll never lose track of anything you’ve read online. It’s all right there in your own offline archive, fully searchable and always accessible. Whether you're in save mode or serve mode, dn keeps your digital life intact.

**:arrow_down: Download** | **:rocket: Install** | **:runner: Run** | **:mag_right: Never Lose Anything Again**

[Get Started with dn](https://github.com/dosyago/DownloadNet)

----


================================================
FILE: TODO
================================================
Ultimate Goal

- stable across releases (binaries, npm, can add to winget/choco in future)
- revenue

----

Releases

- macos signed
- win signed
- linux 
- release per arch where relevant as well

Future

  - UX to select an existing Chrome profile from standard locations. Ident via Google Profile Picture.png and then copy to a $USER_DATA_DIR/Default directory (rsync or robocopy maybe) and pass --user-data-dir=$USER_DATA_DIR for it to just work as of chrome 136 anyway. 
  - consider other mitigations if these are ineffective (watermark in archvies, other limitations, closed source / more advanced features unlocked, etc)
    - More future tasks:
      Marketing

      - /download-net page on dosaygo.com

      Crawl fixes

      - make batch size work
      - ensure no more than tab from any domain per batch (so that between loads timeouts are enforced)
      - save crawl in a "running crawls page"
      - be able to pause a crawl and restart it (should be simple), and crawl state persisted to disk.

      Add product key

      - product key section in crawl and settings
      - 15 minutes then shutdown
      - no free eval license key 
      - license is 69 per seat per year
      - plumbing for the backend

      Dev

      - add cross plat node exec.js for scripts



================================================
FILE: docs/OLD-README.md
================================================
# :floppy_disk: [DownloadNet](https://github.com/c9fe/22120) [![source lines of code](https://sloc.xyz/github/crisdosyago/Diskernet)](https://sloc.xyz) [![npm downloads (22120)](https://img.shields.io/npm/dt/archivist1?label=npm%20downloads%20%2822120%29)](https://npmjs.com/package/archivist1) [![npm downloads (downloadnet, since Jan 2022)](https://img.shields.io/npm/dt/downloadnet?label=npm%20downloads%20%28downloadnet%2C%20since%20Jan%202022%29)](https://npmjs.com/package/downloadnet) [![binary downloads](https://img.shields.io/github/downloads/c9fe/22120/total?label=OS%20binary%20downloads)](https://GitHub.com/crisdosyago/DownloadNet/releases) [![visitors+++](https://hits.seeyoufarm.com/api/count/incr/badge.svg?url=https%3A%2F%2Fgithub.com%2Fc9fe%2F22120&count_bg=%2379C83D&title_bg=%23555555&icon=&icon_color=%23E7E7E7&title=%28today%2Ftotal%29%20visitors%2B%2B%2B%20since%20Oct%2027%202020&edge_flat=false)](https://hits.seeyoufarm.com) ![version](https://img.shields.io/npm/v/archivist1)

:floppy_disk: - an internet on yer Disk

**DownloadNet** (codename *PROJECT 22120*) is an archivist browser controller that caches everything you browse, a library server with full text search to serve your archive. 

**Now with full text search over your archive.** 

This feature is just released in version 2 so it will improve over time.

## And one more thing...

**Coming to a future release, soon!**: The ability to publish your own search engine that you curated with the best resources based on your expert knowledge and experience.

## Get it

[Download a release](https://github.com/crisdosyago/Diskernet/releases)

or ...

**Get it on [npm](https://www.npmjs.com/package/downloadnet):**

```sh
$ npm i -g downloadnet@latest
```

or...

**Build your own binaries:**

```sh
$ git clone https://github.com/crisdosyago/DownloadNet
$ cd DownloadNet
$ npm i 
$ ./scripts/build_setup.sh
$ ./scripts/compile.sh
$ cd bin/
```

<span id=toc></span>
----------------
- [Overview](#classical_building-22120---)
  * [License](#license)
  * [About](#about)
  * [Get 22120](#get-22120)
  * [Using](#using)
    + [Pick save mode or serve mode](#pick-save-mode-or-serve-mode)
    + [Exploring your 22120 archive](#exploring-your-22120-archive)
  * [Format](#format)
  * [Why not WARC (or another format like MHTML) ?](#why-not-warc-or-another-format-like-mhtml-)
  * [How it works](#how-it-works)
  * [FAQ](#faq)
    + [Do I need to download something?](#do-i-need-to-download-something)
    + [Can I use this with a browser that's not Chrome-based?](#can-i-use-this-with-a-browser-thats-not-chrome-based)
    + [How does this interact with Ad blockers?](#how-does-this-interact-with-ad-blockers)
    + [How secure is running chrome with remote debugging port open?](#how-secure-is-running-chrome-with-remote-debugging-port-open)
    + [Is this free?](#is-this-free)
    + [What if it can't find my chrome?](#what-if-it-cant-find-my-chrome)
    + [What's the roadmap?](#whats-the-roadmap)
    + [What about streaming content?](#what-about-streaming-content)
    + [Can I black list domains to not archive them?](#can-i-black-list-domains-to-not-archive-them)
    + [Is there a DEBUG mode for troubleshooting?](#is-there-a-debug-mode-for-troubleshooting)
    + [Can I version the archive?](#can-i-version-the-archive)
    + [Can I change the archive path?](#can-i-change-the-archive-path)
    + [Can I change this other thing?](#can-i-change-this-other-thing)

------------------

## License 

22120 is licensed under Polyform Strict License 1.0.0 (no modification, no distribution). You can purchase a license for different uses below:


-  for personal, research, noncommercial purposes: 
[Buy a Perpetual Non-commercial Use License of the current Version re-upped Monthly to the Latest Version, USD$1.99 per month](https://buy.stripe.com/fZeg0a45zdz58U028z) [Read license](https://github.com/DOSYCORPS/polyform-licenses/blob/1.0.0/PolyForm-Noncommercial-1.0.0.md)
- for part of your internal tooling in your org: [Buy a Perpetual Internal Use License of the current Version re-upped Monthly to the Latest Version, USD $12.99 per month](https://buy.stripe.com/00g4hsgSlbqXb288wY) [Read license](https://github.com/DOSYCORPS/polyform-licenses/blob/1.0.0/PolyForm-Internal-Use-1.0.0.md)
- for anywhere in your business: [Buy a Perpetual Small-medium Business License of the current Version re-upped Monthly to the Latest Version, USD $99 per month](https://buy.stripe.com/aEUbJUgSl2UreekdRj) [Read license](https://github.com/DOSYCORPS/polyform-licenses/blob/1.0.0/PolyForm-Small-Business-1.0.0.md)

<p align=right><small><a href=#toc>Top</a></small></p>

## About

**This project literally makes your web browsing available COMPLETELY OFFLINE.** Your browser does not even know the difference. It's literally that amazing. Yes. 

Save your browsing, then switch off the net and go to `http://localhost:22120` and switch mode to **serve** then browse what you browsed before. It all still works.

**warning: if you have Chrome open, it will close it automatically when you open 22120, and relaunch it. You may lose any unsaved work.**

<p align=right><small><a href=#toc>Top</a></small></p>

## Get 22120

3 ways to get it:

1. Get binary from the [releases page.](https://github.com/c9fe/22120/releases), or
2. Run with npx: `npx downloadnet@latest`, or
    - `npm i -g downloadnet@latest && exlibris`
3. Clone this repo and run as a Node.JS app: `npm i && npm start` 

<p align=right><small><a href=#toc>Top</a></small></p>

## Using

### Pick save mode or serve mode

Go to http://localhost:22120 in your browser, 
and follow the instructions. 

<p align=right><small><a href=#toc>Top</a></small></p>

### Exploring your 22120 archive

Archive will be located in `22120-arc/public/library`\*

But it's not public, don't worry!

You can also check out the archive index, for a listing of every title in the archive. The index is accessible from the control page, which by default is at [http://localhost:22120](http://localhost:22120) (unless you changed the port).

\**Note:`22120-arc` is the archive root of a single archive, and by defualt it is placed in your home directory. But you can change the parent directory for `22120-arc` to have multiple archvies.*

<p align=right><small><a href=#toc>Top</a></small></p>

## Format

The archive format is:

`22120-arc/public/library/<resource-origin>/<path-hash>.json`

Inside the JSON file, is a JSON object with headers, response code, key and a base 64 encoded response body.

<p align=right><small><a href=#toc>Top</a></small></p>

## Why not WARC (or another format like MHTML) ?

**The case for the 22120 format.**

Other formats (like MHTML and SingleFile) save translations of the resources you archive. They create modifications, such as altering the internal structure of the HTML, changing hyperlinks and URLs into "flat" embedded data URIs, or local references, and require other "hacks* in order to save a "perceptually similar" copy of the archived resource.

22120 throws all that out, and calls rubbish on it. 22120 saves a *verbatim* **high-fidelity** copy of the resources your archive. It does not alter their internal structure in any way. Instead it records each resource in its own metadata file. In that way it is more similar to HAR and WARC, but still radically different. Compared to WARC and HAR, our format is radically simplified, throwing out most of the metadata information and unnecessary fields these formats collect.

**Why?**

At 22120, we believe in the resources and in verbatim copies. We don't annoint ourselves as all knowing enough to modify the resource source of truth before we archive it, just so it can "fit the format* we choose. We don't believe we need to decorate with obtuse and superfluous metadata. We don't believe we should be modifying or altering resources we archive. We belive we should save them exactly as they were presented. We believe in simplicity. We believe the format should fit (or at least accommodate, and be suited to) the resource, not the other way around. We don't believe in conflating **metadata** with **content**; so we separate them. We believe separating metadata and content, and keeping the content pure and altered throughout the archiving process is not only the right thing to do, it simplifies every part of the audit trail, because we know that the modifications between archived copies of a resource of due to changes to the resources themselves, not artefacts of the format or archiving process.

Both SingleFile and MHTML require mutilatious modifications of the resources so that the resources can be "forced to fit" the format. At 22120, we believe this is not required (and in any case should never be performed). We see it as akin to lopping off the arms of a Roman statue in order to fit it into a presentation and security display box. How ridiculous! The web may be a more "pliable" medium but that does not mean we should treat it without respect for its inherent content. 

**Why is changing the internal structure of resources so bad?**

In our view, the internal structure of the resource as presented, *is the cannon*. Internal structure is not just substitutable "presentation" - no, in fact it encodes vital semantic information such as hyperlink relationships, source choices, and the "strokes" of the resource author as they create their content, even if it's mediated through a web server or web framework. 

**Why else is 22120 the obvious and natural choice?**

22120 also archives resources exactly as they are sent to the browser. It runs connected to a browser, and so is able to access the full-scope of resources (with, currently, the exception of video, audio and websockets, for now) in their highest fidelity, without modification, that the browser receives and is able to archive them in the exact format presented to the user. Many resources undergo presentational and processing changes before they are presented to the user. This is the ubiquitous, "web app", where client-side scripting enabled by JavaScript, creates resources and resource views on the fly. These sorts of "hyper resources" or "realtime" or "client side" resources, prevalent in SPAs, are not able to be archived, at least not utilizing the normal archive flow, within traditional `wget`-based archiving tools. 

In short, the web is an *online* medium, and it should be archived and presented in the same fashion. 22120 archives content exactly as it is received and presented by a browser, and it also replays that content exactly as if the resource were being taken from online. Yes, it requires a browser for this exercise, but that browser need not be connected to the internet. It is only natural that viewing a web resource requires the web browser. And because of 22120 the browser doesn't know the difference! Resources presented to the browser form a remote web site, and resources given to the browser by 22120, are seen by the browser as ***exactly the same.*** This ensures that the people viewing the archive are also not let down and are given the change to have the exact same experience as if they were viewing the resource online. 

<p align=right><small><a href=#toc>Top</a></small></p>

## How it works

Uses DevTools protocol to intercept all requests, and caches responses against a key made of (METHOD and URL) onto disk. It also maintains an in memory set of keys so it knows what it has on disk. 

<p align=right><small><a href=#toc>Top</a></small></p>

## FAQ

### Do I need to download something?

Yes. But....If you like **22120**, you might love the clientless hosted version coming in future. You'll be able to build your archives online from any device, without any download, then download the archive to run on any desktop. You'll need to sign up to use it, but you can jump the queue and sign up [today](https://dosyago.com).

### Can I use this with a browser that's not Chrome-based? 

No. 

<p align=right><small><a href=#toc>Top</a></small></p>

### How does this interact with Ad blockers?

Interacts just fine. The things ad blockers stop will not be archived.

<p align=right><small><a href=#toc>Top</a></small></p>

### How secure is running chrome with remote debugging port open?

Seems pretty secure. It's not exposed to the public internet, and pages you load that tried to use it cannot use the protocol for anything (except to open a new tab, which they can do anyway). It seems there's a potential risk from malicious browser extensions, but we'd need to confirm that and if that's so, work out blocks. See [this useful security related post](https://github.com/c9fe/22120/issues/67) for some info.

<p align=right><small><a href=#toc>Top</a></small></p>

### Is this free?

Yes this is totally free to download and use for personal non-commercial use. If you want to modify or distribute it, or use it commercially (either internally or for customer functions) you need to purchase a [Noncommercial, internal use, or SMB license](#license). 

<p align=right><small><a href=#toc>Top</a></small></p>

### What if it can't find my chrome?

See this useful [issue](https://github.com/c9fe/22120/issues/68).

<p align=right><small><a href=#toc>Top</a></small></p>

### What's the roadmap?

- Full text search ✅
- Library server to serve archive publicly.
- Distributed p2p web browser on IPFS

<p align=right><small><a href=#toc>Top</a></small></p>

### What about streaming content?

The following are probably hard (and I haven't thought much about):

- Streaming content (audio, video)
- "Impure" request response pairs (such as if you call GET /endpoint 1 time you get "A", if you call it a second time you get "AA", and other examples like this).
- WebSockets (how to capture and replay that faithfully?)

Probably some way to do this tho.

<p align=right><small><a href=#toc>Top</a></small></p>

### Can I black list domains to not archive them?

Yes! Put any domains into `22120-arc/no.json`\*, eg:

```json
[
  "*.horribleplantations.com",
  "*.cactusfernfurniture.com",
  "*.gustymeadows.com",
  "*.nytimes.com",
  "*.cnn.co?"
]
```

Will not cache any resource with a host matching those. Wildcards: 

- `*` (0 or more anything) and 
- `?` (0 or 1 anything) 

\**Note: the `no` file is per-archive. `22120-arc` is the archive root of a single archive, and by defualt it is placed in your home directory. But you can change the parent directory for `22120-arc` to have multiple archvies, and each archive requires its own `no` file, if you want a blacklist in that archive.*

<p align=right><small><a href=#toc>Top</a></small></p>

### Is there a DEBUG mode for troubleshooting?

Yes, just make sure you set an environment variable called `DEBUG_22120` to anything non empty.

So for example in posix systems:

```bash
export DEBUG_22120=True
```

<p align=right><small><a href=#toc>Top</a></small></p>

### Can I version the archive?

Yes! But you need to use `git` for versioning. Just initiate a git repo in your archive repository. And when you want to save a snapshot, make a new git commit.

<p align=right><small><a href=#toc>Top</a></small></p>

### Can I change the archive path?

Yes, there's a control for changing the archive path in the control page: http://localhost:22120

<p align=right><small><a href=#toc>Top</a></small></p>

### Can I change this other thing?

There's a few command line arguments. You'll see the format printed as the first printed line when you start the program.

For other things you can examine the source code. 

<p align=right><small><a href=#toc>Top</a></small></p>



================================================
FILE: docs/SECURITY.md
================================================
# Security Policy

## Supported Versions

Use this section to tell people about which versions of your project are
currently being supported with security updates.

| Version | Supported          |
| ------- | ------------------ |
| Latest  | :white_check_mark: |


## Reporting a Vulnerability

To report a vulnerability, contact: cris@dosycorp.com

To view previous responsible disclosure vulnerability reports, mediation write ups, notes and other information, please visit the [Dosyago Responsible Dislcousre Center](https://github.com/dosyago/vulnerability-reports)


================================================
FILE: docs/features.md
================================================
Cool Possible Feature Ideas

- might be nice to have historical documents indexed as well. For example. Every time we reload a page, we could add a new copy to the index, if it's different...or we could add a new copy if it's been more than X time since the last time we added it. So 1 day , or 1 week. Then we show all results in search (maybe in an expander under the main URL, like "historical URL". So you can find a result that was on front page of HN 1 year ago or 3 weeks ago, even if you revisit and reindex HN every day.



================================================
FILE: docs/issues
================================================
- ndx index seems to lose documents.
  - e.g.
  1. visit goog:hell
  2. visit top link: wiki - hell
  3. visit hellomagainze.com
  4. search hell
  5. see results: goog/hell, wiki/hell, hellomag
  6. reload wiki - hell
  7. search hell
  8. see results: wiki/hell, hellomag
  - WHERE THE HELL DID goog/hell go? 



================================================
FILE: docs/todo
================================================
- complete snippet generation
  - sometimes we are not getting any segments. In that case we should just show the first part of the file. 
  - improve trigram segmenter: lower max segment length, increase fore and aft context
- Index.json is randomly getting clobbered sometimes. Investigate and fix. Important because this breaks the whole archive.
  - No idea what's causing this after an small investigation. But I've added a log on saveIndex to see when it writes.
- publish button
  - way to selectively add (bookmark mode) 
  - way to remove (all modes) items from index
- save trigram index to disk
- let's not reindex unless we have changed contentSignature
- let's not write FTS indexes unless we have changed them since last time (UpdatedKeys)
- result paging
- We need to not open other localhosts if we already have one open
- We need to reload on localhost 22120 if we open with that
  - throttle how often this can occur per URL
- search improvements
  - use different min score options for different sources (noticed URL not match meghan highlight for hello mag even tho query got megan and did match and highlight queen in url)
  - get snippets earlier (before rendering in lib server) and use to add to signal
  - if we have multiple query terms (multiple determined by some form of tokenization) then try to show all terms present in the snippet. even tho one term may be higher scoring. Should we do multiple passes of ukkonen distance one for whole query and one for each term? This will be easier / faster with trigrams I guess. Basically we want snippet to be a relevant summary that provides signal.
  - Another way to improve snippet highlight is to 'revert back' the highlighted text, and calculate their match/ukkonen on the query term. So e.g. if we get q:'israle beverly', hl:['beverly', 'beverly'], it's good overlap, but if we get hl:['is it really'] even tho that might score ok for israle, it's not a good match. so can we 'score that back' if we go match('is it really', 'israel') and see it is low, so we exclude it?
  - try an exact match on the query term if possible for highlight. first one.
  - we could also add signal from the highlighting to just in time alter the order (e.g. 'hell wiki' search brings google search to top rank, but the Hell wikipedia page has more highlight visible)
  - Create instant search (or at least instant queries (so search over previous queries -- not results necessarily))
  - an error in Full text search can corrupt the index and make it unrecoverable...we need to guard against this
    - this is still happening. sometimes the index is not saved, even on a normal error free restart. unknown why. 


================================================
FILE: eslint.config.js
================================================
import globals from "globals";
import pluginJs from "@eslint/js";


export default [
  {languageOptions: { globals: globals.browser }},
  pluginJs.configs.recommended,
];

================================================
FILE: exec.js
================================================
import path from 'path';
import {execSync} from 'child_process';

const runPath = path.resolve(process.argv[2]);
execSync(`"${runPath}"`,{stdio:'inherit'});


================================================
FILE: global-run.cjs
================================================
#!/usr/bin/env node

const os = require('os');
const { spawn } = require('child_process');
const fs = require('fs');
const path = require('path');

if (!fs.existsSync(path.join(process.cwd(), 'node_modules'))) {
  spawn('npm', ['i'], { stdio: 'inherit' });
}

// Getting the total system memory
const totalMemory = os.totalmem();

// Allocating 90% of the total memory
const memoryAllocation = Math.floor((totalMemory / (1024 * 1024)) * 0.8); // Converted bytes to MB and took 90% of it

console.log(`Index can use up to: ${memoryAllocation}MB RAM`);

// Running the application
spawn('node', [`--max-old-space-size=${memoryAllocation}`, path.resolve(__dirname, 'build', 'global', 'downloadnet.cjs')], { stdio: 'inherit' });



================================================
FILE: package.json
================================================
{
  "name": "downloadnet",
  "version": "4.5.2",
  "type": "module",
  "description": "Library server and an archivist browser controller.",
  "main": "global-run.cjs",
  "module": "build/esm/downloadnet.mjs",
  "bin": {
    "dn": "global-run.cjs"
  },
  "scripts": {
    "start": "node --max-old-space-size=4096 src/app.js",
    "build": "node exec.js \"./scripts/build_only.sh\"",
    "parcel": "node exec.js \"./scripts/parcel.sh\"",
    "clean": "node exec.js \"./scripts/clean.sh\"",
    "test": "node --watch src/app.js",
    "inspect": "node --inspect-brk=127.0.0.1:9999 src/app.js",
    "save": "node src/app.js DownloadNet save",
    "serve": "node src/app.js DownloadNet serve",
    "lint": "watch -n 5 npx eslint .",
    "test-hl": "node src/highlighter.js",
    "prepublishOnly": "npm run build"
  },
  "repository": {
    "type": "git",
    "url": "git+https://github.com/dosyago/DownloadNet.git"
  },
  "keywords": [
    "archivist",
    "library"
  ],
  "author": "@dosy",
  "bugs": {
    "url": "https://github.com/dosyago/DownloadNet/issues"
  },
  "homepage": "https://github.com/dosyago/DownloadNet#readme",
  "dependencies": {
    "@667/ps-list": "latest",
    "@dosyago/rainsum": "latest",
    "chalk": "latest",
    "chrome-launcher": "latest",
    "express": "latest",
    "flexsearch": "latest",
    "fz-search": "latest",
    "inquirer": "latest",
    "natural": "latest",
    "ndx": "^1.0.2",
    "ndx-query": "^1.0.1",
    "ndx-serializable": "^1.0.0",
    "ukkonen": "latest",
    "ws": "latest"
  },
  "devDependencies": {
    "@eslint/js": "latest",
    "esbuild": "latest",
    "eslint": "latest",
    "globals": "latest",
    "postject": "latest"
  }
}


================================================
FILE: public/find_cleaned_duplicates.mjs
================================================
#!/usr/bin/env node

import fs from 'node:fs';
import path from 'node:path';
import child_process from 'node:child_process';

import {
  loadPref,
  cache_file,
  index_file,
} from '../src/args.js';

const CLEAN = true;
const CONCURRENT = 7;
const sleep = ms => new Promise(res => setTimeout(res, ms));
const problems = new Map();
let cleaning = false;
let made = false;

process.on('exit', cleanup);
process.on('SIGINT', cleanup);
process.on('SIGTERM', cleanup);
process.on('SIGHUP', cleanup);
process.on('SIGUSR2', cleanup);
process.on('beforeExit', cleanup);

console.log({Pref:loadPref(), cache_file: cache_file(), index_file: index_file()});
make();

async function make() {
  const indexFile = fs.readFileSync(index_file()).toString();
  JSON.parse(indexFile).map(([key, value]) => {
    if ( typeof key === "number" ) return;
    if ( key.startsWith('ndx') ) return;
    if ( value.title === undefined ) {
      console.log('no title property', {key, value});
    }
    const url = key;
    const title = value.title.toLocaleLowerCase();
    if ( title.length === 0 || title.includes('404') || title.includes('not found') ) {
      if ( problems.has(url) ) {
        console.log('Found duplicate', url, title, problems.get(url));
      }
      const prob = {title, dupes:[], dupe:false};
      problems.set(url, prob);
      const cleaned1 = clean(url);
      if ( problems.has(cleaned1) ) {
        console.log(`Found duplicate`, {url, title, cleaned1, dupeEntry:problems.get(cleaned1)});
        prob.dupe = true;
        prob.dupes.push(cleaned1);
        url !== cleaned1 && (problems.delete(cleaned1), prob.diff = true);
      }
      const cleaned2 = clean2(url);
      if ( problems.has(cleaned2) ) {
        console.log(`Found duplicate`, {url, title, cleaned2, dupeEntry: problems.get(cleaned2)});
        prob.dupe = true;
        prob.dupes.push(cleaned2);
        url !== cleaned2 && (problems.delete(cleaned2), prob.diff = true);
      }
    }
  });

  made = true;

  cleanup();
}

function cleanup() {
  if ( cleaning ) return;
  if ( ! made ) return;
  cleaning = true;
  console.log('cleanup running');
  const outData = [...problems.entries()].filter(([key, {dupe}]) => dupe);
  outData.sort(([a], [b]) => a.localeCompare(b));
  fs.writeFileSync(
    path.resolve('.', 'url-cleaned-dupes.json'), 
    JSON.stringify(outData, null, 2)
  );
  const {size:bytesWritten} = fs.statSync(
    path.resolve('.', 'url-cleaned-dupes.json'), 
    {bigint: true}
  );
  console.log(`Wrote ${outData.length} dupe urls in ${bytesWritten} bytes.`);
  process.exit(0);
}

function clean(urlString) {
  const url = new URL(urlString);
  if ( url.hash.startsWith('#!') || url.hostname.includes('google.com') || url.hostname.includes('80s.nyc') ) {
  } else {
    url.hash = '';
  }
  for ( const [key, value] of url.searchParams ) {
    if ( key.startsWith('utm_') ) {
      url.searchParams.delete(key);
    }
  }
  url.pathname = url.pathname.replace(/\/$/, '');
  url.protocol = 'https:';
  url.pathname = url.pathname.replace(/(\.htm.?|\.php|\.asp.?)$/, '');
  if ( url.hostname.startsWith('www.') ) {
    url.hostname = url.hostname.replace(/^www./, '');
  }
  const key = url.toString();
  return key;
}

function clean2(urlString) {
  const url = new URL(urlString);
  url.pathname = ''; 
  return url.toString();
}

function curlCommand(url) {
  return `curl -k -L -s -o /dev/null -w '%{url_effective}' ${JSON.stringify(url)} \
    -H 'Accept: text/html,application/xhtml+xml,application/xml;q=0.9,image/avif,image/webp,image/apng,*/*;q=0.8,application/signed-exchange;v=b3;q=0.9' \
    -H 'Accept-Language: en,en-US;q=0.9,zh-TW;q=0.8,zh-CN;q=0.7,zh;q=0.6,ja;q=0.5' \
    -H 'Cache-Control: no-cache' \
    -H 'Connection: keep-alive' \
    -H 'DNT: 1' \
    -H 'Pragma: no-cache' \
    -H 'Sec-Fetch-Dest: document' \
    -H 'Sec-Fetch-Mode: navigate' \
    -H 'Sec-Fetch-Site: none' \
    -H 'Sec-Fetch-User: ?1' \
    -H 'Upgrade-Insecure-Requests: 1' \
    -H 'User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/104.0.0.0 Safari/537.36' \
    -H 'sec-ch-ua: "Chromium";v="104", " Not A;Brand";v="99", "Google Chrome";v="104"' \
    -H 'sec-ch-ua-mobile: ?0' \
    -H 'sec-ch-ua-platform: "macOS"' \
    --compressed ;
  `;
}


================================================
FILE: public/find_crawlable.mjs
================================================
#!/usr/bin/env node

import fs from 'node:fs';
import path from 'node:path';
import child_process from 'node:child_process';

const CLEAN = false;
const CONCURRENT = 7;
const sleep = ms => new Promise(res => setTimeout(res, ms));
const entries = [];
let cleaning = false;

process.on('exit', cleanup);
process.on('SIGINT', cleanup);
process.on('SIGTERM', cleanup);
process.on('SIGHUP', cleanup);
process.on('SIGUSR2', cleanup);
process.on('beforeExit', cleanup);

make();

async function make() {
  const titlesFile = fs.readFileSync(path.resolve('.', 'topTitles.json')).toString();
  const titles = new Map(JSON.parse(titlesFile).map(([url, title]) => [url, {url,title}]));
  titles.forEach(({url,title}) => {
    if ( title.length === 0 && url.startsWith('https:') && !url.endsWith('.pdf') ) {
      entries.push(url);
    }
  });

  cleanup();
}

function cleanup() {
  if ( cleaning ) return;
  cleaning = true;
  console.log('cleanup running');
  fs.writeFileSync(
    path.resolve('.', 'recrawl-https-3.json'), 
    JSON.stringify(entries, null, 2)
  );
  console.log(`Wrote recrawlable urls`);
  process.exit(0);
}

function clean(urlString) {
  const url = new URL(urlString);
  if ( url.hash.startsWith('#!') || url.hostname.includes('google.com') || url.hostname.includes('80s.nyc') ) {
  } else {
    url.hash = '';
  }
  for ( const [key, value] of url.searchParams ) {
    if ( key.startsWith('utm_') ) {
      url.searchParams.delete(key);
    }
  }
  url.pathname = url.pathname.replace(/\/$/, '');
  url.protocol = 'https:';
  url.pathname = url.pathname.replace(/(\.htm.?|\.php)$/, '');
  if ( url.hostname.startsWith('www.') ) {
    url.hostname = url.hostname.replace(/^www./, '');
  }
  const key = url.toString();
  return key;
}

function clean2(urlString) {
  const url = new URL(urlString);
  url.pathname = ''; 
  return url.toString();
}

function curlCommand(url) {
  return `curl -k -L -s -o /dev/null -w '%{url_effective}' ${JSON.stringify(url)} \
    -H 'Accept: text/html,application/xhtml+xml,application/xml;q=0.9,image/avif,image/webp,image/apng,*/*;q=0.8,application/signed-exchange;v=b3;q=0.9' \
    -H 'Accept-Language: en,en-US;q=0.9,zh-TW;q=0.8,zh-CN;q=0.7,zh;q=0.6,ja;q=0.5' \
    -H 'Cache-Control: no-cache' \
    -H 'Connection: keep-alive' \
    -H 'DNT: 1' \
    -H 'Pragma: no-cache' \
    -H 'Sec-Fetch-Dest: document' \
    -H 'Sec-Fetch-Mode: navigate' \
    -H 'Sec-Fetch-Site: none' \
    -H 'Sec-Fetch-User: ?1' \
    -H 'Upgrade-Insecure-Requests: 1' \
    -H 'User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/104.0.0.0 Safari/537.36' \
    -H 'sec-ch-ua: "Chromium";v="104", " Not A;Brand";v="99", "Google Chrome";v="104"' \
    -H 'sec-ch-ua-mobile: ?0' \
    -H 'sec-ch-ua-platform: "macOS"' \
    --compressed ;
  `;
}


================================================
FILE: public/injection.js
================================================
import {DEBUG as debug} from '../src/common.js';

const DEBUG = debug || false;

export function getInjection({sessionId}) {
  // Notes:
    // say() function
      // why aliased? Resistant to page overwriting
      // just a precaution as we are already in an isolated world here, but this makes
      // this script more portable if it were introduced globally as well as robust 
      // against API or behaviour changes of the browser or its remote debugging protocol
      // in future
  return `
    {
      const X = 1;
      const DEBUG = ${JSON.stringify(DEBUG, null, 2)};
      const MIN_CHECK_TEXT = 3000;  // min time between checking documentElement.innerText
      const MIN_NOTIFY = 5000;      // min time between telling controller text maybe changed
      const MAX_NOTIFICATIONS = 13; // max times we will tell controller text maybe changed
      const OBSERVER_OPTS = {
        subtree: true,
        childList: true,
        characterData: true
      };
      const Top = globalThis.top;
      let lastInnerText;

      if ( Top === globalThis ) {
        const ConsoleInfo = console.info.bind(console);
        const JSONStringify = JSON.stringify.bind(JSON);
        const TITLE_CHANGES = 10;
        const INITIAL_CHECK_TIME = 500;
        const TIME_MULTIPLIER = Math.E;
        const sessionId = "${sessionId}";
        const sleep = ms => new Promise(res => setTimeout(res, ms));
        const handler = throttle(handleFrameMessage, MIN_NOTIFY);
        let count = 0;

        installTop();

        async function installTop() {
          console.log("Installing in top frame...");
          self.startUrl = location.href;
          say({install: { sessionId, startUrl }});
          await sleep(1000);
          beginTitleChecks();
          // start monitoring text changes from 30 seconds after load
          setTimeout(() => beginTextNotifications(), 30000);
          console.log("Installed.");
        }

        function beginTitleChecks() {
          let lastTitle = null;
          let checker;
          let timeToNextCheck = INITIAL_CHECK_TIME;
          let changesLogged = 0;

          check();
          console.log('Begun logging title changes.');

          function check() {
            clearTimeout(checker);
            const currentTitle = document.title; 
            if ( lastTitle !== currentTitle ) {
              say({titleChange: {lastTitle, currentTitle, url: location.href, sessionId}});
              lastTitle = currentTitle;
              changesLogged++;
            } else {
              // increase check time if there's no change
              timeToNextCheck *= TIME_MULTIPLIER;
            }
            if ( changesLogged < TITLE_CHANGES ) {
              checker = setTimeout(check, timeToNextCheck);
            } else {
              console.log('Finished logging title changes.'); 
            }
          }
        }

        function say(thing) {
          ConsoleInfo(JSONStringify(thing));
        }

        function beginTextNotifications() {
          // listen for {textChange:true} messages
          // throttle them
          // on leading throttle edge send message to controller with 
          // console.info(JSON.stringify({textChange:...}));
          self.addEventListener('message', messageParser);

          console.log('Begun notifying of text changes.');

          function messageParser({data, origin}) {
            let source;
            try {
              ({source} = data.frameTextChangeNotification);
              if ( count > MAX_NOTIFICATIONS ) {
                self.removeEventListener('message', messageParser);
                return;
              }
              count++;
              handler({textChange:{source}});
            } catch(e) {
              DEBUG.verboseSlow && console.warn('could not parse message', data, e);
            }
          }
        }

        function handleFrameMessage({textChange}) {
          const {source} = textChange;
          console.log('Telling controller that text changed');
          say({textChange:{source, sessionId, count}});
        }
      } 

      beginTextMutationChecks();

      function beginTextMutationChecks() {
        // create mutation observer for text
        // throttle output

        const observer = new MutationObserver(throttle(check, MIN_CHECK_TEXT));
        observer.observe(document.documentElement || document, OBSERVER_OPTS);

        console.log('Begun observing text changes.');
        
        function check() {
          console.log('check');
          const textMutated = document.documentElement.innerText !== lastInnerText;
          if ( textMutated ) {
            DEBUG.verboseSlow && console.log('Text changed');
            lastInnerText = document.documentElement.innerText;
            Top.postMessage({frameTextChangeNotification:{source:location.href}}, '*');
          }
        }
      }

      // javascript throttle function
        // source: https://stackoverflow.com/a/59378445 
        /*
        function throttle(func, timeFrame) {
          var lastTime = 0;
          return function (...args) {
            var now = new Date();
            if (now - lastTime >= timeFrame) {
              func.apply(this, args);
              lastTime = now;
            }
          };
        }
        */

      // alternate throttle function with trailing edge call
        // source: https://stackoverflow.com/a/27078401
        ///*
        // Notes
          // Returns a function, that, when invoked, will only be triggered at most once
          // during a given window of time. Normally, the throttled function will run
          // as much as it can, without ever going more than once per \`wait\` duration;
          // but if you'd like to disable the execution on the leading edge, pass
          // \`{leading: false}\`. To disable execution on the trailing edge, ditto.
				function throttle(func, wait, options) {
					var context, args, result;
					var timeout = null;
					var previous = 0;
					if (!options) options = {};
					var later = function() {
						previous = options.leading === false ? 0 : Date.now();
						timeout = null;
						result = func.apply(context, args);
						if (!timeout) context = args = null;
					};
					return function() {
						var now = Date.now();
						if (!previous && options.leading === false) previous = now;
						var remaining = wait - (now - previous);
						context = this;
						args = arguments;
						if (remaining <= 0 || remaining > wait) {
							if (timeout) {
								clearTimeout(timeout);
								timeout = null;
							}
							previous = now;
							result = func.apply(context, args);
							if (!timeout) context = args = null;
						} else if (!timeout && options.trailing !== false) {
							timeout = setTimeout(later, remaining);
						}
						return result;
					};
				}
        //*/
    }
  `;
}


================================================
FILE: public/library/README.md
================================================
# ALT Default storage directory for library

Remove `public/library/http*` and `public/library/cache.json` from `.gitignore` if you forked this repo and want to commit your library using git.

## Clearing your cache

To clear everything, delete all directories that start with `http` or `https` and delete cache.json

To clear only stuff from domains you don't want, delete all directories you don't want that start with `http` or `https` and DON'T delete cache.json



================================================
FILE: public/make_top.mjs
================================================
#!/usr/bin/env node

import fs from 'node:fs';
import path from 'node:path';
import child_process from 'node:child_process';

const CLEAN = false;
const CONCURRENT = 7;
const sleep = ms => new Promise(res => setTimeout(res, ms));
const entries = [];
const counted = new Set();
const errors = new Map();
let counts;
let cleaning = false;

process.on('exit', cleanup);
process.on('SIGINT', cleanup);
process.on('SIGTERM', cleanup);
process.on('SIGHUP', cleanup);
process.on('SIGUSR2', cleanup);
process.on('beforeExit', cleanup);

make();

async function make() {
  const titlesFile = fs.readFileSync(path.resolve('.', 'topTitles.json')).toString();
  const titles = new Map(JSON.parse(titlesFile).map(([url, title]) => [url, {url,title}]));
  if ( CLEAN ) {
    for ( const [url, obj] of titles ) {
      const k1 = clean(url);
      const k2 = clean2(url);
      if ( !titles.has(k1) ) {
        titles.set(k1, obj);
      }
      if ( !titles.has(k2) ) {
        titles.set(k2, obj);
      }
    }
  }
  const remainingFile =  fs.readFileSync(path.resolve('.', 'remainingFile.json')).toString();
  const remainingSet = new Set(JSON.parse(remainingFile));
  const countsFile = fs.readFileSync(path.resolve('.', 'ran-counts.json')).toString();
  counts = new Map(JSON.parse(countsFile).filter(([url, count]) => remainingSet.has(url)));
  let current = 0;
  for ( const [url, count] of counts ) {
    let title;
    let realUrl;
    if ( titles.has(url) ) {
      ({title} = titles.get(url));
      entries.push({
        url, 
        title, 
        count,
      });
      counted.add(url);
    } else {
      console.log(`Curl call for ${url} in progress...`);
      let notifyCurlComplete;
      const curlCall = new Promise(res => notifyCurlComplete = res);
      do {
        await sleep(1000);
      } while ( current >= CONCURRENT );
      child_process.exec(curlCommand(url), (err, stdout, stderr) => {
        if ( ! err && (!stderr || stderr.length == 0)) {
          realUrl = stdout; 
          if ( titles.has(realUrl) ) {
            ({title} = titles.get(realUrl));
            entries.push({
              url, 
              realUrl,
              title, 
              count,
            });
            counted.add(url);
          }
        } else {
          console.log(`Error on curl for ${url}`, {err, stderr});
          errors.set(url, {err, stderr});
        }
        console.log(`Curl call for ${url} complete!`);
        notifyCurlComplete();
      });
      current += 1;
      curlCall.then(() => current -= 1);
    }
  }
  cleanup();
}

async function make_v2() {
  const titlesFile = fs.readFileSync(path.resolve('.', 'topTitles.json')).toString();
  const titles = new Map(JSON.parse(titlesFile).map(([url, title]) => [url, {url,title}]));
  if ( CLEAN ) {
    for ( const [url, obj] of titles ) {
      const k1 = clean(url);
      const k2 = clean2(url);
      if ( !titles.has(k1) ) {
        titles.set(k1, obj);
      }
      if ( !titles.has(k2) ) {
        titles.set(k2, obj);
      }
    }
  }
  const countsFile = fs.readFileSync(path.resolve('.', 'ran-counts.json')).toString();
  counts = new Map(JSON.parse(countsFile));
  let current = 0;
  for ( const [url, count] of counts ) {
    let title;
    let realUrl;
    if ( titles.has(url) ) {
      ({title} = titles.get(url));
      entries.push({
        url, 
        title, 
        count,
      });
      counted.add(url);
    } else {
      console.log(`Curl call for ${url} in progress...`);
      let notifyCurlComplete;
      const curlCall = new Promise(res => notifyCurlComplete = res);
      do {
        await sleep(250);
      } while ( current >= CONCURRENT );
      child_process.exec(curlCommand(url), (err, stdout, stderr) => {
        if ( ! err && (!stderr || stderr.length == 0)) {
          realUrl = stdout; 
          if ( titles.has(realUrl) ) {
            ({title} = titles.get(realUrl));
            entries.push({
              url, 
              realUrl,
              title, 
              count,
            });
            counted.add(url);
          }
        } else {
          console.log(`Error on curl for ${url}`, {err, stderr});
          errors.set(url, {err, stderr});
        }
        console.log(`Curl call for ${url} complete!`);
        notifyCurlComplete();
      });
      current += 1;
      curlCall.then(() => current -= 1);
    }
  }
  cleanup();
}

function cleanup() {
  if ( cleaning ) return;
  cleaning = true;
  console.log('cleanup running');
  if ( errors.size ) {
    fs.writeFileSync(
      path.resolve('.', 'errorLinks4.json'),
      JSON.stringify([...errors.keys()], null, 2)
    );
    console.log(`Wrote errors`);
  }
  if ( counted.size !== counts.size ) {
    counted.forEach(url => counts.delete(url)); 
    fs.writeFileSync(
      path.resolve('.', 'noTitleFound4.json'),
      JSON.stringify([...counts.keys()], null, 2)
    )
    console.log(`Wrote noTitleFound`);
  }
  fs.writeFileSync(
    path.resolve('.', 'topFrontPageLinksWithCounts4.json'), 
    JSON.stringify(entries, null, 2)
  );
  console.log(`Wrote top links with counts`);
  process.exit(0);
}

async function make_v1() {
  const titlesFile = fs.readFileSync(path.resolve('.', 'topTitles.json')).toString();
  const titles = new Map(JSON.parse(titlesFile).map(([url, title]) => [clean(url), {url,title}]));
  const countsFile = fs.readFileSync(path.resolve('.', 'counts.json')).toString();
  const counts = new Map(JSON.parse(countsFile).map(([url, count]) => [clean(url), count]));
  for ( const [key, count] of counts ) {
    counts.set(clean2(key), count);
  }
  const entries = [];
  for ( const [key, {url,title}] of titles ) {
    entries.push({
      url, title, 
      count: counts.get(key) || 
        counts.get(url) || 
        counts.get(clean2(key)) || 
        console.log(`No count found for`, {key, url, title, c2key: clean2(key)})
    });
  }
  fs.writeFileSync(
    path.resolve('.', 'topFrontPageLinks.json'), 
    JSON.stringify(entries, null, 2)
  );
}

function clean(urlString) {
  const url = new URL(urlString);
  if ( url.hash.startsWith('#!') || url.hostname.includes('google.com') || url.hostname.includes('80s.nyc') ) {
  } else {
    url.hash = '';
  }
  for ( const [key, value] of url.searchParams ) {
    if ( key.startsWith('utm_') ) {
      url.searchParams.delete(key);
    }
  }
  url.pathname = url.pathname.replace(/\/$/, '');
  url.protocol = 'https:';
  url.pathname = url.pathname.replace(/(\.htm.?|\.php)$/, '');
  if ( url.hostname.startsWith('www.') ) {
    url.hostname = url.hostname.replace(/^www./, '');
  }
  const key = url.toString();
  return key;
}

function clean2(urlString) {
  const url = new URL(urlString);
  url.pathname = ''; 
  return url.toString();
}

function curlCommand(url) {
  return `curl -k -L -s -o /dev/null -w '%{url_effective}' ${JSON.stringify(url)} \
    -H 'Accept: text/html,application/xhtml+xml,application/xml;q=0.9,image/avif,image/webp,image/apng,*/*;q=0.8,application/signed-exchange;v=b3;q=0.9' \
    -H 'Accept-Language: en,en-US;q=0.9,zh-TW;q=0.8,zh-CN;q=0.7,zh;q=0.6,ja;q=0.5' \
    -H 'Cache-Control: no-cache' \
    -H 'Connection: keep-alive' \
    -H 'DNT: 1' \
    -H 'Pragma: no-cache' \
    -H 'Sec-Fetch-Dest: document' \
    -H 'Sec-Fetch-Mode: navigate' \
    -H 'Sec-Fetch-Site: none' \
    -H 'Sec-Fetch-User: ?1' \
    -H 'Upgrade-Insecure-Requests: 1' \
    -H 'User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/104.0.0.0 Safari/537.36' \
    -H 'sec-ch-ua: "Chromium";v="104", " Not A;Brand";v="99", "Google Chrome";v="104"' \
    -H 'sec-ch-ua-mobile: ?0' \
    -H 'sec-ch-ua-platform: "macOS"' \
    --compressed ;
  `;
}


================================================
FILE: public/old-index.html
================================================
<!DOCTYPE html>
<meta charset=utf-8>
<title>Your Personal Search Engine and Archive</title>
<link rel=stylesheet href=style.css>
<header>
  <h1><a href=/>DownloadNet</a> &mdash; Personal Web Search and Archive</h1>
</header>
<p>
  View <a href=/archive_index.html>your index</a>
</p>
<!--
<form method=POST action=/crawl>
  <fieldset>
    <legend>Crawl and Index</legend>
    <p>
      Crawl and index a list of links. 
      <br>
      <small>This will open 1 link at a time, and index it when it has loaded.</small>
    <p>
      <label>
        Links
        <br>
        <textarea class=long name=links>
          https://cnn.com
          https://bloomberg.com
          https://microsoft.com
          https://dosyago.com
          https://intel.com
        </textarea>
        <br>
        <small>List format is 1 link per line.</small>
      </label>
    </p>
    <details open>
      <summary>Advanced settings</summary>
      <p>
        <label>
          Timeout
          <br>
          <input required name=timeout
            type=number min=1 max=300 value=3.6 step=0.1> <span class=units>seconds</span>
          <br>
          <small>Seconds to wait for each page to load before indexing.</small>
        </label>
      <p>
      <label>
        Depth
        <br>
        <input required name=depth 
          type=number min=1 max=20 value=1 step=1> <span class=units>clicks</span>
      </label>
      <br>
      <section class=small>
        <strong>Value guide</strong>
        <ol>
          <li>Only each link.
          <li>Plus anything 1 click from the link.
          <li>Plus anything 2 clicks from the link.
        </ol>
        <em>And so on&hellip;</em>
      </section>
      <p>
        <label>
          Min Page Crawl Time
          <br>
          <input name=minPageCrawlTime
            type=number min=1 max=60 value=20> <span class=units>seconds</span>
          <br>
          <small>Seconds to wait for each page to load before indexing.</small>
        </label>
      <p>
      <p>
        <label>
          Max Page Crawl Time
          <br>
          <input name=maxPageCrawlTime
            type=number min=3 max=120 value=30> <span class=units>seconds</span>
          <br>
          <small>Max time to allow for each page.</small>
        </label>
      <p>
      <p>
        <label>
          Batch size
          <br>
          <input name=batchSize
            type=number min=1 max=32 value=2> <span class=units>tabs</span>
          <br>
          <small>Number of concurrent tabs.</small>
        </label>
      <p>
      <p>
        <label>
          <input name=saveToFile
            type=checkbox checked>
            Save the harvested URLs to a file
        </label>
      <p>
      <p>
        <label>
          <span class=text>Program to run on every page</span>
          <br>
          <textarea class=long rows=9 name=program>
            if ( ! State.titles ) {
              State.titles = new Map();
              State.onExit.addHandler(() => {
                fs.writeFileSync(
                  path.resolve('.', `titles-${(new Date).toISOString()}.txt`), 
                  JSON.stringify([...State.titles.entries()], null, 2) + '\n'
                );
              });
            }
            const {result:{value:data}} = await send("Runtime.evaluate", 
              {
                expression: `(function () {
                  return {
                    url: document.location.href,
                    title: document.title,
                  };
                }())`,
                returnByValue: true
              }, 
              sessionId
            );
            State.titles.set(data.url, data.title);
            console.log(`Saved ${State.titles.size} titles`);
          </textarea>
        </label>
      </p>
    </details>
    <p>
      <button>Crawl</button>
      <script>
        {
          const button = document.currentScript.previousElementSibling;
          let disabled = false;
          button.addEventListener('click', click => {
            if ( disabled ) return click.preventDefault(); 
            disabled = true;
            setTimeout(() => button.disabled = true, 0);
          });
        }
      </script>
  </fieldset>
</form>
-->
<form method=GET action=/search>
  <fieldset class=search>
    <legend>Search your archive</legend>
    <input autofocus class=search type=search name=query placeholder="search your library">
    <button>Search</button>
  </fieldset>
</form>
<form method=POST action=/mode>
  <fieldset>
    <legend>Save or Serve: Mode Control</legend>
    <p>
      Control whether pages you browse are <label class=cmd for=save>saved to</label>, or 
      <label class=cmd for=serve>served from</label> your archive
      <br>
      <small><em class=caps>Pro-Tip:</em> Serve pages when you're offline, and it will still feel like you're online</small>
    <p>
      <label>
        <input type=radio name=mode value=save id=save>
        Save
      </label>
      <label>
        <input type=radio name=mode value=serve id=serve>
        Serve
      </label>
      <label>
        <input type=radio name=mode value=select id=select>
        Select (<em>Bookmark mode</em>)
      </label>
      <output name=notification>
    <p>
      <button>Change mode</button>
    <script>
      {
        const form = document.currentScript.closest('form');
        form.notification.value = "Getting current mode...";
        setTimeout(showCurrentMode, 300);

        async function showCurrentMode() {
          const mode = await fetch('/mode').then(r => r.text());
          console.log({mode});
          if ( ! mode ) {
            setTimeout(showCurrentMode, 300);
            return;
          }
          form.notification.value = "";
          form.querySelector(`[name="mode"][value="${mode}"]`).checked = true;
        }
      }
    </script>
  </fieldset>
</form>
<form method=POST action=/base_path>
  <fieldset>
    <legend id=new_base_path>File system path of archive</legend>
    <p>
      Set the path to where your archive folder will go
      <br>
      <small>The default is your home directory</small>
    <p>
      <label>
        Base path
        <input class=long type=text name=base_path placeholder="A folder path...">
      </label>
    <p>
      <button>Change base path</button>
    <script>
      {
        const form = document.currentScript.closest('form');
        showCurrentLibraryPath();

        form.base_path.onchange = e => {
          self.target = e.target;
        }
        async function showCurrentLibraryPath() {
          const base_path = await fetch('/base_path').then(r => r.text());
          form.querySelector(`[name="base_path"]`).value = base_path;
        }
      }
    </script>
  </fieldset>
</form>
<form disabled method=POST action=/publish>
  <fieldset>
    <legend>Publish your archive</legend>
    <p>
      Publish a search engine from your archive 
      <br>
      <small>This will generate a server.zip file that you can unzip and run</small>
    <p>
      <button disabled>Publish</button>
  </fieldset>
</form>
<p>
  Notice a bug? <a href=https://github.com/dosyago/DownloadNet/issues>Open an issue!</a>
</p>
<footer>
  <cite>
    <a rel=author href=https://github.com/dosyago/DownloadNet>DownloadNet GitHub</a>
  </cite>
</footer>


================================================
FILE: public/problem_find.mjs
================================================
#!/usr/bin/env node

import fs from 'node:fs';
import path from 'node:path';
import child_process from 'node:child_process';

import {
  loadPref,
  cache_file,
  index_file,
} from '../src/args.js';

const CLEAN = false;
const CONCURRENT = 7;
const sleep = ms => new Promise(res => setTimeout(res, ms));
const problems = new Map();
let cleaning = false;
let made = false;

process.on('exit', cleanup);
process.on('SIGINT', cleanup);
process.on('SIGTERM', cleanup);
process.on('SIGHUP', cleanup);
process.on('SIGUSR2', cleanup);
process.on('beforeExit', cleanup);

console.log({Pref:loadPref(), cache_file: cache_file(), index_file: index_file()});
make();

async function make() {
  const indexFile = fs.readFileSync(index_file()).toString();
  JSON.parse(indexFile).map(([key, value]) => {
    if ( typeof key === "number" ) return;
    if ( key.startsWith('ndx') ) return;
    if ( value.title === undefined ) {
      console.log('no title property', {key, value});
    }
    const url = key;
    const title = value.title.toLocaleLowerCase();
    if ( title.length === 0 || title.includes('404') || title.includes('not found') ) {
      if ( problems.has(url) ) {
        console.log('Found duplicate', url, title, problems.get(url));
      }
      problems.set(url, title);
    }
  });

  made = true;

  cleanup();
}

function cleanup() {
  if ( cleaning ) return;
  if ( ! made ) return;
  cleaning = true;
  console.log('cleanup running');
  const outData = [...problems.entries()];
  fs.writeFileSync(
    path.resolve('.', 'url-problems.json'), 
    JSON.stringify(outData, null, 2)
  );
  const {size:bytesWritten} = fs.statSync(
    path.resolve('.', 'url-problems.json'), 
    {bigint: true}
  );
  console.log(`Wrote ${outData.length} problem urls in ${bytesWritten} bytes.`);
  process.exit(0);
}

function clean(urlString) {
  const url = new URL(urlString);
  if ( url.hash.startsWith('#!') || url.hostname.includes('google.com') || url.hostname.includes('80s.nyc') ) {
  } else {
    url.hash = '';
  }
  for ( const [key, value] of url.searchParams ) {
    if ( key.startsWith('utm_') ) {
      url.searchParams.delete(key);
    }
  }
  url.pathname = url.pathname.replace(/\/$/, '');
  url.protocol = 'https:';
  url.pathname = url.pathname.replace(/(\.htm.?|\.php)$/, '');
  if ( url.hostname.startsWith('www.') ) {
    url.hostname = url.hostname.replace(/^www./, '');
  }
  const key = url.toString();
  return key;
}

function clean2(urlString) {
  const url = new URL(urlString);
  url.pathname = ''; 
  return url.toString();
}

function curlCommand(url) {
  return `curl -k -L -s -o /dev/null -w '%{url_effective}' ${JSON.stringify(url)} \
    -H 'Accept: text/html,application/xhtml+xml,application/xml;q=0.9,image/avif,image/webp,image/apng,*/*;q=0.8,application/signed-exchange;v=b3;q=0.9' \
    -H 'Accept-Language: en,en-US;q=0.9,zh-TW;q=0.8,zh-CN;q=0.7,zh;q=0.6,ja;q=0.5' \
    -H 'Cache-Control: no-cache' \
    -H 'Connection: keep-alive' \
    -H 'DNT: 1' \
    -H 'Pragma: no-cache' \
    -H 'Sec-Fetch-Dest: document' \
    -H 'Sec-Fetch-Mode: navigate' \
    -H 'Sec-Fetch-Site: none' \
    -H 'Sec-Fetch-User: ?1' \
    -H 'Upgrade-Insecure-Requests: 1' \
    -H 'User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/104.0.0.0 Safari/537.36' \
    -H 'sec-ch-ua: "Chromium";v="104", " Not A;Brand";v="99", "Google Chrome";v="104"' \
    -H 'sec-ch-ua-mobile: ?0' \
    -H 'sec-ch-ua-platform: "macOS"' \
    --compressed ;
  `;
}


================================================
FILE: public/redirector.html
================================================
<!DOCTYPE html>
<meta name="referrer" content="no-referrer" />
<h1>About to index archive and index <code id=url-text></code></h1>
<script type=module>
  const url = new URLSearchParams(location.search).get('url');
  const text = document.querySelector('#url-text');
  let valid = false;
  try {
    new URL(url);
    valid = true;
  } catch(e) {
    console.warn(`URL ${url} is not a valid URL`);
  }

  if ( valid ) {
    text.innerText = url;
    setTimeout(() => {
      window.location.href = url;
    }, 1000);
  }
</script>


================================================
FILE: public/style.css
================================================
/* public/style.css */

/* 1. Modern CSS Reset (Simplified) */
*, *::before, *::after {
  box-sizing: border-box;
  margin: 0;
  padding: 0;
}

html {
  -webkit-text-size-adjust: 100%;
  tab-size: 4;
  font-family: system-ui, -apple-system, BlinkMacSystemFont, 'Segoe UI', Roboto, Oxygen, Ubuntu, Cantarell, 'Open Sans', 'Helvetica Neue', sans-serif;
  line-height: 1.5;
}

body {
  min-height: 100vh;
  display: flex;
  flex-direction: column;
}

img, picture, video, canvas, svg {
  display: block;
  max-width: 100%;
}

input, button, textarea, select {
  font: inherit;
}

button {
  cursor: pointer;
}

a {
  text-decoration: none;
  color: inherit;
}

ul, ol {
  list-style: none;
}

/* 2. CSS Custom Properties (Variables) & Theming */
:root {
  /* Light Mode (Default) */
  --font-primary: system-ui, -apple-system, BlinkMacSystemFont, 'Segoe UI', Roboto, Oxygen, Ubuntu, Cantarell, 'Open Sans', 'Helvetica Neue', sans-serif;
  --font-monospace: 'SFMono-Regular', Consolas, 'Liberation Mono', Menlo, Courier, monospace;

  --color-text: #222;
  --color-text-muted: #555;
  --color-background: #f8f9fa;
  --color-surface: #ffffff;
  --color-primary: #007bff;
  --color-primary-hover: #0056b3;
  --color-secondary: #6c757d;
  --color-border: #dee2e6;
  --color-accent: #17a2b8;
  --color-success: #28a745;
  --color-danger: #dc3545;
  --color-warning: #ffc107;
  --color-highlight-bg: #ffe082; /* For search term highlighting */

  --spacing-xs: 0.25rem;
  --spacing-sm: 0.5rem;
  --spacing-md: 1rem;
  --spacing-lg: 1.5rem;
  --spacing-xl: 2rem;

  --border-radius: 0.375rem;
  --shadow-sm: 0 1px 2px 0 rgba(0, 0, 0, 0.05);
  --shadow-md: 0 4px 6px -1px rgba(0, 0, 0, 0.1), 0 2px 4px -1px rgba(0, 0, 0, 0.06);
}

@media (prefers-color-scheme: dark) {
  :root {
    --color-text: #e9ecef;
    --color-text-muted: #adb5bd;
    --color-background: #121212; /* Slightly off-black for depth */
    --color-surface: #1e1e1e;   /* For cards, modals, etc. */
    --color-primary: #0d6efd;
    --color-primary-hover: #0b5ed7;
    --color-secondary: #495057;
    --color-border: #343a40;
    --color-accent: #20c997;
    --color-success: #198754;
    --color-danger: #dc3545;
    --color-warning: #ffca2c;
    --color-highlight-bg: #4a3c00; /* Darker highlight for dark mode */
  }
}

/* 3. Base & Layout Styles */
body {
  font-family: var(--font-primary);
  background-color: var(--color-background);
  color: var(--color-text);
  display: flex;
  flex-direction: column;
  min-height: 100vh;
}

.container {
  width: 90%;
  max-width: 1000px;
  margin: 0 auto;
  padding: var(--spacing-lg) var(--spacing-md);
  flex-grow: 1;
  display: flex;
  flex-direction: column;
}

.site-header {
  padding-bottom: var(--spacing-md);
  margin-bottom: var(--spacing-lg);
  border-bottom: 1px solid var(--color-border);
  display: flex;
  justify-content: space-between;
  align-items: center;
  flex-wrap: wrap; /* Allow wrapping on small screens */
}

.site-header h1 {
  font-size: 1.75rem;
  font-weight: 600;
  margin: 0;
}
.site-header h1 a {
  color: var(--color-primary);
  transition: color 0.2s ease-in-out;
}
.site-header h1 a:hover {
  color: var(--color-primary-hover);
}

.main-nav ul {
  display: flex;
  gap: var(--spacing-md);
}
.main-nav a {
  color: var(--color-text-muted);
  font-weight: 500;
  transition: color 0.2s ease-in-out;
}
.main-nav a:hover, .main-nav a.active {
  color: var(--color-primary);
}

main {
  flex-grow: 1;
}

.page-title {
  font-size: 1.5rem;
  margin-bottom: var(--spacing-lg);
  color: var(--color-text);
}

.site-footer {
  text-align: center;
  padding: var(--spacing-md);
  margin-top: var(--spacing-xl);
  border-top: 1px solid var(--color-border);
  font-size: 0.9rem;
  color: var(--color-text-muted);
}

/* 4. Form Elements */
form {
  background-color: var(--color-surface);
  padding: var(--spacing-lg);
  border-radius: var(--border-radius);
  box-shadow: var(--shadow-sm);
  margin-bottom: var(--spacing-lg);
}

fieldset {
  border: none;
  padding: 0;
  margin: 0;
}

legend {
  font-size: 1.2rem;
  font-weight: 600;
  margin-bottom: var(--spacing-md);
  color: var(--color-text);
  padding: 0; /* Resetting some browser defaults */
  display: block; /* Ensure it takes full width if needed */
  width: 100%;
}

.form-group {
  margin-bottom: var(--spacing-md);
}

.form-group label {
  display: block;
  margin-bottom: var(--spacing-sm);
  font-weight: 500;
  color: var(--color-text-muted);
}

.form-group label small {
  font-weight: normal;
  font-size: 0.85em;
  display: block;
}

input[type="text"],
input[type="search"],
input[type="url"],
input[type="number"],
input[type="email"],
textarea,
select {
  width: 100%;
  padding: var(--spacing-sm) var(--spacing-md);
  border: 1px solid var(--color-border);
  border-radius: var(--border-radius);
  background-color: var(--color-background); /* Slightly different from surface for depth */
  color: var(--color-text);
  transition: border-color 0.2s ease-in-out, box-shadow 0.2s ease-in-out;
}

input[type="text"]:focus,
input[type="search"]:focus,
input[type="url"]:focus,
input[type="number"]:focus,
input[type="email"]:focus,
textarea:focus,
select:focus {
  outline: none;
  border-color: var(--color-primary);
  box-shadow: 0 0 0 0.2rem rgba(var(--color-primary), 0.25);
}

textarea {
  min-height: 100px;
  resize: vertical;
}

.input-group {
  display: flex;
}
.input-group input[type="search"] {
  border-top-right-radius: 0;
  border-bottom-right-radius: 0;
  flex-grow: 1;
}
.input-group button {
  border-top-left-radius: 0;
  border-bottom-left-radius: 0;
}


button, .button {
  display: inline-block;
  padding: var(--spacing-sm) var(--spacing-lg);
  font-weight: 500;
  text-align: center;
  vertical-align: middle;
  border: 1px solid transparent;
  border-radius: var(--border-radius);
  background-color: var(--color-primary);
  color: #fff;
  transition: background-color 0.2s ease-in-out, border-color 0.2s ease-in-out;
  line-height: 1.5; /* Ensure consistent height with inputs */
}

button:hover, .button:hover {
  background-color: var(--color-primary-hover);
}

button.secondary, .button.secondary {
  background-color: var(--color-secondary);
  color: #fff;
}
button.secondary:hover, .button.secondary:hover {
  background-color: darken(var(--color-secondary), 10%);
}

button.danger, .button.danger {
  background-color: var(--color-danger);
  color: #fff;
}
button.danger:hover, .button.danger:hover {
  background-color: darken(var(--color-danger), 10%);
}

button.icon-button {
  background: none;
  border: none;
  color: var(--color-text-muted);
  padding: var(--spacing-xs);
  font-size: 1.2em; /* Adjust as needed */
  line-height: 1;
}
button.icon-button:hover {
  color: var(--color-primary);
}


/* 5. List & Item Styles (for search results, index) */
.item-list {
  margin-top: var(--spacing-lg);
}

.item-list li {
  background-color: var(--color-surface);
  padding: var(--spacing-md);
  margin-bottom: var(--spacing-md);
  border-radius: var(--border-radius);
  box-shadow: var(--shadow-sm);
  border: 1px solid var(--color-border);
}

.item-list li .item-title {
  font-size: 1.15rem;
  font-weight: 600;
  margin-bottom: var(--spacing-xs);
}
.item-list li .item-title a {
  color: var(--color-primary);
}
.item-list li .item-title a:hover {
  text-decoration: underline;
}

.item-list li .item-url {
  font-size: 0.9rem;
  color: var(--color-text-muted);
  word-break: break-all;
  margin-bottom: var(--spacing-sm);
  display: block; /* Ensure it's on its own line if needed */
}
.item-list li .item-url a {
  color: var(--color-secondary);
}
.item-list li .item-url a:hover {
  text-decoration: underline;
}


.item-list li .item-snippet {
  font-size: 0.95rem;
  line-height: 1.6;
  color: var(--color-text);
}
.item-list li .item-snippet mark { /* For highlighted search terms */
  background-color: var(--color-highlight-bg);
  color: var(--color-text); /* Ensure text is readable on highlight */
  padding: 0.1em 0.2em;
  border-radius: 0.2em;
}

.item-actions {
  margin-top: var(--spacing-sm);
  display: flex;
  gap: var(--spacing-sm);
}


/* Pagination */
.pagination {
  display: flex;
  justify-content: center;
  align-items: center;
  gap: var(--spacing-sm);
  margin-top: var(--spacing-lg);
  padding: var(--spacing-md);
}
.pagination a, .pagination span {
  padding: var(--spacing-sm) var(--spacing-md);
  border-radius: var(--border-radius);
  color: var(--color-primary);
}
.pagination a {
  border: 1px solid var(--color-primary);
}
.pagination a:hover {
  background-color: var(--color-primary);
  color: #fff;
}
.pagination span { /* Current page */
  background-color: var(--color-primary);
  color: #fff;
  font-weight: 600;
}
.pagination .disabled {
    color: var(--color-text-muted);
    pointer-events: none;
    border-color: var(--color-border);
}


/* Utilities */
.text-center {
  text-align: center;
}
.text-muted {
  color: var(--color-text-muted) !important;
}
.mb-0 { margin-bottom: 0 !important; }
.mt-0 { margin-top: 0 !important; }
.debug-info {
  font-size: 0.8rem;
  color: var(--color-accent);
  font-family: var(--font-monospace);
}

/* Specific for edit index delete button */
.delete-form {
  display: inline; /* Keep it on the same line */
}
.delete-button {
  background: none;
  border: none;
  color: var(--color-danger);
  padding: 0 var(--spacing-xs);
  font-size: 1em;
  cursor: pointer;
  margin-left: var(--spacing-sm);
}
.delete-button:hover {
  color: darken(var(--color-danger), 15%);
}
.strikethrough {
  text-decoration: line-through;
  opacity: 0.7;
}

/* Edit toggle */
.edit-toggle-section {
  display: flex;
  justify-content: flex-end;
  margin-bottom: var(--spacing-md);
}
.edit-toggle-section details {
  position: relative; /* For absolute positioning of the button */
}
.edit-toggle-section summary {
  display: inline-block;
  cursor: pointer;
  padding: var(--spacing-xs) var(--spacing-sm);
  border-radius: var(--border-radius);
  background-color: var(--color-surface);
  border: 1px solid var(--color-border);
  color: var(--color-text-muted);
}
.edit-toggle-section summary:hover {
  border-color: var(--color-primary);
  color: var(--color-primary);
}
.edit-toggle-section summary::-webkit-details-marker { /* Hide default arrow */
  display: none;
}
.edit-toggle-section summary::marker { /* Hide default arrow FF */
 display: none;
}
.edit-toggle-section .details-content {
  position: absolute;
  right: 0;
  top: calc(100% + var(--spacing-xs)); /* Position below the summary */
  background-color: var(--color-surface);
  border: 1px solid var(--color-border);
  border-radius: var(--border-radius);
  padding: var(--spacing-sm);
  box-shadow: var(--shadow-md);
  z-index: 10;
  white-space: nowrap; /* Prevent button text from wrapping */
}


/* Responsive adjustments */
@media (max-width: 768px) {
  .site-header {
    flex-direction: column;
    align-items: flex-start;
    gap: var(--spacing-sm);
  }
  .main-nav ul {
    flex-direction: column;
    gap: var(--spacing-xs);
  }
  .input-group {
    flex-direction: column;
  }
  .input-group input[type="search"], .input-group button {
    border-radius: var(--border-radius); /* Reset individual border radius */
  }
  .input-group input[type="search"] {
    margin-bottom: var(--spacing-sm);
  }
}

@media (max-width: 480px) {
  .container {
    width: 95%;
    padding-left: var(--spacing-sm);
    padding-right: var(--spacing-sm);
  }
  .site-header h1 {
    font-size: 1.5rem;
  }
  .page-title {
    font-size: 1.3rem;
  }
  button, .button {
    padding: var(--spacing-sm) var(--spacing-md); /* Slightly smaller padding */
  }
}

/* public/style.css */
/* ... (keep all existing CSS from the previous version) ... */

/* ADD THE FOLLOWING AT THE END OF THE FILE, OR INTEGRATE INTO RELEVANT SECTIONS */

/* Layout for pages with a sidebar */
.page-with-sidebar {
  display: grid;
  grid-template-columns: 220px 1fr; /* Sidebar width and main content */
  gap: var(--spacing-lg);
  flex-grow: 1; /* Ensure it takes available space in the container */
}

.page-sidebar {
  background-color: var(--color-surface);
  padding: var(--spacing-md);
  border-radius: var(--border-radius);
  box-shadow: var(--shadow-sm);
  border-right: 1px solid var(--color-border);
  height: fit-content; /* So it doesn't stretch unnecessarily if content is short */
  position: sticky; /* Make sidebar sticky */
  top: var(--spacing-lg); /* Adjust based on your header or desired spacing */
}

.page-sidebar h3 {
  font-size: 1.1rem;
  font-weight: 600;
  margin-bottom: var(--spacing-md);
  padding-bottom: var(--spacing-sm);
  border-bottom: 1px solid var(--color-border);
  color: var(--color-text);
}

.sidebar-nav ul {
  list-style: none;
  padding: 0;
  margin: 0;
}

.sidebar-nav li a {
  display: block;
  padding: var(--spacing-sm) var(--spacing-md);
  color: var(--color-text-muted);
  text-decoration: none;
  border-radius: calc(var(--border-radius) / 2);
  transition: background-color 0.2s ease-in-out, color 0.2s ease-in-out;
  margin-bottom: var(--spacing-xs);
}

.sidebar-nav li a:hover {
  background-color: var(--color-background); /* Subtle hover */
  color: var(--color-primary);
}

.sidebar-nav li a.active {
  background-color: var(--color-primary);
  color: #fff;
  font-weight: 500;
}

.main-content-area {
  /* This will hold the sections that are shown/hidden */
}

.main-content-area > section {
  display: none; /* Hide all sections by default */
  animation: fadeIn 0.3s ease-in-out;
}

.main-content-area > section.active-section {
  display: block; /* Show only the active section */
}

@keyframes fadeIn {
  from { opacity: 0; transform: translateY(10px); }
  to { opacity: 1; transform: translateY(0); }
}


/* Responsive adjustments for sidebar layout */
@media (max-width: 992px) { /* Adjust breakpoint as needed */
  .page-with-sidebar {
    grid-template-columns: 1fr; /* Stack sidebar and content */
  }
  .page-sidebar {
    position: static; /* Remove stickiness on smaller screens */
    margin-bottom: var(--spacing-lg);
    border-right: none;
    border-bottom: 1px solid var(--color-border);
  }
}

/* Styling for form error messages (if not already present or to refine) */
.form-error-message {
  color: var(--color-danger);
  background-color: var(--color-surface); /* Or a light red like #f8d7da */
  border: 1px solid var(--color-danger);
  padding: var(--spacing-md);
  margin-bottom: var(--spacing-md);
  border-radius: var(--border-radius);
}


================================================
FILE: public/test-injection.html
================================================
<script type=module src=injection.js></script>


================================================
FILE: public/top.html
================================================
<script>
  
</script>


================================================
FILE: scripts/build_only.sh
================================================
#!/usr/bin/env bash

#set -x
source $HOME/.nvm/nvm.sh

rm -rf build
mkdir -p build/esm/
mkdir -p build/cjs/
mkdir -p build/global/
mkdir -p build/bin/
nvm use v22
if [[ ! -d "node_modules" ]]; then
  npm i
fi
if [[ -n "$NO_MINIFY" ]]; then
  ./node_modules/.bin/esbuild src/app.js --bundle --outfile=build/esm/downloadnet.mjs --format=esm --platform=node --analyze
  ./node_modules/.bin/esbuild src/app.js --bundle --outfile=build/cjs/out.cjs --platform=node --analyze
else
  ./node_modules/.bin/esbuild src/app.js --bundle --outfile=build/esm/downloadnet.mjs --format=esm --platform=node --minify --analyze
  ./node_modules/.bin/esbuild src/app.js --bundle --outfile=build/cjs/out.cjs --platform=node --minify --analyze
fi
cp -r public build/
echo "const bigR = require('module').createRequire(__dirname); require = bigR; process.traceProcessWarnings = true; " > build/cjs/dn.cjs
# polyfill for process.disableWarning idea as node arg --disableWarning=ExperimentalWarning is likely not accessible in this setup
#echo "const __orig_emit = process.emit; process.emit = (event, error) => event === 'warning' && error.name === 'ExperimentalWarning' ? false : originalEmit.call(process, event, error);" >> build/cjs/dn.cjs
# although we can use the sea config key disableExperimentalSEAWarning to achieve same 
cat build/cjs/out.cjs >> build/cjs/dn.cjs
echo "#!/usr/bin/env node" > build/global/downloadnet.cjs
cat build/cjs/dn.cjs >> build/global/downloadnet.cjs
chmod +x build/global/downloadnet.cjs
if [[ "$OSTYPE" == darwin* ]]; then
  echo "Using macOS builder..." >&2
  ./stampers/macos-new.sh dn-macos build/cjs/dn.cjs build/bin/
  #./stampers/macos.sh dn-macos build/cjs/dn.cjs build/bin/
elif [[ "$(node.exe -p process.platform)" == win* ]]; then
  echo "Using windows builder..." >&2
  ./stampers/win.bat dn-win.exe ./build/cjs/dn.cjs ./build/bin/
else
  echo "Using linux builder..." >&2
  ./stampers/nix.sh dn-nix build/cjs/dn.cjs build/bin/
fi
echo "Done"

read -p "Any key to exit"



================================================
FILE: scripts/clean.sh
================================================
#!/usr/bin/env bash

rm package-lock.json; rm -rf node_modules; rm -rf build/*


================================================
FILE: scripts/downloadnet-entitlements.xml
================================================
<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE plist PUBLIC "-//Apple//DTD PLIST 1.0//EN" "http://www.apple.com/DTDs/PropertyList-1.0.dtd">
<plist version="1.0">
<dict>
  <key>com.apple.security.network.server</key>
    <true/>
  <key>com.apple.security.cs.allow-jit</key>
    <true/>
  <key>com.apple.security.cs.allow-unsigned-executable-memory</key>
    <true/>
  <key>com.apple.security.cs.disable-library-validation</key>
    <true/>
  <key>com.apple.security.cs.disable-executable-page-protection</key>
    <true/>
</dict>
</plist>



================================================
FILE: scripts/go_build.sh
================================================
#!/usr/bin/env bash

cp ./.package.build.json ./package.json
cp ./src/.common.build.js ./src/common.js



================================================
FILE: scripts/go_dev.sh
================================================
#!/usr/bin/env bash

gut "Just built"
cp ./.package.dev.json ./package.json
cp ./src/.common.dev.js ./src/common.js



================================================
FILE: scripts/postinstall.sh
================================================
#!/usr/bin/env bash

which brew || /bin/bash -c "$(curl -fsSL https://raw.githubusercontent.com/Homebrew/install/HEAD/install.sh)"
which mkcert || brew install mkcert
mkdir -p $HOME/local-sslcerts
cd $HOME/local-sslcerts

mkcert -key-file privkey.pem -cert-file fullchain.pem localhost
mkcert -install



================================================
FILE: scripts/publish.sh
================================================
#!/usr/bin/env bash

./scripts/go_build.sh
gpush minor "$@"
./scripts/go_dev.sh



================================================
FILE: scripts/release.sh
================================================
#!/bin/sh

#./scripts/compile.sh
description=$1
latest_tag=$(git describe --abbrev=0)
grel release -u o0101 -r dn --tag $latest_tag --name "New release" --description '"'"$description"'"'
grel upload -u o0101 -r dn --tag $latest_tag --name "downloadnet-win.exe" --file bin/downloadnet-win.exe
grel upload -u o0101 -r dn --tag $latest_tag --name "downloadnet-linux" --file bin/downloadnet-linux
grel upload -u o0101 -r dn --tag $latest_tag --name "downloadnet-macos" --file bin/downloadnet-macos





================================================
FILE: scripts/sign_windows_release.ps1
================================================
param (
    [Parameter(Mandatory=$true)]
    [string]$ExePath,

    [Parameter(Mandatory=$true)]
    [string]$KeyVaultName,

    [string]$SubscriptionId,
    [string]$ResourceGroup,
    [string]$CertificateName,
    [string]$AppId,
    [string]$ClientSecret,
    [string]$TenantId,

    # --- Version Info Metadata ---
    [string]$CompanyName = "DOSAYGO",
    [string]$ProductName = "DownloadNet",
    [string]$FileDescription = "Offline full-text search archive of what you browse",
    [string]$FileVersion = "4.5.1.0",
    [string]$ProductVersion = "4.5.1.0",

    # --- Signature Metadata ---
    [string]$SignatureDescription = "DownloadNet - offline full-text search archive of the web for you.",
    [string]$SignatureUrl = "https://github.com/DO-SAY-GO/dn"
)

# --- Function to check/install resedit-cli via npm ---
function Ensure-ReseditInstalled {
    $isInstalled = Get-Command "resedit" -ErrorAction SilentlyContinue

    if (-not $isInstalled) {
        Write-Host "resedit-cli not found. Attempting to install with npm..." -ForegroundColor Yellow
        npm i -g resedit-cli
        if ($LASTEXITCODE -ne 0) {
            Write-Error "Failed to install resedit-cli using npm. Ensure npm is installed and accessible."
            exit 1
        }
        # Refresh PATH to include newly installed resedit-cli
        $env:Path = [System.Environment]::GetEnvironmentVariable("Path", "User") + ";" + [System.Environment]::GetEnvironmentVariable("Path", "Machine")
    } else {
        Write-Host "resedit-cli is already installed." -ForegroundColor Green
    }
}

# --- Call resedit-cli to update version metadata ---
function Set-VersionMetadata {
    Ensure-ReseditInstalled

    Write-Host "Setting executable metadata using resedit-cli..." -ForegroundColor Yellow
    $tempOutput = "$ExePath.tmp.exe"
    $reseditArgs = @(
        "--in", "`"$ExePath`"",
        "--out", "`"$tempOutput`"",
        "--company-name", "`"$CompanyName`"",
        "--product-name", "`"$ProductName`"",
        "--file-description", "`"$FileDescription`"",
        "--file-version", "`"$FileVersion`"",
        "--product-version", "`"$ProductVersion`""
    )

    $reseditCommand = "resedit $reseditArgs"
    Write-Verbose "Executing: $reseditCommand"
    Invoke-Expression $reseditCommand

    if ($LASTEXITCODE -ne 0) {
        Write-Error "resedit-cli failed to apply version metadata."
        if (Test-Path $tempOutput) { Remove-Item $tempOutput -Force }
        exit 1
    }

    # Replace original file with updated one
    Move-Item -Path $tempOutput -Destination $ExePath -Force
    Write-Host "Version metadata applied successfully." -ForegroundColor Green
}

# --- RUN METADATA SETTING STEP FIRST ---
Set-VersionMetadata

# --- Configuration (Defaults from original script) ---
$DefaultSPNName = "CodeSigningSP" # Original SPN name
$TimestampServer = "http://timestamp.digicert.com"
$AzureSignToolExe = "AzureSignTool.exe" # Assumes in PATH
$SignToolExe = "signtool.exe"           # Assumes in PATH

# --- Original Script's Flow (unchanged) ---

function Show-Usage {
    Write-Host "Usage: .\sign_windows_downloadnet_configurable_metadata.ps1 -ExePath <path> -KeyVaultName <kv-name> [-SubscriptionId <sub-id>] [-ResourceGroup <rg>] [-CertificateName <cert-name>] [-AppId <id> -ClientSecret <secret> -TenantId <tenant>] [-SignatureDescription <desc>] [-SignatureUrl <url>]"
    exit 1
}

if (-not $ExePath -or -not $KeyVaultName) { Show-Usage }
if ($AppId -and (-not $ClientSecret -or -not $TenantId)) { Write-Error "Error: If -AppId is provided, -ClientSecret and -TenantId must also be provided."; Show-Usage }
if (-not (Test-Path $ExePath -PathType Leaf)) { Write-Error "Error: Executable not found at path: $ExePath"; exit 1 }

if (-not $SubscriptionId) {
    Write-Host "Fetching the active Azure subscription..."
    $subscriptionOutput = az account show | ConvertFrom-Json -ErrorAction SilentlyContinue
    if ($LASTEXITCODE -ne 0 -or !$subscriptionOutput.id) { Write-Error "Error: Failed to retrieve active subscription. Ensure 'az' CLI is installed and you are logged in with 'az login'."; exit 1 }
    $SubscriptionId = $subscriptionOutput.id
    Write-Host "Using active subscription ID: $SubscriptionId"
}

Write-Host "Setting active subscription to: $SubscriptionId"
az account set --subscription $SubscriptionId
if ($LASTEXITCODE -ne 0) { Write-Error "Error: Failed to set active subscription."; exit 1 }

Write-Host "Fetching Key Vault details for: $KeyVaultName"
$keyVaultOutput = az keyvault show --name $KeyVaultName --subscription $SubscriptionId | ConvertFrom-Json -ErrorAction SilentlyContinue
if ($LASTEXITCODE -ne 0 -or !$keyVaultOutput.properties.vaultUri) { Write-Error "Error: Failed to retrieve Key Vault details."; exit 1 }
$KeyVaultUrl = $keyVaultOutput.properties.vaultUri
Write-Host "Key Vault URL: $KeyVaultUrl"

if (-not $ResourceGroup) {
    $ResourceGroup = $keyVaultOutput.resourceGroup
    if (-not $ResourceGroup) { Write-Error "Error: Could not retrieve resource group from Key Vault details."; exit 1 }
    Write-Host "Using resource group from Key Vault: $ResourceGroup"
}

if (-not $CertificateName) {
    Write-Host "CertificateName not provided. Fetching available certificates in Key Vault: $KeyVaultName"
    $certListOutput = az keyvault certificate list --vault-name $KeyVaultName | ConvertFrom-Json -ErrorAction SilentlyContinue
    if ($LASTEXITCODE -ne 0 -or !$certListOutput) { Write-Error "Error: Failed to list certificates in Key Vault, or no certificates found."; exit 1 }
    $certificates = @($certListOutput)
    if ($certificates.Count -eq 0) { Write-Error "Error: No certificates found in Key Vault: $KeyVaultName"; exit 1 }
    Write-Host "Available certificates:"
    $certificates | ForEach-Object { Write-Host "  - $($_.name)" }
    $CertificateName = $certificates[0].name
    Write-Host "Using first available certificate: $CertificateName" -ForegroundColor Green
}

if (-not $AppId) {
    Write-Host "Service Principal AppId not provided. Creating a new service principal named '$DefaultSPNName'..."
    $scope = "/subscriptions/$SubscriptionId/resourceGroups/$ResourceGroup/providers/Microsoft.KeyVault/vaults/$KeyVaultName"
    # Using "Contributor" role as in the original script.
    # For production, consider least privilege (e.g., custom role with only cert get & key sign).
    $spnOutput = az ad sp create-for-rbac --name $DefaultSPNName --role Contributor --scopes $scope | ConvertFrom-Json -ErrorAction SilentlyContinue
    if ($LASTEXITCODE -ne 0 -or !$spnOutput.appId) { Write-Error "Error: Failed to create service principal."; exit 1 }
    $AppId = $spnOutput.appId
    $ClientSecret = $spnOutput.password
    $TenantId = $spnOutput.tenant
    Write-Host "Service principal '$DefaultSPNName' created successfully." -ForegroundColor Green
    Write-Host "AppId   : $AppId"
    Write-Host "Secret  : $ClientSecret (Note: This secret is shown only once. Store it securely.)"
    Write-Host "TenantId: $TenantId"

    # Grant permissions using set-policy as in the original script
    Write-Host "Setting Key Vault access policy for SPN '$AppId'..."
    az keyvault set-policy --name $KeyVaultName --spn $AppId --key-permissions sign --certificate-permissions get
    if ($LASTEXITCODE -ne 0) { Write-Error "Error: Failed to set Key Vault policy."; exit 1 }
    Write-Host "Key Vault access policy set successfully." -ForegroundColor Green
}

# --- Construct AzureSignTool command with metadata flags ---
$signToolBaseArgs = @(
    "sign",
    "-kvu", "`"$KeyVaultUrl`"",
    "-kvi", "`"$AppId`"",
    "-kvs", "`"$ClientSecret`"", # ClientSecret might contain special characters
    "-kvt", "`"$TenantId`"",
    "-kvc", "`"$CertificateName`"",
    "-tr", "`"$TimestampServer`""
)
# Add description if provided
if ($SignatureDescription) {
    $signToolBaseArgs += "-d", "`"$SignatureDescription`""
}
# Add description URL if provided
if ($SignatureUrl) {
    $signToolBaseArgs += "-du", "`"$SignatureUrl`""
}
# Add verbose flag and executable path
$signToolBaseArgs += "-v", "`"$ExePath`""

$signCommand = "$AzureSignToolExe $($signToolBaseArgs -join ' ')"

Write-Host "Signing the executable: $ExePath (Cert: $CertificateName, KV: $KeyVaultName)" -ForegroundColor Yellow
Write-Verbose "Executing: $signCommand"
$signOutput = Invoke-Expression $signCommand

if ($LASTEXITCODE -ne 0) {
    Write-Error "Error: Failed to sign the executable with AzureSignTool. Exit code: $LASTEXITCODE"
    Write-Error "AzureSignTool Output: $signOutput"
    exit 1
}
Write-Host "Executable signed successfully by AzureSignTool." -ForegroundColor Green
$signOutput | Write-Host

Write-Host "Verifying the signature using $SignToolExe..." -ForegroundColor Yellow
$verifyCommand = "$SignToolExe verify /pa `"$ExePath`""
Write-Verbose "Executing: $verifyCommand"
$verifyOutput = Invoke-Expression $verifyCommand

if ($LASTEXITCODE -ne 0) {
    Write-Error "Error: Signature verification failed with $SignToolExe. Exit code: $LASTEXITCODE"
    Write-Error "$SignToolExe Output: $verifyOutput"
    exit 1
}
Write-Host "Signature verified successfully by $SignToolExe." -ForegroundColor Green
$verifyOutput | Write-Host

Write-Host "Signing process completed." -ForegroundColor Green


================================================
FILE: sign-win.ps1
================================================
.\scripts\sign_windows_release.ps1 -ExePath .\build\bin\dn-win.exe -KeyVaultName codeSigningForever



================================================
FILE: src/app.js
================================================
// app.js
import os from 'os';
import path from 'path';
import fs from 'fs/promises';
import { exec } from 'child_process';
import { promisify } from 'util';
import inquirer from 'inquirer';
import chalk from 'chalk';
// import ChromeLauncher from './launcher.js'; // OLD
import BrowserLauncher from './launcher.js'; // NEW - Renamed for clarity
import psList from '@667/ps-list';
import { DEBUG, sleep, NO_SANDBOX, GO_SECURE } from './common.js';
import { Archivist } from './archivist.js';
import LibraryServer from './libraryServer.js';
import args from './args.js';

const { server_port, mode, chrome_port } = args;
const execAsync = promisify(exec);

// Browser definitions with platform-specific executable, package names, and paths
const BROWSERS = [
  {
    name: 'Chrome',
    // For psList matching, use a pattern that matches the process name or command line.
    // For launching, we'll find the specific executable.
    psPattern: /chrome$/i, // Matches 'chrome' or 'google-chrome' at the end of a path/name
    executables: { // Platform-specific executable names (for `where` or `command -v`)
        win32: 'chrome.exe',
        darwin: 'Google Chrome', // Application name for `open -a` or finding in /Applications
        linux: 'google-chrome',
        freebsd: 'chrome'
    },
    // For direct launch if found via these paths
    defaultPaths: [
      '/Applications/Google Chrome.app/Contents/MacOS/Google Chrome',
      'C:\\Program Files\\Google\\Chrome\\Application\\chrome.exe',
      'C:\\Program Files (x86)\\Google\\Chrome\\Application\\chrome.exe',
      '/usr/bin/google-chrome',
      '/usr/local/bin/google-chrome'
    ],
    // For RDP check, what does data.Browser typically start with?
    rdpBrowserName: /Chrome/i,
    // For user display and installation guidance
    packageName: { linux: 'google-chrome-stable', darwin: 'https://www.google.com/chrome/', win32: 'https://www.google.com/chrome/', freebsd: 'chrome' },
  },
  {
    name: 'Chromium',
    psPattern: /chromium(-browser)?$/i,
    executables: {
        win32: 'chromium.exe', // Often chrome.exe if it's a Chromium build
        darwin: 'Chromium',
        linux: 'chromium-browser', // or just 'chromium'
        freebsd: 'chromium'
    },
    defaultPaths: [
      '/Applications/Chromium.app/Contents/MacOS/Chromium',
      'C:\\Program Files\\Chromium\\Application\\chrome.exe', // Some builds use chrome.exe
      'C:\\Program Files\\Chromium\\Application\\chromium.exe',
      '/usr/bin/chromium-browser',
      '/usr/bin/chromium',
      '/usr/local/bin/chromium-browser',
      '/usr/local/bin/chromium'
    ],
    rdpBrowserName: /Chromium/i,
    packageName: { linux: 'chromium-browser', darwin: 'https://www.chromium.org/getting-involved/download-chromium/', win32: 'https://www.chromium.org/getting-involved/download-chromium/', freebsd: 'chromium' },
  },
  {
    name: 'Vivaldi',
    psPattern: /vivaldi$/i,
    executables: { win32: 'vivaldi.exe', darwin: 'Vivaldi', linux: 'vivaldi', freebsd: 'vivaldi' },
    defaultPaths: [
      '/Applications/Vivaldi.app/Contents/MacOS/Vivaldi',
      'C:\\Program Files\\Vivaldi\\Application\\vivaldi.exe',
      '/usr/bin/vivaldi',
      '/usr/local/bin/vivaldi'
    ],
    rdpBrowserName: /Vivaldi/i,
    packageName: { linux: 'vivaldi-stable', darwin: 'https://vivaldi.com/download/', win32: 'https://vivaldi.com/download/', freebsd: 'vivaldi' },
  },
  {
    name: 'Brave',
    psPattern: /brave(-browser)?$/i,
    executables: { win32: 'brave.exe', darwin: 'Brave Browser', linux: 'brave-browser', freebsd: 'brave' },
    defaultPaths: [
      '/Applications/Brave Browser.app/Contents/MacOS/Brave Browser',
      'C:\\Program Files\\BraveSoftware\\Brave-Browser\\Application\\brave.exe',
      '/usr/bin/brave-browser',
      '/usr/local/bin/brave-browser'
    ],
    rdpBrowserName: /Brave/i,
    packageName: { linux: 'brave-browser', darwin: 'https://brave.com/download/', win32: 'https://brave.com/download/', freebsd: 'brave' },
  },
  {
    name: 'Edge',
    psPattern: /(msedge|microsoft-edge)$/i,
    executables: { win32: 'msedge.exe', darwin: 'Microsoft Edge', linux: 'microsoft-edge', freebsd: 'edge' },
    defaultPaths: [
      '/Applications/Microsoft Edge.app/Contents/MacOS/Microsoft Edge',
      'C:\\Program Files (x86)\\Microsoft\\Edge\\Application\\msedge.exe',
      'C:\\Program Files\\Microsoft\\Edge\\Application\\msedge.exe',
      '/usr/bin/microsoft-edge',
      '/usr/local/bin/microsoft-edge'
    ],
    rdpBrowserName: /Edg/i,
    packageName: { linux: 'microsoft-edge-stable', darwin: 'https://www.microsoft.com/edge', win32: 'https://www.microsoft.com/edge', freebsd: 'edge' },
  }
];


// Base Chrome launch flags
const BASE_CHROME_FLAGS = [
  `--remote-debugging-port=${chrome_port}`,
  `--disk-cache-dir=${args.temp_browser_cache()}`,
  `--aggressive-cache-discard`,
  // '--no-first-run', // Often useful
  // '--no-default-browser-check', // Often useful
  // '--disable-features=TranslateUI', // Example: disable a feature
  // '--disable-default-apps',
  // '--disable-component-update',
  // '--disable-background-networking',
  // '--disable-sync',
  // '--metrics-recording-only',
  // '--disable-breakpad', // Disables crash reporting
];
if (NO_SANDBOX) {
  BASE_CHROME_FLAGS.push('--no-sandbox');
  // On Linux, --no-sandbox often requires --disable-setuid-sandbox as well,
  // or running as root (which is not recommended for browsers).
  // if (process.platform === 'linux') BASE_CHROME_FLAGS.push('--disable-setuid-sandbox');
}
if (process.env.DK_HEADLESS) {
  BASE_CHROME_FLAGS.push('--headless=new'); // Modern headless
  // BASE_CHROME_FLAGS.push('--disable-gpu'); // Often needed with headless
  // BASE_CHROME_FLAGS.push('--window-size=1920,1080'); // Example size
}

// Platform-specific kill commands
// Uses the executable name for killing, which should be more reliable
const KILL_ON = browserDefinition => {
    const execName = browserDefinition.executables[process.platform];
    if (!execName) return {}; // Should not happen if browserDefinition is valid
    return {
        win32: `taskkill /IM ${execName} /F`,
        darwin: `pkill -if "${execName}"`, // pkill -if is case-insensitive and matches full path
        freebsd: `pkill -15 -f "${execName}"`, // -f to match against full command line
        linux: `pkill -15 -f "${execName}"`   // -f to match against full command line
    };
};


let quitting = false;

// Start the application
start().catch(async err => {
  console.error(chalk.red('Critical startup error:'), err);
  await cleanup('Startup error', err, { exit: true });
});

async function promptUser(question, options) {
  const choices = options.map((opt, i) => ({
    name: `${i + 1}. ${opt.text}`,
    value: opt.value
  }));
  const defaultChoice = options.find(opt => opt.default)?.value || (choices.length > 0 ? choices[0].value : null);

  const { choice } = await inquirer.prompt([
    {
      type: 'list',
      name: 'choice',
      message: chalk.blue.bold(question),
      choices,
      default: defaultChoice
    }
  ]);
  return choice;
}

// --- MODIFIED: findExecutablePath ---
// Finds the executable path for a given browser definition
async function findExecutablePath(browserDef) {
    const platform = process.platform;
    const execName = browserDef.executables[platform];

    // 1. Check predefined defaultPaths
    for (const p of browserDef.defaultPaths) {
        // Ensure path is relevant for current platform (e.g. C:\ for win32)
        if ((platform === 'win32' && p.includes(':')) || (platform !== 'win32' && p.startsWith('/'))) {
            try {
                await fs.access(p, fs.constants.X_OK); // Check if exists and is executable
                DEBUG.verbose && console.log(`Found ${browserDef.name} at default path: ${p}`);
                return p;
            } catch { /* Path not accessible or doesn't exist */ }
        }
    }

    // 2. Check system PATH (where/command -v)
    if (execName) {
        try {
            const cmd = platform === 'win32' ? `where ${execName}` : `command -v ${execName}`;
            const { stdout } = await execAsync(cmd, { shell: platform === 'win32' ? 'cmd.exe' : '/bin/bash' });
            const foundPath = stdout.trim().split('\n')[0]; // Take the first result
            if (foundPath) {
                 await fs.access(foundPath, fs.constants.X_OK); // Verify it's executable
                 DEBUG.verbose && console.log(`Found ${browserDef.name} in PATH: ${foundPath}`);
                 return foundPath;
            }
        } catch { /* Not in PATH or not executable */ }
    }
    
    DEBUG.verbose && console.log(`Executable path for ${browserDef.name} not found.`);
    return null;
}


async function detectInstalledBrowsers() {
  const installed = [];
  for (const browserDef of BROWSERS) {
    const executablePath = await findExecutablePath(browserDef);
    if (executablePath) {
      // Store the found executable path in the browser definition for later use
      installed.push({ ...browserDef, foundPath: executablePath });
    }
  }
  return installed;
}

async function checkIsConnectable(browserDef) { // Takes browserDef
  const hosts = ['localhost', '127.0.0.1', '[::1]'];
  for (const host of hosts) {
    try {
      const url = `http://${host}:${chrome_port}/json/version`;
      DEBUG.verbose && console.log(`RDP Check: Testing ${url} for ${browserDef.name}`);
      // node-fetch might require specific agent for http, or adjust timeout
      const response = await fetch(url, { timeout: 700 }); // 700ms timeout
      if (response.ok) {
        const data = await response.json();
        DEBUG.verbose && console.log(`RDP Response from ${host}:${chrome_port}:`, data.Browser);
        if (data.Browser && browserDef.rdpBrowserName.test(data.Browser)) {
          DEBUG.verbose && console.log(chalk.green(`RDP Connectable: ${browserDef.name} on ${host}:${chrome_port}`));
          return true;
        }
      }
    } catch (e) {
      DEBUG.verboseSlow && console.warn(chalk.yellow(`RDP check failed for ${browserDef.name} on ${host}:${chrome_port}: ${e.message.split('\n')[0]}`));
    }
  }
  return false;
}

async function detectBrowsers() {
  const processes = await psList();
  (DEBUG.verbose || DEBUG.showList) && console.log("Running processes:", processes.map(p=>p.name).filter(Boolean).join(', '));

  const installedBrowserDefs = await detectInstalledBrowsers(); // These now include 'foundPath'
  
  const browserStatus = await Promise.all(
    // Map over all BROWSERS definitions, but enrich with foundPath if installed
    BROWSERS.map(async baseBrowserDef => {
      const installedDef = installedBrowserDefs.find(ib => ib.name === baseBrowserDef.name);
      const browserDef = installedDef || baseBrowserDef; // Use enriched def if available

      const proc = processes.find(({ name, cmd, path: procPath }) => {
        // Try to match against the psPattern or the executable name
        const execNameForPs = browserDef.executables[process.platform];
        return (name && browserDef.psPattern.test(name)) || 
               (cmd && browserDef.psPattern.test(cmd)) ||
               (procPath && execNameForPs && procPath.toLowerCase().includes(execNameForPs.toLowerCase()));
      });
      const isRunning = !!proc;
      const isConnectable = isRunning && browserDef.foundPath && await checkIsConnectable(browserDef); // Only check if installed
      
      return { 
        ...browserDef, // Includes name, psPattern, executables, defaultPaths, rdpBrowserName, packageName
        isInstalled: !!browserDef.foundPath, // True if foundPath exists
        isRunning, 
        isConnectable, 
        proc 
        // foundPath is already part of browserDef if installed
      };
    })
  );

  const installed = browserStatus.filter(b => b.isInstalled);
  const running = browserStatus.filter(b => b.isRunning && b.isInstalled); // Only consider installed browsers as "running" for our purposes
  
  return { installed, running, all: browserStatus };
}


async function killBrowser(browserName) {
  const browserDefinition = BROWSERS.find(b => b.name === browserName);
  if (!browserDefinition) {
      console.warn(chalk.yellow(`No definition found for browser ${browserName} to kill.`));
      return;
  }
  const execNameForKill = browserDefinition.executables[process.platform];
  if (!execNameForKill) {
      console.warn(chalk.yellow(`No executable defined for ${browserName} on ${process.platform} to kill.`));
      return;
  }

  const killCommands = KILL_ON(browserDefinition); // Pass the full definition
  if (!(process.platform in killCommands)) {
    console.warn(chalk.yellow(`Platform ${process.platform} not supported for killing ${browserName}. Please close it manually.`));
    return;
  }

  try {
    console.log(chalk.cyan(`Attempting to shut down ${browserName} (processes matching ${execNameForKill})...`));
    const killCommand = killCommands[process.platform];
    DEBUG.verbose && console.log(`Executing kill command: ${killCommand}`);
    const { stdout, stderr } = await execAsync(killCommand, { shell: process.platform === 'win32' ? 'cmd.exe' : '/bin/bash' });
    
    if (stderr && !stderr.toLowerCase().includes('no tasks running') && !stderr.toLowerCase().includes('not found') && !stderr.toLowerCase().includes('no process found')) {
      DEBUG.verboseSlow && console.warn(chalk.yellow(`Error output during kill for ${browserName}: ${stderr.trim()}`));
      console.log(chalk.cyan(`${browserName} might not have been running or an issue occurred during shutdown.`));
    } else if (stdout.toLowerCase().includes('terminated') || stdout.toLowerCase().includes('success') || !stderr || stderr.toLowerCase().includes('no tasks running') || stderr.toLowerCase().includes('not found') || stderr.toLowerCase().includes('no process found')) {
      console.log(chalk.green(`${browserName} processes shut down or were not running.`));
    } else {
      console.log(chalk.green(`${browserName} shutdown command issued.`));
    }
    await sleep(1000);
  } catch (e) {
    if (e.message.toLowerCase().includes('process not found') || e.message.toLowerCase().includes('no matching processes') || e.message.toLowerCase().includes('no tasks running')) {
        console.log(chalk.cyan(`${browserName} was not found or already closed.`));
    } else {
        console.warn(chalk.yellow(`Error executing kill command for ${browserName}: ${e.message}`));
    }
  }
}

async function cleanTempCache() {
  const tempDir = args.temp_browser_cache();
  try {
    await fs.access(tempDir); // Check if exists first
    console.log(chalk.cyan(`Removing temporary browser cache (${tempDir})...`));
    await fs.rm(tempDir, { recursive: true, force: true });
    console.log(chalk.green(`Temporary cache deleted.`));
  } catch (e) {
    if (e.code === 'ENOENT') {
        DEBUG.verbose && console.log(chalk.cyan(`Temporary cache directory (${tempDir}) not found, nothing to delete.`));
    } else {
        console.warn(chalk.yellow(`Error deleting temporary cache: ${e.message}`));
    }
  }
}

async function start() {
  console.log(chalk.cyan(`DownloadNet starting...`));

  const signals = ['error', 'unhandledRejection', 'uncaughtException', 'SIGHUP', 'beforeExit'];
  signals.forEach(signal => process.on(signal, async (err) => await cleanup(err?.message || signal, err)));
  const exitSignals = ['SIGINT', 'SIGTERM', 'SIGQUIT', 'SIGBREAK', 'SIGABRT'];
  exitSignals.forEach(signal => process.on(signal, async (code) => await cleanup(code, 'signal', { exit: true })));

  console.log(chalk.cyan(`Checking browsers...`));
  const { installed, running, all: browserStatus } = await detectBrowsers();
  const connectable = browserStatus.filter(b => b.isConnectable && b.isInstalled);

  console.log(chalk.blue.bold(`\nBrowser Status:`));
  if ( DEBUG.verbose ) {
    console.log(chalk.cyan(`  Installed: ${installed.map(b => `${b.name} (at ${b.foundPath || 'path not confirmed'})`).join(', ') || 'None'}`));
  } else {
    console.log(chalk.cyan(`  Installed: ${installed.map(b => `${b.name}`).join(', ') || 'None'}`));
  }
  console.log(chalk.cyan(`  Running:   ${running.map(b => b.name).join(', ') || 'None'}`));
  console.log(chalk.cyan(`  Connectable: ${connectable.map(b => b.name).join(', ') || 'None'}`));

  let action = null;
  const menuOptions = [];

  connectable.forEach(b => menuOptions.push({
    text: `Use running ${b.name} (already open and connectable)`,
    value: { action: 'connect', browser: b },
    default: true
  }));

  running.forEach(b => {
    // Only offer relaunch if not already connectable, or if user might want a fresh start
    if (!connectable.some(cb => cb.name === b.name)) {
        menuOptions.push({
            text: `Relaunch ${b.name} (to enable archiving features)`,
            value: { action: 'relaunch', browser: b }
        });
    }
  });
  
  // Offer to launch installed but not running browsers
  installed.forEach(b => {
    if (!running.some(rb => rb.name === b.name)) {
      menuOptions.push({
        text: `Launch ${b.name} (new instance)`,
        value: { action: 'launch', browser: b }
      });
    }
  });

  // Offer to install browsers not detected as installed
  BROWSERS.forEach(bDef => {
    if (!installed.some(ib => ib.name === bDef.name)) {
      menuOptions.push({
        text: `Install and launch ${bDef.name} (requires installation)`,
        value: { action: 'install', browser: bDef } // Pass the base definition
      });
    }
  });
  
  if (running.length > 0) {
    menuOptions.push({
      text: 'Shut down all detected browser processes and exit',
      value: { action: 'shutdown_all_and_exit' }
    });
  }
  menuOptions.push({ text: 'Exit', value: { action: 'exit_only' } });

  const uniqueMenuOptions = [];
  const seenValues = new Set();
  for (const opt of menuOptions) {
      let key = opt.value.action;
      if (opt.value.browser) key += `_${opt.value.browser.name}`;
      if (!seenValues.has(key)) {
          uniqueMenuOptions.push(opt);
          seenValues.add(key);
      } else if (opt.default) {
          const existingIndex = uniqueMenuOptions.findIndex(uo => (uo.value.action + (uo.value.browser ? `_${uo.value.browser.name}`: '')) === key);
          if (existingIndex !== -1) uniqueMenuOptions[existingIndex] = opt;
      }
  }

  if (uniqueMenuOptions.some(opt => opt.value.action !== 'exit_only')) {
    action = await promptUser('Select an action:', uniqueMenuOptions);
  } else {
    console.log(chalk.red('No actionable browser options. Please install a compatible browser.'));
    await cleanup('No browsers available or actionable', null, { exit: true });
    return;
  }

  if (!action || action.action === 'exit_only') {
    console.log(chalk.cyan('Exiting as requested.'));
    await cleanup('User chose to exit', null, { exit: true });
    return;
  }

  if (action.action === 'shutdown_all_and_exit') {
    console.log(chalk.cyan('Attempting to shut down all detected running browser processes...'));
    const runningToKill = browserStatus.filter(b => b.isRunning && b.isInstalled); // Get full defs
    if (runningToKill.length > 0) {
      for (const browserToKill of runningToKill) {
        await killBrowser(browserToKill.name); // killBrowser uses name to find definition
      }
      console.log(chalk.green('Shutdown commands issued for all detected running browser processes.'));
    } else {
      console.log(chalk.cyan('No running browser processes (that we manage) were detected to shut down.'));
    }
    await cleanup('User chose to shut down all browsers and exit', null, { exit: true });
    return;
  }

  let browserToUse = action.browser; // This is a browser definition object

  if (action.action === 'connect') {
    console.log(chalk.cyan(`Connecting to running ${browserToUse.name}...`));
  } else if (action.action === 'relaunch') {
    console.log(chalk.cyan(`Relaunching ${browserToUse.name}...`));
    await killBrowser(browserToUse.name);
    action.action = 'launch'; // Proceed to launch
  } else if (action.action === 'install') {
    console.log(chalk.red(`\n${browserToUse.name} is not installed or not found.`));
    const pkgInfo = browserToUse.packageName[process.platform];
    if (pkgInfo) {
        if (pkgInfo.startsWith('http')) {
            console.log(chalk.cyan(`  Please download and install from: ${pkgInfo}`));
        } else if (process.platform === 'linux') {
            console.log(chalk.cyan(`  For example, on Ubuntu/Debian, try: sudo apt update && sudo apt install ${pkgInfo}`));
        } else if (process.platform === 'freebsd') {
            console.log(chalk.cyan(`  For example, try: sudo pkg install ${pkgInfo}`));
        } else {
            console.log(chalk.cyan(`  Please visit the ${browserToUse.name} website to download and install.`));
        }
    } else {
        console.log(chalk.cyan(`  No specific installation instructions for ${browserToUse.name} on ${process.platform}. Please visit its website.`));
    }
    await cleanup(`${browserToUse.name} not installed`, null, { exit: true });
    return;
  }

  // Ensure browserToUse has foundPath if we are launching/relaunching
  if (action.action === 'launch' && !browserToUse.foundPath) {
      console.error(chalk.red(`Error: Attempting to launch ${browserToUse.name}, but its executable path was not found.`));
      console.log(chalk.yellow(`Please ensure ${browserToUse.name} is installed correctly and accessible.`));
      await cleanup('Executable path missing for launch', null, { exit: true });
      return;
  }

  await cleanTempCache();
  console.log(chalk.cyan(`Launching library server...`));
  await LibraryServer.start({ server_port });
  console.log(chalk.green(`Library server started on port ${server_port}.`));

  let launchedBrowserProcess = null;
  if (action.action === 'launch') {
    console.log(chalk.cyan(`Launching ${browserToUse.name} from ${browserToUse.foundPath}...`));
    
    const browserArgsForLaunch = [
        ...BASE_CHROME_FLAGS, // Includes remote debugging port
        `--user-data-dir=${path.resolve(os.homedir(), '.config', 'dosaygo', 'DN-Profile')}`,
        // Add any browser-specific flags if needed, e.g. based on browserToUse.name
        // For now, assuming BASE_CHROME_FLAGS are generic enough for Chromium-based ones
        `${GO_SECURE ? 'https' : 'http'}://localhost:${server_port}` // Starting URL
    ];
    
    // Remove userDataDir: false from LAUNCH_OPTS if it was there, as it's not a standard spawn option.
    // userDataDir is typically a flag like --user-data-dir=...
    // If you want a specific user data dir, add it to browserArgsForLaunch.
    // For a fresh profile, many browsers do this by default if no --user-data-dir is set,
    // or you might need a specific flag like --guest or ensure no existing profile is picked up.
    // For now, we are not explicitly setting --user-data-dir for a throwaway profile.
    // If args.temp_browser_profile() is desired:
    // browserArgsForLaunch.push(`--user-data-dir=${args.temp_browser_profile()}`);


    launchedBrowserProcess = BrowserLauncher.launch(browserToUse.foundPath, browserArgsForLaunch);

    if (!launchedBrowserProcess) {
      console.error(chalk.red(`Failed to launch ${browserToUse.name}.`));
      await cleanup('Browser launch failed', null, { exit: true });
      return;
    }

    launchedBrowserProcess.on('exit', async (code, signal) => {
      const exitReason = code !== null ? `exited with code ${code}` : `killed by signal ${signal}`;
      console.log(chalk.magenta(`Browser process (${browserToUse.name}) ${exitReason}.`));
      if (!quitting) { // Avoid info message if we are intentionally quitting everything
        console.info(chalk.cyan(`
          ---------------------------------------------------------------------
          INFO: Browser exited. If this was unexpected or too quick:
          - Check for error messages above from the browser process.
          - If running headless (DK_HEADLESS=true), ensure your setup is correct.
            You might need a display server like Xvfb on Linux if not using --headless=new.
          - The browser might have crashed or failed to start with the given flags.
          ---------------------------------------------------------------------
        `));
      }
      await cleanup(`Browser ${exitReason}`, null, { exit: true });
    });

    // Give the browser a moment to start up and open the remote debugging port
    console.log(chalk.green(`${browserToUse.name} launched. PID: ${launchedBrowserProcess.pid}. Waiting for it to become connectable...`));
    await sleep(2500); // Wait a bit for RDP to be available
    
    // Verify connectability after launch
    const isNowConnectable = await checkIsConnectable(browserToUse);
    if (!isNowConnectable) {
        console.warn(chalk.yellow(`Launched ${browserToUse.name}, but it's not connectable on port ${chrome_port} after waiting.`));
        console.warn(chalk.yellow(`Archivist might not function correctly. Check browser console for errors.`));
        // Decide if this is a fatal error or if we should proceed with caution
        // For now, proceed with caution.
    } else {
        console.log(chalk.green(`${browserToUse.name} is connectable.`));
    }

  } else if (action.action === 'connect') {
    console.log(chalk.cyan(`Proceeding with already running and connectable ${browserToUse.name}.`));
  }

  if (quitting) return;

  console.log(chalk.cyan(`Launching archivist and connecting to browser on port ${chrome_port}...`));
  await Archivist.collect({ chrome_port, mode });
  console.log(chalk.green.bold(`System ready. Archivist connected.`));
}

async function cleanup(reason, err, { exit = false } = {}) {
  if (quitting && exit) {
    DEBUG.verbose && console.log(chalk.cyan(`Cleanup already in progress for exit. Reason: ${reason}`));
    return;
  }
  console.log(chalk.cyan(`\nInitiating shutdown sequence. Reason: ${reason}`));
  if (err) {
    console.error(chalk.red('Error during operation or shutdown:'), err instanceof Error ? err.stack : err);
  }

  if (exit) quitting = true;

  DEBUG.verbose && console.log(chalk.yellow(`Cleanup called. Reason: ${reason}`));

  Archivist.shutdown(); // Signal archivist to stop its work
  LibraryServer.stop(); // Stop the HTTP server

  // Note: We don't explicitly kill the browser here if it was launched by us.
  // Its 'exit' handler calls cleanup. If the user chose 'connect', we don't own the process.
  // If 'shutdown_all_and_exit' was chosen, browsers were killed before this.

  if (exit) {
    console.log(chalk.cyan(`All components signaled to stop. Exiting in 3 seconds...`));
    await sleep(3000);
    process.exit(err instanceof Error ? 1 : 0);
  }
}


================================================
FILE: src/archivist.js
================================================
// Licenses
  // FlexSearch is Apache-2.0 licensed
    // Source: https://github.com/nextapps-de/flexsearch/blob/bffb255b7904cb7f79f027faeb963ecef0a85dba/LICENSE
  // NDX is MIT licensed
    // Source: https://github.com/ndx-search/ndx/blob/cc9ec2780d88918338d4edcfca2d4304af9dc721/LICENSE
  
// module imports
  import crypto from 'crypto';
  import { rainbowHash } from '@dosyago/rainsum';
  import {URL} from 'url';
  import Path from 'path';
  import os from 'os';
  import Fs from 'fs';
  import {stdin as input, stdout as output} from 'process';
  import util from 'util';
  import readline from 'readline';

  // search related
    import FlexSearch from 'flexsearch';
    const {Index: FTSIndex} = FlexSearch;
    //const {Index: FTSIndex} = require('flexsearch');
    import { 
      createIndex as NDX, 
      addDocumentToIndex as ndx, 
      removeDocumentFromIndex, 
      vacuumIndex 
    } from 'ndx';
    import { query as NDXQuery } from 'ndx-query';
    import { toSerializable, fromSerializable } from 'ndx-serializable';
    //import { DocumentIndex } from 'ndx';
    import Fuzzy from 'fz-search';
    //import * as _Fuzzy from './lib/fz.js';
    import Nat from 'natural';

  import args from './args.js';
  import {
    GO_SECURE,
    untilTrue,
    sleep, DEBUG as debug, 
    BATCH_SIZE,
    MIN_TIME_PER_PAGE,
    MAX_TIME_PER_PAGE,
    MAX_TITLE_LENGTH,
    MAX_URL_LENGTH,
    clone,
    CHECK_INTERVAL, TEXT_NODE, FORBIDDEN_TEXT_PARENT,
    RichError
  } from './common.js';
  import {connect} from './protocol.js';
  import {BLOCKED_CODE, BLOCKED_HEADERS} from './blockedResponse.js';
  import {getInjection} from '../public/injection.js';
  import {hasBookmark, bookmarkChanges} from './bookmarker.js';

// search related state: constants and variables
  const DEBUG = debug || false;
  // common
    /* eslint-disable no-control-regex */
    const STRIP_CHARS = /[\u0001-\u001a\0\v\f\r\t\n]/g;
    /* eslint-enable no-control-regex */
    //const Fuzzy = globalThis.FuzzySearch;
    const NDX_OLD = false;
    const USE_FLEX = true;
    const FTS_INDEX_DIR = args.fts_index_dir;
    const URI_SPLIT = /[/.]/g;
    const NDX_ID_KEY = 'ndx_id';
    const INDEX_HIDDEN_KEYS = new Set([
      NDX_ID_KEY
    ]);
    const hiddenKey = key => key.startsWith('ndx') || INDEX_HIDDEN_KEYS.has(key);
    let Id;

  // natural (NLP tools -- stemmers and tokenizers, etc)
    const {WordTokenizer, PorterStemmer} = Nat;
    const Tokenizer = new WordTokenizer();
    const Stemmer = PorterStemmer;
    const words = Tokenizer.tokenize.bind(Tokenizer);
    const termFilter = Stemmer.stem.bind(Stemmer);
    //const termFilter = s => s.toLocaleLowerCase();

  // FlexSearch
    const FLEX_OPTS = {
      charset: "utf8",
      context: true,
      language: "en",
      tokenize: "reverse"
    };
    let Flex = new FTSIndex(FLEX_OPTS);
    DEBUG.verboseSlow && console.log({Flex});

  // NDX
    const NDXRemoved = new Set();
    const REMOVED_CAP_TO_VACUUM_NDX = 10;
    const NDX_FIELDS = ndxDocFields();
    let NDX_FTSIndex = new NDXIndex(NDX_FIELDS);
    let NDXId;
    DEBUG.verboseSlow && console.log({NDX_FTSIndex});

  // fuzzy (maybe just for queries ?)
    const REGULAR_SEARCH_OPTIONS_FUZZY = {
      minimum_match: 1.0
    };
    const HIGHLIGHT_OPTIONS_FUZZY = {
      minimum_match: 2.0 // or 3.0 seems to be good
    };
    const FUZZ_OPTS = {
      keys: ndxDocFields({namesOnly:true})
    };
    const Docs = new Map();
    let fuzzy = new Fuzzy({source: [...Docs.values()], keys: FUZZ_OPTS.keys});

// module state: constants and variables
  // cache is a simple map
    // that holds the serialized requests
    // that are saved on disk
  const Status = {
    loaded: false
  };
  const FrameNodes = new Map();
  const Targets = new Map();
  const UpdatedKeys = new Set();
  const Cache = new Map();
  const Index = new Map();
  const Indexing = new Set();
  const CrawlIndexing = new Set();
  const CrawlTargets = new Set();
  const CrawlData = new Map();
  const Q = new Set();
  const Sessions = new Map();
  const Installations = new Set();
  const ConfirmedInstalls = new Set();
  const BLANK_STATE = {
    Targets,
    Sessions,
    Installations,
    ConfirmedInstalls,
    FrameNodes,
    Docs,
    Indexing,
    CrawlIndexing,
    CrawlData,
    CrawlTargets,
    Cache, 
    Index,
    NDX_FTSIndex,
    Flex,
    SavedCacheFilePath: null,
    SavedIndexFilePath: null,
    SavedFTSIndexDirPath: null,
    SavedFuzzyIndexDirPath: null,
    saver: null,
    indexSaver: null,
    ftsIndexSaver: null,
    saveInProgress: false,
    ftsSaveInProgress: false
  };
  const State = Object.assign({}, BLANK_STATE);
  export const Archivist = { 
    NDX_OLD,
    USE_FLEX,
    collect, getMode, changeMode, shutdown, 
    beforePathChanged,
    afterPathChanged,
    saveIndex,
    getIndex,
    deleteFromIndexAndSearch,
    search,
    getDetails,
    isReady,
    findOffsets,
    archiveAndIndexURL
  }
  const BODYLESS = new Set([
    301,
    302,
    303,
    307
  ]);
  const NEVER_CACHE = new Set([
    `${GO_SECURE ? 'https://localhost' : 'http://127.0.0.1'}:${args.server_port}`,
    `http://localhost:${args.server_port}`,
    `http://localhost:${args.chrome_port}`,
    `http://127.0.0.1:${args.chrome_port}`,
    `https://127.0.0.1:${args.chrome_port}`,
    `http://::1:${args.chrome_port}`,
    `https://::1:${args.chrome_port}`
  ]);
  const SORT_URLS = ([urlA],[urlB]) => urlA < urlB ? -1 : 1;
  const CACHE_FILE = args.cache_file; 
  const INDEX_FILE = args.index_file;
  const NO_FILE = args.no_file;
  const TBL = /(:\/\/|:|@)/g;
  const UNCACHED_BODY = b64('We have not saved this data');
  const UNCACHED_CODE = 404;
  const UNCACHED_HEADERS = [
    { name: 'Content-type', value: 'text/plain' },
    { name: 'Content-length', value: '26' }
  ];
  const UNCACHED = {
    body:UNCACHED_BODY, responseCode:UNCACHED_CODE, responseHeaders:UNCACHED_HEADERS
  }
  let Mode, Close;

// shutdown and cleanup
  // handle writing out indexes and closing browser connection when resetting under nodemon
    process.once('SIGUSR2', function () {
      shutdown(function () {
        process.kill(process.pid, 'SIGUSR2');
      });
    });

// logging
    let logName;
    let logStream;

// main
  async function collect({chrome_port:port, mode} = {}) {
    try {
      console.log('Starting collect');
      const {library_path} = args;
      const exitHandlers = [];
      process.on('beforeExit', runHandlers);
      process.on('SIGUSR2', code => runHandlers(code, 'SIGUSR2', {exit: true}));
      process.on('exit', code => runHandlers(code, 'exit', {exit: true}));
      State.connection = State.connection || await connect({port});
      console.log('Connection established');
      State.onExit = {
        addHandler(h) {
          exitHandlers.push(h);
        }
      };
      const {send, on, close} = State.connection;
      //const DELAY = 100; // 500 ?
      Close = close;

      let requestStage;
      
      console.log('Loading files...');
      await loadFiles();

      clearSavers();

      Mode = mode; 
      console.log({Mode});
      if ( Mode == 'save' || Mode == 'select' ) {
        requestStage = "Response";
        // in case we get a updateBasePath call before an interval
        // and we don't clear it in time, leading us to erroneously save the old
        // cache to the new path, we always used our saved copy
        State.saver = setInterval(() => saveCache(State.SavedCacheFilePath), 17000);
        // we use timeout because we can trigger this ourself
        // so in order to not get a race condition (overlapping calls) we ensure 
        // only 1 call at 1 time
        State.indexSaver = setTimeout(() => saveIndex(State.SavedIndexFilePath), 11001);
        State.ftsIndexSaver = setTimeout(() => saveFTS(State.SavedFTSIndexDirPath), 31001);
      } else if ( Mode == 'serve' ) {
        requestStage = "Request";
        clearSavers();
      } else {
        throw new TypeError(`Must specify mode, and must be one of: save, serve, select`);
      }

      on("Target.targetInfoChanged", attachToTarget);
      on("Target.targetInfoChanged", updateTargetInfo);
      on("Target.targetInfoChanged", indexURL);
      on("Target.attachedToTarget", installForSession);
      on("Page.loadEventFired", reloadIfNotLive);
      on("Fetch.requestPaused", cacheRequest);
      on("Runtime.consoleAPICalled", handleMessage);

      await send("Target.setDiscoverTargets", {discover:true});
      await send("Target.setAutoAttach", {autoAttach:true, waitForDebuggerOnStart:false, flatten: true});
      await send("Security.setIgnoreCertificateErrors", {ignore:true});
      await send("Fetch.enable", {
        patterns: [
          {
            urlPattern: "http*://*", 
            requestStage
          }
        ], 
      });

      const {targetInfos:targets} = await send("Target.getTargets", {});
      DEBUG.debug && console.log({targets});
      const pageTargets = targets.filter(({type}) => type == 'page').map(targetInfo => ({targetInfo}));
      await Promise.all(pageTargets.map(attachToTarget));
      sleep(5000).then(() => Promise.all(pageTargets.map(reloadIfNotLive)));

      State.bookmarkObserver = State.bookmarkObserver || startObservingBookmarkChanges();

      Status.loaded = true;

      return Status.loaded;

      async function runHandlers(reason, err, {exit = false} = {}) {
        debug.verbose && console.log('before exit running', exitHandlers, {reason, err});
        while(exitHandlers.length) {
          const h = exitHandlers.shift();
          try {
            h();
          } catch(e) {
            console.warn(`Error in exit handler`, h, e);
          }
        }
        if ( exit ) {
          console.log(`Exiting in 3 seconds...`);
          await sleep(3000);
          process.exit(0);
        }
      }

      function handleMessage(args) {
        const {type, args:[{value:strVal}]} = args;
        if ( type == 'info' ) {
          try {
            const val = JSON.parse(strVal);
            // possible messages
            const {install, titleChange, textChange} = val;
            switch(true) {
              case !!install: {
                  confirmInstall({install});
                } break;
              case !!titleChange: {
                  reindexOnContentChange({titleChange});
                } break;
              case !!textChange: {
                  reindexOnContentChange({textChange});
                } break;
              default: {
                  if ( DEBUG ) {
                    console.warn(`Unknown message`, strVal);
                  }
                } break;
            }
          } catch(e) {
            DEBUG.verboseSlow && console.info('Not the message we expected to confirm install. This is OK.', {originalMessage:args});
          } 
        }
      }

      function confirmInstall({install}) {
        const {sessionId} = install;
        if ( ! State.ConfirmedInstalls.has(sessionId) ) {
          State.ConfirmedInstalls.add(sessionId);
          DEBUG.verboseSlow && console.log({confirmedInstall:install});
        }
      }

      async function reindexOnContentChange({titleChange, textChange}) {
        const data = titleChange || textChange;
        if ( data ) {
          const {sessionId} = data;
          const latestTargetInfo = clone(await untilHas(Targets, sessionId));
          if ( titleChange ) {
            const {currentTitle} = titleChange;
            DEBUG.verboseSlow && console.log('Received titleChange', titleChange);
            latestTargetInfo.title = currentTitle;
            Targets.set(sessionId, latestTargetInfo);
            DEBUG.verboseSlow && console.log('Updated stored target info', latestTargetInfo);
          } else {
            DEBUG.verboseSlow && console.log('Received textChange', textChange);
          }
          if ( ! dontCache(latestTargetInfo) ) {
            DEBUG.verboseSlow && console.log(
              `Will reindex because we were told ${titleChange ? 'title' : 'text'} content maybe changed.`, 
              data
            );
            indexURL({targetInfo:latestTargetInfo});
          }
        }
      }

      function updateTargetInfo({targetInfo}) {
        if ( targetInfo.type === 'page' ) {
          const sessionId = State.Sessions.get(targetInfo.targetId); 
          DEBUG.verboseSlow && console.log('Updating target info', targetInfo, sessionId);
          if ( sessionId ) {
            const existingTargetInfo = Targets.get(sessionId);
            // if we have an existing target info for this URL and have saved an updated title
            DEBUG.verboseSlow && console.log('Existing target info', existingTargetInfo);
            if ( existingTargetInfo && existingTargetInfo.url === targetInfo.url ) {
              // keep that title (because targetInfo does not reflect the latest title)
              if ( existingTargetInfo.title !== existingTargetInfo.url ) {
                DEBUG.verboseSlow && console.log('Setting title to existing', existingTargetInfo);
                targetInfo.title = existingTargetInfo.title;
              }
            }
            Targets.set(sessionId, clone(targetInfo));
          }
        }
      }

      async function reloadIfNotLive({targetInfo, sessionId} = {}) {
        if ( Mode == 'serve' ) return; 
        if ( !targetInfo && !!sessionId ) {
          targetInfo = Targets.get(sessionId);
          console.log(targetInfo);
        }
        if ( neverCache(targetInfo?.url) ) return;
        const {attached, type} = targetInfo;
        if ( attached && type == 'page' ) {
          const {url, targetId} = targetInfo;
          const sessionId = State.Sessions.get(targetId);
          if ( !!sessionId && !State.ConfirmedInstalls.has(sessionId) ) {
            DEBUG.verboseSlow && console.log({
              reloadingAsNotConfirmedInstalled:{
                url, 
                sessionId
              },
              confirmedInstalls: State.ConfirmedInstalls
            });
            await sleep(600);
            send("Page.stopLoading", {}, sessionId);
            send("Page.reload", {}, sessionId);
          }
        }
      }

      function neverCache(url) {
        if ( ! url ) return true;
        try {
          url = new URL(url);
          return url?.href == "about:blank" || url?.href?.startsWith('chrome') || NEVER_CACHE.has(url.origin);
        } catch(e) {
          DEBUG.debug && console.warn('Could not form url', url, e);
          return true;
        } 
      }

      async function installForSession({sessionId, targetInfo, waitingForDebugger}) {
        if ( waitingForDebugger ) {
          console.warn(targetInfo);
          throw new TypeError(`Target not ready for install`);
        }
        if ( ! sessionId ) {
          throw new TypeError(`installForSession needs a sessionId`);
        }

        const {targetId, url} = targetInfo;

        const installUneeded = dontInstall(targetInfo) ||
          State.Installations.has(sessionId)
        ;

        if ( installUneeded ) return;

        DEBUG.verboseSlow && console.log("installForSession running on target " + targetId);

        State.Sessions.set(targetId, sessionId);
        Targets.set(sessionId, clone(targetInfo));

        if ( Mode == 'save' || Mode == 'select' ) {
          send("Network.setCacheDisabled", {cacheDisabled:true}, sessionId);
          send("Network.setBypassServiceWorker", {bypass:true}, sessionId);

          await send("Runtime.enable", {}, sessionId);
          await send("Page.enable", {}, sessionId);
          await send("Page.setAdBlockingEnabled", {enabled: true}, sessionId);
          await send("DOMSnapshot.enable", {}, sessionId);

          on("Page.frameNavigated", updateFrameNode);
          on("Page.frameAttached", addFrameNode);
          // on("Page.frameDetached", updateFrameNodes); // necessary? maybe not 

          await send("Page.addScriptToEvaluateOnNewDocument", {
            source: getInjection({sessionId}),
            worldName: "Context-22120-Indexing",
            runImmediately: true
          }, sessionId);

          DEBUG.verboseSlow && console.log("Just request install", targetId, url);
        }

        State.Installations.add(sessionId);

        DEBUG.verboseSlow && console.log('Installed sessionId', sessionId);
        if ( Mode == 'save' ) {
          indexURL({targetInfo});
        }
      }

      async function indexURL({targetInfo:info = {}, sessionId, waitingForDebugger} = {}) {
        if ( waitingForDebugger ) {
          console.warn(info);
          throw new TypeError(`Target not ready for install`);
        }
        if ( Mode == 'serve' ) return;
        if ( info.type != 'page' ) return;
        if ( ! info.url  || info.url == 'about:blank' ) return;
        if ( info.url.startsWith('chrome') ) return;
        if ( dontCache(info) ) return;

        DEBUG.verboseSlow && console.log('Index URL', info);

        DEBUG.verboseSlow && console.log('Index URL called', info);

        if ( State.Indexing.has(info.targetId) ) return;
        State.Indexing.add(info.targetId);

        if ( ! sessionId ) {
          sessionId = await untilHas(
            State.Sessions, info.targetId, 
            {timeout: State.crawling && State.crawlTimeout}
          );
        }

        if ( !State.Installations.has(sessionId) ) {
          await untilHas(
            State.Installations, sessionId, 
            {timeout: State.crawling && State.crawlTimeout}
          );
        }

        send("DOMSnapshot.enable", {}, sessionId);

        await sleep(500);

        const flatDoc = await send("DOMSnapshot.captureSnapshot", {
          computedStyles: [],
        }, sessionId);
        const pageText = processDoc(flatDoc).replace(STRIP_CHARS, ' ');

        if ( State.crawling ) {
          const has = await untilTrue(() => State.CrawlData.has(info.targetId));

          const {url} = Targets.get(sessionId);
          if ( ! dontCache({url}) ) {
            if ( has ) {
              const {depth,links} = State.CrawlData.get(info.targetId);
              DEBUG.verboseSlow && console.log(info, {depth,links});

              const {result:{value:{title,links:crawlLinks}}} = await send("Runtime.evaluate", {
                expression: `(function () { 
                  return {
                    links: Array.from(
                      document.querySelectorAll('a[href]')
                    ).map(a => a.href),
                    title: document.title
                  };
                }())`,
                returnByValue: true
              }, sessionId);

              const shouldCrawl = depth <= State.crawlDepth;

              if ( shouldCrawl ) {
                links.length = 0;
                links.push(...crawlLinks.filter(url => url.startsWith('http')).map(url => ({url,depth:depth+1})));
              }
              if ( logStream ) {
                console.log(`Writing ${links.length} entries to ${logName}`);
                logStream.cork();
                links.forEach(url => {
                  logStream.write(`${url}\n`);
                });
                logStream.uncork();
              }
              console.log(`Just crawled: ${title} (${info.url})`);
            }

            if ( ! State.titles ) {
              State.titles = new Map();
              State.onExit.addHandler(() => {
                Fs.writeFileSync(
                  Path.resolve(args.CONFIG_DIR, `titles-${(new Date).toISOString()}.txt`), 
                  JSON.stringify([...State.titles.entries()], null, 2) + '\n'
                );
              });
            }

            const {result:{value:data}} = await send("Runtime.evaluate", 
              {
                expression: `(function () {
                  return {
                    url: document.location.href,
                    title: document.title,
                  };
                }())`,
                returnByValue: true
              }, 
              sessionId
            );

            State.titles.set(data.url, data.title);
            console.log(`Saved ${State.titles.size} titles`);

            if ( State.program && ! dontCache(info) ) {
              const targetInfo = info;
              const fs = Fs;
              const path = Path;
              try {
                await sleep(500);
                await eval(`(async () => {
                  try {
                    ${State.program}
                  } catch(e) {
                    console.warn('Error in program', e, State.program);
                  }
                })();`);
                await sleep(500);
              } catch(e) {
                console.warn(`Error evaluate program`, e);
              }
            }
          }
        }

        const {title, url} = Targets.get(sessionId);
        let id, ndx_id;
        if ( State.Index.has(url) ) {
          ({ndx_id, id} = State.Index.get(url));
        } else {
          Id++;
          id = Id;
        }
        const doc = toNDXDoc({id, url, title, pageText});
        State.Index.set(url, {date:Date.now(),id:doc.id, ndx_id:doc.ndx_id, title});   
        State.Index.set(doc.id, url);
        State.Index.set('ndx'+doc.ndx_id, url);

        const contentSignature = getContentSig(doc);

        //Flex code
        Flex.update(doc.id, contentSignature);

        //New NDX code
        NDX_FTSIndex.update(doc, ndx_id);

        // Fuzzy 
        // eventually we can use this update logic for everyone
        let updateFuzz = true;
        if ( State.Docs.has(url) ) {
          const current = State.Docs.get(url);
          if ( current.contentSignature === contentSignature ) {
            updateFuzz = false;
          }
        }
        if ( updateFuzz ) {
          doc.contentSignature = contentSignature;
          fuzzy.add(doc);
          State.Docs.set(url, doc);
          DEBUG.verboseSlow && console.log({updateFuzz: {doc,url}});
        }

        DEBUG.verboseSlow && console.log("NDX updated", doc.ndx_id);

        UpdatedKeys.add(url);

        DEBUG.verboseSlow && console.log({id: doc.id, title, url, indexed: true});

        State.Indexing.delete(info.targetId);
        State.CrawlIndexing.delete(info.targetId);
      }

      async function attachToTarget({targetInfo}, retryCount = 0) {
        if ( dontInstall(targetInfo) ) return;
        const {url} = targetInfo;
        if ( url && targetInfo.type == 'page' ) {
          try {
            if ( ! targetInfo.attached ) {
              const {sessionId} = (await send("Target.attachToTarget", {
                targetId: targetInfo.targetId,
                flatten: true
              }));
              State.Sessions.set(targetInfo.targetId, sessionId);
            }
          } catch(e) {
            DEBUG.verboseSlow && console.error(`Attach to target failed`, targetInfo);
            if ( retryCount < 3 ) {
              const ms = 1500;
              DEBUG.verboseSlow && console.log(`Retrying attach in ${ms/1000} seconds...`);
              setTimeout(() => attachToTarget({targetInfo}, (retryCount || 1) + 1), ms);
            } 
          }
        }
      }

      async function cacheRequest(pausedRequest) {
        const {
          requestId, request, resourceType, 
          frameId,
          responseStatusCode, responseHeaders, responseErrorReason
        } = pausedRequest;
        const isNavigationRequest = resourceType == "Document";
        const isFont = resourceType == "Font";

        if ( dontCache(request) ) {
          DEBUG.verboseSlow && console.log("Not caching", request.url);
          send(`Fetch.continue${requestStage}`, {requestId});
          return;
        }
        const key = serializeRequestKey(request);
        if ( Mode == 'serve' ) {
          if ( State.Cache.has(key) ) {
            let {body, responseCode, responseHeaders} = await getResponseData(State.Cache.get(key));
            responseCode = responseCode || 200;
            //DEBUG.verboseSlow && console.log("Fulfilling", key, responseCode, responseHeaders, body.slice(0,140));
            DEBUG.verboseSlow && console.log("Fulfilling", key, responseCode, body.slice(0,140));
            await send("Fetch.fulfillRequest", {
              requestId, body, responseCode, responseHeaders
            });
          } else {
            DEBUG.verboseSlow && console.log("Sending cache stub", key);
            await send("Fetch.fulfillRequest", {
              requestId, ...UNCACHED
            });
          } 
        } else {
          let saveIt = false;
          if ( Mode == 'select' ) {
            const rootFrameURL = getRootFrameURL(frameId);
            const frameDescendsFromBookmarkedURLFrame = hasBookmark(rootFrameURL);
            saveIt = frameDescendsFromBookmarkedURLFrame;
            DEBUG.verboseSlow && console.log({rootFrameURL, frameId, mode, saveIt});
          } else if ( Mode == 'save' ) {
            saveIt = true;
          }
          if ( saveIt ) {
            const response = {key, responseCode: responseStatusCode, responseHeaders};
            const resp = await getBody({requestId, responseStatusCode});
            if ( resp ) {
              let {body, base64Encoded} = resp;
              if ( ! base64Encoded ) {
                body = b64(body);
              }
              response.body = body;
              const responsePath = await saveResponseData(key, request.url, response);
              State.Cache.set(key, responsePath);
            } else {
              DEBUG.verboseSlow && console.warn("get response body error", key, responseStatusCode, responseHeaders, pausedRequest.responseErrorReason);  
              response.body = '';
            }
            //await sleep(DELAY);
            if ( !isFont && responseErrorReason ) {
              if ( isNavigationRequest ) {
                await send("Fetch.fulfillRequest", {
                    requestId,
                    responseHeaders: BLOCKED_HEADERS,
                    responseCode: BLOCKED_CODE,
                    body: Buffer.from(responseErrorReason).toString("base64"),
                  },
                );
              } else {
                await send("Fetch.failRequest", {
                    requestId,
                    errorReason: responseErrorReason
                  },
                );
              }
              return;
            }
          } 
          send(`Fetch.continue${requestStage}`, {requestId}).catch(
            e => console.warn("Issue with continuing request", {e, requestStage, requestId})
          );
        }
      }

      async function getBody({requestId, responseStatusCode}) {
        let resp;
        if ( ! BODYLESS.has(responseStatusCode) ) {
          resp = await send("Fetch.getResponseBody", {requestId});
        } else {
          resp = {body:'', base64Encoded:true};
        }
        return resp;
      }
      
      function dontInstall(targetInfo) {
        return targetInfo.type !== 'page';
      }

      async function getResponseData(path) {
        try {
          return JSON.parse(await Fs.promises.readFile(path));
        } catch(e) {
          console.warn(`Error with ${path}`, e);
          return UNCACHED;
        }
      }

      async function saveResponseData(key, url, response) {
        try {
          const origin = (new URL(url).origin);
          let originDir = State.Cache.get(origin);
          if ( ! originDir ) {
            originDir = Path.resolve(library_path(), origin.replace(TBL, '_'));
            try {
              Fs.mkdirSync(originDir, {recursive:true});
            } catch(e) {
              console.warn(`Issue with origin directory ${originDir}`, e);
            }
            State.Cache.set(origin, originDir);
          } else {
            if ( originDir.includes(':\\\\') ) {
              originDir = originDir.split(/:\\\\/, 2);
              originDir[1] = originDir[1]?.replace?.(TBL, '_');
              originDir = originDir.join(':\\\\');
            }
          }

          const fileName = `${await sha1(key)}.json`;

          const responsePath = Path.resolve(originDir, fileName);
          try {
            await Fs.promises.writeFile(responsePath, JSON.stringify(response,null,2));
          } catch(e) {
            console.warn(`Issue with origin directory or file: ${responsePath}`, e);
          }

          return responsePath;
        } catch(e) {
          console.warn(`Could not save response data`, e);
          return '';
        }
      }

      async function sha1(key) {
        return crypto.createHash('sha1').update(key).digest('hex');
      }
      
      async function rainbow(key) {
        return rainbowHash(128, 0, new Uint8Array(Buffer.from(key)));
      }

      function serializeRequestKey(request) {
        const {url, /*urlFragment,*/ method, /*headers, postData, hasPostData*/} = request;

        /**
        let sortedHeaders = '';
        for( const key of Object.keys(headers).sort() ) {
          sortedHeaders += `${key}:${headers[key]}/`;
        }
        **/

        return `${method}${url}`;
        //return `${url}${urlFragment}:${method}:${sortedHeaders}:${postData}:${hasPostData}`;
      }

      async function startObservingBookmarkChanges() {
        console.info("Not observing");
        return;
        for await ( const change of bookmarkChanges() ) {
          if ( Mode == 'select' ) {
            switch(change.type) {
              case 'new': {
                  DEBUG.verboseSlow && console.log(change);
                  archiveAndIndexURL(change.url);
                } break;
              case 'delete': {
                  DEBUG.verboseSlow && console.log(change);
                  deleteFromIndexAndSearch(change.url);
                } break;
              default: {
                console.log(`We don't do anything about this bookmark change, currently`, change);
              } break;
            }
          }
        }
      }
    } catch(e) {
      console.error('Error while collect', e);
    }
  }

// helpers
  function neverCache(url) {
    return !url || url == "about:blank" || url.match(/^(?:chrome|vivaldi|brave|edge)/) || NEVER_CACHE.has(url);
  }

  function dontCache(request) {
    if ( ! request.url ) return true;
    if ( neverCache(request.url) ) return true;
    if ( Mode == 'select' && ! hasBookmark(request.url) ) return true;
    const url = new URL(request.url);
    return NEVER_CACHE.has(url.origin) || !!(State.No && State.No.test(url.host));
  }

  function processDoc({documents, strings}) {
    /* 
      Info
      Implementation Notes 

      1. Code uses spec at: 
        https://chromedevtools.github.io/devtools-protocol/tot/DOMSnapshot/#type-NodeTreeSnapshot

      2. Note that so far the below will NOT produce text for and therefore we will NOT
      index textarea or input elements. We can access those by using the textValue and
      inputValue array properties of the doc, if we want to implement that.
    */
       
    const texts = [];
    for( const doc of documents) {
      const textIndices = doc.nodes.nodeType.reduce((Indices, type, index) => {
        if ( type === TEXT_NODE ) {
          const parentIndex = doc.nodes.parentIndex[index];
          const forbiddenParent = parentIndex >= 0 && 
            FORBIDDEN_TEXT_PARENT.has(strings[
              doc.nodes.nodeName[
                parentIndex
              ]
            ])
          if ( ! forbiddenParent ) {
            Indices.push(index);
          }
        }
        return Indices;
      }, []);
      textIndices.forEach(index => {
        const stringsIndex = doc.nodes.nodeValue[index];
        if ( stringsIndex >= 0 ) {
          const text = strings[stringsIndex];
          texts.push(text);
        }
      });
    }

    const pageText = texts.filter(t => t.trim()).join(' ');
    DEBUG.verboseSlow && console.log('Page text>>>', pageText);
    return pageText;
  }

  async function isReady() {
    return await untilTrue(() => Status.loaded);
  }

  async function loadFuzzy({fromMemOnly: fromMemOnly = false} = {}) {
    if ( ! fromMemOnly ) {
      const fuzzyDocs = Fs.readFileSync(getFuzzyPath()).toString();
      State.Docs = new Map(JSON.parse(fuzzyDocs).map(doc => {
        doc.i_url = getURI(doc.url);
        doc.contentSignature = getContentSig(doc);
        return [doc.url, doc];
      }));
    }
    State.Fuzzy = fuzzy = new Fuzzy({source: [...State.Docs.values()], keys: FUZZ_OPTS.keys});
    DEBUG.verboseSlow && console.log('Fuzzy loaded');
  }

  function getContentSig(doc) { 
    return doc.title + ' ' + doc.title + ' ' + doc.content + ' ' + getURI(doc.url);
  }

  function getURI(url) {
    return url.split(URI_SPLIT).join(' ');
  }

  function saveFuzzy(basePath) {
    const docs = [...State.Docs.values()]
      .map(({url, title, content, id}) => ({url, title, content, id}));
    if ( docs.length === 0 ) return;
    const path = getFuzzyPath(basePath);
    Fs.writeFileSync(
      path,
      JSON.stringify(docs, null, 2)
    );
    DEBUG.verboseSlow && console.log(`Wrote fuzzy to ${path}`);
  }

  function clearSavers() {
    if ( State.saver ) {
      clearInterval(State.saver);
      State.saver = null;
    }

    if ( State.indexSaver ) {
      clearTimeout(State.indexSaver);
      State.indexSaver = null;
    }

    if ( State.ftsIndexSaver ) {
      clearTimeout(State.ftsIndexSaver);
      State.ftsIndexSaver = null;
    }
  }

  async function loadFiles() {
    let cacheFile = CACHE_FILE();
    let indexFile = INDEX_FILE();
    let ftsDir = FTS_INDEX_DIR();
    let someError = false;

    try {
      State.Cache = new Map(JSON.parse(Fs.readFileSync(cacheFile)));
    } catch(e) {
      console.warn(e+'');
      State.Cache = new Map();
      someError = true;
    }

    try {
      State.Index = new Map(JSON.parse(Fs.readFileSync(indexFile)));
    } catch(e) {
      console.warn(e+'');
      State.Index = new Map();
      someError = true;
    }

    try {
      const flexBase = getFlexBase();
      Fs.readdirSync(flexBase, {withFileTypes:true}).forEach(dirEnt => {
        if ( dirEnt.isFile() ) {
          const content = Fs.readFileSync(Path.resolve(flexBase, dirEnt.name)).toString();
          Flex.import(dirEnt.name, JSON.parse(content));
        }
      });
      DEBUG.verboseSlow && console.log('Flex loaded');
    } catch(e) {
      console.warn(e+'');
      someError = true;
    }

    try {
      loadNDXIndex(NDX_FTSIndex);
    } catch(e) {
      console.warn(e+'');
      someError = true;
    }

    try {
      await loadFuzzy();
    } catch(e) {
      console.warn(e+'');
      someError = true;
    }

    if ( someError ) {
      const rl = readline.createInterface({input, output});
      const question = util.promisify(rl.question).bind(rl);
      console.warn('Error reading archive file. Your archive directory is corrupted. We will attempt to patch it so you can use it going forward, but because we replace a missing or corrupt index, cache, or full-text search index files with new blank copies, existing resources already indexed and cached may become inaccessible from your new index. A future version of this software should be able to more completely repair your archive directory, reconnecting and re-existing all cached resources and notifying you about and purging from the index any missing resources.\n');
      console.log('Sorry about this, we are not sure why this happened, but we know this must be very distressing for you.\n');
      console.log(`For your information, the corruped archive directory is at: ${args.getBasePath()}\n`);
      console.info('Because this repair as described above is not a perfect solution, we will give you a choice of how to proceed. You have two options: 1) attempt a basic repair that may leave some resources inaccessible from the repaired archive, or 2) do not touch the corrupted archive, but instead create a new fresh blank archive to begin saving to. Which option would you like to proceed with?');
      console.log('1) Basic repair with possible inaccessible pages');
      console.log('2) Leave the corrupt archive untouched, start a new archive');
      let correctAnswer = false;
      let newBasePath = '';
      while(!correctAnswer) {
        let answer = await question('Which option would you like (1 or 2)? ');
        answer = parseInt(answer);
        switch(answer) {
          case 1: {
            console.log('Alright, selecting option 1. Using the existing archive and patching a simple repair.');
            newBasePath = args.getBasePath();
            correctAnswer = true;
          } break;
          case 2: {
            console.log('Alright, selection option 2. Leaving the existing archive along and creating a new, fresh, blank archive.');
            let correctAnswer2 = false;
            while( ! correctAnswer2 ) {
              try {
                newBasePath = Path.resolve(os.homedir(), await question(
                  'Please enter a directory name for your new archive.\n' +
                  `${os.homedir()}/`
                ));
                correctAnswer2 = true;
              } catch(e2) {
                console.warn(e2);
                console.info('Sorry that was not a valid directory name.');
                await question('enter to continue');
              }
            }
            correctAnswer = true;
          } break;
          default: {
            correctAnswer = false;
            console.log('Sorry, that was not a valid option. Please input 1 or 2.');
          } break;
        }
      }
      console.log('Resetting base path', newBasePath);
      args.updateBasePath(newBasePath, {force:true, before: [
        () => Archivist.beforePathChanged(newBasePath, {force:true})
      ]});
      saveFiles({forceSave:true});
    }

    Id = Math.round(State.Index.size / 2) + 3;
    NDXId = State.Index.has(NDX_ID_KEY) ? State.Index.get(NDX_ID_KEY) + 1003000 : (Id + 1000000);
    if ( !Number.isInteger(NDXId) ) NDXId = Id;
    DEBUG.verboseSlow && console.log({firstFreeId: Id, firstFreeNDXId: NDXId});

    State.SavedCacheFilePath = cacheFile;
    State.SavedIndexFilePath = indexFile;
    State.SavedFTSIndexDirPath = ftsDir;
    DEBUG.verboseSlow && console.log(`Loaded cache key file ${cacheFile}`);
    DEBUG.verboseSlow && console.log(`Loaded index file ${indexFile}`);
    DEBUG.verboseSlow && console.log(`Need to load FTS index dir ${ftsDir}`);

    try {
      if ( !Fs.existsSync(NO_FILE()) ) {
        DEBUG.verboseSlow && console.log(`The 'No file' (${NO_FILE()}) does not exist, ignoring...`); 
        State.No = null;
      } else {
        State.No = new RegExp(JSON.parse(Fs.readFileSync(NO_FILE))
          .join('|')
          .replace(/\./g, '\\.')
          .replace(/\*/g, '.*')
          .replace(/\?/g, '.?')
        );
      }
    } catch(e) {
      DEBUG.verboseSlow && console.warn('Error compiling regex from No file', e);
      State.No = null;
    }
  }

  function getMode() { return Mode; }

  function saveFiles({useState: useState = false, forceSave:forceSave = false} = {}) {
    if ( State.Index.size === 0 ) return;
    clearSavers();
    State.Index.set(NDX_ID_KEY, NDXId);
    if ( useState ) {
      // saves the old cache path
      saveCache(State.SavedCacheFilePath);
      saveIndex(State.SavedIndexFilePath);
      saveFTS(State.SavedFTSIndexDirPath, {forceSave});
    } else {
      saveCache();
      saveIndex();
      saveFTS(null, {forceSave});
    }
  }

  async function changeMode(mode) { 
    saveFiles({forceSave:true});
    Mode = mode;
    await collect({chrome_port:args.chrome_port, mode});
    DEBUG.verboseSlow && console.log('Mode changed', Mode);
  }

  function getDetails(id) {
    const url = State.Index.get(id);
    const {title} = State.Index.get(url);
    const {content} = State.Docs.get(url);
    return {url, title, id, content};
  }

  function findOffsets(query, doc, maxLength = 0) {
    if ( maxLength ) {
      doc = Array.from(doc).slice(0, maxLength).join('');
    }
    Object.assign(fuzzy.options, HIGHLIGHT_OPTIONS_FUZZY);
    const hl = fuzzy.highlight(doc); 
    DEBUG.verboseSlow && console.log(query, hl, maxLength);
    return hl;
  }

  function beforePathChanged(new_path, {force: force = false} = {}) {
    const currentBasePath = args.getBasePath();
    if ( !force && (currentBasePath == new_path) ) {
      return false;
    }
    saveFiles({useState:true, forceSave:true});
    // clear all memory cache, index and full text indexes
    State.Index.clear();
    State.Cache.clear();
    State.Docs.clear();
    State.NDX_FTSIndex = NDX_FTSIndex = new NDXIndex(NDX_FIELDS);
    State.Flex = Flex = new FTSIndex(FLEX_OPTS);
    State.fuzzy = fuzzy = new Fuzzy({source: [...State.Docs.values()], keys: FUZZ_OPTS.keys});
    return true;
  }

  async function afterPathChanged() { 
    DEBUG.verboseSlow && console.log({libraryPathChange:args.library_path()});
    saveFiles({useState:true, forceSave:true});
    // reloads from new path and updates Saved FilePaths
    await loadFiles();
  }

  function saveCache(path) {
    //DEBUG.verboseSlow && console.log("Writing to", path || CACHE_FILE());
    if ( State.Cache.size === 0 ) return;
    Fs.writeFileSync(path || CACHE_FILE(), JSON.stringify([...State.Cache.entries()],null,2));
  }

  function saveIndex(path) {
    if ( State.saveInProgress || Mode == 'serve' ) return;
    if ( State.Index.size === 0 ) return;
    State.saveInProgress = true;

    clearTimeout(State.indexSaver);

    DEBUG.verboseSlow && console.log(
      `INDEXLOG: Writing Index (size: ${State.Index.size}) to`, path || INDEX_FILE()
    );
    //DEBUG.verboseSlow && console.log([...State.Index.entries()].sort(SORT_URLS));
    Fs.writeFileSync(
      path || INDEX_FILE(), 
      JSON.stringify([...State.Index.entries()].sort(SORT_URLS),null,2)
    );

    State.indexSaver = setTimeout(saveIndex, 11001);

    State.saveInProgress = false;
  }

  function getIndex() {
    const idx = JSON.parse(Fs.readFileSync(INDEX_FILE()))
      .filter(([key]) => typeof key === 'string' && !hiddenKey(key))
      .sort(([,{date:a}], [,{date:b}]) => b-a);
    DEBUG.verboseSlow && console.log(idx);
    return idx;
  }

  async function deleteFromIndexAndSearch(url) {
    if ( State.Index.has(url) ) {
      const {id, ndx_id, title, /*date,*/} = State.Index.get(url);
      // delete index entries
      State.Index.delete(url); 
      State.Index.delete(id);
      State.Index.delete('ndx'+ndx_id);
      // delete FTS entries (where we can)
      State.NDX_FTSIndex.remove(ndx_id);
      State.Flex.remove(id);
      State.Docs.delete(url);
      // save it all (to ensure we don't load data from disk that contains delete entries)
      saveFiles({forceSave:true});
      // and just rebuild the whole FTS index (where we must)
      await loadFuzzy({fromMemOnly:true});
      return {title};
    }
  }

  async function search(query) {
    const flex = (await Flex.searchAsync(query, args.results_per_page))
      .map(id=> ({id, url: State.Index.get(id)}));
    const ndx = NDX_FTSIndex.search(query)
      .map(r => ({
        ndx_id: r.key, 
        url: State.Index.get('ndx'+r.key), 
        score: r.score
      }));
    Object.assign(fuzzy.options, REGULAR_SEARCH_OPTIONS_FUZZY);
    const fuzzRaw = fuzzy.search(query);
    const fuzz = processFuzzResults(fuzzRaw);

    const results = combineResults({flex, ndx, fuzz});
    //console.log({flex,ndx,fuzz});
    const ids = new Set(results);

    const HL = new Map();
    const highlights = fuzzRaw.filter(({id}) => ids.has(id)).map(obj => {
      const title = State.Index.get(obj.url)?.title;
      return {
        id: obj.id,
        url: Archivist.findOffsets(query, obj.url, MAX_URL_LENGTH) || obj.url,
        title: Archivist.findOffsets(query, title, MAX_TITLE_LENGTH) || title,
      };
    });
    highlights.forEach(hl => HL.set(hl.id, hl));

    return {query,results, HL};
  }

  function combineResults({flex,ndx,fuzz}) {
    DEBUG.verboseSlow && console.log({flex,ndx,fuzz});
    const score = {};
    flex.forEach(countRank(score));
    ndx.forEach(countRank(score));
    fuzz.forEach(countRank(score));
    DEBUG.verboseSlow && console.log(score);
  
    const results = [...Object.values(score)].map(obj => {
      try {
        const {id} = State.Index.get(obj.url); 
        obj.id = id;
        return obj;
      } catch(e) {
        DEBUG.verboseSlow && console.log({obj, index:State.Index, e, ndx, flex, fuzz});
        console.error("Error", e);
        return obj;
      }
    });
    results.sort(({score:scoreA}, {score:scoreB}) => scoreB-scoreA);
    DEBUG.verboseSlow && console.log(results);
    const resultIds = results.map(({id}) => id).filter(v => !!v);
    return resultIds;
  }

  function countRank(record, weight = 1.0) {
    return ({url, score:res_score = 1.0}, rank, all) => {
      let result = record[url];
      if ( ! result ) {
        result = record[url] = {
          url,
          score: 0
        };
      }

      result.score += res_score*weight*(all.length - rank)/all.length
    };
  }

  function processFuzzResults(docs) {
    const docIds = docs.map(({id}) => id); 
    const uniqueIds = new Set(docIds);
    return [...uniqueIds.keys()].map(id => ({id, url:State.Index.get(id)}));
  }

  async function saveFTS(path = undefined, {forceSave:forceSave = false} = {}) {
    if ( State.ftsSaveInProgress || Mode == 'serve' ) return;
    State.ftsSaveInProgress = true;

    clearTimeout(State.ftsIndexSaver);

    DEBUG.verboseSlow && console.log("Writing FTS index to", path || FTS_INDEX_DIR());
    const dir = path || FTS_INDEX_DIR();

    if ( forceSave || UpdatedKeys.size ) {
      DEBUG.verboseSlow && console.log(`${UpdatedKeys.size} keys updated since last write`);
      const flexBase = getFlexBase(dir);
      Flex.export((key, data) => {
        key = key.split('.').pop();
        try {
          Fs.writeFileSync(
            Path.resolve(flexBase, key),
            JSON.stringify(data, null, 2)
          );
        } catch(e) {
          console.error('Error writing full text search index', e);
        }
      });
      DEBUG.verboseSlow && console.log(`Wrote Flex to ${flexBase}`);
      NDX_FTSIndex.save(dir);
      saveFuzzy(dir);
      UpdatedKeys.clear();
    } else {
      DEBUG.verboseSlow && console.log("No FTS keys updated, no writes needed this time.");
    }

    State.ftsIndexSaver = setTimeout(saveFTS, 31001);
    State.ftsSaveInProgress = false;
  }

  function shutdown(then) {
    DEBUG.verboseSlow && console.log(`Archivist shutting down...`);  
    saveFiles({forceSave:true});
    Close && Close();
    DEBUG.verboseSlow && console.log(`Archivist shut down.`);
    return then && then();
  }

  function b64(s) {
    return Buffer.from(s).toString('base64');
  }

  function NDXIndex(fields) {
    let retVal;

    // source: 
      // adapted from:
      // https://github.com/ndx-search/docs/blob/94530cbff6ae8ea66c54bba4c97bdd972518b8b4/README.md#creating-a-simple-indexer-with-a-search-function

    if ( ! new.target ) { throw `NDXIndex must be called with 'new'`; }

    // `createIndex()` creates an index data structure.
    // First argument specifies how many different fields we want to index.
    const index = NDX(fields.length);
    // `fieldAccessors` is an array with functions that used to retrieve data from different fields. 
    const fieldAccessors = fields.map(f => doc => doc[f.name]);
    const fieldBoostFactors = fields.map(f => f.boost);
    
    retVal = {
      index,
      // `add()` function will add documents to the index.
      add: doc => ndx(
        retVal.index,
        fieldAccessors,
        // Tokenizer is a function that breaks text into words, phrases, symbols, or other meaningful elements
        // called tokens.
        // Lodash function `words()` splits string into an array of its words, see https://lodash.com/docs/#words for
        // details.
        words,
        // Filter is a function that processes tokens and returns terms, terms are used in Inverted Index to
        // index documents.
        termFilter,
        // Document key, it can be a unique document id or a refernce to a document if you want to store all documents
        // in memory.
        doc.ndx_id,
        // Document.
        doc,
      ),
      remove: id => {
        removeDocumentFromIndex(retVal.index, NDXRemoved, id);
        maybeClean();
      },
      update: (doc, old_id) => {
        retVal.remove(old_id);
        retVal.add(doc);
      },
      // `search()` function will be used to perform queries.
      search: q => NDXQuery(
        retVal.index,
        fieldBoostFactors,
        // BM25 ranking function constants:
        1.2,  // BM25 k1 constant, controls non-linear term frequency normalization (saturation).
        0.75, // BM25 b constant, controls to what degree document length normalizes tf values.
        words,
        termFilter,
        // Set of removed documents, in this example we don't want to support removing documents from the index,
        // so we can ignore it by specifying this set as `undefined` value.
        NDXRemoved, 
        q,
      ),
      save: (basePath) => {
        maybeClean(true);
        const obj = toSerializable(retVal.index);
        const objStr = JSON.stringify(obj, null, 2);
        const path = getNDXPath(basePath);
        Fs.writeFileSync(
          path,
          objStr
        );
        DEBUG.verboseSlow && console.log("Write NDX to ", path);
      },
      load: newIndex => {
        retVal.index = newIndex;
      }
    };

    DEBUG.verboseSlow && console.log('ndx setup', {retVal});
    return retVal;

    function maybeClean(doIt = false) {
      if ( (doIt && NDXRemoved.size) || NDXRemoved.size >= REMOVED_CAP_TO_VACUUM_NDX ) {
        vacuumIndex(retVal.index, NDXRemoved);
      }
    }
  }

  function loadNDXIndex(ndxFTSIndex) {
    if ( Fs.existsSync(getNDXPath()) ) {
      const indexContent = Fs.readFileSync(getNDXPath()).toString();
      const index = fromSerializable(JSON.parse(indexContent));
      ndxFTSIndex.load(index);
    }
    DEBUG.verboseSlow && console.log('NDX loaded');
  }

  function toNDXDoc({id, url, title, pageText}) {
    // use existing defined id or a new one
    return {
      id, 
      ndx_id: NDXId++,
      url,
      i_url: getURI(url),
      title, 
      content: pageText
    };
  }

  function ndxDocFields({namesOnly:namesOnly = false} = {}) {
    if ( !namesOnly && !NDX_OLD ) {
      /* old format (for newer ndx >= v1 ) */
      return [
        /* we index over the special indexable url field, not the regular url field */
        { name: "title", boost: 1.3 },
        { name: "i_url", boost: 1.15 }, 
        { name: "content", boost: 1.0 },
      ];
    } else {
      /* new format (for older ndx ~ v0.4 ) */
      return [
        "title",
        "i_url",
        "content"
      ];
    }
  }

  async function untilHas(thing, key, {timeout: timeout = false} = {}) {
    if ( thing instanceof Map ) {
      if ( thing.has(key) ) {
        return thing.get(key);
      } else {
        let resolve;
        const pr = new Promise(res => resolve = res);
        const then = Date.now();
        const checker = setInterval(() => {
          const now = Date.now();
          if ( thing.has(key) || (timeout && (now-then) >= timeout) ) {
            clearInterval(checker);
            resolve(thing.get(key));
          } else {
            DEBUG.verboseSlow && console.log(thing, "not have", key);
          }
        }, CHECK_INTERVAL);

        return pr;
      }
    } else if ( thing instanceof Set ) {
      if ( thing.has(key) ) {
        return true;
      } else {
        let resolve;
        const pr = new Promise(res => resolve = res);
        const then = Date.now();
        const checker = setInterval(() => {
          const now = Date.now();
          if ( thing.has(key) || (timeout && (now-then) >= timeout) ) {
            clearInterval(checker);
            resolve(true);
          } else {
            DEBUG.verboseSlow && console.log(thing, "not have", key);
          }
        }, CHECK_INTERVAL);

        return pr;
      }
    } else if ( typeof thing === "object" ) {
      if ( thing[key] ) {
        return true;
      } else {
        let resolve;
        const pr = new Promise(res => resolve = res);
        const then = Date.now();
        const checker = setInterval(() => {
          const now = Date.now();
          if ( thing[key] || (timeout && (now-then) >= timeout) ) {
            clearInterval(checker);
            resolve(true);
          } else {
            DEBUG.verboseSlow && console.log(thing, "not have", key);
          }
        }, CHECK_INTERVAL);

        return pr;
      }
    } else {
      throw new TypeError(`untilHas with thing of type ${thing} is not yet implemented!`);
    }
  }

  function getNDXPath(basePath) {
    return Path.resolve(args.ndx_fts_index_dir(basePath), 'index.ndx');
  }

  function getFuzzyPath(basePath) {
    return Path.resolve(args.fuzzy_fts_index_dir(basePath), 'docs.fzz');
  }

  function getFlexBase(basePath) {
    return args.flex_fts_index_dir(basePath);
  }

  function addFrameNode(observedFrame) {
    const {frameId, parentFrameId} = observedFrame;
    const node = {
      id: frameId,
      parentId: parentFrameId,
      parent: State.FrameNodes.get(parentFrameId)
    };

    DEBUG.verboseSlow && console.log({observedFrame});

    State.FrameNodes.set(node.id, node);

    return node;
  }

  function updateFrameNode(frameNavigated) {
    const {
      frame: {
        id: frameId, 
        parentId, url: rawUrl, urlFragment, 
        /*
        domainAndRegistry, unreachableUrl, 
        adFrameStatus
        */
      }
    } = frameNavigated;
    const url = urlFragment?.startsWith(rawUrl.slice(0,4)) ? urlFragment : rawUrl;
    let frameNode = State.FrameNodes.get(frameId);

    DEBUG.verboseSlow && console.log({frameNavigated});

    if ( ! frameNode ) {
      // Note
        // This is not actually a panic because
        // it can happen. It may just mean 
        // this isn't a sub frame.
        // So rather than panicing:
          /*
          throw new TypeError(
            `Sanity check failed: frameId ${
              frameId
            } is not in our FrameNodes data, which currently has ${
              State.FrameNodes.size
            } entries.`
          );
          */
        // We do this instead (just add it):
      frameNode = addFrameNode({frameId, parentFrameId: parentId});
    }

    if ( frameNode.id !== frameId ) {
      throw new TypeError(
        `Sanity check failed: Child frameId ${
          frameNode.frameId
        } was supposed to be ${
          frameId
        }`
      );
    }

    // Note:
      // use the urlFragment (a URL + the hash fragment identifier) 
      // only if it's actually a URL

    // Update frame node url (and possible parent)
      frameNode.url = url;
      if ( parentId !== frameNode.parentId ) {
        console.info(`Interesting. Frame parent changed from ${frameNode.parentId} to ${parentId}`);
        frameNode.parentId = parentId;
        frameNode.parent = State.FrameNodes.get(parentId);
        if ( parentId && !frameNode.parent ) {
          throw new TypeError(
            `!! FrameNode ${
              frameId
            } uses parentId ${
              parentId
            } but we don't have any record of ${
              parentId
            } in out FrameNodes data`
          );
        }
      }

    // comment out these details but reserve for possible future use
      /*
      frameNode.detail = {
        unreachableUrl, urlFragment,  
        domainAndRegistry, adFrameStatus
      };
      */
  }

  /*
  function removeFrameNode(frameDetached) {
    const {frameId, reason} = frameDetached;
    throw new TypeError(`removeFrameNode is not implemented`);
  }
  */

  function getRootFrameURL(frameId) {
    let frameNode = State.FrameNodes.get(frameId);
    if ( ! frameNode ) {
      DEBUG.verboseSlow && console.warn(new TypeError(
        `Sanity check failed: frameId ${
          frameId
        } is not in our FrameNodes data, which currently has ${
          State.FrameNodes.size
        } entries.`
      ));
      return;
    }
    if ( frameNode.id !== frameId ) {
      throw new TypeError(
        `Sanity check failed: Child frameId ${
          frameNode.id
        } was supposed to be ${
          frameId
        }`
      );
    }
    while(frameNode.parent) {
      frameNode = frameNode.parent;
    }
    return frameNode.url;
  }

// crawling
  async function archiveAndIndexURL(url, {
      crawl, 
      createIfMissing:createIfMissing = false, 
      timeout, 
      depth, 
      TargetId,
      program,
    } = {}) {
      DEBUG.verboseSlow && console.log('ArchiveAndIndex', url, {crawl, createIfMissing, timeout, depth, TargetId, program});
      if ( Mode == 'serve' ) {
        throw new TypeError(`archiveAndIndexURL can not be used in 'serve' mode.`);
      }
      if ( program ) {
        State.program = program;
      }
      let targetId = TargetId;
      let sessionId;
      if ( ! dontCache({url}) ) {
        const {send, on, close} = State.connection;
        const {targetInfos:targs} = await send("Target.getTargets", {});
        const targets = targs.reduce((M,T) => {
          M.set(T.url, T);
          M.set(T.targetId, T);
          return M;
        }, new Map);
        DEBUG.verboseSlow && console.log('Targets', targets);
        if ( targets.has(url) || targets.has(targetId) ) {
          DEBUG.verboseSlow && console.log('We have target', url, targetId);
          const targetInfo = targets.get(url) || targets.get(targetId);
          ({targetId} = targetInfo);
          if ( crawl && ! State.CrawlData.has(targetId) ) {
            State.CrawlIndexing.add(targetId)
            State.CrawlData.set(targetId, {depth, links:[]});
            if ( State.visited.has(url) ) {
              return [];
            } else {
              State.visited.add(url);
            }
          }
          sessionId = State.Sessions.get(targetId);
          DEBUG.verboseSlow && console.log(
            "Reloading to archive and index in select (Bookmark) mode", 
            url
          );
          if ( State.program && ! dontCache(targetInfo) ) {
            const fs = Fs;
            const path = Path;
            try {
              await sleep(500);
              await eval(`(async () => {
                try {
                  ${State.program}
                } catch(e) {
                  console.warn('Error in program', e, State.program);
                }
              })();`);
              await sleep(500);
            } catch(e) {
              console.warn(`Error evaluate program`, e);
            }
          }

          await untilTrue(async () => {
            const {result:{value:loaded}} = await send("Runtime.evaluate", {
              expression: `(function () {
                return document.readyState === 'complete'; 
              }())`,
              returnByValue: true
            }, sessionId);
            DEBUG.verboseSlow && console.log({loaded, targetInfo});
            return loaded;
          });
          //send("Page.stopLoading", {}, sessionId);
          send("Page.reload", {}, sessionId);
          if ( crawl ) {
            let resolve;
            const pageLoaded = new Promise(res => resolve = res).then(() => sleep(1000));
            {
              on("Page.loadEventFired", resolve);
              //console.log(targets, targetId, targets.get(targetId));
              const {result:{value:loaded}} = await send("Runtime.evaluate", {
                expression: `(function () {
                  return document.readyState === 'complete'; 
                }())`,
                returnByValue: true
              }, sessionId);
              if ( loaded ) {
                resolve(true);
              }
            }
            let notifyStable;
            const pageHTMLStabilized = new Promise(res => notifyStable = res);
            setTimeout(async () => {
              const timeout = MAX_TIME_PER_PAGE / 4;
              const checkDurationMsecs = 1618;
              const maxChecks = timeout / checkDurationMsecs;
              let lastSize = 0;
              let checkCounts = 1;
              let countStableSizeIterations = 0;
              const minStableSizeIterations = 3;

              while(checkCounts++ <= maxChecks) {
                const flatDoc = await send("DOMSnapshot.captureSnapshot", {
                  computedStyles: [],
                }, sessionId);
                const pageText = processDoc(flatDoc).replace(STRIP_CHARS, ' ');
                const currentSize = pageText.length;

                if(lastSize != 0 && currentSize == lastSize) 
                  countStableSizeIterations++;
                else 
                  countStableSizeIterations = 0; //reset the counter

                if(countStableSizeIterations >= minStableSizeIterations) {
                  notifyStable(true);
                }

                lastSize = currentSize;
                await sleep(checkDurationMsecs);
              }

              notifyStable(false);
            }, 0);

            await pageLoaded;
            
            if ( State.program && ! dontCache(targetInfo) ) {
              const fs = Fs;
              const path = Path;
              try {
                await sleep(500);
                await eval(`(async () => {
                  try {
                    ${State.program}
                  } catch(e) {
                    console.warn('Error in program', e, State.program);
                  }
                })();`);
                await sleep(500);
              } catch(e) {
                console.warn(`Error evaluate program`, e);
              }
            }

            await Promise.race([
              await Promise.all([
                pageHTMLStabilized,
                untilTrue(() => !State.CrawlIndexing.has(targetId), timeout/5, timeout),
                sleep(State.minPageCrawlTime || MIN_TIME_PER_PAGE)
              ]),
              sleep(State.maxPageCrawlTime || MAX_TIME_PER_PAGE)
            ]);

            console.log(`Closing page ${url}, at target ${targetId}`);

            await send("Target.closeTarget", {targetId});
            State.CrawlTargets.delete(targetId);
          }
        } else if ( createIfMissing ) {
          DEBUG.verboseSlow && console.log('We create target', url);
          try {
            targetId = null;
            ({targetId} = await send("Target.createTarget", {
              url: `${GO_SECURE ? 'https://localhost' : 'http://127.0.0.1'}:${args.server_port}/redirector.html?url=${
                encodeURIComponent(url)
              }`
            }));
          } catch(e) {
            console.warn("Error creating new tab for url", url, e);
            return;
          }
          if ( crawl && ! State.CrawlData.has(targetId) ) {
            State.CrawlTargets.add(targetId);
            State.CrawlIndexing.add(targetId);
            State.CrawlData.set(targetId, {depth, links:[]});
          }
          return archiveAndIndexURL(url, {
            crawl, timeout, depth, createIfMissing: false, /* prevent redirect loops */
            TargetId: targetId,
            program,
          });
        }
      } else {
        DEBUG.verboseSlow && console.warn(
          `archiveAndIndexURL called in mode ${
            Mode
           } for URL ${
            url
           } but that URL is not in our Bookmarks list.`
        );
      }
      if ( crawl && State.CrawlData.has(targetId) ) {
        const {links} = State.CrawlData.get(targetId);
        console.log({targetId,links});
        State.CrawlData.delete(targetId);
        return links;
      } else {
        return [];
      }
  }

  export async function startCrawl({
    urls, timeout, depth, saveToFile: saveToFile = false,
    batchSize,
    minPageCrawlTime, 
    maxPageCrawlTime,
    program,
  } = {}) {
    if ( State.crawling ) {
      console.log('Already crawling...');
      return;
    }
    if ( saveToFile ) {
      logName = `crawl-${(new Date).toISOString()}.urls.txt`; 
      logStream = Fs.createWriteStream(Path.resolve(args.CONFIG_DIR, logName), {flags:'as+'});
    }
    console.log('StartCrawl', {urls, timeout, depth, batchSize, saveToFile, minPageCrawlTime, maxPageCrawlTime, program});
    State.crawling = true;
    State.crawlDepth = depth;
    State.crawlTimeout = timeout;
    State.visited = new Set();
    Object.assign(State,{
      batchSize,
      minPageCrawlTime,
      maxPageCrawlTime
    });
    const batch_sz = State.batchSize || BATCH_SIZE;
    let totalBytes = 0;
    setTimeout(async () => {
      try {
        while(urls.length >= batch_sz) {
          const jobs = [];
          const batch = urls.splice(urls.length-batch_sz,batch_sz);
          console.log({urls, batch});
          for( let i = 0; i < batch_sz; i++ ) {
            const {depth,url} = batch.shift();
            const pr = archiveAndIndexURL(
              url, 
              {crawl: true, depth, timeout, createIfMissing:true, getLinks: depth >= 1, program}
            );
            jobs.push(pr);
          }
          const links = (await Promise.all(jobs)).flat().filter(({url}) => !Q.has(url));
          if ( links.length ) {
            urls.push(...links);
            links.forEach(({url}) => Q.add(url)); 
          }
        }
        while(urls.length) {
          const {depth,url} = urls.pop();
          const links = (await archiveAndIndexURL(
            url, 
            {crawl: true, depth, timeout, createIfMissing:true, getLinks: depth >= 1, program}
          )).filter(({url}) => !Q.has(url));
          console.log(links, Q);
          if ( links.length ) {
            urls.push(...links);
            links.forEach(({url}) => Q.add(url)); 
          }
        }
      } catch(e) {
        console.warn(e);
        throw new RichError({status:500, message: e.message});
      } finally {
        await untilTrue(() => State.CrawlData.size === 0 && State.CrawlTargets.size === 0, -1)
        State.crawling = false;
        State.crawlDepth = false;
        State.crawlTimeout = false;
        State.visited = false;
        if ( saveToFile ) {
          logStream.close();
          totalBytes = logStream.bytesWritten;
          console.log(`Wrote ${totalBytes} bytes of URLs to ${logName}`);
        }
        console.log(`Crawl finished.`);
      }
    }, 0);
  }


================================================
FILE: src/args.js
================================================
import os from 'os';
import path from 'path';
import fs from 'fs';

const server_port = process.env.PORT || process.argv[2] || 22120;
const mode = process.argv[3] || 'save';
const chrome_port = process.argv[4] || 9222;

const Pref = {};
export const CONFIG_DIR = path.resolve(os.homedir(), '.config', 'dosyago', 'DownloadNet');
fs.mkdirSync(CONFIG_DIR, {recursive:true});
const pref_file = path.resolve(CONFIG_DIR, 'config.json');
const cacheId = Math.random();

loadPref();

let BasePath = Pref.BasePath;
export const archive_root = () => path.resolve(BasePath, '22120-arc');
export const no_file = () => path.resolve(archive_root(), 'no.json');
export const temp_browser_cache = () => path.resolve(archive_root(), 'temp-browser-cache' + cacheId);
export const library_path = () => path.resolve(archive_root(), 'public', 'library');
export const cache_file = () => path.resolve(library_path(), 'cache.json');
export const index_file = () => path.resolve(library_path(), 'index.json');
export const fts_index_dir = () => path.resolve(library_path(), 'fts');

const flex_fts_index_dir = base => path.resolve(base || fts_index_dir(), 'flex');
const ndx_fts_index_dir = base => path.resolve(base || fts_index_dir(), 'ndx');
const fuzzy_fts_index_dir = base => path.resolve(base || fts_index_dir(), 'fuzzy');

const results_per_page = 10;

updateBasePath(process.argv[5] || Pref.BasePath || CONFIG_DIR);

const args = {
  mode,

  server_port, 
  chrome_port,

  updateBasePath,
  getBasePath,

  library_path,
  no_file,
  temp_browser_cache,
  cache_file,
  index_file,
  fts_index_dir,
  flex_fts_index_dir,
  ndx_fts_index_dir,
  fuzzy_fts_index_dir,

  results_per_page,
  CONFIG_DIR
};

export default args;

function updateBasePath(new_base_path, {force:force = false, before: before = []} = {}) {
  new_base_path = path.resolve(new_base_path);
  if ( !force && (BasePath == new_base_path) ) {
    return false;
  }

  console.log(`Updating base path from ${BasePath} to ${new_base_path}...`);
  BasePath = new_base_path;

  if ( Array.isArray(before) ) {
    for( const task of before ) {
      try { task(); } catch(e) { 
        console.error(`before updateBasePath task failed. Task: ${task}`);
      }
    }
  } else

Download .txt

gitextract_oucmwqcm/

├── .eslintrc.cjs
├── .github/
│   ├── FUNDING.yml
│   └── workflows/
│       └── node.js.yml
├── .gitignore
├── .npm.release
├── .npmignore
├── .npmrelease
├── CONTRIBUTING.md
├── NOTICE
├── README.md
├── TODO
├── docs/
│   ├── OLD-README.md
│   ├── SECURITY.md
│   ├── features.md
│   ├── issues
│   └── todo
├── eslint.config.js
├── exec.js
├── global-run.cjs
├── icons/
│   └── dk.icns
├── package.json
├── public/
│   ├── find_cleaned_duplicates.mjs
│   ├── find_crawlable.mjs
│   ├── injection.js
│   ├── library/
│   │   └── README.md
│   ├── make_top.mjs
│   ├── old-index.html
│   ├── problem_find.mjs
│   ├── redirector.html
│   ├── style.css
│   ├── test-injection.html
│   └── top.html
├── scripts/
│   ├── build_only.sh
│   ├── clean.sh
│   ├── downloadnet-entitlements.xml
│   ├── go_build.sh
│   ├── go_dev.sh
│   ├── postinstall.sh
│   ├── publish.sh
│   ├── release.sh
│   └── sign_windows_release.ps1
├── sign-win.ps1
├── src/
│   ├── app.js
│   ├── archivist.js
│   ├── args.js
│   ├── blockedResponse.js
│   ├── bookmarker.js
│   ├── common.js
│   ├── gem-highlighter.js
│   ├── hello.js
│   ├── highlighter.js
│   ├── index.js
│   ├── installBrowser.js
│   ├── launcher.js
│   ├── libraryServer.js
│   ├── protocol.js
│   ├── root.cjs
│   └── root.js
├── stampers/
│   ├── macos-new.sh
│   ├── macos.sh
│   ├── nix.sh
│   ├── notarize_macos.sh
│   └── win.bat
└── test.sh

Download .txt

SYMBOL INDEX (201 symbols across 18 files)

FILE: public/find_cleaned_duplicates.mjs
  constant CLEAN (line 13) | const CLEAN = true;
  constant CONCURRENT (line 14) | const CONCURRENT = 7;
  function make (line 30) | async function make() {
  function cleanup (line 68) | function cleanup() {
  function clean (line 87) | function clean(urlString) {
  function clean2 (line 108) | function clean2(urlString) {
  function curlCommand (line 114) | function curlCommand(url) {

FILE: public/find_crawlable.mjs
  constant CLEAN (line 7) | const CLEAN = false;
  constant CONCURRENT (line 8) | const CONCURRENT = 7;
  function make (line 22) | async function make() {
  function cleanup (line 34) | function cleanup() {
  function clean (line 46) | function clean(urlString) {
  function clean2 (line 67) | function clean2(urlString) {
  function curlCommand (line 73) | function curlCommand(url) {

FILE: public/injection.js
  constant DEBUG (line 3) | const DEBUG = debug || false;
  function getInjection (line 5) | function getInjection({sessionId}) {

FILE: public/make_top.mjs
  constant CLEAN (line 7) | const CLEAN = false;
  constant CONCURRENT (line 8) | const CONCURRENT = 7;
  function make (line 25) | async function make() {
  function make_v2 (line 90) | async function make_v2() {
  function cleanup (line 153) | function cleanup() {
  function make_v1 (line 180) | async function make_v1() {
  function clean (line 204) | function clean(urlString) {
  function clean2 (line 225) | function clean2(urlString) {
  function curlCommand (line 231) | function curlCommand(url) {

FILE: public/problem_find.mjs
  constant CLEAN (line 13) | const CLEAN = false;
  constant CONCURRENT (line 14) | const CONCURRENT = 7;
  function make (line 30) | async function make() {
  function cleanup (line 53) | function cleanup() {
  function clean (line 71) | function clean(urlString) {
  function clean2 (line 92) | function clean2(urlString) {
  function curlCommand (line 98) | function curlCommand(url) {

FILE: src/app.js
  constant BROWSERS (line 21) | const BROWSERS = [
  constant BASE_CHROME_FLAGS (line 111) | const BASE_CHROME_FLAGS = [
  function promptUser (line 159) | async function promptUser(question, options) {
  function findExecutablePath (line 180) | async function findExecutablePath(browserDef) {
  function detectInstalledBrowsers (line 215) | async function detectInstalledBrowsers() {
  function checkIsConnectable (line 227) | async function checkIsConnectable(browserDef) { // Takes browserDef
  function detectBrowsers (line 250) | async function detectBrowsers() {
  function killBrowser (line 290) | async function killBrowser(browserName) {
  function cleanTempCache (line 332) | async function cleanTempCache() {
  function start (line 348) | async function start() {
  function cleanup (line 573) | async function cleanup(reason, err, { exit = false } = {}) {

FILE: src/archivist.js
  constant DEBUG (line 55) | const DEBUG = debug || false;
  constant STRIP_CHARS (line 58) | const STRIP_CHARS = /[\u0001-\u001a\0\v\f\r\t\n]/g;
  constant NDX_OLD (line 61) | const NDX_OLD = false;
  constant USE_FLEX (line 62) | const USE_FLEX = true;
  constant FTS_INDEX_DIR (line 63) | const FTS_INDEX_DIR = args.fts_index_dir;
  constant URI_SPLIT (line 64) | const URI_SPLIT = /[/.]/g;
  constant NDX_ID_KEY (line 65) | const NDX_ID_KEY = 'ndx_id';
  constant INDEX_HIDDEN_KEYS (line 66) | const INDEX_HIDDEN_KEYS = new Set([
  constant FLEX_OPTS (line 81) | const FLEX_OPTS = {
  constant REMOVED_CAP_TO_VACUUM_NDX (line 92) | const REMOVED_CAP_TO_VACUUM_NDX = 10;
  constant NDX_FIELDS (line 93) | const NDX_FIELDS = ndxDocFields();
  constant REGULAR_SEARCH_OPTIONS_FUZZY (line 99) | const REGULAR_SEARCH_OPTIONS_FUZZY = {
  constant HIGHLIGHT_OPTIONS_FUZZY (line 102) | const HIGHLIGHT_OPTIONS_FUZZY = {
  constant FUZZ_OPTS (line 105) | const FUZZ_OPTS = {
  constant BLANK_STATE (line 131) | const BLANK_STATE = {
  constant BODYLESS (line 172) | const BODYLESS = new Set([
  constant NEVER_CACHE (line 178) | const NEVER_CACHE = new Set([
  constant CACHE_FILE (line 188) | const CACHE_FILE = args.cache_file;
  constant INDEX_FILE (line 189) | const INDEX_FILE = args.index_file;
  constant NO_FILE (line 190) | const NO_FILE = args.no_file;
  constant TBL (line 191) | const TBL = /(:\/\/|:|@)/g;
  constant UNCACHED_BODY (line 192) | const UNCACHED_BODY = b64('We have not saved this data');
  constant UNCACHED_CODE (line 193) | const UNCACHED_CODE = 404;
  constant UNCACHED_HEADERS (line 194) | const UNCACHED_HEADERS = [
  constant UNCACHED (line 198) | const UNCACHED = {
  function collect (line 216) | async function collect({chrome_port:port, mode} = {}) {
  function neverCache (line 866) | function neverCache(url) {
  function dontCache (line 870) | function dontCache(request) {
  function processDoc (line 878) | function processDoc({documents, strings}) {
  function isReady (line 922) | async function isReady() {
  function loadFuzzy (line 926) | async function loadFuzzy({fromMemOnly: fromMemOnly = false} = {}) {
  function getContentSig (line 939) | function getContentSig(doc) {
  function getURI (line 943) | function getURI(url) {
  function saveFuzzy (line 947) | function saveFuzzy(basePath) {
  function clearSavers (line 959) | function clearSavers() {
  function loadFiles (line 976) | async function loadFiles() {
  function getMode (line 1107) | function getMode() { return Mode; }
  function saveFiles (line 1109) | function saveFiles({useState: useState = false, forceSave:forceSave = fa...
  function changeMode (line 1125) | async function changeMode(mode) {
  function getDetails (line 1132) | function getDetails(id) {
  function findOffsets (line 1139) | function findOffsets(query, doc, maxLength = 0) {
  function beforePathChanged (line 1149) | function beforePathChanged(new_path, {force: force = false} = {}) {
  function afterPathChanged (line 1165) | async function afterPathChanged() {
  function saveCache (line 1172) | function saveCache(path) {
  function saveIndex (line 1178) | function saveIndex(path) {
  function getIndex (line 1199) | function getIndex() {
  function deleteFromIndexAndSearch (line 1207) | async function deleteFromIndexAndSearch(url) {
  function search (line 1226) | async function search(query) {
  function combineResults (line 1257) | function combineResults({flex,ndx,fuzz}) {
  function countRank (line 1282) | function countRank(record, weight = 1.0) {
  function processFuzzResults (line 1296) | function processFuzzResults(docs) {
  function saveFTS (line 1302) | async function saveFTS(path = undefined, {forceSave:forceSave = false} =...
  function shutdown (line 1337) | function shutdown(then) {
  function b64 (line 1345) | function b64(s) {
  function NDXIndex (line 1349) | function NDXIndex(fields) {
  function loadNDXIndex (line 1433) | function loadNDXIndex(ndxFTSIndex) {
  function toNDXDoc (line 1442) | function toNDXDoc({id, url, title, pageText}) {
  function ndxDocFields (line 1454) | function ndxDocFields({namesOnly:namesOnly = false} = {}) {
  function untilHas (line 1473) | async function untilHas(thing, key, {timeout: timeout = false} = {}) {
  function getNDXPath (line 1536) | function getNDXPath(basePath) {
  function getFuzzyPath (line 1540) | function getFuzzyPath(basePath) {
  function getFlexBase (line 1544) | function getFlexBase(basePath) {
  function addFrameNode (line 1548) | function addFrameNode(observedFrame) {
  function updateFrameNode (line 1563) | function updateFrameNode(frameNavigated) {
  function getRootFrameURL (line 1647) | function getRootFrameURL(frameId) {
  function archiveAndIndexURL (line 1675) | async function archiveAndIndexURL(url, {
  function startCrawl (line 1876) | async function startCrawl({

FILE: src/args.js
  constant CONFIG_DIR (line 10) | const CONFIG_DIR = path.resolve(os.homedir(), '.config', 'dosyago', 'Dow...
  function updateBasePath (line 59) | function updateBasePath(new_base_path, {force:force = false, before: bef...
  function getBasePath (line 126) | function getBasePath() {
  function loadPref (line 130) | function loadPref() {
  function savePref (line 144) | function savePref() {
  function clone (line 152) | function clone(o) {

FILE: src/blockedResponse.js
  constant BLOCKED_CODE (line 1) | const BLOCKED_CODE = 200;
  constant BLOCKED_BODY (line 2) | const BLOCKED_BODY = Buffer.from(`
  constant BLOCKED_HEADERS (line 7) | const BLOCKED_HEADERS = [
  constant BLOCKED_RESPONSE (line 16) | const BLOCKED_RESPONSE = `

FILE: src/bookmarker.js
  constant DEBUG (line 7) | const DEBUG = debug || false;
  constant FS_WATCH_OPTS (line 12) | const FS_WATCH_OPTS = {
  constant UDD_PATHS (line 18) | const UDD_PATHS = {
  constant PLAT_TABLE (line 26) | const PLAT_TABLE = {
  constant PROFILE_DIR_NAME_REGEX (line 30) | const PROFILE_DIR_NAME_REGEX = /^(Default|Profile \d+)$/i;
  constant BOOKMARK_FILE_NAME_REGEX (line 32) | const BOOKMARK_FILE_NAME_REGEX = /^Bookmarks$/i;
  function shutdown (line 176) | async function shutdown() {
  function hasBookmark (line 198) | function hasBookmark(url) {
  function getProfileRootDir (line 206) | function getProfileRootDir() {
  function flatten (line 246) | function flatten(bookmarkObj, {toMap: toMap = false, map} = {}) {
  function resolveEnvironmentVariablesToPathSegments (line 312) | function resolveEnvironmentVariablesToPathSegments(path) {

FILE: src/common.js
  constant DEEB (line 9) | const DEEB = process.env.DEBUG_22120_VERBOSE || false;
  constant DEBUG (line 11) | const DEBUG = {
  constant SHOW_FETCH (line 23) | const SHOW_FETCH = false;
  constant PUBLIC_SERVER (line 30) | const PUBLIC_SERVER = true;
  constant MIN_TIME_PER_PAGE (line 33) | const MIN_TIME_PER_PAGE = 10000;
  constant MAX_TIME_PER_PAGE (line 34) | const MAX_TIME_PER_PAGE = 32000;
  constant MIN_WAIT (line 35) | const MIN_WAIT = 200;
  constant MAX_WAITS (line 36) | const MAX_WAITS = 300;
  constant BATCH_SIZE (line 37) | const BATCH_SIZE = 5;
  constant MAX_REAL_URL_LENGTH (line 38) | const MAX_REAL_URL_LENGTH = 2**15 - 1;
  constant CHECK_INTERVAL (line 40) | const CHECK_INTERVAL = 400;
  constant TEXT_NODE (line 41) | const TEXT_NODE = 3;
  constant MAX_HIGHLIGHTABLE_LENGTH (line 42) | const MAX_HIGHLIGHTABLE_LENGTH = 0;
  constant MAX_TITLE_LENGTH (line 43) | const MAX_TITLE_LENGTH = 140;
  constant MAX_URL_LENGTH (line 44) | const MAX_URL_LENGTH = 140;
  constant MAX_HEAD (line 45) | const MAX_HEAD = 140;
  constant LOCALP (line 47) | const LOCALP = path.resolve(os.homedir(), 'local-sslcerts', 'privkey.pem');
  constant ANYP (line 48) | const ANYP = path.resolve(os.homedir(), 'sslcerts', 'privkey.pem');
  constant GO_SECURE (line 49) | const GO_SECURE = fs.existsSync(LOCALP) || fs.existsSync(ANYP);
  class RichError (line 53) | class RichError extends Error {
    method constructor (line 54) | constructor(msg) {
  constant FORBIDDEN_TEXT_PARENT (line 68) | const FORBIDDEN_TEXT_PARENT = new Set([
  constant ERROR_CODE_SAFE_TO_IGNORE (line 76) | const ERROR_CODE_SAFE_TO_IGNORE = new Set([
  constant SNIP_CONTEXT (line 91) | const SNIP_CONTEXT = 31;
  constant NO_SANDBOX (line 93) | const NO_SANDBOX = (process.env.DEBUG_22120 && process.env.SET_22120_NO_...
  constant APP_ROOT (line 95) | const APP_ROOT = __ROOT;
  function say (line 99) | function say(o) {
  function clone (line 103) | function clone(o) {
  function untilTrue (line 107) | async function untilTrue(pred, waitOverride = MIN_WAIT, maxWaits = MAX_W...

FILE: src/gem-highlighter.js
  constant MAX_ACCEPT_SCORE (line 6) | const MAX_ACCEPT_SCORE = 0.5;
  constant CHUNK_SIZE (line 7) | const CHUNK_SIZE = 12;
  function params (line 9) | function params(qLength, chunkSize = CHUNK_SIZE) {
  function markText (line 35) | function markText(text, query) {
  function highlight (line 48) | function highlight(query, doc, {
  function getFragmenter (line 260) | function getFragmenter(chunkSize, {overlap = false, step = 1} = {}) {
  function trilight (line 312) | function trilight(query, doc, {

FILE: src/highlighter.js
  constant MAX_ACCEPT_SCORE (line 6) | const MAX_ACCEPT_SCORE = 0.5;
  constant CHUNK_SIZE (line 7) | const CHUNK_SIZE = 12;
  function internalMarkText (line 11) | function internalMarkText(textToMark, queryToFind) {
  function calculateUkkonenParams (line 25) | function calculateUkkonenParams(queryLength, chunkSize = CHUNK_SIZE) {
  function highlight (line 36) | function highlight(query, docString, {
  function getFragmenter (line 202) | function getFragmenter(chunkSize, {overlap = false, symbolsArray = null,...
  function trilight (line 252) | function trilight(query, docString, {

FILE: src/installBrowser.js
  constant SUPPORTED_BROWSERS (line 11) | const SUPPORTED_BROWSERS = ['chrome', 'brave', 'vivaldi', 'edge', 'chrom...
  constant PLATFORM (line 12) | const PLATFORM = process.platform;
  constant ARCH (line 13) | const ARCH = process.arch;
  function installBrowser (line 19) | async function installBrowser(browserName) {
  function checkBrowserAvailability (line 32) | async function checkBrowserAvailability(browserName) {
  function installBrowserForPlatform (line 39) | async function installBrowserForPlatform(browserName) {
  function installOnWindows (line 51) | async function installOnWindows(browserName) {
  function installOnMacOS (line 87) | async function installOnMacOS(browserName) {
  function installOnLinux (line 118) | async function installOnLinux(browserName) {
  function installOnDebian (line 134) | async function installOnDebian(browserName) {
  function installOnFedora (line 175) | async function installOnFedora(browserName) {
  function getLinuxDistro (line 203) | async function getLinuxDistro() {
  function downloadBinary (line 229) | async function downloadBinary(url, outputPath) {
  function getDownloadUrl (line 239) | function getDownloadUrl(browserName, platform, arch) {

FILE: src/launcher.js
  function launch (line 12) | function launch(executablePath, browserArgs = [], options = {}) {

FILE: src/libraryServer.js
  constant SITE_PATH (line 23) | const SITE_PATH = path.resolve(APP_ROOT, '..', 'public');
  function PageLayout (line 43) | function PageLayout({ title, content, currentNav, layoutType = 'default'...
  function start (line 79) | async function start({server_port}) {
  function addHandlers (line 139) | function addHandlers() {
  function stop (line 401) | async function stop() {
  function MainApplicationView (line 419) | function MainApplicationView() {
  function IndexView (line 585) | function IndexView(urls, {edit = false} = {}) {
  function SearchResultView (line 669) | function SearchResultView({results, query, HL, page, hasMore = false}) {

FILE: src/protocol.js
  constant ROOT_SESSION (line 4) | const ROOT_SESSION = "browser";
  constant MESSAGES (line 5) | const MESSAGES = new Map();
  function connect (line 16) | async function connect({port:port = 9222} = {}) {

FILE: src/root.cjs
  constant APP_ROOT (line 6) | const APP_ROOT = dir;

Download .json

Condensed preview — 64 files, each showing path, character count, and a content snippet. Download the .json file or copy for the full structured content (344K chars).

[
  {
    "path": ".eslintrc.cjs",
    "chars": 250,
    "preview": "module.exports = {\n  \"env\": {\n    \"es2021\": true,\n    \"node\": true\n  },\n  \"extends\": \"eslint:recommended\",\n  \"parserOpti"
  },
  {
    "path": ".github/FUNDING.yml",
    "chars": 87,
    "preview": "# These are supported funding model platforms\n\ncustom: https://dosaygo.com/downloadnet\n"
  },
  {
    "path": ".github/workflows/node.js.yml",
    "chars": 862,
    "preview": "# This workflow will do a clean installation of node dependencies, cache/restore them, build the source code and run tes"
  },
  {
    "path": ".gitignore",
    "chars": 1887,
    "preview": "*.pkg\n\"\n\"*\n*~\n.*.un~\n*.blob\n.\\build\\*\n22120-arc\n\n.*.swp\n\n# Bundling and packaging\n22120.exe\n22120.nix\n22120.mac\n22120.wi"
  },
  {
    "path": ".npm.release",
    "chars": 29,
    "preview": "Sun Jan 15 15:11:49 CST 2023\n"
  },
  {
    "path": ".npmignore",
    "chars": 69,
    "preview": "\n.*.swp\n*~\n.*un~\n\n# Bundling and packaging\nbuild/bin/*\n\nbuild/cjs/*\n\n"
  },
  {
    "path": ".npmrelease",
    "chars": 29,
    "preview": "Fri Aug 30 00:09:47 CST 2024\n"
  },
  {
    "path": "CONTRIBUTING.md",
    "chars": 4217,
    "preview": "# Contributing\n\nWhen contributing to this repository, please first discuss the change you wish to make via issue,\nemail,"
  },
  {
    "path": "NOTICE",
    "chars": 338,
    "preview": "Copyright Dosyago Corporation & Cris Stringfellow (https://dosaygo.com)\n\n22120 and all previously released versions, inc"
  },
  {
    "path": "README.md",
    "chars": 6113,
    "preview": "# :floppy_disk: [DownloadNet (dn)](https://github.com/dosyago/DownloadNet) – Your Offline Web Archive with Full Text Sea"
  },
  {
    "path": "TODO",
    "chars": 1290,
    "preview": "Ultimate Goal\n\n- stable across releases (binaries, npm, can add to winget/choco in future)\n- revenue\n\n----\n\nReleases\n\n- "
  },
  {
    "path": "docs/OLD-README.md",
    "chars": 15606,
    "preview": "# :floppy_disk: [DownloadNet](https://github.com/c9fe/22120) [![source lines of code](https://sloc.xyz/github/crisdosyag"
  },
  {
    "path": "docs/SECURITY.md",
    "chars": 571,
    "preview": "# Security Policy\n\n## Supported Versions\n\nUse this section to tell people about which versions of your project are\ncurre"
  },
  {
    "path": "docs/features.md",
    "chars": 531,
    "preview": "Cool Possible Feature Ideas\n\n- might be nice to have historical documents indexed as well. For example. Every time we re"
  },
  {
    "path": "docs/issues",
    "chars": 313,
    "preview": "- ndx index seems to lose documents.\n  - e.g.\n  1. visit goog:hell\n  2. visit top link: wiki - hell\n  3. visit hellomaga"
  },
  {
    "path": "docs/todo",
    "chars": 2674,
    "preview": "- complete snippet generation\n  - sometimes we are not getting any segments. In that case we should just show the first "
  },
  {
    "path": "eslint.config.js",
    "chars": 170,
    "preview": "import globals from \"globals\";\nimport pluginJs from \"@eslint/js\";\n\n\nexport default [\n  {languageOptions: { globals: glob"
  },
  {
    "path": "exec.js",
    "chars": 157,
    "preview": "import path from 'path';\nimport {execSync} from 'child_process';\n\nconst runPath = path.resolve(process.argv[2]);\nexecSyn"
  },
  {
    "path": "global-run.cjs",
    "chars": 726,
    "preview": "#!/usr/bin/env node\n\nconst os = require('os');\nconst { spawn } = require('child_process');\nconst fs = require('fs');\ncon"
  },
  {
    "path": "package.json",
    "chars": 1685,
    "preview": "{\n  \"name\": \"downloadnet\",\n  \"version\": \"4.5.2\",\n  \"type\": \"module\",\n  \"description\": \"Library server and an archivist b"
  },
  {
    "path": "public/find_cleaned_duplicates.mjs",
    "chars": 4300,
    "preview": "#!/usr/bin/env node\n\nimport fs from 'node:fs';\nimport path from 'node:path';\nimport child_process from 'node:child_proce"
  },
  {
    "path": "public/find_crawlable.mjs",
    "chars": 2833,
    "preview": "#!/usr/bin/env node\n\nimport fs from 'node:fs';\nimport path from 'node:path';\nimport child_process from 'node:child_proce"
  },
  {
    "path": "public/injection.js",
    "chars": 6892,
    "preview": "import {DEBUG as debug} from '../src/common.js';\n\nconst DEBUG = debug || false;\n\nexport function getInjection({sessionId"
  },
  {
    "path": "public/library/README.md",
    "chars": 468,
    "preview": "# ALT Default storage directory for library\n\nRemove `public/library/http*` and `public/library/cache.json` from `.gitign"
  },
  {
    "path": "public/make_top.mjs",
    "chars": 7722,
    "preview": "#!/usr/bin/env node\n\nimport fs from 'node:fs';\nimport path from 'node:path';\nimport child_process from 'node:child_proce"
  },
  {
    "path": "public/old-index.html",
    "chars": 7338,
    "preview": "<!DOCTYPE html>\n<meta charset=utf-8>\n<title>Your Personal Search Engine and Archive</title>\n<link rel=stylesheet href=st"
  },
  {
    "path": "public/problem_find.mjs",
    "chars": 3523,
    "preview": "#!/usr/bin/env node\n\nimport fs from 'node:fs';\nimport path from 'node:path';\nimport child_process from 'node:child_proce"
  },
  {
    "path": "public/redirector.html",
    "chars": 531,
    "preview": "<!DOCTYPE html>\n<meta name=\"referrer\" content=\"no-referrer\" />\n<h1>About to index archive and index <code id=url-text></"
  },
  {
    "path": "public/style.css",
    "chars": 14461,
    "preview": "/* public/style.css */\n\n/* 1. Modern CSS Reset (Simplified) */\n*, *::before, *::after {\n  box-sizing: border-box;\n  marg"
  },
  {
    "path": "public/test-injection.html",
    "chars": 47,
    "preview": "<script type=module src=injection.js></script>\n"
  },
  {
    "path": "public/top.html",
    "chars": 22,
    "preview": "<script>\n  \n</script>\n"
  },
  {
    "path": "scripts/build_only.sh",
    "chars": 1993,
    "preview": "#!/usr/bin/env bash\n\n#set -x\nsource $HOME/.nvm/nvm.sh\n\nrm -rf build\nmkdir -p build/esm/\nmkdir -p build/cjs/\nmkdir -p bui"
  },
  {
    "path": "scripts/clean.sh",
    "chars": 79,
    "preview": "#!/usr/bin/env bash\n\nrm package-lock.json; rm -rf node_modules; rm -rf build/*\n"
  },
  {
    "path": "scripts/downloadnet-entitlements.xml",
    "chars": 541,
    "preview": "<?xml version=\"1.0\" encoding=\"UTF-8\"?>\n<!DOCTYPE plist PUBLIC \"-//Apple//DTD PLIST 1.0//EN\" \"http://www.apple.com/DTDs/P"
  },
  {
    "path": "scripts/go_build.sh",
    "chars": 104,
    "preview": "#!/usr/bin/env bash\n\ncp ./.package.build.json ./package.json\ncp ./src/.common.build.js ./src/common.js\n\n"
  },
  {
    "path": "scripts/go_dev.sh",
    "chars": 117,
    "preview": "#!/usr/bin/env bash\n\ngut \"Just built\"\ncp ./.package.dev.json ./package.json\ncp ./src/.common.dev.js ./src/common.js\n\n"
  },
  {
    "path": "scripts/postinstall.sh",
    "chars": 303,
    "preview": "#!/usr/bin/env bash\n\nwhich brew || /bin/bash -c \"$(curl -fsSL https://raw.githubusercontent.com/Homebrew/install/HEAD/in"
  },
  {
    "path": "scripts/publish.sh",
    "chars": 81,
    "preview": "#!/usr/bin/env bash\n\n./scripts/go_build.sh\ngpush minor \"$@\"\n./scripts/go_dev.sh\n\n"
  },
  {
    "path": "scripts/release.sh",
    "chars": 498,
    "preview": "#!/bin/sh\n\n#./scripts/compile.sh\ndescription=$1\nlatest_tag=$(git describe --abbrev=0)\ngrel release -u o0101 -r dn --tag "
  },
  {
    "path": "scripts/sign_windows_release.ps1",
    "chars": 9249,
    "preview": "param (\n    [Parameter(Mandatory=$true)]\n    [string]$ExePath,\n\n    [Parameter(Mandatory=$true)]\n    [string]$KeyVaultNa"
  },
  {
    "path": "sign-win.ps1",
    "chars": 101,
    "preview": ".\\scripts\\sign_windows_release.ps1 -ExePath .\\build\\bin\\dn-win.exe -KeyVaultName codeSigningForever\n\n"
  },
  {
    "path": "src/app.js",
    "chars": 26799,
    "preview": "// app.js\nimport os from 'os';\nimport path from 'path';\nimport fs from 'fs/promises';\nimport { exec } from 'child_proces"
  },
  {
    "path": "src/archivist.js",
    "chars": 66715,
    "preview": "// Licenses\n  // FlexSearch is Apache-2.0 licensed\n    // Source: https://github.com/nextapps-de/flexsearch/blob/bffb255"
  },
  {
    "path": "src/args.js",
    "chars": 4553,
    "preview": "import os from 'os';\nimport path from 'path';\nimport fs from 'fs';\n\nconst server_port = process.env.PORT || process.argv"
  },
  {
    "path": "src/blockedResponse.js",
    "chars": 995,
    "preview": "export const BLOCKED_CODE = 200;\nexport const BLOCKED_BODY = Buffer.from(`\n  <style>:root { font-family: system-ui, mono"
  },
  {
    "path": "src/bookmarker.js",
    "chars": 10364,
    "preview": "import os from 'os';\nimport Path from 'path';\nimport fs from 'fs';\n\nimport {DEBUG as debug} from './common.js';\n\nconst D"
  },
  {
    "path": "src/common.js",
    "chars": 3851,
    "preview": "import path from 'path';\nimport {fileURLToPath} from 'url';\nimport fs from 'fs';\nimport os from 'os';\nimport { root } fr"
  },
  {
    "path": "src/gem-highlighter.js",
    "chars": 22676,
    "preview": "// highlighter.js\n\nimport ukkonen from 'ukkonen';\nimport {DEBUG} from './common.js';\n\nconst MAX_ACCEPT_SCORE = 0.5;\ncons"
  },
  {
    "path": "src/hello.js",
    "chars": 53,
    "preview": "console.log(`hello...is it me you're looking for?`);\n"
  },
  {
    "path": "src/highlighter.js",
    "chars": 22368,
    "preview": "// highlighter.js\n\nimport ukkonen from 'ukkonen';\nimport {DEBUG} from './common.js';\n\nconst MAX_ACCEPT_SCORE = 0.5;\ncons"
  },
  {
    "path": "src/index.js",
    "chars": 89,
    "preview": " \nrequire = require('esm')(module/*, options*/);\nmodule.exports = require('./app.js');\n \n"
  },
  {
    "path": "src/installBrowser.js",
    "chars": 13223,
    "preview": "import { exec } from 'child_process';\nimport { promisify } from 'util';\nimport { createWriteStream } from 'fs';\nimport {"
  },
  {
    "path": "src/launcher.js",
    "chars": 2191,
    "preview": "// launcher.js\nimport { spawn } from 'child_process';\nimport { DEBUG } from './common.js'; // Assuming common.js is acc"
  },
  {
    "path": "src/libraryServer.js",
    "chars": 27716,
    "preview": "import sea from 'node:sea';\nimport http from 'http';\nimport https from 'https';\nimport fs from 'fs';\nimport os from 'os'"
  },
  {
    "path": "src/protocol.js",
    "chars": 5229,
    "preview": "import Ws from 'ws';\nimport {sleep, untilTrue, SHOW_FETCH, DEBUG, ERROR_CODE_SAFE_TO_IGNORE} from './common.js';\n\nconst "
  },
  {
    "path": "src/root.cjs",
    "chars": 215,
    "preview": "const path = require('path');\nconst url = require('url');\n\nconst file = __filename;\nconst dir = path.dirname(file);\ncons"
  },
  {
    "path": "src/root.js",
    "chars": 399,
    "preview": "import path from 'path';\nimport url from 'url';\n\nlet mod;\nlet esm = false;\n\ntry {\n  const [a, b] = [__dirname, __filenam"
  },
  {
    "path": "stampers/macos-new.sh",
    "chars": 13523,
    "preview": "#!/bin/bash\n\n# macOS Single Executable Application (SEA) Stamper, Signer, and Conditional Notarizer for DownloadNet\n\nset"
  },
  {
    "path": "stampers/macos.sh",
    "chars": 1343,
    "preview": "#!/bin/bash\n\nsource $HOME/.nvm/nvm.sh\n\n# Variables\nEXE_NAME=\"$1\"\nJS_SOURCE_FILE=\"$2\"\nOUTPUT_FOLDER=\"$3\"\n\n# Ensure nvm is"
  },
  {
    "path": "stampers/nix.sh",
    "chars": 1182,
    "preview": "#!/bin/bash\n\nsource $HOME/.nvm/nvm.sh\n\n# Variables\nEXE_NAME=\"$1\"\nJS_SOURCE_FILE=\"$2\"\nOUTPUT_FOLDER=\"$3\"\n\n# Ensure nvm is"
  },
  {
    "path": "stampers/notarize_macos.sh",
    "chars": 4499,
    "preview": "#!/bin/bash\n\n# create-notarized-pkg.sh\n# Creates a notarized and stapled .pkg installer from a code-signed binary, signi"
  },
  {
    "path": "stampers/win.bat",
    "chars": 1671,
    "preview": "@echo off\nsetlocal\n\n:: Check for required arguments\nif \"%~3\"==\"\" (\n  echo Usage: %0 executable_name js_source_file outpu"
  },
  {
    "path": "test.sh",
    "chars": 1250,
    "preview": "#!/bin/bash\n\n# Variables\nEXE_NAME=\"$1\"\nJS_SOURCE_FILE=\"$2\"\nOUTPUT_FOLDER=\"$3\"\n\n# Ensure nvm is installed\nif ! command -v"
  }
]

// ... and 1 more files (download for full content)

About this extraction

This page contains the full source code of the dosyago/DiskerNet GitHub repository, extracted and formatted as plain text for AI agents and large language models (LLMs). The extraction includes 64 files (322.0 KB), approximately 83.6k tokens, and a symbol index with 201 extracted functions, classes, methods, constants, and types. Use this with OpenClaw, Claude, ChatGPT, Cursor, Windsurf, or any other AI tool that accepts text input. You can copy the full output to your clipboard or download it as a .txt file.

Extracted by GitExtract — free GitHub repo to text converter for AI. Built by Nikandr Surkov.

Extract another repo